Context-free language

from Wikipedia, the free encyclopedia

In the theoretical computer science a is context-free language ( English context-free language , CFL ) is a formal language , by a context-free grammar can be described. A context-free grammar allows a defined reading process (interpretation) of expressions in a formal language. On the one hand, it can be decided whether an expression corresponds to the rules of grammar and, on the other hand, a syntax tree can be created in the course of the analysis . A program that does this is called a parser . Parsers are used in particular to process programming languages. In computational linguistics , too, attempts are made to describe natural languages using rules of context-free grammars.

Context-free languages ​​are also known as Type 2 languages ​​of the Chomsky hierarchy . The class of all context-free languages ​​includes the regular languages (type 3 languages) and is included in the class of context-sensitive languages (type 1 languages).

One speaks of context-free languages ​​because the rules of context-free grammars are always applied regardless of the context. This distinguishes them from context-sensitive grammars , the rules of which also depend on the syntactic context.

characterization

The class of context-free languages ​​is the same as the class of languages accepted by nondeterministic push-down automata . The languages ​​accepted by deterministic push-down automata are called deterministic context-free languages and are identical to the class of LR (k) languages ​​(cf. LR (k) grammar ).

Examples

If an alphabet consists of the symbols and , the following languages ​​are examples of context-free languages:

The language contains the words : , , etc, so getting as many s as s. If you choose the symbols and instead of and , that corresponds to correctly nested brackets: for example or .

The language contains the words , , , etc, so all balanced words. Since they result in the same word when read from the front and the back, they are palindromes .

The language is context sensitive , but not context free.

Context-free languages ​​are used in the definition of the syntax of programming languages; for example, arithmetic expressions and generally correct bracket structures can be specified. The limits of context-free languages ​​lie in context-relevant properties, such as B. the type check in programming languages, which can only be represented by context-sensitive grammars . In practice, however, context-free parsers with additional functions and data structures are used.

In computational linguistics, context-free grammars are used to simulate natural languages. There is an alphabet of words of a language, for example , can be constructed with a few simple rules noun phrases: Due to the rules and are and correct expressions in the grammar.

properties

The class of context-free languages ​​is completed under the

It is not completed under

  • average
    • Counterexample : The languages and are context-free. But is not context free.
  • complement
    • Proof of contradiction : It has already been shown that there are context-free languages whose cut is not context-free. Let context-free languages ​​be completed with formation of complement. Then are also context free. Because of the isolation between Vereinigung and De Morgan, it follows that and is therefore context-free. Contradiction: the average was assumed to be not context-free.
  • Use of logarithmic space-constrained reduction
  • Symmetrical difference

The conclusion under union can be proven by the construction of a new, again context-free grammar: Be and context-free. New start symbol and new production . With

In the same way, one can show the isolation under concatenation for two context-free languages: Be and context-free. New start symbol and new production . With

The use of Kleene- * also corresponds to a new, context-free grammar: Be context-free. New start symbol and new production . With

Every regular language is also context free, since every regular grammar is also a context free grammar. There are context-sensitive languages that are not context-free. One such example is . Pumping lemmas exist in different variants for regular and context-free languages. For context-free languages, they describe necessary but not sufficient properties. The proof that a formal language is not context-free is usually made through the violation of these necessary properties. Often, the examined language is first suitably thinned out by intersecting with a regular language. This approach is justified by the above-mentioned conclusion under cut with regular languages.

An open problem is the question of whether the set of primitive words is context-free. A word above any alphabet is primitive if it is not a repetition of another word, i.e. does not have the form for any other, non-empty word of the same alphabet and an integer .

Typical decision problems

Let , and given context-free languages ​​above the alphabet . The following typical problems then arise:

  • Word problem : does a wordbelong?
  • Emptiness Problem : IsThe Empty Set?
  • Finiteness problem : isa finite set of words ()?

The problems listed above are decidable in context-free languages (the word problem by the Cocke-Younger-Kasami algorithm ). The equivalence problem ( ) is no longer decidable from and including this level of the Chomsky hierarchy .

Advanced properties

  • DLIN DCFL CFL GCSL CSL
  • REG DLIN LIN CFL
  • For each there are languages ​​that can be represented as intersections of context-free languages, but not as intersections of context-free languages.

Natural language

In linguistics , context-free grammars are also used to describe the syntax of natural languages. For Swiss German , for example, it has been shown that the language cannot be fully described with such a grammar. In computational linguistics, context-free grammars (or equivalent formalisms) with additional data structures are often used for languages ​​such as Swiss German.

See also

  • Backus-Naur-Form , a compact formal metalanguage for the representation of context-free grammars

literature

  • Uwe Schöning : Theoretical Computer Science - in a nutshell . 4th edition. Berlin 2003, spectrum, ISBN 3-8274-1099-1 .
  • Stuart M. Shieber: Evidence against the context-freeness of natural language . In: Linguistics and Philosophy 8, 1985, 3, ISSN  0924-4662 , pp. 333-343.
  • John E. Hopcroft , Rajeev Motwani , Jeffrey Ullman : Introduction to Automata Theory, Formal Languages, and Complexity Theory. 2nd revised edition. Pearson Studium, Munich 2002, ISBN 3-8273-7020-5 .
  • Leonard Y. Liu, Peter Weiner: An Infinite Hierarchy of Intersections of Context-Free Languages. In: Mathematical Systems Theory 7, 1973, ISSN  0025-5661 , pp. 185-192.

Web links

  • CFL . In: Complexity Zoo. (English)

Individual evidence

  1. Pál Dömösi, Masami Ito: Context-Free Languages and Primitive Words . World Scientific Publishing Co Pte Ltd , Singapore 2014, ISBN 978-981-4271-66-0 , pp. 447 ( Preview of Google Books [accessed November 30, 2015]).
  2. Context-Free Languages ​​and Primitive Words (see above), p. 11 ( preview of Google Books , accessed November 30, 2015)
  3. Stuart M. Shieber; Evidence against the context-freeness of natural language; In Linguistics and Philosophy 8; 1985 (pdf; 6.3 MB)