Attribute grammar

An attribute grammar is a context-free grammar that is expanded to include attributes as well as rules and conditions. The concept is used in compiler construction , for example to check compliance with rules that can not be formulated with context-free grammars . Such rules are e.g. For example, that every variable must be declared and used according to its data type . The concept of attribute grammars was originally introduced by Donald E. Knuth .

A compiler checks compliance with these rules during the semantic analysis . He only has the information available that is contained in the syntax tree of the program. Additional information that simplifies the semantic analysis can be integrated into the syntax tree as attributes.

For example, the type of an expression can be annotated as an attribute to the corresponding node in the syntax tree. Using attribute rules and conditions, you can also specify dependencies on other attributes (including other nodes in the syntax tree).

The programming of the relevant parts of the compiler is simplified if the productions of the grammar themselves are given appropriate attributes.

notation

${\ displaystyle X_ {i} .a: = f (\ ldots, X_ {j} .b, \ ldots) \ in R (p)}$ is an attribute that belongs to a non-terminal of the production with . ${\ displaystyle a}$ ${\ displaystyle X_ {i}}$ ${\ displaystyle p: X_ {0} \ rightarrow X_ {1} \ ldots X_ {n}}$ ${\ displaystyle 0 \ leq i, j \ leq n}$

Definitions

${\ displaystyle AG = (G, A, R, B)}$ is an attribute grammar that is defined by the following components:

${\ displaystyle G = (N, T, P, Z)}$ is a context-free grammar .
${\ displaystyle A = \ bigcup _ {X \ in (T \ cup N)} A (X)}$ is a finite set of attributes that are each clearly assigned to a symbol . The individual attribute sets are disjoint, so it applies . ${\ displaystyle X}$ ${\ displaystyle A (X)}$ ${\ displaystyle A (X) \ cap A (Y) \ neq \ varnothing \ Rightarrow X = Y}$
${\ displaystyle R = \ bigcup _ {p \ in P} R (p)}$ is a finite set of attribution rules.
${\ displaystyle B = \ bigcup _ {p \ in P} B (p)}$ is a finite set of conditions. The condition of a production can also be understood in this form as an attribute of the left side, so all conditions are also included with the attributes. ${\ displaystyle B (p)}$ ${\ displaystyle p: X_ {0} \ rightarrow X_ {1} \ ldots X_ {n}}$ ${\ displaystyle X_ {0} .b}$
${\ displaystyle AF (p): = \ {X_ {i} .a \; \ vert \; p: X_ {0} \ rightarrow X_ {1} \ ldots X_ {n}, 0 \ leq i \ leq n, X_ {i} .a: = f (\ ldots) \ in R (p) \}}$ is the set of attributes that are defined in the rules of a production . ${\ displaystyle R (p)}$ ${\ displaystyle p \ in P}$

The attributes of a symbol can be divided into two disjoint classes, as there is only one calculation rule of the form in for all attributes : ${\ displaystyle A (X)}$ ${\ displaystyle a \ in A (X)}$ ${\ displaystyle Xa \ leftarrow f (\ ldots)}$ ${\ displaystyle R}$

${\ displaystyle AS (X): = \ {Xa \; \ vert \; \ exists \; p: X \ rightarrow X_ {1} \ ldots X_ {n} \ in P \ land Xa \ in AF (p) \ }}$ is the set of synthesized (derived) attributes. These are the attributes that are defined in the rules of a production where is shown on the left. ${\ displaystyle r \ in R (p)}$ ${\ displaystyle p \ in P}$ ${\ displaystyle X}$
${\ displaystyle AI (X): = \ {Xa \; \ vert \; \ exists \; q: Y \ rightarrow uXv \ in P \ land Xa \ in AF (q) \}}$ is the set of inherited attributes. These are the attributes that are defined in the rules of a production where is shown on the right. ${\ displaystyle r \ in R (p)}$ ${\ displaystyle p \ in P}$ ${\ displaystyle X}$

Circularity

Attribute grammars are circular when the dependency graph of the attribute variable induced by the functional dependency contains a loop.

This circularity can be tested in exponential time.

A simplified test that allows fewer grammars calculates the problem in polynomial time.

Grammar types

S attribute grammars

S-attribute grammars, SAG for short, are attribute grammars that only work on synthetic attributes. In this way, they can be calculated directly during the reduce steps of the parse process of an LR (k) parser . Implemented in yacc .

L attribute grammars

L-attribute grammars (LAG) can be evaluated in a top-down pass from left to right through the abstract syntax tree. They can be evaluated for any LL grammar and can therefore be used for programming languages similar to Pascal. With these, only derived and following tree parts are allowed to access current attributes.

Example:

${\ displaystyle A \ rightarrow BC, a.1 = a.0, b.0 = b.1, c.2 = c.1, c.2 = c.0}$ (allowed)
${\ displaystyle A \ rightarrow BC, a.1 = a.2}$ (forbidden)

This facilitates the forward-looking declaration of variables and functions.

LR attribute grammars

A sub-class of L attribute grammars, specifically those that can be evaluated in one pass from left to right during LR parsing. Implementation: zyacc; Can be implemented manually in yacc using global variables. The advantage of the greater power of the LR parsing compared to the LL parsing thus manifests itself in a mirror-inverted manner in the disadvantage of the lower power of the LR attribute grammars compared to the L attribute grammars.

ECLR attribute grammars

A variant of the LR attribute grammars; it uses an equivalence relation to optimize the attribute evaluation. Implementation: rie.

Individual evidence

^ Donald E. Knuth: Semantics of context-free languages . In: Mathematical systems theory . tape 2 , no. 2 , ISSN 0025-5661 , p. 127–145 , doi : 10.1007 / BF01692511 ( springer.com [accessed January 8, 2017]).
^ Donald E. Knuth: Semantics of context-free languages: Correction . In: Mathematical systems theory . tape 5 , no. 2 , ISSN 0025-5661 , p. 95-96 , doi : 10.1007 / BF01702865 ( springer.com [accessed January 8, 2017]).

[1] Donald E. Knuth: Semantics of context-free languages . In: Mathematical systems theory . tape 2 , no. 2 , ISSN 0025-5661 , p. 127–145 , doi : 10.1007 / BF01692511 ( springer.com [accessed January 8, 2017]).

[2] Donald E. Knuth: Semantics of context-free languages: Correction . In: Mathematical systems theory . tape 5 , no. 2 , ISSN 0025-5661 , p. 95-96 , doi : 10.1007 / BF01702865 ( springer.com [accessed January 8, 2017]).