Recursive descent

from Wikipedia, the free encyclopedia

Recursive descent parser (English: recursive descent ) is a technique from the compiler , the direct way one (that is, without table..) Top down - parser implemented. It is characterized by its low complexity; the use of a parser generator is not necessary.

With this method, each non-terminal symbol is assigned a procedure which characterizes the production rule for this symbol. If the production rules allow recursion, then these procedures also call each other recursively.

A recursive descent can include backtracking . A waiver of this is guaranteed, however, if an LL ( k ) grammar is given for the language to be parsed. The following will assume the common case .

Code for the grammar rules of a nonterminal symbol

A procedure is defined for each non-terminal symbol in the grammar . If

all the rules with on the left are, the procedure looks in pseudo-code as follows:

 proc ()
    if lookahead in  then
       
    else if lookahead in  then
       
    ...
    else if lookahead in  then
       
    else
       error

The following applies:

  • lookaheadis the current input character (or token ).
  • is the set of terminal symbols (or tokens) that can appear at the beginning of a word generated by one of the words in the set .
  • is the set of terminal symbols that can appear directly to the right of in a word generated by the start symbol .
  • is the code for parsing .

The quantities must be included here because the empty word can contain; see LL (k) grammar .

Code for a sequence of grammar symbols

For (which can be terminals and / or non-terminals) consists of the code sections for in the same order.

The code for a single symbol looks like this:

  • If terminal is:
 if lookahead =  then
    lookahead := GetSymbol()
 else
    error
  • If it is non-terminal:
 ()

There is GetSymbola function that returns the next input character (or token).

example

The following formal grammar can be given in EBNF for arithmetic .

Produktionsregeln in EBNF
atom = number | "(" expression ")";
power = atom ["^" negation];
negation = ["-"] power;
multiplication = negation {("*" | "/") negation};
addition = multiplication {("+" | "-") multiplication};
expression = addition;

This is followed by a recursive descent in Python , which carries out a syntactic analysis and creates an abstract syntax tree . Before this, the input is broken down into a list of tokens using a lexical scanner scan . In retrospect, evaluatea recursive evaluation of the abstract syntax tree is demonstrated.

class SyntaxError(Exception):
    pass

def scan(s):
    a = []; i = 0; n = len(s)
    while i<n:
        if s[i] in "+-*/^()":
            a.append(s[i])
            i+=1
        elif s[i].isdigit():
            j = i
            while i<n and s[i].isdigit(): i+=1
            a.append(int(s[j:i]))
        elif s[i].isspace():
            i+=1
        else:
            raise SyntaxError(
                "unerwartetes Zeichen: '{}'".format(s[i]))
    a.append(None)
    return a

def expect_token(a,i):
    if a[i] is None:
        raise SyntaxError("unerwartetes Ende der Eingabe")
    else:
        return a[i]

def atom(a,i):
    t = expect_token(a,i)
    if isinstance(t,int):
        return i+1,a[i]
    elif t=="(":
        i,x = expression(a,i+1)
        if expect_token(a,i)!=")":
            raise SyntaxError(
                "')' wurde erwartet, es kam aber '{}'".format(a[i]))
        return i+1,x
    else:
        raise SyntaxError(
            "unerwartetes Symbol: '{}'".format(t))

def power(a,i):
    i,x = atom(a,i)
    if a[i]=="^":
        i,y = negation(a,i+1)
        return i,["^",x,y]
    else:
      return i,x

def negation(a,i):
    if a[i]=="-":
        i,x = power(a,i+1)
        return i,["~",x]
    else:
        return power(a,i)

def multiplication(a,i):
    i,x = negation(a,i)
    op = a[i]
    while op=="*" or op=="/":
        i,y = negation(a,i+1)
        x = [op,x,y]
        op = a[i]
    return i,x

def addition(a,i):
    i,x = multiplication(a,i)
    op = a[i]
    while op=="+" or op=="-":
        i,y = multiplication(a,i+1)
        x = [op,x,y]
        op = a[i]
    return i,x

def expression(a,i):
    return addition(a,i)

def ast(a):
    i,t = expression(a,0)
    if a[i] is None:
        return t
    else:
        raise SyntaxError(
            "Ende der Eingabe wurde erwartet, es kam aber: '{}'"
            .format(a[i]))

dispatch = {
    "+": lambda x,y: x+y,
    "-": lambda x,y: x-y,
    "*": lambda x,y: x*y,
    "/": lambda x,y: x/y,
    "^": lambda x,y: x**y,
    "~": lambda x: -x
}

def evaluate(t):
    if isinstance(t,int):
        return t
    else:
        return dispatch[t[0]](*map(evaluate,t[1:]))

while True:
    try:
        s = input("> ")
        if s=="": continue
        a = scan(s)
        t = ast(a)
        print("Abstrakter Syntaxbaum:\n  {}\n".format(t))
        print("Ergebnis:\n  {}".format(evaluate(t)))
    except SyntaxError as e:
        print("Syntaxfehler: "+str(e))
    except KeyboardInterrupt:
        print()
        break

literature

  • Aho, Alfred V., Sethi, Ravi, and Ullman, Jeffrey D. (1988). Compilers - Principles, Techniques, and Tools. Addison-Wesley. Sections 2.4 and 4.4, pp. 45–46 and 188–189.
  • Aho, Alfred V., Sethi, Ravi, and Ullman, Jeffrey D. (2008). Compilers Principles, Techniques, and Tools Pearson Studies. Sections 2.4.2 and 4.4.1, pp. 79–82 and 264–266.