Shunting yard algorithm

Graphic illustration of the algorithm as a 3-way turnout.

The shunting yard algorithm is a method to convert mathematical terms from the infix notation into the reverse Polish notation or into an abstract syntax tree . The algorithm was developed by Edsger W. Dijkstra invented and English shunting yard , marshalling yard ' named because he's way of working to a marshalling yard recalls.

functionality

The shunting yard algorithm requires both an input and an output queue for the conversion, as well as a stack , which is required for the intermediate storage of the operators. The input string is read character by character, with all numbers being written directly to the output variable in the same order. If the pending character is an operation character, it is placed on the operator stack. If an operator is already on the stack, the operator priority and operator associativity are used to decide whether the new operator should be placed directly on the stack or whether the stack should be emptied into the output first.

Opening brackets are also placed on the operator stack, but they are not written to the output stream when they are removed. With closing brackets, the stack is emptied until an opening bracket is encountered; if no opening brackets are found, the input string is incomplete, although the error handling is not defined by the algorithm.

In detail

This is followed by a German pseudo-code that can be easily translated and adapted into most programming languages.

    Stack mit LIFO-Prinzip und Ausgabe-Queue anlegen.
    SOLANGE Tokens verfügbar sind:
        Token einlesen.
        WENN Token IST-Zahl:
            Token ZU Ausgabe.
        ENDEWENN
        WENN Token IST-Funktion:
            Token ZU Stack.
        ENDEWENN
        WENN Token IST-Argumenttrennzeichen:
            BIS Stack-Spitze IST öffnende-Klammer:
                Stack-Spitze ZU Ausgabe.
                FEHLER-BEI Stack IST-LEER:
                    GRUND (1) Ein falsch platziertes Argumenttrennzeichen.
                    GRUND (2) Der schließenden Klammer geht keine öffnende voraus.
                ENDEFEHLER
            ENDEBIS
        ENDEWENN
        WENN Token IST-Operator
            SOLANGE Stack IST-NICHT-LEER UND
                    Stack-Spitze IST Operator UND
                    Token IST-linksassoziativ UND
                    Präzedenz von Token IST-KLEINER-GLEICH Präzedenz von Stack-Spitze
                Stack-Spitze ZU Ausgabe.
            ENDESOLANGE
            Token ZU Stack.
        ENDEWENN
        WENN Token IST öffnende-Klammer:
            Token ZU Stack.
        ENDEWENN
        WENN Token IST schließende-Klammer:
            BIS Stack-Spitze IST öffnende-Klammer:
                FEHLER-BEI Stack IST-LEER:
                    GRUND (1) Der schließenden Klammer geht keine öffnende voraus.
                ENDEFEHLER
                Stack-Spitze ZU Ausgabe.
            ENDEBIS
            Stack-Spitze (öffnende-Klammer) entfernen
            WENN Stack-Spitze IST-Funktion:
                Stack-Spitze ZU Ausgabe.
            ENDEWENN
        ENDEWENN
    ENDESOLANGE
    BIS Stack IST-LEER:
        FEHLER-BEI Stack-Spitze IST öffnende-Klammer:
            GRUND (1) Es gibt mehr öffnende als schließende Klammern.
        ENDEFEHLER
        Stack-Spitze ZU Ausgabe.
    ENDEBIS

This algorithm assumes that all tokens are correctly recognized and are valid. In particular, this algorithm does not take over the superposition of the characters “+” and “-”. A conflict between right and left associative operators with the same precedence is not caught.

The “reading” access to the stack (e.g. with “Stack top ACT”) and the “taking” (with “Stack top CLOSED”) are required.

Required functions are:

Recognize numbers
Recognize functions
Recognizing argument separators
Recognize operators
Determine operator associativity
Determining the operator precedence (here higher precedence means a stronger bond), the precedence of a function is maximal.

Examples

The following calculations, given in Infix notation, are to be transformed. It is documented what happens.

Precedents: ( +, −) <( ·, :) <( ^) <functions

(3 + 4) (5 - 6)

Summarize tokens: (3+4)(5−6)(Here: every character is a token)
( on the stack.
3 for output
+ on the stack
4 for output
):
1. + for output
2. ( remove from the stack
Operator expected but (found:
1. implicit ·on the stack
2. ( on the stack
5 for output
− on the stack
6 for output
):
1. − for output
2. ( remove from the stack
End: ·to issue

Output:34+56−·

1 + 2 - 3 4 + 5 ^ 6 ^ 7 8 - 9

Summarize tokens: 1+2−3·4+5^6^7·8−9(Here: every character is a token)
1 for output
+ on the stack
2 for output
−:
1. + for output
2. − on the stack
3 for output
· on the stack
4 for output
+:
1. · for output
2. − for output
3. + on the stack
5 for output
^ on the stack
6 for output
^on the stack ( ^is right-associative)
7 for output
·:
1. ^ for output
2. ^ for output
3. · on the stack
8 for output
−:
1. · for output
2. + for output
3. − on stack
9 for output
End: −to issue

Output: 12+34·−567^^8·+9−

Explicitly in parentheses ([(1 + 2) - (3 · 4)] + {[5 ^ (6 ^ 7)] · 8}) - 9 this is possibly easier to understand.

cos (1 + sin (ln (5) - exp (8)) ^ 2)

Summarize tokens: cos ( 1 + sin ( ln ( 5 ) − exp ( 8 ) ) ^ 2 )
cos on the stack
( on the stack
1 for output
+ on the stack
sin on the stack
( on the stack
ln on the stack
( on the stack
5 for output
):
1. (Remove the top one from the stack
2. ln for output
– on the stack
exp on the stack
( on the stack
8 for output
):
1. (Remove the top one from the stack
2. exp for output
):
1. – for output
2. (Remove the top one from the stack
3. sin for output
^ on the stack
2 for output
):
1. ^ for output
2. + for output
3. (Remove the top one from the stack
4. cos for output
End: No remaining elements in the stack, so you're done!

Output: 15ln8exp−sin2^+cos

Web links

Original description of the algorithm (PDF; 1.1 MB; English)