Lex (computer science)

from Wikipedia, the free encyclopedia

The Lex program is used in translators to create scanners for the lexical analysis of source texts . A lexical scanner is part of a translator .

Lex is often used in combination with Yacc , who is responsible for the syntactic analysis .

Lex was programmed in C by Mike Lesk at Bell Labs in the mid-1970s ; the regex treatment was provided by Alfred V. Aho and Ken Thompson . In the summer of 1976 the program was re-implemented by the then Bell Labs intern Eric Schmidt .

Working method

A description file must be created so that Lex can generate an analysis program. In this file, so-called tokens are defined using regular expressions .

Here is an example of such a file:

%{
    #include "y.tab.h"
    extern int yylval;
%}
    %%
    "="      { return EQ; }
    "!="     { return NE; }
    "+"      { return PLUS; }
    "-"      { return MINUS; }
    ";"      { return SEMICOLON; }
    "print"  { return PRINT; }
    [0-9]+   { yylval = atoi(yytext); return NUMBER; }
    ...

The resulting analysis program reads the source code of the program to be compiled and divides it into tokens. If this is not possible, there is a lexical error. The tokens are then transferred to the syntactic analysis part or program of a translator.

example

For example source code like

 print 15+5;

are the tokens:

  1. (PRINT,)
  2. (NUMBER, 15)
  3. (PLUS,)
  4. (NUMBER, 5)
  5. (SEMICOLON,)

It should be noted that Lex has no knowledge of permitted syntax. Specifically, this means that the sample code

 15+ print; 5

would also be transferred to the same tokens (but in a different order).

See also

literature

  • Helmut Herold: lex & yacc. The professional tools for lexical and syntactic text analysis. Addison-Wesley, 2003, ISBN 3-8273-2096-8 .
  • lex & yacc. O'Reilly. ISBN 1-56592-000-7 .
  • ME Lesk, E. Schmidt: Lex - A Lexical Analyzer Generator. Computing Science Technical Report No. 39, Bell Laboratories, Murray Hill NJ 1975.

Web links