Compiler

A compiler (also compiler ; from English compile , gather 'or latin compilare , heap') is a computer program , the source code of a particular programming language translated into a form of a computer (direct) can be performed. This creates a more or less directly executable program. This is to be distinguished from interpreters , e.g. for early versions of BASIC , which do not generate any source code.

Sometimes a distinction is made between the terms translator and compiler. A translator translates a program from a formal source language into a semantic equivalent in a formal target language. Compilers are special translators that convert program code from problem-oriented programming languages , so-called high - level languages , into executable machine code of a certain architecture or an intermediate code ( bytecode , p-code or .NET code ). This separation between the terms translator and compiler is not made in all cases.

The process of translation is also known as compilation or conversion (or with the corresponding verb ). The opposite, i.e. the reverse translation of machine language into source text of a certain programming language, is called decompilation and corresponding programs are called decompilers .

terminology

A translator is a program that accepts a program formulated in a source language as input and translates it into a semantically equivalent program in a target language. In particular, it is required that the generated program delivers the same results as the given program. The source language assembler is often seen as an exception - its translator (in machine code) is called "assembler" and is i. A. not referred to as a "compiler". The translator's job includes a wide range of subtasks, from syntax analysis to target code generation. Another important task is to identify and report errors in the source program.

The word “compiler” comes from the English “to compile” and means in the actual sense of the word “compiler”. In the 1950s, the term was not yet firmly anchored in the computer world. Compiler originally referred to an auxiliary program that brought together an overall program from individual subroutines or formula evaluations in order to carry out special tasks. (This task is now performed by the linker , which can, however, also be integrated in the compiler.) The individual subroutines were still written “by hand” in machine language. From 1954, the term “algebraic compiler” came up for a program that independently converted formulas into machine code. The "algebraic" fell away over time.

At the end of the 1950s, the term compiler was still controversial in English-speaking countries. The Fortran development team held onto the term “translator” for years to refer to the compiler. This designation is even included in the name of the Fortran programming language itself: Fortran is composed of For mula and Tran slation, that is roughly formula translation. It was not until 1964 that the term compiler prevailed over the term translator in connection with Fortran. According to Carsten Busch, there is a “special irony in history” that the term compiler is translated as “translator” in German. However, some German publications also use the English term compiler instead of translator.

In a narrower sense, however, some German-language publications only use the technical term compiler if the source language is a higher programming language than the target language. Typical applications are the translation of a high-level programming language into the machine language of a computer, as well as the translation into bytecode of a virtual machine . The target language of compilers (in this sense) can also be an assembly language . A translator for translating assembler source programs into machine language is referred to as an assembler or assembler.

history

Konrad Zuse was already planning a compiler for the first higher programming language designed, the Plankalkül by Konrad Zuse . Zuse referred to a single program as a calculation plan and as early as 1944 had the idea for a so-called plan production device, which should automatically generate a punched tape with a corresponding machine plan for the Zuse Z4 computer from a mathematically formulated calculation plan .

More concrete than Zuse's idea of a plan production device was a concept by Heinz Rutishauser for automatic calculation plan production. In a lecture to the Society for Applied Mathematics and Mechanics ( GAMM ) as well as in his habilitation thesis at ETH Zurich in 1951 , he described which additional programming commands (instructions) and hardware additions were necessary to the Z4, which was then used at the ETHZ The computer can also be used as an aid for automatic program creation.

Grace Hopper (1984)

An early compiler was designed by mathematician Grace Hopper in 1949 . Up until then, programmers had to create machine code directly. (The first assembler was written by Nathaniel Rochester for an IBM 701 between 1948 and 1950. ) To simplify this process, Grace Hopper developed a method that made it possible to write programs and their subroutines in a more human than machine language Way to express. On May 3, 1952, Hopper presented the first compiler A-0 , which called algorithms from a catalog, rewrote code, compiled it in the appropriate order, reserved memory space and organized the allocation of memory addresses . At the beginning of 1955 Hopper presented a prototype of the compiler B-0 , which generated programs according to English, French or German instructions. Hopper called her talk on the first compiler “The Education of a Computer”.

The history of compiler construction was shaped by the current programming languages (see the programming language timetable ) and hardware architectures. Other early milestones are the first Fortran compiler in 1957 and the first COBOL compiler in 1960 . Many architectural features of today's compilers were not developed until the 1960s.

In the past, programs were sometimes also called compilers, which put sub-programs together. This bypasses today's core task of a compiler, because nowadays subroutines can be inserted by other means: Either in the source text itself, for example by a preprocessor (see also precompiler ) or, in the case of compiled components, by an independent linker .

Working method

The basic steps involved in translating a source code into a target code are:

Syntax check: It is checked whether the source code represents a valid program, i.e. corresponds to the syntax of the source language. Any errors found are logged. The result is an intermediate representation of the source code.

Analysis and optimization

Code generation

The optimized intermediate display is translated into corresponding commands in the target language. Further, target language-specific optimizations can be made here.

Note: Modern compilers (mostly) no longer generate code themselves.

C ++ with activated global optimization: The code is generated when linking.
C # : The code is generated from the common intermediate language codegenerated during compilationduring runtime by the JIT or NGEN compiler of the .NET environment.
the same applies to other languages that use the common language infrastructure such as F # and VB.NET , see list of .NET languages .
Java : The code is generated from Java byte code generated during compilation by the Java JIT compiler during runtime.

Code generation during runtime enables:

cross-module optimizations,
exact adaptation to the target platform (instruction set, adaptation to the capabilities of the CPU),
Use of Profiling Information.

Structure of a compiler

The compiler , so the programming of a compiler, is an independent discipline within computer science .

Modern compilers are divided into different phases, each of which takes on different subtasks of the compiler. Some of these phases can be implemented as independent programs (see precompiler , preprocessor ). They are executed sequentially . Basically, two phases can be distinguished: the front end (also analysis phase ), which analyzes the source text and generates an attributed syntax tree from it , and the back end (also synthesis phase ), which generates the target program from it.

Frontend (also "analysis phase")

In the front end, the code is analyzed, structured and checked for errors. It is itself divided into phases. Languages like modern C ++ do not allow a division of the syntax analysis into lexical analysis, syntactic analysis and semantic analysis due to ambiguities in their grammar. Your compilers are correspondingly complex.

Lexical analysis

The lexical analysis divides the imported source text into lexical units ( tokens ) of different types, for example keywords , identifiers , numbers , strings or operators . This part of the compiler is called a tokenizer, scanner, or lexer.

A scanner occasionally uses a separate screener to skip whitespace ( white space , i.e. spaces, tabs, line endings, etc.) and comments .

Another function of the lexical analysis is to associate recognized tokens with their position (e.g. line number) in the source text. If errors are found in the source text in the further analysis phase, which is based on the tokens (e.g. syntactic or semantic type), the error messages generated can be provided with a reference to the location of the error.

Lexical errors are characters or character strings that cannot be assigned to a token. For example, most programming languages do not allow identifiers that begin with digits (e.g. "3foo").

Syntactic Analysis

The syntactic analysis checks whether the imported source code is in the correct structure of the source language to be translated, i.e. corresponds to the context-free syntax (grammar) of the source language. The input is converted into a syntax tree . The syntactic analyzer is also known as a parser . If the source code does not match the grammar of the source language, the parser outputs a syntax error . The syntax tree created in this way is annotated for the next phase (semantic analysis) with the "contents" of the nodes ; d. For example, variable identifiers and numbers are passed on along with the information that they are. The syntactic analysis checks, for example, whether the brackets are correct, i.e. whether each opening bracket is followed by a closing bracket of the same type, as well as without brackets entanglement. The key words also provide certain structures.

Semantic analysis

The semantic analysis checks the static semantics , i.e. conditions on the program that go beyond the syntactic analysis. For example, a variable must usually have been declared before it is used, and assignments must be made with compatible (compatible) data types . This can be done with the help of attribute grammars . The nodes of the syntax tree generated by the parser are given attributes that contain information. For example, a list of all declared variables can be created. The output of the semantic analysis is then called a decorated or attributed syntax tree.

Backend (also "synthesis phase")

The backend generates the program code of the target language from the attributed syntax tree created by the frontend.

Intermediate code generation

Many modern compilers generate an intermediate code from the syntax tree , which can be relatively close to the machine, and carry out program optimizations on this intermediate code, for example. This is particularly useful for compilers that support multiple source languages or different target platforms . Here the intermediate code can also be an exchange format.

Program optimization

The intermediate code is the basis of many program optimizations. See program optimization .

Code generation

During code generation, the program code of the target language is generated either directly from the syntax tree or from the intermediate code. If the target language is a machine language, the result can be an executable program directly or a so-called object file that leads to a library or an executable program by linking with the runtime library and possibly other object files . All of this is carried out by the code generator, which is part of the compiler system, sometimes as part of the compiler program, sometimes as a separate module.

Classification of different compiler types

Native compiler: Compiler that generates the target code for the platform on which it runs itself. The code is platform specific.
Cross compiler

Compiler that runs on one platform and generates target code for another platform, for example another operating system or another processor architecture .; A typical application is the creation of programs for an embedded system that itself does not contain any tools or no good tools for software creation, as well as the creation or porting of an operating system to a new platform.
Single pass compiler: Compiler that generates the target code from the source code in a single pass (in contrast to the multi-pass compiler); the compiler reads the source text from front to back only once and generates the result program at the same time. Such a compiler is usually very fast, but can only carry out simple optimizations. A single-pass compiler can only be created for certain programming languages, for example Pascal , C and C ++ , because the programming language must not contain any forward references (nothing may be used that has not already been declared "above" in the source code) .
Multi-pass compiler: With this type of compiler, the source code is translated into the target code in several steps (originally: the source code is read in several times or worked through several times "from front to back"). In the early days of compiler construction, the translation process was mainly divided into several runs because the computer often did not have sufficient capacity to hold the complete compiler and the program to be translated in main memory at the same time. Nowadays a multi-pass compiler is mainly used to resolve forward references ( declaration of an identifier "further down in the source code" as its first use) and to carry out complex optimizations.

Special forms

In a Trans compiler (as transpiler or cross-translator hereinafter) If it is a special compiler source code of a programming language translated into the source code of another programming language, for example, Pascal in C . This process is called “code transformation” or “translate”. However, since many programming languages have special properties and performance features, efficiency losses can occur if these are not taken into account by the transcompiler. Since programming languages usually follow different programming paradigms , the newly generated source text is often difficult to read for developers. Sometimes manual post-processing of the code is necessary, as the automatic translation does not work smoothly in all cases. There are also extensive subprogram libraries in many modern programming languages. Implementing library calls makes the compilation process even more difficult.
Compiler compilers and compiler generators are auxiliary programs for the automatic generation of compiler parts or complete compilers. See also: ANTLR , Coco / R , JavaCC , Lex , Yacc
Just-in-time compilers (or JIT compilers ) do not translate source code or intermediate code into machine code until the program is executed. Program parts are only compiled when they are executed for the first time or several times. The degree of optimization usually depends on the frequency of use of the corresponding function.
With the Compreter , the program source code is first translated into an intermediate code, which is then interpreted at runtime. Compreters should combine the advantages of the compiler with the advantages of the interpreter . To reduce execution time, many interpreters of today are effectively implemented internally as computers that translate the source code at runtime before the program is executed. A bytecode interpreter is also a compreter, e.g. B. the virtual machines from Java up to version 1.2.

Program optimization (in detail)

Many optimizations that used to be the task of the compiler are now carried out within the CPU during code processing. Machine code is good if it has short critical paths and few surprises due to incorrectly predicted jumps, requests data from memory in good time, and uses all execution units of the CPU equally.

Parallel calculation of the amount of Mandelbrot on a Haswell i7 CPU (2014). The graphic shows the calculations taking place simultaneously on one core (data type: floating point, single precision), that is between 64 and 128 calculations in 8 to 16 commands per core, divided into 2 threads. On a Haswell Core i7-5960X (8 cores) up to 1024 parallel calculations (96 billion iterations / sec) take place, on a Haswell Xeon E7-8890 V3 up to 2304 (180 billion iterations / sec per socket). The job of modern compilers is to optimize code to allow this nesting of instructions. This is a fundamentally different job than compilers did in the late 1980s.

To control the translation, the source text can contain additional special compiler instructions in addition to the instructions of the programming language .

A compiler usually offers options for various optimizations with the aim of improving the runtime of the target program or minimizing its memory requirements . The optimizations are partly based on the properties of the hardware , for example how many and which registers the computer's processor makes available. It is possible for a program to run more slowly after optimization than it would have done without optimization. This can occur, for example, when an optimization for a program construct produces longer code that would actually be executed faster, but requires more time to be loaded into the cache . It is therefore only advantageous if it is used frequently.

Some optimizations result in the compiler generating target language constructs for which there are no direct equivalents in the source language. A disadvantage of such optimizations is that it is then hardly possible to follow the program flow with an interactive debugger in the source language.

Optimizations can be very complex. In many cases, especially in modern JIT compilers, it is therefore necessary to weigh up whether it is worthwhile to optimize a part of the program. With ahead-of-time compilers , all useful optimizations are used in the final compilation, but often not during software development (this reduces the compilation time required). For non-automatic optimizations on the part of the programmer, tests and application scenarios can be run through (see Profiler ) to find out where complex optimizations are worthwhile.

In the following, some optimization options for a compiler are considered. The greatest potential for optimization, however, often lies in changing the source program itself, for example in replacing an algorithm with a more efficient one. In most cases, this process cannot be automated, but has to be carried out by the programmer . On the other hand, simpler optimizations can be delegated to the compiler in order to keep the source code legible.

Saving of machine commands

In many high-level programming languages, for example, you need an auxiliary variable to swap the content of two variables:

Saving of machine commands (MB)
Higher programming language	Machine commands
Higher programming language	without optimization	with optimization
help = a	a → register 1 register 1 → help	a → register 1
a = b	b → Register 1 Register 1 → a	b → Register 2 Register 2 → a
b = help	help → Register 1 Register 1 → b	Register 1 → b

With the optimization, only 4 assembler commands are required instead of 6, and the memory space for the auxiliary variable is hilfnot required. This means that this swap is carried out more quickly and requires less main memory . However, this only applies if sufficient registers are available in the processor . Storing data in registers instead of main memory is a common optimization option.

The command sequence shown above as optimized has another property that can be an advantage in modern CPUs with several processing pipelines: The two read commands and the two write commands can be processed in parallel without any problems, they are not dependent on the result of the other. Only the first write command must definitely wait until the last read command has been carried out. More in-depth optimization processes may therefore insert machine commands between b → register 2 and register 2 → a that belong to a completely different high-level command line.

Static formula evaluation at compile time

Calculating the circumference using

pi = 3.14159
u  = 2 * pi * r

can be u = 6.28318 * revaluated by a compiler at the time of compilation. This formula evaluation saves the multiplication 2 * piat runtime of the generated program. This procedure is referred to as constant folding.

Elimination of dead program code

If the compiler can recognize that a part of the program will never be run through, then it can omit this part from the compilation.

Example:

100   goto 900
200   k=3
900   i=7
...   ...

If GOTOthe jump label 200is never used in this program , the instruction can be 200 k=3omitted. The jump command 100 goto 900is then also superfluous.

Detection of unused variables

If a variable is not required, no memory space has to be reserved and no target code has to be generated.

Example:

subroutine test (a,b)
    b = 2 * a
    c = 3.14 * b
    return b

The variable is cnot required here: It is not in the parameter list , is not used in subsequent calculations and is not output. The instruction can therefore be c = 3.14 * bomitted.

Optimizing loops

In particular , one tries to optimize loops by, for example

holds as many variables as possible in registers (usually at least the loop variable);
instead of an index, with the elements of a field ( English array is accessed), a link used in the elements, characterized the effort in accessing field elements is low;
Calculations within the loop, which produce the same result in each run, are only carried out once before the loop (loop-invariant code motion);
combines two loops that go over the same value range into one loop, so that the administrative effort for the loop is only incurred once;
the loop partially or (in the case of loops with a constant, low number of passes) completely unrolls (English loop unrolling ), so that the instructions within the loop are executed several times in direct succession, without a check of the loop condition and a jump to the beginning of the loop taking place each time after the instructions ;
reverses the loop (especially in counting loops with for ), since efficient jump commands (Jump-Not-Zero) can be used when counting down to 0 ;
reshapes the loop so that the check of the termination condition is carried out at the end of the loop (loops with an initial check always have a conditional and an unconditional jump instruction, while loops with an end check only have a conditional jump instruction);
removed the whole loop if (after some tweaking) it has an empty body. However, this can lead to queues that are intended to slow down a program are removed. However, functions of the operating system should be used for this purpose as far as possible.
Arranges nested loops (loops in loops) - if the programming logic used allows it - in ascending order, from the outermost loop with the fewest loop passes to the innermost loop with the most loop passes. This prevents multiple multiple initializations of the inner loop bodies.

Some of these optimizations are of no use or even counterproductive with current processors.

Insertion of subroutines

In the case of small subroutines , the effort involved in calling the subroutine is more significant than the work performed by the subroutine. Compilers therefore try to insert the machine code of smaller subroutines directly - similar to how some compilers / assemblers / precompilers break down macro instructions into source code. This technique is also known as inlining. In some programming languages it is possible to use inline keywords to indicate to the compiler that certain subprograms should be inserted. The insertion of subroutines often opens up further possibilities for optimization, depending on the parameters.

Holding values in registers

Instead of accessing the same variable in the memory several times, for example in a data structure, the value can only be read once and temporarily stored in registers or in the stack for further processing. In C , C ++ and Java this behavior may have to be switched off with the keyword volatile : A variable designated as volatile is read repeatedly from the original memory location each time it is used, since its value may have changed in the meantime. This can be the case, for example, if it is a hardware port or a thread running in parallel could have changed the value.

Example:

int a = array[25]->element.x;
int b = 3 * array[25]->element.x;

The machine program only accesses once array[25]->element.x, the value is stored temporarily and used twice. If it is xvolatile, it is accessed twice.

There is another reason besides volatile that makes intermediate storage in registers impossible: If the value of the variable could be changed vby using the pointer zin memory, intermediate storage vin a register can lead to incorrect program behavior. Since the pointers often used in the C programming language are not limited to an array (they could point anywhere in main memory), the optimizer often does not have enough information to rule out a pointer changing a variable.

Use faster equivalent statements

Instead of multiplying or dividing integers by a power of two, a shift command can be used. There are cases in which not only powers of two, but also other numbers (simple sums of powers of two) are used for this optimization. For example, it can be faster than . Instead of dividing by a constant, it can be multiplied by the reciprocal of the constant. Of course, such special optimizations should definitely be left to the compiler. (n << 1) + (n << 2)n * 6

Omission of runtime checks

Programming languages such as Java require runtime checks when accessing fields or variables. If the compiler determines that a particular access will always be in the permitted range (for example, a pointer that is known not to be NULL at this point), the code for these runtime checks can be omitted.

Reduction of paging during runtime

Closely related code areas, for example a loop body, should be located in the main memory on the same or in as few memory pages as possible ("page", connected memory block managed by the operating system) during runtime. This optimization is the task of the (optimizing) linker. This can be, for example, achieved in that the target code at a suitable point idle instructions ( "NOPs" - N o OP eration) are added. This makes the generated code larger, but because of the reduced number of TLB cache entries and page walks required, the program is executed faster.

Advancing or delaying memory accesses

By preferring memory read accesses and delaying write accesses, the ability of modern processors to work in parallel with different functional units can be used. For example, with the commands: a = b * c; d = e * f;the operand can ealready be loaded while another part of the processor is still busy with the first multiplication.

An example compiler

The following example created in ANTLR is intended to explain the cooperation between parser and lexer . The translator should be able to master and compare expressions of the basic arithmetic operations. The parser grammar converts file content into an abstract syntax tree (AST).

Grammars

The tree grammar is able to evaluate the lexemes stored in the AST. The operator of the arithmetic functions is in the AST notation before the operands as a prefix notation . Therefore, the grammar can perform calculations based on the operator without jumps and still correctly calculate parentheses and operations of different priorities .

tree grammar Eval;
options {
	tokenVocab=Expression;
	ASTLabelType=CommonTree;
}
@header {
import java.lang.Math;
}
start	: line+; //Eine Datei besteht aus mehreren Zeilen
line	: compare {System.out.println($compare.value);}
	;

compare returns [double value]
	: ^('+' a=compare b=compare) {$value = a+b;}
	| ^('-' a=compare b=compare) {$value = a-b;}
	| ^('*' a=compare b=compare) {$value = a*b;}
	| ^('/' a=compare b=compare) {$value = a/b;}
	| ^('%' a=compare b=compare) {$value = a\%b;}
	| ^(UMINUS a=compare)        {$value = -1*a;} //Token UMINUS ist notwendig, um den binären
                                                      //Operator nicht mit einem Vorzeichen zu verwechseln
	| ^('^' a=compare b=compare) {$value = Math.pow(a,b);}
	| ^('=' a=compare b=compare) {$value = (a==b)? 1:0;} //wahr=1, falsch=0
	| INT {$value = Integer.parseInt($INT.text);}
	;

If one of the expressions identified above as compare is not yet a lexeme, the following lexer grammar divides it into individual lexemes. The lexer uses the technique of recursive descent . Expressions are broken down further and further until they can only be tokens of type number or operators.

grammar Expression;
options {
	output=AST;
	ASTLabelType=CommonTree;
}
tokens {
	UMINUS;
}
start	:	(line {System.out.println($line.tree==null?"null":$line.tree.toStringTree());})+;
line	:	compare NEWLINE -> ^(compare); //Eine Zeile besteht aus einem Ausdruck und einem
                                              //terminalen Zeichen
compare	:	sum ('='^ sum)?; //Summen sind mit Summen vergleichbar
sum	: 	product	('+'^ product|'-'^ product)*; //Summen bestehen aus Produkten (Operatorrangfolge)
product	:	pow ('*'^ pow|'/'^ pow|'%'^ pow)*; //Produkte (Modulo-Operation gehört hier dazu) können
                                                  //aus Potenzen zusammengesetzt sein
pow	: 	term ('^'^ pow)?; //Potenzen werden auf Terme angewendet
term	:	number //Terme bestehen aus Nummern, Subtermen oder Summen
		|'+' term -> term
		|'-' term -> ^(UMINUS term) //Subterm mit Vorzeichen
		|'('! sum ')'! //Subterm mit Klammerausdruck
		;
number	:	INT; //Nummern bestehen nur aus Zahlen
INT	:	'0'..'9'+;
NEWLINE	:	'\r'? '\n';
WS	:	(' '|'\t'|'\n'|'\r')+ {skip();}; //Whitespace wird ignoriert

The output after the token start also shows the expression just evaluated.

Output of the example

Input:

5 = 2 + 3
32 * 2 + 8
(2 * 2^3 + 2) / 3

Output (only the printout of the input is output in the AST representation in the first lines):

(= 5 (+ 2 3))
(+ (* 32 2) 8)
(/ (+ (* 2 (^ 2 3)) 2) 3)
1.0
72.0
6.0

The first expression is evaluated as true (1), with the other expressions the result of the calculation is output.

literature

Alfred V. Aho , Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman : Compilers: principles, techniques, & tools . Pearson Addison-Wesley, Boston 2007, ISBN 0-321-48681-1 .
Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman: Compiler . Pearson, 2008, ISBN 978-3-8273-7097-6 (German translation).
Reinhard Wilhelm , Dieter Maurer: Translator construction - theory, construction, generation . Springer, 1997, ISBN 3-540-61692-6 .
Niklaus Wirth : Basics and techniques of compiler construction . 3rd, edited edition. Oldenbourg Wissenschaftsverlag, Munich 2011, ISBN 978-3-486-70951-3 .

Web links

Wiktionary: Compiler - explanations of meanings, word origins, synonyms, translations

Wiktionary: compile - explanations of meanings, word origins, synonyms, translations

Individual evidence

↑ Michael Eulenstein: Generation of portable compilers. The portable system POCO. (= Informatik-Fachberichte 164) Springer Verlag: Berlin, u. a., 1988, p. 1; Hans-Jochen Schneider (Hrsg.): Lexicon computer science and data processing. 4th edition, Oldenbourg Verlag: München, Berlin, 1998, 900; Manfred Broy: Computer Science. A basic introduction. Volume 2: System structures and theoretical computer science. 2nd edition, Springer Verlag: Berlin, Heidelberg, 1998, p. 173.

↑ ^a ^b Carsten Busch: Mataphors in Computer Science. Modeling - formalization - application. Springer Fachmedien: Wiesbaden, 1998, p. 171.

↑ Axel Rogat: Structure and mode of operation of compilers , Chapter 1.11: History ; Thomas W. Parsons: Introduction to Compiler Construction. Computer Science Press: New York, 1992, p. 1.

↑ For the translation of the English “compiler” with the German “translator” see below. a .: Hans-Jürgen Siegert, Uwe Baumgarten: Operating systems. An introduction. 6th edition, Oldenbourg Verlag: München, Wien, 2007, p. 352; Christoph Prevezanos: Computer-Lexikon 2011. Markt + Technik Verlag: München, 2010, p. 940; Christoph Prevenzanos: Technical writing. For computer scientists, academics, technicians and everyday working life. Hanser Verlag: München, 2013, p. 130.

^ For example, Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman: Compiler. Principles, techniques and tools. 2nd edition, Pearson Studium: Munich, 2008.

↑ See also Hans-Jochen Schneider (Hrsg.): Lexicon Informatik und Datenverarbeitung. 4th edition, Oldenbourg Verlag: München, Berlin, 1998: Article “Compiler”, p. 158, and Article “Translators”, p. 900.

↑ Hartmut Ernst, Jochen Schmidt; Gert Beneken: Basic course in computer science. Basics and concepts for successful IT practice. A comprehensive, practice-oriented introduction. 5th edition, Springer: Wiesbaden, 2015, p. 409.

^ Hans Dieter Hellige: Stories of Computer Science. Visions, paradigms, leitmotifs. Springer, Berlin 2004, ISBN 3-540-00217-0 , pp. 45, 104, 105.

↑ Evelyn Boesch Trüeb: Heinz Rutishauser. In: Historical Lexicon of Switzerland . July 12, 2010 , accessed October 21, 2014 .

↑ Stefan Betschon: The magic of the beginning. Swiss computer pioneers. In: Franz Betschon , Stefan Betschon, Jürg Lindecker, Willy Schlachter (eds.): Engineers build Switzerland. First-hand history of technology. Verlag Neue Zürcher Zeitung, Zurich 2013, ISBN 978-3-03823-791-4 , pp. 381–383.

^ Friedrich L. Bauer: My years with Rutishauser.

↑ Stefan Betschon: The story of the future. In: Neue Zürcher Zeitung, December 6, 2016, p. 11

^ Inventor of the Week Archive. Massachusetts Institute of Technology , June 2006, accessed September 25, 2011 .

^ Kurt W. Beyer: Grace Hopper and the invention of the information age . Massachusetts Institute of Technology, 2009, ISBN 978-0-262-01310-9 ( Google Books [accessed September 25, 2011]).

↑ Kathleen Broome Williams: Grace Hopper . Naval Institute Press, 2004, ISBN 1-55750-952-2 ( Google Books [accessed September 25, 2011]).

^ FL Bauer, J. Eickel: Compiler Construction: An Advanced Course . Springer, 1975.

↑ transcompiler. In: Neogrid IT Lexicon. Accessed on November 18, 2011 : "If a compiler generates the source code of another from the source code of one programming language (e.g. C in C ++) it is called a transcompiler."

↑ Transpiler. bullhost.de, accessed on November 18, 2012 .

[1] Michael Eulenstein: Generation of portable compilers. The portable system POCO. (= Informatik-Fachberichte 164) Springer Verlag: Berlin, u. a., 1988, p. 1; Hans-Jochen Schneider (Hrsg.): Lexicon computer science and data processing. 4th edition, Oldenbourg Verlag: München, Berlin, 1998, 900; Manfred Broy: Computer Science. A basic introduction. Volume 2: System structures and theoretical computer science. 2nd edition, Springer Verlag: Berlin, Heidelberg, 1998, p. 173.

[Busch171-2] Carsten Busch: Mataphors in Computer Science. Modeling - formalization - application. Springer Fachmedien: Wiesbaden, 1998, p. 171.

Compiler

terminology

history

Working method

Structure of a compiler

Frontend (also "analysis phase")

Lexical analysis

Syntactic Analysis

Semantic analysis

Backend (also "synthesis phase")

Intermediate code generation

Program optimization

Code generation

Classification of different compiler types

Special forms

Program optimization (in detail)

Saving of machine commands

Static formula evaluation at compile time

Elimination of dead program code

Detection of unused variables

Optimizing loops

Insertion of subroutines

Holding values ​​in registers

Use faster equivalent statements

Omission of runtime checks

Reduction of paging during runtime

Advancing or delaying memory accesses

An example compiler

Grammars

Output of the example

literature

Web links

Individual evidence

Holding values in registers