SECD machine

from Wikipedia, the free encyclopedia

The SECD machine is a virtual machine that is intended as the target architecture for compilers of functional programming languages . The letters stand for S tock, e nvironment, C ontrol, D ump, which are the internal registers of the machine. These registers contain pointers to lists in memory .

The SECD machine was the first virtual machine specifically designed to evaluate expressions of the lambda calculus . It was originally described by Peter J. Landin in 1963 as part of the definition of his ISWIM programming language . The description published by Landin was quite abstract and left many implementation decisions open, such as the question of operational semantics . This is why the SECD machine is often presented in more detail, such as in Peter Henderson's Lispkit LISP compiler, which has been distributed since 1980. It is also used as the target architecture for various other experimental compilers.

In 1989, researchers at the University of Calgary were working on a hardware implementation of the machine.

Register and memory

The SECD machine is a stacking machine whose instructions take their main arguments from a stack. Additional parameters for an instruction can be specified after the instruction name. Like all other internal data structures, the stack is represented by a list, with the S register pointing to the beginning of the list. This list is stored on a heap , so the stack does not necessarily have to be contained in a contiguous block of memory. As long as there are still free memory cells, stack space is still available. When all memory cells have been used, garbage collection may reclaim memory.

At the beginning, register C points to the first instruction from the program to be evaluated, which is displayed as a list of instructions. After each evaluation of an instruction, the C register points to the next instruction in the list and thus behaves similarly to an instruction counter with the exception that subsequent instructions do not have to be contained in immediately following memory locations.

Register E contains the current variable environment, a pointer to a list of lists. Each individual list contains the variable bindings of a certain abstraction level: the first list contains the parameters of the current function; Variables in the current function freely are, but they are bound in a surrounding function can be found in the following lists.

The "dump", to the beginning of which the register D points, is used as temporary memory for the contents of other registers that have to be saved in connection with function calls. In this respect, it plays roughly the role of the call stack on other machines.

The memory organization of the SECD machine is similar to the model used by the interpreters of most functional programming languages: memory is a set of memory cells, each either an atom , i.e. a primitive value, or an empty or non-empty one List may contain. Non-empty lists are represented by a pair of pointers, traditionally denoted by car and cdr . More modern developments often use the terms head and tail instead . The various types of values that can be contained in a cell, by a type of mark ( type tag ) distinguished. Often the different types of atoms (numbers, strings, etc.) are marked differently.

For example, a list that contains the numbers 1 , 2, and 3 and is usually written in the form "(1 2 3)" could look like this:

Adresse   Tag       Inhalt
----------------------------
      9 [ integer |     2 ]
      8 [ integer |     3 ]
      7 [ list    | 8 | 0 ]
      6 [ list    | 9 | 7 ]
      ...
      2 [ list    | 1 | 6 ]
      1 [ integer |     1 ]
      0 [ nil             ]

The memory cells 3 to 5 have been omitted here because they do not belong to our example list, which can be distributed over the memory as desired. Cell 2 contains the head of the list. It points to cell 1, in which the first element of the list can be found, and to cell 6, in which the remaining list can be found. The remaining list consists of a reference to cell 9, which contains the second element (“2”) of the list and cell 7, which in turn describes a remaining list. The remaining list now consists of a reference to cell 8 with the third element ("3") and a reference to an empty list ( nil ) as a conclusion. In the SECD machine, the cell with the address 0 always implicitly designates the empty list, so that a special marking for the empty list is superfluous.

The principle that the second pointer ("cdr") in a list cell must always point to another list is pure convention. If both “car” and “cdr” point to atoms, this can also be represented as a pair, which is usually noted as “(1. 2)”.

Instructions

The main instructions of the SECD machine are as follows:

  • nil writes a nil pointer on the stack
  • ldc c writes a constant argument c on the stack
  • ld v writes the value of a variable v on the stack. Variables are represented by pairs (1. p) , where l denotes the abstraction level and p the position within the parameters of this level. As an example, "(1. 3)" denotes the third parameter of the current function (abstraction level = 1).
  • sel l1, l2 removes a value from the stack and makes a case distinction based on the value found: If the top of the stack was different from nil , the instruction sequence pointed to by l1 is carried out, otherwise that pointed to by l2 . The register C is replaced by l1 or l2 ; in any case, a pointer to the instruction following the sel is saved on the dump.
  • join from a list pointer from dump and turns it into the new value of the register C . This instruction occurs at the end of the two alternatives of a sel instruction.
  • ldf f expects a list argument that represents the code of a function. It is a Closure prepared: a pair consisting of the code of the function and the current Environment E . This closure is stored on the stack.
  • ap removes a closure and a list of parameter values ​​from the stack. The closure is applied to the parameters by making its own environment the current one, writing the parameter list in front of it, emptying the stack and setting the C register to the function pointer from the closure. The previous values ​​of S and E and the continuation address from C are saved on the dump.
  • ret removes a return code from the stack, restores registers S , E, and C from the dump, removes the topmost item from the dump, and pushes the previously removed return code onto the restored stack.
  • dum puts a "dummy", an empty list, at the top of the environment list.
  • rap works like ap , except that it replaces an occurrence of a dummy environment with the current environment. In this way recursive functions are possible.

There are also a number of instructions for primitive functions such as “car”, “cdr”, list construction, addition, input / output and others. Any necessary arguments take these functions from the stack.

literature

Web links

Individual evidence

  1. A report on the design is available at SECD: DESIGN ISSUES .