Register renaming

from Wikipedia, the free encyclopedia

Register renaming ( English register renaming ) refers to a phase in the instruction decoding, the most of superscalar microprocessors is applied. It helps to avoid unnecessary sequencing or improves out-of-order execution , i.e. H. the possibility of running small parts of the program concurrently .

motivation

Superscalar processors have a relatively small set of directly addressable architecture registers that are specified by the instruction set. It is therefore not possible to add new registers to the instruction set with every processor generation, as this would break the binary compatibility. For this reason, however, data dependencies between program parts can occur in the program sequence which prevent out-of-order or concurrent execution. However, these are not real dependencies, but rather name dependencies, in this case also called data conflicts .

In the x86 family, register renaming was first used in the Pentium Pro , in which eight registers of the instruction code that were visible or accessible to the programmer were transferred to 96 invisible or non-accessible but physically available registers.

The dependencies Write After Read (WAR) and Write After Write (WAW) can now be resolved by renaming registers. There is also a large number of shadow registers that cannot be used directly by the program. Usually, in the decoding stage (ID) of the processor, every time a register is defined, i.e. every instruction that produces a result and wants to store it in a register, the register used is renamed into a shadow register - hence the name. After that, the code only contains real data dependencies and the independent parts can be executed in parallel or in a different order.

example

The following code can only be processed sequentially in this form, as there are data dependencies.

1:  R1 := R2 / R3
2:  R4 := R1 + R5   # RAW-Abhängigkeit mit Zeile 1
3:  R5 := R6 + R7   # WAR-Abhängigkeit mit Zeile 2
4:  R1 := R8 + R9   # WAW-Abhängigkeit mit Zeile 1

If you now consistently rename the target register of each defining operation, the WAR and WAW dependencies dissolve:

1:  A := R2 / R3
2:  B := A  + R5   # RAW-Abhängigkeit mit Zeile 1
3:  C := R6 + R7
4:  D := R8 + R9

Blocks (1 and 2), 3 and 4 can now be executed in any order or in parallel.

Alternatives

Another approach is to deal with the explicitly parallel instruction computing the Itanium gone processor but it turned out to be less successful. Instructions that can be executed in parallel are encoded in special instructions and combined into so-called instruction blocks ( instruction groups ). A disadvantage of this method is that the target processor has to be known at the time of the translation and a subsequent adjustment is no longer possible or only possible through a new translation.

See also

literature

  • Jean-Loup Baer: Microprocessor Architecture: From Simple Pipelines to Chip Multiprocessors , pp. 89 ff., Cambridge University Press, 2010, ISBN 9780521769921 .
  • Christian Müller-Schloer, Wolfgang Karl, Sami Yehia: Architecture of Computing Systems - ARCS 2010: 23rd International Conference, Hannover, Germany, February 22-25, 2010, Proceedings , p. 127 ff., Springer Science & Business Media, 2010, ISBN 9783642119491 .
  • John L. Hennessy, David A. Patterson: Computer Architecture: A Quantitative Approach , pp. 208 ff., Elsevier, 2012, ISBN 9780123838728 .