Software pipelining

from Wikipedia, the free encyclopedia

Software pipelining is a design pattern for programming a processor with multiple execution units so that as many of them as possible are busy at the same time. The purpose of the method is to shorten the time for a calculation, in that the intra-processor parallelism can be used for the calculation. These so-called command assembly lines are called "pipelines".

background

Software pipelining is used for the parallel processing of commands from a single thread ( instruction level parallelism ). In contrast to the parallel processing of commands, software pipelining is used when the same calculation is carried out on a vector of input data (i.e. a form of SIMD ) and special attention has been paid to the arrangement of the commands in the instruction stream .

In contrast to a pipeline within a processor, which divides the individual processing steps of a machine instruction so that several instructions (in different stages of completion) can be processed at the same time, a software pipeline involves several machine instructions in order to perform a calculation on a set of input data . This also means that software pipelining is explicitly influenced by the programmer and is not a property or functionality of the processor, but rather the properties of superscalarity and pipeline architecture are used, which together enable the parallel execution of instructions. In contrast, the processor pipeline cannot be manipulated by the programmer.

example

It should be y = (x + 3) * 2carried out, that is, a vector of values ​​x (i) should be increased by 3 element by element and then doubled. If the processor has two execution units for arithmetic instructions, then these can be assigned as follows:


Tact i Storage unit A Execution unit A Execution unit B Storage unit B
1 0 r (0) = x (0)
2 1 r (1) = x (1) r (0) = r (0) + 3
3 2 r (2) = x (2) r (1) = r (1) + 3 r (0) = r (0) * 2
4th 3 r (3) = x (3) r (2) = r (2) + 3 r (1) = r (1) * 2 y (0) = r (0)
...
j + 1 j r (j) = x (j) r (j - 1)) = r (j - 1) + 3 r (j - 2) = r (j - 2) * 2 y (j - 3) = r (j - 3)
...
n + 3 n + 2 y (n - 1) = r (n - 1)

Legend:

  • i is the current index
  • In the 2nd column you can see the calculation carried out by execution unit A.
  • r (i) is a register that stores the intermediate step of the calculation

Software pipelining requires that the processor can decode and execute more than one instruction at the same time.

The term pipelining comes from the division of an operation to be carried out into individual work steps or stages that are carried out one after the other, like on an assembly line. Since the calculation of a value in a cycle only takes one step in the pipeline, several data sets (in different stages of completion) can be processed at the same time. If an operation is already in the second step, the next operation can already be started in the previous step. In general, software pipelining is supported by all superscalar processors, often with the help of loop unrolling and register renaming in the compiler . The IA-64 especially supports software pipelining, loop unrolling is not necessary, register renaming is taken over by the processor during execution of the register stack engine .

literature

Web links

Individual evidence

  1. a b Markus Pister: Generic software pipelining at the assembly level. - Chapter 5: Software pipelining. P. 48.