Very Long Instruction Word

Very Long Instruction Word ( VLIW ) describes a property of an instruction set architecture ( English Instruction Set Architecture , short ISA ) of a family of microprocessors . The aim is to accelerate the processing of sequential programs by using parallelism at the command level. In contrast to superscalar processors, with VLIW the commands are not assigned dynamically by the processor to the individual functional units at runtime, but the compiler groups commands that can be executed in parallel. VLIW does not preclude the use of a pipeline architecture .

realization

Parallel execution (EX) of instructions at VLIW in a pipeline

The compiler checks a program which instructions can be executed in parallel during translation. These parallelizable instructions are summarized in groups and entered in the command format. The group size depends on the number of available execution units working in parallel. Again, this depends on the architecture. The instructions of an instruction, which can also contain empty instructions for filling, are processed in parallel by the execution units at the runtime of the program.

properties

As the name suggests, one of the main features of VLIW is the wide instruction format, which contains several instructions at once. In contrast to the superscalar technique, the compiler takes on the task of rearranging and marking the commands that can be executed in parallel, with the aim of making optimal use of the available parallelism of command sequences. Additional hardware logic, such as in the case of superscalar technology , is not necessary; this means that there is more space on the CPU for additional functional units.

The parallelism at instruction level that VLIW offers cannot always be fully exploited, e.g. B. only one command can be executed in a cycle due to data dependencies. In these cases the width of the command word is not used. Some manufacturers try to solve this overhead problem with their own VLIW extensions. Texas Instruments , for example, developed the VelociTI technology, in which several commands in consecutive cycles can be packed into one command word. Bits at the boundaries of the individual commands indicate whether the following command should be executed in the same or in the next cycle. Intel uses a similar concept in its IA-64 architecture.

Advantages:

More space for the functional units
Simple control path
Good utilization by compiler techniques such as software pipelining

Disadvantage:

Code cannot necessarily be ported to other processors without major changes

Examples

The VLIW architecture was first implemented in 1978 in the Russian superscalar computer ELBRUS-1 by Boris Babajan . In 1999, according to the international announcement of the Russian microprocessor Elbrus 2000, this architecture was transferred to microprocessors for the first time.

Also pioneers were Cydrome in the 1980s ( Bob Rau ), Multiflow ( Josh Fisher ) and Culler-Harrison in the 1970s ( Glen Culler ), and in Czechoslovakia Norbert Fristacky .

The VLIW architecture is used in the CPUs from Transmeta , in the Crusoe and in the Efficeon . The (not mass marketed) processors from Tilera Technologies, a joint venture and the like, are also based on the VLIW architecture . a. from Intel , which refers to massive SMP - multi-core processors has specialized.

A modern, modified implementation of the VLIW architecture is Intel's Itanium CPU, which in this case is called EPIC .

AMD uses VLIW technology in its graphics processors of the R600 - RV870 series in order to execute up to five parallel instructions on a VLIW shader . However, the development of the R600 architecture dates back to the time when ATI Technologies was still an independent company. In the beginning, the architecture was still inferior to that of Nvidia in terms of performance, but it allowed AMD to compete successfully with its main competitor Nvidia with significantly lower transistor quantities and shader clock rates. Nvidia's scalar solution relies on high utilization and not only needs more transistors for a comparable performance, but also a much higher clock speed, which ultimately leads to major disadvantages compared to the VLIW architecture in terms of energy efficiency.

literature

Binu Mathew: Very Large Instruction Word Architectures. In: Vojin G. Oklobdzija (Ed.): The Computer Engineering Handbook , CRC Press, Boca Raton 2001, ISBN 9780849308857 , ( online ; PDF; 182 kB).


	according to word length	1-bit architecture • Bit-slice architecture • 4-bit architecture • 8-bit architecture • 16-bit architecture • 32-bit architecture • 64-bit architecture
	according to instruction set structure	CISC • EPIC • NISC • RISC • VLIW • Microarchitecture
	with optimization for purpose	(Main) processor • Graphics processor • GPGPU • Stream processor • Sound processor • Floating point unit • Network processor • Physics accelerator • Vector processor • TensorFlow Processing Unit