x86 processor

Intel i8086 processor in DIP-40 housing .

Compared to the i8086, the Intel i8088 only has an 8-bit wide data bus and was used in the IBM PC .

x86 is the abbreviation of a microprocessor - architecture and the associated instruction sets , which inter alia, the chip -producers Intel and AMD are developed.

The x86 instruction set architecture ( English Instruction Set Architecture , or "ISA"), according to the processors of the 8086 / 8088 named series, with which it was introduced 1978th The first successor processors were later named 80 1 86 , 80 2 86 etc. In the 1980s there was therefore talk of the 80x86 architecture - later the "80" was left out at the beginning. The x86 architecture has since expanded with every generation of processors and was already a 32-bit architecture with the 80386 in 1985 , which was also explicitly referred to as i386 .

During the development of the Itanium named Intel was the x86 architecture, which was then a 32-bit, retronym in " I ntel A rchitecture 32 -bit " to abbreviated IA-32 . The retro-anonymous name IA-16 for the 16-bit architecture of the 8086/80286 is also known, but was not widely used. In contrast, the old names “x86” and “i386” (for 32-bit x86) were still used.

The architecture of the Itanium independently developed and incompatible designated Intel IA-64 , what can also therefore lead to confusion because AMD with the first available 2003 64-bit instruction set AMD64 instruction set architecture IA-32 also for 64-bit architecture has made . Intel itself followed suit with Intel 64 in 2005; Intel 64 is compatible with AMD64. Modern 64-bit x86 processors can therefore still be described as belonging to the IA-32 architecture, although this has since been ambiguous. In order to be able to differentiate between 32- and 64-bit, the designation “ x64 ” (for x86 with 64 bits) was introduced for the 64-bit mode, based on “x86” . The retro-anonymous designation " x32 " (for x86 with 32 bits) is rarely found and, on top of that, ambiguous, as it is either a 32-bit x86 processor (mode) or 32-bit addressing on a 64- Bit mode running 64-bit processor can act.

Since digit combinations cannot be protected by trademark law, after the introduction of the 80486 , Intel and most of its competitors switched to using word marks such as Pentium or Celeron (Intel) or Athlon or Phenom (AMD), but the old numbering scheme remained as the name of the whole family receive.

history

The x86 architecture was introduced in 1978 with Intel's first 16-bit CPU , the 8086, which was supposed to replace the older 8-bit processors 8080 and 8085 . Although the 8086 was not particularly successful at first, in 1981 IBM introduced the first PC that used a stripped-down version of the 8086, the 8088 , as a CPU. Due to the enormous success of the IBM PC and its numerous replicas, the so-called IBM PC-compatible PCs , the x86 architecture became one of the most successful CPU architectures in the world within a few years and has remained so to this day.

In addition to Intel, other manufacturers have also produced x86-compatible CPUs under license over the years, including Cyrix (now VIA Technologies ), NEC , UMC , Harris , TI , IBM , IDT and Transmeta . The largest manufacturer of x86-compatible processors after Intel was and is AMD , which today has become a driving force in the further development of the x86 standard alongside Intel.

Intel developed the 8086 in 1978 at the end of the 8-bit era. In 1985, Intel introduced the 80386 , the first x86 CPU with a 32-bit architecture. Today this architecture is known under the name IA-32 (as 32-bit architecture also under the name "i386"); it is, so to speak, the extension of the instruction sets from 8086 and 80286 to 32 bits, but includes their instruction sets completely. The 32-bit era was the longest and most lucrative section in x86 history to date, with IA-32 under constant development - largely under Intel's leadership.

The 64-bit era began for x86 in 1999, but this time on the initiative of AMD. The 64-bit x86 standard was named x64 or x86-64, was introduced by AMD in 2003 as AMD64 and was also adopted by Intel under the name Intel 64 in 2005.

The IA-64 architecture used by Intel and HP in the Itanium product line has nothing to do with IA-32 - including x64. It is a new development, which apart from an x86 emulation (only in the oldest Itanium series) contains no traces of x86 technology. In contrast, IA-32 with the 64-bit extension x64 is still fully downward compatible with 32- and 16-bit x86.

Naming according to the instruction set

Since the instruction set is constantly expanding, one can only assume the minimum required instruction set when speaking of an x86 instruction set architecture - or the current status, with all possible extensions. At this point the designation "x86" is very ambiguous. With the naming a certain convention has developed, which is justified by the historical development.

year	first designation	Alternative names	Instruction set	Operating modes
1972	IA-8	-	" Intel Architecture 8-bit " - unofficial, retronyme Identification of the 8-bit - 8080 , the predecessor of 8086. This instruction set architecture is not compatible with x86.
1978	8086	80x86 , x86	Processors and instruction set architectures compatible with the Intel 8086 and 8088 .	Real fashion
1982	80286	i286	Processors and instruction set architectures compatible with the 80 2 86 .	additional 16-bit protected mode
	IA-16	x86	" I ntel A rchitecture 16 -bit " - little used retro anonymous designation of 16-bit -x86 by Intel, ie the instruction set of the 8086 (real mode) and the 80286 (16-bit protected mode). The designation x86 always includes the 16-bit processor mode Real Mode and is more common than IA-16.
1985	i386	IA-32	32-bit instruction set extension and addressing introduced with the 80 3 86 .	additional 32-bit protected mode , virtual 8086 mode
1989	i486	-	Processors and instruction set architectures that are compatible with the 80 4 86 , including the math coprocessor (i486DX).	additional 32-bit protected mode , virtual 8086 mode
	x87	8087, 80x87	The floating point unit (FPU) as a separate math coprocessor for the 8086/8088 (8087), the 80286 (80287) and the i386 (80387 or i387). Starting with the i486DX, the floating point unit is part of the processor, with the exception of the i486SX (80487 or i487, the last separate FPU).
1993	i586	-	Processors and instruction set architectures compatible with the Pentium .	like i386 and i486, new (optional) SIMD functions
1995	i686	P6	Processors and instruction set architectures compatible with the Pentium Pro (1995) or Pentium II (1997). The Pentium II already supports the MMX extension, which is why i686 often additionally the vector acceleration uses dictate when they are available.	like i386 and i486, new (optional) SIMD functions
	IA-32	(x32)	“ I ntel A rchitecture 32- bit ” - retronymous designation of 32-bit -x86, ie the command set of the 80386 (32-bit protected mode). “X32” is a retro anonymous name for 32-bit x86 (derived from “x64” for 64-bit x86), but it is not widespread and also ambiguous, as there is also 32-bit addressing within the x64 mode there (like the x32 ABI under Linux ).
2003	amd64	x86-64	Processors and instruction set architectures with 64-bit instruction set AMD64 of the Opteron and Athlon 64 are compatible. These contain at least the instruction set extensions MMX , SSE and SSE2 as well as x87 and the NX bit . In the 32-bit mode of the long mode , the virtual 8086 mode is missing .	Legacy Mode : like i386 ; Long Mode : 64-Bit Mode and (32-Bit) Compatibility Mode ; SIMD extensions
	x64	x86-64, amd64	x64 was introduced by Microsoft and Sun to differentiate between pure 32-bit x86 and 64-bit x86, i.e. IA-32 with AMD64 or Intel 64 . The abbreviation “ x32 ” also stands for 32-bit addressing within the 64-bit long mode and is part of x64 (64-bit x86).

While the instruction set architecture x86 is the most imprecise designation, the listed more precise designations still do not precisely characterize the existing (required by software) machine instructions or the exact integrated instruction set in the processor. Under Linux , for example, the specification “i686-pae” has established itself for the Pentium II instruction set with PAE . For example, GParted provided a 32-bit ISO image for "i486" and for "i686-pae" - if a processor does not have a PAE flag (such as the first Pentium M ), you had to click on it fall back on the i486 variant. Even under Windows, it is not clear whether the 64-bit variant actually runs on an older 64-bit x86 processor (with AMD64 or Intel 64 expansion), since the functions in addition to the x64 instruction set expansion from Windows 8.1CMPXCHG16b , PrefetchWand LAHF/SAHFmust be present.

design

The x86 architecture uses a CISC instruction set with variable instruction length. Word-sized memory accesses are also permitted to memory addresses that are not word-aligned. Words are stored in little endian direction. Ease of portability of Intel 8085 assembly code has been a driving force in architectural development. This led to some suboptimal and, in retrospect, problematic design decisions.

Today's x86 processors are hybrid CISC / RISC processors because they first translate the x86 instruction set into RISC micro-instructions of constant length, to which modern micro-architectural optimizations can be applied. The transfer is first made to so-called reservation stations, i.e. to small buffers that are connected upstream of the various arithmetic units. The first hybrid x86 processor was the Pentium Pro .

Real fashion

The Intel 8086 and 8088 had 14 16-bit registers. Four of them ( AX, BX, CX, DX) were general-purpose registers. In addition, each had a special function:

AX( Engl. Accumulator register ) served as a prime target for arithmetic operations
BX( Engl. Base register ) was used for addressing the start address of a data structure
CX( engl. count register ) served as a counter for loops ( loop instruction) and shift operations
DX( English data register ) served as the data register for the second operand.

Each register could be accessed using two separate bytes (the high byte in BXunder the name BH, the low byte as BL). Of the two pointer registers, SP(“StackPointer”) points to the top element of the stack and BP(“BasePointer”) can point to another location in the stack or memory (is often used BPas a pointer to a function frame). The two index registers SI(“SourceIndex”) and DI(“DestinationIndex”) can be used for block operations or together with SPor BPas an index in an array. In addition, there are four segment registers CS(“ Code segment ”), DS(“DataSegment”), SS(“StackSegment”) and ES(“ExtraSegment”), with each of which the base address for a 64 kB memory segment is defined. There is also the flag register, which can contain flags such as carry , overflow , zero , etc., and the instruction pointer ( IP), which points to the current instruction.

In real mode the memory access is "segmented". This is done by shifting the segment address 4 bits to the left and adding an offset so that a 20-bit address is created. The total address space in real mode is 2 ²⁰ bytes (1 megabyte), which was a lot in 1978. There are two addressing modes: near and far ( Engl. For near and far). In Far Mode , both the segment and the offset are specified. In near mode , only the offset is specified and the segment is taken from a register. For data, this is DS, for code, CSand for the stack SS. If DS, for example A000h and SI 5677h , it indicates DS:SIto the absolute address DS × 16 + SI= A5677h .

In this scheme, different segment / offset pairs can point to the same absolute address. If DS A111h and SI 4567h , DS:SIalso points to the above address A5677h . The scheme was supposed to make Intel 8085 code easier to portate, but it ultimately made the work of the programmer more difficult.

In addition, the i8086 had 64 kB of 8-bit I / O address space (alternatively 32 kB with 16 bit) and a hardware-supported stack of 64 kB as well. Only words (2 bytes) can be placed on the stack. The stack grows towards lower addresses and SS:SPpoints to the last word placed on the stack (the lowest address). There are 256 interrupts that can be triggered by both hardware and software. The interrupts can cascade and use the stack to store the return address.

Protected and enhanced mode

The Intel 80286 processor knew another working mode , the " Protected Mode ". (By integrating an MMU English. " Memory Management Unit " for memory management unit) on the chip could in protected mode up to 16 MB are addressed memory. A special MMU register points to a segment table in the main memory in which the 24-bit base addresses of the segments were specified. The segment registers then only served as an index in this segment table. In addition, each segment could be assigned one of four privilege levels (called " rings "). Overall, these innovations meant an improvement. However, software for protected mode was incompatible with the real mode of the 8086 processor.

The Intel 80386 brought probably the biggest leap for the x86 architecture. With the exception of the "Intel i386SX ", which only supported 24-bit addressing and had a 16-bit data bus, all i386 processors were fully 32-bit capable - registers, instructions, I / O space and memory. Up to 4 GB of memory could be addressed. For this purpose, the protected mode has been expanded to "32-bit enhanced mode". As on the 80286, the segment registers were also used in enhanced mode as an index in a segment table that described the division of the memory. However, 32-bit offsets could be used in each segment. This led to the so-called " Flat Memory Model ", in which only one 4 GB data segment and one 4 GB code segment are made available to each process. Both segments start at the address 0and are 4 GB in size. The actual memory management is then only by also connected to the 80386 introduced paging performed a mechanism in equal parts (the entire memory English. Pages , so memory pages ) divides and per process allows any mapping between logical and physical addresses which has greatly simplified the implementation of virtual memory . No new general purpose registers have been added. However, apart from the segment registers, all registers have been expanded to 32 bits. The Advanced tab AXcalled henceforth EAX, from SIwas ESIso two new segment registers named FSand GSyet were added.

The basic architecture of the i386 processor was named the basis of all further developments in the x86 architecture and retronym IA-32. All later 32-bit x86 processors work on the principle of the Intel 80386.

The previously separate math coprocessor 80387 was integrated directly into the processor from the next CPU, the "Intel 80486 " (with the exception of the 486SX, which does not have a coprocessor). With this coprocessor floating point calculations could be carried out in hardware. Without it, they had to be mapped to calculations with whole numbers ( emulation ). Not only are a large number of instructions required per floating point operation, but loops and branches often occur , so that floating point operations were carried out comparatively very slowly without the coprocessor.

register

AX / EAX / RAX: accumulator
BX / EBX / RBX: base
CX / ECX / RCX: counter
DX / EDX / RDX: data / general purpose
SI / ESI / RSI: source index (strings)
DI / EDI / RDI: target index (character strings)
SP / ESP / RSP: stack pointer
BP / EBP / RBP: Stack segment (start address)
IP / EIP / RIP: command pointer

MMX and 3DNow!

In 1996 Intel introduced the MMX technology (English Matrix Math Extensions , especially by marketing but also often called Multi-Media Extensions ). MMX defined 8 new SIMD registers with a width of 64 bits, which, however, used the same memory space as the registers of the floating point unit (FPU). This improved the compatibility with existing operating systems, which still only had to save the well-known FPU registers when switching between different applications. But switching between MMX and FPU had to be laborious. In addition, MMX was limited to integer operations and for a long time was not properly supported by the compilers. Microsoft in particular found it difficult to equip the in-house compiler with at least support for MMX intrinsics . MMX was therefore only used relatively rarely, most likely for 2D video editing, image editing, video playback, etc.

In 1997, AMD expanded the MMX instruction set to include floating point operations for floating point numbers of single precision and called the resulting technology 3DNow . This did not solve the compiler problems, but 3DNow! In contrast to MMX, it could be used for 3D games that rely on fast floating point operations. Game developers and manufacturers of 3D graphics programs used 3DNow! To improve application performance on AMD's K6 and Athlon processors.

Streaming SIMD extensions

Increase in instructions (skip)
Instruction set	Instructions
Instruction set	number	total
x86 (base)	80	80
MMX	57	140
SSE	70	200
SSE2	144	350
SSE3	13	360
SSSE3	16	380
SSE4	54	430
SSE5	47	480

In 1999 Intel brought the SSE instruction set with the Pentium III processor . Like AMD, Intel mainly added floating point SIMD instructions. Furthermore, a separate functional unit was created on the processor for SSE with 8 new 128-bit registers (XMM0 to XMM7), which no longer overlap with the floating point registers. However, since these new registers have to be saved by the operating system even when the context is changed, a lock was implemented in the CPU that must first be enabled by SSE-capable operating systems in order to make the SSE registers available in application programs.

AMD processors initially only supported the 64-bit commands of the extension, which work in the MMX functional unit, as the separate functional unit was completely missing. Most of these commands only work with data of the type integer , which is why the designation ISSE exists, where I stands for integer. SSE is fully supported from the Athlon XP processor.

SSE2 , introduced by Intel in 2001 with the Pentium 4 , first added more integer instructions for the SSE registers and secondly 64-bit SIMD floating point instructions. The former made MMX almost obsolete, and the latter also allowed conventional compilers to use SIMD instructions. With the introduction of the 64-bit extension, AMD therefore selected SSE2 as an integral part of the AMD64 architecture, so that all 64-bit x86 processors support this extension (AMD processors from Athlon64).

With the Prescott revision of the Pentium 4, Intel delivered SSE3 from 2004 , which mainly provides memory and thread management instructions to increase the performance of Intel's Hyper-Threading technology.

AMD has also mastered the SSE3 instruction set since the Athlon 64 processors with the Venice and San Diego cores .

See also: SSSE3 , SSE4 , SSE4a and SSE5

64 bit

Around the year 2002 the memory expansion of modern x86 computers reached the addressing limit of the x86 instruction set architecture of 4 GB due to the 32-bit address size. With PAE , Intel had already introduced a way of addressing more than 4 GB of RAM with the Pentium Pro , but its use was technically complex and the memory that could be used per process was still limited to a maximum of 4 GB.

Intel originally wanted to make the jump to 64-bit with a new processor architecture called Itanium and therefore called it " Intel Architecture 64-Bit" (IA-64). The Itanium architecture was only able to establish itself as a niche product in the server and workstation market segment. AMD, on the other hand, expanded the existing 32-bit x86 processor architecture " Intel Architecture 32-bit" - IA-32 or 32-bit x86 from the i386 - to 64-bit and called this expansion during development "x86-64" “, Finally AMD64 when it was released in 2003 . Intel took over large parts of this expansion under the name Intel 64 (from 2005). 64-bit x86 processors are therefore based on AMD64, Intel 64 is largely compatible with it. As a general name for it x64 has established itself , partly also the original development name x86-64 .

Virtualization

Although the virtualization of an x86 processor is complex due to the comprehensive architecture, there are several products that make a virtual x86 processor available, including VMware , Hyper-V and Virtual PC or open source software such as Xen or VirtualBox . Hardware-side virtualization is also available as an extension, it is called " Intel VT " (for Virtualization Technology) at Intel , and " AMD Virtualization " at AMD .

AVX - Advanced Vector Extensions

In 2008 the SIMD extensions to MMX, SSE 1-4 should be extended again and Intel suggested " AVX ". AVX was the first time in 2011 Sandy Bridge realized microarchitecture. Compared to SSE, the word length for data and registers has been doubled to 256 bits. Many new commands have been added that can be used as 256-bit extensions to the SSE commands. With the next revision of the microarchitecture, the Haswell microarchitecture, AVX was again expanded by new commands, henceforth called AVX-2 , and can offer almost all SSE commands in a 256-bit expansion.

Since energy efficiency is becoming more and more important in high-performance computing and the SIMD concept enables progress, AVX has been completely revised for the Intel Xeon Phi (also in 2013), the data and register width has been doubled to 512 bits and the number the register doubled to 32. Intel calls this extension AVX-512 . It consists of several specified groups of new commands, which are not all implemented in the same way. The second Xeon Phi generation (“Knights Corner”) received the “Foundation”, the third generation (“Knights Landing”) in 2016 also received “CD”, “ER” and “PF” extensions.

For the Skylake Xeon server generation EP / EX announced for 2017 , AVX-512 has also been announced.

Overview of the x86 generations

Intel Pentium 4 ; early Northwood production

Prominent CPU types (end user area)	First introduced	Linear / physical address space	Significant new features
Intel 8086 , Intel 8088	1978	16-bit / 20-bit ( segmented )	first x86 microprocessor
Intel 80186 , Intel 80188 , NEC V20 / V30	1982	16-bit / 20-bit ( segmented )	faster memory address resolution, MUL / DIV instruction
Intel 80286	1982	16-bit (30-bit virtual ) / 24-bit ( segmented )	MMU , for protected mode and a larger address space
Intel 80386 , AMD Am386	1985	32-bit (46-bit virtual ) / 32-bit	32-bit instruction set , MMU with paging
Intel 486 , AMD Am486	1989		RISC- like pipelining , integrated FPU , on-chip cache
Pentium , Pentium MMX , Rise mP6	1993		Superscalarity , 64-bit wide data bus, faster FPU (pipeline), SIMD for integer data with MMX
Cyrix 6x86 , Cyrix MII , Cyrix III Joshua (2000)	1996		Renaming register , speculative instruction execution
Pentium Pro , AMD K5 , Nx586 (1994)	1995	32-bit / 32-bit physical (36-bit with PAE)	µ instruction conversion , PAE (Pentium Pro), integrated L2 cache (Pentium Pro), conditional move instructions (CMOV etc.)
AMD K6 / -2/3 , Pentium II / III , IDT / Centaur - C6	1997		L3 cache support, SIMD for floating point data: AMD 3DNow , Intel SSE
Athlon , Athlon XP	1999		superscalar FPU, three parallel integer pipelines ( up to three x86 instr./clock )
Pentium 4	2000		long pipelines , optimized for very high clock frequencies, SSE2 , Hyper-Threading
Pentium M , VIA C7 (2005), Intel Core (2006)	2003		optimized for low power dissipation
Athlon 64 , Opteron	2003	64-bit / 40-bit physically in the first AMD implementations.	AMD64 , on-die memory controller, HyperTransport
Pentium 4 Prescott 2M / Cedar Mill	2004		very long pipelines , designed for very high clock frequencies, SSE3 , 64-bit (only for socket LGA 775)
Intel Core 2	2006		energy-efficient, multicore , medium-length pipeline, designed for lower clock frequencies than the P4, SSE4 (Penryn)
AMD Phenom	2007	64-bit / 48-bit physically in the AMD Phenom	monolithic quad-core, 128 bit FPUs, SSE4a , HyperTransport 3 or QuickPath, integrated memory controller, on-die L3 cache, SMT (only with i7), modular design
Intel Core i3 , Intel Core i5 , Intel Core i7	2008
Intel Atom			In-order command execution , pipelined, very energy efficient
VIA Nano			Out-of-order command execution , superscalar, hardware encryption, very energy-efficient, adaptive power management
AMD Bobcat	2011
Intel Sandy Bridge	2010	64-bit	Advanced Vector Extensions , AES-NI (hardware accelerated encryption), SMT (only with i7), very modular design, CMT (only with Bulldozer-based processors), FMA (only with Bulldozer processors)
AMD bulldozer	2011	64-bit
AMD Jaguar	2013	64-bit / 40-bit physically	AVX , AES , SSEx , very low power consumption, first HSA features
Intel Haswell	2013	64-bit	AVX 2 , FMA3 , Iris Pro graphics
AMD Steamroller	2014	64-bit	improved CMT, twice as many decoders as bulldozers
AMD Cliffcoaster	2016	64-bit	significantly better CMT, 1.5 times as many decoders as Steamroller

Manufacturer

x86-compatible processors have been designed and manufactured by many companies, including:

Web links

Large INTEL CPU archive - lots of pictures and information

sandpile.org - Extensive archive for x86-related documentation

cpu-collection.de - Extensive processor collection

Assembler x86 command lists / OpCode and descriptions

i8086.de 8086/88 assembler command reference

The x86 processor turns 30 - as Intel stormed all the summits thanks to IBM

Individual evidence

↑ Rask Ingemann Lambertsen: Re: New back end ia16: 16-bit Intel x86. In: [email protected] mailing list. August 1, 2007, accessed on November 20, 2016 (English): "It is also clear from the search results that outside of Intel, IA16 or IA-16 means the 16-bit x86 family members i8086-i80286 and IA32 or IA- 32 means x86 family members starting with the i80386. "
↑ Christof Windeck: 64-bit names. In: Heise online . April 28, 2008 . Retrieved November 19, 2016 .; Quote: "In terms of x86 processors with 64-bit expansion, x86-64, AMD64, EM64T, Intel 64 and x64 mean practically the same thing."
↑ Windows 8 system requirements. Microsoft, accessed November 20, 2016 .
↑ Martin Fischer: Without a meltdown gap: Chinese x86 processors KX-5000 presented, attack on AMD's ZEN 2 with KX-7000 planned. In: heise-online.de. January 23, 2018, accessed January 23, 2018 .

[1] Rask Ingemann Lambertsen: Re: New back end ia16: 16-bit Intel x86. In: [email protected] mailing list. August 1, 2007, accessed on November 20, 2016 (English): "It is also clear from the search results that outside of Intel, IA16 or IA-16 means the 16-bit x86 family members i8086-i80286 and IA32 or IA- 32 means x86 family members starting with the i80386. "

[heiseonline_324064-2] Christof Windeck: 64-bit names. In: Heise online . April 28, 2008 . Retrieved November 19, 2016 .; Quote: "In terms of x86 processors with 64-bit expansion, x86-64, AMD64, EM64T, Intel 64 and x64 mean practically the same thing."

[3] Windows 8 system requirements. Microsoft, accessed November 20, 2016 .

[4] Martin Fischer: Without a meltdown gap: Chinese x86 processors KX-5000 presented, attack on AMD's ZEN 2 with KX-7000 planned. In: heise-online.de. January 23, 2018, accessed January 23, 2018 .