x86 processor

from Wikipedia, the free encyclopedia
Intel i8086 processor in DIP-40 housing .
Compared to the i8086, the Intel i8088 only has an 8-bit wide data bus and was used in the IBM PC .

x86 is the abbreviation of a microprocessor - architecture and the associated instruction sets , which inter alia, the chip -producers Intel and AMD are developed.

The x86 instruction set architecture ( English Instruction Set Architecture , or "ISA"), according to the processors of the 8086 / 8088 named series, with which it was introduced 1978th The first successor processors were later named 80 1 86 , 80 2 86 etc. In the 1980s there was therefore talk of the 80x86 architecture - later the "80" was left out at the beginning. The x86 architecture has since expanded with every generation of processors and was already a 32-bit architecture with the 80386 in 1985 , which was also explicitly referred to as i386 .

During the development of the Itanium named Intel was the x86 architecture, which was then a 32-bit, retronym in " I ntel A rchitecture 32 -bit " to abbreviated IA-32 . The retro-anonymous name IA-16 for the 16-bit architecture of the 8086/80286 is also known, but was not widely used. In contrast, the old names “x86” and “i386” (for 32-bit x86) were still used.

The architecture of the Itanium independently developed and incompatible designated Intel IA-64 , what can also therefore lead to confusion because AMD with the first available 2003 64-bit instruction set AMD64 instruction set architecture IA-32 also for 64-bit architecture has made . Intel itself followed suit with Intel 64 in 2005; Intel 64 is compatible with AMD64. Modern 64-bit x86 processors can therefore still be described as belonging to the IA-32 architecture, although this has since been ambiguous. In order to be able to differentiate between 32- and 64-bit, the designation “ x64 ” (for x86 with 64 bits) was introduced for the 64-bit mode, based on “x86” . The retro-anonymous designation " x32 " (for x86 with 32 bits) is rarely found and, on top of that, ambiguous, as it is either a 32-bit x86 processor (mode) or 32-bit addressing on a 64- Bit mode running 64-bit processor can act.

Since digit combinations cannot be protected by trademark law, after the introduction of the 80486 , Intel and most of its competitors switched to using word marks such as Pentium or Celeron (Intel) or Athlon or Phenom (AMD), but the old numbering scheme remained as the name of the whole family receive.

history

The x86 architecture was introduced in 1978 with Intel's first 16-bit CPU , the 8086, which was supposed to replace the older 8-bit processors 8080 and 8085 . Although the 8086 was not particularly successful at first, in 1981 IBM introduced the first PC that used a stripped-down version of the 8086, the 8088 , as a CPU. Due to the enormous success of the IBM PC and its numerous replicas, the so-called IBM PC-compatible PCs , the x86 architecture became one of the most successful CPU architectures in the world within a few years and has remained so to this day.

In addition to Intel, other manufacturers have also produced x86-compatible CPUs under license over the years, including Cyrix (now VIA Technologies ), NEC , UMC , Harris , TI , IBM , IDT and Transmeta . The largest manufacturer of x86-compatible processors after Intel was and is AMD , which today has become a driving force in the further development of the x86 standard alongside Intel.

Intel developed the 8086 in 1978 at the end of the 8-bit era. In 1985, Intel introduced the 80386 , the first x86 CPU with a 32-bit architecture. Today this architecture is known under the name IA-32 (as 32-bit architecture also under the name "i386"); it is, so to speak, the extension of the instruction sets from 8086 and 80286 to 32 bits, but includes their instruction sets completely. The 32-bit era was the longest and most lucrative section in x86 history to date, with IA-32 under constant development - largely under Intel's leadership.

The 64-bit era began for x86 in 1999, but this time on the initiative of AMD. The 64-bit x86 standard was named x64 or x86-64, was introduced by AMD in 2003 as AMD64 and was also adopted by Intel under the name Intel 64 in 2005.

The IA-64 architecture used by Intel and HP in the Itanium product line has nothing to do with IA-32 - including x64. It is a new development, which apart from an x86 emulation (only in the oldest Itanium series) contains no traces of x86 technology. In contrast, IA-32 with the 64-bit extension x64 is still fully downward compatible with 32- and 16-bit x86.

Naming according to the instruction set

Since the instruction set is constantly expanding, one can only assume the minimum required instruction set when speaking of an x86 instruction set architecture - or the current status, with all possible extensions. At this point the designation "x86" is very ambiguous. With the naming a certain convention has developed, which is justified by the historical development.

year first designation Alternative names Instruction set Operating modes
1972 IA-8 - " Intel Architecture 8-bit " - unofficial, retronyme Identification of the 8-bit - 8080 , the predecessor of 8086. This instruction set architecture is not compatible with x86.
1978 8086 80x86 , x86 Processors and instruction set architectures compatible with the Intel 8086 and 8088 . Real fashion
1982 80286 i286 Processors and instruction set architectures compatible with the 80 2 86 . additional 16-bit protected mode
IA-16 x86 " I ntel A rchitecture 16 -bit " - little used retro anonymous designation of 16-bit -x86 by Intel, ie the instruction set of the 8086 (real mode) and the 80286 (16-bit protected mode). The designation x86 always includes the 16-bit processor mode Real Mode and is more common than IA-16.
1985 i386 IA-32 32-bit instruction set extension and addressing introduced with the 80 3 86 . additional 32-bit protected mode , virtual 8086 mode
1989 i486 - Processors and instruction set architectures that are compatible with the 80 4 86 , including the math coprocessor (i486DX).
x87 8087, 80x87 The floating point unit (FPU) as a separate math coprocessor for the 8086/8088 (8087), the 80286 (80287) and the i386 (80387 or i387). Starting with the i486DX, the floating point unit is part of the processor, with the exception of the i486SX (80487 or i487, the last separate FPU).
1993 i586 - Processors and instruction set architectures compatible with the Pentium . like i386 and i486, new (optional) SIMD functions
1995 i686 P6 Processors and instruction set architectures compatible with the Pentium Pro (1995) or Pentium II (1997). The Pentium II already supports the MMX extension, which is why i686 often additionally the vector acceleration uses dictate when they are available.
IA-32 (x32) I ntel A rchitecture 32- bit ” - retronymous designation of 32-bit -x86, ie the command set of the 80386 (32-bit protected mode). “X32” is a retro anonymous name for 32-bit x86 (derived from “x64” for 64-bit x86), but it is not widespread and also ambiguous, as there is also 32-bit addressing within the x64 mode there (like the x32 ABI under Linux ).
2003 amd64 x86-64 Processors and instruction set architectures with 64-bit instruction set AMD64 of the Opteron and Athlon 64 are compatible. These contain at least the instruction set extensions MMX , SSE and SSE2 as well as x87 and the NX bit . In the 32-bit mode of the long mode , the virtual 8086 mode is missing . Legacy Mode : like i386 ; Long Mode : 64-Bit Mode and (32-Bit) Compatibility Mode ; SIMD extensions
x64 x86-64, amd64 x64 was introduced by Microsoft and Sun to differentiate between pure 32-bit x86 and 64-bit x86, i.e. IA-32 with AMD64 or Intel 64 .

The abbreviation “ x32 ” also stands for 32-bit addressing within the 64-bit long mode and is part of x64 (64-bit x86).

While the instruction set architecture x86 is the most imprecise designation, the listed more precise designations still do not precisely characterize the existing (required by software) machine instructions or the exact integrated instruction set in the processor. Under Linux , for example, the specification “i686-pae” has established itself for the Pentium II instruction set with PAE . For example, GParted provided a 32-bit ISO image for "i486" and for "i686-pae" - if a processor does not have a PAE flag (such as the first Pentium M ), you had to click on it fall back on the i486 variant. Even under Windows, it is not clear whether the 64-bit variant actually runs on an older 64-bit x86 processor (with AMD64 or Intel 64 expansion), since the functions in addition to the x64 instruction set expansion from Windows 8.1CMPXCHG16b , PrefetchWand LAHF/SAHFmust be present.

design

The x86 architecture uses a CISC instruction set with variable instruction length. Word-sized memory accesses are also permitted to memory addresses that are not word-aligned. Words are stored in little endian direction. Ease of portability of Intel 8085 assembly code has been a driving force in architectural development. This led to some suboptimal and, in retrospect, problematic design decisions.

Today's x86 processors are hybrid CISC / RISC processors because they first translate the x86 instruction set into RISC micro-instructions of constant length, to which modern micro-architectural optimizations can be applied. The transfer is first made to so-called reservation stations, i.e. to small buffers that are connected upstream of the various arithmetic units. The first hybrid x86 processor was the Pentium Pro .

Real fashion

The Intel 8086 and 8088 had 14 16-bit registers. Four of them ( AX, BX, CX, DX) were general-purpose registers. In addition, each had a special function:

  • AX( Engl. Accumulator register ) served as a prime target for arithmetic operations
  • BX( Engl. Base register ) was used for addressing the start address of a data structure
  • CX( engl. count register ) served as a counter for loops ( loop instruction) and shift operations
  • DX( English data register ) served as the data register for the second operand.

Each register could be accessed using two separate bytes (the high byte in BXunder the name BH, the low byte as BL). Of the two pointer registers, SP(“StackPointer”) points to the top element of the stack and BP(“BasePointer”) can point to another location in the stack or memory (is often used BPas a pointer to a function frame). The two index registers SI(“SourceIndex”) and DI(“DestinationIndex”) can be used for block operations or together with SPor BPas an index in an array. In addition, there are four segment registers CS(“ Code segment ”), DS(“DataSegment”), SS(“StackSegment”) and ES(“ExtraSegment”), with each of which the base address for a 64  kB memory segment is defined. There is also the flag register, which can contain flags such as carry , overflow , zero , etc., and the instruction pointer ( IP), which points to the current instruction.

In real mode the memory access is "segmented". This is done by shifting the segment address 4 bits to the left and adding an offset so that a 20-bit address is created. The total address space in real mode is 2 20  bytes (1 megabyte), which was a lot in 1978. There are two addressing modes: near and far ( Engl. For near and far). In Far Mode , both the segment and the offset are specified. In near mode , only the offset is specified and the segment is taken from a register. For data, this is DS, for code, CSand for the stack SS. If DS, for example A000h and SI 5677h , it indicates DS:SIto the absolute address DS × 16 + SI= A5677h .

In this scheme, different segment / offset pairs can point to the same absolute address. If DS A111h and SI 4567h , DS:SIalso points to the above address A5677h . The scheme was supposed to make Intel 8085 code easier to portate, but it ultimately made the work of the programmer more difficult.

In addition, the i8086 had 64 kB of 8-bit I / O address space (alternatively 32 kB with 16 bit) and a hardware-supported stack of 64 kB as well. Only words (2 bytes) can be placed on the stack. The stack grows towards lower addresses and SS:SPpoints to the last word placed on the stack (the lowest address). There are 256  interrupts that can be triggered by both hardware and software. The interrupts can cascade and use the stack to store the return address.

Protected and enhanced mode

The Intel 80286 processor knew another working mode , the " Protected Mode ". (By integrating an MMU English. " Memory Management Unit " for memory management unit) on the chip could in protected mode up to 16  MB are addressed memory. A special MMU register points to a segment table in the main memory in which the 24-bit base addresses of the segments were specified. The segment registers then only served as an index in this segment table. In addition, each segment could be assigned one of four privilege levels (called " rings "). Overall, these innovations meant an improvement. However, software for protected mode was incompatible with the real mode of the 8086 processor.

The Intel 80386 brought probably the biggest leap for the x86 architecture. With the exception of the "Intel i386SX ", which only supported 24-bit addressing and had a 16-bit data bus, all i386 processors were fully 32-bit capable - registers, instructions, I / O space and memory. Up to 4  GB of memory could be addressed. For this purpose, the protected mode has been expanded to "32-bit enhanced mode". As on the 80286, the segment registers were also used in enhanced mode as an index in a segment table that described the division of the memory. However, 32-bit offsets could be used in each segment. This led to the so-called " Flat Memory Model ", in which only one 4 GB data segment and one 4 GB code segment are made available to each process. Both segments start at the address 0and are 4 GB in size. The actual memory management is then only by also connected to the 80386 introduced paging performed a mechanism in equal parts (the entire memory English. Pages , so memory pages ) divides and per process allows any mapping between logical and physical addresses which has greatly simplified the implementation of virtual memory . No new general purpose registers have been added. However, apart from the segment registers, all registers have been expanded to 32 bits. The Advanced tab AXcalled henceforth EAX, from SIwas ESIso two new segment registers named FSand GSyet were added.

The basic architecture of the i386 processor was named the basis of all further developments in the x86 architecture and retronym IA-32. All later 32-bit x86 processors work on the principle of the Intel 80386.

The previously separate math coprocessor 80387 was integrated directly into the processor from the next CPU, the "Intel 80486 " (with the exception of the 486SX, which does not have a coprocessor). With this coprocessor floating point calculations could be carried out in hardware. Without it, they had to be mapped to calculations with whole numbers ( emulation ). Not only are a large number of instructions required per floating point operation, but loops and branches often occur , so that floating point operations were carried out comparatively very slowly without the coprocessor.

register

  • AX / EAX / RAX: accumulator
  • BX / EBX / RBX: base
  • CX / ECX / RCX: counter
  • DX / EDX / RDX: data / general purpose
  • SI / ESI / RSI: source index (strings)
  • DI / EDI / RDI: target index (character strings)
  • SP / ESP / RSP: stack pointer
  • BP / EBP / RBP: Stack segment (start address)
  • IP / EIP / RIP: command pointer

MMX and 3DNow!

In 1996 Intel introduced the MMX technology (English Matrix Math Extensions , especially by marketing but also often called Multi-Media Extensions ). MMX defined 8 new SIMD registers with a width of 64 bits, which, however, used the same memory space as the registers of the floating point unit (FPU). This improved the compatibility with existing operating systems, which still only had to save the well-known FPU registers when switching between different applications. But switching between MMX and FPU had to be laborious. In addition, MMX was limited to integer operations and for a long time was not properly supported by the compilers. Microsoft in particular found it difficult to equip the in-house compiler with at least support for MMX intrinsics . MMX was therefore only used relatively rarely, most likely for 2D video editing, image editing, video playback, etc.

In 1997, AMD expanded the MMX instruction set to include floating point operations for floating point numbers of single precision and called the resulting technology 3DNow . This did not solve the compiler problems, but 3DNow! In contrast to MMX, it could be used for 3D games that rely on fast floating point operations. Game developers and manufacturers of 3D graphics programs used 3DNow! To improve application performance on AMD's K6 and Athlon processors.

Streaming SIMD extensions

Increase in instructions (skip)
Instruction set Instructions
number total
x86 (base) 80 80
MMX 57 140
SSE 70 200
SSE2 144 350
SSE3 13 360
SSSE3 16 380
SSE4 54 430
SSE5 47 480

In 1999 Intel brought the SSE instruction set with the Pentium III processor . Like AMD, Intel mainly added floating point SIMD instructions. Furthermore, a separate functional unit was created on the processor for SSE with 8 new 128-bit registers (XMM0 to XMM7), which no longer overlap with the floating point registers. However, since these new registers have to be saved by the operating system even when the context is changed, a lock was implemented in the CPU that must first be enabled by SSE-capable operating systems in order to make the SSE registers available in application programs.

AMD processors initially only supported the 64-bit commands of the extension, which work in the MMX functional unit, as the separate functional unit was completely missing. Most of these commands only work with data of the type integer , which is why the designation ISSE exists, where I stands for integer. SSE is fully supported from the Athlon XP processor.

SSE2 , introduced by Intel in 2001 with the Pentium 4 , first added more integer instructions for the SSE registers and secondly 64-bit SIMD floating point instructions. The former made MMX almost obsolete, and the latter also allowed conventional compilers to use SIMD instructions. With the introduction of the 64-bit extension, AMD therefore selected SSE2 as an integral part of the AMD64 architecture, so that all 64-bit x86 processors support this extension (AMD processors from Athlon64).

With the Prescott revision of the Pentium 4, Intel delivered SSE3 from 2004 , which mainly provides memory and thread management instructions to increase the performance of Intel's Hyper-Threading technology.

AMD has also mastered the SSE3 instruction set since the Athlon 64 processors with the Venice and San Diego cores .

See also: SSSE3 , SSE4 , SSE4a and SSE5

64 bit

Around the year 2002 the memory expansion of modern x86 computers reached the addressing limit of the x86 instruction set architecture of 4 GB due to the 32-bit address size. With PAE , Intel had already introduced a way of addressing more than 4 GB of RAM with the Pentium Pro , but its use was technically complex and the memory that could be used per process was still limited to a maximum of 4 GB.

Intel originally wanted to make the jump to 64-bit with a new processor architecture called Itanium and therefore called it " Intel Architecture 64-Bit" (IA-64). The Itanium architecture was only able to establish itself as a niche product in the server and workstation market segment. AMD, on the other hand, expanded the existing 32-bit x86 processor architecture " Intel Architecture 32-bit" - IA-32 or 32-bit x86 from the i386 - to 64-bit and called this expansion during development "x86-64" “, Finally AMD64 when it was released in 2003 . Intel took over large parts of this expansion under the name Intel 64 (from 2005). 64-bit x86 processors are therefore based on AMD64, Intel 64 is largely compatible with it. As a general name for it x64 has established itself , partly also the original development name x86-64 .

Virtualization

Although the virtualization of an x86 processor is complex due to the comprehensive architecture, there are several products that make a virtual x86 processor available, including VMware , Hyper-V and Virtual PC or open source software such as Xen or VirtualBox . Hardware-side virtualization is also available as an extension, it is called " Intel VT " (for Virtualization Technology) at Intel , and " AMD Virtualization " at AMD .

AVX - Advanced Vector Extensions

In 2008 the SIMD extensions to MMX, SSE 1-4 should be extended again and Intel suggested " AVX ". AVX was the first time in 2011 Sandy Bridge realized microarchitecture. Compared to SSE, the word length for data and registers has been doubled to 256 bits. Many new commands have been added that can be used as 256-bit extensions to the SSE commands. With the next revision of the microarchitecture, the Haswell microarchitecture, AVX was again expanded by new commands, henceforth called AVX-2 , and can offer almost all SSE commands in a 256-bit expansion.

Since energy efficiency is becoming more and more important in high-performance computing and the SIMD concept enables progress, AVX has been completely revised for the Intel Xeon Phi (also in 2013), the data and register width has been doubled to 512 bits and the number the register doubled to 32. Intel calls this extension AVX-512 . It consists of several specified groups of new commands, which are not all implemented in the same way. The second Xeon Phi generation (“Knights Corner”) received the “Foundation”, the third generation (“Knights Landing”) in 2016 also received “CD”, “ER” and “PF” extensions.

For the Skylake Xeon server generation EP / EX announced for 2017 , AVX-512 has also been announced.

Overview of the x86 generations

Intel Pentium 4 ; early Northwood production
Prominent CPU types (end user area) First
introduced
Linear /
physical address space
Significant new features
Intel 8086 , Intel 8088 1978 16-bit / 20-bit ( segmented ) first x86 microprocessor
Intel 80186 , Intel 80188 , NEC V20 / V30 1982 faster memory address resolution, MUL / DIV instruction
Intel 80286 16-bit (30-bit virtual ) /
24-bit ( segmented )
MMU , for protected mode and a larger address space
Intel 80386 , AMD Am386 1985 32-bit (46-bit virtual ) / 32-bit 32-bit instruction set , MMU with paging
Intel 486 , AMD Am486 1989 RISC- like pipelining , integrated FPU , on-chip cache
Pentium , Pentium MMX , Rise mP6 1993 Superscalarity , 64-bit wide data bus, faster FPU (pipeline),
SIMD for integer data with MMX
Cyrix 6x86 , Cyrix MII , Cyrix III Joshua (2000) 1996 Renaming register , speculative instruction execution
Pentium Pro , AMD K5 , Nx586 (1994) 1995 32-bit / 32-bit physical
(36-bit with PAE)
µ instruction conversion , PAE (Pentium Pro), integrated L2 cache
(Pentium Pro), conditional move instructions (CMOV etc.)
AMD K6 / -2/3 , Pentium II / III , IDT / Centaur - C6 1997 L3 cache support, SIMD for floating point data: AMD 3DNow , Intel SSE
Athlon , Athlon XP 1999 superscalar FPU, three parallel integer pipelines ( up to three x86 instr./clock )
Pentium 4 2000 long pipelines , optimized for very high clock frequencies, SSE2 , Hyper-Threading
Pentium M , VIA C7 (2005), Intel Core (2006) 2003 optimized for low power dissipation
Athlon 64 , Opteron 2003 64-bit / 40-bit physically in the
first AMD implementations.
AMD64 , on-die memory controller, HyperTransport
Pentium 4 Prescott 2M / Cedar Mill 2004 very long pipelines , designed for very high clock frequencies,
SSE3 , 64-bit (only for socket LGA 775)
Intel Core 2 2006 energy-efficient, multicore , medium-length pipeline, designed
for lower clock frequencies than the P4, SSE4 (Penryn)
AMD Phenom 2007 64-bit / 48-bit physically
in the AMD Phenom
monolithic quad-core, 128 bit FPUs, SSE4a , HyperTransport  3
or QuickPath, integrated memory controller, on-die L3 cache,
SMT (only with i7), modular design
Intel Core i3 , Intel Core i5 , Intel Core i7 2008
Intel Atom In-order command execution , pipelined, very energy efficient
VIA Nano Out-of-order command execution , superscalar, hardware encryption,
very energy-efficient, adaptive power management
AMD Bobcat 2011
Intel Sandy Bridge 2010 64-bit Advanced Vector Extensions , AES-NI (hardware accelerated encryption),
SMT (only with i7), very modular design, CMT (only with Bulldozer-based
processors), FMA (only with Bulldozer processors)
AMD bulldozer 2011
AMD Jaguar 2013 64-bit / 40-bit physically AVX , AES , SSEx , very low power consumption, first HSA features
Intel Haswell 2013 64-bit AVX 2 , FMA3 , Iris Pro graphics
AMD Steamroller 2014 improved CMT, twice as many decoders as bulldozers
AMD Cliffcoaster 2016 64-bit significantly better CMT, 1.5 times as many decoders as Steamroller

Manufacturer

x86-compatible processors have been designed and manufactured by many companies, including:

See also

Web links

Individual evidence

  1. Rask Ingemann Lambertsen: Re: New back end ia16: 16-bit Intel x86. In: gcc-patches@gcc.gnu.org mailing list. August 1, 2007, accessed on November 20, 2016 (English): "It is also clear from the search results that outside of Intel, IA16 or IA-16 means the 16-bit x86 family members i8086-i80286 and IA32 or IA- 32 means x86 family members starting with the i80386. "
  2. Christof Windeck: 64-bit names. In: Heise online . April 28, 2008 . Retrieved November 19, 2016 .; Quote: "In terms of x86 processors with 64-bit expansion, x86-64, AMD64, EM64T, Intel 64 and x64 mean practically the same thing."
  3. Windows 8 system requirements. Microsoft, accessed November 20, 2016 .
  4. Martin Fischer: Without a meltdown gap: Chinese x86 processors KX-5000 presented, attack on AMD's ZEN 2 with KX-7000 planned. In: heise-online.de. January 23, 2018, accessed January 23, 2018 .