64-bit architecture

In IT, 64-bit architecture is a processor architecture with a processing width of 64 bits . With 64-bit address registers , such processors are able to provide individual processes with larger (non-segmented) address spaces than 4 GB.

Some processors support several architectures (for reasons of compatibility). Current PC processors from AMD and Intel support the x86-16 architecture of the Intel 8086, the x86-32 architecture of the Intel 80386 and the x86-64 architecture of AMD64 / Intel 64.

Similarly, operating systems and computer programs that are designed for such an architecture are given the 64- bit property (e.g. "64-bit operating system" or " Windows 64-bit").

development

64-bit
AMD Athlon 64 processor

The first architectures with isolated properties of a 64-bit architecture developed from the 1960s in the field of supercomputers . The decisive factor here was the optimization for processing mathematical models with 64-bit floating point support . In comparison, processors for the PC / workstation area were delivered without floating point units until the 1990s and had to be purchased as a coprocessor for a separate slot . The address space of the IBM 7030 Stretch did not even reach a megabyte of data.

The further development of 64-bit architectures was driven by increasingly inexpensive main memory that could be manufactured, which in the early 1990s led to 64-bit architectures in the server sector (MIPS 4000, DEC Alpha, SPARC64, HP PA-RISC, IBM Power 3, Intel IA -64), in the early 2000s in the PC / workstation area (AMD64) and in the early 2010s even in the area of smartphones (ARM64). Some of the older architectures had widened the data paths even before the development of the full 64-bit architecture, for example the Pentium P5 (64-bit data bus, commands 8 to 120 bits long) or even the Pentium-4 (here even two 64-bit data buses, which are generally used to transfer 512-bit words).

Early special architectures of supercomputers with bus widths from 64 bits

1961: the IBM 7030 Stretch with 18-bit address and 64-bit data bus and support for words of variable bit width

1974: the CDC STAR-100 (successor to the 60-bit computer from Control Data Corporation ), a vector computer in Harvard architecture. Using the 16-bit address bus, up to 65536 superwords of 512 bits each can be transferred via a 512-bit data bus. There was a separate 128-bit bus for commands.

1976: Cray-1 , the first 64-bit vector computer, forerunner of the Cray line of supercomputers: 24-bit address space, 16 or 32-bit commands, 16 data buses with 64 bits each

1983: Elxsi 6400 so-called " mini supercomputer" with 64-bit data paths, 64-bit integer registers, but 32-bit address space, support for clusters of up to 12 CPUs.

64-bit architectures for servers in general-purpose processors

1991: from MIPS (later SGI ) from MIPS R4000 a 64-bit MIPS architecture (32- and 64-bit)

1992: DEC 's Alpha processor series

1995: from Sun Microsystems the SPARC architecture ("UltraSPARC," 32- and 64-bit)

1995: from Hewlett-Packard the PA ‑ RISC series (32 and 64 bit)

1997: the Power series from IBM (32- and 64-bit)

2000: the System ‑ z series from IBM (formerly S / 390 )

2001: Intel 's Itanium architecture "IA-64"

1998: from Apple / IBM / Motorola or Freescale (from 2004), the PowerPC series (32 and 64 bit)

2005: the SPARC64 ‑ V architecture from Fujitsu

64-bit architectures for servers, PCs, tablets and smartphones in universal processors

2003: from AMD / Intel x64 , an instruction set extension for the x86 processor family (16- and 32-bit, with x64 extended by 64-bit)
- from AMD the AMD64 extension of the IA ‑ 32 architecture (from 2003)
- from Intel the Intel 64 expansion, which is largely compatible with AMD64 (from 2005)
2013: ARM Limited released the ARMv8 architecture

The coprocessors have each been limited in their development by the data paths of the main processor. As the first mathematical coprocessor (FPU) of the 16-bit Intel 8086 , the Intel 8087 even had 80-bit registers available. The later graphic processors (GPU) optimized for 3D calculations on a four-fold packed representation of the matrices , so that these developed into 128-bit and 256-bit processors. Since they do not have their own applications and operating systems with this bit width, they are not full architectures.

hardware

The architecture of a processor says nothing about how individual functions are specifically implemented in the chip design. This means that individual internal commands can still be executed as 32-bit operations (such as shift commands in MIPS-R4000 processors).

The specific hardware of 64-bit processors is much more determined by the processor design of the years in which they were introduced. These include

mostly multicore systems
usually several 64-bit buses to the main memory
always super pipelined architecture
mostly out-of-order execution, superscalar execution
mostly vector commands from 128 bit width
Floating point unit, some of which can execute several dozen floating point instructions per core at the same time
extensive cache architectures with 2 to 3, sometimes 4 hierarchies
Virtualization options for storage and, in some cases, I / O operations

The additional effort for expanding a 32-bit architecture to 64-bit was around 10 percent. The 32-bit processor Intel Core Duo Processor T2700 managed with 151 million transistors, the otherwise largely identical 64-bit processor Intel Core2 Duo Processor E4300 required 167 million. The background is that almost everything in the processors was already 64 bit or wider and only the very last components had to be expanded to 64 bit.

In the PowerPC architecture - in contrast to x86 - the expansion to 64-bit was planned from the beginning, since this CPU originally comes from the area of mainframes ( IBM Power architecture). A 64-bit extension was also planned for the MIPS architecture at an early stage, in both cases implementation as hardware did not take place until a few years later.

software

compatibility

Successful general-purpose 64-bit processors can still run 32-bit code, sometimes 16-bit code. If this capability is also supported by the 64-bit operating system (which is necessary for the execution of 64-bit programs), then this older code can also be (natively) executed under these operating systems. For this, programs must be able to be loaded in 32-bit mode and the 32-bit API (usually by a wrapper) must continue to be supported. Older 32-bit programs can still be executed and newer 32-bit programs can be written and tested. As long as the operating system supports it, you can choose whether to run 32-bit programs or 64-bit programs.

In return, however, it is not possible to run 64-bit programs under 32-bit operating systems, even on 64-bit processors.

Computer programs designed for a 64-bit architecture use 64-bit for addressing the main memory (or its virtual memory ) and are therefore not compatible with a processor architecture with a lower number of bits (e.g. 32-bit architecture ). On the other hand, for some platforms there is the option of compiling or running programs from 32-bit predecessor systems directly on a 64-bit platform without revising them. The AMD64 processors (including Intel 64 ), for example, offer a 32-bit x86 compatibility mode for this purpose. To achieve this, the processors contain additional components for interpreting the 32-bit instruction set. Modern operating systems activate this mode for the respective processes - a mark on the program file indicates whether they are to be run in the extended 64-bit mode or in the compatible 32-bit mode. In Windows , this is implemented by the WOW64 subsystem. Where the hardware does not offer backward compatibility, there is also the possibility of realizing the goal of executing 32-bit programs via a comparatively slow, software-based emulation .

Programming model

In the C programming language , the orientation towards a 64-bit architecture is reflected both in the size of the pointer types (e.g. void*) and the integer types (in particular intand long). When moving from a 32-bit architecture, the pointers and the data type are usually broadened longto 64 bits, whereas the data type intremains 32 bits. This is then called LP64 for short . For backward compatibility with the 32-bit architecture, which was mostly implemented as ILP32 , what is known as LLP64 was left partly longidentical . All of today's Unix-like 64-bit operating systems express the 64-bit architecture in an LP64 type model, Windows uses the LLP64 model. int

The ILP64 data model was introduced because the source code of legacy software was often developed under the impermissible assumption that it could inthold a pointer. It is found on early 64-bit systems that wanted to get to market quickly without having to clean up existing source code.

64-bit data models
Data model	short (integer)	int (integrity)	long (integer)	long long (integrity)	pointer (integrity)	Example operating system / compiler
LLP64	16	32	32	64	64	Microsoft Win64 (X64 / IA64)
LP64	16	32	64	64	64	Unix systems (e.g. Solaris ) and Unixoid systems ( e.g. Linux and macOS )
ILP64	16	64	64	64	64	Cray, DEC / Alpha with Tru64 UNIX , DEC / Alpha with Linux
SILP64	64	64	64	64	64	Some Unicos systems

advantages

The main advantage of 64-bit programs running under a 64-bit operating system on a 64-bit processor is essentially the increased address space. In addition, with some architectures (e.g. the x86-64) there are more universal registers (15 instead of 7) and the guarantee of minimum instruction sets (on x86-64: you can rely on SSE2 being available). The availability of 64-bit integer arithmetic is necessary for the address calculation of operands and is rarely used for arithmetic calculations these days. Nowadays arithmetic is mostly floating point arithmetic, graphic arithmetic is carried out in GPUs, mostly also as floating point operations. Video decoding is used e.g. B. largely executed in GPUs in specialized arithmetic units.

Enlarged address range

The theoretically possible address space of a 64- bit -Prozessors of 16 Exbi byte is nowadays usually not fully supported, usually only 48-bit address space per process (256 Tebi byte), some server CPUs support meanwhile 57 bit (128 Pebi byte). The restriction results from the level of the page table address resolution. An address range of more than 4 GB can make sense for main memories far below 4 GB because

the main memory can be expanded by paging to include memory on hard drives or SSDs,
Hard disk space can be mapped directly into the memory area of processors and
Memory management benefits from a larger address range because data can be better organized (stack and heap do not get in each other's way, the dreaded heap fragmentation does not occur).

Another advantage compared to a 32-bit architecture : More than four Gibi bytes of RAM can be addressed directly (→ 4 GiB limit ), which benefits applications with high memory requirements, such as video processing and database systems. With 64 bits, up to 16 exbibytes can be addressed, which is currently (2016) and for the foreseeable future sufficient to address not only the available main memory but also the hard disk space (e.g. via mmap ).

disadvantage

What is an advantage for data-intensive programs (for example in the case of database or file servers), can lead to disadvantages in terms of memory consumption and speed, especially with small programs.

With 64-bit architectures, all address values are twice as wide with 64-bit (instead of 32-bit with 32-bit architectures). Their storage therefore takes up twice as much space in the RAM and in the caches. Other data types (e.g. longin the LP64 model) also take up twice as much space on 64-bit architectures as on 32-bit architectures. This becomes evident in the generated program files, which are typically around 25 to 30 percent larger in comparison to the 32-bit program and can therefore also place greater stress on RAM and cache (“cache miss”). In the worst case, this reduces the execution speed of the programs by approximately the same factor. With AMD64 (and Intel 64 ), for example, this is counteracted by a greatly increased number of registers compared to IA-32 , so that even unfavorable 64-bit programs are not significantly slower in practice. Many 64-bit architectures also master IP -relative addressing with signed 32-bit offsets, which can prevent an increase in the command length.

Problems

Without a specially adapted execution environment, however, no benefit can be drawn from switching from 32-bit to 64-bit CPUs. This becomes particularly clear with downward compatible CPUs such as AMD Athlon 64 X2 , AMD Phenom X3 / X4, Intel Pentium D , Intel Pentium Extreme Edition , Intel Core 2 Duo , Intel Core 2 Quad , Intel Core i7 or the 64-bit PowerPC CPUs . This applies not only to the operating systems with 64-bit system kernels for paging management with large addresses, but also to the auxiliary libraries of the programs with the algorithms used in them: Many old systems use 32-bit-optimized algorithms, which only after adaptation by programmers of benefit from the 64-bit extension.

The need for adaptation particularly affects mathematical auxiliary functions (including multimedia and games), but also memory management. Many programs from the Unix area have a head start here, as 64-bit architectures have long been common there. During the development of the workstations , desktop programs in the Unix area (including Linux) were also adapted to 64-bit for many years before the Windows programs were adapted to the 64-bit editions of Windows. With macOS the development is mixed, since the Unix-based core and the desktop surface come from different development branches. The latter systems in particular make use of the possibility of downward-compatible CPUs to run both 32 and 64-bit programs in parallel on a 64-bit operating system kernel - these, however, have the problem that the interaction of the programs on the desktop can be inhibited (known for browser plugins, for example).

Similar to SIMD or AltiVec extensions, specially adapted software is usually also required for 64-bit systems.

Web links

Andrew Josey: Data Size Neutrality and 64-bit Support . USENIX , December 4, 1997 (English; 32/64-bit programmer's view).
M. Jungowski: WoW64 Microsoft's start-up help for 64-bit Windows . Online magazine Planet 3DNow !, July 14, 2004 (32/64-bit mixed operation).

Individual evidence

^ Harry Phillips: New Perspectives on Microsoft Windows Vista for Power Users . Cengage Learning, 2008, ISBN 978-1-4239-0603-2 , pp. 16 ( limited preview in Google Book search).
↑ Documents on the IBM 7030 Stretch
↑ PA-RISC 2.0 Architecture Specifications , ftp.parisc-linux.org (English, PDF file)
↑ ark.intel.com
↑ ark.intel.com
↑ Jorge Orchilles: Microsoft Windows 7 Administrator's Reference: Upgrading, Deploying, Managing, and Securing Windows 7 . Syngress, 2010, ISBN 978-1-59749-562-2 , pp. 9 ( limited preview in Google Book search).
↑ 64-Bit Programming Models: Why LP64? The Open Group , 1998, accessed January 1, 2016 .
↑ The data model is a property of the compiler under the corresponding target operating system, not of the operating system alone.
↑ Cray C / C ++ Reference Manual. Cray Inc , accessed January 2, 2016 .
↑ More performance: Linux with 64-bit programs . Detailed comparison of 32-bit and 64-bit application benchmarks under Linux.
↑ ^a ^b Are 64-bit Binaries Really Slower than 32-bit Binaries? Comparison of 32-bit and 64-bit programs on Solaris / SPARC (English).

[New_Perspectives-1] Harry Phillips: New Perspectives on Microsoft Windows Vista for Power Users . Cengage Learning, 2008, ISBN 978-1-4239-0603-2 , pp. 16 ( limited preview in Google Book search).

[2] Documents on the IBM 7030 Stretch

[3] PA-RISC 2.0 Architecture Specifications , ftp.parisc-linux.org (English, PDF file)

[4] rk.intel.com

[5] rk.intel.com

[6] Jorge Orchilles: Microsoft Windows 7 Administrator's Reference: Upgrading, Deploying, Managing, and Securing Windows 7 . Syngress, 2010, ISBN 978-1-59749-562-2 , pp. 9 ( limited preview in Google Book search).

[7] 64-Bit Programming Models: Why LP64? The Open Group , 1998, accessed January 1, 2016 .

[8] The data model is a property of the compiler under the corresponding target operating system, not of the operating system alone.

[9] Cray C / C ++ Reference Manual. Cray Inc , accessed January 2, 2016 .

[lperf-10] More performance: Linux with 64-bit programs . Detailed comparison of 32-bit and 64-bit application benchmarks under Linux.

[sperf-11] Are 64-bit Binaries Really Slower than 32-bit Binaries? Comparison of 32-bit and 64-bit programs on Solaris / SPARC (English).


	according to word length	1-bit architecture • Bit-slice architecture • 4-bit architecture • 8-bit architecture • 16-bit architecture • 32-bit architecture • 64-bit architecture
	according to instruction set structure	CISC • EPIC • NISC • RISC • VLIW • Microarchitecture
	with optimization for purpose	(Main) processor • Graphics processor • GPGPU • Stream processor • Sound processor • Floating point unit • Network processor • Physics accelerator • Vector processor • TensorFlow Processing Unit