Hyper-threading
Hyper-Threading Technology (short- HTT , usually only Hyper-Threading and HT called) is a specific implementation of hardwareseitigem multithreading in Intel - processors , also from AMD was acquired. Thanks to several complete register sets and a complex control unit, internally parallel pipeline stages are allocated to two parallel instruction and data streams. Conceptually, Hyper-Threading corresponds to Simultaneous Multithreading (SMT).
Please note: The processor core can process a single thread (second thread switched off) as well as two different threads in parallel, so in the latter case it requires its own page tables . These threads can also come from a single process and thus calculate in the same process context.
concept
The idea behind Hyper-Threading is to make better use of the arithmetic units of a processor by having two threads share the resources that would be necessary for a complete core. One thread can use the resources that the other is not currently using - especially ALU components and FPU; also pipeline gaps that can arise, for example, when a process or thread has to wait for the main memory due to a cache miss . Here, the second process or thread can calculate and thus run in parallel using Hyper-Threading. On the software side, a CPU with Hyper-Threading behaves similarly to a symmetrical multiprocessor system (SMP) - the distribution of the incoming data to the free space is generated by two logical processors ( called siblings in HT jargon ), which are managed by the operating system using classic multiprocessing methods can. Even if an SMP-capable operating system can theoretically deal with HT without adaptation, an adapted operating system makes sense, since otherwise the full performance cannot be used and in individual cases there may even be a performance reduction.
With the NetBurst microarchitecture in particular, a typical thread only consumes about 35% of the execution resources. Hyper-Threading can increase this workload: While with only one running (unoptimized) application the performance gain is only marginal and hardly noticeable by the user, the user benefits from Hyper-Threading with several applications or threads running at the same time. The gain in performance is offset by a comparatively low additional expenditure in the logic of the processor: only the threading logic and additional register sets for the other threads must be available, but no additional arithmetic units.
The logical processors are equivalent; Due to the processing units that must be used together, they can interfere with one another, which means that the total computing power is significantly less than twice the individual power. If one of the two thread execution units is switched off, the other can compute at full speed without being disturbed. The operating system should therefore explicitly assign computationally intensive threads to a core that does not process another thread (and switch off the second execution unit). The possible computing power of a processor with HT-capable cores (each with two logical processors) is generally well below that of a processor with "full" cores.
Even with in-order architectures such as the Atom processor , Hyper-Threading can bring significant performance gains. Since an in-order architecture cannot give priority to operations within a thread and thus everything within the thread is processed in sequence, there are often “gaps” in the pipeline that can then be filled with operations from another thread. Here, Hyper-Threading has a similar effect to out-of-order execution , but only at the thread level. The individual threads are still executed slowly due to the in-order execution, so Hyper-Threading does not increase the single-thread, but the multi-thread performance.
But even with modern “wide” (fourfold superscalar) micro-architectures such as Nehalem , according to Intel, Hyper-Threading can accelerate normal programs by 10 to 20% and optimized programs by up to 33% in multitasking mode.
functionality
In Hyper-Threading, CPU resources are divided into three categories:
- replicated resources : these are kept independently in their own copy by each sibling. In any case, this includes the complete set of registers including stack pointer and program counter .
- partitioned resources : These are divided between the siblings by subdivision, that is, they only exist once, but individual parts of the resources can be assigned to exactly one sibling. These include the instruction queues, the reorder buffer and the load / store buffer.
- shared resources : All other resources belong to the shared resources and are used by both siblings, usually in such a way that they can only be used by one of the siblings at the same time. These currently include in particular the arithmetic-logic unit (ALU) and the floating point unit (FPU).
support
hardware
Hyper-Threading can be found with Intel processors in newer models of the Pentium 4 series and their derivatives, with Xeon from the Netburst family, with many Core i models and some Atom processors. The multi-core processors from AMD such as Athlon 64 X2 , Opteron and newer flag themselves to be Hyper-Threading-capable, although they are either not or have a slightly different structure (closer to SMT). With various multi-core processors from Intel such as Pentium D or Core 2 Duo, depending on the version, Hyper-Threading is not available, but the corresponding processor flag is still set.
Hyperthreading can usually be switched off in the BIOS or UEFI , which is mainly used in the workstation and server area so that the remaining CPU cores also provide the full possible computing power. Although this generally reduces the computing power of the overall system with many threads, it increases it in software constellations that use fewer or the same number of threads as the remaining core number.
software
The speed advantage of Hyper-Threading compared to classic single-threading can only be used if you use an SMP-capable operating system and applications that are ideally optimized for Hyper-Threading or generally for multithreading. Compared to classic multiprocessor systems, Hyper-Threading has a disadvantage in terms of pure performance, since the two threads on one processor core share the available resources and are therefore each executed more slowly than a single thread on the core. The speed of execution of a thread on a logical processor depends to a large extent on how well its resource requirements match the needs of the other thread. Therefore, z. B. Threads with hard real-time requirements are permanently assigned to a core without a second thread, or hyperthreading must be switched off. When introducing hyperthreading technology, Intel advertised an increase in performance of up to 33% per additional logical processor. This is probably the ideal case, in everyday life a CPU core made up of two HT siblings brings performance values of around 120–125% compared to a simple, full-fledged CPU core.
Hyper-Threading is much more cost-effective to implement than two full cores. When implemented correctly, especially in the operating system, it increases the efficiency of the processor in software environments that execute many threads at the same time, and still maintains the ability to achieve high single-thread performance with just a few threads. The processor cores can be quite "wide" - it makes sense to actually build processing units twice in the CPU core, which are often required at the same time in a statistical mean in an HT-siblings pair, in order to reduce mutual hindrance. This increases the performance per thread. If software generates more threads than cores, the computing power generally drops because the operating system often has to reload the thread context. Modern operating systems try to reassign a thread to the same CPU core on which it was previously executed, even after an interruption. This can have a strong accelerating effect if the memory areas required to execute this thread are still in the associated processor cache and therefore do not have to be reloaded.
Operating systems that support Hyper-Threading include Windows operating systems from Windows XP , macOS , newer versions of FreeBSD and other BSDs, and Linux . Windows 2000 is compatible with Hyper-Threading, but rarely benefits from it because it does not differentiate between physical and logical processors (no so-called “SMT awareness”). The performance can even drop due to effects such as cache thrashing . The Windows operating system is only recommended without restrictions for six-core cores with Hyper-Threading from Windows 7 onwards , as the scheduler in Windows Vista and older cannot handle twelve threads generated by Hyper-Threading optimally.
Compilers that can produce Hyper-Threading-friendly code are the Intel Compilers and the GNU Compiler Collection . However, Hyper-Threading only has a speed advantage for applications whose calculations can be parallelized , i.e. the calculation of one thread is not dependent on the result of another.
Whether or not computer games benefit from Hyper-Threading depends primarily on how many demanding threads the games in question can make available to the processor and how many of them the processor can process at the same time. While dual-core CPUs benefit very well from Hyper-Threading with current games, since most current games offer the processor more than two demanding threads, the majority of games at the beginning of 2011 even lose a little performance when Hyper-Threading is activated on a quad-core processor because they offer the processor hardly more than four threads, but at the same time the administrative overhead within the processor increases due to Hyper-Threading. Since 2009 at the latest, however, there have also been exceptions such as Anno 1404 , which offer the processor more than four demanding threads, so that quad-cores also benefit from Hyper-Threading.
See also
Web links
- Hyper-Threading Technology, Intel (English)
- Hyper-Threading Technology at Intel
- Hyper-Threading and other features of Intel processors in comparison (English)
- Hyper-Threading Technology at Computerbase
- Hyper-Threading from the perspective of the Linux scheduler (English)
- The technical basics of Intel Hyper-Threading (Server Mile Technet)
- Exploring the performance limits of simultaneous multithreading for memory intensive applications (English)
- Simultaneous Multithreading (English)
- Multithreading (English)
- Hyper-Threading Technology Architecture and Microarchitecture (English)
Individual evidence
- ↑ a b Deborah T. Marr, Frank Binns, David L. Hill, Glenn Hinton, David A. Koufaty, J. Alan Miller, Michael Upton: Hyper-Threading Technology Architecture and Microarchitecture. In: Intel Technology Journal, Vol. 06, Issue 01. Intel, February 14, 2002, accessed February 3, 2017 .
- ↑ computerbase.de: Intel Hyper-Threading: Windows XP and Windows 2000 in comparison
- ↑ Nico Ernst, golem.de: Test: Core i7 980X - six cores, but rarely faster, hyperthreading slows down under Vista. March 11, 2010, accessed October 24, 2011 .
- ↑ Volker Rißka: Test: Intel Core i3-2100 / 2120 - insider tip for gamers. Computerbase, April 22, 2011, accessed October 24, 2011 .
- ↑ Volker Rißka: Test: Intel "Sandy Bridge". Computerbase, January 3, 2011, accessed October 24, 2011 .
- ↑ Marc Sauter, PC Games Hardware: Lynnfield in the test: Benchmarks of the Intel Core i5-750 and Core i7-860 in Anno 1404. July 31, 2009, accessed on November 5, 2009 : “Based on the i7-920 with activated or deactivated Simultaneous Multi Threading, we verify that Anno 1404 is the first game that runs faster thanks to SMT - plus 8 percent. "