Streaming SIMD extensions

from Wikipedia, the free encyclopedia
Comparison of the implementation of instruction set extensions by AMD (left) and Intel (right), as of 2013

The Streaming SIMD Extensions ( SSE ), formerly also Internet SIMD Streaming Extensions (ISSE), is an instruction set extension of the x86 architecture developed by Intel , which was introduced with the introduction of the Pentium III (Katmai) processor in 1999 and therefore initially called it Katmai New Instructions (KNI) wore. The purpose is to accelerate programs through parallelization to instruction level, called SIMD .

concept

In contrast to the previously published MMX instruction set extension, SSE was specially developed for floating point number data types . In addition, 128-bit registers twice as wide were implemented, both of which are frequently criticized weaknesses of the MMX instruction set. Intel also decided to completely redesign the SSE instruction set and make it incompatible with the 3DNow instructions published by competitor AMD in 1998 , which served a comparable purpose. This step was successful in the long term, SSE prevailed over 3DNow and AMD later only supported SSE and let 3DNow support expire.

Although initially explicitly mentioned in the name (ISSE), this technology has nothing directly to do with the Internet; rather, the reference served the better marketing (Intel advertised with the introduction of the Pentium III among other things with the fact that Internet surfing would be faster and generally more exciting). After a short time, Intel dropped the "I" so that today we only speak of SSE .

SSE further development

The competition that has existed for some time between AMD and Intel over the definition of sovereignty in the further development of the x86 architecture has led to incompatible extensions of SSE since around SSE3.

SSE2 , SSE3 , SSSE3 , SSE4 , SSE4a and SSE5 are more recent extensions or extension proposals from SSE both from AMD and Intel. In the meantime, further development branches exist with the Advanced Vector Extensions , XOP and CVT16 .

technical structure

The eight 128-bit wide and dedicated SSE registers, named XMM0 to XMM7

The SSE instruction set extension originally comprised 70 instructions and 8 new registers (XMM0 to XMM7), later both the number of registers and the number of instructions were increased in the course of further development.

Like AMD's 3DNow extensions, SSE is primarily designed for floating point operations. However, with the Pentium III, Intel introduced new 128-bit wide registers so that twice as much data can be processed in parallel with SSE commands as with the 3DNow, which is based on 64-bit registers. With the processors of that time, however, this did not go hand in hand with a higher computational throughput, since the 128-bit SSE commands were split up internally into two 64-bit SSE micro-ops each, because the internal execution units and their data paths are only 64 bits wide were.

In current 64-bit processors, such as those based on the core microarchitecture , the 128-bit wide SSE registers are actually processed in one step. The number of SSE registers has also been increased to 16, with the newly introduced registers being referred to as XMM8 to XMM15, analogous to the previous naming scheme.

Support in the CPUs

Since SSE was one of the first SIMD extensions to the x86 architecture and was launched on the market in 1999, practically all x86 CPUs have had SSE since the mid-2000s.

For example, starting with the Athlon , AMD supported some of the commands contained in the SSE command set (including those that work with 64-bit registers). This is also referred to as an extension of MMX. SSE has been fully supported since the Athlon XP processor, so extensively that even its own 3DNow extension has been abandoned.

Below is an overview of the CPU family from which the respective manufacturers have integrated SSE:

Individual evidence

  1. 3DNow! Instructions are Being Deprecated ( Memento of the original dated November 9, 2013 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. (English) @1@ 2Template: Webachiv / IABot / developer.amd.com
  2. ^ Agner Fog: Stop the instruction set war ( English ) agner.org. December 5, 2009. Retrieved May 12, 2012.
  3. Jon Stokes: Into the Core: Intel's next-generation microarchitecture ( English ) arstechnica.com. April 5, 2006. Archived from the original on April 1, 2007. Retrieved on May 12, 2012.