F16C

F16C (formerly CVT16 ) is an instruction set extension for microprocessors from Intel and AMD for easier conversion of floating point numbers of different precision .

history

On May 1st, 2009, this instruction set extension was announced by AMD under the name CVT16. It provides some commands of the SSE5 extension in a revised form and acts as a link to Intel's AVX extension. These commands have also been used by Intel since 2012.

function

The instruction set extension facilitates the conversion of floating point numbers with half precision (16 bits) to floating point numbers of single precision (32 bits) and vice versa, which also involves the relocation of XMM registers to YMM registers.

Technical information

There are variants of the instruction set that move four floating point values into an XMM register or eight floating point values into an XMM register and a YMM register. The command names VCVTPH2PS and VCVTPH2PS are short for "vector convert packed half to packed single" and vice versa.

VCVTPH2PS xmmreg, xmmrm64 converts four half precision floating point values in memory or in the lower half of an XMM register to four single precision floating point values in an XMM register.
VCVTPH2PS ymmreg, xmmrm128 converts eight half precision floating point values in memory or an XMM register (the lower half of a YMM register) to eight single precision floating point values in a YMM register.
VCVTPS2PH xmmrm64, xmmreg, imm8 converts four single precision floating point values in an XMM register to half precision floating point values in memory or the lower half of an XMM register.
VCVTPS2PH xmmrm128, ymmreg, imm8 converts eight single precision floating point values in a YMM register to half precision floating point values in memory or the lower half of an XMM register.

The immediate 8-bit argument imm8 for VCVTPS2PH specifies the shape of the rounding. The values '0' - '4' determine the rounding shape (next value / round off / round up / delete). This also specifies the mode for MXCSR.RC. Bit 29 of the ECX register indicates the support for these commands after being queried by CPUID with EAX = 1.

Web links

Individual evidence

↑ Chuck Walbourn: DirectXMath: F16C and FMA. In: Microsoft Developer Network . September 11, 2012, accessed January 11, 2017 .
^ RL Uy: Beyond multi-core: A survey of architectural innovations on microprocessor . In: 2014 International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM) . November 1, 2014, p. 1–6 , doi : 10.1109 / HNICEM.2014.7016212 .
↑ Daniel Kiss Worm: Modern X86 Assembly Language Programming . Apress , 2014, ISBN 978-1-4842-0064-3 , pp. 342 ff ., doi : 10.1007 / 978-1-4842-0064-3 .

[1] Chuck Walbourn: DirectXMath: F16C and FMA. In: Microsoft Developer Network . September 11, 2012, accessed January 11, 2017 .

[2] RL Uy: Beyond multi-core: A survey of architectural innovations on microprocessor . In: 2014 International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM) . November 1, 2014, p. 1–6 , doi : 10.1109 / HNICEM.2014.7016212 .

[3] Daniel Kiss Worm: Modern X86 Assembly Language Programming . Apress , 2014, ISBN 978-1-4842-0064-3 , pp. 342 ff ., doi : 10.1007 / 978-1-4842-0064-3 .