Integer (data type)

from Wikipedia, the free encyclopedia

With Integer ([ ɪnteɡɐ ] English [ ɪntɪdʒə ], for integer ; of Latin numerus integer ) is used in the computer science a data type referred to, which stores integer values. The range of values ​​is finite. Calculations with integers are usually exact. Only an overflow can occur if the permissible value range is exceeded. As a basic arithmetic data type, integers are present in the hardware of almost all computer systems and are available in almost every programming language . Usually several types of integers are provided, which differ in their representation, length or the presence of a sign . The implemented arithmetic with integers has not yet been standardized and often has language-dependent ( JavaC ) or even compiler -dependent (C - sequence of evaluation of expressions) peculiarities. One attempt at standardization is with the “Language Independent Arithmetic” (ISO / IEC 10967).

Representations

Apart from exotic representations, there are three options for storing integer variables. The sign - if present - can be read from a certain number in all representations.

In the signed amount representation, the sign and the amount are stored and processed separately.

In the case of b -complement numbers ( two's complement numbers) exactly half the subset of the numbers with a large absolute value is interpreted as negative numbers, without the arithmetic of positive numbers being significantly changed. This leads to simple circuits and a simple rule for sign changes (digit-wise b -complement and subsequent increase in the number). There is no difference between two's complement arithmetic and purely positive binary numbers. There is a bijective mapping between the representations and the values ​​(no value has two representations). One can interpret b -complement numbers like technical counters (odometer in the car). The disadvantage of b -complement numbers is that the smallest negative value has no positive counterpart in the representation.

In the case of (b − 1) -complement numbers ( one's complement numbers ), on the other hand, the rule for changes in sign is simplified (the subsequent increase is omitted) and more case distinctions must be taken into account in arithmetic and, above all, two representations of zero (± 0)

In modern computing systems the base b is practically without exception b = 2 and the representation in two's complement has largely established itself.

Decimal equivalents to the two's and one's complement in the binary system would be tens and nine's complement numbers.

Some manufacturers often use a decimal format based on customer requests (banks). Almost without exception, a signed amount is chosen and the amount is stored in the so-called BCD form ( binary coded decimal ). The request is justified, as rounding errors occur when converting a decimal number into an integer or back, which makes accurate commercial accounting impossible.

Overview

Two's complement One's complement Signed amount representation BCD numbers
Base 2 2 2 10
Uniqueness reversibly unambiguous 2 representations for the same value (± 0) 2 representations for the same value (± 0) Representations of no value
Range of values maximal, asymmetrical symmetrical symmetrical symmetrical

Examples

(the example numbers are designed for 9 bits, as two-digit BCD numbers are possible, representation MSB  →  LSB ):

Two's complement One's complement Signed amount representation BCD numbers
maximum 0 1111 1111 (255) 0 1111 1111 (255) 0 1111 1111 (255) 0 1001 1001 (99)
17th 0 0001 0001 0 0001 0001 0 0001 0001 0 0001 0111
5 0 0000 0101 0 0000 0101 0 0000 0101 0 0000 0101
1 0 0000 0001 0 0000 0001 0 0000 0001 0 0000 0001
0 0 0000 0000 0 0000 0000 0 0000 0000 0 0000 0000
−0 1 1111 1111 1 0000 0000 1 0000 0000
−1 1 1111 1111 1 1111 1110 1 0000 0001 1 0000 0001
−2 1 1111 1110 1 1111 1101 1 0000 0010 1 0000 0010
−5 1 1111 1011 1 1111 1010 1 0000 0101 1 0000 0101
−17 1 1110 1111 1 1110 1110 1 0001 0001 1 0001 0111
Minimum + 1 1 0000 0001 (−255) 1 0000 0001 (−254) 1 1111 1110 (−254) 1 1001 1000 (−98)
minimum 1 0000 0000 (−256) 1 0000 0000 (−255) 1 1111 1111 (−255) 1 1001 1001 (−99)

Common forms of storage

An integer usually consists of 8, 16, 32, 64 or 128 bits (i.e. 1, 2, 4, 8 or 16 bytes ) - according to the word length of the respective CPU . Historically, other values ​​(12, 48, ... bits) were also used. In programming languages ​​the designations of these numbers are partially standardized: In Java they are called byte (8), short (16), int (32) and long(64 bit). In C there are the same identifiers for types, but their size varies depending on the architecture. For this purpose, C supports unsigned ( unsigned) integer variants, with which many older and also such processors work exclusively or primarily (directly), as they are still used today in microcontrollers and embedded systems . It was not until C99 that platform-independent types were specified which are clearly defined in bits according to their explicit word length, e.g. B. for the one byte wide: int8_tresp. uint8_t.

Computing systems usually process integers faster than floating point numbers , because often fewer bits have to be processed (the smallest IEEE 754 floating point number is 32 bits) and there is no processing of the exponent, which saves computing time and memory space. In addition, pure fixed-point arithmetic (integer-based) offers the advantage of precise processing (within fixed dynamic limits ) over floating-point arithmetic ; data-dependent effects such as denormalization or absorption do not occur. For this reason, pure integer processing is often required for internal software in financial institutions. B. is implemented at GnuCash .

When storing in the memory, in addition to the need to store the bits of the number representation at all, there is also the problem of the byte order and arrangement.

Maximum range of values

Size
(bit)
Typical names sign Limits of the range of values ​​( two's complement ) Decimal places
(unsigned)
min Max
8th char, byte / byte, modern: int8_t or uint8_t signed −128 127 3
unsigned 0 255 3
16 Word, Short / short, Integer, modern: int16_t or uint16_t signed −32,768 32,767 5
unsigned 0 65,535 5
32 DWord / Double Word, int, long (Windows on 16/32/64-bit systems; Unix / Linux / C99 on 16/32-bit systems), modern: int32_t or uint32_t signed −2,147,483,648 2,147,483,647 10
unsigned 0 4,294,967,295 10
64 Int64, QWord / Quadword, long long, Long / long (Unix / Linux / C99 on 64-bit systems), modern: int64_t or uint64_t signed −9,223,372,036,854,775,808 9,223,372,036,854,775,807 19th
unsigned 0 18,446,744,073,709,551,615 20th
128 Int128, Octaword, Double Quadword signed ≈ −1.70141 10 38 ≈ 1.70141 · 10 38 39
unsigned 0 ≈ 3.40282 · 10 38 39
n BigInteger signed −2 n − 1 2 n − 1 - 1 ⌈Log 10 2 n − 1
unsigned 0 2 n - 1 ⌈Log 10 2 n

Arithmetic overflow

Overflow for unsigned integers (bit length 3)
Overflow with integer numbers (bit length 3 + 1)

If an integer variable is assigned a value outside its value range, this leads to an arithmetic overflow . So z. B. with an unsigned 8-bit integer variable from 255 + 1 the value 0; with a signed one in two's complement, however, from 127 + 1 the value −128.

See also

literature

References and comments

  1. ISO / IEC 10967 in the English language Wikipedia
  2. Knuth: Volume 2. S. 195, 4.1 Positional number systems; P. 284, 4.3.2 Modular arithmetic
  3. ^ David Goldberg: What Every Computer Scientist Should Know About Floating-Point Arithmetic . In: ACM Computing Surveys . 23, 1991, pp. 5-48. doi : 10.1145 / 103162.103163 . Retrieved September 2, 2010.
  4. What's new in GnuCash 1.6? . gnucash.org. Retrieved September 3, 2010.
  5. a b c Agner Fog: Calling conventions for different C ++ compilers and operating systems: Chapter 3, Data Representation (PDF; 416 kB) February 16, 2010. Accessed August 30, 2010.
  6. ^ Eric Giguere: The ANSI Standard: A Summary for the C Programmer . December 18, 1987. Retrieved September 4, 2010.
  7. Randy Meyers: The New C: Integers in C99, Part 1 . drdobbs.com. December 1, 2000. Retrieved September 4, 2010.