Off-by-one error

from Wikipedia, the free encyclopedia

An off-by-one error or off-by-one error (German about um-one-beside-error or plus-minus-one syndrome , abbreviated OBOE), jokingly also " Obi-Wan error" , as similar sounding, or ± 1 problem , describes a programming error in computer science in which a number or a number of steps carried out is 1 too large or too small. For example, the size of a memory block is incorrect by 1 or a buffer is written one step too far into memory, with the memory of another buffer or a variable being overwritten.

General

Off-by-one errors often occur when handling data fields , arrays , vectors , lists , strings or other indexable data types .

Off-by-one errors can easily be made and are very difficult to find, especially since they are very often only noticeable under very special conditions. They can also be overlooked very easily when looking through the source code. A further complication is that indices or offsets in the source code are usually formed by variables or formulas. Measures by compilers / interpreters or possibly operating systems that register that a buffer limit is exceeded by each individual byte also only take effect in the special case that the entire reserved buffer is to be used.

When an off-by-one fault is found and localized, it is usually very easy to fix.

Common causes (selection)

There are various causes of errors that ultimately lead to an off-by-one error. Examples:

  • A common source of error is the fact that when programming, counting often starts with 0 and not with 1 (zero-based numbering) . For an array with 10 data fields this means that the data fields do not have the indices 1 to 10, but the indices 0 to 9.
  • Another common source of errors is the use of null bytes , especially with character strings, i.e. text. The zero byte is a character with the value 0 that does not appear in text and marks the end of a character string, while the actual buffer for the character string can be many times larger. This means that the buffer size does not have to be constantly changed for variable buffer contents and the length of the character string does not have to be specified separately. Due to the zero byte, however, a character string is in principle one character longer than the character string itself is long. For example, the string "Hello" is 5 characters long, but requires 6 characters in memory. Functions that determine and return the length of a character string do not count the zero byte.
  • Off-by-one errors also often occur as a result of so-called fence post errors, i.e. by confusing distances and numbers during indexing.

consequences

In the case of an off-by-one error, the wrong termination or continuation condition is typically selected when writing the loop that is supposed to process a field, so that in the body of the loop an instruction that accesses the field based on an index is exactly once too often or once too little, which either attempts to access an element of the field that does not exist, or the last (or first) element of the field is omitted. In the first-mentioned case, an index-out-of-upper-range error (or similar) is often the noticeable consequence, in the latter case sometimes no error is visible as long as the entire buffer size is not used or an index-out -of-lower-range error is reported.

An off-by-one error can lead to a crash of the program if there is important data in the memory after the buffer that is then overwritten by the loop (e.g. pointer to a structure). In principle, there can also be program code after a buffer in the main memory, whereby an accidental overwriting usually also causes a program crash, since the data does not correspond to a valid machine command.

Examples

Example from language C :

int nettopreise[10];
int i;

/* nettopreise initialisieren */
...

for (i = 0; i <= 10; i++)
    nettopreise[i] = nettopreise[i] * 1.19; // MwSt aufschlagen.

In this case it should i < 10not i <= 10read and , since the declaration specified 10 as the field size, but because C is zero-based, the maximum index is 9.

Often times, this type of error results from the confusion that comes with people counting from 1 to N, but field indices in many programming languages ​​go from 0 to N − 1. Then there is the greater than sign and the greater than or equal to sign, which can be confused. In addition, the termination condition can be more complicated, so that off-by-one errors often occur at this point.

The case of a data structure that starts with 1, but loop counts using this data structure start with 0, is particularly tricky.

An off-by-one error can also easily occur if, in the case of area limits, it is not taken into account whether the lower and upper bounds are inclusive or exclusive. Thus, the function returns substringin Java the portion of a string that includes the lower bound with, but not the upper.

For example, if you want to extract the partial word “bar” from the word “Foobar” by counting the letters, you can easily make a mistake with the upper limit, even if you start counting correctly at 0. Since the word “bar” includes the letters in the indexes 3, 4 and 5, one is tempted to substring(3, 5)call it up. As a result you would only get “ba”.

To avoid this problem, use other programming languages ​​such as C / C ++, JavaScript or PHP instead of the parameters start index and end index, the parameters start index and length of the desired character string.

Web links

Individual evidence

  1. userpage.fu-berlin.de
  2. ^ Aegidius Plüss: Java - exemplary: learning by doing . Oldenbourg Wissenschaftsverlag, 2004, ISBN 3-486-20040-2 , p. 51.
  3. foldoc.org
  4. Dieter Masak: Legacy software: The long life of the old systems . Springer-Verlag, 2005, ISBN 3-540-25412-9 , p. 161.