Division remainder method

from Wikipedia, the free encyclopedia

The residual division method (see also modulo ) provides a hash function .

The function is:

is the size of the hash table.

properties

  1. The hash function can be calculated very quickly
  2. The choice of table size influences the collision probability of the function values ​​of .

For most input data, for example, the choice of a power of two for , that is , is unsuitable, since this corresponds to the extraction of the least significant bits of , so that all the more significant bits are ignored in the hash calculation.

For practical applications, the choice of a prime number for which is not a Mersenne prime provides a low number of collisions to be expected with many input data distributions.

Hashing of strings

Strings can be hashed using the division method by converting them to whole numbers at the base , where the character set size denotes.

To avoid integer overflows, the Horner scheme can be used to calculate the hash value for keys . The following example shows the calculation of a hash value for a 7-bit ASCII character string .

Thus, the maximum possible intermediate result can occur.

Shown in pseudocode :

Parameter: natürliche Zahlen i, h=0; Feld s
 for i = 0 to i < länge_von(s)
	h = (h * 128 + s[i]) mod m;
Ergebnis: h.

The multiplication by 128 = 2^7corresponds to the left bit shift operation << 7 .

literature

Individual evidence

  1. ^ Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press among others, Cambridge MA among others 2001, ISBN 0-262-03293-7 , p. 231 .