Perfect hash function

A perfect hash function is a hash function that maps different elements from a finite and fixed set of keys to different elements from an image set (no collisions, injectivity ). The injectivity has an important advantage: In the worst case, an element of a hash table that was created with a perfect hash function can be accessed in constant time . ${\ displaystyle h \ colon S \ rightarrow T}$ ${\ displaystyle x \ neq x '}$ ${\ displaystyle S}$ ${\ displaystyle h (x) \ neq h (x ')}$ ${\ displaystyle T}$

A perfect hash function is called minimal if , i. H. . This means that the image set of the function has as many elements as the original image set. In practice, this reduces the memory requirements of the array , the elements for each of stores, to the minimum. ${\ displaystyle T = \ {0, \ dotsc, \ left \ vert S \ right \ vert -1 \}}$ ${\ displaystyle \ left \ vert S \ right \ vert = \ left \ vert T \ right \ vert \,}$ ${\ displaystyle h (s)}$ ${\ displaystyle s \ in S}$

In contrast to imperfect hashing, which requires amortized access time and in the worst case , perfect hashing offers access to the elements in constant time even in the worst case , so it is significantly faster. This is achieved by storing the values of the keys in an array indexed from to at position ; In contrast to normal hashing, each bucket contains only exactly one element due to the injectivity of . For this you pay with computing time to construct the hash function and you need more storage space. ${\ displaystyle {\ mathcal {O}} (1)}$ ${\ displaystyle {\ mathcal {O}} (\ log n)}$ ${\ displaystyle {\ mathcal {O}} (1)}$ ${\ displaystyle s}$ ${\ displaystyle 0}$ ${\ displaystyle \ left \ vert T \ right \ vert -1}$ ${\ displaystyle h (s)}$ ${\ displaystyle h}$

In practice, one looks for hash functions with the following properties:

Construction in time, d. H. The construction time increases linearly as the number of keys increases . ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle \ left \ vert S \ right \ vert}$

Evaluation in , d. H. after construction, a key can be mapped to an index in constant time .

{\ displaystyle {\ mathcal {O}} (1)}

{\ displaystyle s \ in S}

{\ displaystyle h (s)}

The hash function requires as little memory as possible.

The hash function should be minimally perfect.

Currently common minimally perfect hash functions work in time to construction and require at least 1.56 bits per key. ${\ displaystyle O (n)}$

(Minimal) perfect hash functions are appropriate in practice when:

there is a fixed set of keys to which values are assigned (with constantly changing sets of keys, constant new construction would be too time-consuming), ${\ displaystyle S}$
there is enough time to construct the hash function,
constant access to the values is required,
additional memory is available for the hash function.

Individual evidence

^ Emmanuel Esposito, Thomas Mueller Graf, Sebastiano Vigna: RecSplit: Minimal Perfect Hashing via Recursive Splitting. 2020 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX), 2020, accessed on January 23, 2020 .

[1] Emmanuel Esposito, Thomas Mueller Graf, Sebastiano Vigna: RecSplit: Minimal Perfect Hashing via Recursive Splitting. 2020 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX), 2020, accessed on January 23, 2020 .