Cryptographic hash function

from Wikipedia, the free encyclopedia

A cryptological hash function or cryptographic hash function is a special form of a hash function ( hash function) that is collision-resistant . So it is practically impossible to find two different input values ​​that result in an identical hash value.

Areas of application for cryptological or cryptographic hash functions are primarily integrity checks of files or messages, the secure storage of passwords and digital signatures . They can also be used as pseudo- random number generators or for the construction of block ciphers .

Classification

A hash function is a mapping that efficiently maps a character string of any length (input value) to a character string of fixed length (hash value). This necessarily means that the same hash value is assigned to different input values, so the mapping is not injective . It is basically possible to find two input values ​​that result in the same hash value.

Cryptological hash functions are divided into keyless and key-dependent: A keyless hash function only has one input value, while a key-dependent hash function requires a secret key as a second input value.

The keyless hash functions are further divided into one-way hash functions (English One-Way Hash Function , short OWHF) and collision-resistant hash function (English Collision Resistant Hash Function , short CRHF).

A one-way hash function fulfills the following conditions:

  1. Way function (English: preimage resistance ): It is impossible in practice to a given output value an input value to find the the hash function to maps: .
  2. Weak collision resistance (English: 2nd-preimage resistance ): It is virtually impossible for a given value a dissimilar value to find the same hash result: .

For a collision-resistant hash function, the following also applies:

  1. Strong collision resistance : It is practically impossible to find two different input values and that result in the same hash value. The difference to the weak collision resistance is that both input values and can be freely selected here.

You can also request a resistance to near-collision resistance . This is supposed to be virtually impossible, two different input values and to find the hash values and differ only in a few bits.

The key-dependent hash functions include Message Authentication Codes (MAC). These include constructs such as HMAC , CBC-MAC or UMAC.

construction

The Merkle-Damgård construction generates the hash value from the message blocks by repeatedly applying the compression function

Most hash functions developed before 2010 follow the Merkle-Damgård construction. As a result of the SHA-3 competition, the Merkle-Damgård construction was replaced or extended by various construction methods.

Merkle-Damgård method

In the Merkle-Damgård construction , a compression function is iterated. A compression function takes two bit sequences of lengths and as input and outputs a bit sequence of length . In addition, it is a one-way function , so it should be difficult to find appropriate input values ​​for a given output. A block cipher is often used as a compression function, the entries are then used as a message and key.

In the Merkle-Damgård construction, the input message M is first divided into blocks of fixed length and padded with additional bits so that the input length is an integral multiple of the block length. The compression function has a message block as input and the output of the previous message blocks. The hash value of the entire message is the hash value of the last block:

IV denotes an initial value .

The compression function can be constructed in several ways.

Compression functions based on bit operations

The compression function can be specially developed for a hash function and then consists of simple operations directly on the message bits. This class includes: B. the MD4 family including SHA and RIPEMD .

Compression functions based on a block cipher

A distinction is made here between hash functions whose hash value has the same length as the block length and those whose hash value has twice the block length.

  1. Hash length equals block length:
    Let the encryption with a block cipher under the key , the message blocks and the outputs of the compression function. Three common constructions are as follows:
    • Matyas-Meyer-Oseas variant:
    • Davies-Meyer variant:
    • Miyaguchi Preneel variant:
  2. Hash length with double block length: These include MDC-2 and MDC-4. They essentially consist of the two or fourfold application of the Matyas-Meyer-Oseas construction.

Compression functions based on algebraic structures

In order to be able to reduce the security of the compression function to a difficult problem, its operation is defined in appropriate algebraic structures. The price for provable security is a loss of speed. MASH (Modular Arithmetic Secure Hash) uses an RSA-like modulus , with and prime numbers. The core of the compression function is:

A: constant : exclusive or , : inclusive Or

Sponge method

Sponge constructions have fundamentally different properties than Merkle-Damgård constructions. The best-known representative of this class is SHA-3 .

Attacks

Attacks against hash functions can be of a general nature and only depend on the bit length of the hash value and treat the hash algorithm as a black box. You can, on the other hand, go against the compression function. With hash functions based on a block cipher, an attack can take place against the underlying block cipher. In addition, attacks on the implementation of the hash algorithm are possible.

Black box attacks

Black box attacks are attacks on hash functions for which nothing is known about how the hash function actually works. Only the length of the hash value  is assumed to be known and it is assumed that the hash values ​​are evenly distributed .

  1. Guess (English: 2nd preimage ): The attacker randomly selects a message and compares its hash value with that of a given message. The success rate with this approach is a  hash value that is one bit long.
  2. Collision attack: the attacker creates many variations of a real message and many variations of a fake message. It then compares the two quantities and looks for two messages, and , that have the same hash value. A collision is to be expected after attempts.

Attacks on the compression function

Meet-in-the-middle
The attacker creates variations of the first half of a forged message and variations of the second half. It calculates the hash values ​​forwards starting with the starting value IV and backwards from the hash result and tries to find a collision at the point of attack. That is, he must be able to invert the compression function efficiently: find a pair given such that holds .
Correcting Block Attack
The attacker replaces all but one of the blocks in a message - say the first. He then defines this variable in such a way that it supplies the desired overall hash value in the course of the chaining.
Fixed point attack
The attacker is looking for a and , so that . In this case, he can insert message blocks at this point without changing the hash value.
Differential cryptanalysis
Differential cryptanalysis is an attack on block cipher systems that can be transferred to hash functions. Here input differences and the corresponding output differences are examined. A difference of zero then corresponds to a collision.
Boomerang Attack
The boomerang attack is an extension of differential cryptanalysis. It combines two independent differential paths into one attack.
Rebound attack
The internal structure of a hash function is considered to be three-part, with E = E (bw) · E (in) · E (iw). The inbound phase is a meet-in-the-middle attack, which is followed by a differential cryptanalysis forwards and backwards.
Herding
Here the attacker creates a structure (so-called diamond structure) from numerous intermediate values. Starting from each of the intermediate values, a message can be created which results in the same hash value H. Given a message P (preimage), the attacker looks for a single block which, appended to P, yields one of the stored intermediate values ​​in the structure. The attacker then creates a sequence of message blocks that connect this intermediate value with H.

Block encryption attacks

Weaknesses in a block encryption method, which are actually irrelevant as long as the method is used for encryption, can have significant effects if it is used to construct a hash method. These would be z. B. weak keys or a complementary property.

Overview of hash functions

Snefru
was designed by Ralph Merkle in 1990 . The core of the hash function is similar to the Khafre (Merkle) block cipher system. Snefru is considered unsafe.
N hash
was developed by Nippon Telephone and Telegraph in 1990. The algorithm is similar to the block cipher system FEAL (Nippon T&T). N-hash is considered unsafe.
FFT hash
is a hash function based on the Fast Fourier Transform . It was first presented by Schnorr in 1991, but was soon cracked. A second version followed later. It is considered unsafe.
MD4
was developed by Ronald Rivest in 1990 . After three rounds it generates a 128-bit long hash value. At the beginning, the length of the message is brought to an integer multiple of 512 bits. To do this, it is filled with a “1” and a corresponding number of “0” so that is. It is appended with the length of the original message in 64-bit format. Next, the buffer is initialized. The main loop consists of three rounds with 16 steps each. Each round receives as input a 512-bit long message block and the 128-bit long buffer content. Each round uses a non-linear round function 16 times. The hash value output is the concatenation (chaining) of the last 32-bit words in the buffer. MD4 is considered unsafe.
MD5
In 1991, Rivest released an improved hashing process before a serious weakness in MD4 was uncovered.
The main changes are: MD5 has a fourth round. The fourth round has a new round function; that of the second round has been replaced by a new function. The additive constants have been redefined.
The first partial attack on MD5 in 1993 found pseudocollisions; H. For a message block, two concatenation variables V1 and V2 which differ from one another in only a few bits can be found, which produce the same output. The attack had no serious consequences, however. A new, more efficient attack took place in 2005. The authors were looking for a message pair with two blocks each, which would cause a collision after the second block had been processed. MD5 is considered unsafe.
SHA
The NIST failed in 1993 before the Secure Hash Algorithm (SHA). Two years later it was replaced by SHA-1. The only difference between SHA-1 and its predecessor is an additional 1-bit rotation.
The message is padded as with MD4. The buffer is initialized with five constants. The main loop consists of four rounds with 20 steps each.
In 1998 a differential analysis against SHA-0 and SHA-1 was carried out.
In 2002, NIST published three more variants of the algorithm that generate larger hash values. These are SHA-256, SHA-384 and SHA-512, with the appended number indicating the length of the hash value in bits.
In 2004, an improved attack on SHA-0 was described. The authors found near-collisions here, as well as collisions for a version of SHA reduced to 65 laps. A year later, the same authors report an attack on the full number of rounds of SHA-0 with a complexity of 2 51 . In the same year, an improved attack against SHA-0 with a complexity of 2 39 hash operations and against SHA-1 with a complexity of 2 69 was successful . In February 2017, the first collision for SHA-1 was published.
RIPEMD
RIPE-MD was developed in 1992 as part of the European Union's RACE Integrity Primitives Evaluation (RIPE) project. In 1996 the original hash value length was expanded from 128 to 160 bits. The variants RIPEMD-256 and RIPEMD-320 were also introduced. The message is padded as with MD4. The buffer is initialized with five constants. The main loop consists of five rounds with 16 steps each. The algorithm runs in parallel in two versions. After each block, the two output values ​​of both lines are added to the linkage variables. In the original RIPEMD, collisions could be found with a complexity so it should not be used.

HAVAL
was introduced in 1992 and is also part of the MD4 family. The messages are processed in 1024-bit long blocks. The hash value can be 128, 160, 192, 224 or 256 bits long. The number of laps can also vary from three to five. Each round consists of 16 steps.
In 2003, collisions with three laps were found for HAVAL. The attack succeeds against all possible output lengths. The complexity corresponds to 2 29 calculation steps of the compression function. HAVAL should therefore not be used for applications that require collision resistance.
tiger
was developed by Anderson and Biham in 1996. Message padding is like MD4; H. the message is appended with a “1” plus a sequence of “0” and the message length as a 63-bit word. The result is divided into 512-bit long blocks. The hash value contains 192 bits. For reasons of compatibility, TIGER / 128 or TIGER / 160 are defined that use the first 128 or 160 bits of TIGER / 192, respectively.
PANAMA
is by Daemen and Clapp and dates from 1998. It processes message blocks with a length of 256 bits and outputs a hash value with 256 bits. The buffer is a linear shift register with 32 states with eight words each.
One of the authors was able to generate collisions in only 2 6 evaluations of the update function, so that Panama cannot be considered collision-resistant.
Whirlpool
was designed by Rijmen and Barreto. It is based on the Miyaguchi-Preneel scheme.
The message is padded as with MD4. The padded message is divided into 512-bit long blocks. The hash value is 512 bits long. Whirlpool uses an AES variant in 10 rounds as a function .
SMASH
was developed by Knudsen in 2005. After the message padding at the beginning, the message is optionally processed in blocks of 256 or 512 bit long and delivers a 256 or 512 bit long hash value. The main round consists of several rounds called H-rounds and L-rounds. Three different H-rounds are defined. Each H-round contains its own S-Box (substitution table), which is based on that of the Serpent block cipher . Left or right shifts are carried out in the L-round.
SMASH was soon successfully attacked and is considered unsafe.
FORK-256
was at the Cryptographic Hash Workshop by Hong et al. presented. It processes message blocks with a length of 512 bits, subdivided into 16 words and delivers a hash value of 256 bits. The main loop consists of four branches and eight steps per branch. FORK-256 is considered unsafe.
SHA-3 (Keccak)
The design principle of SHA-3 differs from the hash functions of the MD group including SHA-2. It is a so-called sponge construction . The sponge construction is an iterative function in which the state (number of bits in the internal state) is greater than the output (output bits). This is intended to ward off generic attacks such as a collision with complexity below .
BLAKE
2008 by Jean-Philippe Aumasson, Luca Henzen, Willi Meier and Raphael C.-W. Phan developed; was one of the finalists in the SHA-3 selection process.

See also

literature

  • Alfred J. Menezes, Paul C. van Oorschot, Scott A. Vanstone: Handbook of Applied Cryptography. CRC Press, 1996, pp. 321-384.
  • Bart Preneel: Cryptographic Primitives for Information Authentication - State of the Art . State of the Art in Applied Cryptography, LNCS 1528. Springer-Verlag, 1998, pp. 49-104.
  • Douglas R. Stinson: Cryptography - Theory and Practice. Chapman & Hall / CRC, 2002, pp. 117-154.

Web links

Individual evidence

  1. ^ A b Alfred Menezes, Paul van Oorschot, Scott Vanstone: Handbook of Applied Cryptography . CRC Press, 1996, chap. 9, p. 324 ( uwaterloo.ca [PDF]).
  2. ^ Antoine Joux, Thomas Peyrin: Hash Functions and the (Amplified) Boomerang Attack . Advances in Cryptology-CRYPTO2007, LNCS4622. Springer-Verlag, 2007, pp. 244-263
  3. Florian Mendel, Christian Rechberger, Martin Schlaffer, Soren S. Thomsen: The Rebound Attack: Cryptanalysis of Reduced Whirlpool and Grostl . Fast Software Encryption, LNCS5665. Springer-Verlag, 2009, pp. 260-276
  4. John Kelsey, Tadayoshi Kohno: Herding Hash Functions and the Nostradamus Attack . ( iacr.org [PDF]).
  5. Bruce Schneier : Applied Cryptography. Addison-Wesley, 1996, pp. 491-524
  6. Serge Vaudenay: FFT hash is not yet Collision-free . Advances in Cryptology-CRYPTO 92, LNCS 740. Springer-Verlag, 1993, pp. 587-593.
  7. Claus-Peter Schnorr , Serge Vaudenay: Parallel FFT-Hashing , Fast Software Encryption, LNCS 809. Springer-Verlag, 1994, pp. 149-156.
  8. Ronald L. Rivest : The MD4 Message Digest Algorithm , Advances in Cryptology-CRYPTO 90, LNCS 537. Springer Verlag, 1991, pp. 303-311.
  9. ^ William Stallings: Cryptography and Network Security. Prentice Hall, 1999, pp. 271-297.
  10. Bert den Boer, Antoon Bosselaers: Collisions for the Compression Function of MD5 , Advances in Cryptology-EUROCRYPT 93, LNCS 765. Springer-Verlag, 1994, pp. 293-304.
  11. Xiaoyun Wang, Hongbo Yu: How to Break MD5 and Other Hash Functions , Advances in Cryptology-EUROCRYPT 2005, LNCS 3496. Springer-Verlag, 2005, pp. 19-35.
  12. Florent Chabaud, Antoine Joux: Differential Collisions in SHA-0 , Advances in Cryptology-CRYPTO 98, LNCS 1462. Springer-Verlag, 1999, pp. 56-71.
  13. ^ Eli Biham , Rafi Chen: Near-Collisions of SHA-0 , Advances in Cryptology-CRYPTO 2004, LNCS 3152. Springer-Verlag, 2005, pp. 290-305.
  14. Eli Biham, Rafi Chen et al .: Collisions of SHA-0 and Reduced SHA-1 , Advances in Cryptology-EUROCRYPTO 2005, LNCS 3494. Springer-Verlag, 2005, pp. 526-541.
  15. Xiaoyun Wang, Hongbo Yu, Yiqun Lisa Yin: Efficient Collision Search Attacks on SHA-0 , Advances in Cryptology-CRYPTO 2005, LNCS 3621. Springer-Verlag, 2006, pp. 1-16.
  16. Xiaoyun Wang, Hongbo Yu, Yiqun Lisa Yin: Finding Collisions in the Full SHA-1 , Advances in Cryptology-CRYPTO 2005, LNCS 3621. Springer-Verlag, 2006, pp. 17-36.
  17. Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov. The first collision for full SHA-1
  18. Hans Dobbertin , A. Basselaers, Bart Preneel: RIPEMD-160: A Strengthened version of RIPEMD , Fast Software Encryption, LNCS 1039. Springer-Verlag, 1996, pp 71-79.
  19. Xiaoyun Wang et al .: Cryptanalysis of the Hash Functions MD4 and RIPEMD , Advances in Cryptology-EUROCRYPT 2005, LNCS 3494. Springer-Verlag, 2005, pp. 1-18.
  20. Yuliang Zheng, Josef Pieprzyk, Jennifer Seberry: HAVAL-A one-way hashing algorithm with variable length of output , AUSCRYPT 92, LNCS 718. Springer-Verlag, 1993, pp. 83-104.
  21. Bart Van Rompay, Alex Biryukov, Bart Preneel, Joos Vandewalle: Cryptanalysis of 3-Pass HAVAL , Advances in Cryptology-ASIACRYPT 2003, LNCS 2894. Springer-Verlag, 2003, pp. 228–245.
  22. Tiger: A Fast New Hash Function . (Designed in 1995)
  23. Joan Daemen , Craig Clapp: Fast Hashing and Stream Encryption with PANAMA , Fast Software Encryption, LNCS 1372. Springer-Verlag, 1998, pp. 60-74.
  24. ^ Joan Daemen, Gilles van Assche: Producing Collisions for PANAMA, Instantaneously , Fast Software Encryption, LNCS 4593. Springer-Verlag, 2007, pp. 1-18.
  25. The WHIRLPOOL Hash Function ( Memento from April 8, 2006 in the Internet Archive )
  26. Lars R. Knudsen: SMASH-A Cryptographic Hash Function , Fast Software Encryption, LNCS 3557. Springer-Verlag, 2005, pp. 228-242.
  27. Norbert Pramstaller, Christian Rechberger, Vincent Rijmen : Smashing SMASH , Cryptology ePrint Archive Report 2005/081
  28. http://csrc.nist.gov/groups/ST/hash/first_workshop.html