Hash tree

from Wikipedia, the free encyclopedia
A binary hash tree

A hash tree ( English hash tree or Merkle tree , after the scientist Ralph Merkle ) is a data structure in cryptography and computer science . A hash tree is a tree made up of hash values from data blocks , for example from a file . Hash trees are an extension of hash lists and are also used to ensure the integrity of data. If they use the Tiger hash function as a basis, they are called Tiger Trees or Tiger Tree Hashes .

history

Hash trees were invented by Ralph Merkle in 1979 . The original purpose was the efficient handling of many Lamport one-time signatures , which are among the quantum-secure procedures. However, a single Lamport key can only be used to sign a single message. In combination with hash trees, however, a Lamport key can be used for many messages, which is implemented using the Merkle signature method.

Applications

In addition to signatures, hash trees can be used to protect any type of data that is stored and exchanged against changes. The main application at present is to ensure in P2P networks that data blocks received from other peers are undamaged and unchanged. There are proposals to use hash trees in trusted computing . Sun Microsystems uses them in ZFS . Hash trees are also used by Google Wave and Tarsnap online data backup .

Well-known implementations of hash trees are the blockchain of the crypto currency Bitcoin and the version management Git .

functionality

A hash tree is a tree of hash values, with the leaves being hash values ​​of data blocks, for example a file. Nodes further up the tree are hash values ​​of their children. Most implementations use a binary tree (each node has at most two children), but a higher level of exit can just as easily be used.

Usually a cryptological hash function such as SHA-1 , Whirlpool or Tiger is used. If the hash tree is only to protect against unintentional damage, a (cryptographically insecure) checksum such as CRC can be used.

The root of the hash tree is known as the top hash , root hash or master hash . Before downloading a file on a P2P network, the top hash is usually obtained from a trusted source, such as a friend or a website with a good rating. If the top hash is available, the rest of the hash tree can also be obtained from an untrustworthy source, i.e. also from every peer in a P2P network. It can then be checked against the trustworthy top hash and, if necessary, rejected.

The main difference to a hash list is that each branch of the hash tree can be downloaded individually and checked for integrity immediately, even if the entire tree is not yet available. For example, if you want to check the integrity of the data block L2 in the image, it is sufficient to know the hash 0-0 and the hash 1. The hash 0-1 can be calculated from the data block L2 and the hash 0 can be calculated using the hash 0-0. The top hash is also calculated with hash 0 and hash 1. It is more efficient to split files into very small chunks for transfer so that only small pieces of it need to be reloaded if damaged. With very large files, however, this also results in relatively large hash lists or hash trees. If trees are used, however, individual branches can be quickly loaded and checked for integrity so that the actual file can be downloaded.

Tiger-tree hash

The tiger tree hash is a widely used binary hash tree based on the cryptological hash function Tiger . It is often used to check the integrity of large files during or after transfer. The tiger tree hash typically has 1024 byte blocks of data from the file at the leaf level. The root hash is then - with a high degree of probability - a unique identifier for the file. If the client has the complete Tiger hash tree, it can on the one hand verify whether the individual file blocks are correct and at the same time check whether the hash tree itself is correct.

Tiger tree hashes are used by the file sharing protocols Gnutella , Gnutella2 and Direct Connect , as well as by file sharing applications such as Phex , BearShare , LimeWire , Shareaza and DC ++ .

In text representation, the values ​​are usually specified as Base32- coded, either directly or as a Uniform Resource Name , e.g. B. urn:tree:tiger:LWPNACQDBZRYXW3VHJVCJ64QBZNGHOHHHZWCLNQfor a 0-byte file.

Web links

swell

Individual evidence

  1. RC Merkle: A digital signature based on a conventional encryption function , Crypto '87
  2. ZFS End-to-End Data Integrity. ( Memento of the original from October 12, 2010 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. In: Jeff Bonwick's blog  @1@ 2Template: Webachiv / IABot / blogs.sun.com
  3. ^ Wave Protocol Verification Paper. ( Memento of the original from November 12, 2010 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. Google Wave Federation Protocol @1@ 2Template: Webachiv / IABot / www.waveprotocol.org
  4. Satoshi Nakamoto: Bitcoin: A Peer-to-Peer Electronic Cash System. Retrieved March 1, 2018 .
  5. DC ++ 's feature list