Distributed hash table

A distributed hash table ( English Distributed Hash Table , DHT) is a data structure that can be used, for example, the location of a file in a P2P system to store. The focus here is on decentralization and the efficiency of data storage.

The data is distributed as evenly as possible over the existing storage nodes. Each storage node corresponds to an entry in the hash table . The self-organizing data structure can map the failure, joining and leaving of nodes. The basis for distributed hash tables are consistent hash functions .

One differentiates DHTs according to the storage scheme. The data can be stored directly within the DHT (direct storage) or a reference to the data can be held in the DHT (indirect storage). Direct storage is only suitable for small data (<1 kB), as otherwise the system would become too inflexible.

properties

Properties of DHTs are:

Fault tolerance : The system should function reliably even if nodes fail or leave the system.
Load sharing : Keys are distributed evenly across all nodes.
Robustness : The system should be able to function “correctly” even if a part (possibly a large part) of the nodes tries to disturb the system.
Self-organization : No manual configuration is necessary.
Scalability : The system should be able to remain functional even with a large number of nodes.

Basic way of working

Using a hash function , keys are assigned to the data objects in a linear value range , which is distributed as evenly as possible over the nodes of the node set. At least one node is responsible for each part of the key space. However, multiple nodes are often responsible for the same area, with responsibilities changing dynamically. A joining protocol regulates the inclusion of new nodes in the existing system. The protocol then establishes the connections to the neighboring nodes and usually also regulates the construction of routing tables .

The routing tables are used by the DHT nodes to determine other nodes that are responsible for certain data records. The definition of “distance” is linked to the structure and topology and varies in different systems. It does not necessarily have to match the physical organization of the nodes. A distributed hash table that places its nodes in Euclidean space could choose the node with the closest Euclidean distance to a key. The routing tables should allow each node to reach the next node to a key in O (log n ) search steps.

The implemented algorithms can be exchanged through a generic interface that only offers two functions publish(Schlüssel, Inhalt)and lookup(Schlüssel).

Partitioning the key space

With most DHTs, keys are mapped to nodes using a variant of consistent hashing or rendezvouz hashing . These two variants were developed simultaneously, but independently, to solve the DHT problem.

Both consistent hashing and rendezvouz hashing have the fundamental property that when a node joins or exits, only the keys of the neighboring nodes change and all other nodes are not affected. In conventional hash tables, however, almost the entire key range is redistributed when a bucket is added or removed. If the responsibility of data objects changes, data redistribution is necessary. This puts a strain on the network and the data bandwidth. Therefore, DHTs are designed in such a way that they can also efficiently support frequent entries and exits of nodes.

Consistent hashing

A distance function is used for consistent hashing . This indicates the distance between two keys and . The distance is independent of the geographical distance or the latency in the network. In addition, each node in the network receives a key, which we call its identifier (ID of the node ). Each node is then responsible for storing those items whose distance from his ID at least: . ${\ displaystyle \ delta (k_ {1}, k_ {2})}$ ${\ displaystyle k_ {1}}$ ${\ displaystyle k_ {2}}$ ${\ displaystyle x}$ ${\ displaystyle i_ {x}}$ ${\ displaystyle x}$ ${\ displaystyle x = \ operatorname {argmin} _ {y} \ delta (k, i_ {y})}$

For example, sets Chord a consistent hashing, wherein the nodes as points on a circle as the circular arc of to be construed in a clockwise direction. The circular key space therefore consists of connected segments, the end points of which are the node IDs. So if, for example, and are two consecutive node IDs in a circle, then the node is responsible for all keys between and . ${\ displaystyle \ delta (k_ {1}, k_ {2})}$ ${\ displaystyle k_ {1}}$ ${\ displaystyle k_ {2}}$ ${\ displaystyle i_ {1}}$ ${\ displaystyle i_ {2}}$ ${\ displaystyle i_ {1}}$ ${\ displaystyle i_ {1}}$ ${\ displaystyle i_ {2}}$

Rendezvous hashing

With rendezvouz hashing, all clients who want to map a key to one of the nodes use the same hash function selected at the beginning . In addition, all clients have the same list of IDs , one for each node. In order to determine the correct node for a key , hash weights are first calculated. The key is then associated with the node corresponding to the maximum of these values. A node with an ID is therefore responsible for all keys whose hash weight is higher than the hash weights of all other nodes for the key. ${\ displaystyle n}$ ${\ displaystyle h ()}$ ${\ displaystyle \ {S_ {1}, S_ {2}, ..., S_ {n} \}}$ ${\ displaystyle k}$ ${\ displaystyle n}$ ${\ displaystyle w_ {1} = h (S_ {1}, k), \ w_ {2} = h (S_ {2}, k), \ ..., \ w_ {n} = h (S_ {n }, k)}$ ${\ displaystyle S_ {x}}$ ${\ displaystyle k_ {m}}$ ${\ displaystyle h (S_ {x}, k_ {m})}$

Locality-preserving hashing

Locality-preserving hashing ensures that similar keys are assigned to similar nodes. This enables more efficient range queries. However, it can happen that the distribution of the key space to the nodes and thus their utilization is no longer uniformly random. The Self-Chord framework, for example, makes object keys independent of node IDs and sorts keys along a ring buffer using a statistical approach based on the swarm intelligence paradigm . The sorting ensures that neighboring nodes are responsible for similar keys and queries such as B. Range queries can be executed in logarithmic time.

Overlay network

The overlay network connects the nodes so that they can find the relevant node responsible for keys. Each node maintains connections to other nodes (its neighbors) in a routing table . A node selects its neighbors according to the network topology (structure of the network).

All DHT topologies have one basic property in common: for each key , each node either knows the ID of the node that is responsible for , or it has a link to a node whose ID is closer to , defined by a distance measure in section Partitioning of the key space . A message can then simply be routed to the responsible node by : At each step the message is forwarded to the node whose ID is closest to until the responsible node is reached. This algorithm is generally not globally optimal. Sometimes this is called key-based routing. ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle k}$

The overlay network has two parameters that have a major impact on its performance. The maximum route length should be small so that packets arrive quickly, and the maximum node degree should be small so that the overhead per node visited is small. The two parameters are in a trade-off relationship. Some typical relationships are described in the following table.

Max. Node degree	Max. Route length	Used in	comment
${\ displaystyle O (1)}$	${\ displaystyle O (n)}$		Worst possible route length, requests get very slow
${\ displaystyle O (\ log n)}$	${\ displaystyle O (\ log n)}$	Chord Kademlia Pastry Tapestry	Most common but not optimal (neighboring degree / route length ratio). Chord is the simplest version, Kademlia seems to be the most popular optimized variant (should have improved average time for inquiries)
${\ displaystyle O (\ log n)}$	${\ displaystyle O (\ log n / \ log (\ log n))}$	Coordinates	More complex implementation but requests can be faster (lower worst-case limit)
${\ displaystyle O ({\ sqrt {n}})}$	${\ displaystyle O (1)}$		Highest local storage space requirements, high communication load after joining and exiting a node

${\ displaystyle O (\ log n)}$ as the maximum neighboring degree and maximum route length is the most widespread parameterization. Although the neighbor degree / route length trade-off is not optimal, it often allows greater flexibility in the choice of neighbors. Many DHTs use this flexibility to select neighbors with the lowest possible latency in the physical network below. In general, DHTs create navigable small-world network topologies with the tradeoff between route length and network degree.

The maximum route length is closely related to the diameter (of the network): the maximum number of hops in any shortest path between two nodes. The worst-case route length of the network is obviously at least as large as the diameter, consequently DHTs have the fundamental limitation of the node degree / diameter tradeoff in graph theory . The route length can also be larger than the diameter, since the greedy routing algorithm may not find the shortest paths.

Overlay Network Algorithms

In addition to routing, there are many algorithms that use the structure of overlay networks in DHTs to send messages to all nodes or a subset. These algorithms are used by applications for overlay multicasts , range queries or for collecting statistics. Two systems based on this approach are Structella, which implements flooding and random walks on a pastry overlay, and DQ-DHT, which implements a dynamic query search algorithm over a chord network.

Implementations

On many computers, sending messages is significantly more expensive than local hash table access. This is why it makes sense to bundle many messages in one batch. Assuming that each node has a local batch of at most messages, the messages are bundled as follows. Each node first sorts its local batch according to the ID of the node responsible for the message. This is possible in time with bucket sort , whereby the number of nodes is in the DHT. If there are several operations for the same key in a batch, the batch is reduced before it is sent. For example, several requests for the same key can be reduced to one or several increment operations can be reduced to one add operation. This can be done with a local hash table. Finally, the operations are sent to the respective nodes. ${\ displaystyle b}$ ${\ displaystyle O (b + n)}$ ${\ displaystyle n}$

The following implementations of distributed hash tables currently exist:

IPFS
Kademlia - structures based on this algorithm exist in several P2P networks, but are usually not compatible with each other. Implementations:
- KAD - Development by the eMule development team, based on the Kademlia algorithm, to replace the servers of the eDonkey2000 network that fail over time .
- Mojito - Developed by the LimeWire development team for quick source identification within the Gnutella network.

Applications

DHTs for data storage

software

apt-p2p ( package management system ): Distributed update based on apt
Vuze : Bit torrent client
BitComet : BitTorrent client
BitTorrent (from version 4.1.0)
Coral : content distribution network
CSpace : Instant Messenger with Kademlia
Deluge (BitTorrent client)
eDonkey2000 , name: Overnet
eMule (from version 0.40) and aMule (from version 2.1.0), name: Kad
Free Download Manager : Free download manager that can also handle BitTorrent and DHT
Freenet : Anonymous data storage.
Halite : BitTorrent client
KTorrent : BitTorrent client
LimeWire (Name: Mojito )
MLDonkey (Overnet and Kademlia)
µTorrent : BitTorrent client
qBittorrent : BitTorrent client
RetroShare : serverless instant messenger, anonymous / file sharing, newsgroups, Voice over IP, e-mail
rTorrent : BitTorrent client
Tox : Instant Messenger
Transmission (BitTorrent) : BitTorrent client
YaCy : distributed search engine

DHT research

@1@ 2( Page no longer available , search in web archives: Project IRIS ) (Infrastructure for Resilient Internet Systems)

Individual evidence

^ Agostino Forestiero, Emilio Leonardi, Carlo Mastroianni, Michela Meo: Self-Chord: A Bio-Inspired P2P Framework for Self-Organizing Distributed Systems . In: IEEE / ACM Transactions on Networking . 18, No. 5, October 2010, pp. 1651-1664. doi : 10.1109 / TNET.2010.2046745 .
↑ Sarunas Girdzijauskas: Designing peer-to-peer overlays a small-world perspective . EPFL, 2009.
^ The (Degree, Diameter) Problem for Graphs . Maite71.upc.es. Archived from the original on February 17, 2012. Retrieved January 10, 2012.
↑ Gurmeet Singh Manku, Moni Naor, Udi Wieder: "Know your Neighbor's Neighbor: the Power of Lookahead in Randomized P2P Networks" . Proc. STOC, 2004.
^ Ali Ghodsi: Distributed k-ary System: Algorithms for Distributed Hash Tables ( Memento of May 22, 2007 in the Internet Archive ). KTH - Royal Institute of Technology, 2006.
↑ Miguel Castro, Manuel Costa, Antony Rowstron: Should we build Gnutella on a structured overlay? . In: ACM SIGCOMM Computer Communication Review . 34, No. 1, January 1, 2004, p. 131. doi : 10.1145 / 972374.972397 .
^ Domenico Talia, Paolo Trunfio: Enabling Dynamic Querying over Distributed Hash Tables . In: Journal of Parallel and Distributed Computing . 70, No. 12, December 2010, pp. 1254-1265. doi : 10.1016 / j.jpdc.2010.08.012 .
Jump up ↑ Peter Sanders, Kurt Mehlhorn, Martin Dietzfelbinger, Roman Dementiev: Sequential and Parallel Algorithms and Data Structures . Springer, S. 135 f .

[1] Agostino Forestiero, Emilio Leonardi, Carlo Mastroianni, Michela Meo: Self-Chord: A Bio-Inspired P2P Framework for Self-Organizing Distributed Systems . In: IEEE / ACM Transactions on Networking . 18, No. 5, October 2010, pp. 1651-1664. doi : 10.1109 / TNET.2010.2046745 .

[2] Sarunas Girdzijauskas: Designing peer-to-peer overlays a small-world perspective . EPFL, 2009.

[3] The (Degree, Diameter) Problem for Graphs . Maite71.upc.es. Archived from the original on February 17, 2012. Retrieved January 10, 2012.

[4] Gurmeet Singh Manku, Moni Naor, Udi Wieder: "Know your Neighbor's Neighbor: the Power of Lookahead in Randomized P2P Networks" . Proc. STOC, 2004.

[5] Ali Ghodsi: Distributed k-ary System: Algorithms for Distributed Hash Tables ( Memento of May 22, 2007 in the Internet Archive ). KTH - Royal Institute of Technology, 2006.

[6] Miguel Castro, Manuel Costa, Antony Rowstron: Should we build Gnutella on a structured overlay? . In: ACM SIGCOMM Computer Communication Review . 34, No. 1, January 1, 2004, p. 131. doi : 10.1145 / 972374.972397 .

[7] Domenico Talia, Paolo Trunfio: Enabling Dynamic Querying over Distributed Hash Tables . In: Journal of Parallel and Distributed Computing . 70, No. 12, December 2010, pp. 1254-1265. doi : 10.1016 / j.jpdc.2010.08.012 .

[8] Jump up ↑ Peter Sanders, Kurt Mehlhorn, Martin Dietzfelbinger, Roman Dementiev: Sequential and Parallel Algorithms and Data Structures . Springer, S. 135 f .