Distributed Hash Tables
Distributed Hash Tables
Département d’Informatique
Projet SDA
En Informatique
Thème
— Kahlouche Nour-EL-houda
— Yahoui Soheib
2022/2023
1
Table des matières
1.9 Conclusion………………………………………………………………………………………….…20
2
1. Introduction :
3
2.Concept of a DHT :
4
2. Properties of DHT :
3 .Structure-DHT :
-An overlay network that connects the nodes , allowing them to find
of given key in the keyspace.
5
In computing, Chord is a protocol and
algorithm for a peer-to-peer distributed hash table. A distributed hash
table stores key-value pairs by assigning keys to different computers
(known as “nodes”); a node will store the values for all the keys for which
it is responsible. Chord specifies how keys are assigned to nodes, and how
a node can discover the value for a given key by first locating the node
responsible for that key.
Chord is one of the four original distributed hash table protocols, along
with CAN, Tapestry, and Pastry. It was introduced in 2001 by Ion Stoica,
Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan,
and was developed at MIT. For more information, once can refer to the
original paper. Chord is based on consistent hashing, something
which I’ve implemented in a previous article.
6
_E.g, pick from the range [0…2^m-1]
Properties :
Routing table size is O (log N) ,where N is the total number of nodes
Guarantees that a file is found in O (log N) hops.
Let us say that all ids and keys are in a keyspace of [0,1,…23−1] and are
represented in binary. We can represent this space as a complete binary tree
where each leaf node is a key. Circled leaves are ids that correspond to a
participating computer in the network. In the example above (Figure 3), three
computers are participating in the protocol with ids of 000, 110, and 111
respectively.
10001000 = 1. Next, consider key 100 and computer ids 110 and 111. Both 100,
110, and 111 share a common prefix of 1. However, bit 1 of 110 and 111 are
equal: 11011101. Thus, we consider the first bit at which they differ: bit 2.
7
Since 11021102 but 1112≠10021112≠1002, 100 should be assigned to the
computer whose id is 110.
In this paper, l briefly illustrate the technology bechind Cassandra and its
working.
8
storing a value on multiple peers) in TomP2P and it is also exposed in the API.
Thus, an operation such as get or put will return immediately and the user can
either block the operation to wait for the completion or add a listener that gets
notified when the operation completes.
In this example we will create a DHT with 8 Nodes. These 8 nodes will
be connected, such that the network forms conceptually a “circle ”
starting from the first node. Each node will be assigned an ID from the
following array of IDs {10, 20, 30, 40, 50, 60 ,70, 80 }. The first Node
will have an ID of 10, the second; an ID of 20 and so forth. (fig 3 )
Figure 03 :
9
Each Node will have a successor (the node immediately after the current
node that is connected in the network i.e. the next node) and a pred-
ecessor (the node immediately before the current node that is connected
in the network i.e. the previous node). (Fig 04)
figure 04 :
Let us consider that the main purpose of the DHT in this example is to
store the contents of a particular set of files. The specific content of each
file does not matter for this example, however I would like to use file
contents in order to make the example seem closer to a real world
problem.
10
Figure 05 :
figure 06 :
11
decided is where its key is going to be stored after the removal of the
node. If we want to remove the node with ID 80, we need to assign its
currently stored key to some other node. In this example we will follow a
basic rule, that says that when removing a node, all its stored keys shall
be assigned to its successor.
_This rule was arbitrary chosen for this example, there are many other
ways to manage key of nodes that are going to be removed in the future.
Figure 07 : Figure 08 :
When a new node wants to join the DHT network, depending on its
position relative to who is its successor/predecessor, the node will acquire
12
a different key. In this example, we will bring back the Node with ID 80
inside the DHT network. The constrain that we will use in this example is
that, if on leaving, a nodes key was assigned to its successor, then when
bringing that node inside the network, the new node will acquire a new
key from its predecessor, relative to its new position.
_Here I am not applying the rule “Resource with key N will be managed
by Node with ID M, where M < N and M is the biggest ID number in the
set”. If we had to apply this rule, there would be only one correct
position for a node with ID 80.
Figure 09 :
13
4. Querying the network for a particular resource
[O(n)] :
When performing a query for a specific key on the network, the parameter
with the greatest wight is “connectivity”. Connectivity here means the
number of direct connections a node has to other nodes, i.e. how many
successors / predecessors does a given node have.
14
Let us consider that Node 10 would like to obtain the value for key 85. It
first needs to locate where that particular resource is stored, i.e. Node 80.
Initially Node 10 would query its successor with the theoretical
question “Do you have key 85?”. If its successor, in this case Node 20.
Does not have key 85 it will continue asking this question to its
successor and so on until Node 80 gets asked. This process illustrates
Once Node 80 receives the query and confirms that it manages key 85, Node 80
would return the message to its caller. Namely , the query would propagate
Because we are traversing linearly through the whole network in both directions,
Figure10 :
15
Parameter CAN CHORD Kademlia Koord Pastry Tapestry Viceroy
Map key-value
Matching Matching key
Routing pairs to Matching key Matching key to Suffix Levels of tree,
key to node and prefix in
coordinate to node ID node ID matching vicinity search
ID node ID
space
Routing
O(log(n))+c c i Between O(log(log(
(network O(dn(2/d)) O(log(n)) O(log(n)) O(log(n)) O(log(n))
s small n))) and O(log(n))
Between constant
Degree 2d O(log(n)) O(log(n)) O(2log(n)) O(log(n)) Constant
to log(n)
O(log(n))+c c i
Join/Leaves 2d log(n)2 O(log(n)) O(log(n)) O(log(n)) O(log(n))
s small
Ethereum [3],
Mainline DHT
Implementatio OpenChord, OceanStore,
– (BitTorrent), – FreePastry –
OverSIM Mnemosyne [4]
I2P, Kad
Network
_ The popularity of Kademlia over other DHTs is likely due to its relative simplicity
and performance. The rest of this section dives deeper into Kademlia.
6.2 Kademlia :
16
Nodes have enough information to route traffic through low-latency
paths.
Parallel and asynchronous queries are made to avoid timeout delays
from failed nodes.
The node existence algorithm resists certain basic distributed denial-
of-service (DDoS) attacks.
7.Calcular DHT :
each peer only aware of immediate successor and predecessor .
17
reduced from 6 to 3 messages .
possible to design shortcuts with O(log N) neighbors,O(log N)
messages in query.
8.Advantages of DHT :
9.Disadvantages of DHT :
18
Conclusion :
In this chapter, we have defined the notion of
distributed Hash Table systems. We have reviewed
Structure of DHT and Popular DHT protocols : Chord,
Kademlia,Apache Cassandra ,Koorde TomP2P with
some Properties and DHT Algorithms.
And we ended this chapter with some examples to
explain this project.
19
BIBLIOGRAPHIE :
https://tlu.tarilabs.com/protocols/distributed-hash-tables
https://medium.com/the-code-vault/data-structures-distributed-
hash-table-febfd01fc0af
https://medium.com/techlog/chord-building-a-dht-distributed-
hash-table-in-golang-67c3ce17417b
https://fr.slideshare.net/atefbentahar/les-protocoles-de-routage-
dans-les-rseaux-pair-apair-master-informatiquesr-2016
https://slideplayer.fr/amp/1817817/
20