0% found this document useful (0 votes)
127 views7 pages

Chord (Peer-To-Peer) : Computing Peer-To-Peer Distributed Hash Table Key-Value Pairs

Chord is a peer-to-peer lookup protocol that arranges node keys in a circular identifier space ranging from 0 to 2^m - 1. Each node maintains a finger table pointing to other nodes in the identifier circle to efficiently locate successor nodes. The Chord protocol maps keys to nodes to balance load and allows nodes to join and leave the network without disruption by consistently hashing keys and node IDs into the same identifier space. DHash builds on Chord to provide reliable storage and retrieval of immutable data blocks identified by their content hash.

Uploaded by

arunkmmbai
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views7 pages

Chord (Peer-To-Peer) : Computing Peer-To-Peer Distributed Hash Table Key-Value Pairs

Chord is a peer-to-peer lookup protocol that arranges node keys in a circular identifier space ranging from 0 to 2^m - 1. Each node maintains a finger table pointing to other nodes in the identifier circle to efficiently locate successor nodes. The Chord protocol maps keys to nodes to balance load and allows nodes to join and leave the network without disruption by consistently hashing keys and node IDs into the same identifier space. DHash builds on Chord to provide reliable storage and retrieval of immutable data blocks identified by their content hash.

Uploaded by

arunkmmbai
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Chord (peer-to-peer)

In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A
distributed hash table stores key-value pairs by assigning keys to different computers (known as
"nodes"); a node will store the values for all the keys it is responsible for. Chord specifies how
keys are to be assigned to nodes, and how a node can discover the value for a given key by first
locating the node responsible for that key.[1]

Chord is one of the four original distributed hash table protocols, along with CAN, Tapestry, and
Pastry. It was introduced in 2001 by Ion Stoica, Robert Morris, David Karger, Frans Kaashoek,
and Hari Balakrishnan, and was developed at MIT[2].

Overview
Using the Chord lookup protocol, node keys are arranged in a circle. The circle cannot have
more than 2m nodes. The circle can have IDs/keys ranging from 0 to 2m − 1.IDs and keys are
assigned an m-bit identifier using consistent hashing. The SHA-1 algorithm is the base hashing
function for consistent hashing. Consistent hashing is integral to the robustness and performance
of Chord because both keys and IDs (IP addresses) are uniformly distributed and in the same
identifier space. Consistent hashing is also necessary to let nodes join and leave the network
without disruption.Each node has a successor and a predecessor. The successor to a node (or
key) is the next node (key) in the identifier circle in a clockwise direction. The predecessor is
counter-clockwise. If there is a node for each possible ID, the successor of node 2 is node 3, and
the predecessor of node 1 is node 0; however, normally there are holes in the sequence. For
example, the successor of node 153 may be node 167 (and nodes from 154 to 166 will not exist);
in this case, the predecessor of node 167 will be node 153.Since the successor (or predecessor)
node may disappear from the network (because of failure or departure), each node records a
whole segment of the circle adjacent to it, i.e. the r nodes preceding it and the r nodes following
it. This list results a high probability that a node is able to correctly locate its successor or
predecessor, even if the network in question suffers from a high failure rate.

Chord protocol
A 16-node Chord network. The "fingers" for one of the nodes are highlighted.

The Chord protocol is one solution for connecting the peers of a P2P network. Chord
consistently maps a key onto a node. Both keys and nodes are assigned an m-bit identifier. For
nodes, this identifier is a hash of the node's IP address. For keys, this identifier is a hash of a
keyword, such as a file name. It is not uncommon to use the words "nodes" and "keys" to refer to
these identifiers, rather than actual nodes or keys. There are many other algorithms in use by
P2P, but this is a simple and common approach.A logical ring with positions numbered 0 to 2m
− 1 is formed among nodes. Key k is assigned to node successor(k), which is the node whose
identifier is equal to or follows the identifier of k. If there are N nodes and K keys, then each
node is responsible for roughly K / N keys.When a new node joins or leaves the network,
responsibility for O(K / N) keys changes hands.If each node knows only the location of its
successor, a linear search over the network could locate a particular key. This is a naive method
for searching the network, since any given message could potentially have to be relayed through
most of the network. Chord implements a faster search method.Chord requires each node to keep
a "finger table" containing up to m entries. The ith entry of node n will contain the address of
successor(n + 2i).With such a finger table, the number of nodes that must be contacted to find a
successor in an N-node network is O(logN). (See proof below.)

Potential uses
 Cooperative Mirroring: A load balancing mechanism by a local network hosting
information available to computers outside of the local network. This scheme could allow
developers to balance the load between many computers instead of a central server to
ensure availability of their product.

 Time-shared storage: In a network, once a computer joins the network its available data is
distributed throughout the network for retrieval when that computer disconnects from the
network. As well as other computers' data is sent to the computer in question for offline
retrieval when they are no longer connected to the network. Mainly for nodes without the
ability to connect full time to the network.

 Distributed Indices: Retrieval of files over the network within a searchable database. eg.
P2P file transfer clients.

 Large scale combinatorial searches: Keys being candidate solutions to a problem and
each key mapping to the node, or computer, that is responsible for evaluating them as a
solution or not. eg. Code Breaking

Proof sketches

The routing path between nodes A and B. Each hop cuts the remaining distance in half (or
better).

With high probability, Chord contacts O(logN) nodes to find a successor in an N-node
network.

Suppose node n wishes to find the successor of key k. Let p be the predecessor of k. We wish to
find an upper bound for the number of steps it takes for a message to be routed from n to p.
Node n will examine its finger table and route the request to the closest predecessor of k that it
has. Call this node f. If f is the ith entry n's finger table, then both f and p are at distances
between 2i − 1 and 2i from n along the identifier circle. Hence, the distance between f and p
along this circle is at most 2i − 1. Thus the distance from f to p is less than the distance from n to
f: the new distance to p is at most half the initial distance.This process of halving the remaining
distance repeats itself, so after t steps, the distance remaining to p is at most 2m / 2t; in
particular, after logN steps, the remaining distance is at most 2m / N. Because nodes are
distributed uniformly at random along the identifier circle, the expected number of nodes falling
within an interval of this length is 1, and with high probability, there are fewer than logN such
nodes. Because the message always advances by at least one node, it takes at most logN steps
for a message to traverse this remaining distance. The total expected routing time is thus
O(logN).
If Chord keeps track of r = O(log N) predecessors/successors, then with high probability, if
each node has probability of 1/4 of failing, find_successor (see below) and find_predecessor
(see below) will return the correct nodes

Simply, the probability that all r nodes fail is , which is a low probability; so
with high probability at least one of them is alive and the node will have the correct pointer.

Pseudocode
Definitions for pseudocode:

 finger[k]: first node that succeeds


 successor: the next node from the node in question on the identifier ring
 predecessor: the previous node from the node in question on the identifier ring

The pseudocode to find the successor node of an id is given below:

// ask node n to find the successor of id


n.find_successor(id)
if (id (n, successor] )
return successor;
else
// forward the query around the circle
n0 = closest_preceding_node(id);
return n0.find_successor(id);

// search the local table for the highest predecessor of id


n.closest_preceding_node(id)
for i = m downto 1
if (finger[i] (n,id))
return finger[i];
return n;

The pseudocode to stabilize the chord ring/circle after node joins and departures is as follows:

// create a new Chord ring.


n.create()
predecessor = nil;
successor = n;

// join a Chord ring containing node n'.


n.join(n')
predecessor = nil;
successor = n'.find_successor(n);

// called periodically. verifies n’s immediate


// successor, and tells the successor about n.
n.stabilize()
x = successor.predecessor;
if (x (n, successor))
successor = x;
successor.notify(n);

// n' thinks it might be our predecessor.


n.notify(n')
if (predecessor is nil or n' (predecessor, n))
predecessor = n';

// called periodically. refreshes finger table entries.


// next stores the index of the finger to fix
n.fix_fingers()
next = next + 1;
if (next > m)
next = 1;
finger[next] = find_successor(n+2next − 1);

// called periodically. checks whether predecessor has failed.


n.check_predecessor()
if (predecessor has failed)
predecessor = nil;

What is Chord?Chord is a peer-to-peer lookup algorithm. It allows a distributed set of


participants to agree on a single node as a rendezvous point for a given key, without any
central coordination. In particular, it provides a distributed evaluation of the successor(ID)
function: given the identifier of a key ID, the successor function returns the address of the
node whose identifier most closely follows ID in a circular identifier space. The identifier
space is typically a 160-bit number. The Chord algorithm handles adjusting this mapping as
the population of nodes changes over time.More details are described in publications found
at our publications page. Chord has been used to build a block storage infrastructure,
naming services and various file sharing systems.Chord is sometimes referred as a
distributed hash table; however, the Chord algorithm itself does not specify any mechanism
for storage of data. That is the role of DHash.

What is DHash?

DHash is also sometimes referred to as a distributed hash table. It is a layer built built on top of
Chord and handles reliable storage of data blocks on participating nodes. It does this through
techniques such as replication and erasure coding. The logical application interface is simply:

key = put (data)


data = get (key)

Data stored in the system is immutable and identified by its contents; freeing DHash from having
to worry about semantics of multiple writes. DHash has been used to build a backup system,
various file systems (CFS and Ivy), and a Usenet News server. For details on how to write
programs to use DHash, see our hacking notes.

How large of a system can Chord/DHash scale to?

In theory, the protocols themselves scale logarithmically with the number of nodes.

However, there aren't any wide-area p2p systems currently (2005) that scale much beyond
several million simultaneous users. Our implementation has never been tested with more than
hundreds of participating nodes and millions of data items.

Does Chord/DHash protect against malicious users?

No and sort-of. Security in distributed rendezvous protocols like Chord is still an open research
question (2005), though some early results are discussed in the Proceedings of the first
IPTPS.DHash provides integrity protection of data by restricting IDs used to be the output of a
cryptographic hash of the data or a public-key signature. However, it does not protect against
denial of service attacks where malicious nodes interfere with routing.

Does Chord/CFS support keyword search?

CFS does not support keyword search. Our investigations into keyword search have suggested
that simple solutions result in poor load-balance: for example, naively storing an index of all
items which contain a keyword K at the successor of the hash of K. For more information, see
this paper.

What protocols does Chord use?

Chord communicates with peers using standard RPCs. The specific protocols are defined using
XDR and can be viewed via our WebCVS interface at http://cvs.pdos.lcs.mit.edu/cvs/sfsnet/svc/.
However, these definitions do not specify the semantics in detail --- if you want to implement
Chord, you may be better off doing it without seeking compatibility with our implementation.
What transport does Chord use?Chord and DHash use a custom-built transport layer
optimized for peer-to-peer communication patterns; it uses the Vivaldi network coordinates
system to predict round-trip times to remote hosts and windows outstanding RPCs. This is
because TCP turns out to not be a particularly good transport for peer-to-peer DHT
communication patterns: TCP relies on communication between a single pair of nodes to its
RTT estimate and to find the proper window size. Peer-to-peer nodes tend to communicate
with many nodes briefly, making TCP set-up expensive but not long enough to get out of
slow-start or measure the RTT well. If long-standing TCP connections were left open to
hundreds of nodes, the kernel would run out of buffers.

Our transport layer is implemented on top of the SFS asynchronous RPC libraries over UDP.

You might also like