0% found this document useful (0 votes)
101 views66 pages

Distributed Systems

The document provides an overview of distributed leader election algorithms in synchronous networks. It begins with background on distributed computing and the leader election problem. It then discusses different network models and complexity measures. The document proceeds to summarize several classic leader election algorithms, including LCR for rings, HS for bidirectional rings, and FloodMax for general graphs. It also covers minimum spanning tree and leader election in anonymous rings. The summaries focus on time and message complexity analyses.

Uploaded by

aditi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views66 pages

Distributed Systems

The document provides an overview of distributed leader election algorithms in synchronous networks. It begins with background on distributed computing and the leader election problem. It then discusses different network models and complexity measures. The document proceeds to summarize several classic leader election algorithms, including LCR for rings, HS for bidirectional rings, and FloodMax for general graphs. It also covers minimum spanning tree and leader election in anonymous rings. The summaries focus on time and message complexity analyses.

Uploaded by

aditi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Distributed Leader Election Algorithms

in Synchronous Networks
Mitsou Valia
National Technical University of Athens
School of Applied Mathematics and Physics

1/66

Distributed Computing
Distributed computing is decentralised and
parallel computing, using two or more computers
communicating over a network to accomplish a
common task.
The collaborating processes are often identical.
One of the central problems is

2/66

Leader Election

Given a network of processes, exactly one process


should output the decision that it is the leader.
It is usually required that all non-leader processes
are informed of the leaders election.

3/66

Networks
The timing model:
Synchronous
Asynchronous
Partially synchronous

The failure model:


Completely reliable
Partly faulty
Stopping failure
Byzantine failure

4/66

The Synchronous Network Model


Directed Graph G(V,E), |V|=n
Nodes represent processes
Edges represent (directed) communication
channels

5/66

The Synchronous Network Model


(formal)
Alphabet M (null indicates the absence of a
message)
On every i V we have a process which consists:
statesi (a not necessarily finite set of states)
starti (the initial state)
msgsi (a message generation function)
transi (a state transition function)

With each edge i, j there is a link that can hold at


most a single message in M.
6/66

Complexity measures
Time complexity: the number of the rounds until
all outputs are produced or all the processes
halt.
Communication complexity: the number of nonnull messages that are sent during the
execution.

7/66

Leader Election in a Synchronous


Ring

8/66

Setting
The network graph is a directed ring (unidirected or
bi-directed) consisting of n nodes (n may be unknown
to the processes).
Processes run the same deterministic algorithm
The only piece of information supplied to the
processes is a unique integer identifier (UID).
UIDs may be used
In comparisons only (comparison-based algorithms)
In comparisons and other calculations (non-comparisonbased).
9/66

Related Work and Important Results


LCR (79)
HS (80)

Time
Complexity
O(n)
O(n)

ML (06)

O(n)

Algorithm

TimeSlice
Lower
bound

O(n umin)

Msg Complexity

Restrictions

O(n2)
O(nlogn)
O(nlogn) (better
constant)

Bidirectional

O(n)

Noncomparison
based

(n) (trivial) (nlogn) FL (87)


10/66

The LCR Algorithm

Comparison-based Algorithm
The size of the ring is unknown to the processes
Unidirectional Ring
It elects the process with the maximum UID

11/66

The LCR Algorithm


Description
Each process sends its UID around the ring.
When a process receives a UID, it compares
this one to its own.
If the incoming UID is greater, then it
passes this UID to the next process.
If the incoming UID is smaller, then it
discards it.
If it is equal, then the process declares
itself the leader.
12/66

LCR Example
4
7

3
6

13/66

LCR Complexity Analysis


Time Complexity: O(n)
Message Complexity: O(n2) worst case
O(nlogn) average case

14/66

The HS Algorithm

Comparison-based Algorithm
The size of the ring is unknown to the processes
Bi-directional Ring
It elects the process with the maximum UID

15/66

The HS Algorithm
Description
Each process operates in phases 0, 1, 2...
In each phase k, process i sends tokens with its
UID in both directions to travel distance 2k and
return back to it.
If both tokens return then process i continues in
phase k+1.

16/66

Figure: The execution of the HS

17/66

The HS Algorithm (continued)


When a process receives an outgoing UID, it
compares this one with its own.
If the received UID is smaller, then it
discards it.
If the received UID is greater then
it passes it to the next process, if it is not the
end of its path,
else it returns it back to the previous one (to
travel back to the originating process).

If it is equal, then the process declares


itself the leader.

18/66

HS Example (phase 0)
All
the 20
nodes

19/66

HS Example (phase 1)

15
19
20
18
16
12
17
20/66

HS Example (phase 2)

15
20
18
16
17

21/66

HS Example (phase 3)
20
17

22/66

HS Example (phase 4)
20

23/66

HS Example (phase 5)

20 is the
LEADER

24/66

Complexity Analysis
Time Complexity: O(n)
Message Complexity: O(nlogn)

25/66

Distributed Algorithms in a
General Synchronous Network

26/66

Leader Election in a General Network The FloodMax Algorithm


The diam of the graph is known.
Causes both leader and non-leaders to identify
themselves.
It elects the process with the maximum UID.

27/66

FloodMax Algorithm
Every process keeps the maximum UID it has
seen so far (initially its own).
At each round, each process sends this
maximum value to every outgoing neighbor.
After diam rounds if the maximum value is the
processs UID then it elects itself the leader,
otherwise it is a non-leader.
28/66

Complexity Analysis

Time Complexity: diam rounds


Communication Complexity: diam|E|
(|E| messages in every round).

29/66

Minimum Spanning Tree


Spanning tree of a graph G(V,E): a tree that consists
entirely of edges in E and contains every vertex of G.
The problem: Given an undirected graph G(V,E) find a
minimum weight (undirected) spanning tree for the
network.
Distributed output: Each process should determine which of
its incident edges belong to the tree.
Processes know n
Processes have UIDs
30/66

Minimum Spanning Tree (continued)

General Strategy for MST:


Start with the trivial spanning forest.
For every connected component C select a
minimum weight outgoing edge e.
Combine C with the component at the other
end of e, including e.
Stop when the forest has a single component.
31/66

Minimum Spanning Tree (continued)

Several well-known sequential MST algorithms


are special cases of this general strategy:
Prim (add minimum-weight outgoing edge
from the current component attaching a new
single node)
Kruskal (add minimum-weight edge that joins
two separated parts)
32/66

Minimum Spanning Tree (continued)


A distributed version could be:
Each component determines a minimum-weight
outgoing edge and all these edges are added to
the forest causing combinations of components
all at once.
The above strategy is false in general!!!
Example: A cycle could be created.
1

Lemma: If all edges of G have distinct weights,


33/66
then there is exactly one MST.

Minimum Spanning Tree (continued)


The SynchGHS algorithm
(Based on an asynchronous algorithm developed
by Gallager, Humblet and Spira in 1983.)
The strategy mentioned before is used.
Assumption: Edge weights are all distinct.

34/66

Minimum Spanning Tree (continued)


The Algorithm:
The algorithm builds components in levels.
For each level k, the level k components are
subtrees of the MST that constitute a spanning
forest.
Each level k component has at least 2k nodes.
Every component at every level has a
distinguished leader node.

35/66

Minimum Spanning Tree (continued)


17
5
10

15
3
9

13

10

14

12

2
7

16

11
36/66

Minimum Spanning Tree (continued)


17
5
10

15
3
9

13

10

14

12

2
7

16

11
37/66

Minimum Spanning Tree (continued)


17
5
10

15
3
9

13

10

14

12

2
7

16

11
38/66

Minimum Spanning Tree (continued)


17
5
10

15
3
9

13

10

14

12

2
7

16

11
39/66

Minimum Spanning Tree (continued)


Complexity Analysis
Time Complexity: O(nlogn)
[logn levels] x [O(n) time for every level for
synchronization].
Communication Complexity: O((n+|E|)logn)
[logn levels] x [O(n) messages along tree edges
+ O(|E|) messages for finding the local minimum
weight outgoing edges].
It can be reduced to O(nlogn + |E|).
40/66

Minimum Spanning Tree (continued)


Non-unique weight edges:
edge identifier: a triple (weighti,j , u, u)
where, u<u the UIDs of i, j.
Thus, a total ordering is defined among the edge
identifiers.
Example:

(1,1,2)
1

(1,1,3)

(1,2,3)
3

41/66

Minimum Spanning Tree (continued)


Leader Election:
The leaves of the MST begin a convergecast along the
paths of the tree.
Internal nodes wait to receive messages from all but
one neighbor. Then they send a message to the
remaining neighbor.
If a node receives messages from every neighbor
without having itself send a message then becomes the
leader.
If two neighboring nodes receive messages from each
other at the same round, then the one with the greatest
UID becomes the leader.
Complexity: n-1 additional time and messages.

42/66

Leader Election in
Anonymous Rings

43/66

General
Lemma: If the network is symmetric (i.e. a ring)
and anonymous (the processes havent UIDs)
then it is impossible to elect a leader by a
deterministic algorithm. [by Angluin (1980)]

Probabilistic algorithms are used to break


symmetry.

44/66

Itai and Rodeh Algorithm


Assumption: Processes know n.
The Algorithm
The algorithm proceeds in phases, each of them
containing n rounds.
At every phase, a n processes are active (initially
everyone). During each phase some processes may
become inactive.
At the beginning of every phase, every active process
decides with probability a-1 whether or not to become a
candidate.
To do that, it picks a random number r, 0<r<1 and if r<a-1,
then it becomes a candidate and initiates a pebble to
travel around the ring.
45/66

Itai and Rodeh Algorithm


To compute the number of candidates (c), each process
counts the pebbles it has seen.
Number of pebbles counted = Number of candidates.
At the end of the phase, every process has calculated c.
If c=1 then sole candidate becomes leader. If c>1 then a
new phase begins with the new active processes (the
candidates of the previous phase). If c=0 the phase was
useless.

46/66

Itai and Rodeh Algorithm


0.46

a-1 = 1/10
c=2

0.04

0.37

0.35

0.83

0.64

0.08

0.22

0.53
0.93

47/66

Itai and Rodeh Algorithm


0.74

Useless phase
a-1 = 1/2
c=2

0.88

48/66

Itai and Rodeh Algorithm


0.32

a-1 = 1/2
c=1

0.69

49/66

Itai and Rodeh Algorithm


The leader!!!

50/66

Itai and Rodeh AlgorithmComplexity Analysis


p(a,c) : the probability that c out of a active processes
become candidates. Then
a c 1
p (a, c) = a 1
c a

Proof:
1,
Xi a random variable, X i =

a c

a 1

1
0, 1 a

Xi=1 if i becomes a candidate, else 0 (bernoulli trial)


Then X=nXi= the number of processes become
candidates. X~binomial distribution.
Thus

a c
P[X = c ] = a 1 a 1
c

a c

= p ( a, c )

51/66

Itai and Rodeh AlgorithmComplexity Analysis


Average Case
Time Complexity: 2.441716 n
Message Complexity: 2.441716 n
The number of pebbles initialized per phase is X (the
number of active processes that become candidates).
E[X ] = E[aXi ] = a(E[Xi ]) = a a-1 = 1
Thus, the expected message complexity per phase is n.
52/66

Leader Election Protocols with


Cheating Processes

53/66

The Model
Full- or Perfect-Information Model [BL 90]:
There is an adversary that controls t players
The adversary has unlimited computational
power.
Communication between players is by
broadcast.
Reliable delivery of messages.
The identity of the sender is protected.
The adversary has complete knowledge of
the state of the protocol at any given
54/66
moment.

The Network
We assume an asynchronous network with
synchronization points :
Computation proceeds in rounds.
In each round processes send messages.
During a round we cant force processes to act
simultaneously.
Messages of round i precede those of round i+1.
Within a round, all cheaters have the opportunity
to wait until they receive messages from all
honest players and then send their own.
55/66

Example
A Leader Election Protocol of n processes (Baton
Passing [Saks 89]):
In every round the baton is randomly passed to a
process that hasnt yet received it.
The last process left with the baton becomes the
leader.
n 1 n 2 2 1 1

P(i ) =
L =
n n 1 3 2 n

If there are cheaters, when they take the baton they give
it to an honest process, in order to increase the
probability of a cheater to be elected.
Baton Passing lasts n-2 rounds.

56/66

Failure probability
Let P(n,t) be a leader election protocol between n
processes, t of which are corrupted.
failP(n,t): the probability that one of the cheaters is
elected.
Proposition: For any n, any t 1 and any leader
election protocol P :

1. failP(n,t) is non-decreasing in t.
2. failP(n,t) t/n

3. failP(n, n/2 ) =1
57/66

Resilience
Resilience: How many cheaters are allowed in order
for the protocol to guarantee that an honest player
can be elected with positive probability.
Definition: P is resilient for t=b(n) iff
for all suitably large n
failP(n,b(n)) 1 .
However,
If t 1 then failP(n,t)>1/4
(P is the Lightest Bin protocol

>0 such that

fail(n,t)
1
1-

which achieves optimal resilience)


b(n) n/2

t
58/66

Cheaters Edge
Cheaters edge: The factor that failP(n,t) increases by
cheating. (Antonakopoulos 2006)
Definition: edgeP(n,t) = n/t failP(n,t)-1 (0)
Example: Assume a fair leader election protocol
If edge = 1 n/t failP(n,t) 1 = 1
n/t failP(n,t) = 2
failP(n,t) = 2 t/n
59/66

Zero Edge Protocols


Zero Edge Protocols: Protocols where cheaters
cannot increase their probability of election by
cheating.
These protocols exist only for t=1.
For t>1 the adversary can find two players that
can collude.
Example: Baton Passing (t=1)
Counter Example: Itai - Rodeh
60/66

Zero Edge Protocols


A picks a row
B picks a column
C picks a level
D, E are mute.

The selected square


determines the leader.
A cannot increase the
probability of his election.
Same for B and C .
61/66

Related Work and Important Results


Probabilistic arguments have established that:
1. There exists a leader election protocol AN1 with
bounded cheaters edge for all t n.
[Alon, Naor, 93].
2. For any < 1/2, there exists a protocol that is
resilient for t = n. [AN93, BN00] (In terms of
resilience, this is optimal.)
Disadvantages
Non-constructive. ExhaustiveO search
may be
(n)
2
2
attempted, but could take time
.
O(n) running time (very slow)
62/66

Related Work and Important Results Reduction via Committees


Lemma: From a leader election protocol
P(n,t) executed in r(n) rounds and
constructed in s(n) time, we can obtain a
leader election protocol cmt|P(logdn, (t/n +
c/logn) logdn), that lasts r(logdn)+1 rounds
and is constructible in s(logdn)+poly(n)
time.
[Russel - Zuckerman 01]
63/66

Related Work and Important Results


General scheme to overcome this drawback:
1. Players pick a small committee.
2. Committee members pick a leader among them using a
suitably good protocol, discovered via exhaustive
search (so it doesnt have to be efficiently constructible).
After long line of work, achieved (log*n + O(1))-round
protocols, with optimal resilience.
[Russel, Zuckerman 01], [Feige 99].

None of them has bounded cheaters edge.


64/66

Related Work and Important Results


Antonakopoulos (2006) presented three leader election
protocols with bounded cheaters edge that are
poly(n)-time constructible:

Protocol

Condition

rounds

P*

t (n/logn)

P#

t (n/(lognloglogn))

5logn

P+

polylogn

65/66

66/66

You might also like