0% found this document useful (0 votes)
22 views67 pages

Synchronization

The document discusses various issues related to synchronization in distributed systems. It explains concepts like clock synchronization, event ordering, mutual exclusion and election algorithms. It describes different clock synchronization techniques including centralized and distributed algorithms.

Uploaded by

cmpn.20102b0008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views67 pages

Synchronization

The document discusses various issues related to synchronization in distributed systems. It explains concepts like clock synchronization, event ordering, mutual exclusion and election algorithms. It describes different clock synchronization techniques including centralized and distributed algorithms.

Uploaded by

cmpn.20102b0008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

R1

P1 P3

P2
 The processes in distributed systems communicate with
each other.
 It is necessary to understand, how do processes
cooperate and synchronize with each other
 The following are the issues related to synchronization :

◦ Clock Synchronization

◦ Event ordering

◦ Mutual Exclusion

◦ Election Algorithms
 Most distributed applications require the clocks of the nodes
to be synchronized with each other

 Clocks tick at different rates


◦ Create ever-widening gap in perceived time
◦ Clock Drift (change from the real-time clock that was used for its
initial setting).
◦ Clock Skew (Difference between two clocks at one point in time)
 Clock synchronization algorithms are classified into :
 Centralized Algorithms
 Passive Time Server
 Active Time server
 Distributed Algorithms
 Global Averaging
 Localized Averaging
 One node has a real time receiver called the Time Server Node.

 The time of the server node is used as the reference time.

 The clocks of all other nodes are synchronized with the time
server node

 Depending on the role of the time server node the centralized


clock synchronization algorithms are classified into :
 Passive Time Server
 Active Time server
➢ Each node periodically sends a message (time=?) to the time server node
➢ Time server responds with a message (time = Tserver), Tserver is the current
time in the clock of the time server node
➢ request sent: T0
➢ reply received: T1
➢ Assume network delays are symmetric
New time =Tserver + (T1 – T0)/2

Tserver
server
request reply
client
T0 T1 time
 There may be an unpredictable variation in message
propagation time between the two nodes.
 Cristian Algorithm:
1. P requests the time from S
2. After receiving the request from P, S prepares a response and
appends the time T from its own clock.
3. P then sets its time to be T + RTT/2
 Assumes that the RTT is split equally between request and response,
which may not always be the case but is a reasonable assumption on
a LAN connection.
 Further accuracy can be gained by making multiple requests to S
and using the response with the average RTT.
 The time server periodically broadcasts its clock time
(“time = Tserver)
 Other nodes receive the message and use the clock time
to correct their own clocks
 The node’s clock is readjusted to the time:
 Tnew = Tserver + Ta.
 (Ta → Approximate time require for the message propagation from server)
 The time server periodically sends a message (“time = ?)
 Each computer sends back its clock value to the time server
 The time server has a priori knowledge of the approximate time
required for the propagation of the message
 It then takes a fault-tolerant average of the clock values of all
computers.
 To take the fault-tolerant average, the time server chooses a subset
of all clock values that do not differ from one another by more
than a specified amount, and the average is taken only for the
clock values in this subset
 The calculated average time is the current time to which all clocks
should be readjusted.
 The time server readjusts its own clock to this value
 It then sends the amount of time by which each individual computer
has to adjust its time
 Each node’s clock is independently synchronized with real time.

 Each node of the system is equipped with a real time receiver.

 Two approaches used for internal synchronization are :

 Global Averaging Distributed Algorithms

 Localized Averaging Distributed Algorithms


 The clock process at each node broadcasts its time as a “resync” message
when its local time equals : T0 + iR
◦ where To is a fixed time in the past agreed upon by all nodes
◦ R is a system parameter that depends on such factors as the total number of nodes
in the system, the maximum allowable drift rate & so on.
 This msg is broadcasted from each node at the beginning of every fixed
interval.
 After broadcasting, the clock process of a node waits for time ‘T’.
 Then collects the resync msgs sent by other nodes and Clock process
records the time.
 At the end of the waiting period, the clock process estimates the skew of its
clock with respect to each of the other nodes.
 Then computes the fault – tolerant average of the estimated skews.
 Corrects the local clock.
 The nodes are arranged in the pattern of a ring or a grid.

 Periodically each node exchanges its clock time with its


neighbors in the ring or the grid.

 Sets its clock time to the average of its own clock time
and the clock times of its neighbors.
 Lamport defined a new relation called happened-before for
partial ordering of events.
 Also introduced the concept of logical clocks for ordering of
events based on the happened-before relation.
 The happened-before relation(→) on a set of events satisfies
the following conditions:
1. If a and b are events in the same process and a occurs before b,
then a → b
2. If a is the event of sending a message by one process and b is the
event of the receipt of the same message by another process, then
a→b
3. If a → b and b → c , then a → c , i.e happened-before is a
transitive relation
 In terms of the happened-before relation, two events a and b
are said to be concurrent if they are not related by the
happened-before relation.

◦ That is, neither a → b nor b → a is true.

 Two events are concurrent if neither can causally affect the


other. Due to this reason, this relation is sometimes also
known as the relation of causal ordering.
 a → b is true if and only if there exists a path from a to b by moving
forward in time along process and message lines in the direction of the
arrows.
For example, e10 → e11, e20 → e24, e11 → e23
e30 → e24 , e11 → e32
 Two events a and b are concurrent if and only if no path exists either
from a to b or from b to a.
For example, e12 and e20, e21 and e30, e10 and e30, e11 and e31.
 Logical clocks concept is a way to associate a timestamp with
each system event so that events that are related to each other
by the happened-before relation can be properly ordered in
that sequence.
 Implementation of Logical Clocks
 C1 : If a and b are two events within the same process Pi and a
occurs before b then Ci(a) < Ci(b)
 C2 : If a is the sending of a message by process Pi and b is the
receipt of that message by process Pj then Ci(a) < Cj(b) .
 C3 : A clock Ci associated with a process Pi must always go
forward, never backward.
 To meet conditions C1, C2 and C3, the algorithm uses the
following implementation rules:

 IR1 : Each process Pi increments Ci between any two


successive events.

 IR2 : If event a is the sending of a message m by process Pi ,


the message m contains a timestamp Tm = Ci(a) and upon
receiving the message m process Pj sets Cj greater than or
equal to its present value but greater than Tm .
 Two processes P1 and P2 with their counters C1 and C2, counters
act as logical clocks
 Counters are initialized to zero and the counter is incremented by
1 whenever an event occurs in the process.
 If the event is sending of a message, the process includes the
incremented value of the counter in the message
 If the event is receiving of a message, a check is made to see if
the incremented counter value is less than or equal to the
timestamp in the received message.
 If so, the counter value is corrected, otherwise the counter value
is left as it is.
C1=8 e08

C1=7 e07 e14 C2= 6


C1=6 e06
e13 C2= 3 5

C1=5 e05

C1=4 e04

C1=3 e03

C1=2 e02 e12 C2= 2

C1=1 e01 e11 C2= 1


C1 = 0 C2= 0
 Each process has a physical clock associated with it
 Each clock runs at a constant rate
 The rates at which different clocks run are different
 Each process have its own clock running at its own speed.
 For example, when the clock ticks 10 times in processor p1, it
ticks 8 times in processor p2
P1 P2
e08 120 96 101
110 88 93
e07 100 80 85
e06 90 72 77
e05 80 64 69
70 56 61
e04 60 48
50 40
e03 40 32
e02 30 24
20 16
e01 10 8
0 0
When time=60, p1 sends a message to p2, which is received by p2 at time=56. this time is
less than 60. Therefore p2 adjusts its time with 61.
 Election algorithms are meant for electing a co-ordinator
process from among the currently running processes in such
a manner that at any instance of time there is a single co-
ordinator for all processes in the system.
 Assumptions:
◦ Each process in the system has a unique priority number.
◦ The process having the highest priority is elected as a coordinator.
◦ On recovery, a failed process can take appropriate actions to rejoin
the set of active processes.
 An election algorithm finds out which of the currently
active processes has the highest priority number and then
informs this to all other active processes.
 The two basic election algorithms are as follows :
◦ Bully Algorithm and
◦ Ring Algorithm
 Used for dynamically selecting a coordinator by process ID
number.
 When a process P determines that the current coordinator is down
because of message timeouts or failure of the coordinator to
initiate a handshake, it performs the following sequence of
actions, for initiating the election:
 When a process p detects that the coordinator is not responding to
requests, it initiates an election:
a. p sends an election message to all processes with higher numbers.
b. If nobody responds, then p wins and takes over.
c. If one of the processes answers, then p's job is done.
2. If a process p receives an election message from a
lower-numbered process at any time, it:
◦ a. sends a response message back.
◦ b. holds an election (unless its already holding one).

3. A process announces its victory by sending all


processes a message telling them that it is the new
coordinator.
4. If a process that has been down recovers, it holds an
election.
• Process 5 sends out only one election message to Process 6.
• When Process 6 does not respond Process 5 declares itself the winner.
 In this algorithm, it is assumed that all the processes in the
system are organized in a logical ring. The ring is
unidirectional.
 Logically ordered processes are in a ring such that each
process knows who is its successor.
 Every process in the ring knows the structure of the ring.
◦ Should know if any process is failed in the ring.
◦ Skip the crashed process.
 When a process (pi) sends a request message to the current
co-ordinator & does not receive a reply within a fixed
amount of time, it assumes that the coordinator has crashed
 Then, it initiates an election by sending an election message to its
successor.
 At each step, the sender adds its process id to the message.
 On receiving the election message, the successor appends its own
id number to the message & passes it to the next active member in
the ring.
 In this manner, the election message circulates over the ring from
one active process to another and eventually returns to process pi.
 Process pi recognizes the message that contains the list of id
numbers of all the active processes in the ring.
 It elects the process having highest id value as the coordinator.
 There are several resources in a system that must not be used
simultaneously by multiple processes.
 Exclusive access to such a shared resource by a process must be
ensured. This exclusiveness of access is called Mutual Exclusion
between processes.
 The sections of a program that need exclusive access to shared
resources are referred to as Critical Sections.
 The two basic approaches used by different algorithms for
implementing mutual exclusion in distributed systems are :
◦ Centralized approach
◦ Distributed approach
1. Safety Property: At any instant, only one process can execute the
critical section.

2. Liveness Property: This property states the absence of deadlock


and starvation. Two or more processes should not endlessly wait for
messages which will never arrive.

3. Fairness: Each process gets a fair chance to execute the CS.


Fairness property generally means the CS execution requests are
executed in the order of their arrival (time is determined by a logical
clock) in the system.

5
7
 The decision making for mutual exclusion is distributed across
the entire system.
 Uses logical clock for event ordering.
 Non-token based Algorithms:
◦ Lamport’s Distributed Mutual Algorithm
◦ Ricart-Agrawala Algorithm
◦ Maekawa’s Algorithm

 Token Based Algorithms:


◦ Suzuki-Kasami’s Broadcast Algorithms,
◦ Raymond’s Tree based Algorithm

5
8
Lamport's Algorithm is used to generate a unique timestamp
for each event in the system.

 When a process wants to enter a critical section, it sends a


request message to all other processes. The message
contains:
◦ The process identifier

◦ The name of the critical section that the process wants to enter

◦ A unique TS generated by the process for the request message

59
 When a process Pi in a distributed system wish to enter the critical
section, it broadcasts a timestamped REQUEST message along with its
own identifier to all the processes.

 On receiving the REQUEST, every process inserts this request in its


RQi, which contains all the incoming requests.

 The requests get inserted at the correct place (based on increasing


value of the timestamp).

 The process Pi also inserts this value in its own RQi.

 The queue RQi at any process Pi maintains the order of processes


entering the CR.
1. The critical region request message
• Pi sends a REQUEST (timestampi, i) to all Pj (j = 1..N for j ≠ i)
• Pi queues its own request in the queue, RQi.
2. Pj receiving the request message from Pi
• Pj queues this request RQj in time stamped order.
• Sends a REPLY to Pi. /* The REPLY message is an acknowledgement that
RQj has been appended with the request of Pi*/

3. The execution of the CR


• A process Pi can enter the CR only if both the conditions are
satisfied.
- Pi has received a REPLY from all other processes.
- The REQUEST of Pi is in the front of RQi.
4. The release of the CR
• Process Pi exits the CR and dequeues the top entry from its RQi.
• Broadcasts a RELEASE message (timestamped with its
corresponding REQUEST message) to all processes.
• On receiving RELEASE message from Pi, Pj removes Pi’s entry
from RQj.
/* Removing the entries indicates, that process has exited */

62
63
 Dotted lines indicate Reply message

66
Performance Parameters
 Lamport’s algorithm has message overhead of total

3(N − 1) messages:
N – 1 REQUEST messages to all process,
N −1 REPLY messages, and
N −1 RELEASE messages per CR invocation.

 It can also be optimized by reducing the number of


RELEASE messages sent.

67
 This algorithm is similar to that of Lamport’s Algorithm, the difference
being in the sending of REPLY messages.

 Uses two types of messages: REQUEST and REPLY.

 A process Pi sends its ‘REQUEST’ message, with its timestamp and


identifier, indicating its wish to enter the CR, to all the processes.

 If the process receiving the request does not want to enter the CR (nor
it is currently executing in the CR), it sends a REPLY message back to
the requesting process by granting its permission for CR entry.

 If the REPLY comes back from all processes, then only the requesting
process enters the critical region.

68
 If the receiving process itself wants to enter the critical region,
the timestamp of the REQUEST message and its own timestamp
are compared to establish the priority.

 The requesting process’ REPLY is deferred if its own timestamp is


lower.

 Note: Each process Pi maintains the request-deferred array, RDi,


along with request queue, RQi

69
 Requesting the critical section:
a. When a process Pi wants to enter the CS, it broadcasts a
timestamped REQUEST message to all other processes.
b. When Process Pj receives a REQUEST message from Pi, it sends
a REPLY message to Pi; if Pj is neither requesting nor executing
the CS,
or
if the Pj is requesting and Pi’s timestamp is smaller than Pj’s
timestamp.
Otherwise, the reply is deferred and Pj sets RDj[i]=1

7
0
 Executing the critical section:
c. Pi enters the CS after it has received a REPLY message from
every process it sent a REQUEST message to.
 Releasing the critical section:
d. When Pi exits the CS, it sends all the deferred REPLY messages:
if RDi[j]=1, then send a REPLY message to Rj and set RDi[j]=0.
Note:
 When a site receives a message, it updates its clock using the
timestamp in the message.
 When a site takes up a request for the CS for processing, it
updates its local clock and assigns a timestamp to the request
7
1
7
4
 The algorithm guarantees
◦ mutual exclusion because a process can enter its critical section
only after getting permission from all other processes, and in the
case of a conflict only one of the conflicting processes can get
permission from all other processes.
◦ freedom from starvation since entry to the critical section is
scheduled according to the timestamp ordering.
• It has also been proved that the algorithm is free from
deadlock.
• Furthermore, if there are n processes, the algorithm requires
n-I request messages and n-l reply messages, giving a total of
2(n-I) messages per critical section entry.
Drawbacks:
1. In a system having n processes, the algorithm is liable to n
points of failure because if one of the processes fails, the
entire scheme collapses.
The failed process will not reply to request messages that will be
falsely interpreted as denial of permission by the requesting
processes, causing all the requesting processes to wait indefinitely.

2. The algorithm requires that each process know the identity


of all the processes participating in the mutual-exclusion
algorithm
3. In this algorithm, a process willing to enter a critical section
can do so only after communicating with all other
processes and getting permission from them.
The algorithm is suitable only for a small group of cooperating
processes.
 It is a quorum or voting-based mutual exclusion algorithm.

 It suggests that a process Pi does not require to request all


processes, but only to a subset of processes (the quorum)
called Ri.

 Each process Pi in the quorum set Ri gives permission to at


most one process at a time.
Properties of a Quorum Set
Each process Pi is associated with a set of process called quorum or a
voting set Ri. The set should satisfy the following conditions:
/*Intersection of any two Quorum sets should not be Null*/
1. (Ri ∩ Rj ! = Null), (for all i and j in i ≠ j, j ≤ N)

/*Each process belongs to its own quorum set Ri*/

2. (Pi ∈ Ri) , (for all i, 1 ≤ i ≤ N)

/*Size of each quorum set is K*/

3. (|Ri| = K), (for all i in 1 ≤ i ≤ N)

4. Any process Pj is contained in some M number of Ri s (for 1 ≤ i, j ≤ N)


7
9
The data structures used by each process Pi:
1. The request-deferred queue, RDi
/* of processes REQUESTing and not REPLIED to*/
2. A variable called ‘voted’ = FALSE is set initially;
/* TRUE when a reply is sent indicating that it has already granted
permission to a process in its quorum */

81
 Example

R1= {P1, P2, P4} R2= {P2, P1, P3}


Voted = False P1 P2 Voted = False
RD1 = Null RD1 = Null

R4= {P4, P3, P1} P3 R1= {P3, P2, P4}


P4
Voted = False Voted = False
RD1 = Null RD1 = Null
Voted = False
Reply RD2 = Null
RD1 = Null

Voted = False
RD3 = Null RD4 = Null

Voted = False
RD1 = Null

Voted = False
RD1 = Null
Reply

Reply
The process states are explained using the following algorithm.
1. The critical region request message from Pi
• Pi sends a REQUEST(timestampi, i) to all Pj in its quorum set Ri
2. Pj on receiving the REQUEST message from Pi
• If variable votedj = True or if the process Pj itself is currently
executing the CR
- then REQUEST from Pi is deferred and pushed in the queue RDj.
- else REPLY is send to Pi and variable votedj is set to TRUE
/* termed as permission granted*/
3. The execution of the CR
• If all REPLY received from processes in Ri, enter CR.
4. The release of the CR: Pi
After execution of CR,
Process Pi sends RELEASE to all Pj in Ri.
86
5. The receipt of RELEASE message: /* Pi sends to all Pj in Ri*/
• If RDj is nonempty
(a) dequeue top of the queue RDj,
/* looks out for requests which came while Pi was in CR*/
(b) Pj sends a REPLY message to only this dequeued process.
(c) Set votedj to be TRUE
• If queue RDj was empty
(a) votedj = false

87
Performance Parameters
 Maekawa used the theory of projective planes and
showed that N = K(K − 1) + 1. This relation gives |Ri | = √N.
1. An execution of the CR requires SQRT(N) REQUEST, SQRT(N)
REPLY and SQRT(N) RELEASE messages, thus requiring total 3
SQRT(N) messages per CR execution.
2. Synchronization delay is 2T.
3. M = K = SQRT(N) works best.

90

You might also like