Synchronization
Synchronization
P1 P3
P2
The processes in distributed systems communicate with
each other.
It is necessary to understand, how do processes
cooperate and synchronize with each other
The following are the issues related to synchronization :
◦ Clock Synchronization
◦ Event ordering
◦ Mutual Exclusion
◦ Election Algorithms
Most distributed applications require the clocks of the nodes
to be synchronized with each other
The clocks of all other nodes are synchronized with the time
server node
Tserver
server
request reply
client
T0 T1 time
There may be an unpredictable variation in message
propagation time between the two nodes.
Cristian Algorithm:
1. P requests the time from S
2. After receiving the request from P, S prepares a response and
appends the time T from its own clock.
3. P then sets its time to be T + RTT/2
Assumes that the RTT is split equally between request and response,
which may not always be the case but is a reasonable assumption on
a LAN connection.
Further accuracy can be gained by making multiple requests to S
and using the response with the average RTT.
The time server periodically broadcasts its clock time
(“time = Tserver)
Other nodes receive the message and use the clock time
to correct their own clocks
The node’s clock is readjusted to the time:
Tnew = Tserver + Ta.
(Ta → Approximate time require for the message propagation from server)
The time server periodically sends a message (“time = ?)
Each computer sends back its clock value to the time server
The time server has a priori knowledge of the approximate time
required for the propagation of the message
It then takes a fault-tolerant average of the clock values of all
computers.
To take the fault-tolerant average, the time server chooses a subset
of all clock values that do not differ from one another by more
than a specified amount, and the average is taken only for the
clock values in this subset
The calculated average time is the current time to which all clocks
should be readjusted.
The time server readjusts its own clock to this value
It then sends the amount of time by which each individual computer
has to adjust its time
Each node’s clock is independently synchronized with real time.
Sets its clock time to the average of its own clock time
and the clock times of its neighbors.
Lamport defined a new relation called happened-before for
partial ordering of events.
Also introduced the concept of logical clocks for ordering of
events based on the happened-before relation.
The happened-before relation(→) on a set of events satisfies
the following conditions:
1. If a and b are events in the same process and a occurs before b,
then a → b
2. If a is the event of sending a message by one process and b is the
event of the receipt of the same message by another process, then
a→b
3. If a → b and b → c , then a → c , i.e happened-before is a
transitive relation
In terms of the happened-before relation, two events a and b
are said to be concurrent if they are not related by the
happened-before relation.
C1=5 e05
C1=4 e04
C1=3 e03
5
7
The decision making for mutual exclusion is distributed across
the entire system.
Uses logical clock for event ordering.
Non-token based Algorithms:
◦ Lamport’s Distributed Mutual Algorithm
◦ Ricart-Agrawala Algorithm
◦ Maekawa’s Algorithm
5
8
Lamport's Algorithm is used to generate a unique timestamp
for each event in the system.
◦ The name of the critical section that the process wants to enter
59
When a process Pi in a distributed system wish to enter the critical
section, it broadcasts a timestamped REQUEST message along with its
own identifier to all the processes.
62
63
Dotted lines indicate Reply message
66
Performance Parameters
Lamport’s algorithm has message overhead of total
3(N − 1) messages:
N – 1 REQUEST messages to all process,
N −1 REPLY messages, and
N −1 RELEASE messages per CR invocation.
67
This algorithm is similar to that of Lamport’s Algorithm, the difference
being in the sending of REPLY messages.
If the process receiving the request does not want to enter the CR (nor
it is currently executing in the CR), it sends a REPLY message back to
the requesting process by granting its permission for CR entry.
If the REPLY comes back from all processes, then only the requesting
process enters the critical region.
68
If the receiving process itself wants to enter the critical region,
the timestamp of the REQUEST message and its own timestamp
are compared to establish the priority.
69
Requesting the critical section:
a. When a process Pi wants to enter the CS, it broadcasts a
timestamped REQUEST message to all other processes.
b. When Process Pj receives a REQUEST message from Pi, it sends
a REPLY message to Pi; if Pj is neither requesting nor executing
the CS,
or
if the Pj is requesting and Pi’s timestamp is smaller than Pj’s
timestamp.
Otherwise, the reply is deferred and Pj sets RDj[i]=1
7
0
Executing the critical section:
c. Pi enters the CS after it has received a REPLY message from
every process it sent a REQUEST message to.
Releasing the critical section:
d. When Pi exits the CS, it sends all the deferred REPLY messages:
if RDi[j]=1, then send a REPLY message to Rj and set RDi[j]=0.
Note:
When a site receives a message, it updates its clock using the
timestamp in the message.
When a site takes up a request for the CS for processing, it
updates its local clock and assigns a timestamp to the request
7
1
7
4
The algorithm guarantees
◦ mutual exclusion because a process can enter its critical section
only after getting permission from all other processes, and in the
case of a conflict only one of the conflicting processes can get
permission from all other processes.
◦ freedom from starvation since entry to the critical section is
scheduled according to the timestamp ordering.
• It has also been proved that the algorithm is free from
deadlock.
• Furthermore, if there are n processes, the algorithm requires
n-I request messages and n-l reply messages, giving a total of
2(n-I) messages per critical section entry.
Drawbacks:
1. In a system having n processes, the algorithm is liable to n
points of failure because if one of the processes fails, the
entire scheme collapses.
The failed process will not reply to request messages that will be
falsely interpreted as denial of permission by the requesting
processes, causing all the requesting processes to wait indefinitely.
81
Example
Voted = False
RD3 = Null RD4 = Null
Voted = False
RD1 = Null
Voted = False
RD1 = Null
Reply
Reply
The process states are explained using the following algorithm.
1. The critical region request message from Pi
• Pi sends a REQUEST(timestampi, i) to all Pj in its quorum set Ri
2. Pj on receiving the REQUEST message from Pi
• If variable votedj = True or if the process Pj itself is currently
executing the CR
- then REQUEST from Pi is deferred and pushed in the queue RDj.
- else REPLY is send to Pi and variable votedj is set to TRUE
/* termed as permission granted*/
3. The execution of the CR
• If all REPLY received from processes in Ri, enter CR.
4. The release of the CR: Pi
After execution of CR,
Process Pi sends RELEASE to all Pj in Ri.
86
5. The receipt of RELEASE message: /* Pi sends to all Pj in Ri*/
• If RDj is nonempty
(a) dequeue top of the queue RDj,
/* looks out for requests which came while Pi was in CR*/
(b) Pj sends a REPLY message to only this dequeued process.
(c) Set votedj to be TRUE
• If queue RDj was empty
(a) votedj = false
87
Performance Parameters
Maekawa used the theory of projective planes and
showed that N = K(K − 1) + 1. This relation gives |Ri | = √N.
1. An execution of the CR requires SQRT(N) REQUEST, SQRT(N)
REPLY and SQRT(N) RELEASE messages, thus requiring total 3
SQRT(N) messages per CR execution.
2. Synchronization delay is 2T.
3. M = K = SQRT(N) works best.
90