0% found this document useful (0 votes)
15 views

Distributed Synchronization

Uploaded by

brekhna khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Distributed Synchronization

Uploaded by

brekhna khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Distributed Systems

Synchronization
Chapter 6

1
?Why Synchronize
• Often important to control access to a single,
shared resource.
• Also often important to agree on the ordering of
events.
• Synchronization in Distributed Systems is much
more difficult than in uniprocessor systems.
• We will study:
1. Synchronization based on “Actual Time”.
2. Synchronization based on “Relative Time”.
3. Synchronization based on Co-ordination (with Election Algorithms).
4. Distributed Mutual Exclusion.
5. Distributed Transactions.
2
Clock Synchronization
• Synchronization based on “Actual Time”.
• Note: time is really easy on a uniprocessor system.
• Achieving agreement on time in a DS is not trivial.
• Question: is it even possible to synchronize all the
clocks in a Distributed System?
• With multiple computers, “clock skew” ensures that
no two machines have the same value for the “current
time”. But, how do we measure time?

3
?How Do We Measure Time
• Turns out that we have only been
measuring time accurately with a “global”
atomic clock since Jan. 1st, 1958 (the
“beginning of time”).
• Bottom Line: measuring time is not as
easy as one might think it should be.
• Algorithms based on the current time
(from some Physical Clock) have been
devised for use within a DS.
4
Clock Synchronization
Cristian's Algorithm

• Getting the current time from a “time server”, using periodic client
requests.
• Major problem if time from time server is less than the client –
resulting in time running backwards on the client! (Which cannot
happen – time does not go backwards).
• Minor problem results from the delay introduced by the network
5 request/response: latency.
6
The Berkeley Algorithm (1)

(a) The time daemon asks all the other machines


7 for their clock values
The Berkeley Algorithm (2)

8 (b) The machines answer


The Berkeley Algorithm (3)

(c) The time daemon tells everyone


9 how to adjust their clock
Network Time Protocol

10 Getting the current time from a time server


11
Logical Clocks
• Synchronization based on “relative time”.
• Note that (with this mechanism) there is no
requirement for “relative time” to have any
relation to the “real time”.
• What’s important is that the processes in the
Distributed System agree on the ordering in
which certain events occur.
• Such “clocks” are referred to as Logical Clocks.

12
Lamport’s Logical Clocks
• First point: if two processes do not interact, then their
clocks do not need to be synchronized – they can
operate concurrently without fear of interfering with
each other.
• Second (critical) point: it does not matter that two
processes share a common notion of what the “real”
current time is. What does matter is that the processes
have some agreement on the order in which certain
events occur.
• Lamport used these two observations to define the
“happens-before” relation (also often referred to within
the context of Lamport’s Timestamps).
13
Lamport logical clock timestamp
• Let timestamp(a) be the Lamport logical clock
timestamp
a  b => timestamp(a) < timestamp(b)
(if a happens before b, then Lamport_timestamp(a) <
Lamport_timestamp(b))

timestamp(a) < timestamp(b) => ab


(If Lamport_timestamp(a) < Lamport_timestamp(b), it does
NOT imply that a happens before b

14
The “Happens-Before” Relation (4)
• The question to ask is:
– How can some event that “happens-before” some other
event possibly have occurred at a later time??
• The answer is: it can’t!
• So, Lamport’s solution is to have the receiving process
adjust its clock forward to one more than the sending
timestamp value. This allows the “happens-before”
relation to hold, and also keeps all the clocks running
in a synchronized state. The clocks are all kept in sync
relative to each other.

15
Lamport’s Logical Clocks (1)

• The "happens-before" relation →


can be observed directly in two situations:
1. If a and b are events in the same process,
and a occurs before b, then a → b is true.
2. If a is the event of a message being sent by
one process, and b is the event of the
message being received by another process,
then a → b.
16
Lamport’s Logical Clocks (2)

(a) Three processes, each with its own clock.


17 The clocks run at different rates.
Lamport’s Logical Clocks (3)

18 (b) Lamport’s algorithm corrects the clocks


Lamport’s Logical Clocks (4)
• Updating counter Ci for process Pi :
1. Before executing an event Pi executes
Ci ← Ci + 1.
2. When process Pi sends a message m to Pj,
it sets m’s timestamp ts(m) equal to Ci
after having executed the previous step.
3. Upon the receipt of a message m, process Pj
adjusts its own local counter as
Cj ← max{Cj , ts(m)}, after which it then
executes the first step and delivers the message to
19 the application.
Lamport’s Logical Clocks (5)

The positioning of Lamport’s logical


20 clocks in distributed systems
Problem: Totally-Ordered Multicasting

Updating a replicated database and leaving it in an inconsistent


state: Update 1 adds 100 euro to an account, Update 2
calculates and adds 1% interest to the same account. Due to
network delays, the updates may not happen in the correct
21 order. Whoops!
Solution: Totally-Ordered Multicasting
• A multicast message is sent to all processes in the
group, including the sender, together with the
sender’s timestamp.
• At each process, the received message is added to
a local queue, ordered by timestamp.
• Upon receipt of a message, a multicast
acknowledgement/timestamp is sent to the group.
• Due to the “happens-before” relationship
holding, the timestamp of the acknowledgement
is always greater than that of the original
22 message.
More Totally Ordered Multicasting
• Only when a message is marked as acknowledged by
all the other processes will it be removed from the
queue and delivered to a waiting application.
• Lamport’s clocks ensure that each message has a
unique timestamp, and consequently, the local queue
at each process eventually contains the same contents.
• In this way, all messages are delivered/processed in
the same order everywhere, and updates can occur in a
consistent manner.

23
Totally-Ordered Multicasting, Revisited

• Update 1 is time-stamped and multicast. Added to local queues.


• Update 2 is time-stamped and multicast. Added to local queues.
• Acknowledgements for Update 2 sent/received. Update 2 can now be
processed.
• Acknowledgements for Update 1 sent/received. Update 1 can now be
processed.
• (Note: all queues are the same, as the timestamps have been used to ensure
24 the “happens-before” relation holds.)
Vector Clocks (1)

25 Concurrent message transmission using logical clocks


Vector Clocks (2)
• Vector clocks are constructed by letting each
process Pi maintain a vector VCi with the
following two properties:
1. VCi [ i ] is the number of events that have
occurred so far at Pi. In other words, VCi [ i ]
is the local logical clock at process Pi .
2. If VCi [ j ] = k then Pi knows that k events
have occurred at Pj. It is thus Pi’s knowledge
26 of the local time at Pj .
Vector Timestamps

(1,0,0) (2,0,0)
p1
a b m1

(2,1,0) (2,2,0)
Physical
p2
time
c d
m2

(0,0,1) (2,2,2)
p3
e f

27
Example: Vector Logical Time
Physical Time

1,0,0,0 2,0,0,0 4,0,2,2


p 1 0,0,0,0 3,0,2,2
(1,0,0,0)
1,2,0,0 (4,0,2,2)
p 2 0,0,0,0
1,1,0,0 (2,0,0,0) (1,2,0,0)
(2,0,2,2)
2,0,2,0
p 3 0,0,0,0
2,0,1,0 2,2,3,0 4,2,4,2 4,2,5,3
(2,0,2,0) (2,0,2,3)
p 4 0,0,0,0
2,0,2,1
2,0,2,2
2,0,2,3

n,m,p,q Vector logical clock


(vector timestamp)
Message

28
Mutual Exclusion within Distributed Systems

• It is often necessary to protect a shared resource


within a Distributed System using “mutual
exclusion” – for example, it might be necessary to
ensure that no other process changes a shared
resource while another process is working with it.
• In non-distributed, uniprocessor systems, we can
implement “critical regions” using techniques such
as semaphores, monitors and similar constructs –
thus achieving mutual exclusion.
• These techniques have been adapted to Distributed
Systems …
29
DS Mutual Exclusion: Techniques

• Centralized: a single coordinator controls


whether a process can enter a critical
region.
• Distributed: the group confers to
determine whether or not it is safe for a
process to enter a critical region.

30
Mutual Exclusion
A Centralized Algorithm (1)

(a) Process 1 asks the coordinator for permission


to access a shared resource. Permission is
31 granted.
Mutual Exclusion
A Centralized Algorithm (2)

b) Process 2 then asks permission to access the


same resource. The coordinator does not
32 reply.
Mutual Exclusion
A Centralized Algorithm (3)

(c) When process 1 releases the resource, it tells


33 the coordinator, which then replies to 2
Comments: The Centralized Algorithm
• Advantages:
– It works.
– It is fair.
– There’s no process starvation.
– Easy to implement.
• Disadvantages:
– There’s a single point of failure!
– The coordinator is a bottleneck on busy systems.
• Critical Question: When there is no reply, does this
mean that the coordinator is “dead” or just busy?
34
Distributed Mutual Exclusion
• Based on work by Ricart and Agrawala
(1981).
• Requirement of their solution: total ordering
of all events in the distributed system (which
is achievable with Lamport’s timestamps).
• Note that messages in their system contain
three pieces of information:
1. The critical region ID.
2. The requesting process ID.
3. The current time.
35
Skeleton State Diagram for a Process

36
Mutual Exclusion: Distributed Algorithm
1. When a process (the “requesting process”) decides to enter a critical
region, a message is sent to all processes in the Distributed System
(including itself).
2. What happens at each process depends on the “state” of the critical region.
3. If not in the critical region (and not waiting to enter it), a process sends
back an OK to the requesting process.
4. If in the critical region, a process will queue the request and will not send
a reply to the requesting process.
5. If waiting to enter the critical region, a process will:
a) Compare the timestamp of the new message with that in its queue
(note that the lowest timestamp wins).
b) If the received timestamp wins, an OK is sent back, otherwise the
request is queued (and no reply is sent back).
6. When all the processes send OK, the requesting process can safely enter
the critical region.
7. When the requesting process leaves the critical region, it sends an OK to
37 all the processes in its queue, then empties its queue.
Distributed Algorithm (1)
• Three different cases:
1. If the receiver is not accessing the resource and does
not want to access it, it sends back an OK message
to the sender.
2. If the receiver already has access to the resource,
it simply does not reply. Instead, it queues the
request.
3. If the receiver wants to access the resource as well
but has not yet done so, it compares the timestamp
of the incoming message with the one contained in
the message that it has sent everyone. The lowest
one wins.
38
Distributed Algorithm (2)

(a) Two processes want to access a


39 shared resource at the same moment
Distributed Algorithm (3)

40 (b) Process 0 has the lowest timestamp, so it wins


Distributed Algorithm (4)

(c) When process 0 is done, it sends an OK also,


41 so 2 can now go ahead
Comments: The Distributed Algorithm
• The algorithm works because in the case of a conflict, the
lowest timestamp wins as everyone agrees on the total ordering
of the events in the distributed system.
• Advantages:
– It works.
– There is no single point of failure.
• Disadvantages:
– We now have multiple points of failure!!!
– A “crash” is interpreted as a denial of entry to a critical region.
– (A patch to the algorithm requires all messages to be ACKed).
– Worse is that all processes must maintain a list of the current processes
in the group (and this can be tricky)
– Worse still is that one overworked process in the system can become a
bottleneck to the entire system – so, everyone slows down.
42
… Which Just Goes To Show
• That it isn’t always best to implement a distributed
algorithm when a reasonably good centralized
solution exists.
• Also, what’s good in theory (or on paper) may not be
so good in practice.
• Finally, think of all the message traffic this distributed
algorithm is generating (especially with all those
ACKs). Remember: every process is involved in the
decision to enter the critical region, whether they have
an interest in it or not (Oh dear … ).
43
A Token Ring Algorithm

(a) An unordered group of processes on a


network.
(b) A logical ring constructed in software.
44
Comments: Token-Ring Algorithm
• Advantages:
– It works (as there’s only one token, so mutual
exclusion is guaranteed).
– It’s fair – everyone gets a shot at grabbing the token
at some stage.
• Disadvantages:
– Lost token! How is the loss detected (it is in use or
is it lost)? How is the token regenerated?
– Process failure can cause problems – a broken ring!
– Every process is required to maintain the current
logical ring in memory – not easy.
45
Comparison: Mutual Exclusion Algorithms
Messages per Delay before entry
Algorithm Problems
entry/exit (in message times)

Centralized 3 2 Coordinator crash

Crash of any
Distributed 2(n–1) 2(n–1)
process
Lost token,
Token-Ring 1 to  0 to n – 1
process crash
• None are perfect – they all have their problems!
• The “Centralized” algorithm is simple and efficient, but suffers from a single
point-of-failure.
• The “Distributed” algorithm has nothing going for it – it is slow,
complicated, inefficient of network bandwidth, and not very robust.
It “sucks”!
• The “Token-Ring” algorithm suffers from the fact that it can sometimes take
46 a long time to reenter a critical region having just exited it.

You might also like