0% found this document useful (0 votes)
5 views37 pages

Slides 05

This document discusses coordination in distributed systems, focusing on clock synchronization and logical clocks. It explains the concepts of physical clocks, precision, accuracy, and various synchronization algorithms, including Lamport's logical clocks and vector clocks. Additionally, it addresses issues like mutual exclusion and causally ordered multicasting to ensure consistent event ordering across processes.

Uploaded by

mahmoudweso2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views37 pages

Slides 05

This document discusses coordination in distributed systems, focusing on clock synchronization and logical clocks. It explains the concepts of physical clocks, precision, accuracy, and various synchronization algorithms, including Lamport's logical clocks and vector clocks. Additionally, it addresses issues like mutual exclusion and causally ordered multicasting to ensure consistent event ordering across processes.

Uploaded by

mahmoudweso2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Distributed Systems

(4th edition, version 01)

Chapter 05: Coordination


Coordination Clock synchronization

Physical clocks
Problem
Sometimes we simply need the exact time, not just an ordering.

Solution: Universal Coordinated Time (UTC)

Note
UTC is broadcast through short-wave radio and satellite. Satellites can give an
accuracy of about ±0.5 ms.

Physical clocks
Coordination Clock synchronization

Clock synchronization
Precision
The goal is to keep the deviation between two clocks on any two machines
within a specified bound, known as the precision π:

∀t, ∀p, q : |Cp(t ) −C q (t )|≤ π

with Cp (t ) the computed clock time of machine p at UTC time t .

Accuracy
In the case of accuracy, we aim to keep the clock bound to a value α:

∀t, ∀p : |Cp(t ) −t|≤ α

Synchronization
• Internal synchronization: keep clocks precise
• External synchronization: keep clocks accurate

Clock synchronization algorithms


Coordination Clock synchronization

Clock drift
Clock specifications
• A clock comes specified with its maximum clock drift rate ρ.
• F (t ) denotes oscillator frequency of the hardware clock at time t
• F is the clock’s ideal (constant) frequency ⇒ living up to specifications:

F (t )
∀t : (1 − ρ ) ≤ ≤ (1 + ρ)
F

Observation Fast, perfect, slow clocks


By using hardware interrupts we couple
a software clock to the hardware clock,
and thus also its clock drift rate:

Clock synchronization algorithms


Coordination Clock synchronization

Detecting and adjusting incorrect times


Getting the current time from a timeserver

Clock synchronization algorithms


Coordination Clock synchronization

Reference broadcast synchronization


Essence
• A node broadcasts a reference message m ⇒ each receiving node p
records the time Tp,m that it received m.
• Note: Tp,m is read from p’s local clock.

Problem: averaging will not capture


drift ⇒ use linear regression

M
∑k =1 (T p,k −T q,k )
NO: Offset[p, q](t ) = M
YES: Offset[p, q](t ) = αt + β

Clock synchronization algorithms


Coordination Logical clocks

The Happened-before relationship


Issue
What usually matters is not that all processes agree on exactly what time it is,
but that they agree on the order in which events occur. Requires a notion of
ordering.

Lamport’s logical clocks


Coordination Logical clocks

The Happened-before relationship


Issue
What usually matters is not that all processes agree on exactly what time it is,
but that they agree on the order in which events occur. Requires a notion of
ordering.

The happened-before relation


• If a and b are two events in the same process, and a comes before b,
then a → b.
• If a is the sending of a message, and b is the receipt of that message,
then a → b
• If a → b and b → c, then a → c

Note
This introduces a partial ordering of events in a system with concurrently
operating processes.

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Attach a timestamp C(e) to each event e, satisfying the following


properties:
P1 If a and b are two events in the same process, and a → b, then we
demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of that
message, then also C(a) < C(b).

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks
Problem
How do we maintain a global view of the system’s behavior that is consistent
with the happened-before relation?

Attach a timestamp C(e) to each event e, satisfying the following


properties:
P1 If a and b are two events in the same process, and a → b, then we
demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of that
message, then also C(a) < C(b).

Problem
How to attach a timestamp to an event when there’s no global clock ⇒
maintain a consistent set of logical clocks, one per process.

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks: solution


Each process Pi maintains a local counter Ci and adjusts this counter
1. For each new event that takes place within Pi , Ci is incremented by 1.
2. Each time a message m is sent by process Pi , the message receives a
timestamp ts(m) = Ci .
3. Whenever a message m is received by a process Pj , Pj adjusts its local
counter Cj to max{Cj , ts(m)}; then executes step 1 before passing m to
the application.

Notes
• Property P1 is satisfied by (1); Property P2 by (2) and (3).
• It can still occur that two events happen at the same time. Avoid this by
breaking ties through process IDs.

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks: example


Consider three processes with event counters operating at different
rates

Lamport’s logical clocks


Coordination Logical clocks

Logical clocks: where implemented


Adjustments implemented in middleware

Lamport’s logical clocks


Coordination Logical clocks

Example: Totally ordered multicast


Concurrent updates on a replicated database are seen in the same
order everywhere
• P1 adds $100 to an account (initial value: $1000)
• P2 increments account by 1%
• There are two replicas

Result
In absence of proper synchronization:
replica #1 ← $1111, while replica #2 ← $1110.

Lamport’s logical clocks


Coordination Logical clocks

Example: Totally ordered multicast


Solution
• Process Pi sends timestamped message mi to all others. The message
itself is put in a local queue queuei .
• Any incoming message at Pj is queued in queuej , according to its
timestamp, and acknowledged to every other process.

Lamport’s logical clocks


Coordination Logical clocks

Example: Totally ordered multicast


Solution
• Process Pi sends timestamped message mi to all others. The message
itself is put in a local queue queuei .
• Any incoming message at Pj is queued in queuej , according to its
timestamp, and acknowledged to every other process.

Pj passes a message mi to its application if:


(1) mi is at the head of queuej
(2) for each process Pk , there is a message mk in queuej with a larger
timestamp.

Lamport’s logical clocks


Coordination Logical clocks

Example: Totally ordered multicast


Solution
• Process Pi sends timestamped message mi to all others. The message
itself is put in a local queue queuei .
• Any incoming message at Pj is queued in queuej , according to its
timestamp, and acknowledged to every other process.

Pj passes a message mi to its application if:


(1) mi is at the head of queuej
(2) for each process Pk , there is a message mk in queuej with a larger
timestamp.

Note
We are assuming that communication is reliable and FIFO ordered.

Lamport’s logical clocks


Coordination Logical clocks

Lamport’s clocks for mutual exclusion


1 c l a s s Process:
2 def i n i t ( s e l f , chanID, procID, procIDSet):
3 self.chan.join(procID)
4 self.procID = int(procID)
5 self.otherProcs.remove(self.procID)
6 self.que ue = [] # The request queue
7 s e lf. c lo c k = 0 # The current l o g i c a l clock
8
9 def requestToEnter(self):
10 s e l f. c l o c k = s e lf. c lo c k + 1 # Increment c l o c k value
11 self. q ue ue.a pp e nd ((se lf. c loc k, s e lf. p ro c ID , ENTER)) # Append request t o q
12 self.cleanupQ() # S o r t t h e queue
13 self.chan.se nd To (se lf.otherProcs, ( s e l f . c l o c k , s e lf. p ro c ID , ENTER)) # Send request
14
15 def ackToEnter(self, re q ue s te r):
16 s e lf. c l o c k = s e lf. c lo c k + 1 # Increment c l o c k value
17 self.chan.se nd To (re q ueste r, ( s e l f . c l o c k , s e lf. p ro c ID , ACK)) # Permit other
18
19 def r e le a s e ( s e lf) :
20
tmp = [ r f o r r i n s e lf. q ue u e [ 1 : ] i f r [ 2 ] == ENTER] # Remove a l l ACKs
21
s e lf. q ue ue = tmp # and copy t o new queue
22
s e lf. c l o c k = s e lf. c lo c k + 1 # Increment c l o c k value
23
self.chan.se nd To (se lf.otherProcs, ( s e l f . c l o c k , s e lf. p ro c ID , RELEASE)) # Release
24
25
26
def allowedToEnter(self):
commProcs = s e t ( [ r e q [ 1 ] f o r r e q i n s e lf. q u e u e [ 1 :] ] ) # See who has s e n t a message
27
return (s e lf. q ue ue [0 ] [1 ] == self. p roc ID and le n(s e lf. o t he rP r o c s ) == len(commProcs))

Lamport’s logical clocks


Coordination Logical clocks

Lamport’s clocks for mutual exclusion


1 def re c e ive ( s e l f ) :
2 msg = self.chan.recvFrom(self.otherProcs)[1] # Pick up any message
3 s e l f. c l o c k = max(self.clock, msg[0]) # Adjust c l o c k v a l u e . . .
4 s e l f. c l o c k = s e lf. c lo c k + 1 # . . . a n d increment
5 i f msg[2] == ENTER:
6 self.queue.append(msg) # Append an ENTER request
7 self.ackToEnter(msg[1]) # and unconditionally allow
8 e l i f msg[2] == ACK:
9 self.queue.append(msg) # Append a received ACK
10 e l i f msg[2] == RELEASE:
11 del(self.q ue ue [0 ]) # J u s t remove f i r s t message
12 self.cleanupQ() # And s o r t and cleanup

Lamport’s logical clocks


Coordination Logical clocks

Lamport’s clocks for mutual exclusion


Analogy with totally ordered multicast
• With totally ordered multicast, all processes build identical queues,
delivering messages in the same order
• Mutual exclusion is about agreeing in which order processes are allowed
to enter a critical region

Lamport’s logical clocks


Coordination Logical clocks

Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.

Concurrent message Observation


transmission using logical Event a: m1 is received at T = 16;
clocks Event b: m2 is sent at T = 20.

Vector clocks
Coordination Logical clocks

Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.

Concurrent message Observation


transmission using logical Event a: m1 is received at T = 16;
clocks Event b: m2 is sent at T = 20.

Note
We cannot conclude that a causally
precedes b.

Vector clocks
Coordination Logical clocks

Causal dependency
Definition
We say that b may causally depend on a if ts(a) < ts(b), with:
• for all k , ts(a)[k ] ≤ ts(b)[k ] and
• there exists at least one index k′ for which ts(a)[k′] < ts(b)[k′]

Precedence vs. dependency


• We say that a causally precedes b.
• b may causally depend on a, as there may be information from a that is
propagated into b.

Vector clocks
Coordination Logical clocks

Capturing potential causality


Solution: each Pi maintains a vector VCi
• VCi [i ] is the local logical clock at process Pi .
• If VCi [j ] = k then Pi knows that k events have occurred at Pj .

Maintaining vector clocks


1. Before executing an event, Pi executes VCi [i ] ← VCi [i ] + 1.
2. When process Pi sends a message m to Pj , it sets m’s (vector)
timestamp ts(m) equal to VCi after having executed step 1.
3. Upon the receipt of a message m, process Pj sets
VCj [k ] ← max{VCj [k ], ts(m)[k ]}for each k , after which it executes step 1
and then delivers the message to the application.

Vector clocks
Coordination Logical clocks

Vector clocks: Example


Capturing potential causality when exchanging messages

(a) (b)

Analysis

Situation ts(m2 ) ts(m4 ) ts(m2 ) ts(m2 ) Conclusion


< >
ts(m4 ) ts(m4 )
(a) (2, 1, 0) (4, 3, 0) Yes No m2 may causally precede m4
(b) (4, 1, 0) (2, 3, 0) No No m2 and m4 may conflict

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Observation
We can now ensure that a message is delivered only if all causally preceding
messages have already been delivered.

Adjustment
Pi increments VCi [i ] only when sending a message, and Pj “adjusts” VCj
when receiving a message (i.e., effectively does not change VCj [j ]).

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Observation
We can now ensure that a message is delivered only if all causally preceding
messages have already been delivered.

Adjustment
Pi increments VCi [i ] only when sending a message, and Pj “adjusts” VCj
when receiving a message (i.e., effectively does not change VCj [j ]).

Pj postpones delivery of m until:


1. ts(m)[i ] = VCj [i ] + 1
2. ts(m)[k ] ≤ VCj [k ] for all k ̸= i

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Enforcing causal communication

Vector clocks
Coordination Logical clocks

Causally ordered multicasting


Enforcing causal communication

Vector clocks
Coordination Mutual exclusion

Mutual exclusion
Problem
Several processes in a distributed system want exclusive access to some
resource.

Basic solutions
Permission-based: A process wanting to enter its critical region, or access a
resource, needs permission from other processes.
Token-based: A token is passed between processes. The one who has the
token may proceed in its critical region, or pass it on when not
interested.

Overview
Coordination Mutual exclusion

Permission-based, centralized
Simply use a coordinator

(a) (b) (c)


(a) Process P1 asks the coordinator for permission to access a shared
resource. Permission is granted.
(b) Process P2 then asks permission to access the same resource. The
coordinator does not reply.
(c) When P1 releases the resource, it tells the coordinator, which then replies
to P2 .

A centralized algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala


The same as Lamport except that acknowledgments are not sent
Return a response to a request only when:
• The receiving process has no interest in the shared resource; or
• The receiving process is waiting for the resource, but has lower priority
(known through comparison of timestamps).
In all other cases, reply is deferred, implying some more local administration.

A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Ricart & Agrawala


Example with three processes

(a) (b) (c)

(a) Two processes want to access a shared resource at the same moment.
(b) P0 has the lowest timestamp, so it wins.
(c) When process P0 is done, it sends an OK also, so P2 can now go ahead.

A distributed algorithm
Coordination Mutual exclusion

Mutual exclusion: Token ring algorithm


Essence
Organize processes in a logical ring, and let a token be passed between them.
The one that holds the token is allowed to enter the critical region (if it wants
to).

An overlay network constructed as a logical ring with a circulating token

A token-ring algorithm
Coordination Mutual exclusion

Decentralized mutual exclusion


Principle
Assume every resource is replicated N times, with each replica having its own
coordinator ⇒ access requires a majority vote from m > N/2 coordinators. A
coordinator always responds immediately to a request.

Assumption
When a coordinator crashes, it will recover quickly, but will have forgotten
about permissions it had granted.

A decentralized algorithm
Coordination Mutual exclusion

Mutual exclusion: comparison

Messages per Delay before entry


Algorithm entry/exit (in message times)
Centralized 3 2
Distributed 2(N − 1) 2(N − 1)
Token ring 1,..., ∞ 0,..., N − 1
Decentralized 2kN + ( k − 1)N/2 + N, k = 1, 2,... 2kN + ( k − 1)N/2

A decentralized algorithm

You might also like