Distributed Computing Clocks
Distributed Computing Clocks
Minu Poulose
Assistant Professor
Div. of CSE,SOE
Contents
●
Synchronizing physical clocks – Berkeley
Algorithm
●
Network Time Protocol (NTP)
●
Logical Clocks
●
Vector Clocks
Berkeley Algorithm (1989)
●
An algorithm for internal synchronization of a group of computers
running Berkeley UNIX
●
A co-ordinator computer is chosen as master
●
A master polls other computers whose clocks are to be synchronized
(slaves)
●
Slaves send back their clock values
●
The master uses round trip times to estimate the slaves’ clock values
●
It takes an average of the values obtained including its own clock timing
Berkeley Algorithm (1989)
●
It eliminates faulty clock values to get a fault tolerant average
●
Accuracy of the protocol depends upon a nominal maximum round trip
time between the master and the slaves.
●
Instead of sending the updated correct time to the slaves, the master sends
the amount by which each individual slave’s clock has to be adjusted (+ve
or –ve value)
●
This eliminates further uncertainties that may occur due to message
transmission time
Berkeley Algorithm (1989)
●
Measurements
●
With 15 computers, clock synchronization of 20-25 millisecs was
achieved using the protocol (drift rate < 2x10-5 , maximum round trip
time taken as 10 millisecs)
●
If master fails, it can elect a new master as predecessor to take over (not
guaranteed in bounded time)
●
Christian’s and Berkeley’s algorithms are intended to be used within
intranets
Network Time Protocol (NTP)
●
NTP distribute time information over the Internet
●
NTP’s design aims and features:
(2) To provide a reliable service that can survive lengthy losses of connectivity
There are redundant servers and redundant paths between the servers. The
servers can reconfigure so as to continue to provide the service if one of
them becomes unreachable.
●
Network Time Protocol (NTP)
●
The NTP service is provided by a network of servers located across
the Internet.
●
Primary servers are connected directly to a time source such as a
radio clock receiving UTC
●
Secondary servers are synchronized, ultimately, with primary servers.
●
The servers are connected in a logical hierarchy called a
synchronization subnet as in Fig, whose levels are called strata.
Network Time Protocol (NTP)
●
Primary servers occupy stratum 1: they are at the root.
●
Stratum 2 servers are secondary servers that are synchronized
directly with the primary servers
●
Stratum 3 servers are synchronized with stratum 2 servers, and
so on.
●
The lowest level (leaf) servers execute in users’ workstations.
Example Synchronization subnet in an
NTP implementation
Primary servers are connected to Secondary servers are synchronized to
UTC sources primary servers
2 2
3 3 3
●
The clocks belonging to servers with high stratum numbers are liable to be less
accurate than those with low stratum numbers because errors are introduced at
each level of synchronization.
●
NTP takes into account the total message round trip delays of the servers to the
root
●
The synchronization subnet can reconfigure as servers become unreachable or
failures occur
Eg: a primary that loses its UTC source can become a secondary, a secondary
that loses its primary can use another primary
Network Time Protocol (NTP)
(1) Multicast mode
●
A server within a high speed LAN periodically multicasts time to other
servers which set clocks assuming some delay
●
Not very accurate; but sufficient for many purposes
●
Each message bears timestamps of recent events:
●
Local times of Send and Receive of previous message
●
Local times of Send and Receive of current message
● We have Ti-3, Ti-2, Ti-1, Ti for the messages m and m’ between servers A and
B
Messages exchanged between a
pair of NTP peers
Server B Ti-2 Ti-1
Time
m m'
Time
Server A Ti- 3 Ti
●
In symmetric mode there can be a non-negligible delay between arrival of
one message and the dispatch of the next
●
Also messages may be lost, but the three timestamps carried by each
message are valid
Accuracy of NTP
● For each pair of messages between two servers, NTP estimates an offset oi between
the two clocks and a delay di (total transmission time for the two messages)
●
If the true offset of the clock at B relative to A is o and if the actual transmission
times for m and m’ are t and t’, then we have
● Ti-2 = Ti-3 + t + o and Ti = Ti-1 + t’ - o
●
This gives us (by adding the equations) :
● di = t + t’ = Ti-2 - Ti-3 + Ti - Ti-1
●
Also (by subtracting the equations)
● o = oi + (t’ - t )/2 where oi = (Ti-2 - Ti-3 + Ti-1 - Ti)/2
Accuracy of NTP
● NTP servers use a data filtering algorithm to filter pairs <oi, di>,
estimating the offset o and calculating the quality of this estimate
as a statistical quantity called Filter Dispersion
●
High filter dispersion represents relatively unreliable data
● NTP servers retain eight most recent pairs <oi,di>
● The value oi of that corresponds to the minimum value di is chosen
to estimate o
Accuracy of NTP
●
A NTP server exchange with several peers in addition to with parent
●
Peers with lower stratum numbers are favored
●
Peer with the lowest synchronization dispersion are favored
●
Accuracy of tens of milliseconds over Internet paths and 1
millisecond on LANs
Logical time and Logical clocks
●
Instead of synchronizing clocks, event ordering can be used
●
Event ordering is based on
(1) If two events occurred at the same process pi (i = 1, 2, … N) then
they occurred in the order observed by pi, that is i
(2) When a message, m is sent between two processes, send(m)
happened before receive(m)
(3) The happened before relation is transitive
●
It is also known as the relation of casual ordering or potential casual
ordering
Logical time and Logical clocks
● The happened before relation can be defined as follows
●
HB1: If process
pi: e ie’,
then e e’
●
HB2: For any message m, send(m) receive(m)
●
HB3: IF e, e’ and e’’ are events such that e e’ and
e’e’’,
then ee’’
●
The relation illustrated for three processes p1, p2, p3 in figure
Events occurring at three processes
p1
a b m1
b c because of m1
a b (at p1)
p2 Physical
c d time
m2
a || e c d (at p2) d f because of m2
p3
e a f considering all the above chain f
●
Not all events are related by
●
Consider a and e (different processes and no chain of messages to relate them)
●
They are not related by ; they are said to be concurrent; write as a || e
Lamport’s Logical Clocks
●
Logical clocks – Lamport’s mechanism by which the HB ordering can be
captured numerically
●
A logical clock is a monotonically increasing software counter; not related
to a physical clock.
● Each process pi has a logical clock, Li which can be used to apply logical
Lamport’s timestamps to events
●
Processes update their logical clocks and transmit the values of their logical
clocks in messages as follows
● LC1: Li is incremented before each event is issued at process pi :
Li :=Li+1
● LC2: (a) When a process pi sends a message m, it piggybacks on m
the value t = Li;
●
Processes update their logical clocks and
●
e e’ L(e) < L(e’)
●
But the converse is not true; that is L(e) < L(e') does not imply e e’
● Figure illustrates the use of logical clocks; processes p1, p2, p3 has their
logical clocks initialized to 0
●
Clock values given are those immediately after the event to which they
are adjacent
●
Eg: L(b) > L(e) but b || e
Lamport’s timestamps for the events
1 2 Piggyback 2 on m1
p1
a b m1 At p2 on receipt of m1, get max (0, 2) = 2
Add 1 to get 3 as the value
3 4
p2 Physical
time
c d m2
1 5
p3
e f
Initially each of p1, p2, p3 has its logical clock set to zero.
The clock values are incremented by 1 when each event occurs.
Eg: 1 for a, 2 for b.
●
Some pairs of distinct events, generated by different processes have
numerically identical Lamport timestamps
●
We can create a total order of events by using both the timestamp and the
identifiers of the processes at which events occur <t,id> pair
●
Assumptions:
● Ti : local timestamp of event e occurring at pi
or Ti = Tj and i < j
●
Useful in some applications.
●
Eg: to order entry of processes to a critical section
Vector Clocks
●
Overcome the shortcomings of Lamport’s clocks; from L(e) < L(e') we
cannot conclude that e
e’
●
Vector clock for a system of N processes is an array of N integers
● Each process pi keeps a vector clock Vi which it uses to timestamp local
events
●
Processes piggyback vector timestamps on the messages they send to one
another
●
Rules for updating clocks are as follows
Vector Clocks
Vi[i] := Vi[i] +1
● VC3: pi includes the value t = Vi in every message it sends
● VC4: When pi receives a timestamp t in a message, it sets
Vi[j] :=max(Vi[j], t[j]), for j=1,2…,N (Taking component wise
maximum of two vector timestamps in this way is known as a
merge operation)
Vector Clocks
●
Let V(e) be the vector timestamp applied by the process at which e occurs,
we can show that
●
Converse is also true; V(e) < V(e`) e e`
●
Figure shows the vector time stamps of the events
●
Disadvantage: Amount of storage and message payload is proportional to
the number of processes N
Vector Clocks
(1,0,0) (2,0,0) At p1 a(1,0,0) b (2,0,0); piggyback (2,0,0) on m1
p1
a b m 1 At p2 on receipt of m1, get max ((0,0,0), (2,0,0)) = (2, 0, 0) Add 1
to own element = (2,1,0)
(2,1,0) (2,2,0)
p2 Physical
time
c d
m2
(0,0,1) (2,2,2)
p3
e f
V(a) < V(f) shows that a f