0% found this document useful (0 votes)
19 views12 pages

Module3 Notes

This document discusses the importance of time and global states in distributed systems, focusing on clock synchronization methods and algorithms for capturing global states. It covers various synchronization techniques, including Cristian's method, the Berkeley algorithm, and the Network Time Protocol (NTP), as well as logical clocks and vector clocks for event ordering. The document emphasizes the challenges of maintaining accurate time across distributed systems and the significance of both physical and logical clocks in achieving consistency and coordination.

Uploaded by

kaifmohammed6777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views12 pages

Module3 Notes

This document discusses the importance of time and global states in distributed systems, focusing on clock synchronization methods and algorithms for capturing global states. It covers various synchronization techniques, including Cristian's method, the Berkeley algorithm, and the Network Time Protocol (NTP), as well as logical clocks and vector clocks for event ordering. The document emphasizes the challenges of maintaining accurate time across distributed systems and the significance of both physical and logical clocks in achieving consistency and coordination.

Uploaded by

kaifmohammed6777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DISTRIBUTED

SYSTEM(BCS515D)
MODULE-3 NOTES

TIME AND GLOBAL STATES


Page 1 of 12

INTRODUCTION
• Time is an important and interesting issue in distributed
systems, for several reasons.
• First, time is a quantity we often want to measure accurately. In
order to know at what time of day a particular event occurred at
a particular computer it is necessary to synchronize its clock with
an authoritative, external source of time.
• Second, algorithms that depend upon clock synchronization
have been developed for several problems in distribution.
• These include
o maintaining the consistency of distributed data
o checking the authenticity of a request sent to a server
o eliminating the processing of duplicate updates
• In the first half of this chapter, we examine methods whereby
computer clocks can be approximately synchronized, using
message passing. We go on to introduce logical clocks, including
vector clocks, which are used to define an order of events
without measuring the physical time at which they occurred.
• In the second half, we describe algorithms whose purpose is to
capture global states of distributed systems as they execute.

CLOCKS, EVENTS AND PROCESS STATES


• CLOCKS :
o Clocks are electronic devices that count oscillations occurring in
a crystal at a definite frequency, and typically divide this count
and store the result in a counter register.
o Clock devices can be programmed to generate interrupts at
regular intervals.
Page 2 of 12

o The operating system reads the node’s hardware clock value,


Hi(t), scales it and adds an offset so as to produce a software
clock Ci (t) = αHi (t) + β that approximately measures real,
physical time t for process pi .

• CLOCK SKEW AND CLOCK DRIFT :


o The instantaneous difference between the readings of any two
clocks is called their skew.
o Clock drift refers to several related phenomena where a clock
does not run at exactly the same rate as a reference clock.
o A clock’s drift rate is the change in the offset (difference in
reading) between the clock and a nominal perfect reference
clock per unit of time measured by the reference clock.

• COORDINATED UNIVERSAL TIME :


o Coordinated Universal Time is abbreviated as UTC is an
international standard for timekeeping.
o UTC signals are synchronized and broadcast regularly from land
based radio stations and satellites covering many parts of the
world.
o For example, in the USA, the radio station WWV broadcasts time
signals on several shortwave frequencies.
o Satellite sources include the Global Positioning System (GPS).
o Signals received from GPS satellites are accurate to about 1
microsecond. Computers with receivers attached can
synchronize their clocks with these timing signals.
Page 3 of 12

SYNCHRONIZING PHYSICAL CLOCKS


• Physical clocks are fundamental tools in computing and
networking for measuring and synchronizing time. They are
crucial for coordinating events and actions across different
systems and devices.
• Physical clocks in distributed systems refer to the real-time
clocks within each node. These clocks are fundamental for
coordinating actions and maintaining the sequence of
operations.
• Physical clocks track the progression of time using hardware-
based mechanisms. These clocks are typically built into
computer systems and network devices.
• They provide a continuous time count from a specific starting
point.
• The two modes of synchronization are:
o External synchronization: For a synchronization bound D
> 0 , and for a source S of UTC time, |S(t) – Ci (t)|< D, for I
= 1,2,….. N and for all real times t in I. Another way of
saying this is that the clocks Ci are accurate to within the
bound D.
o Internal synchronization: For a synchronization bound D >
0 , |Ci (t) – Cj (t)| < D for i,j = 1, 2,…… N , and for all real
times t in I. Another way of saying this is that the clocks CI
agree within the bound D.
• Clocks that are internally synchronized are not necessarily
externally synchronized, since they may drift collectively from an
external source of time even though they agree with one
another.
Page 4 of 12

• SYNCHRONIZATION IN A SYNCHRONOUS SYSTEM


o In a synchronous system, bounds are known for the drift rate of
clocks, the maximum message transmission delay, and the time
required to execute each step of a process .
o One process sends the time t on its local clock to the other in a
message m. In principle, the receiving process could set its clock
to the time t + TTRANS, where TTRANS is the time taken to transmit
m between them. The two clocks would then agree.
o In a synchronous system, by definition, there is also an upper
bound max on the time taken to transmit any message. Let the
uncertainty in the message transmission time be u, so that u =
(max – min) . If the receiver sets its clock to be t + min , then the
clock skew may be as much as u, since the message may in fact
have taken time max to arrive.
o Similarly, if it sets its clock to t + max , the skew may again be as
large as u. If, however, it sets its clock to the halfway point, t +
(max + min) / 2 , then the skew is at most u/2 .
o In general, for a synchronous system, the optimum bound that
can be achieved on clock skew when synchronizing N clocks is u
(1 – 1/N).
o Most distributed systems found in practice are asynchronous:
the factors leading to message delays are not bounded in their
effect, and there is no upper bound max on
message transmission delays.
o For an asynchronous system, we may say only that
TTRANS = min + x , where x > 0. The value of x is not known in a
particular case, although a distribution of values may be
measurable for a particular installation
Page 5 of 12

• CRISTIAN’S METHOD FOR SYNCHRONIZING CLOCKS


o Cristian [1989] suggested the use of a time server, connected to
a device that receives signals from a source of UTC, to
synchronize computers externally.

CLOCK SYNCHRONIZATION USING A TIME SERVER

o A process p requests the time in a message mr , and receives the


time value t in a message mt.
o Process p records the total round-trip time Tround taken to send
the request mr and receive the reply mt .
o It can measure this time with reasonable accuracy if its rate of
clock drift is small.
o A simple estimate of the time to which p should set its clock is
t+Tround /2 ,which assumes that the elapsed time is split equally
before and after S placed t in mt .
o The earliest point at which S could have placed the time in mt
was min after p dispatched mr .
o The latest point at which it could have done this was min before
mt arrived at p.
o The time by S’s clock when the reply message arrives is
therefore in the range [t + min, t + Tround -min] .
o The width of this range is Tround - 2min , so the accuracy is
±(Tround / 2 – min).
o The greater the accuracy required, the smaller the probability of
achieving it.
o This is because the most accurate results are those in which both
messages are transmitted in a time close to min – an unlikely
event in a busy network.
Page 6 of 12

• THE BERKELEY ALGORITHM


o In Berkeley Algorithm, a coordinator computer is chosen to act
as the master.
o Unlike in Cristian’s protocol, this computer periodically polls the
other computers whose clocks are to be synchronized, called
slaves. The slaves send back their clock values to it.
o The master estimates their local clock times by observing the
round-trip times, and it averages the values obtained including
its own clock’s reading.
o The balance of probabilities is that this average cancels out the
individual clocks’ tendencies to run fast or slow.
o The accuracy of the protocol depends upon a nominal maximum
round-trip time between the master and the slaves.
o The master eliminates any occasional readings associated with
larger times than this maximum.

o The Berkeley algorithm eliminates readings from faulty clocks.


o Such clocks could have a significant adverse effect if an ordinary
average was taken so instead the master takes a fault-tolerant
average.
o That is, a subset is chosen of clocks that do not differ from one
another by more than a specified amount, and the average is
taken of readings from only these clocks.
o If the master fail, then another can be elected to take over and
function exactly as its predecessor.
Page 7 of 12

• THE NETWORK TIME PROTOCOL


o The Network Time Protocol (NTP) defines an architecture for a
time service and a protocol to distribute time information over
the Internet.
o NTP’s chief design aims and features are as follows:
✓ To provide a service enabling clients across the Internet to be
synchronized accurately to UTC: NTP employs statistical
techniques for the filtering of timing data and it discriminates
between the quality of timing data from different servers.
✓ To provide a reliable service that can survive lengthy losses of
connectivity: There are redundant servers and redundant paths
between the servers. The servers can reconfigure so as to
continue to provide the service if one of them becomes
unreachable.
✓ To enable clients to resynchronize sufficiently frequently to
offset the rates of drift found in most computers: The service is
designed to scale to large numbers of clients and servers.
✓ To provide protection against interference with the time
service, whether malicious or accidental: The time service uses
authentication techniques to check that timing data originate
from the claimed trusted sources. It also validates the return
addresses of messages sent to it.
o The NTP service is provided by a network of servers located
across the Internet.
o The servers are connected in a logical hierarchy called a
synchronization subnet whose levels are called strata.
o Primary servers occupy stratum 1: they are at the root.
o Stratum 2 servers are secondary servers that are synchronized
directly with the primary servers; stratum 3 servers are
synchronized with stratum 2 servers, and so on.
Page 8 of 12

o The clocks belonging to servers with high stratum numbers are


liable to be less accurate than those with low stratum numbers,
because errors are introduced at each level of synchronization.
o NTP servers synchronize with one another in one of three
modes: multicast, procedure-call and symmetric mode.
o Multicast mode is intended for use on a high-speed LAN. One or
more servers periodically multicasts the time to the servers
running in other computers connected by the LAN, which set
their clocks assuming a small delay. This mode can achieve only
relatively low accuracies.
o In Procedure-call mode, one server accepts requests from other
computers, which it processes by replying with its timestamp.
This mode is suitable where higher accuracies are required than
can be achieved with multicast, or where
o multicast is not supported in hardware.
o For example, file servers on the same or a neighbouring LAN that
need to keep accurate timing information for file accesses could
contact a local server in procedure-call mode.
o Symmetric Mode is intended for use by the servers that supply
time information in LANs and by the higher levels (lower strata)
of the synchronization subnet, where the highest accuracies are
to be achieved. A pair of servers operating in symmetric mode
exchange messages bearing timing information.
o For each pair of messages sent between two servers the NTP
calculates an offset oi , which is an estimate of the actual offset
between the two clocks, and a delay di , which is the total
transmission time for the two messages. If the true offset of the
clock at B relative to that at A is o, and if the actual transmission
times for m and m' are t and t', respectively, then we have:
Ti – 2 = Ti – 3 + t + o and Ti = Ti – 1 + t’ – o
This leads to:
di = t + t’ = Ti – 2 – Ti – 3 + Ti – Ti – 1
Page 9 of 12

and:
o = oi + (t’– t) / 2 = , where oi = (Ti – 2 – Ti – 3 + Ti – 1 – Ti )/ 2

LOGICAL TIME AND LOGICAL CLOCKS

• Logical clocks
o Lamport [1978] invented a simple mechanism by which the
happened before ordering can be captured numerically,
called a logical clock.
o A Lamport logical clock is a monotonically increasing
software counter, whose value need bear no particular
relationship to any physical clock.
o Each process pi keeps its own logical clock, Li , which it uses
to apply so-called Lamport timestamps to events. We denote
the timestamp of event e at pi by Li(e) , and by L(e) we denote
the timestamp of event e at whatever process it occurred at.
o To capture the happened-before relation , processes update
their logical clocks and transmit the values of their logical
clocks in messages as follows:
LC1:
Li is incremented before each event is issued at process pi
Li := Li + 1.
LC2:
(a) When a process pi sends a message m, it piggybacks
on m the value
t = Li
Page 10 of 12

(b) On receiving (m,t),a process pj computes Lj:= max(Lj,t)


and then applies LC1 before timestamping the event
receive(m).
o Although we increment clocks by 1, we could have chosen
any positive value.
o It can easily be shown, by induction on the length of any
sequence of events relating two events e and e’ , that e -> e’
L(e) < L(e’)
o Each of the processes p1 , p2 and p3 has its logical clock
initialized to 0.
o The clock values given are those immediately after the event
to which they are adjacent.
o Note that, for example, L(b) > L(e) but b||e.

• VECTOR CLOCKS
o Mattern [1989] and Fidge [1991] developed vector clocks
to overcome the shortcoming of Lamport’s clocks: the fact
that from L(e) < L(e’) we cannot conclude that e -> e’.
o A vector clock for a system of N processes is an array of N
integers. Each process keeps its own vector clock, Vi , which
it uses to timestamp local events.
o Like Lamport timestamps, processes piggyback vector
timestamps on the messages they send to one another, and
there are simple rules for updating the clocks:
Page 11 of 12

o For a vector clock VI , Vi [i] is the number of events that pI


has timestamped, and Vi[j] (j ≠ i) is the number of events
that have occurred at pJ that have potentially affected pi .

You might also like