Synchronization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

SYNCHRONIZATION IN DISTRIBUTED SYSTEM

Clock synchronization is the mechanism to synchronize the time of all the computers in the distributed
environments or system. Assume that there are three systems present in a distributed environment. To
maintain the data i.e., to send, receive and manage the data between the systems with the same time
in synchronized manner you need a clock that must be synchronized. This process to synchronize data
is known as Clock Synchronization. Synchronization in distributed system is more complicated than
in centralized system because of the use of distributed algorithms.
Properties of Distributed Algorithms to Maintain Clock Synchronization:
• Relevant and correct information will be scattered among multiple machines.
• The processes make the decision only on local information.

• Failure of the single point in the system must be avoided.

• No common clock or the other precise global time exists.

• In the distributed systems, the time is ambiguous.

As the distributed systems has its own clocks. The time among the clocks may also vary. So, it is
possible to synchronize all the clocks in distributed environment.
Types of Clock Synchronization:
1. Physical Clock Synchronization

2. Logical Clock Synchronization

3. Mutual Exclusion Synchronization

4. Election Algorithm

1. Physical Synchronization:
• In physical clock synchronization, All the computers will have their own clocks.

• The physical clocks are needed to adjust the time of nodes. All the nodes in the system can share
their local time with all other nodes in the system.

• The time will be set based on UTC (Universal Coordinate Timer).

• The time difference between the two computers is known as “Time drift”. Clock drifts over the time
is known as “Skew”. Synchronization is necessary here.

• In physical synchronization, physical clocks are used to time stamp an event on that computer.

• If two events, E1 and E2, having different time stamps t1 and t2, the order of the event occurring
will be considered and not on the exact time or the day at which they are occur.
Several methods are used to attempt the synchronization of the physical clocks in Distributed synchronization:
i. Christian’s algorithm
ii. Berkely’s algorithm
iii. NTP Algorithm
i. Christian’s Algorithm:
Cristian’s Algorithm is a clock synchronization algorithm is used to synchronize time with a
time server by client processes. This algorithm works well with low-latency networks
where Round Trip Time is short as compared to accuracy while redundancy-prone distributed
systems/applications do not go hand in hand with this algorithm. Here Round-Trip Time
refers to the time duration between the start of a Request and the end of the corresponding
Response.
Below is an illustration imitating the working of Cristian’s algorithm:

Algorithm:
• The process on the client machine sends the request for fetching clock time (time at the server)
to the Clock Server at time T0 .
• The Clock Server listens to the request made by the client process and returns the response in
form of clock server time.
• The client process fetches the response from the Clock Server at time T1, and calculates the
synchronized client clock time using the formula given below.

Tclient= Tserver + (T1- T0)/2


Where, Tclient refers to the synchronized clock time,
Tserver refers to the clock time returned by the server,
T0 refers to the time at which request was sent by the client process,
T1 refers to the time at which response was received by the client process.
ii. Berkeley Algorithm:
Berkeley’s Algorithm is a clock synchronization technique used in distributed systems. The
algorithm assumes that each machine node in the network either doesn’t have an accurate time
source or doesn’t possess a UTC server.
Algorithm:
• An individual node is chosen as the master node from a pool node in the network. This node is
the main node in the network which acts as a master and the rest of the nodes act as slaves.
• The master node is chosen using an election process/leader election algorithm.
• Master node periodically pings slave nodes and fetches clock time at them using Cristian’s
algorithm. The diagram below illustrates how the master sends requests to slave nodes:

• The diagram below illustrates how slave nodes send back time given by their system clock.

• Master node calculates the average time difference between all the clock times received and the
clock time given by the master’s system clock itself. This average time difference is added to the
current time at the master’s system clock and broadcasted over the network.
Formula for Calculating the offset and calculating the new time for all nodes is:
Offset = (Tm+ T1 + T2 + T3 + … + Tn)/(n+1)
After calculating the offset, all the connected nodes (Master and Slave Nodes) will adjust their
clock time (adding/subtracting) the time depending on the calculated value of the offset as
shown in the figure (C), given below.

iii. Network Time Protocol (NTP) Algorithm:


NTP is a protocol that helps the computers clock times to be synchronized in a network. This
protocol is an application protocol that is responsible for the synchronization of hosts on a
TCP/IP network. NTP was developed by David Mills in 1981 at the University of Delaware.
This is required in a communication mechanism so that a seamless connection is present
between the computers.
Features of NTP:
Some features of NTP are:
• NTP servers have access to highly precise atomic clocks and GPU clocks.
• It uses Coordinated Universal Time (UTC) to synchronize CPU clock time.
• Avoids even having a fraction of vulnerabilities in information exchange communication.
• Provides consistent timekeeping for file servers.
Working of NTP:
NTP is a protocol that works over the application layer, it uses a hierarchical system of time
resources and provides synchronization within the stratum servers. First, at the topmost level,
there is highly accurate time resources’ ex. atomic or GPS clocks. These clock resources are
called stratum 0 servers, and they are linked to the below NTP server called Stratum 1,2 or 3
and so on. These servers then provide the accurate date and time so that communicating hosts
are synced to each other.
Architecture of Network Time Protocol:

Formula for Calculating the Tnew:


Consider the following scenario, in which we have a client and server connecting together and
to make the communication take place, we need to synchronize their clocks:

Hence, the time offset t, and the T4-new can be calculated as:
t = [ (T2 – T1) + (T3 – T4)]/2
T4-new = T4 + t
t = [(800 - 1100)+(850 - 1200)]/2
t = -325
T4-new = 1200 + (-325)
T4-new = 875
Applications of NTP:
• Used in a production system where the live sound is recorded.
• Used in the development of Broadcasting infrastructures.
• Used where file system updates needed to be carried out across multiple computers depending
on synchronized clock times.
• Used to implement security mechanism which depend on consistent time keeping over the
network.
• Used in network acceleration systems which rely on timestamp accuracy to calculate
performance.
Advantages of NTP:
• It provides internet synchronization between the devices.
• It provides enhanced security within the premises.
• It is used in the authentication systems like Kerberos.
• It provides network acceleration which helps in troubleshooting problems.
• Used in file systems that are difficult in network synchronization.
Disadvantages of NTP:
• When the servers are down the sync time is affected across a running communication.
• Servers are prone to error due to various time zones and conflict may occur.
• Minimal reduction of time accuracy.
• When NTP packets are increased synchronization is conflicted.
• Manipulation can be done in synchronization.

2. Logical Clocks:
• Logical clocks refer to implementing a protocol on all machines within your distributed system,
so that the machines can maintain consistent ordering of events within some virtual timespan.
• A logical clock is a mechanism for capturing chronological and causal relationships in a
distributed system.
• Distributed systems may have no physically synchronous global clock, so a logical clock allows
global ordering on events from different processes in such systems.
Example:
If we go outside then we have made a full plan that at which place we have to go first, second
and so on. We don’t go to second place at first and then the first place. We always maintain the
procedure or an organization that is planned before. In a similar way, we should do the
operations on our PCs one by one in an organized way.
Suppose we have more than 10 PCs in a distributed system and every PC is doing it’s own work
but then how we make them work together. There comes a solution to this i.e., LOGICAL
CLOCK.

Method-1:
To order events across process, try to sync clocks in one approach. This means that if one PC
has a time 2:00 pm then every PC should have the same time which is quite not possible. Not
every clock can sync at one time. Then we can’t follow this method.
Method-2:
Another approach is to assign Timestamp to events. Taking the example into consideration,
this means if we assign the first place as 1, second place as 2, third place as 3 and so on. Then
we always know that the first place will always come first and then so on. Similarly, if we give
each PC their individual number than it will be organized in a way that 1st PC will complete its
process first and then second and so on. But, Timestamps will only work as long as they obey
causality.
Causality:
Causality is fully based on HAPPEN BEFORE RELATIONSHIP.
• Taking single PC only if 2 events A and B are occurring one by one then TS(A) < TS(B). If
A has timestamp of 1, then B should have timestamp more than 1, then only happen before
relationship occurs.
• Taking 2 PCs and event A in P1 (PC.1) and event B in P2 (PC.2) then also the condition will
be TS(A) < TS(B). Taking example- suppose you are sending message to someone at 2:00:00
pm, and the other person is receiving it at 2:00:02 pm.Then it’s obvious that TS(sender) <
TS(receiver).
Properties Derived from Happen Before Relationship:
• Transitive Relation:
If, TS(A) <TS(B) and TS(B) <TS(C), then TS(A) < TS(C)
• Causally Ordered Relation:
a->b, this means that a is occurring before b and if there are any changes in a it will surely
reflect on b.
• Concurrent Event:
This means that not every process occurs one by one, some processes are made to happen
simultaneously i.e., A || B.
Two methods are used to attempt the synchronization of the physical clocks in Distributed synchronization:
i. Lamport’s Logical Clock Algorithm
ii. Lamport’s Vertical Clock Algorithm
1. Lamport’s Logical Clock Algorithm:
Lamport’s Logical Clock was created by Leslie Lamport. It is a procedure to determine the
order of events occurring. It provides a basis for the more advanced Vector Clock Algorithm.
Due to the absence of a Global Clock in a Distributed Operating System Lamport Logical
Clock is needed.
Algorithm:
• Happened before relation(->): a -> b, means ‘a’ happened before ‘b’.
• Logical Clock: The criteria for the logical clocks are:
➢ [C1]: Ci (a) < Ci(b), [ Ci -> Logical Clock, If ‘a’ happened before ‘b’, then time of ‘a’ will
be less than ‘b’ in a particular process. ]
➢ [C2]: Ci(a) < Cj(b), [ Clock value of Ci(a) is less than Cj(b) ]
Reference:
• Process: Pi
• Event: Eij, where i is the process in number and j: jth event in the ith process.
• tm: vector time span for message m.
• Ci vector clock associated with process Pi, the jth element is Ci[j] and contains Pi‘s latest
value for the current time in process Pj.
• d: drift time, generally d is 1.
Implementation Rules[IR]:
• [IR1]: If a -> b [‘a’ happened before ‘b’ within the same process] then, Ci(b) =Ci(a) + d
• [IR2]: Cj = max(Cj, tm + d) [If there’s more number of processes, then tm = value of Ci(a),
Cj = max value between Cj and tm + d]
For Example:

• Take the starting value as 1, since it is the 1st event and there is no incoming value at the
starting point:
➢ e11 = 1
➢ e21 = 1
• The value of the next point will go on increasing by d (d = 1), if there is no incoming value
i.e., to follow [IR1].
➢ e12 = e11 + d = 1 + 1 = 2
➢ e13 = e12 + d = 2 + 1 = 3
➢ e14 = e13 + d = 3 + 1 = 4
➢ e15 = e14 + d = 4 + 1 = 5
➢ e16 = e15 + d = 5 + 1 = 6
➢ e22 = e21 + d = 1 + 1 = 2
➢ e24 = e23 + d = 3 + 1 = 4
➢ e26 = e25 + d = 6 + 1 = 7
• When there will be incoming value, then follow [IR2] i.e., take the maximum value
between Cj and Tm + d.
➢ e17 = max(7, 5) = 7, [e16 + d = 6 + 1 = 7, e24 + d = 4 + 1 = 5, maximum among 7 and 5
is 7]
➢ e23 = max(3, 3) = 3, [e22 + d = 2 + 1 = 3, e12 + d = 2 + 1 = 3, maximum among 3 and 3
is 3]
➢ e25 = max(5, 6) = 6, [e24 + 1 = 4 + 1 = 5, e15 + d = 5 + 1 = 6, maximum among 5 and 6
is 6]
Limitation:
• In case of [IR1], if a -> b, then C(a) < C(b) -> true.
• In case of [IR2], if a -> b, then C(a) < C(b) -> May be true or may not be true.

2. Lamport’s Vertical Clock Algorithm:


Vector Clock is an algorithm that generates partial ordering of events and detects causality
violations in a distributed system. These clocks expand on Scalar time to facilitate a causally
consistent view of the distributed system, they detect whether a contributed event has caused
another event in the distributed system. It essentially captures all the causal relationships.
This algorithm helps us label every process with a vector(a list of integers) with an integer for
each local clock of every process within the system. So for N given processes, there will be
vector/ array of size N.
How Does the Vector Clock Algorithm Work:
• Initially, all the clocks are set to zero.
• Every time, an Internal event occurs in a process, the value of the processes’s logical clock
in the vector is incremented by 1.
• Also, every time a process sends a message, the value of the processes’s logical clock in the
vector is incremented by 1.
Every time, a process receives a message, the value of the processes’s logical clock in the
vector is incremented by 1, and moreover, each element is updated by taking the maximum
of the value in its own vector clock and the value in the vector in the received message (for
every element).
Example:
Consider a process (P) with a vector size N for each process: the above set of rules
mentioned are to be executed by the vector clock:

The above example depicts the vector clocks mechanism in which the vector clocks are
updated after execution of internal events, the arrows indicate how the values of vectors are
sent in between the processes (P1, P2, P3).
To sum up, Vector clocks algorithms are used in distributed systems to provide a causally
consistent ordering of events, but the entire Vector is sent to each process for every message
sent, to keep the vector clocks in sync.
3. Mutual Exclusion Synchronization:
Mutual exclusion is a concurrency control property which is introduced to prevent race conditions.
It is the requirement that a process can not enter its critical section while another concurrent
process is currently present or executing in its critical section i.e., only one process is allowed to
execute the critical section at any given instance of time.
Mutual Exclusion in Single Computer System Vs. Distributed System:
In single computer system, memory and other resources are shared between different processes.
The status of shared resources and the status of users is easily available in the shared memory so
with the help of shared variable (For example: Semaphores) mutual exclusion problem can be
easily solved.
In Distributed systems, we neither have shared memory nor a common physical clock and there
for we can not solve mutual exclusion problem using shared variables. To eliminate the mutual
exclusion problem in distributed system approach based on message passing is used.
A site in distributed system do not have complete information of state of the system due to lack of
shared memory and a common physical clock.
Requirements of Mutual Exclusion Algorithm:
• No Deadlock:
Two or more sites should not endlessly wait for any message that will never arrive.
➢ No Starvation:
Every site who wants to execute critical section should get an opportunity to execute it in
finite time. Any site should not wait indefinitely to execute critical section while other site is
repeatedly executing critical section.
• Fairness:
Each site should get a fair chance to execute critical section. Any request to execute critical
section must be executed in the order they are made i.e Critical section execution requests
should be executed in the order of their arrival in the system.
• Fault Tolerance:
In case of failure, it should be able to recognize it by itself in order to continue functioning
without any disruption.
Solution to Distributed Mutual Exclusion:
As we know shared variables, or a local kernel can not be used to implement mutual exclusion in
distributed systems. Message passing is a way to implement mutual exclusion. Below are the three
approaches based on message passing to implement mutual exclusion in distributed systems:
i. Token Based Algorithm:
• A unique token is shared among all the sites.
• If a site possesses the unique token, it is allowed to enter its critical section
• This approach uses sequence number to order requests for the critical section.
• Each requests for critical section contains a sequence number. This sequence number is
used to distinguish old and current requests.
• This approach insures Mutual exclusion as the token is unique
• Example: Suzuki-Kasami’s Broadcast Algorithm
ii. Non-Token Based Approach:
• A site communicates with other sites in order to determine which sites should execute
critical section next. This requires exchange of two or more successive round of messages
among sites.
• This approach uses timestamps instead of sequence number to order requests for the
critical section.
• Whenever a site make request for critical section, it gets a timestamp. Timestamp is also
used to resolve any conflict between critical section requests.
• All algorithm which follows non-token-based approach maintains a logical clock. Logical
clocks get updated according to Lamport’s Scheme.
• Example: Lamport's Algorithm, Ricart–Agrawala Algorithm.
iii. Quorum Based Approach:
• Instead of requesting permission to execute the critical section from all other sites, each
site requests only a subset of sites which is called a quorum.
• Any two subsets of sites or Quorum contains a common site.
• This common site is responsible to ensure Mutual Exclusion
• Example: Maekawa’s Algorithm

4. Election Algorithm:
Distributed Algorithm is an algorithm that runs on a distributed system. Distributed system is a
collection of independent computers that do not share their memory. Each processor has its own
memory and they communicate via communication networks. Communication in networks is
implemented in a process on one machine communicating with a process on another machine.
Many algorithms used in the distributed system require a coordinator that performs functions
needed by other processes in the system.
Election algorithms are designed to choose a coordinator.
Election Algorithms:
Election algorithms choose a process from a group of processors to act as a coordinator. If the
coordinator process crashes due to some reasons, then a new coordinator is elected on other
processor. Election algorithm basically determines where a new copy of the coordinator should be
restarted. Election algorithm assumes that every active process in the system has a unique priority
number. The process with highest priority will be chosen as a new coordinator. Hence, when a
coordinator fails, this algorithm elects that active process which has highest priority number. Then
this number is send to every active process in the distributed system. We have two election
algorithms for two different configurations of a distributed system.
i. The Bully Algorithm:
This algorithm applies to system where every process can send a message to every other
process in the system.
Algorithm:
Suppose process P sends a message to the coordinator.
• If the coordinator does not respond to it within a time interval T, then it is assumed that
coordinator has failed.
• Now process P sends an election messages to every process with high priority number.
• It waits for responses, if no one responds for time interval T then process P elects itself as
a coordinator.
• Then it sends a message to all lower priority number processes that it is elected as their
new coordinator.
• However, if an answer is received within time T from any other process Q,
➢ (I) Process P again waits for time interval T’ to receive another message from Q that it
has been elected as coordinator.
➢ (II) If Q doesn’t responds within time interval T’ then it is assumed to have failed and
algorithm is restarted.
ii. The Ring Algorithm:
This algorithm applies to systems organized as a ring (logically or physically). In this
algorithm we assume that the link between the process are unidirectional and every process
can message to the process on its right only. Data structure that this algorithm uses is active
list, a list that has a priority number of all active processes in the system.
Algorithm:
• If process P1 detects a coordinator failure, it creates new active list which is empty
initially. It sends election message to its neighbour on right and adds number 1 to its
active list.
• If process P2 receives message elect from processes on left, it responds in 3 ways:
➢ (I) If message received does not contain 1 in active list, then P1 adds 2 to its active list
and forwards the message.
➢ (II) If this is the first election message it has received or sent, P1 creates new active list
with numbers 1 and 2. It then sends election message 1 followed by 2.
➢ (III) If Process P1 receives its own election message 1 then active list for P1 now
contains numbers of all the active processes in the system. Now Process P1 detects
highest priority number from list and elects it as the new coordinator.

You might also like