0% found this document useful (0 votes)
74 views39 pages

CH 6

Uploaded by

Binyam Eshetu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views39 pages

CH 6

Uploaded by

Binyam Eshetu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter 6: Synchronization

1
Synchronization

• Synchronization in distributed systems is crucial for ensuring consistency,


coordination, and cooperation among distributed components.

• It addresses the challenges of maintaining data consistency, managing


concurrent processes, and achieving coherent system behavior across different
nodes in a network.

• By implementing effective synchronization mechanisms, distributed systems can


operate seamlessly, prevent data conflicts, and provide reliable and efficient
services. 2
Importance of Synchronization in Distributed
Systems

• Synchronization in distributed systems is of paramount importance due to the


following reasons:

• Data Integrity: Ensures that data remains consistent across all nodes, preventing
conflicts and inconsistencies.

• State Synchronization: Maintains a coherent state across distributed


components, which is crucial for applications like databases and file systems.

• Efficient Utilization: Optimizes the use of network and computational resources


by minimizing redundant operations. 3
• Task Coordination: Helps coordinate tasks and operations among distributed
nodes, ensuring they work together harmoniously.

• Resource Management: Manages access to shared resources, preventing


conflicts and ensuring fair usage.

• Redundancy Management: Ensures redundant systems are synchronized,


improving fault tolerance and system reliability.

• Recovery Mechanisms: Facilitates effective recovery mechanisms by maintaining


synchronized states and logs.
4
• Load Balancing: Ensures balanced distribution of workload, preventing
bottlenecks and improving overall system performance.

• Deadlock Prevention: Implements mechanisms to prevent deadlocks, where


processes wait indefinitely for resources.

• Scalable Operations: Supports scalable operations by ensuring that


synchronization mechanisms can handle increasing numbers of nodes and
transactions

5
Types of Synchronization

1. Time/clock Synchronization

Clock synchronization ensures that all nodes in a distributed


system have a consistent view of time. This is crucial for coordinating
events, logging, and maintaining consistency in distributed applications.

6
Importance of Time
Synchronization:

• Event Ordering: Ensures that events are recorded in the correct sequence across
different nodes.

• Consistency: Maintains data consistency in time-sensitive applications like


databases and transaction systems.

• Debugging and Monitoring: Accurate timestamps are vital for debugging,


monitoring, and auditing system activities.

7
Techniques:

• Network Time Protocol (NTP): Synchronizes clocks of computers over


a network.
• Precision Time Protocol (PTP): Provides higher accuracy time
synchronization for systems requiring precise timing.
• Logical Clocks: Ensure event ordering without relying on physical time
(e.g., Lamport timestamps).

8
2. Data Synchronization

• Data synchronization ensures that multiple copies of data across different nodes
in a distributed system remain consistent. This involves coordinating updates and
resolving conflicts to maintain a unified state.
• Importance of Data Synchronization:
Consistency: Ensures that all nodes have the same data, preventing
inconsistencies.
Fault Tolerance: Maintains data integrity in the presence of node failures and
network partitions.
Performance: Optimizes data access and reduces latency by ensuring data is
correctly synchronized.

9
Techniques:

• Replication: Copies of data are maintained across multiple nodes to


ensure availability and fault tolerance.
• Consensus Algorithms: Protocols like Paxos, Raft, and Byzantine Fault
Tolerance ensure agreement on the state of data across nodes.
• Eventual Consistency: Allows updates to be propagated
asynchronously, ensuring eventual consistency over time (e.g.,
DynamoDB).

10
3. Process Synchronization

• Process synchronization coordinates the execution of processes in a distributed


system to ensure they operate correctly without conflicts. This involves managing
access to shared resources and preventing issues like race conditions, deadlocks,
and starvation.
Importance of Process Synchronization:
Correctness: Ensures that processes execute in the correct order and interact
safely.
Resource Management: Manages access to shared resources to prevent conflicts
and ensure efficient utilization.
Scalability: Enables the system to scale efficiently by coordinating process
execution across multiple nodes.

11
Techniques:

• Mutual Exclusion: Ensures that only one process accesses a critical section or
shared resource at a time (e.g., using locks, semaphores).

• Barriers: Synchronize the progress of processes, ensuring they reach a certain


point before proceeding.

• Condition Variables: Allow processes to wait for certain conditions to be met


before continuing execution.

12
Clock Synchronization
• Centralized systems do not need clock synchronization, as they work under a common
clock. But the distributed systems do not follow common clock: each system functions
based on its own internal clock and its own notion of time.

• The time in distributed systems is measured in the following contexts:


 The time of the day at which an event happened on a specific machine in the
network.
 The time interval between two events that happened on different machines in the
network.
 The relative ordering of events that happened on different machines in the network.
13
Clock synchronization
• Clock synchronization is the process of ensuring that physically distributed
processors have a common notion of time.

• Clock synchronization is the mechanism to synchronize the time of all the


computers in the distributed environments or system.

• Clock synchronization deals with understanding the temporal ordering of events


produced by concurrent processes.

• It is useful for synchronizing senders and receivers of messages, determining


whether messages are related and their proper ordering, controlling joint activity,
and serializing concurrent access to shared objects. 14
Cont..

• Multiple autonomous processes running on different machines need to be able to


agree on and be able to make consistent decisions about the ordering of certain
events in a system.

• Assume that there are three systems present in a distributed environment. To


maintain the data i.e. to send, receive and manage the data between the systems
with the same time in synchronized manner you need a clock that has to be
synchronized. This process to synchronize data is known as Clock Synchronization.

15
Conti..

• Clock synchronization can be achieved by 2 ways: External and Internal Clock


Synchronization
External clock synchronization is the one in which an external reference
clock is present. It is used as a reference and the nodes in the system can set
and adjust their time accordingly.
Internal clock synchronization is the one in which each node shares its time
with other nodes and all the nodes set and adjust their times accordingly.

16
Conti..
• There are 2 types of clock synchronization algorithms: Centralized and
Distributed.

1. Centralized is the one in which a time server is used as a reference. The single
time-server propagates it’s time to the nodes, and all the nodes adjust the time
accordingly. It is dependent on a single time-server, so if that node fails, the whole
system will lose synchronization. Examples of centralized are-Berkeley the
Algorithm, Passive Time Server, Active Time Server etc.

17
Cont..

2. Distributed is the one in which there is no centralized time-server present.


Instead, the nodes adjust their time by using their local time and then, taking the
average of the differences in time with other nodes. Distributed algorithms
overcome the issue of centralized algorithms like scalability and single point failure.
Examples of Distributed algorithms are – Global Averaging Algorithm, Localized
Averaging Algorithm, NTP (Network time protocol), etc.

18
Conti..

• Centralized clock synchronization algorithms suffer from two major drawbacks:

They are subject to a single-point failure. If the time-server node fails, the clock
synchronization operation cannot be performed. This makes the system
unreliable. Ideally, a distributed system should be more reliable than its individual
nodes. If one goes down, the rest should continue to function correctly.

From a scalability point of view, it is generally not acceptable to get all the time
requests serviced by a single-time server. In a large system, such a solution puts a
heavy burden on that one process. 19
Conti..

• Distributed algorithms overcome these drawbacks as there is no centralized time-


server present. Instead, a simple method for clock synchronization may be to
equip each node of the system with a real-time receiver so that each node’s clock
can be independently synchronized in real-time. Multiple real-time clocks (one
for each node) are normally used for this purpose

20
Types of Clock Synchronization

1. Physical clock synchronization

2. Logical clock synchronization

21
1. physical clock
synchronization
• In physical clock synchronization, All the computers will have their own clocks.

• The physical clocks are needed to adjust the time of nodes. All the nodes in the system can
share their local time with all other nodes in the system.

• The time will be set based on UTC (Universal Coordinate Timer). UTC is used as a reference
time clock for the nodes in the system.

• Physical clocks keeps the time of the day. It will be consistent across systems.

• The time difference between the two computers is known as “Time drift”. Clock drifts over
the time is known as “Skew”. Synchronization is necessary here.
22
Cont..

• Physical clocks: In physical synchronization, physical clocks are used to time


stamp an event on that computer.

• If two events, E1 and E2, having different time stamps t1 and t2, the order of the
event occurring will be considered and not on the exact time or the day at which
they are occur.

23
Cont..

• Several methods are used to attempt the synchronization of the physical clocks in
Distributed synchronization:

1. UTC (Universal coordinate timer)

2. Christian’s algorithm

3. Berkely’s algorithm

24
Universal Coordinate Time (UTC)

• All the computers are generally synchronized to a standard time called


Universal Coordinate Time (UTC).
• UTC is the primary time standard by which the time and the clock are
regulated in the world. It is available via radio signals, telephone line
and satellites (GPS).
• UTC is broadcasted via the satellites.
• Computer servers and online services with the UTC resources can be
synchronized by the satellite broadcast.

25
Christian’s Algorithm:

• The simplest algorithm for setting time, it issues a remote procedure call (RPC) to
the time sever and obtains the time.

• The machine which send requests to the time server is “d/z” seconds, where d is
the maximum difference between the clock and the UTC.

• The time server sends the reply with current UTC when receives the request from
the receiver.

26
CRISTIAN’S ALGORITHM

Algorithm:
- Let S be the time server and Ts be its time.
- Process P requests the time from S.
- After receiving the request from P, S prepares a response and appends time Ts
from its own clock
and then sends it back to P 27
Berkeley Algorithm

• The Berkeley Algorithm is a decentralized algorithm that aims to synchronize the


clocks of distributed systems without requiring a centralized time server.
Operation:
Coordinator Election: A coordinator node periodically gathers time values
from other nodes in the system.
Clock Adjustment: The coordinator calculates the average time and
broadcasts the adjustment to all nodes, which then adjust their local clocks
based on the received time difference.
Handling Clock Drift: The algorithm accounts for clock drift by periodically
recalculating and adjusting the time offset.

28
BERKELEY’S ALGORITHM

29
2. Logical Clocks
synchronization

• If two systems do not interact with each other then there is no need of
synchronization. So, what usually matters is that processes agree on the order in
which events occur rather than the time at which they occurred.

• Logical clocks does not need exact time. So, absolute time is not a constrain in
logical clocks.

• Logical clocks just bothers about the message to be delivered and not about the
timings of the events occurred.

30
Cont..
• The most common logical clock synchronization algorithms for distributed systems is
Lamport’s algorithm. It is used in the situation where ordering is important not the time.

• Lamport’s logical clock (or timestamp) was proposed by Leslie Lamport in the 1970s and
widely used in almost all distributed systems since then, almost all cloud computing
systems use some form of logical ordering of events.

• Lamport’s logical clock (or timestamp) was proposed by Leslie Lamport in the 1970s and
widely used in almost all distributed systems since then, almost all cloud computing
systems use some form of logical ordering of events.
31
Lamport’s algorithm
• Lamport define the relation happens-before (->) between any pair of events with 3 rules:

1. If a and b are events on the same process, then a -> b if a occurs before b based on the local
clock.

2. If a process sends a message m to another process, then send(m) -> receive(m) where send(m)
and receive(m) are events from first and second processes respectively.

3. happens-before is transitive, i.e. if a -> b and b -> c then a -> c.

4. If two events, a and b, happen in different processes that do not exchange messages , then
a__>b is not true, but neither is b __> a. These events are said to be concurrent (a || b).

32
Cont..

• The goal of Lamport’s logical clock is to assign timestamps to all events such that
these timestamps obey causality - if an event B is caused by an earlier event A,
then everyone must see A before seeing B. Formally, if an event A causally
happens before another event B, then timestamp(A) < timestamp(B). The
timestamp must always go forward and not backward.

33
Cont..

• Let’s look at an example where 3 processes in the system with the following
conditions:

1. We assume the clocks use local counter which is an integer (initial value of
counter is 0) but the increment of each clock is different.

2. A process increments its counter when an even happens or when it sends


message. The counter is assigned to the event as its timestamp., the message
event also carries its timestamp.

34
• The messages m1 and m2 obey happens-before, however messages m3
and m4 do not and we need to correct the local clock. For example, m3 is
sent at 50, them m3 should only be received at 51 or later. The algorithm
to update Lamport’s counter is:
35
Cont..

1. Before executing an event, the process A increment its counter, i.e.


timestamp(A) = timestamp(A) + increment.

2. When A sends a message to process B, it sends along timestamp(A).

3. Upon receiving the message, B will adjust local clock and the counter is then
incremented by 1 before the message is considered received., i.e.
timestamp(B) = max(timestamp(A), timestamp(B)) + 1.

36
Cont..

37
Cont..
• On the other hand, if events happen from different processes and do not
exchange message directly or indirectly, then nothing can be said about their
relation, and these events are said to be concurrent. Concurrent events are not
casually related and their order is not guaranteed.

38
Thank you!

39

You might also like