chap2 ds
chap2 ds
Synchronization in distributed systems is a critical aspect that ensures the consistent and coordinated
operation of multiple processes or nodes across a network. It involves various techniques to maintain
data consistency, manage concurrent access to shared resources, and achieve coherent system
behavior.
Key Challenges and Goals:
•Data Consistency: Ensuring that multiple copies of data remain consistent across different nodes.
•Concurrent Access: Managing simultaneous access to shared resources to prevent conflicts and data
corruption.
•Process Coordination: Coordinating the execution of processes to achieve a desired system
behavior.
•Fault Tolerance: Handling failures and ensuring that the system can recover and maintain consistency.
Types of Synchronization:
1.Time Synchronization:
•Ensures that all nodes in the system have a consistent view of time.
•Crucial for coordinating events, logging, and maintaining consistency in distributed applications.
•Algorithms:
•Network Time Protocol (NTP): Widely used for synchronizing clocks across the internet.
•Precision Time Protocol (PTP): Used for high-precision time synchronization in networks.
2.Data Synchronization:
•Maintains consistency between multiple copies of data across different nodes.
•Techniques:
•Replication: Creating multiple copies of data on different nodes.
•Distributed Locking: Using locks to control access to shared data.
•Optimistic Concurrency Control (OCC): Assuming that conflicts are rare and handling them when
they occur.
•Pessimistic Concurrency Control (PCC): Preventing conflicts by locking resources before
accessing them.
3.Process Synchronization:
•Coordinates the execution of processes to ensure they operate correctly without conflicts.
•Mechanisms:
•Mutual Exclusion: Ensuring that only one process can access a shared resource at a time.
•Semaphores: A synchronization primitive that can be used to control access to shared
resources.
•Monitors: A high-level synchronization construct that encapsulates shared data and the operations
that can be performed on it
Common Synchronization Algorithms:
1.Lamport's Logical Clocks: A method for ordering events in a distributed system without relying on physical
clocks. Each event is assigned a timestamp, and the clocks are synchronized by passing these timestamps
between processes.
2.Vector Clocks: An extension of Lamport's clocks that provides partial ordering of events and can determine the
causal relationship between them.
3.Physical Clock Synchronization: Techniques like the Network Time Protocol (NTP) and Precision Time
Protocol (PTP) are used to synchronize physical clocks across a network.
4.Mutual Exclusion Algorithms: Algorithms like Ricart-Agrawala and Lamport's bakery algorithm ensure that
only one process can access a critical section at a time.
5.Token Ring Algorithm: A method where a token is passed around the network, and only the holder of the
token can perform certain actions, ensuring synchronized access to resources.
6.Election Algorithms: Algorithms like Bully and Ring algorithms are used to elect a coordinator or leader among
distributed processes.
These algorithms help maintain consistency and coordination in distributed systems, ensuring that all
components work together seamlessly
Synchronization is crucial in distributed systems for several reasons:
1. Data Consistency:
•Preventing Data Corruption: Multiple nodes might access and modify the same data concurrently. Without synchronization, this
can lead to race conditions and data inconsistencies.
•Maintaining Data Integrity: Synchronization ensures that data remains consistent across different nodes, even in the presence
of updates and modifications.
2. Process Coordinatioan:
•Ensuring Correct Execution: Synchronization mechanisms coordinate the execution of processes to prevent conflicts and
ensure that they operate correctly.
•Achieving Desired System Behavior: Synchronization allows processes to collaborate and achieve a specific system goal,
such as a distributed transaction or a distributed algorithm.
3. Fault Tolerance:
•Handling Failures: Synchronization mechanisms can help a distributed system recover from failures by ensuring that data
remains consistent and processes can be restarted correctly.
4. Resource Management:
•Fair Resource Allocation: Synchronization can be used to allocate shared resources fairly among different processes or
nodes.
•Preventing Deadlocks: Synchronization mechanisms can help prevent deadlocks, which occur when processes are waiting for
each other to release resources.
5. Scalability:
•Handling Increased Load: Synchronization techniques can help distribute the workload across multiple nodes, improving the
scalability of the system.
In Summary:
Synchronization is essential in distributed systems to ensure data consistency, process coordination, fault tolerance, resource
management, and scalability.
By effectively synchronizing the activities of different nodes, distributed systems can operate reliably, efficiently, and maintain data
integrity
Synchironization in centralized and distributed system
In a distributed system, where multiple computers communicate and collaborate over a network, understanding the concepts
of clocks, events, and process states is crucial.
Clocks
In a distributed system, each computer has its own clock. However, these clocks are not perfectly synchronized. This leads to
several challenges:
Clock Drift vs. Clock Skew
In distributed systems, accurate timekeeping is crucial for various operations, such as timestamping events, coordinating
actions, and ensuring consistency. However, due to hardware imperfections and environmental factors, clocks in different
systems can diverge over time, leading to clock drift and skew.
Clock Drift This refers to the gradual divergence of a clock's rate from a reference clock. It occurs due to variations in the
oscillator frequency of the clock's hardware. This means that one clock might run slightly faster or slower than another,
causing a gradual time difference.
Clock Skew Clock skew is the difference in time between two clocks at a specific point in time. It can be caused by various
factors, including initial time differences, clock drift, and network delays.
Events
An event is a significant occurrence in a process's execution. It can be a message send, a message receive, or a change in
the process's state.
Clock drift
• refers to the gradual divergence of a clock's time from a reference time standard. This
occurs due to variations in the clock's frequency and environmental factors.
• In simpler terms, it's like two watches that start at the same time, but one runs slightly
faster or slower than the other, causing them to gradually show different times.
• This phenomenon is particularly significant in distributed systems where multiple
computers need to coordinate their actions based on time. If their clocks are not
synchronized, it can lead to inconsistencies, errors, and even system failures.
To address clock drift, various synchronization techniques are employed, such as:
• Network Time Protocol (NTP): This protocol synchronizes clocks across a network by
periodically adjusting their time.
• Precision Time Protocol (PTP): This protocol provides more accurate time
synchronization, especially for real-time applications.
By using these techniques, distributed systems can maintain accurate time across all nodes,
ensuring smooth operation and preventing potential issues.
Impact of Clock Drift and Skew:
•Inaccurate Timestamps: Events may be incorrectly ordered or assigned incorrect timestamps.
•Coordination Problems: Distributed systems may experience difficulties in coordinating actions due to inconsistent time.
•Security Vulnerabilities: Cryptographic protocols and security mechanisms may be compromised if time is not synchronized.
Mitigation Techniques:
To mitigate the effects of clock drift and skew, various techniques are employed:
•Network Time Protocol (NTP): NTP is a widely used protocol for synchronizing clocks across a network. It uses a
hierarchical architecture to distribute time from reliable time sources.
•Precision Time Protocol (PTP): PTP is a more precise protocol that is often used in critical real-time systems. It offers
lower latency and higher accuracy than NTP.
•Hardware-Based Time Synchronization: Some systems use hardware-based time synchronization mechanisms, such
as GPS receivers or atomic clocks, to achieve high accuracy.
By understanding and addressing clock drift and skew, we can ensure the reliable operation of distributed systems.
Lamport's Logical Clocks
• Lamport's logical clocks are a mechanism used in distributed systems to provide a partial ordering of events, even in the absence of a global clock. This is crucial for tasks like
synchronization, causal ordering, and conflict resolution.
• Partial Ordering:
1. If event A happens before event B on the same process, then the timestamp of A is less than the timestamp of B.
2. If event A (sending a message) happens before event B (receiving the message), then the timestamp of A is less than the timestamp of B.
Lamport's Algorithm: A Distributed Mutual Exclusion Protocol
Lamport's algorithm is a distributed mutual exclusion algorithm that ensures only one process can access a shared resource at a time in a distributed system. It achieves
this by using logical clocks to assign timestamps to events and by using a request-reply mechanism to coordinate access to the critical section.
Here's how it works:
1.Logical Clocks:
•Each process maintains its own logical clock.
•Whenever a process generates an event (e.g., sending a message, entering the critical section), it increments its clock.
•Timestamps are assigned to events based on their causal order.
2.Requesting the Critical Section:
•A process that wants to enter the critical section generates a timestamped request message.
•It sends this request to all other processes.
3.Receiving Requests:
•When a process receives a request, it compares the timestamp of the received request with its own local clock.
•If the received timestamp is larger, it updates its local clock to the larger value.
•It then places the request in its local queue, ordered by timestamp.
4.Entering the Critical Section:
•A process can enter the critical section only if:
•It has received a message with a larger timestamp from all other processes.
•Its own request is at the head of its local queue.
5.Releasing the Critical Section:
•After exiting the critical section, the process removes its request from its local queue.
•It sends a release message to all other processes.
Applications:
•Distributed Synchronization:
•Ensuring that certain operations occur in a specific order, like committing transactions in a
distributed database.
•Conflict Resolution:
•Detecting and resolving conflicts in replicated data, such as in distributed file systems.
•Distributed Debugging:
•Analyzing the execution of distributed systems by ordering events and identifying causal
relationships.
Example:
Consider two processes, P1 and P2.
P1:
Event 1: Timestamp 1
Event 2: Timestamp 2
P2:
Event 3: Timestamp 3
Event 4: Timestamp 4
While it might appear that Event 3 happened before Event 2, Lamport's clocks cannot
determine this. They only guarantee that:
•Event 1 happened before Event 2.
•Event 3 happened before Event 4.
Logical Time and Logical Clocks
• Logical time is a concept used to order events in a distributed system, even without precise physical clocks. Logical clocks are mechanisms to assign timestamps to
events in a way that reflects their causal order.
Types of Logical Clocks:
• Lamport Clocks:
• Assigns unique timestamps to events.
• Ensures a partial ordering of events.
• When a process sends a message, it includes its current timestamp.
• The receiver's clock is advanced to the maximum of its current value and the received timestamp.
• Vector Clocks:
• Assigns a vector of timestamps to each event.
• Provides a more precise ordering of events, especially when dealing with concurrent events.
• Each process maintains a vector of logical clocks, one for each process in the system.
• When an event occurs, the process increments its own clock in the vector.
• When a message is sent, the entire vector is included in the message.
Global States
• A global state of a distributed system is a snapshot of the state of all its processes and channels at a specific point in time. Capturing a consistent global state is
challenging due to the lack of a global clock and asynchronous communication.
Techniques for Capturing Global States:
• Snapshot Algorithms:
• Chandy-Lamport Algorithm: A distributed algorithm that records the state of each process and its outgoing messages.
• Distributed Snapshot Algorithm: A more efficient algorithm that allows processes to record their states independently.
• Logical Clocks:
• Lamport clocks and vector clocks can be used to order events and infer potential global states.
• By understanding and applying these concepts, distributed systems can achieve accurate timekeeping, efficient coordination, and consistent global states, leading
to improved performance and reliability.
Events
An event is a significant occurrence in a process's execution. It can be a message send, a message receive, or a change in the
process's state.
Process States
A process state represents the current state of a process at a particular point in time. It can be one of the following:
•Running: The process is currently executing instructions.
•Ready: The process is waiting to be executed.
•Blocked: The process is waiting for an event, such as I/O completion or a message arrival.
Ordering Events
To understand the causal relationships between events in a distributed system, we need to order them. Two common ordering
relations are:
•Happens-Before Relation (→):
•If a and b are events in the same process, and a occurs before b, then a → b.
•If a is the sending of a message, and b is the receiving of that message, then a → b.
•Concurrent Events: Two events a and b are concurrent if neither a → b nor b → a.
Logical Clocks
To assign timestamps to events in a distributed system, logical clocks are used. Two common types of logical clocks are:
•Lamport Clocks: Each process maintains a logical clock, which is incremented before each event. When a message is sent, the
sender's clock value is included in the message. The receiver's clock is then set to the maximum of its current value and the
received timestamp.
•Vector Clocks: Each process maintains a vector of logical clocks, one for each process in the system. When an event occurs, the
process increments its own clock in the vector. When a message is sent, the entire vector is included in the message. The receiver
updates its vector by taking the maximum of each corresponding element.
By understanding these concepts, we can analyze the behavior of distributed systems, debug problems, and design efficient
algorithms for various distributed applications.
Global States in Distributed Systems
A global state of a distributed system is a snapshot of the state of all its processes and channels at a specific
point in time. However, due to the lack of a global clock, capturing a consistent global state is challenging.
Challenges of Global States:
•Inconsistent Views: Different nodes may have different views of the system's state at a given time.
•Causality Violations: Events may appear to occur in a different order on different nodes.
Capturing Global States:
Several techniques are used to capture global states:
1.Snapshot Algorithms:
•Chandy-Lamport Algorithm: A distributed algorithm that records the state of each process and its outgoing
messages.
•Distributed Snapshot Algorithm: A more efficient algorithm that allows processes to record their states
independently.
2.Logical Clocks:
•Lamport clocks and vector clocks can be used to order events and infer potential global states.
Clock Synchronization in Distributed Systems
Clock synchronization is the process of ensuring that all clocks in a distributed system are aligned to a
common time reference. This is crucial for various reasons, including:
•Coordinating Actions: Enables accurate coordination of events and actions across different nodes.
•Timestamping Events: Provides consistent timestamps for events, facilitating debugging and analysis.
•Distributed File Systems: Helps maintain consistency in distributed file systems.
•Real-time Systems: Ensures timely execution of tasks in real-time systems.
•Security Protocols: Plays a role in authentication and secure communication protocols.
Challenges of Clock Synchronization
•Clock Drift: Individual clocks tend to drift apart over time due to hardware variations and environmental
factors.
•Network Delays: Network latency can introduce inaccuracies in time measurements.
•Message Transmission Time: The time taken for messages to travel between nodes can vary.
Clock Synchronization: Centralized vs. Distributed Systems
Coordinated Universal Time – abbreviated as UTC (from the French equivalent) – is an international standard for
timekeeping. It is based on atomic time, but a so-called ‘leap second’ is inserted – or, more rarely, deleted –
occasionally to keep it in step with astronomical time.
Clock Synchronization Algorithms
Several algorithms have been developed to address these challenges:
1.Network Time Protocol (NTP):
•Widely used for accurate time synchronization.
•Employs a hierarchical architecture with time servers at different levels.
•Uses a probabilistic algorithm to estimate network delays and adjust clocks accordingly.
2.Precision Time Protocol (PTP):
•Designed for precise time synchronization, especially in industrial and telecommunications networks.
•Uses a deterministic algorithm to calculate network delays.
•Offers lower latency and higher accuracy than NTP.
3.Cristian's Algorithm:
•A simple algorithm where a client requests the time from a time server.
•The client calculates the round-trip time and adjusts its clock accordingly.
•Less accurate than NTP and PTP, but can be used in simpler scenarios.
Key Considerations for Clock Synchronization:
•Accuracy: The desired level of accuracy depends on the specific application requirements.
•Reliability: The algorithm should be robust and fault-tolerant.
•Scalability: The algorithm should be able to handle large-scale distributed systems.
•Security: In security-sensitive applications, cryptographic techniques can be used to protect time synchronization.
By carefully selecting and implementing appropriate clock synchronization algorithms, distributed systems can achieve
the necessary level of time consistency, enabling reliable and efficient operation.
Global States in Distributed Systems
• A global state of a distributed system is a snapshot of the state of all its processes and channels at a specific point in time.
Capturing a consistent global state is challenging due to the lack of a global clock and asynchronous communication.
Why Global States Matter:
• Debugging: Understanding the system's state at a specific point in time can help identify and fix bugs.
• Performance Analysis: Analyzing global states can help identify performance bottlenecks.
• Fault Tolerance: Detecting and recovering from failures often requires knowledge of the system's global state.
Challenges in Capturing Global States:
• Inconsistent Views: Different nodes may have different views of the system's state at a given time.
• Causality Violations: Events may appear to occur in a different order on different nodes.
Process 2
Process 1 Process 3
…
Shared Process n
resource
Coordination and Agreement
Distributed Mutual Exclusion
Ring-Based Algorithm
In the ring-based algorithm, processes are organized in a logical ring.
A token is passed among the processes, and only the process holding
the token can enter the critical section.
Coordination and Agreement
Distributed Mutual Exclusion
Ring-Based Algorithm
Arrange the processes in a logical ring
Coordination and Agreement
Distributed Mutual Exclusion
Ring Topology: Processes are arranged in a ring, and each process knows its neighbors.
•Election Initiation: A process initiates an election by sending an election message with its ID
to its neighbor.
•Message Passing: Each process that receives an election message compares its ID with the
received ID.
•If its ID is higher, it overwrites the message with its own ID and forwards it.
•If its ID is lower, it simply forwards the message.
•Coordinator Selection: The message eventually reaches the process with the highest ID,
which becomes the coordinator.
1 It then sends a coordinator message around the ring.
Coordination and Agreement
Election
Example 2
Bully Algorithm
Coordination and Agreement
Election
example1
Example 2
Election algorithm
The following takes place when a process, say “P” sends a message to the coordinator:
1. If the coordinator remains unresponsive for the specified time interval “t,” it is considered a coordinator
failure.
2. Process “P” broadcasts an election message to all processes with a higher-priority numerical value.
3. It waits for a response. If there’s no response within time interval “t,” process “P” self-elects as coordinator.
4. It transmits a notification to lower-priority numbers and the process “P,” becomes the new coordinator.
example3
Election
Coordination and Agreement
Multicast Communication