DC - Notes
DC - Notes
1. Transparency: One of the primary goals of a distributed operating system is to make
multiple computers appear as a single system to users, ensuring transparency. This
means a collection of distinct machines connected by a communication subsystem
should function as a virtual uniprocessor. Achieving complete transparency is
challenging and requires support for multiple aspects of transparency. According to the
International Standards Organization's Reference Model for Open Distributed
Processing [ISO 1992], the eight forms of transparency are access transparency,
location transparency, replication transparency, failure transparency, migration
transparency, concurrency transparency, performance transparency, and scaling
transparency.
2. Reliability: Distributed systems are generally expected to be more reliable than
centralized systems due to the availability of multiple resource instances. However,
simply having multiple instances does not guarantee reliability. The distributed operating
system must be designed effectively to leverage this characteristic and enhance system
reliability.
A fault is a mechanical or algorithmic defect that can lead to an error, ultimately causing
system failure. System failures can be categorized into two types: fail-stop failures
[Schlichting and Schneider, 1983] and Byzantine failures [Lamport et al., 1982]. In a
fail-stop failure, the system ceases to function but does so in a detectable manner. In
contrast, a Byzantine failure allows the system to continue running while producing
incorrect results, often caused by undetected software bugs, making it more challenging
to handle than fail-stop failures.
To achieve higher reliability, the fault-handling mechanisms in a distributed operating
system must be designed to prevent faults, tolerate faults, and detect and recover from
faults. Various methods are commonly used to address these challenges.
3. Fault Avoidance: Fault avoidance focuses on designing system components to
minimize the occurrence of faults. This is often achieved through conservative design
practices, such as using high-reliability components, to enhance system reliability. While
a distributed operating system has little direct influence on the fault avoidance capability
of hardware components, its software components must be thoroughly tested to ensure
high reliability. Effective testing and robust design of the software components play a
crucial role in reducing faults and improving overall system stability.
4. Fault Tolerance: Fault tolerance enables a system to function despite partial failures.
Two key approaches in distributed operating systems are:
1. **Redundancy Techniques:** Replicating critical components (processes, files, or
storage) prevents single points of failure. A system is **k-fault tolerant** with **k+1
replicas** for fail-stop failures and **2k+1 replicas** for Byzantine failures (using majority
voting). However, redundancy increases system overhead, requiring a balance between
reliability and efficiency.
2. **Distributed Control:** Decentralizing services (file management, scheduling, name
resolution) avoids single points of failure. Independent servers ensure reliability,
preventing system-wide failures.
These strategies enhance resilience and ensure continued operation in distributed
systems.
5. Fault Detection and Recovery:To enhance reliability, distributed operating systems use
fault detection and recovery techniques: Atomic Transactions: Ensures operations occur
entirely or not at all, preventing inconsistent data states after failures. Transactions
simplify crash recovery by maintaining data integrity. Stateless Servers: Unlike stateful
servers, stateless servers do not retain client history, making crash recovery simpler by
eliminating complex state management. Acknowledgments & Timeouts: Lost messages
due to failures are detected via acknowledgment messages and retransmissions.
Sequence numbers help avoid duplicate messages. These mechanisms improve
reliability but introduce system overhead, requiring a balance between efficiency and
fault tolerance.
6. Flexibility: Flexibility in distributed operating systems is crucial for ease of modification
and enhancement. A flexible design allows seamless updates to fix bugs, adapt to
changing environments, and incorporate new functionalities. The choice of kernel model
significantly impacts flexibility—microkernel architecture offers higher modularity, making
it easier to modify and add services without system downtime. While it may introduce
slight performance overhead due to interprocess communication, its advantages in
maintainability, scalability, and customization outweigh this drawback. Modern distributed
OS designs prefer microkernels for their adaptability and user-centric configurability.
7. Performance: For a distributed system to be effective, its performance must match or
exceed that of a centralized system. Proper design of system components is essential to
avoid inefficiencies. Key principles for optimizing performance include:
Batch Processing – Sending data in large chunks and piggybacking acknowledgments improves
efficiency.
Caching – Storing frequently accessed data locally reduces network usage and speeds up
operations.
Minimizing Data Copying – Reducing unnecessary data transfers between memory and devices
decreases CPU overhead.
Reducing Network Traffic – Process migration and clustering minimize internode communication
costs.
Applying these strategies ensures better speed, lower latency, and improved overall system
performance.
8. Scalability: Scalability is the ability of a system to handle increasing service loads,
making it a crucial consideration for distributed systems that are expected to grow over
time. A well-designed distributed operating system should accommodate this growth
without significant performance degradation or service disruptions. Below are key
principles for designing scalable systems:
1. Avoid Centralized Entities
Centralized components like a single file server or database can create bottlenecks as
the system scales. The failure of such entities can lead to system-wide failures, and
increased contention for resources can saturate network capacity. To avoid these issues,
distributed systems should empl
2. oy techniques like resource replication and decentralized control, ensuring that all nodes
share an equal role in system operation.
Applying these principles ensures that distributed systems can effectively scale to meet
increasing demands while maintaining performance and resilience.
9. Heterogeneity: in distributed systems arises from the use of different hardware, software,
communication protocols, and data formats across interconnected nodes. This diversity leads to
challenges in compatibility, requiring data translation between incompatible systems. The
complexity increases with the number of formats, making it difficult to manage and scale. Using
an intermediate standard data format for conversion reduces software complexity and improves
system interoperability.
10. Security: Enforcing security in distributed systems is more challenging than in centralized
systems due to the lack of a single control point and the use of insecure networks for
communication. Unlike centralized systems, where user authentication is straightforward, a
distributed system requires methods to authenticate both the client and server, ensuring that
messages are received by the intended recipient and are unaltered during transmission.
1. Ensuring the sender knows the message was received by the intended receiver.
2. Ensuring the receiver knows the message was sent by the genuine sender.
3. Guaranteeing that the contents of the message remain unchanged during transfer.
Types of Transparency:
Access transparency: Access transparency ensures that users cannot distinguish between
local and remote resources in a distributed system. The system should provide a uniform
interface where system calls remain the same regardless of resource location. While complete
access transparency is challenging due to communication failures, global resource naming has
been successfully developed. Distributed shared memory also aids in access transparency but
has performance limitations for certain applications.
1. Name Transparency – Resource names should not reveal their physical location and
must remain unchanged even if resources move within the system. Names should be
unique systemwide.
2. User Mobility – Users should be able to access resources using the same name
regardless of which machine they log into, without requiring additional effort.
Both aspects rely on a global resource naming facility.
3. Replication Transparency: Replication Transparency ensures that users are unaware of the
existence and management of multiple copies of a resource in a distributed system. It involves:
1. Naming of Replicas – The system maps user-supplied resource names to appropriate
replicas without user intervention.
2. Replication Control – The system automatically handles decisions on the number,
placement, creation, and deletion of replicas for performance and reliability.
These tasks are managed entirely by the system to maintain seamless access.
4. Failure Transparency: Failure Transparency ensures that users remain unaware of partial
system failures, such as communication failures, machine crashes, or storage failures. A
distributed operating system with failure transparency continues functioning, possibly in a
degraded form.
For example, a failure-transparent file service can be implemented using multiple file servers
that cooperate to ensure uninterrupted access, even if some servers fail. However, designing
such systems requires balancing redundancy and overhead.
While complete failure transparency is impractical due to network failures and cost constraints,
partial failure handling improves system reliability.
5. Migration Transparency: Migration Transparency ensures that object movement (e.g., files
or processes) in a distributed system happens automatically without user awareness. Key
aspects include:
1. Automated Migration Decisions – The system determines which objects to move and
where.
2. Name Preservation – Objects retain their original names after migration.
3. Seamless Communication – Messages reach migrating processes without requiring
resending.
1. Event Ordering – Ensures consistent access sequencing for all users.
2. Mutual Exclusion – Prevents multiple processes from simultaneously accessing
resources that require exclusive use.
3. No Starvation – Guarantees that every process requesting a resource eventually gets
access.
4. No Deadlock – Prevents situations where processes block each other indefinitely.
1. Load Balancing – Prevents some processors from being overloaded while others remain
idle.
2. Intelligent Resource Allocation – Efficiently distributes system resources among active
jobs.
3. Process Migration – Moves processes between nodes to maintain optimal workload
distribution.
8. Scaling Transparency: Scaling Transparency ensures that a distributed system can expand
without disrupting users. It requires:
These design principles enable smooth growth and adaptability in distributed systems.
Hardware Concepts:
Distributed systems consist of multiple CPUs, but their organization varies based on how they
interconnect and communicate. A fundamental classification method for multi-CPU systems is
Flynn’s taxonomy, which categorizes systems based on the number of instruction and data
streams.
Flynn’s Classification
○ Parallel processing with multiple data units executing the same instruction.
○ Common in supercomputers and applications requiring repetitive calculations,
such as vector processing.
3. MISD (Multiple Instruction Streams, Single Data Stream)
○ Comprises multiple independent computers, each with its own program and data.
○ All distributed systems belong to this category.
○ Each CPU has its own private memory, requiring message passing for
communication.
This classification helps in understanding how distributed systems operate and manage
communication among processors.
Bus-based Multiprocessor
2. Performance Challenges
4. Overall Summary
● Design Goal: Achieve high parallel performance by allowing multiple CPUs to operate
simultaneously on shared data, while minimizing the contention for a single bus.
● Key Trade-Off: A single bus is simpler but becomes a bottleneck as the number of
CPUs grows. Caches greatly reduce bus traffic but introduce the complexity of
maintaining coherence.
● Practical Approach: Equip each CPU with a private cache, implement a coherence
protocol to keep data consistent, and rely on the bus for misses and cache coherence
signals.
Bus-based multiprocessors with caches are one of the foundational designs for shared-memory
parallel systems. While straightforward to implement for a small number of CPUs, they require
careful design of cache-coherence mechanisms and bus arbitration to scale effectively.
By including caches, overall performance improves significantly (due to fewer bus transactions),
but ensuring coherence is key to correct program behavior. Different hardware protocols and
optimizations exist to handle coherence efficiently, especially as systems scale in CPU count
and complexity.
Switched Microprocessors:
○ For n=1024n = 1024, a request must traverse 20 switching stages (10 outbound,
10 inbound).
○ If a CPU runs at 100 MIPS (10 ns per instruction), switching time must be 500
picoseconds (0.5 ns) per stage.
○ A large multiprocessor would require thousands of high-speed switches, making
it costly.
5. NUMA (Non-Uniform Memory Access) Machines:
○ Introduces a hierarchical system where each CPU has its local memory.
○ Accessing local memory is fast, but accessing other CPUs’ memory is slower.
○ Requires careful software optimization to ensure most memory accesses are
local.
6. Overall Conclusion:
○ Information is physically copied from the sender’s address space to the receiver’s
address space.
○ Data is transmitted in the form of messages.
○ Conceptual model illustrated in Figure 3.1(b).
5. IPC in Distributed Systems:
Conclusion:
In distributed systems, message passing is the primary IPC mechanism because shared
memory is not available. A message-passing system simplifies communication by handling
network complexities and provides a foundation for advanced IPC methods like RPC and DSM.
Solutions:
1. Synchronization:
A primitive is considered blocking if its invocation halts the execution of the process until a
certain condition is met.
● Blocking Send:
○ The sender process executes the send statement and gets blocked until it
receives an acknowledgment from the receiver confirming that the message has
been received.
○ If the receiver is not ready or fails, the sender remains blocked.
● Blocking Receive:
○ The receiver process executes the receive statement and gets blocked until a
message arrives.
○ The receiver cannot proceed until it successfully obtains the message.
Nonblocking Semantics
A primitive is considered nonblocking if its invocation does not stop execution. The process
continues running immediately after invoking the primitive.
● Nonblocking Send:
○ The sender places the message into a buffer and continues execution without
waiting for an acknowledgment.
○ It improves concurrency but requires additional mechanisms to ensure message
delivery.
● Nonblocking Receive:
○ The receiver executes the receive statement, but instead of waiting for a
message, it continues execution immediately.
○ The system must provide a way for the receiver to check if a message has
arrived.
1. Polling:
○ The receiver periodically checks the buffer using a test primitive to see if a
message is available.
○ If no message has arrived, it continues execution and checks again later.
○ This method can be inefficient due to frequent checks, which may waste CPU
resources.
2. Interrupts:
○ When a message arrives in the buffer, a software interrupt notifies the receiving
process.
○ This method allows the receiver to continue execution without unnecessary
polling.
○ It enables maximum parallelism but introduces complexity, as user-level
interrupts can be difficult to handle.
● Occurs when both send and receive primitives use blocking semantics.
● Execution flow:
1. The sender process sends a message and waits for an acknowledgment
before proceeding.
2. The receiver executes the receive statement and remains blocked until the
message arrives.
3. The receiver then sends an acknowledgment message to the sender.
4. The sender resumes execution only after receiving the acknowledgment.
● Illustration of Synchronous Communication:
unnecessary.
● Thread-based Optimization:
○ In systems supporting multiple threads within a process (discussed in Chapter
8), blocking primitives can be used without significantly reducing concurrency.
○ One thread may be blocked on message communication while others continue
execution.
Asynchronous Communication
Conclusion
Synchronization in message-passing systems plays a vital role in determining system
performance, concurrency, and reliability.
2. Buffering:
However, the receiver may not always be ready to accept the message at the time of
transmission. In such cases, the operating system must provide buffering mechanisms to
temporarily store messages until the receiver is ready.
● Single-Message Buffering
● Finite-Bound Buffering (Multiple-Message Buffers)
3.5.1 Null Buffer (No Buffering)
In this strategy, no intermediate storage is used between the sender and receiver. Since there is
no place to temporarily store the message, one of the following methods is used:
● The message remains in the sender’s address space, and the sender is blocked until
the receiver executes a corresponding receive() operation.
● The sender process is suspended and restarts the send() operation once the receiver
is ready.
● The receiver executes receive(), sending an acknowledgment to the sender’s
kernel.
● Upon receiving the acknowledgment, the sender is unblocked and retries the send
operation.
✅
Characteristics of Null Buffering:
✅
Minimal memory usage (no extra storage required).
❌
Ensures tight synchronization between sender and receiver.
❌
High synchronization overhead (sender and receiver must execute at the same time).
Potential message loss (in case of timeout-based retransmissions).
How It Works:
● A single buffer is allocated at the receiver’s node.
● If the receiver is not ready, the message is temporarily stored in the buffer.
● The message remains ready for retrieval when the receiver executes receive().
● The buffer can be located in:
○ The kernel’s address space (managed by the OS).
○ The receiver’s address space (managed by the process).
✅
Characteristics of Single-Message Buffering:
Reduces synchronization constraints (sender and receiver don’t need to be active
✅
simultaneously).
❌
Ensures message reliability (no immediate loss if the receiver is busy).
❌
Still limited to one message at a time (cannot handle high message traffic).
Involves two copy operations (sender → buffer, buffer → receiver).
How It Works:
● The system maintains a message queue of unlimited size for each receiver.
● Messages remain stored until explicitly retrieved by the receiver.
● The sender can send messages at any time, and they will be queued up for
processing.
✅
Characteristics of Unbounded Buffering:
✅
Ideal for high-throughput systems (messages won’t be lost).
❌
Maximum flexibility (no blocking on sender or receiver).
❌
Impractical in real systems (memory is finite).
Requires complex memory management (to prevent excessive storage usage).
When the buffer reaches its limit, the system must decide what to do with new incoming
messages. Two strategies are commonly used:
❌
○ The sender receives an error notification and may choose to retry later.
○ Reduces reliability, as messages may be lost.
2. Flow-Controlled Communication (Blocking Sender)
○ The sender blocks until the receiver processes some messages, creating space
in the buffer.
❌
○ Synchronization is enforced, preventing message loss.
○ May cause deadlocks if not handled properly.
Implementation Considerations
✅
Characteristics of Finite-Bound Buffering:
✅
More efficient than unbounded buffering (prevents unlimited memory growth).
❌
More reliable than null/single-message buffering (can hold multiple messages).
❌
Risk of overflow (must manage buffer size carefully).
Extra overhead in buffer management (memory allocation, message ordering, etc.).
Conclusion
Choosing the right buffering strategy depends on:
📌 Key Takeaways:
● Null buffering is the simplest but least flexible method.
● Single-message buffering improves performance but still has synchronization
limitations.
● Unbounded buffering is ideal in theory but impractical due to memory constraints.
● Finite-bound buffering is the most commonly used strategy in real-world systems.
🚀
By understanding these buffering mechanisms, developers can design efficient IPC
models tailored to their system requirements.
3. Multidatagram Messages and Maximum Transfer Unit (MTU)
○ If the message size is less than or equal to the MTU, it can be sent in a single
packet.
○ These messages do not require fragmentation and can be transmitted directly.
2. Multidatagram Messages
○ If the message size is greater than the MTU, it must be fragmented into multiple
packets.
○ Each fragment is sent separately and contains both control information and
message data.
○ The order of packets is important since the receiver must reassemble them
correctly.
📌 Conclusion:
Multidatagram messages play a vital role in enabling the transmission of large data across
🚀
networks while adhering to MTU constraints. Efficient fragmentation and reassembly are
crucial for maintaining data integrity and reliable communication in distributed systems.
● Encoding (Sender Side): Converts data into a stream format suitable for transmission.
● Decoding (Receiver Side): Converts the received data back into program objects.
● Ensures correct data representation across different architectures and systems.
4. Encoding Methods
● If the receiver gets badly encoded data (e.g., exceeds max length), it:
○ Fails to decode the message.
○ Returns an error message to the sender.
📌 Conclusion:
Encoding and decoding are critical for reliable message passing in distributed systems. The
🚀
choice between tagged and untagged representation depends on a trade-off between
efficiency and flexibility.
❌ Cons:
● Process Migration is Impossible: If a process moves to another machine, its original
address becomes invalid.
✅ Pros:
● Supports Process Migration.
● Caches last known location to improve efficiency.
❌ Cons:
● Overhead: If a process moves frequently, finding it may take multiple hops.
● Failure Risk: If an intermediate machine crashes, the process may become
unreachable.
● Goal: Users should not need to know the physical location of a process.
● Solution: Do not embed the machine ID in process identifiers.
✅ Advantages:
● Supports process migration without modifying programs.
● Works for functional addressing (e.g., mapping a service name to multiple
processes).
❌ Disadvantages:
● Single point of failure (if the name server crashes).
● Scalability issues (high demand on the name server).
● Solution: Replicate the name server (but requires synchronization).
Conclusion
🚀 Final Thought: Distributed systems should prioritize location transparency and scalability
for efficient process communication!
6. Failure Handling in Message-Based Communication
1. Loss of Request Message – The message may be lost due to a communication link
failure or because the receiver's node is down.
2. Loss of Response Message – The response may be lost due to network failure or if the
sender's node crashes.
3. Unsuccessful Request Execution – If the receiver's node crashes during request
processing, the request execution may be incomplete.
To address these issues, reliable IPC protocols use internal retransmissions and
acknowledgment messages to ensure message delivery. The sender's kernel retransmits a
message if no acknowledgment is received within a specified timeout period.
○ The server starts a timer upon receiving the request. If it finishes before the timer
expires, the reply serves as an acknowledgment. Otherwise, a separate
acknowledgment is sent.
○ If no acknowledgment is received, the client retransmits the request.
○ The client acknowledges the reply to prevent unnecessary retransmissions.
4. Two-Message IPC Protocol (Used in Many Systems)
○ The client sends a request and waits for a reply.
○ The server processes the request and sends a response. If the client does not
receive the reply within the timeout, it retransmits the request.
○ This protocol follows at-least-once semantics, ensuring that the request is
executed at least once. However, it may cause duplicate executions, leading to
inconsistent results.
A client-server communication using the two-message IPC protocol may experience the
following:
Conclusion
Reliable IPC protocols mitigate failures through retransmissions and acknowledgments. The
choice of protocol depends on the application’s tolerance for duplicate executions and network
overhead considerations.
Remote Procedure Call (RPC) is an extension of the traditional procedure call mechanism that
allows a process to execute a procedure located in a different address space, possibly on
another computer. Unlike a regular procedure call, where the caller and the callee share
memory, in RPC, the caller (client process) and callee (server process) operate in separate
memory spaces and exchange information through message passing.
○ The client process sends a request message to the server, containing the
procedure name and parameters.
○ The client then waits (blocks) for a response from the server.
2. Procedure Execution:
○ The server receives the request, extracts the parameters, and executes the
requested procedure.
○ Once execution is complete, the server sends the result back to the client in a
reply message.
3. Response Handling:
○ The client receives the reply message, extracts the result, and resumes
execution from the calling point.
Characteristics of RPC:
● Message Passing: Since the client and server do not share memory, they communicate
via request and response messages.
● Blocking by Default: The client usually waits (blocks) until it receives a response.
However, asynchronous RPC models allow the client to continue executing while waiting
for a response.
● Concurrency Models: The server can process requests sequentially or create threads
to handle multiple requests concurrently.
● Location Transparency: The client does not need to know the exact location of the
procedure—it only makes a request as if calling a local function.
A key design goal of RPC mechanisms is transparency, which means making remote
procedure calls behave as similarly as possible to local procedure calls. Transparency can be
categorized into two types:
1. Syntactic Transparency – The syntax of a remote procedure call should be identical to
that of a local procedure call. This ensures that programmers do not need to learn a
different way of writing function calls when using RPC.
2. Semantic Transparency – The behavior (semantics) of a remote procedure call should
match that of a local procedure call, ensuring that function execution, argument passing,
and return values work in the same way.
Despite efforts to make RPC resemble local procedure calls, several fundamental differences
make full transparency difficult:
○ In local calls, functions can access variables and memory of the caller.
○ In RPC, the caller and callee operate in separate memory spaces (possibly on
different machines).
○ Passing pointers (e.g., linked lists, graphs) is problematic because memory
addresses are meaningful only within a single process. Workarounds like copying
values before sending them may change program behavior.
2. Increased Vulnerability to Failures
○ Local procedure calls execute within the same process, reducing failure risks.
○ RPCs involve multiple processes, a network, and potentially multiple computers,
making them prone to failures such as:
■ Network communication errors
■ Server crashes
■ Delayed responses due to congestion
3. Higher Latency
Due to these differences, some researchers argue that RPC should not be fully transparent:
Conclusion
Multicast and broadcast communication play crucial roles in distributed systems. Some key
applications include:
In multicast communication, receiver processes form groups. There are two main types of
groups:
A flexible message-passing system should support both closed and open groups, depending
on application needs.
A simple approach to managing groups is through a centralized group server that handles:
● Poor reliability – If the central group server fails, the entire system is affected.
● Limited scalability – As the number of groups grows, the centralized server may
become a bottleneck.
To improve reliability, the group server can be replicated, but this introduces challenges in
maintaining consistency across all replicated servers.
1. Selective Receiver – Specifies a unique sender and only exchanges messages with
that sender.
2. Nonselective Receiver – Accepts messages from any sender within a predefined set.
Since the receiver does not know which sender will send a message first, the communication is
nondeterministic. This is especially useful in scenarios where:
● The receiver waits for information from any available sender in a group.
● The receiver dynamically adjusts which senders it accepts messages from.
Such a system requires a way to handle nondeterministic behavior, often achieved using
guarded commands introduced by Dijkstra (1975). These allow conditions to dynamically
control message acceptance, ensuring synchronization between senders and the receiver.
● Ensures that all messages are delivered to all receivers in a consistent order
acceptable to the application.
● Required for applications such as database replication, where receiving updates in
different orders can lead to data inconsistencies.
○ Ensuring order is simple if the sender waits for confirmation before sending the
next message.
2. Many-to-One Communication:
○ Messages are delivered in the order they arrive at the receiver’s machine.
○ Ordering is handled naturally by the receiver.
● A message from one sender may arrive at a receiver before another sender’s
message, while the order may be reversed at a different receiver.
● Causes:
○ LAN contention: Multiple processes compete for network access, making
message order nondeterministic.
○ WAN routing differences: Messages take different, unpredictable paths to the
same destination.
Absolute ordering ensures that all messages are delivered to all receiver processes in the
exact order in which they were sent.
Absolute ordering requires globally synchronized clocks, which are difficult to implement.
However, many applications do not need absolute ordering. Instead, consistent ordering
ensures that all receiver processes receive messages in the same order, though this order
may differ from the original sending order.
Sequencer-Based Implementation
Limitation: The sequencer introduces a single point of failure and has poor reliability.
1. Sender assigns a temporary sequence number to the message, ensuring it is larger
than any previously used number.
2.
3. Sender selects the largest proposed sequence number as the final one and sends a
commit message to all members.
4. Each member assigns the final sequence number to the message upon receiving the
commit message.
5. Messages are delivered in the order of their final sequence numbers.
○ Each process maintains a vector with one component for each group member.
○ The iith component tracks the last received message from the corresponding
sender.
2. Sending a Message:
○ The sender increments its own vector component and attaches the updated
vector to the message.
3. Message Delivery Conditions:
○ If the conditions fail, the message is buffered and rechecked when a new
message arrives.
In distributed systems, certain processes need to take on special roles such as coordinator,
initiator, or sequencer. Election algorithms are used to select a coordinator among multiple
processes.
○ Election algorithms ensure that when an election starts, all processes agree on a
single coordinator.
○ The goal is to locate the process with the highest unique identifier (e.g., network
address) and designate it as the coordinator.
2. Assumptions in Election Algorithms
Bully Algorithm:
When a process, P, detects that the current coordinator is unresponsive, it initiates an election:
○ If no one responds, P wins the election and becomes the new coordinator.
○ If a higher-numbered process responds, it takes over the election, and P stops
its process.
3. Election Continuation:
○ If a previously failed process rejoins and has the highest number, it will initiate
an election and take over as the coordinator, "bullying" its way to the top.
Consider a system with eight processes (0 to 7), where process 7 was the coordinator but has
crashed.
1. Process 4 detects the failure and starts an election, sending messages to 5, 6, and 7.
2. Processes 5 and 6 respond, indicating they are alive, and take over the election.
3. Process 6 wins and announces itself as the new coordinator by sending a
COORDINATOR message to all processes.
If process 7 restarts, it will initiate an election and become the coordinator again since it has
the highest number.
The Ring Algorithm is an election algorithm used in distributed systems to select a new
coordinator when the existing one fails. It operates in a logically ordered ring of processes,
ensuring that every process knows its successor. Unlike token-based ring algorithms, it does not
require a special token for operation.
○ If the successor is down, the sender skips it and moves to the next active
process in the ring.
○ Each process that receives the ELECTION message adds its own process
number to the list and forwards it further.
3. Completion of the Election
○ The ELECTION message circulates around the ring until it reaches the process
that started it.
○ The initiating process identifies the highest process number in the list as the
new coordinator.
4. Coordinator Announcement
● In Fig. 3-13, two processes, 2 and 5, detect the failure of the previous coordinator
(process 7) simultaneously.
● Both independently initiate an election and circulate their messages.
● Since the election messages travel around the ring, both reach their starting points,
where they are converted into COORDINATOR messages.
● These COORDINATOR messages also circulate and inform all processes about the new
coordinator.
● Once both messages complete their rounds, they are removed, causing only minor
bandwidth overhead.
Key Characteristics of the Ring Algorithm
Multiple Elections Causes conflicts but stabilizes No conflicts, extra messages are just
redundant
Conclusion
The Ring Algorithm is a structured approach to coordinator election, ideal for systems with
logically ordered processes. While it introduces extra message overhead, it ensures a fair
and distributed election process. However, in dynamic or large-scale systems, the Bully
Algorithm may be preferred due to its faster convergence.
Lamport’s Algorithm:
Ricart–Agrawala algorithm:
Maekawa’s Algorithm