Chapter 4
Chapter 4
1 11/13/2024
Contents
Performance, Replication, Virtualization, Scalability
Synchronous network model and leader election
Asynchronous shared memory model, fairness, and mutual
exclusion
Data-Centric Consistency Models
Multicore architectures and benchmark
2 11/13/2024
Models of Distributed Computing
In the context of distributed systems, the model of distributed
computing refers to the abstraction or framework that describes
how multiple independent computers or nodes collaborate and
interact to solve a particular problem.
These models help define the rules, communication mechanisms,
and assumptions under which the components of a distributed
system operate.
This chapter focuses on important models and concepts in
distributed computing, including Performance, Replication,
Virtualization, Scalability, and others.
3 11/13/2024
Performance
Performance is critical to the overall effectiveness of the system. It
measures how well a distributed system meets its objectives, such as
computation speed, resource utilization, response time, and throughput.
Performance is often impacted by:
Network Latency: The delay in data transmission between nodes in the
system. This is a major concern in distributed systems, especially when
nodes are geographically dispersed.
Bandwidth: The rate at which data can be transferred across the network.
Limited bandwidth can reduce the system's performance, especially
when large amounts of data are exchanged.
Throughput: The amount of work completed by the system in a given
period. Higher throughput generally indicates better performance.
4 11/13/2024
Performance
Fault Tolerance: The system's ability to continue functioning even in the
presence of failures. This can affect performance if redundancy or
recovery mechanisms slow down the system.
Load Balancing: The efficient distribution of tasks across multiple
nodes. Proper load balancing can improve performance by preventing
some nodes from being overloaded while others are underutilized.
Metrics to evaluate performance:
Response Time: Time taken for a node to respond to a request.
Execution Time: The total time required to execute a particular task.
Efficiency: How well the system utilizes its resources (CPU, memory,
etc.).
5 11/13/2024
Replication
Replication is the process of creating copies of data across multiple nodes or servers in a
distributed system. This strategy aims to ensure fault tolerance, data availability, and
load distribution.
Types of Replication:
Master-Slave Replication: One node is the master (primary), and others act as slaves
(secondary). All write operations are performed on the master, which then propagates
updates to the slave nodes.
It is often used for read-heavy applications with centralized write operations, such as in
transaction management systems in banks.
Multi-Master Replication: Multiple nodes can accept read and write requests, and the
updates are synchronized across all nodes.
It is useful in systems that require high availability and fault tolerance, such as
distributed ledger systems or global financial platforms.
Quorum-based Replication: A set of replicas is used to create a "quorum" that must
reach consensus for operations to proceed. It ensures consistency even when some
replicas are unavailable.
It is commonly used in systems requiring strong consistency and fault tolerance, like
6 11/13/2024
Amazon DynamoDB or large-scale distributed payment systems.
Replication
Advantages:
Availability: Replication increases system availability. Even if some
replicas fail, the system can still operate using other replicas.
Fault Tolerance: Replicas can tolerate node failures without losing data,
ensuring the system continues to function.
Read Scalability: Replication can improve the system's ability to handle
read-heavy workloads by distributing the load across replicas.
Challenges:
Consistency: Maintaining consistency between replicas is difficult,
especially in systems that allow concurrent updates. Techniques such as
eventual consistency and strong consistency models are used to address
this.
Network Overhead: Replicating data increases network traffic, as
updates must be propagated to all replicas.
7 11/13/2024
Virtualization
Virtualization refers to the creation of virtual instances (such as virtual
machines or containers) that abstract physical hardware resources.
This allows multiple virtual environments to run on the same physical
machine, providing isolation, resource allocation, and management.
Types of Virtualization:
Hardware Virtualization: The creation of virtual machines that run
their own operating systems.
Each virtual machine operates as an independent system on the physical
hardware.
Containerization: A lighter form of virtualization where containers share
the host operating system's kernel but remain isolated from each other.
Technologies like Docker and Kubernetes are widely used for
containerization in distributed systems.
8 11/13/2024
Virtualization
Benefits of Virtualization:
Resource Efficiency: Virtualization enables efficient use of hardware resources,
as multiple virtual environments can share a single physical server.
Scalability and Flexibility: Virtual machines and containers can be easily
deployed, scaled, and managed, allowing the system to adapt to changing
workloads.
Fault Isolation: Faults in one virtual environment do not affect others,
improving reliability and fault tolerance.
Challenges:
Overhead: Virtualization introduces some overhead in terms of memory and
CPU usage, as the hypervisor (in the case of virtual machines) or container
engine (in the case of containers) manages multiple environments.
Security Risks: Virtualized environments can introduce security vulnerabilities
if not properly configured, as shared resources can be exploited by malicious
actors.
9 11/13/2024
Scalability
Scalability refers to the ability of a distributed system to handle
increased load or expand its capacity without significant degradation in
performance.
There are two primary types of scalability in distributed systems:
Vertical Scalability (Scaling Up): Involves adding more resources (e.g.,
CPU, RAM) to a single node.
This approach has limitations due to physical hardware constraints.
Horizontal Scalability (Scaling Out): Involves adding more nodes to the
system to distribute the load.
Horizontal scaling is typically preferred in distributed systems as it
provides better fault tolerance and flexibility.
10 11/13/2024
Scalability
Challenges in Scalability:
Load Balancing: Ensuring that the workload is evenly distributed across
multiple nodes or instances is essential for effective scalability.
Data Distribution: When scaling out, data must be distributed across
nodes efficiently. This is often achieved using partitioning or sharding.
Consistency: As the system grows, maintaining consistency (especially
in distributed databases) becomes more complex, and trade-offs between
consistency and availability may need to be made.
11 11/13/2024
Synchronous Network Model
The synchronous network model refers to a type of network
communication where processes or nodes in the system are assumed to
operate in a synchronized manner, meaning that there is a known bound
on the communication delay and the execution time of operations.
This model is predicated on the idea that all processes in the system
operate with synchronized clocks and can send and receive messages in
a predictable, timely fashion.
Key Characteristics of the Synchronous Network Model:
Bounded Communication Delay: In a synchronous network, the time it
takes for a message to travel from one node to another is bounded.
That is, there is an upper limit on how long it will take for any message
to be delivered.
This predictable message delivery time allows processes to make
assumptions about the timing of operations and to coordinate effectively.
12 11/13/2024
Synchronous Network Model
Synchronized Clocks: All processes in the network share a
synchronized global clock or time.
The clock is assumed to be accurate enough for processes to compare
time and make synchronized decisions based on it.
In practice, this synchronization is typically achieved using protocols
like the Network Time Protocol (NTP) or specialized algorithms that
synchronize clocks across the nodes.
Known Time Bound for Operations: Each operation in a synchronous
network is expected to finish within a known time frame. For example, a
process can be assured that it will not take more than a certain amount of
time for a message to be sent or for a computation to complete.
This predictability makes it easier to reason about system behaviour,
plan tasks, and synchronize actions.
13 11/13/2024
Synchronous Network Model
Assumptions about the Environment: All nodes have a consistent
view of time, and there is no significant variance in message delivery
times or processing delays, so long as the system stays within the
predefined bounds. This eliminates uncertainties that could otherwise
affect coordination and consistency.
Advantages of the Synchronous Network Model:
Predictability and Simplicity
Easier Coordination and Synchronization
Simplified Fault
14 11/13/2024
Synchronous Network Model
Challenges and Limitations:
Clock Synchronization: In practice, maintaining perfectly synchronized clocks across
distributed nodes is challenging, especially in large-scale systems.
Rigid Timing Constraints
Scalability Issues: As the system grows in size, maintaining strict synchronization across
all nodes becomes increasingly difficult and resource-intensive.
Vulnerability to Network Delays
15 11/13/2024
Asynchronous Shared Memory Model
In the asynchronous shared memory model, processes are allowed to run
independently without any assumption about the relative speeds of the
processes.
The only constraint is that each process can read from and write to the
shared memory at its own pace, and there’s no global synchronization or
clocks used.
The system may be non-blocking in terms of communication, meaning
that a process can initiate a communication (e.g., a read or write
operation) and continue executing without waiting for the other
process's response.
16 11/13/2024
Fairness
Fairness in this context refers to ensuring that no process is indefinitely
postponed in accessing the shared memory.
For instance, in systems where multiple processes are competing for
access to shared resources (memory or critical sections), fairness
guarantees that every process will eventually get access to the resource,
and no process is starved (i.e., denied access indefinitely).
A typical goal in distributed systems is to ensure that every process has a
fair chance to execute its critical section or access the shared memory.
This is especially important in systems where several processes might be
contending for resources.
17 11/13/2024
Mutual Exclusion (Mutex)
Mutual Exclusion (Mutex) is a fundamental concept in distributed
systems and concurrent computing, ensuring that only one process or
thread can access a critical section (a shared resource or memory) at a
time.
Without mutual exclusion, simultaneous access by multiple processes to
the same resource could lead to data corruption or inconsistent states,
thus violating the system's integrity.
In distributed systems, where processes do not share a common memory
space and may run on different physical machines, achieving mutual
exclusion becomes more complex.
The challenge arises due to the lack of a global clock and asynchronous
communication between distributed processes.
18 11/13/2024
Mutual Exclusion in Centralized Systems
In centralized systems, mutual exclusion is achieved through centralized
coordination where a single entity, such as a central server or coordinator,
manages access to shared resources.
Below are some common mechanisms used to implement mutual exclusion in
centralized systems:
Test & Set: Test & Set is an atomic operation used for mutual exclusion. A
process tests the status of a shared variable (typically a lock), and if the variable
indicates that the resource is free, the process sets the variable to indicate that it
is now holding the resource.
The operation is atomic, ensuring that no two processes can simultaneously
modify the variable.
Semaphores: A semaphore is a synchronization primitive that controls access to
shared resources. It is essentially an integer variable used to signal availability.
Binary semaphores (or mutexes) are used for mutual exclusion by ensuring that
only one process can access the critical section at a time, while counting
19 semaphores manage access to a pool of resources. 11/13/2024
Mutual Exclusion in Centralized Systems
Messages: Message-passing involves processes sending requests to a
central coordinator or server for permission to enter the critical section.
The central server grants access based on the process’s request, ensuring
that only one process at a time can enter the critical section.
This model can be implemented using explicit message passing.
Monitors: A monitor is an abstract data type used for mutual exclusion.
It encapsulates shared resources and provides procedures for accessing
these resources.
A monitor allows only one process to execute a procedure at a time,
ensuring that critical sections are executed without interference from
other processes.
20 11/13/2024
Distributed Mutual Exclusion
Create an algorithm to allow a process to obtain exclusive access
to a resource
Algorithms
Centralized Algorithm
Token Ring Algorithm
Distributed Algorithm
Decentralized Algorithm
21 11/13/2024
Centralized Algorithm
In this, one process is elected as coordinator.
The process that needs resource can send a request to coordinator.
If the resource is not accesses by any other processes, the coordinator allows the
requested process to use the resource. If another process claimed resource:
Coordinator does not reply until release
Maintain queue
Service requests in FIFO order
Benefits
Fair
All requests processed in order
Easy to implement, understand, verify
23 11/13/2024
Distributed Algorithm
Ricart & Agrawala algorithm
Distributed algorithm using reliable multicast and logical clocks
Process wants to enter critical section:
– Compose message containing:
• Identifier (machine ID, process ID)
• Name of resource
• Timestamp (totally-ordered Lamport)
– Send request to all processes in group
– Wait until everyone gives permission
– Enter critical section / use resource
24 11/13/2024
Distributed Algorithm
Ricart & Agrawala algorithm
When process receives request:
– If receiver not interested:
• Send OK to sender
– If receiver is in critical section
• Do not reply; add request to queue
– If receiver just sent a request as well:
• Compare timestamps: received & sent messages
• Earliest wins
• If receiver is loser, send OK
• If receiver is winner, do not reply, queue
When done with critical section
– Send OK to all queued requests
25 11/13/2024
Distributed Algorithm
Lamport’s Mutual Exclusion
Request Queue: Each process maintains a queue of mutual exclusion
requests.
Requesting Critical Section: Process 𝑃𝑖sends a request (i, Ti) to all
nodes, where 𝑇𝑖 is a timestamp.
It also places the request in its own queue.
Entering Critical Section: Process 𝑃𝑖can enter the critical section when it
receives acknowledgments (with larger timestamps) from all other
processes and its request has the earliest timestamp in its queue.
26 11/13/2024
Distributed Algorithm
Lamport’s Mutual Exclusion
Difference from Ricart-Agrawala: Unlike Ricart-Agrawala, all processes
respond to requests immediately (no hold-back).A process can decide to
enter the critical section based solely on whether its request has the
earliest timestamp.
Releasing Critical Section: Process 𝑃𝑖 removes its request from its
queue and sends a release message with a timestamp.
When a process receives a release message, it removes the request from
its queue. This may allow another process to access the critical section if
its request now has the earliest timestamp.
27 11/13/2024
Decentralized Algorithm
In a decentralized algorithm, each replica of a resource has its own
coordinator, which is the node responsible for managing access to
that replica.
To use a resource, a process must obtain permission from the
coordinator of the corresponding replica.
This approach ensures exclusive access to the resource, as a
coordinator grants permission to only one process at a time,
preventing simultaneous access and maintaining mutual exclusion.
28 11/13/2024
Election Algorithms (Leader Selection)
Many Distributed Systems require a process to act as coordinator (for various
reasons). The selection of this process can be performed automatically by an
“election algorithm”.
For simplicity, we assume the following:
Processes each have a unique, positive identifier.
All processes know all other process identifiers.
The process with the highest valued identifier is duly elected coordinator.
When an election “concludes”, a coordinator has been chosen and is known to
all processes.
The overriding goal of all election algorithms is to have all the processes in a
group agree on a coordinator.
There are two types of election algorithm:
1. Bully: “the biggest guy in town wins”.
2. Ring: a logical, cyclic grouping.
29 11/13/2024
Bully Algorithm
P sends an ELECTION message to all processes with higher numbers.
If no one responds, P wins the election and becomes coordinator.
If one of the higher-ups answers, it takes over. P’s job is done.
30 11/13/2024
Bully Algorithm
The bully election algorithm:
Process 4 holds an election.
5 and 6 respond, telling 4 to stop.
Now 5 and 6 each hold an election.
Process 6 tells 5 to stop.
Process 6 wins and tells everyone.
31 11/13/2024
The “Ring” Election Algorithm
In the Ring Election Algorithm, processes are logically arranged in a ring
topology, where each process knows the identifier of its immediate successor
(and optionally, all other processes in the ring).
When a process detects that the coordinator (the leader or central process) has
failed or is down, it initiates an ELECTION message. This message contains
the process's own identifier and begins circulating around the ring.
Each process that receives the election message compares its own identifier
with the one in the message. If the process has a higher identifier, it adds its
own identifier to the message, thereby putting itself forward as a candidate for
election.
The message continues circulating around the ring until it returns to the original
process that initiated the election. At this point, the process identifies the
highest-numbered candidate in the message as the new coordinator.
32 11/13/2024
The “Ring” Election Algorithm
Finally, the original process sends a COORDINATOR message to all other
processes in the ring, informing them of the newly elected coordinator.
Once the coordinator is elected and the COORDINATOR message is received
by all processes, the election process is complete, and the system can resume
normal operation with the new coordinator in place.
33 11/13/2024
Introduction to Consistency Model
In distributed systems, where data is stored across multiple nodes or locations,
consistency refers to the correctness and coherence of the data shared between these
nodes.
Consistency models define the rules and guarantees about how updates to shared data
are reflected across different nodes in a distributed system.
An important issue in distributed systems is the replication of data. Data are generally
replicated to enhance reliability or improve performance.
One of the major problems is keeping replicas consistent. Informally, this means that
when one copy is updated we need to ensure that the other copies are updated as well;
otherwise the replicas will no longer be the same.
If objects (or data) are shared, we need to do something about concurrent accesses to
guarantee state consistency.
To keep replicas consistent, we generally need to ensure that all conflicting operations
are done in the same order everywhere
Conflicting operations: From the world of transactions:
Read–write conflict: a read operation and a write operation act concurrently
Write–write conflicts: two concurrent write operations
34 11/13/2024
Introduction to Consistency Model
Consistency Model
A consistency model is essentially a contract between processes and the
data store (distributed collection of storages accessible to clients). It says
that if processes agree to obey certain rules, the store promises to work
correctly.
35 11/13/2024
Data Centric Consistency Models
data-centric consistency models define how updates to shared data are viewed
across different processes or nodes.
These models are classified into strong consistency and weak consistency
models, based on how synchronization is handled between processes accessing
shared data.
1. Strong Consistency Models: In strong consistency models, operations on
shared data are synchronized, meaning the system guarantees a certain level of
consistency across processes.
These models generally ensure that all processes observe data in a consistent
manner, even if this requires heavy synchronization. Types of strong
consistency models are:
Strict consistency (related to absolute global time)
Linearizability (atomicity)
Sequential consistency (what we are used to - serializability)
Causal consistency (maintains only causal relations)
36 11/13/2024
FIFO consistency (maintains only individual ordering)
Data Centric Consistency Models
2. Weak consistency models: Weak consistency models reduce synchronization,
offering more flexibility and better performance at the cost of weaker guarantees.
These models allow data synchronization only when necessary, typically when
shared data is locked or unlocked.
Types of Weak consistency models are:
General weak consistency
Release consistency
Entry consistency
Observation: The weaker the consistency model, the easier it is to build a
scalable solution.
37 11/13/2024
Strong Consistency Models
Strict Consistency
Any read to a shared data item X returns the value stored by the most recent
write operation on X.
Observation: It doesn’t make sense to talk about “the most recent” in a
distributed environment.
Assume all data items have been initialized to NIL
W(x)a: value a is written to x
R(x)a: reading x returns the value a
The behavior shown in Figure (a) is correct for strict consistency
The behavior shown in Figure (b) is incorrect for strict consistency
38 11/13/2024
Strong Consistency Models
Sequential Consistency
Sequential consistency is a slightly weaker consistency model than strict
consistency. A data store is said to be sequentially consistent when it satisfies
the following condition:
The result of any execution is the same as if the (read and write) operations by
all processes on the data store were executed in some sequential order, and the
operations of each individual process appear in this sequence in the order
specified by its program.
Figure (a) – a sequentially consistent data store
P1 first performs W(x)a to x. Later in absolute time, P2 also performs
W(x)b to x
Both P3 & P4 first read value b and later value a. The write operation of
P2 appears to have taken place before that of P1 to both P3 & P4
39 11/13/2024
Strong Consistency Models
Sequential Consistency
Figure (b) – a data store that is not sequentially consistent
Not all processes see the same interleaving of write operations
40 11/13/2024
Strong Consistency Models
Linearizability
A consistent model that is weaker than strict consistency, but stronger than
sequential consistency is linearizability.
A data store is said to be linearizable when each operation is timestamped and
the following condition holds:
The result of any execution is the same as if the (read and write) operations by
all processes on the data store were executed in some sequential order, and the
operations of each individual process appear in this sequence in the order
specified by its program.
41 11/13/2024
Strong Consistency Models
Casual Consistency
The causal consistency model is a weaker model than sequential consistency.
It makes a distinction between events that are potentially causally related and those that
are not.
If event B is caused or influenced by an earlier event, A, causality requires that
everyone else first see A, then see B.
Figure (a) – a data store that is not causally consistent
Two writes, W(x)a and W(x)b, are casually related since b may be a result of a
computation involving R(x)a
Figure (b) – a data store that is causally consistent
42 11/13/2024
Strong Consistency Models
FIFO Consistency
FIFO consistency is weaker than causal consistency
A data store is said to be FIFO consistent when it satisfies the following
condition:
Writes done by a single process are received by all other processes in the order
in which they were issued, but writes from different processes may be seen in a
different order by different processes.
43 11/13/2024
Weak Consistency Models
General Weak Consistency
Although FIFO consistency can give better performance than the stronger
consistency models, it is still unnecessarily restrictive for many applications
because they require that writes originating in a single process be seen
everywhere in order
Not all applications require seeing all writes or seeing them in order
Solution: Use a synchronization variable. Synchronize(S) synchronizes all local
copies of the data store
Using synchronization variables to partly define consistency is called weak
consistency - has three properties:
Accesses to synchronization variables are sequentially consistent.
No access to a synchronization variable is allowed to be performed until all
previous writes have completed everywhere.
No data access is allowed to be performed until all previous accesses to
synchronization variables have been performed.
44 11/13/2024
Weak Consistency Models
Figure (a) – a data store that is weak consistent (i.e., valid sequence)
P1 performs W(x)a and W(x)b and then synchronizes. P2 and P3 have not
yet been synchronized, thus no guarantees are given about what they see
Figure (b) – a data store that is not weak consistent
Since P2 has synchronized, R(x) in P2 must read b
45 11/13/2024
Weak Consistency Models
Release Consistency
General weak consistency has the problem that when a synchronization
variable is accessed, the data store does not know whether this is being done
because the process is either
Finished writing the shared data, or
About to start reading data
Consequently, the data store must take the actions required in both cases
Make sure that all locally initiated writes have been completed (i.e., propagated
to other copies)
Gathering in all writes from other copies
If the data store could tell the difference between entering a critical region or
leaving one, a more efficient implementation might be possible.
46 11/13/2024
Weak Consistency Models
Release Consistency
Idea: Divide access to a synchronization variable into two parts: an acquire and
a release phase.
About to start accessing data - Acquire forces a requester to wait until the
shared data can be accessed
Finished accessing the shared data - Release sends requester’s local value
to other servers in data store.
Since P3 does not do an acquire before reading x, the data store has no
obligation to give it the current value of x, so returning a is ok.
47 11/13/2024
Weak Consistency Models
Entry Consistency
With release consistency, all local updates are propagated to other copies/servers during
release of shared data.
With entry consistency, each shared data item is associated with a synchronization
variable.
In order to access consistent data, each synchronization variable must be explicitly
acquired.
Release consistency affects all shared data but entry consistency affects only those
shared data associated with a synchronization variable.
Since P2 did not do acquire before reading y, P2 may not read the latest.
By having the distributed system use and handle distributed shared objects (i.e., the
system does acquire on the object’s associated synch variable when a client access
48 11/13/2024
a shared distributed object).
Multicore Architectures in Distributed Systems
Multicore architectures are increasingly used in distributed systems to improve
performance, especially for parallel computing tasks.
Multi-core architectures with an especially high number of cores (tens or
hundreds or even more) and it is used in distributed systems.
Multi-core computer is composed of two or more independent cores
Core (CPU): computing unit that reads/executes program instructions
Ex) dual-core, quad-core, hexa-core, octa-core, …
In these systems, multiple cores can work on different parts of a large
computation (like machine learning, data processing, or scientific simulations),
significantly reducing the overall execution time.
The challenge with multicore systems is to effectively utilize the resources
without introducing issues like race conditions, deadlocks, or performance
bottlenecks caused by contention for shared memory.
49 11/13/2024
Benchmarks for Multicore Systems
Benchmarks are essential tools for evaluating the performance of multicore
systems.
They are designed to measure various aspects of system performance, including
computation speed, memory access efficiency, and communication overhead
between cores.
Some common benchmarks used for multicore systems include:
SPEC CPU Benchmark: Measures the computational performance of systems.
PARSEC (Princeton Application Repository for Shared-Memory Computers):
Focuses on parallel applications and evaluates performance in multi-threaded
environments.
Linpack: Measures floating-point performance and is often used for high-
performance computing (HPC) systems.
STREAM Benchmark: Measures memory bandwidth and its impact on
multicore systems.
50 11/13/2024
i o n!
te nt
r a t
y o u
s fo r
a nk
Th
51 11/13/2024