0% found this document useful (0 votes)

16 views47 pages

Cloud Computing Notes 1

The document outlines a comprehensive course on Cloud Computing, covering distributed computing, cloud deployment models, and major vendors. It discusses the evolution of computing, key features and applications of distributed systems, and the differences between distributed systems and distributed computing. Additionally, it explores architectural types, including centralized, decentralized, and hybrid architectures, along with their advantages and disadvantages.

Uploaded by

chegechris959

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views47 pages

Cloud Computing Notes 1

Uploaded by

chegechris959

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

📝 Notes | Cloud Computing

Course Description:
1. Overview of distributed computing:
trends of computing, introduction to distributed computing, Cloud
Computing

2. Distributed systems:
- Data centers,
- Virtualization, / synchronization / replication

- Web 2.0,
- Service- and utility-oriented computing.

- Architectures of DS

- Communication in DS

3. Introduction to cloud computing:

- definitions, properties, characteristics,
- service and deployment models, and desired features of a cloud.

4. Building cloud computing environment:

- application development,
- infrastructure and system development,
- computing platform and technologies.

5. Cloud deployment models:

architecture and components of SaaS, PaaS, and IaaS.
6. Major vendors in public cloud:
Amazon, Eucalyptus, products, and services of vendors.

7. Migrating into a cloud:

introduction, broad approaches, regulation issues, and conclusions.

8. Cloud applications:
Healthcare and biology. Geo-science. Business and consumer
applications.

9. Cloud issues and challenges :

cloud provider lock-in, security risks in the cloud, and mitigation
measures.

Overview of Distributed Computing:

1. Trends of Computing:

The journey of computing has evolved through key phases:

• Mainframe Era: Centralized computing focused on resource-heavy

mainframes.
• Personal Computing: invention of Desktop computers provided local,
user-centric computing.

• Distributed computing : Invention Networks allowed resource sharing

across multiple devices.
• Grid and Peer-to-Peer Computing: both involve a network of
computers working together to perform large-scale tasks. These
computers, are geographically dispersed, and they collaborate to solve
complex problems by sharing resources such as processing power,
storage, and data.

The difference is Grid computing is commonly used in scientific research,

financial modeling, and large-scale simulations

while , In a peer-to-peer model, each computer (is called "peer") in the

network acts as both a client and a server. Peers share resources directly
with each other without relying on a central server.

• Cloud Computing:
Modern trend providing on-demand, scalable, utility-based services.

2. Introduction to Distributed Computing

Distributed computing is a paradigm in which a group of independent

computers work together as a unified system to achieve a common objective.
Instead of relying on a single central computer, distributed systems enable
multiple machines to collaborate, share resources, and divide tasks. These
systems can span local networks or the internet, forming the foundation for
modern computing infrastructures like cloud computing, grid computing, and
peer-to-peer networks.

3. Reasons behind Distributed computing :

- Becouse a lot of nowdays infrastuctures are built on distibuted systems,
some infrastrures require more than one computer to get its job done.

- People need DC becouse they need high performance , by that they want
to acheive :
1/ Parallelism: access to alot of CPU/memories/storages/data

2/ Tolerate faults: if one computer fails, you can count over the other
one

3/ Physical reason: in real-life some problems are naturelly spread out

in the space ( geographically dispersed ) : exp , the banking system

4/ Security : to isolate the risks, running new software or adding new

hardware ..etc.

4. Key Features of Distributed Computing:

1. Resource Sharing: Utilizes the combined resources of multiple
computers. , Resources like processing power, storage, and data can be
accessed and utilized by various machines within the network.

2. Scalability: Distributed Computing can grow easily by adding more

machines, to increase computational power.

3. Fault Tolerance: If one computer fails, others can take over its tasks.

4. Concurrency: Multiple tasks can run simultaneously across different

nodes, increasing efficiency.

5. Transparency: Users perceive the system as a single, unified entity,

hiding the complexity of the underlying distributed infrastructure.

6. Heterogeneity: Can include different types of computers and operating

systems.
5. Applications of Distributed Computing:

Today, distributed systems power major applications such as online banking,

social media, e-commerce platforms, and real-time collaborative tools.

• Cloud Computing: Provides on-demand access to shared computing

resources, such as servers, storage, and applications.
• Distributed Databases: Ensures data availability, replication, and
reliability across geographically distributed systems.
• Internet of Things (IoT): Enables connected devices to share and
process data in real time.
• Scientific Research: High-performance distributed systems are used for
tasks like protein folding simulations and space exploration.

Conclusion:
Distributed computing has transformed the way we build and deploy
software systems by enabling efficient resource utilization, increased
system reliability, and enhanced scalability.

Distributed Systems (DS):

1. Definition:

A distributed system is a collection of independent computers (nodes) that

appears to its users as a single coherent system. The focus is on how these
nodes communicate, coordinate, synchronize and control the share of
resources (data, services) to provide a unified experience.
fig. 1-1 shows four networked computers and three applications, of
which application B is distributed across computers 2 and 3. The
distributed system provides the means for components of a single
distributed application to communicate with each other, but also to let
different applications communicate.

Middleware in Distributed Systems is a layer of software that acts as

an intermediary between different software applications or components
within a distributed system, facilitating communication, data
management, and service integration. It enables diverse and distributed
systems to work together seamlessly, hiding the complexities of the
underlying infrastructure and providing a consistent interface for
applications.

2. Primary Goals
o Service Continuity: The system should keep operating even if
some nodes or network links fail.
o Consistency & Coordination: maintaining and Ensuring data
consistency or state is synchronized across nodes.
o Scalability & Transparency: Ability to scale up by adding nodes
while hiding the complexity from end users.
3. Diffrence between DS and DC:
• Distributed Systems: Emphasis on the system as a whole, ensuring
reliability, coordination, and consistency across nodes.
• Distributed Computing: Emphasis on dividing and conquering
computational tasks and optimizing for parallel execution.

• Distributed Systems: System design patterns, communication protocols,

data consistency models.
• Distributed Computing: Parallel algorithms, scheduling, job distribution,
result aggregation, performance optimization.

• A distributed system can provide the platform or infrastructure on

which distributed computing tasks are executed.
• Conversely, distributed computing often utilizes components of
distributed systems (e.g., messaging protocols, load balancers) to
coordinate and manage the distributed workloads.

Examples
• Distributed Systems:
 Social Media Platforms that store and serve billions of user posts
from data centers worldwide.
 Cloud-Based services where each service runs on different nodes
but collectively forms a single application.
 Distributed Databases (e.g., Cassandra, MongoDB clusters)
managing data replication and partitioning.
• Distributed Computing:
 MapReduce / Spark Clusters performing big data analytics or
machine learning tasks on enormous datasets.
 Scientific Simulations (e.g., protein folding, climate modeling)
spread over high-performance computing clusters.

4. Types of Distributed Systems:

Types Distributed systems are networks of machines that exchange
information through message-passing, enabling resource sharing and
coordination among computers. Here are the main types of distributed
systems:

Client/Server Systems
In client/server systems, the client sends input to the server, which processes
the request and sends back a response. This model is fundamental and can
involve multiple server.
Peer-to-Peer Systems
Peer-to-peer systems are decentralized, with each node acting as both client
and server. Nodes perform tasks on their local memory and share data
through a supporting medium.

The N-tier Systems

Splits an application’s functionality into logical layers or “tiers”, for example
Three-tier systems commonly uses 3-tier for each function of a program:
Application Layer, Data Layer, and Presentation Layer. This architecture is
commonly used in web applications.
N-tier systems, also known as multitier distributed systems, can have any
number of functions in the network. They are similar to three-tier systems but
with more layers, commonly used in web applications and data systems

Distributed Computing Systems

Two types are used for high-performance computation. Examples include:

Cluster Computing: A collection of connected computers working together as

a unit. Clusters are generally connected quickly via local area networks & each
node is running the same operating system. It is used in web applications,
weather modeling, and complex computational problems.

Grid Computing: A network of computer systems from different

administrative domains working together. It is used for solving complex
problems and collaborating with other organizations.

Distributed Information Systems

These systems involve distributed transaction processing and enterprise
application integration:
Distributed Transaction Processing: It works across different servers using
multiple communication models , to Ensures atomicity, consistency, isolation,
and durability (ACID) across different servers.

Enterprise Application Integration (EAI): is the process of bringing different

businesses together. to ensure consistent data usage.

Remote Procedure Calls (RPC): Allows software elements to communicate by

making local method calls.
Distributed Pervasive Systems
Pervasive computing integrates everyday objects with microprocessors for
communication. Examples
Home Systems: Digital devices in homes that can be controlled remotely.

Electronic Health Systems: Smart medical wearable devices for health

monitoring.
Sensor Networks (IoT devices): Devices that send, store, and process data
efficiently.

These types of distributed systems enable efficient resource sharing,

coordination, and scalability, making them essential for various applications in
modern computing

5. Conclusion:

In short, distributed systems involve the architecture and coordination of

multiple networked computers to act as one logical system, while distributed
computing focuses on splitting computational work across multiple machines
to tackle large-scale problems more efficiently. They often overlap, but each
has its own primary goals and challenges.
DS Concepts, Architectures and communications
1. Distributed system Architectures:

Distributed system is a set of computing components Geographically

dispersed, on which they communicate and coordinate their actions to achieve
a common goal.

to implement how those components communicate, exchange resources, and

maintain data consistency, distributed architectures were
conceptualized of three types of Distributed system Architecture:
- Centralized
- Decentralized
- Hybrid

A. Centralized Architectures :

In centralized architectures, all requests and resources are managed by a

single central node or server, while all other nodes (clients) rely on the central
server
for processing and communication.
Key Features of Centralized Architecture in Distributed Systems:

1. Single Point of Control: A central node or server coordinates and

manages the operations, acting as the control point for the entire
system.
2. Simplified Management: The central authority simplifies management
and administration since the core tasks are handled from a single
location.
3. Improved Resource Allocation: Resource distribution and task
scheduling are more straightforward, as the central entity has a
comprehensive view of the system.
4. Scalability Limitation : As the system grows, the central server might
struggle to handle the increasing load, leading to performance issues.

Advantages of Centralized Architecture:

• Consistency: Centralized control ensures consistency across the system

since all nodes rely on the central authority.
• Easier Maintenance: Updates, patches, and maintenance tasks are
simpler to perform as they only need to be applied to the central node.
• Efficient Communication: Communication between nodes is
streamlined through the central server, reducing the complexity of direct
peer-to-peer interactions.

Disadvantages of Centralized Architecture:

• Single Point of Failure(Potential Bottleneck) : The central server

becomes a critical vulnerability; if it fails, the entire system can be
disrupted.
• Scalability limitation: Centralized systems can struggle to scale
efficiently, as the central server may become overwhelmed with
increasing demands.
• Latency: As all requests must pass through the central server, latency
can increase, especially in geographically distributed systems.

Examples of Centralized architecture

• Client-Server Systems(e.g., Web Applications)
• Database Management Systems (e.g., MySQL with a central server)
• File Storage Systems (e.g., Google Drive's centralized file storage)

Centralized architecture can be beneficial for smaller systems or specific use

cases where central control is crucial. However, for large-scale, geographically
distributed systems, a decentralized or hybrid approach might be more
effective to address the limitations of centralized control.

B. Decentralized Architecture :

A decentralized architecture in a distributed system involves multiple nodes or

components working independently without a single central authority or
control point. Each node makes its own decisions and coordinates with other
nodes to achieve a common goal. This approach provides increased fault
tolerance, scalability, and resilience.

Key Features of Decentralized Architecture in Distributed Systems:

1. No Single Point of Control: Unlike centralized systems, decentralized

architecture doesn't rely on a central server. Each node operates
independently and collaborates with others.
2. Improved Fault Tolerance: If one node fails, the system continues to
function as other nodes can take over the responsibilities.
3. Scalability: Decentralized systems can easily scale by adding more
nodes, distributing the load evenly across the network.
4. Increased Resilience: The absence of a central point reduces the risk of
a single point of failure, making the system more resilient to attacks and
failures.
Advantages of Decentralized Architecture:

• Fault Tolerance: The system remains operational even if some nodes

fail, enhancing overall reliability.
• Scalability: New nodes can be added without significantly impacting the
system's performance.
• Redundancy: Multiple nodes can perform the same tasks, providing
redundancy and backup options.
• Reduced Bottlenecks: Without a central authority, the risk of
bottlenecks is minimized, leading to improved performance.

Disadvantages of Decentralized Architecture:

• Complexity: Managing and coordinating a decentralized system can be

more complex due to the lack of a central control point.
• Consistency: Achieving consistency across all nodes can be challenging,
especially in large and dynamic systems.
• Communication Overhead: Nodes must communicate and coordinate
with each other, leading to increased communication overhead.

Examples of Decentralized Systems:

• Blockchain: A prime example of decentralized architecture, where each

node in the network maintains a copy of the blockchain and validates
transactions independently.
• Peer-to-Peer Networks: In peer-to-peer (P2P) networks, nodes connect
and share resources directly with each other without a central server.
• BitTorrent File Sharing
• Decentralized Cloud Storage (e.g., IPFS - InterPlanetary File System)

Decentralized architecture offers several benefits, particularly for large-scale,

geographically distributed systems where resilience, fault tolerance, and
scalability are crucial. However, it also introduces challenges in terms of
complexity and consistency management.
C. Hybrid Architecture :

A hybrid architecture in a distributed system combines elements of both

centralized and decentralized architectures, leveraging the strengths and
mitigating the weaknesses of each approach. This design allows for greater
flexibility, scalability, fault tolerance, and efficient management.

Key Features of Hybrid Architecture in Distributed Systems:

1. Combination of Control Points: Hybrid architecture can have a central

control point for certain tasks while allowing decentralized nodes for
others.
2. Flexibility: It provides the ability to tailor the architecture to specific
requirements, balancing centralization and decentralization.
3. Enhanced Performance: By distributing tasks appropriately, hybrid
architecture can optimize performance and resource utilization.
4. Improved Fault Tolerance: The decentralized components add
resilience, while the centralized control ensures consistent coordination.

Advantages of Hybrid Architecture:

• Scalability: The system can scale more effectively by distributing load

across decentralized nodes while maintaining central oversight for
critical functions.
• Fault Tolerance: Decentralized components enhance the system's fault
tolerance, reducing the risk of single points of failure.
• Efficient Management: Central control points simplify management
and administration for critical tasks, ensuring consistency and reliability.
• Optimized Resource Utilization: The ability to balance centralization
and decentralization allows for better resource allocation and usage.
Disadvantages of Hybrid Architecture:

• Complexity: Designing and managing a hybrid system can be more

complex due to the need to balance different architectural elements.
• Consistency Challenges: Ensuring consistency across centralized and
decentralized components can be challenging, especially in large
systems.
• Communication Overhead: Communication between centralized and
decentralized components can introduce overhead and latency.

Examples of Hybrid Systems:

• Content Delivery Networks (CDNs): CDNs use a centralized control

system to manage and distribute content, while decentralized edge
servers deliver content locally to users, optimizing performance and
reducing latency.
• Cloud Services: Some cloud services use a hybrid approach, combining
central data centers with decentralized edge computing nodes to provide
scalable and resilient services.

Hybrid architecture is particularly beneficial for systems that require both the
reliability and consistency of centralized control and the scalability and fault
tolerance of decentralized components. By combining these approaches,
hybrid architecture can provide a balanced and efficient solution for complex,
large-scale distributed systems.

2. Data Center and its techniques

A data center in a distributed system is a facility used to house computer

systems and associated components, such as telecommunications and
storage systems. Data centers play a crucial role in distributed systems by
providing centralized resources that can be accessed and utilized by multiple
distributed nodes. Here's a closer look at the role and features of data centers
within distributed systems:

Key Roles of Data Centers in Distributed Systems:

1. Centralized Resource Management: Data centers provide centralized

management of resources such as servers, storage, and networking
equipment, making it easier to maintain and allocate resources
efficiently.
2. Data Storage and Processing: Data centers house the infrastructure
needed for data storage and processing, enabling distributed systems to
store and process large volumes of data.
3. High Availability and Reliability: Data centers are designed to provide
high availability and reliability, ensuring that the distributed system can
function continuously without interruption.
4. Security: Data centers implement robust security measures to protect
data and systems from unauthorized access, cyberattacks, and physical
threats.
5. Connectivity: Data centers provide the necessary connectivity
infrastructure, enabling communication and data exchange between
distributed nodes.

Data Center Architectures in Distributed Systems:

• Centralized Data Centers: In some distributed systems, a centralized

data center manages and stores the majority of the data and resources.
This architecture provides easier management but can create a single
point of failure.
• Distributed Data Centers: Multiple data centers are distributed across
different geographical locations. This architecture enhances fault
tolerance, reduces latency, and improves disaster recovery capabilities.
• Hybrid Data Centers: A combination of centralized and distributed data
centers is used to balance the benefits of both architectures. Critical
resources may be housed in a centralized data center, while additional
resources are distributed across multiple data centers.

Fundamental Components :

◦ Hardware: Server, storage, networking devices

◦ Software: virtualization, containerization, management tool of Cloud.
◦ Infrastructure: power supply, cooling system, and securities.

Examples of Data Centers in Distributed Systems:

• Cloud Data Centers: Cloud providers like AWS, Google Cloud, and
Microsoft Azure operate large-scale data centers that provide
infrastructure and services to distributed systems globally.
• Edge Data Centers: Edge data centers are smaller facilities located
closer to end-users to reduce latency and improve performance for
distributed systems, such as content delivery networks (CDNs) and
Internet of Things (IoT) applications.

Data centers are fundamental components of distributed systems, providing

the necessary infrastructure, resources, and connectivity to enable efficient
and reliable operation. Their role is critical in ensuring the scalability,
availability, and security of distributed systems.

3. key Concepts in Distributed systems:

A. Edge computing

is a model of computing where data processing and analysis occur closer to

the source (or “edge”) of the data—such as sensors, IoT devices, or local
gateways—rather than in a centralized cloud or data center. By performing
computing tasks at or near where the data is generated, edge computing aims
to reduce latency, save bandwidth, and enhance real-time responsiveness.

Key Points

1. Location of Processing
a. Instead of sending large volumes of raw data to a distant cloud
server, the processing (computation and analytics) happens near
the devices—either on the device itself or on a local edge node.
b. This local or near-local node could be a specialized gateway, an on-
premises server, or even an embedded computer in a machine.
2. Benefits
a. Reduced Latency: Immediate processing near the data source
allows real-time or near real-time responses (crucial in applications
like self-driving cars, industrial automation, or health monitoring).
b. Bandwidth Savings: By not transmitting all raw data to the cloud,
organizations can lower network usage and associated costs.
c. Reliability: Systems remain partially functional even if the
connection to the central cloud is disrupted—helpful for remote
locations or harsh environments.
d. Data Privacy: Sensitive or proprietary data can be processed
locally, limiting what is sent over external networks.

3. Use Cases
a. IoT (Internet of Things): Smart sensors in manufacturing or smart
cities can analyze data at the edge, only sending summarized
results to the cloud.
b. Autonomous Vehicles: Quick decision-making (e.g., obstacle
avoidance) requires computing near the source (the car itself).
c. Healthcare: Wearable or bedside devices that need immediate
local analysis for patient monitoring.
d. Retail: Local servers in stores can handle point-of-sale data,
personalized recommendations, and inventory management
without depending on a constant, high-bandwidth connection.
4. Challenges
a. Infrastructure Complexity: Setting up edge nodes across multiple
locations can be more complex than using a single central cloud.
b. Security: More endpoints and distributed nodes can increase the
attack surface, requiring robust security measures.
c. Maintenance: Monitoring and updating a large number of edge
devices requires careful orchestration and management strategies.
5. Relation to Cloud Computing
a. Edge computing often complements cloud computing rather than
replaces it.
b. Hybrid Approach: Perform immediate or real-time processing at
the edge, then send aggregated or less time-sensitive data to the
cloud for further analysis, archiving, or advanced machine learning
tasks.
In Summary

Edge computing brings computational power closer to where data is

generated, aiming for faster response times, lower bandwidth usage,
and improved reliability. It’s especially critical in scenarios requiring
real-time decisions, handling massive data streams, or ensuring
operational continuity in remote environments.

B. Virtualisation :

Virtualization allows one physical machine to run multiple virtual machines

(VMs), each acting like a separate computer. This means you can run different
operating systems and applications on a single physical machine without
conflicts.

• Types of Virtualization:
o Hardware Virtualization: This involves creating virtual versions of
physical hardware components, like CPUs, storage devices, and
network resources. Popular platforms include VMware, Hyper-V, and
VirtualBox.
o Software Virtualization: This allows multiple software
applications to run on a single physical machine, often using virtual
environments or containers like Docker.
o Desktop Virtualization: This lets you run multiple desktop
environments on a single physical computer, enabling users to
switch between operating systems or applications seamlessly.
o Network Virtualization: Combines multiple physical networks into
a single virtualized network (e.g., VPNs, SDN, NFV).
o Storage Virtualization: Pools multiple storage devices into a
single, logical unit for efficient management (e.g., SAN, NAS, Cloud-
based storage).

• Benefits:
o Cost Savings: Reduces the need for physical hardware, leading to
lower costs for purchasing, maintaining, and powering devices.
o Resource Optimization: Maximizes the utilization of physical
resources, as multiple VMs can share a single physical machine’s
CPU, memory, and storage.
o Scalability: Makes it easier to scale your infrastructure up or down
based on demand, by adding or removing VMs as needed.
o Disaster Recovery: Simplifies backup and recovery processes, as
VMs can be easily copied, moved, or restored in the event of
hardware failure.
• Example :
here is a real application for virtualization , on which you can run and see
in your computer (Click here or type this title on YouTube : How to use
VirtualBox - Tutorial for Beginners - YouTube) try to do the same in your
computer .

Virtualization is a foundational technology for DS & cloud computing, making it

possible to offer flexible, on-demand computing resources.

C. Consistency

• Definition: Consistency typically means stability, uniformity (or logical

coherence). Something is consistent if it does not contradict itself and
remains uniform over time.
• Examples:
 A person’s behavior is called consistent if they act in a predictable
or reliable manner.
In Computing (Especially Distributed Systems)

• Data Consistency: In distributed systems (where data is stored on

multiple servers or nodes), consistency refers to the degree to which all
nodes see the same data at the same time. Consistency in distributed
systems means every node / replica has the same view of data at a given
point in time irrespective of whichever client has updated the data. When
you make a request to any node, you receive the exact same response
(even if it’s an error) so that from the outside, it looks like there is a single
node performing all the operations.
• Why It Matters:
 If one part of the system updates data but another part still shows
old data, that’s an inconsistency.
 Some systems prioritize strong consistency (all updates appear
everywhere instantly), while others allow eventual consistency
(updates spread gradually but eventually synchronize).

• CAP Theorem: Consistency is one of the three properties in the famous

CAP Theorem (Consistency, Availability, Partition tolerance). The theorem
states you can strongly achieve only two of the three in a distributed
system.

Real Life Examples:

- entry across DNS servers and clients.

-Review or ratings of products in Amazon.
-Count of likes in Facebook.
-Views on YouTube videos.
-Stream of comments on Facebook live videos.
-Ticket price shown on the front page in an airline website.
-Fetching how many Facebook friends / WhatsApp contacts are online.
In Summary
• In everyday language, consistency means something is stable,
uniform, or logically coherent.
• In computer science, especially in distributed systems, consistency is
about ensuring multiple copies of data stay synchronized or at least
converge to a single up-to-date state over time.

D. Replication

Replication is an important concept in computing and data management. It

involves creating copies of data or systems to ensure consistency, availability,
and reliability. Here are some key aspects of replication:

1/ Data Replication:

i. Synchronous VS. Asynchronous Replication:

In Synchronous , the Data is copied in real-time to multiple locations.
This ensures that all copies are identical at any given moment but may
impact performance due to latency. While the Asynchronous
Replication: Data is copied to multiple locations, but not in real-time.
This method has less impact on performance but may result in slight
data inconsistencies for a brief period.

ii. PRIMARY-BACKUP REPLICATION:

Also known as active-passive replication involves designating one
primary replica (active) to handle all updates (writes), while one or more
backup replicas (passive) maintain copies of the data with the primary
and synchronize. Eg - cloud storage i.e the primary server stores files and
the backup server keeps a copy to prevent data loss

iii. MULTI-PRIMARY REPLICATION:

Allows multiple replicas to accept updates independently. Each replica
acts as both a client (accepting updates) and a server (propagating
updates to other replicas).
Eg : collaborative document editing (eg google docs) changes sync
across multiple servers in real time
iv. CHAIN REPLICATION :
Chain Replication involves replicating data sequentially through a chain
of nodes. Each node in the chain forwards updates to the next node in
the sequence, typically ending with a return path to the primary node.
Eg: Cloud storage system (eg. Amazon S3)

v. DISTRIBUTED REPLICATION :
Distributed Replication distributes data or services across multiple nodes
in a less structured manner compared to primary-backup or chain
replication. Replicas can be located geographically or logically distributed
across the network
Eg: Content Delivery Network (CDN) (eg Netflix)

Vi. Benefits of Data Replication:

Data Redundancy: Ensures that data is available even if one copy is lost
or corrupted.

Improved Performance: Replicated data can be distributed across

multiple servers, reducing access time and load on any single server.

Disaster Recovery: Allows for quick recovery of data in case of hardware

failure, cyber-attacks, or natural disasters.

2/ System Replication:

vi. Full System Replication: Creates a complete copy of an entire

system, including operating systems, applications, and data. This is
useful for creating backup systems or setting up testing
environments.
vii. Partial Replication: Involves replicating specific components or
parts of a system, such as particular applications or databases. This
can be more efficient and cost-effective than full replication.

iii. Benefits:

High Availability: Ensures that critical systems and applications

remain available even if one instance fails.
Load Balancing: Distributes workloads across multiple instances,
improving overall system performance and reliability.

Scalability: Makes it easier to scale systems up or down by adding

or removing replicated instances as needed.

Replication is a fundamental technique for achieving data resilience, ensuring

business continuity, and optimizing system performance.

3/ Synchronization:

Synchronization, is a technique of replication , which refers to the

process of ensuring that data or systems are consistent and up-to-date
across multiple locations or devices. Here are some key aspects of
synchronization:

Data Synchronization:
i. File Synchronization: Keeps files consistent across multiple
devices or storage locations. For example, syncing files between
your computer and cloud storage.
ii. Database Synchronization: Ensures that data in multiple
databases or database instances remains consistent. This is crucial
for distributed systems where data is spread across different
servers.
Benefits:
Consistency: Ensures that all copies of data are the same,
reducing the risk of data discrepancies.
Accessibility: Allows users to access the most recent version of
data from any location or device.
Collaboration: Facilitates real-time collaboration by ensuring
that changes made by one user are immediately reflected for
others.
System Synchronization:

Clock Synchronization: Ensures that the clocks of different

systems are aligned. This is important for time-sensitive
applications and for coordinating actions in distributed systems.

State Synchronization: Ensures that the states of different

systems or components are consistent. This is crucial for
applications like multiplayer online games, where all players need
to see the same game state.

Benefits:
Accuracy: Ensures that actions or events occur at the correct
time and in the correct sequence.
Reliability: Reduces the risk of errors or conflicts caused by
out-of-sync data or systems.
Efficiency: Improves the overall performance and
responsiveness of systems by ensuring that they operate in
harmony.

Synchronization is essential for maintaining data integrity, enabling seamless

access to information, and ensuring the smooth operation of distributed
systems.

4/ Consistency Guarantees :

In Distributed systems’ replication, When we discuss the system design, we

always talk about “Consistency Guarantees”, (is it strong consistency, or weak
consistency ..etc), which is very important paradigm to ensure the availability,
and reliability of the data/system.

By Consistency Gaurantees : Imagine there is only a single thread(program)

accessing an object in your system. There is no interference from any other
thread. How easy is it to define the correctness of the operations that
happened on that object? Since operations are happening one by one by in a
single thread, each operation completes only after the previous operation
finishes. Hence, if we take inputs one by one, run the program to manipulate
the object and manually verify the output, as long as the output is consistent
as per the input provided, the operations are valid. But how would you define
correctness in a multi threaded, multi processor environment where multiple
threads / processes may access the same memory location together?

As developers, we need certain guarantees from the system regarding how

different operations would be ordered, how updates would be visible and
essentially what would be the impact on performance and correctness of the
system in the presence of multiple threads of execution or processes.
Consistency model defines that abstraction. consistency models are trade-
offs between concurrency of operations vs ordering of operations or in
other words, between performance and correctness of the operations.

The Challenge with consistency & Replication in Distributed Systems

For serving massive online traffic in the modern Web, it’s very common to
have an infrastructure set up with multiple replicas (or partitions). Whether it’s
your Facebook activity, GMail data or Amazon’s order history — everything is
replicated across datacenters, availability zones and possibly across countries
to ensure data is not lost and the systems are always highly available in case
one of the replica crashes. This poses a challenge — how to have consistent
data across replicas? Without consistency, a mail that you have sent recently
through GMail might disappear, an item deleted from Amazon cart might
reappear. Or even worse, a financial transaction might be lost causing
thousands of $$$ loss. While losing cart items is okay at times but losing $$$ is
a big NO NO!
there are two kinds of consistency:
- Weak Consistency and Strong Consistency.

Figure 1: Consistency guarantees scale

A/ Weak Consistency :
NoSQL data stores like MongoDB, Amazon Dynamo DB, Cassandra etc. These
systems are usually known for built in high availability and performance. In the
presence of partition and network issues, they embrace weakness in
consistency to support such behaviour. As you can see in Figure 1, weaker
consistency means higher availability, performance and throughput although
more anomalous data.

The Famous type of weak consistency guarantees is called:

Eventual Consistency ( Weak consistency ) :

Eventual consistency is the weakest and probably the most popular

consistency level, ( eg : NoSQL ) , In a distributed system, replicas eventually
converge to the same state. Given no write operation is in progress for a given
data item, eventual consistency guarantees that all replicas start serving Read
requests with the last updated value.
Figure
2: Eventual Consistency with replication. Courtesy: Google Cloud

Properties:

-There is absolutely no ordering guarantee on Reads and Writes.

Arbitrary order applicable.
- Any unit of execution which writes some value to an object, upon
reading the same object back from another replica, the update could be
invisible.
-Reading the same data from different nodes simultaneously may return
stale data.

B/ Strong Consistency :

Conceptually, strong consistency is exact opposite to eventual consistency

where all the replicas read the same value for a given data item at the same
point in time. Certainly, ensuring strong consistency across data center even
across multiple nodes in a single data center is expensive.
For more about consistency models in DS replication : check Consistency
Guarantees in Distributed Systems Explained Simply | by Kousik Nath |
Medium

4. Communication in a DS :
We said a distributed system is a collection of interconnected computers
that work together to achieve a common goal. Without communication we
can’t achieve ( coordination, data sharing and resource distribution)
between the different components of DS.

To make it easier to deal with the numerous levels and issues involved in
communication, the International Standards Organization (ISO) developed a
reference model that clearly identifies the various levels involved, gives them
standard names, and points out which level should do which job. This model
is called the Open Systems Interconnection Reference Model (Day and
Zimmerman, 1983), usually abbreviated as ISO OSI or sometimes just the
OSI model.

The OSI model is designed to allow open systems to communicate. An open

system is one that is prepared to communicate with any other open system
by us ing standard rules that govern the format, contents, and meaning of
the messages sent and received. These rules are formalized in what are
called protocols. To allow a group of computers to communicate over a
network, they must all agree on the protocols to be used. A distinction is
made between two general types of protocols. With connection oriented
protocols, before exchanging data the sender and receiver first explicitly
establish a connection, and possibly negotiate the pro tocol they will use.
When they are done, they must release (terminate) the con nection.

In the OSI model, communication is divided up into seven levels or layers.

Each layer deals with one specific aspect of the communication.
Communication in a DS can be categorized based on synchronization,
delivery method, and architecture.

1. Synchronous vs Asynchronous Communication

2. Unicast, Multicast, and Broadcast Communication

3. Remote Procedure Call (RPC) Protocols

In (1984), Birrell and Nelson suggested was allowing programs to call
procedures located on other machines. When a process on machine A
calls' a procedure on machine B, the calling process on A is suspended,
and execution of the called procedure takes place on B. Information can
be transported from the caller to the callee in the parameters and can
come back in the procedure result. No message passing at all is visible to
the programmer. This method is known as Remote Procedure Call, or
often just RPC.
5. Service- and utility-oriented computing:
Service- and utility-oriented computing are two paradigms that focus on
delivering computing resources and services efficiently. Here's a breakdown of
each:

A/ SERVICE-ORIENTED COMPUTING

• Definition: SOC is a computing paradigm where software applications

are built using independent, reusable services. These services
communicate with each other over a network, often using standard
protocols. such as:
 SOAP (Simple Object Access Protocol) –
A messaging protocol that ensures reliability and security.
 REST (Representational State Transfer) – A simpler, lightweight
method used in web applications.
 gRPC (Google Remote Procedure Call) – A high-performance protocol
for real-time communication.

• Key Features:
Modularity: Applications are composed of smaller,
independent services.
Interoperability: Services can work together, even if they are
built on different platforms or technologies.
LOOSE COUPLING — Services are independent and depend
on each other minimally. If one service fails or is updated, it does not
break the entire system.
DISCOVERABILITY — Services should be easily located and
accessed when needed. They are often registered in a service
directory (like UDDI—Universal Description, Discovery, and
Integration) where applications can search for and find available
services.
Scalability: Services can be scaled independently based on
demand.

Applications: SOC is commonly used in Service-Oriented Architecture

(SOA) to create flexible and adaptable systems for businesses. It is also
used in cloud computing in SAAS ( Softwares as a service )

B/ UTILITY-ORIENTED COMPUTING

• Definition: Utility computing is a model where computing resources (like

storage, processing power, and applications) are provided on-demand
and billed based on usage, similar to utilities like electricity or water.
• Key Features:
o On-Demand Access:
Resources are available whenever needed.
o Pay-As-You-Go:
Users are charged based on their consumption.
o Scalability:
Resources can be scaled up or down dynamically.
o RESOURCE POOLING : Computing resources are pooled together
and shared among several users to maximize efficiency, reduce
waste, and lower costs by distributing resources.
o RESOURCE VIRTUALIZATION : Physical computer resources are
abstracted into virtual machines, storage, and networks, allowing
numerous users to access the same physical infrastructure in a
secure and efficient manner.
• Applications: Utility computing is a subset of cloud computing and is
widely used in Infrastructure as a Service (IaaS) and Platform as a Service
(PaaS) models.

Both paradigms aim to optimize resource usage and provide flexibility, but
SOC focuses on building modular applications, while utility computing
emphasizes resource provisioning and cost efficiency. Eg: Amazon Web
Services, Microsoft Azure, Virtual Machines on Cloud Platforms.

Notes Cloud
No ratings yet
Notes Cloud
1,454 pages
Week 1 Lecture - 1-5 CC - Watermark
No ratings yet
Week 1 Lecture - 1-5 CC - Watermark
156 pages
Active Directory and Domain Controller 1737375800
No ratings yet
Active Directory and Domain Controller 1737375800
14 pages
Beyond Linux From Scratch Version 6.3 ® BLFS Development
100% (2)
Beyond Linux From Scratch Version 6.3 ® BLFS Development
1,192 pages
Week1 DC UW
No ratings yet
Week1 DC UW
35 pages
What Is Distributed Computing
No ratings yet
What Is Distributed Computing
45 pages
Chapter 1 DC
No ratings yet
Chapter 1 DC
44 pages
Distance BBA Syllabus Karnataka State Open University
No ratings yet
Distance BBA Syllabus Karnataka State Open University
32 pages
AVEVA Licensing System 4.1 User Guide
No ratings yet
AVEVA Licensing System 4.1 User Guide
62 pages
R05 Man Act350 Ug en
No ratings yet
R05 Man Act350 Ug en
77 pages
Cloud Computing Module-1
No ratings yet
Cloud Computing Module-1
23 pages
Project Questions
No ratings yet
Project Questions
4 pages
CC Lo1
No ratings yet
CC Lo1
60 pages
Unit 1 Part 2
No ratings yet
Unit 1 Part 2
25 pages
GDNW Jncis-Fwv Study Guide
No ratings yet
GDNW Jncis-Fwv Study Guide
151 pages
Introduction
No ratings yet
Introduction
14 pages
Chapter 1parallel & Distributed Computing 1
No ratings yet
Chapter 1parallel & Distributed Computing 1
35 pages
Distributed Programming Handout - Module 1
No ratings yet
Distributed Programming Handout - Module 1
27 pages
Windows Reinstallation
No ratings yet
Windows Reinstallation
467 pages
Parallel Computing
No ratings yet
Parallel Computing
21 pages
DC 1
No ratings yet
DC 1
27 pages
Unit-1 & Ii GCC
No ratings yet
Unit-1 & Ii GCC
37 pages
BCA-502 (DE2) - SM03hsshs
No ratings yet
BCA-502 (DE2) - SM03hsshs
13 pages
CC 2
No ratings yet
CC 2
35 pages
CC Sem
No ratings yet
CC Sem
64 pages
DC - Unit I
No ratings yet
DC - Unit I
57 pages
4 Distributed Computing Architectures
No ratings yet
4 Distributed Computing Architectures
6 pages
Mobile Computing Lab (2b)
No ratings yet
Mobile Computing Lab (2b)
4 pages
Unit 1
No ratings yet
Unit 1
16 pages
Parallel and Distributed Computing Lec 3
No ratings yet
Parallel and Distributed Computing Lec 3
25 pages
CSC 421 Lecture Note 1
No ratings yet
CSC 421 Lecture Note 1
2 pages
OS CH 4 Introduction To Distributed System
No ratings yet
OS CH 4 Introduction To Distributed System
46 pages
Presentation Telecom
No ratings yet
Presentation Telecom
7 pages
Unit 3
No ratings yet
Unit 3
5 pages
Distributed OS
No ratings yet
Distributed OS
31 pages
IT Notes Unit 5
No ratings yet
IT Notes Unit 5
25 pages
Introduction To Cloud
No ratings yet
Introduction To Cloud
57 pages
1 DC
No ratings yet
1 DC
6 pages
CC Unit-1
No ratings yet
CC Unit-1
17 pages
Assignment2 CCL 24
No ratings yet
Assignment2 CCL 24
9 pages
DistributedComputing Rev2
No ratings yet
DistributedComputing Rev2
44 pages
Meshlium Technical Guide
No ratings yet
Meshlium Technical Guide
214 pages
Lecture 1
No ratings yet
Lecture 1
13 pages
Updated Lecture 08 - Distributed Computing
No ratings yet
Updated Lecture 08 - Distributed Computing
24 pages
UNIT-1 What Is: Q1: Distributed System? or Why Would You Design A System As A Distributed System
No ratings yet
UNIT-1 What Is: Q1: Distributed System? or Why Would You Design A System As A Distributed System
55 pages
Chapter 1.1
No ratings yet
Chapter 1.1
25 pages
Router Configuration Standard V 1.1
No ratings yet
Router Configuration Standard V 1.1
9 pages
Trends in Distributed Object Computing PDF
No ratings yet
Trends in Distributed Object Computing PDF
17 pages
BCT - Unit-2
No ratings yet
BCT - Unit-2
27 pages
Lecture 1 Distributed Notes
No ratings yet
Lecture 1 Distributed Notes
23 pages
Distributed Systems U1 U2
No ratings yet
Distributed Systems U1 U2
73 pages
DCS Chapter-1
No ratings yet
DCS Chapter-1
9 pages
DC-Unit 1
No ratings yet
DC-Unit 1
9 pages
Mathematics Activities 8
No ratings yet
Mathematics Activities 8
3 pages
CCunit 1
No ratings yet
CCunit 1
69 pages
Mathematics 2
No ratings yet
Mathematics 2
2 pages
Module 1
No ratings yet
Module 1
30 pages
U19ads501 - CC U-1
No ratings yet
U19ads501 - CC U-1
60 pages
Distrubuted Computing
No ratings yet
Distrubuted Computing
62 pages
Chapter 1.2
No ratings yet
Chapter 1.2
18 pages
Unit 1
No ratings yet
Unit 1
58 pages
English Questions
No ratings yet
English Questions
5 pages
Seminar Report
No ratings yet
Seminar Report
30 pages
Pokeblub - Google Search 14
No ratings yet
Pokeblub - Google Search 14
1 page
Documentaion of Distributed Computing
No ratings yet
Documentaion of Distributed Computing
7 pages
CSE 423 Virtualization and Cloud Computinglecture0
No ratings yet
CSE 423 Virtualization and Cloud Computinglecture0
16 pages
Distributed Operating System
No ratings yet
Distributed Operating System
18 pages
English Activities
No ratings yet
English Activities
3 pages
Creating A WiFi Repeater - Bridge With Linksys WRT54G+DD-WRT - PAX Tech Bits & Video Tutorials PDF
No ratings yet
Creating A WiFi Repeater - Bridge With Linksys WRT54G+DD-WRT - PAX Tech Bits & Video Tutorials PDF
4 pages
Distributed Computing
No ratings yet
Distributed Computing
6 pages
DS Unit 1
No ratings yet
DS Unit 1
13 pages
Dimension O418
No ratings yet
Dimension O418
11 pages
Cloud Computing
100% (1)
Cloud Computing
134 pages
Network CD Player: Connections Playback Settings Tips Appendix
No ratings yet
Network CD Player: Connections Playback Settings Tips Appendix
161 pages
Introduction To Distributed Systems
No ratings yet
Introduction To Distributed Systems
14 pages
Modem INIt Strings
No ratings yet
Modem INIt Strings
6 pages
Bit4209 Distributed Systems Module
No ratings yet
Bit4209 Distributed Systems Module
117 pages
GB2B0304TPI-1 Barra de Tierra
No ratings yet
GB2B0304TPI-1 Barra de Tierra
1 page
Chapter 9 Prof Ethics
No ratings yet
Chapter 9 Prof Ethics
33 pages
Next Generation Storage Tiering With Dell Emc Isilon Smartpools
No ratings yet
Next Generation Storage Tiering With Dell Emc Isilon Smartpools
34 pages
FND CUSTOM LDT Commands
No ratings yet
FND CUSTOM LDT Commands
2 pages
Daf CDM Blockdiagram
0% (1)
Daf CDM Blockdiagram
14 pages
Distributed Computing
No ratings yet
Distributed Computing
27 pages
Unit-1 Grid
No ratings yet
Unit-1 Grid
42 pages
Distributed Systems
No ratings yet
Distributed Systems
47 pages
Ds-7224/7232Hqhi-K2 Series Turbo HD DVR: Features and Functions
No ratings yet
Ds-7224/7232Hqhi-K2 Series Turbo HD DVR: Features and Functions
3 pages
45plmag PDF
No ratings yet
45plmag PDF
48 pages
Server Cabinets: Fixed Wall Mounted
No ratings yet
Server Cabinets: Fixed Wall Mounted
2 pages
Telegatner Catalog Conectori RF
No ratings yet
Telegatner Catalog Conectori RF
32 pages
Pchardware Viewnet
No ratings yet
Pchardware Viewnet
2 pages
Ocb 283
No ratings yet
Ocb 283
27 pages
FPA 5000 Data Sheet enUS 1218412427
No ratings yet
FPA 5000 Data Sheet enUS 1218412427
6 pages
Ct-2016 16 Antennas 3g 4g Wif Gps Uhf VHF Jammer
No ratings yet
Ct-2016 16 Antennas 3g 4g Wif Gps Uhf VHF Jammer
2 pages
Bang Gia PLC Siemens s7 200
No ratings yet
Bang Gia PLC Siemens s7 200
8 pages
IGNOU MCS 227 Cloud Computing and IoT Previous Years Solved Papers
From Everand
IGNOU MCS 227 Cloud Computing and IoT Previous Years Solved Papers
Manish Soni
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet