0% found this document useful (0 votes)
3 views

Lecture 1 - Introduction to Distributed Systems

The document introduces distributed systems, defining them as collections of independent computers that function as a single coherent system. It outlines various types of distributed systems, their goals, characteristics, advantages, challenges, and applications, emphasizing the importance of scalability, fault tolerance, and resource sharing. Real-world examples illustrate the relevance of distributed systems in areas such as cloud computing, big data, and financial services.

Uploaded by

bilopo9589
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 1 - Introduction to Distributed Systems

The document introduces distributed systems, defining them as collections of independent computers that function as a single coherent system. It outlines various types of distributed systems, their goals, characteristics, advantages, challenges, and applications, emphasizing the importance of scalability, fault tolerance, and resource sharing. Real-world examples illustrate the relevance of distributed systems in areas such as cloud computing, big data, and financial services.

Uploaded by

bilopo9589
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Murang’a University of Technology

Innovation for Prosperity


Lecture 1

Introduction to Distributed
Systems
Contextualizing Distributed Systems

• Alright, class, let’s start with a little reality check. Imagine this:
You’re binge-watching Netflix (because, obviously, who studies
all day?), and your favorite show starts buffering. Annoying,
right? Or worse, you’re trying to upload your assignments to
Google Drive five minutes before the deadline, and it just won’t
sync!

• Guess what? These are all distributed systems in action!

3
Contextualizing Distributed Systems
So why are we even studying this? Well, let’s rewind a bit:
- Once upon a time, computers were these big machines working
“alone”.
- But then came the networking era, and suddenly computers started
“talking” to each other.
- Fast forward to today: We’ve got distributed systems that can handle
connected computers all at the same time.

But here’s the catch:


Making all these systems work seamlessly together is hard. Like,
“coordinating a group project with procrastinators” hard. That’s what
distributed systems are all about—making multiple computers work
together without throwing tantrums.
4
Definition of Distributed Systems
• A distributed system is a collection of independent computers that
appear to the user as a single coherent system.

Here's a breakdown of key aspects:


• Multiple Computers: The system involves multiple machines
(nodes) working together.
• Independence: Each computer operates independently, with its
own resources and processing capabilities.
• Coherence: Despite the independence, the system presents a
unified view to the user. This means applications running on the
system can interact seamlessly, as if they were on a single machine.

5
Types of Distributed Systems
Distributed systems can be classified based on their architecture and purpose:

1. Clusters
• A group of interconnected computers working as a single unit.
• Commonly used in high-performance computing (HPC) for tasks like
scientific simulations.

2. Grids
• A collection of distributed resources (e.g., storage, processing
power) shared across multiple organizations.
• Example: SETI@home uses idle computational power from
volunteers worldwide to analyze radio signals.

6
Types of Distributed Systems
3. Peer-to-Peer (P2P) Systems
• Nodes act as both clients and servers, sharing resources equally.
• Examples: File-sharing platforms like BitTorrent or cryptocurrencies
like Bitcoin.

4. Sensor Networks
• Composed of low-power devices that collect and transmit data
(e.g., weather sensors, IoT devices).
• Example: Smart home systems, where sensors monitor
temperature, lighting, and security.

7
Goals of Distributed Systems
The design of distributed systems is driven by specific goals that make them efficient
and user-friendly.

1. Transparency
• Transparency ensures the system hides its complexity from the user. There are
several types of transparency.

2. Scalability
• The system should handle an increasing number of nodes, users, or data without
degradation in performance.

3. Resource Sharing
• Distributed systems allow efficient utilization of resources, such as computational
power, storage, and bandwidth, across multiple locations.

8
Characteristics of Distributed Systems
1. Scalability
• Scalability refers to the ability of a distributed system to handle
increased load by adding more resources, such as nodes or
hardware.
• It is a critical requirement for modern systems that must
accommodate growing user bases or data volumes.
• There are two main types of scalability:
i. Horizontal Scaling: Adding more machines or nodes to the system.
This is often seen in cloud environments where additional virtual
machines or servers are provisioned as needed.
ii. Vertical Scaling: Enhancing the power of existing nodes, such as
upgrading CPU, memory, or storage.

9
Characteristics of Distributed Systems
2. Concurrency
• Concurrency in distributed systems arises from multiple processes
or users interacting with the system simultaneously.
• Unlike single systems, distributed systems must handle these
interactions across different nodes.
• For example, a distributed database might have thousands of users
querying and updating records at the same time.
• Managing concurrency involves addressing issues like Resource
Conflicts and deadlocks

10
Characteristics of Distributed Systems
3. Replication
• Replication involves creating and maintaining multiple copies of
data or services across nodes in a distributed system.
• This improves availability (data is always accessible even if a node
fails), reliability (data can be recovered from other nodes), and
performance (users access the closest replica, reducing latency).
• However, replication introduces challenges in consistency. Updates
to one copy must be reflected across all replicas.
• Real-world examples include distributed databases like Cassandra
and replication in content delivery networks (CDNs) like Akamai.

11
Characteristics of Distributed Systems
4. Fault Tolerance
• Fault tolerance ensures that a distributed system can continue
functioning despite failures, such as a server crash, network
partition, or disk failure.
• This characteristic is achieved through redundancy (having backup
nodes or replicas) and failover mechanisms (switching to a backup
node automatically).
• Fault tolerance is critical for systems like financial applications,
where downtime or data loss can have severe consequences.

12
Characteristics of Distributed Systems
5. Transparency
• Transparency in distributed systems aims to hide the complexity of the
system from users, making it appear as a single, cohesive system.
• This is hiding the fact that its processes and resources are physically
distributed across multiple computers
• A DS that is able to present itself to users as if it were only on a single
computer system is said to be transparent
• Distribution transparency is a nice a goal, but achieving it is a different
story.
• Examples include cloud storage platforms like Dropbox and Google Drive,
where users access files without worrying about backend details.

13
Types of Transparencies
Transparency Description
Hide differences in data representation and how a resource is
Access accessed

Location Hide where a resource is located

Migration Hide that a resource may move to another location


Hide that a resource may be moved to another location while in
Relocation use

Replication Hide that a resource may be shared by several competitive users

Concurrency Hide that a resource may be shared by several competitive users

Failure Hide the failure and recovery of a resource

Persistence Hide whether a (software) resource is in memory or on disk


Characteristics of Distributed Systems
6. Coordination and Synchronization
• Coordination ensures that nodes in a distributed system work together
efficiently to achieve a common goal.
• Synchronization is a key aspect, especially when nodes need to agree on
the order of operations or the state of a resource.
• Key mechanisms include:
i. Clock Synchronization
ii. Mutual Exclusion
iii. Leader Election

• These mechanisms are essential for applications like distributed databases


and coordination in multi-node computations.

15
Characteristics of Distributed Systems
7. Consistency
• Consistency ensures that all nodes in the system reflect the same
data at any given time.
• This is especially challenging in distributed systems due to network
delays and partitioning.
• Types of consistency:
- Strong Consistency: All updates are immediately visible across all
nodes (e.g., traditional databases).
- Eventual Consistency: Updates propagate to all nodes over time,
with eventual convergence (e.g., NoSQL databases like DynamoDB).

16
Characteristics of Distributed Systems
8. Openness
• Openness refers to the ability of a distributed system to
integrate and operate with heterogeneous hardware, software,
and network technologies.
• Open systems rely on standardized protocols and interfaces to
ensure interoperability, scalability, and future-proofing.
• Key principles include Portability, Interoperability and Flexibility.

17
Advantages of Distributed Systems
Distributed systems offer several benefits:
1. Scalability
• Nodes can be added or removed easily without impacting overall
performance.
2. Fault Tolerance
• Distributed systems use redundancy to handle failures. Eg: RAID systems
3. Resource Sharing
• Enables sharing of resources across different users and organizations,
maximizing efficiency.
4. Performance
• Parallel processing across nodes improves performance for large-scale
computations.
5. Cost Efficiency
• A distributed system can use multiple low-cost computers instead of one
expensive machine.
18
Challenges in Distributed Systems
While distributed systems offer many benefits, they also present unique challenges:
1. Security
• Ensuring secure communication between nodes is critical. Challenges include data
breaches, unauthorized access, and network attacks.
2. Synchronization
• Keeping multiple nodes synchronized is challenging due to differences in local
clocks.
3. Fault Tolerance
• Identifying and recovering from failures is a major challenge.
4. Network Latency
• Communication between nodes is subject to delays caused by the network.
5. Consistency
• Ensuring that all nodes have the same data or state at a given time is difficult.

19
Applications of Distributed Systems
1. Cloud Computing
• Platforms like AWS, Microsoft Azure, and Google Cloud.
• Provide computing, storage, and networking services worldwide.
2. Big Data and Data Analytics
• Distributed frameworks like Hadoop and Spark process massive datasets.
3. Content Delivery Networks (CDNs)
• Systems like Akamai or Cloudflare distribute web content globally for low-
latency access.
4. Banking and Financial Systems
• ATMs, global stock exchanges, and payment gateways rely on distributed
systems.
5. Scientific Computing
• Simulations and research in fields like genetics, weather prediction, and
physics (e.g., SETI@home, CERN).
20

You might also like