0% found this document useful (0 votes)
1 views

Introduction to Distributed Systems

Distributed systems are networks of interconnected computers that work together to present a unified interface to users, characterized by transparency, scalability, and resource sharing. They have evolved from monolithic architectures to modern cloud computing and IoT systems, employing various architectural styles like client-server and microservices. Key components include middleware for communication, processes for task execution, and mechanisms for fault tolerance and security.

Uploaded by

mwangi junior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Introduction to Distributed Systems

Distributed systems are networks of interconnected computers that work together to present a unified interface to users, characterized by transparency, scalability, and resource sharing. They have evolved from monolithic architectures to modern cloud computing and IoT systems, employing various architectural styles like client-server and microservices. Key components include middleware for communication, processes for task execution, and mechanisms for fault tolerance and security.

Uploaded by

mwangi junior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Introduction to Distributed Systems

Definition and Overview


Distributed systems consist of multiple interconnected computers that
collaborate to achieve a common goal. These systems appear to users as a
single, unified entity despite being physically separated across locations.
They rely on communication protocols and software to function cohesively,
ensuring resource sharing, fault tolerance, and scalability.
Characteristics of Distributed Systems
1. Transparency:
 Access Transparency: Users access resources without
needing to know their physical location or how they are
accessed. Example: Google Drive allows file storage and
retrieval without exposing server locations.
 Replication Transparency: Users are unaware of the
existence of multiple data copies, as systems handle
replication internally. Example: Content Delivery Networks
(CDNs) like Akamai or Cloudflare.
 Failure Transparency: The system recovers from failures
with minimal impact on user experience. Example: Distributed
databases like MongoDB implement replication and failover
mechanisms.
2. Scalability
A scalable system handles increased workloads by adding more resources
(nodes or servers). For example, distributed databases like Cassandra enable
horizontal scaling by adding nodes to the cluster.
3. Resource Sharing
Resources such as printers, files, and databases are shared across nodes in
the system. Example: Network-attached storage (NAS) provides centralized
file storage accessible to multiple devices.
Real-World Examples
1. Cloud Computing Platforms
Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform
provide distributed computing resources for various applications, such as
web hosting and big data processing.
2. IoT Systems
Smart homes integrate IoT devices like sensors and cameras, enabling
automation and real-time monitoring.
3. Content Delivery Networks (CDNs)
CDNs like Akamai and Cloudflare optimize content delivery by caching data
in multiple geographically distributed locations.
Historical Evolution
1. Transition from monolithic mainframe systems to distributed client-
server architectures.
2. Introduction of peer-to-peer (P2P) networks for decentralized file
sharing (e.g., Napster).
3. Adoption of microservices and serverless computing for modern
distributed systems.
Emerging Trends
1. Edge Computing
Processing data close to the source (e.g., IoT devices) to reduce latency and
bandwidth usage.
2. Serverless Architectures
Functions as a Service (FaaS) platforms, like AWS Lambda, allow running
code on-demand without managing servers.

2. Architectural Styles
Overview
Distributed systems follow various architectural styles depending on the use
case, with each style offering unique trade-offs in terms of performance,
scalability, and complexity.
Architectural Styles
1. Client-Server Model
Centralized servers provide services to client devices.
Example: Web applications like Gmail.
2. Peer-to-Peer (P2P) Model
All nodes are equal, sharing resources without a central authority.
Example: Blockchain networks like Bitcoin or file-sharing networks like
BitTorrent.
3. Publish-Subscribe Model
Publishers send messages, and subscribers receive messages based on
topics of interest.
Example: Stock trading systems and real-time messaging platforms like
Apache Kafka.
4. Microservices Architecture
Applications are decomposed into smaller, independent services that
communicate via APIs.
Example: Netflix’s architecture enables scalability and fault isolation.
Hybrid Architectures
Combine multiple architectural styles for optimized performance.
Example: IoT systems use client-server for cloud communication and publish-
subscribe for real-time messaging between devices.
Diagrams
1. Client-Server Model
A client sends a request to a central server, which processes the request and
sends back a response.
2. Peer-to-Peer Network
Nodes communicate directly with each other without a central server.
3. Publish-Subscribe Model
Publishers send messages to a broker, which routes them to relevant
subscribers.

3. Middleware
Definition
Middleware acts as an intermediary layer, facilitating communication, data
management, and integration between distributed components. It abstracts
underlying complexities and provides common services.
Types of Middleware
1. Message-Oriented Middleware (MOM)
Example: Apache Kafka processes high volumes of real-time streaming data.
2. Database Middleware
Example: JDBC (Java Database Connectivity) bridges Java applications with
databases.
3. Transaction Middleware
Example: IBM MQ ensures reliable message delivery in transactional
systems.
Role in Distributed Systems
 Provides APIs for developers to build distributed applications.
 Handles tasks like message queuing, transaction management, and
security.
 Example: Enabling real-time messaging in chat apps or ensuring
reliable payment processing.
Middleware in IoT Systems
 Acts as a bridge between edge devices and the cloud.
 Example: Google Cloud IoT Core integrates device data with cloud
services for analysis.

4. Processes
Processes in Distributed Systems
A process is an instance of a program running on a node. Examples include
worker nodes in distributed frameworks like Apache Hadoop or Apache
Spark.
Thread Synchronization
1. Locks:
Ensure mutual exclusion when accessing shared resources.
Example: Redis distributed locks manage resource access in clustered
systems.
2. Semaphores:
Generalized locking mechanisms for controlling access to multiple resources.
Process Migration
 The movement of processes from one node to another for load
balancing or fault tolerance.
 Examples:
Virtual Machine migration in cloud platforms.
Kubernetes pods relocating to optimize resource usage.
Diagram:
 Visualize process migration from one virtual machine to another.

5. Communication
Remote Procedure Calls (RPCs)
 Definition: Enables a program to execute code on a remote server as if
it were local.
 Example: gRPC supports efficient communication between
microservices.
Synchronous vs. Asynchronous Messaging
1. Synchronous Messaging
Blocking communication until a response is received.
Example: HTTP requests.
2. Asynchronous Messaging
Non-blocking communication via message queues like RabbitMQ.
Multicast Communication
 Sending messages to multiple recipients simultaneously.
 Use Case: Video streaming platforms like YouTube Live.
Diagram:
 Multicast tree showing a single source distributing data to multiple
recipients.

6. Naming
Importance of Naming
Naming is crucial in distributed systems as it provides a mechanism to
identify and locate resources, processes, or services. Effective naming
ensures scalability, uniqueness, and ease of access.
Challenges
1. Scalability
Maintaining a large directory of names efficiently across nodes.
2. Uniqueness
Preventing name conflicts in globally distributed environments.
Techniques for Effective Naming
1. Flat Naming
Example: IP addresses directly mapping to devices.
2. Hierarchical Naming
Example: Domain Name System (DNS) resolves domain names to IP
addresses.
3. Attribute-Based Naming
Example: Using metadata or attributes to locate resources in large datasets.
Advanced Naming Systems
Blockchain-based naming systems ensure decentralization and immutability.
Example: Ethereum Name Service (ENS).

7. Coordination
Clock Synchronization
1. Logical Clocks
Lamport timestamps ensure event order without requiring real-time
synchronization.
2. Physical Clocks
Network Time Protocol (NTP) aligns system clocks with a global standard.
Election Algorithms
1. Bully Algorithm
Steps:
 Identify the highest-priority node.
 Declare it as the coordinator.
 Reassign responsibilities if the coordinator fails.
Use Cases: Distributed databases electing a primary replica.
2. Ring Algorithm:
Steps:
 Arrange nodes in a logical ring.
 Pass tokens to identify a leader.
Example: Token ring networks ensuring fair resource access.
Use Cases of Coordination
Distributed file systems like Google File System require consistent
coordination for file access.

8. Consistency and Replication


CAP Theorem
 Consistency: All nodes reflect the latest updates.
 Availability: The system continues to respond to requests even during
failures.
 Partition Tolerance: The system remains functional despite network
partitions.
Replication Techniques
1. Master-Slave Replication:
The master node handles writes, while slaves replicate data for read
operations.
Example: MySQL replication in web applications.
2. Peer-to-Peer Replication:
All nodes are equal, sharing replication responsibilities.
Example: Blockchain-based networks like Ethereum.
Consistency Models
1. Strong Consistency
Ensures all nodes have the same data immediately after an update.
Example: Traditional relational databases.
2. Eventual Consistency
Allows temporary inconsistencies, resolving them over time.
Example: Amazon DynamoDB prioritizes availability over immediate
consistency.

9. Fault Tolerance
Definition
Fault tolerance is the system’s ability to function correctly even when
components fail.
Techniques
1. Replication
Redundancy ensures that if one node fails, others can take over.
Example: Data replication in Hadoop Distributed File System (HDFS).
2. Checkpointing
Periodically saving the system state allows recovery from failures.
Example: High-performance computing systems.
3. Byzantine Fault Tolerance (BFT)
Tolerates malicious faults where nodes provide incorrect information.
Example: Practical Byzantine Fault Tolerance (PBFT) in blockchain systems.
Diagram
 Workflow of Byzantine Fault Tolerance consensus.

10. Security
Encryption Techniques
1. Symmetric Encryption
Uses the same key for encryption and decryption.
Example: AES (Advanced Encryption Standard).
2. Asymmetric Encryption
Uses a public key for encryption and a private key for decryption.
Example: RSA (Rivest-Shamir-Adleman).
Authentication Mechanisms
1. OAuth 2.0:
Secures API access through token-based authentication.
Example: Third-party logins on websites.
2. Multi-Factor Authentication (MFA):=
o Combines passwords with additional security factors like
biometrics.
Secure Communication
1. TLS/SSL Protocols
Encrypt data in transit to prevent eavesdropping.
Example: HTTPS for secure web browsing.
2. Virtual Private Networks (VPNs)
Secure communication over public networks.
Example: Corporate VPNs for remote work.

You might also like