Introduction to Distributed Systems
Introduction to Distributed Systems
2. Architectural Styles
Overview
Distributed systems follow various architectural styles depending on the use
case, with each style offering unique trade-offs in terms of performance,
scalability, and complexity.
Architectural Styles
1. Client-Server Model
Centralized servers provide services to client devices.
Example: Web applications like Gmail.
2. Peer-to-Peer (P2P) Model
All nodes are equal, sharing resources without a central authority.
Example: Blockchain networks like Bitcoin or file-sharing networks like
BitTorrent.
3. Publish-Subscribe Model
Publishers send messages, and subscribers receive messages based on
topics of interest.
Example: Stock trading systems and real-time messaging platforms like
Apache Kafka.
4. Microservices Architecture
Applications are decomposed into smaller, independent services that
communicate via APIs.
Example: Netflix’s architecture enables scalability and fault isolation.
Hybrid Architectures
Combine multiple architectural styles for optimized performance.
Example: IoT systems use client-server for cloud communication and publish-
subscribe for real-time messaging between devices.
Diagrams
1. Client-Server Model
A client sends a request to a central server, which processes the request and
sends back a response.
2. Peer-to-Peer Network
Nodes communicate directly with each other without a central server.
3. Publish-Subscribe Model
Publishers send messages to a broker, which routes them to relevant
subscribers.
3. Middleware
Definition
Middleware acts as an intermediary layer, facilitating communication, data
management, and integration between distributed components. It abstracts
underlying complexities and provides common services.
Types of Middleware
1. Message-Oriented Middleware (MOM)
Example: Apache Kafka processes high volumes of real-time streaming data.
2. Database Middleware
Example: JDBC (Java Database Connectivity) bridges Java applications with
databases.
3. Transaction Middleware
Example: IBM MQ ensures reliable message delivery in transactional
systems.
Role in Distributed Systems
Provides APIs for developers to build distributed applications.
Handles tasks like message queuing, transaction management, and
security.
Example: Enabling real-time messaging in chat apps or ensuring
reliable payment processing.
Middleware in IoT Systems
Acts as a bridge between edge devices and the cloud.
Example: Google Cloud IoT Core integrates device data with cloud
services for analysis.
4. Processes
Processes in Distributed Systems
A process is an instance of a program running on a node. Examples include
worker nodes in distributed frameworks like Apache Hadoop or Apache
Spark.
Thread Synchronization
1. Locks:
Ensure mutual exclusion when accessing shared resources.
Example: Redis distributed locks manage resource access in clustered
systems.
2. Semaphores:
Generalized locking mechanisms for controlling access to multiple resources.
Process Migration
The movement of processes from one node to another for load
balancing or fault tolerance.
Examples:
Virtual Machine migration in cloud platforms.
Kubernetes pods relocating to optimize resource usage.
Diagram:
Visualize process migration from one virtual machine to another.
5. Communication
Remote Procedure Calls (RPCs)
Definition: Enables a program to execute code on a remote server as if
it were local.
Example: gRPC supports efficient communication between
microservices.
Synchronous vs. Asynchronous Messaging
1. Synchronous Messaging
Blocking communication until a response is received.
Example: HTTP requests.
2. Asynchronous Messaging
Non-blocking communication via message queues like RabbitMQ.
Multicast Communication
Sending messages to multiple recipients simultaneously.
Use Case: Video streaming platforms like YouTube Live.
Diagram:
Multicast tree showing a single source distributing data to multiple
recipients.
6. Naming
Importance of Naming
Naming is crucial in distributed systems as it provides a mechanism to
identify and locate resources, processes, or services. Effective naming
ensures scalability, uniqueness, and ease of access.
Challenges
1. Scalability
Maintaining a large directory of names efficiently across nodes.
2. Uniqueness
Preventing name conflicts in globally distributed environments.
Techniques for Effective Naming
1. Flat Naming
Example: IP addresses directly mapping to devices.
2. Hierarchical Naming
Example: Domain Name System (DNS) resolves domain names to IP
addresses.
3. Attribute-Based Naming
Example: Using metadata or attributes to locate resources in large datasets.
Advanced Naming Systems
Blockchain-based naming systems ensure decentralization and immutability.
Example: Ethereum Name Service (ENS).
7. Coordination
Clock Synchronization
1. Logical Clocks
Lamport timestamps ensure event order without requiring real-time
synchronization.
2. Physical Clocks
Network Time Protocol (NTP) aligns system clocks with a global standard.
Election Algorithms
1. Bully Algorithm
Steps:
Identify the highest-priority node.
Declare it as the coordinator.
Reassign responsibilities if the coordinator fails.
Use Cases: Distributed databases electing a primary replica.
2. Ring Algorithm:
Steps:
Arrange nodes in a logical ring.
Pass tokens to identify a leader.
Example: Token ring networks ensuring fair resource access.
Use Cases of Coordination
Distributed file systems like Google File System require consistent
coordination for file access.
9. Fault Tolerance
Definition
Fault tolerance is the system’s ability to function correctly even when
components fail.
Techniques
1. Replication
Redundancy ensures that if one node fails, others can take over.
Example: Data replication in Hadoop Distributed File System (HDFS).
2. Checkpointing
Periodically saving the system state allows recovery from failures.
Example: High-performance computing systems.
3. Byzantine Fault Tolerance (BFT)
Tolerates malicious faults where nodes provide incorrect information.
Example: Practical Byzantine Fault Tolerance (PBFT) in blockchain systems.
Diagram
Workflow of Byzantine Fault Tolerance consensus.
10. Security
Encryption Techniques
1. Symmetric Encryption
Uses the same key for encryption and decryption.
Example: AES (Advanced Encryption Standard).
2. Asymmetric Encryption
Uses a public key for encryption and a private key for decryption.
Example: RSA (Rivest-Shamir-Adleman).
Authentication Mechanisms
1. OAuth 2.0:
Secures API access through token-based authentication.
Example: Third-party logins on websites.
2. Multi-Factor Authentication (MFA):=
o Combines passwords with additional security factors like
biometrics.
Secure Communication
1. TLS/SSL Protocols
Encrypt data in transit to prevent eavesdropping.
Example: HTTPS for secure web browsing.
2. Virtual Private Networks (VPNs)
Secure communication over public networks.
Example: Corporate VPNs for remote work.