0% found this document useful (0 votes)
8 views

ds2

distributed system note

Uploaded by

lavanyapost123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

ds2

distributed system note

Uploaded by

lavanyapost123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1. Define Distributed Systems.

o A distributed system is a network of independent computers that communicate


and coordinate to appear as a single system, providing benefits like scalability
and fault tolerance.
2. What is Middleware?
o Middleware is software that facilitates communication and data management
between different applications in a distributed system, simplifying
interactions.
3. List Middleware Standards in Distributed Systems.
o CORBA, COM, Java RMI, SOAP, REST, WebSockets, MQTT.
4. Explain Distributed Computation/Execution.
o Distributed computation involves dividing tasks across multiple nodes for
concurrent execution, enhancing performance and resource utilization.
5. Describe Parallelism/Concurrency in Distributed Systems.
o Concurrency allows multiple tasks to overlap in execution; parallelism
involves actual simultaneous execution across processors or nodes.

Unit 2

1. Define NTP and its Purpose.


o NTP (Network Time Protocol) synchronizes clocks in a distributed system to
ensure consistent time references.
2. Examples of Problems Requiring Synchronization.
o Database Transactions, Distributed File Systems, Real-Time Systems,
Event Logging.
3. Variations in Scalar Time and Vector Time.
o Scalar Time: Single timestamp for events, can lead to ambiguities.
o Vector Time: Uses a vector of timestamps for each process, capturing
causality accurately.
4. Message Ordering Paradigms.
o FIFO Ordering: Order maintained per sender.
o Causal Ordering: Respects causal relationships.
o Total Ordering: Global order for all messages.
o Unordered: No order guarantee.
5. Are Physical Clocks Needed for Distributed Systems?
o Yes, for synchronization and event logging, but they require additional
mechanisms (like NTP) to ensure consistency due to clock drift and message
latency.

Designing a distributed system introduces several challenges due to its decentralized nature,
where multiple interconnected nodes work together to achieve a common goal. The main
issues and challenges include:

1. Fault Tolerance

 Challenge: A distributed system must continue functioning in the presence of


hardware failures, network issues, or software crashes.
 System Perspective: The system needs to handle partial failures without causing the
whole system to collapse. Techniques like replication, checkpointing, and
redundancy can improve fault tolerance, but they add complexity in terms of
consistency and synchronization.

2. Concurrency and Synchronization

 Challenge: In distributed systems, multiple processes may try to access shared


resources simultaneously.
 System Perspective: Maintaining consistency of shared data becomes complex due
to concurrency. Solutions such as distributed locks, semaphores, and consensus
algorithms (e.g., Paxos, Raft) are often used, but ensuring fairness and preventing
deadlock or livelock is challenging.

3. Consistency and Data Integrity

 Challenge: Achieving consistency across nodes in a distributed system is difficult,


especially in systems with high availability requirements.
 System Perspective: Distributed systems must often choose between strong
consistency (ensuring all nodes have the same data at any point in time) and
availability (the system is always operational), as highlighted by the CAP theorem
(Consistency, Availability, Partition tolerance). Consistency models range from
strong consistency to eventual consistency, with trade-offs in system performance
and complexity.

4. Latency and Bandwidth

 Challenge: The physical separation of nodes introduces latency in communication,


which can significantly impact performance.
 System Perspective: Reducing latency requires optimizing network communication,
possibly through data compression, local caching, and intelligent routing
mechanisms. Managing bandwidth efficiently becomes important in scenarios
involving large data transfers or real-time communication.

5. Scalability

 Challenge: As the system grows, it should continue to perform well without


bottlenecks.
 System Perspective: Horizontal scaling (adding more nodes) and vertical scaling
(adding resources to existing nodes) must be considered. However, as the number of
nodes increases, managing them and ensuring coordination, data consistency, and
fault tolerance becomes more difficult.

6. Security

 Challenge: Securing data transmission between distributed nodes, ensuring


authentication and preventing malicious attacks.
 System Perspective: Securing a distributed system involves protecting against data
breaches, ensuring data encryption, authentication (e.g., certificates),
authorization (e.g., role-based access), and implementing firewalls or intrusion
detection systems. Managing security in a distributed environment is harder due to
the broader attack surface and multiple points of failure.

7. Network Partitioning

 Challenge: Network partitioning occurs when nodes cannot communicate with each
other due to network failures.
 System Perspective: Handling network partitioning requires making trade-offs
between consistency and availability. Systems like Cassandra or DynamoDB favor
availability over consistency during partitions, while others prioritize consistency at
the expense of availability.

8. Distributed Coordination

 Challenge: Coordinating actions across multiple nodes in a reliable and timely


manner is a complex problem.
 System Perspective: Leader election, distributed agreement, and coordination
services like Zookeeper or Chubby are essential to ensure that multiple nodes can
agree on critical tasks such as updates to shared data or resource management.
However, coordinating actions between many nodes is resource-intensive and can
lead to increased latency or complexity in handling faults.

9. Heterogeneity of Nodes

 Challenge: In large-scale distributed systems, nodes may differ significantly in terms


of hardware, operating systems, or network conditions.
 System Perspective: The system must abstract and hide these differences while
providing a unified interface. This introduces complexity in managing load balancing,
ensuring compatibility, and optimizing for the weakest link in the system.

10. Time Synchronization

 Challenge: Maintaining synchronized clocks across all nodes is crucial for ordering
events.
 System Perspective: Techniques like Network Time Protocol (NTP) or logical
clocks (e.g., Lamport timestamps) are used to synchronize time. However,
achieving perfectly synchronized clocks is impossible, leading to potential issues with
data consistency and causality.

11. Load Balancing

 Challenge: In distributed systems, distributing the workload evenly across nodes is


essential for optimizing performance.
 System Perspective: Implementing dynamic load balancing techniques (like
hashing, least connections, or round-robin) ensures that no single node is
overwhelmed. However, managing load across geographically distributed nodes adds
complexity due to variations in latency, available resources, and workloads.
12. Data Distribution and Replication

 Challenge: Distributing and replicating data across nodes for availability and fault
tolerance while maintaining consistency.
 System Perspective: Systems must determine how to partition data (sharding),
replicate data across nodes, and manage replication lag. Choosing between
synchronous and asynchronous replication involves a trade-off between
consistency and latency.

13. Monitoring and Debugging

 Challenge: Monitoring a distributed system is more complex than a centralized


system due to the number of components and potential failure points.
 System Perspective: Distributed tracing, logging, and monitoring tools (e.g.,
Prometheus, Grafana, Jaeger) are crucial, but aggregating logs, identifying the root
cause of errors, and monitoring performance across a large-scale system are
challenging.

14. Transparency

 Challenge: A distributed system should present itself as a single unified system to


users and developers, hiding the underlying complexity.
 System Perspective: Achieving transparency in terms of location, access,
migration, replication, and concurrency requires significant design efforts.
However, complete transparency can lead to performance bottlenecks or complexity
in implementation.

In a distributed system with perfectly synchronized physical clocks and a reliable


communication network, recording a global state involves capturing consistent snapshots
across all nodes. Since the clocks are synchronized, you can use the time to coordinate when
to capture the snapshots, ensuring consistency. Below is an outline of an algorithm to record
the global state:

Global State Recording Algorithm

Assumptions:

 Nodes: The system consists of N nodes P1,P2,...,PNP_1, P_2, ..., P_NP1,P2,...,PN.


 Clock Synchronization: All physical clocks are perfectly synchronized.
 Reliable Communication: Messages are delivered without loss, duplication, or reordering.

Algorithm Outline:

1. Coordinator Selection: Choose a node to act as a coordinator. This node will


initiate the global state recording. Alternatively, if a decentralized approach is
preferred, each node can initiate the recording independently but in a coordinated
manner using the synchronized clocks.
2. Initiation of Global State Recording: The coordinator selects a future time
T_snapshot (say 5 seconds from the current time) at which the global state will be
recorded. It broadcasts this time to all nodes in the system.
3. Local State Recording at Each Node:
o At time T_snapshot, each node records its local state, which includes:
 The values of its local variables.
 The messages that are currently in transit (i.e., sent but not yet received).
4. Message Recording:
o Since the physical clocks are perfectly synchronized, nodes can identify the
messages in transit based on the send and receive times:
 If a node PiP_iPi has sent a message to PjP_jPj before
TsnapshotT_snapshotTsnapshot but PjP_jPj hasn't received it by
TsnapshotT_snapshotTsnapshot, this message is in transit.
o Each node will log any messages sent before T_snapshot but not received by that
time.
5. Global State Assembly:
o After recording their local state at T_snapshot, each node sends its local state
information (including messages in transit) to the coordinator.
o The coordinator collects all local states from each node and combines them to form
the global state of the system at time T_snapshot.
6. Global State Completion:
o Once all local states are collected, the coordinator completes the construction of the
global state and can provide it to any other node or process that needs it.

Algorithm Steps Summary:

1. Coordinator selects a snapshot time TsnapshotT_{snapshot}Tsnapshot and broadcasts it to


all nodes.
2. Each node waits for TsnapshotT_{snapshot}Tsnapshot, then records its local state (local
variables and messages in transit).
3. Nodes send their recorded local state to the coordinator.
4. The coordinator combines all local states to form the global state of the system.

Properties:

 Consistency: Since the physical clocks are synchronized, all nodes record their local state at
the same global time, ensuring a consistent snapshot.
 Minimal Overhead: With perfectly synchronized clocks, there's no need for complex
message coordination or extra control messages beyond the initial broadcast.

This algorithm leverages synchronized physical clocks to simplify the global state recording
process, reducing the need for coordination overhead and ensuring consistent snapshots
across the system.

You might also like