DS Unit 1
DS Unit 1
PROFESSOR EDUCATION
Think of them as a team that communicates to achieve a common goal. This teamwork
allows for better efficiency and reliability.
1. Concurrency: Concurrency in distributed systems means that multiple tasks can progress
simultaneously. This is achieved through the parallel execution of processes on different
computers, enhancing overall system performance. For example, in a distributed database
system, multiple users can perform queries at the same time without waiting for each
other.
Simple Version : Multiple tasks happen at the same time, like cooking different
dishes simultaneously in a kitchen.
2. No Global Clock: Since computers in a distributed system may be spread across the
globe, maintaining a synchronized global clock is impractical. Each computer typically has
its own local clock, and mechanisms such as timestamps are used to order events. This
lack of a global clock makes coordinating actions more complex but is essential for the
independence of distributed components.
Simple Version: Since the computers can be far away, keeping them all in sync with
the same time is tricky. They each have their own time, like having watches set to
different times
2. Cloud Computing: Cloud computing platforms like AWS or Azure exemplify distributed
systems. These services provide users with virtualized computing resources on-demand.
Users can deploy applications on a network of servers, and the system dynamically
allocates resources based on demand. This scalability is crucial for handling varying
workloads efficiently.
3. Cloud computing is like renting internet space for your data, allowing easy access
from anywhere. It's like storing photos in a virtual space without the bother of
physical devices.
Advantages:
Applications:
Unit-1
1. Big Data Processing: Distributed systems play a pivotal role in big data processing,
where massive datasets are analyzed to extract valuable insights. Technologies like
Apache Hadoop distribute data processing tasks across a cluster of computers, enabling
the efficient handling of vast amounts of information.
Simple Version: When there's a ton of information to go through, like figuring out
what people like on the internet, a distributed system makes it faster and easier
2. Online Social Networks: Social networking platforms rely on distributed systems to
manage user data, handle interactions, and ensure responsiveness. The distribution of
data and processing across multiple servers enables these platforms to support a large
user base and provide seamless user experiences.
Simple Version: When you post a picture or chat with friends online, a team
of computers across the world makes sure everything runs smoothly.
Distributed computing and parallel processing share the goal of improving computational
efficiency but differ in their approaches. Distributed computing involves multiple computers
collaborating over a network, often geographically dispersed. In contrast, parallel processing
utilizes multiple processors within a single computer.
Simple Version: You can get what you need without knowing where it is.
Simple Version: It doesn't matter where something is; you can use it without
worrying about where it's kept.
Simple Version: Many things happen at the same time, but you don't need to worry
about them getting mixed up.
5. Failure Transparency: Failure transparency ensures that users are unaware of failures in
the system. Even if a component fails, the system continues to operate, and users may
not experience any disruption. This is achieved through redundancy, fault tolerance
mechanisms, and error recovery strategies. For instance, in a distributed database, data
replication can ensure that data remains available even if a part of the system fails.
Simple Version: Even if one part has a problem, the rest keeps going, and you might
not even notice.
Example: File Sharing in a Distributed System Consider a distributed file system where
multiple nodes share access to a common set of files. In this scenario, a user on one node may
request to read or write to a file that resides on another node. The system must manage file
access, handle concurrency control, and ensure data consistency across nodes.
PROFESSOR EDUCATION
The major issues in designing a distributed system in simpler terms:
1. Heterogeneity:
• In a distributed system, we use different types of networks, operating systems,
computer hardware, and programming languages.
• The internet communication protocol helps hide these differences in networks,
and middleware (software in between) can handle other differences.
• Explanation: Imagine a group of friends using different types of phones,
laptops, and speaking different languages. Making them work together
seamlessly can be challenging.
2. Openness:
• A distributed system should be easy to expand. This means we should be able to
add new parts to the system easily.
• We need to create interfaces for different components so that new parts can be
added to the system without much trouble.
• Explanation: Think of a video game that can be updated with new levels.
The game should be designed so that adding these new levels doesn't break
the existing game and is easy to integrate.
3. Security:
• To keep information safe when it travels over a network, we use encryption. This
makes sure that shared resources are protected, and sensitive information
remains a secret.
• Security is crucial to prevent problems like denial of service attacks, which can
disrupt the normal functioning of the system.
• Explanation: Just like you use a secret code with your friend to pass notes so
others can't understand, in a distributed system, we use encryption to keep
information safe when it travels between different parts
4. Scalability:
• Scalability means a system's ability to handle more work as it grows. For example,
if we need to add more machines to manage increased work, the system should
handle it easily.
• The design should allow the system to grow by adding more nodes (machines)
and users without causing problems.
• Explanation: Think of a library that can easily accommodate more books and
more readers as it gets popular without becoming overcrowded or slow.
5. Fault Avoidance:
Unit-1
•Fault avoidance is about designing the system components in a way that reduces
the chances of errors or faults.
• Using reliable components helps improve the system's reliability, making it less
likely to experience issues
• Explanation: Similar to using strong and reliable materials to build a sturdy
bridge that doesn't break easily, in a distributed system, we design
components to minimize the chances of things going wrong.
6. Transparency:
• Transparency in a distributed system means hiding the complex details from
users.
• Users or programmers don't need to worry about where things are located, how
operations are accessed by other parts, or whether something is replicated or
moved. It's all made simple for them.
• Explanation: When you use a smartphone, you don't need to know the
complicated technology behind it. Similarly, in a distributed system, we
want to hide the complexity from users, making it easy for them to use
without understanding the technical details.
Why scalability is important in the design of a distributed system and discuss some guiding
principles :
These principles make sure that a distributed system can grow smoothly, work well, and not cost
too much to keep up.
Let's break down the limitations of distributed systems mentioned, and I'll provide
examples to illustrate each point:
3. Inconsistent Observations:
• Limitation Explanation: Due to the absence of a global clock and shared memory,
different processes in a distributed system may observe events at different times. This
inconsistency can make it difficult to reason about the state of the system.
• Example: In a collaborative document editing system, if two users edit the same
document simultaneously, the absence of a global clock and shared memory might result
in different servers processing the edits in a different order. As a result, the final state of
the document may vary across different parts of the system.
• Imagine friends collaborating on a document online. They might see some changes
instantly but miss others because the system struggles to keep everything perfectly
in sync
In summary, the absence of a global clock and shared memory in distributed systems introduces
complexities that can lead to inconsistencies in observations, difficulties in obtaining a coherent
global state, and challenges in reasoning about system behavior and debugging. These
Unit-1
limitations highlight the need for careful design and coordination mechanisms in distributed
systems
PROFESSOR EDUCATION
Causal ordering:
Causal ordering of messages in a distributed system is about making sure messages are received
in the right order based on their cause-and-effect relationships. If one message depends on
another, we want to make sure they are received in the correct sequence.
Problem Scenario: Imagine if we send two messages, let's call them m1 and m2, to someone. If
the person gets m2 before m1, it might cause problems because m2 depends on m1. This can
make the system work incorrectly.
Sending a Message:
• All messages are timestamped and sent out with a list of all timestamps of messages sent
to other processes.
• Locally store the timestamp of the sent message.
Receiving a Message:
Explanation:
So, the Schiper-Eggli-Sandoz algorithm is like a set of rules that helps us make sure messages are
received in the right order, respecting the cause-and-effect relationships between them. It's a
way to keep things organized and prevent problems caused by messages arriving in the wrong
sequence
The Chandy-Lamport algorithm is like a plan for teams of computers to take a group
picture or snapshot together. They use special markers to make sure everyone knows when
the picture starts. This helps in capturing a clear snapshot of what all the computers are
doing at the same time, making it easier to understand and manage the entire team's
activities in a distributed system.
1. Record Own State: Process P writes down what it's currently doing (local state).
2. Send Markers: For every connection (channel) from P to other processes:
• P sends a special marker before sending any regular messages.
• (Other processes will see this marker and know something important is
happening.)
1. Check Own State: If Q hasn't written down what it's doing (local state) yet:
• Q notes that it hasn't received regular messages on its incoming channel
associated with the marker.
• After noting this, Q follows the marker sending rule.
2. If State Already Recorded:
• If Q has already written down its state:
• Q notes down the messages received on its incoming channel after
writing its state but before getting the marker from P.
Explanation:
• This algorithm helps all processes in a group take a snapshot of what they're doing at the
same time.
• When a process decides to take a snapshot (using a marker), it tells others by sending
markers.
• Other processes, upon receiving a marker, note down what messages they've received,
ensuring a clear snapshot.
• The process continues until all processes have taken a snapshot.
Key Points:
Notations:
• B(DW): It's like a special message for doing work (computation). The 'DW' is how heavy
or important this work is.
• C(DW): This is another special message, but it's for controlling things, not for doing
actual work.
Algorithm:
Explanation:
• Imagine the processes are like workers, and they send messages to each other.
• Work messages (B) are sent when a process wants to do some work, and control
messages (C) are sent when a process wants to take a break.
• The algorithm makes sure everyone is working together and taking breaks as needed
until all the work is done.
Unit-1
Lamport's Logical Clocks:
Lamport's Logical Clocks are a concept in distributed systems designed to order events in a
distributed system. The logical clock values assigned to events represent a partial ordering of the
events, helping to establish a timeline in a distributed environment.
Lamport's Logical Clocks are like a shared timeline for events in a group of computers.
Each event gets a number, and these numbers help us see the order of events. It's like
giving each action a timestamp so we can understand when things happen in a network of
connected computers.
1. Clock Initialization:
• Each process starts with its logical clock initialized to zero.
2. Event Timestamping:
• Every time a process performs an internal event or sends a message, it increments
its logical clock value.
3. Message Reception:
• Upon receiving a message, a process sets its logical clock value to the maximum
of its current value and the timestamp of the received message, plus one.
In summary, while Lamport's Logical Clocks provide a useful mechanism for establishing a partial
ordering of events in distributed systems, they come with limitations related to the assumptions
made about event execution, message delivery, clock synchronization, and overhead of logical
clock maintenance. In scenarios where these limitations are critical, more advanced clock
synchronization mechanisms and algorithms may be considered