Distributed system
Distributed system
A distributed system is a collection of independent computers that appear to the users of the system
as a single coherent system. These computers or nodes work together, communicate over a network,
and coordinate their activities to achieve a common goal by sharing resources, data, and tasks.
Any Social Media can have its Centralized Computer Network as its Headquarters and
computer systems that can be accessed by any user and using their services will be the
Autonomous Systems in the Distributed System Architecture.
Example of a Distributed System
Any Social Media can have its Centralized Computer Network as its Headquarters
and computer systems that can be accessed by any user and using their services
will be the Autonomous Systems in the Distributed System Architecture
• Distributed System Software: This Software enables computers to
coordinate their activities and to share the resources such as
Hardware, Software, Data, etc.
• Database: It is used to store the processed data that are processed by
each Node/System of the Distributed systems that are connected to
the Centralized network.
• Database: It is used to store the processed data that are processed by each Node/System of
the Distributed systems that are connected to the Centralized network.
• As we can see that each Autonomous System has a common Application that can have its
own data that is shared by the Centralized Database System.
• Middleware Services enable some services which are not present in the local systems or
centralized system default by acting as an interface between the Centralized System and the
local systems. By using components of Middleware Services systems communicate and
manage data.
• The Data which is been transferred through the database will be divided into segments or
modules and shared with Autonomous systems for processing.
• The Data will be processed and then will be transferred to the Centralized system through
the network and will be stored in the database.
• Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in the
System.
• Openness: It is concerned with Extensions and improvements in the system (i.e., How openly
the software is developed and shared with others)
• Concurrency: It is naturally present in Distributed Systems, that deal with the same activity
or functionality that can be performed by separate users who are in remote locations. Every
local system has its independent Operating Systems and Resources.
• Scalability: It increases the scale of the system as a number of processors communicate with
more users by accommodating to improve the responsiveness of the system.
• Fault tolerance: It cares about the reliability of the system if there is a failure in Hardware or
Software, the system continues to operate properly without degrading the performance the
system.
• Transparency: It hides the complexity of the Distributed Systems to the Users and
Application programs as there should be privacy in every system.
• Resource Sharing (Autonomous systems can share resources from remote locations).
• It has extensibility so that systems can be extended in more remote locations and also
incremental growth.
• Security possess a problem due to easy access to data as the resources are shared to
multiple systems.
• Networking Saturation may cause a hurdle in data transfer i.e., if there is a lag in the network
then the user will face a problem accessing data.
• In comparison to a single user system, the database associated with distributed systems is
much more complex and challenging to manage.
• If every node in a distributed system tries to send data at once, the network may become
overloaded.
While distributed systems offer many advantages, they also present some challenges that must be
addressed. These challenges include:
• Network latency: The communication network in a distributed system can introduce latency,
which can affect the performance of the system.
• Distributed coordination: Distributed systems require coordination among the nodes, which
can be challenging due to the distributed nature of the system.
• Security: Distributed systems are more vulnerable to security threats than centralized
systems due to the distributed nature of the system.
• Data consistency: Maintaining data consistency across multiple nodes in a distributed system
can be challenging.
Distributed systems and microservices are related concepts but not the same. Let’s break down the
differences:
1. Distributed Systems:
2. Microservices:
While microservices can be implemented in a distributed system, they are not same. Microservices
focus on architectural design principles, emphasizing modularity, scalability, and flexibility, whereas
distributed systems encompass a broader range of concepts, including communication protocols,
fault tolerance, and concurrency control, among others.
Types
A Distributed System is a Network of Machines that can exchange information with each other
through Message-passing. It can be very useful as it helps in resource sharing. It enables computers
to coordinate their activities and to share the resources of the system so that users perceive the
system as a single, integrated computing facility.
1. Client/Server Systems
2. Peer-to-Peer Systems
3. Middleware
4. Three-tier
5. N-tier
1. Client/Server Systems: Client-Server System is the most basic communication method where the
client sends input to the server and the server replies to the client with an output. The client
requests the server for resources or a task to do, the server allocates the resource or performs the
task and sends the result in the form of a response to the request of the client. Client Server System
can be applied with multiple servers.
3. Middleware: Middleware can be thought of as an application that sits between two separate
applications and provides service to both. It works as a base for different interoperability applications
running on different operating systems. Data can be transferred to other between others by using
this service.
4. Three-tier: Three-tier system uses a separate layer and server for each function of a program. In
this data of the client is stored in the middle tier rather than sorted into the client system or on their
server through which development can be done easily. It includes an Application Layer, Data Layer,
and Presentation Layer. This is mostly used in web or online applications.
5. N-tier: N-tier is also called a multitier distributed system. The N-tier system can contain any
number of functions in the network. N-tier systems contain similar structures to three-tier
architecture. When interoperability sends the request to another application to perform a task or to
provide a service. N-tier is commonly used in web applications and data systems.
Distributed System
• Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in the
System.
• Concurrency: It is naturally present in Distributed Systems, that deal with the same activity
or functionality that can be performed by separate users who are in remote locations. Every
local system has its independent Operating Systems and Resources.
• Scalability: It increases the scale of the system as several processors communicate with more
users by accommodating to improve the responsiveness of the system.
• Transparency: It hides the complexity of the Distributed Systems from the Users and
Application programs as there should be privacy in every system.
• Network latency: The communication network in a distributed system can introduce latency,
which can affect the performance of the system.
• Distributed coordination: Distributed systems require coordination among the nodes, which
can be challenging because of the distributed nature of the system.
• Data consistency: Maintaining data consistency across multiple nodes in a distributed system
can be challenging.
A distributed system is also known as distributed computer science and distributed databases;
independent components that interact with other different machines that exchange messages to
achieve common goals. As such, the distributed system appears to the end-user like an interface or a
computer. Together the system can maximize resources and information while preventing system
failure and did not affect service availability.
1. Distributed Computing System
This distributed system is used in performance computation which requires high computing.
Cluster Computing
When input comes from a client to the main computer, the master CPU divides the task into simple
jobs and sends it to the slave note to do it when the jobs are done by the slave nodes, they send it
back to the master node, and then it shows the result to the main computer.
1. High Performance
2. Easy to manage
3. Scalable
4. Expandability
5. Availability
6. Flexibility
7. Cost-effectiveness
8. Distributed applications
1. High cost.
5. In distributed systems, it is challenging to provide adequate security because both the nodes
and the connections must be protected.
1. In many web applications functionalities such as Security, Search Engines, Database servers,
web servers, proxy, and email.
• Grid Computing: In grid computing, the subgroup consists of distributed systems, which are
often set up as a network of computer systems, each system can belong to a different
administrative domain and can differ greatly in terms of hardware, software, and
implementation network technology.
Grid Computing
The different department has different computer with different OS to make the control node present
which helps different computer with different OS to communicate with each other and transfer
messages to work.
1. Can solve bigger and more complex problems in a shorter time frame. Easier collaboration
with other organizations and better use of existing equipment.
1. Organizations that develop grid standards and practices for the guild line.
3. It is a solution-based solution that can meet computing, data, and network needs.
o Consistent: The transaction should be consistent after the transaction has been
done.
o Durable: Once an engaged transaction, the changes are permanent. Transactions are
often constructed as several sub-transactions, jointly forming a nested transaction.
Nested Transaction
Each database can perform its query containing data retrieval from two different databases to give
one single result
In the company’s middleware systems, the component that manages distributed (or nested)
transactions has formed the application integration core at the server or database. This was referred
to as the Transaction Processing Monitor(TP Monitor). Its main task was to allow an application to
access multiple servers/databases by providing a transactional programming model. Many requests
are sent to the database to get the result, to ensure each request gets successfully executed and
deliver result to each request, this work is handled by the TP Monitor.
• RPC: Remote Procedure Calls (RPC), a software element that sends a request to every other
software element with the aid of using creating a nearby method name and retrieving the
data Which is now known as remote method invocation (RMI). An app can have a different
database for managing different data and then they can communicate with each other on
different platforms. Suppose, if you login into your android device and watch your video on
YouTube then you go to your laptop and open YouTube you can see the same video is in your
watch list. RPC and RMI have the disadvantage that the sender and receiver must be running
at the time of communication.
Purposes
• Targets the application rules and implements them in the EAI system so that even if one of
the lines of business applications is replaced by the application of another vendor.
• An EAI system can use a group of applications as a front end, provide only one, consistent
access interface to those applications, and protect users from learning how to use different
software packages.
Pervasive Computing is also abbreviated as ubiquitous (Changed and removed) computing and it is
the new step towards integrating everyday objects with microprocessors so that this information can
communicate. a computer system available anywhere in the company or as a generally available
consumer system that looks like that same everywhere with the same functionality but that operates
from computing power, storage, and locations across the globe.
• Home system: Nowadays many devices used in the home are digital so we can control them
from anywhere and effectively.
Home Systems
• Electronic Health System: Nowadays smart medical wearable devices are also present
through which we can monitor our health regularly.
Electronic Health System
• Sensor Network (IoT devices): Internet devices only send data to the client to act according
to the data send to the device.
Sensor Network
• Before sensory devices only send and send data to the client but now, they can store and
process the data to manage it efficiently.
Synchronization in Distributed Systems
Synchronization in distributed systems is crucial for ensuring consistency, coordination, and
cooperation among distributed components. It addresses the challenges of maintaining data
consistency, managing concurrent processes, and achieving coherent system behavior across
different nodes in a network. By implementing effective synchronization mechanisms, distributed
systems can operate seamlessly, prevent data conflicts, and provide reliable and efficient services.
1. Data Integrity: Ensures that data remains consistent across all nodes, preventing conflicts
and inconsistencies.
3. Task Coordination: Helps coordinate tasks and operations among distributed nodes,
ensuring they work together harmoniously.
Synchronization in distributed systems presents several challenges due to the inherent complexity
and distributed nature of these systems. Here are some of the key challenges:
• Scalability:
• Fault Tolerance:
o Node Failures: Handling node failures and ensuring data consistency during recovery
requires robust synchronization mechanisms.
o Data Recovery: Synchronizing data recovery processes to avoid conflicts and ensure
data integrity is complex.
• Concurrency Control:
• Data Consistency:
• Time Synchronization:
o Clock Drift: Differences in system clocks (clock drift) can cause issues with time-
based synchronization protocols.
Types of Synchronization
1. Time Synchronization
Time synchronization ensures that all nodes in a distributed system have a consistent view of time.
This is crucial for coordinating events, logging, and maintaining consistency in distributed
applications.
• Event Ordering: Ensures that events are recorded in the correct sequence across different
nodes.
• Debugging and Monitoring: Accurate timestamps are vital for debugging, monitoring, and
auditing system activities.
Techniques:
• Precision Time Protocol (PTP): Provides higher accuracy time synchronization for systems
requiring precise timing.
• Logical Clocks: Ensure event ordering without relying on physical time (e.g., Lamport
timestamps).
2. Data Synchronization
Data synchronization ensures that multiple copies of data across different nodes in a distributed
system remain consistent. This involves coordinating updates and resolving conflicts to maintain a
unified state.
• Consistency: Ensures that all nodes have the same data, preventing inconsistencies.
• Fault Tolerance: Maintains data integrity in the presence of node failures and network
partitions.
• Performance: Optimizes data access and reduces latency by ensuring data is correctly
synchronized.
Techniques:
• Replication: Copies of data are maintained across multiple nodes to ensure availability and
fault tolerance.
• Consensus Algorithms: Protocols like Paxos, Raft, and Byzantine Fault Tolerance ensure
agreement on the state of data across nodes.
3. Process Synchronization
• Correctness: Ensures that processes execute in the correct order and interact safely.
• Resource Management: Manages access to shared resources to prevent conflicts and ensure
efficient utilization.
• Scalability: Enables the system to scale efficiently by coordinating process execution across
multiple nodes.
Techniques:
• Mutual Exclusion: Ensures that only one process accesses a critical section or shared
resource at a time (e.g., using locks, semaphores).
• Barriers: Synchronize the progress of processes, ensuring they reach a certain point before
proceeding.
• Condition Variables: Allow processes to wait for certain conditions to be met before
continuing execution.
Synchronization Techniques
Synchronization in distributed systems is essential for coordinating the operations of multiple nodes
or processes to ensure consistency, efficiency, and correctness. Here are various synchronization
techniques along with their use cases:
• Network Time Protocol (NTP): NTP synchronizes the clocks of computers over a network to
within a few milliseconds of each other.
• Precision Time Protocol (PTP): PTP provides higher precision time synchronization (within
microseconds) suitable for systems requiring precise timing.
o Use Case: Ensuring the correct order of message processing in distributed databases
or messaging systems to maintain consistency.
• Replication: Replication involves maintaining copies of data across multiple nodes to ensure
high availability and fault tolerance.
o Use Case: Cloud storage systems like Amazon S3, where data is replicated across
multiple data centers to ensure availability even if some nodes fail.
• Consensus Algorithms: Algorithms like Paxos and Raft ensure that multiple nodes in a
distributed system agree on a single data value or state.
o Use Case: Distributed databases like Google Spanner, where strong consistency is
required for transactions across globally distributed nodes.
o Use Case: NoSQL databases like Amazon DynamoDB, which prioritize availability and
partition tolerance while providing eventual consistency for distributed data.
• Mutual Exclusion: Ensures that only one process can access a critical section or shared
resource at a time, preventing race conditions.
o Use Case: Managing access to a shared file or database record in a distributed file
system to ensure data integrity.
• Barriers: Barriers synchronize the progress of multiple processes, ensuring that all processes
reach a certain point before any proceed.
o Use Case: Parallel computing applications, such as scientific simulations, where all
processes must complete one phase before starting the next to ensure correct
results.
• Condition Variables: Condition variables allow processes to wait for certain conditions to be
met before continuing execution, facilitating coordinated execution based on specific
conditions.
Coordination mechanisms in distributed systems are essential for managing the interactions and
dependencies among distributed components. They ensure tasks are completed in the correct order,
and resources are used efficiently. Here are some common coordination mechanisms:
1. Locking Mechanisms
• Mutexes (Mutual Exclusion Locks): Mutexes ensure that only one process can access a
critical section or resource at a time, preventing race conditions.
• Read/Write Locks: Read/write locks allow multiple readers or a single writer to access a
resource, improving concurrency by distinguishing between read and write operations.
2. Semaphores
• Counting Semaphores: Semaphores are signaling mechanisms that use counters to manage
access to a limited number of resources.
3. Barriers
4. Leader Election
• Bully Algorithm: A leader election algorithm that allows nodes to select a leader among
them.
• Raft Consensus Algorithm: A consensus algorithm that includes a leader election process to
ensure one leader at a time in a distributed system.
5. Distributed Transactions
• Two-Phase Commit (2PC): A protocol that ensures all nodes in a distributed transaction
either commit or abort the transaction, maintaining consistency.
• Three-Phase Commit (3PC): An extension of 2PC that adds an extra phase to reduce the
likelihood of blocking in case of failures.
Time synchronization in distributed systems is crucial for ensuring that all the nodes in the system
have a consistent view of time. This consistency is essential for various functions, such as
coordinating events, maintaining data consistency, and debugging. Here are the key aspects of time
synchronization in distributed systems:
1. Event Ordering: Ensures that events are ordered correctly across different nodes, which is
critical for maintaining data consistency and correct operation of distributed applications.
3. Logging and Debugging: Accurate timestamps in logs are essential for diagnosing and
debugging issues in distributed systems.
1. Clock Drift: Each node has its own clock, which can drift over time due to differences in
hardware and environmental conditions.
2. Network Latency: Variability in network latency can introduce inaccuracies in time
synchronization.
3. Fault Tolerance: Ensuring time synchronization remains accurate even in the presence of
node or network failures.
• Use Case: General-purpose time synchronization for servers, desktops, and network
devices.
• Description: PTP is designed for higher precision time synchronization than NTP. It is
commonly used in environments where microsecond-level accuracy is required.
• Description: A centralized algorithm where a master node periodically polls all other
nodes for their local time and then calculates the average time to synchronize all
nodes.
• Use Case: Suitable for smaller distributed systems with a manageable number of
nodes
ime synchronization plays a crucial role in many real-world distributed systems, ensuring consistency,
coordination, and reliability across diverse applications. Here are some practical examples:
1. Google Spanner
Google Spanner is a globally distributed database that provides strong consistency and high
availability. It uses TrueTime, a sophisticated time synchronization mechanism combining GPS and
atomic clocks, to achieve precise and accurate timekeeping across its global infrastructure.
TrueTime ensures that transactions across different geographical locations are correctly ordered and
that distributed operations maintain consistency.
High-frequency trading platforms in the financial sector require precise time synchronization to
ensure that trades are executed in the correct sequence and to meet regulatory requirements.
Precision Time Protocol (PTP) is often used to synchronize clocks with microsecond precision,
allowing for accurate timestamping of transactions and fair trading practices.
3. Telecommunications Networks
Cellular networks, such as those used by mobile phone operators, rely on precise synchronization to
manage handoffs between base stations and to coordinate frequency usage.
Network Time Protocol (NTP) and PTP are used to synchronize base stations and network elements,
ensuring seamless communication and reducing interference.
Remote Procedure Call (RPC) is a protocol used in distributed systems that allows a program to
execute a procedure (subroutine) on a remove server or system as if it were a local procedure call.
• RPC enables a client to invoke methods on a server residing in a different address space
(often on a different machine) as if they were local procedures.
• The client and server communicate over a network, allowing for remote interaction and
computation.
Remote Procedure Call (RPC) plays a crucial role in distributed systems by enabling seamless
communication and interaction between different components or services that reside on separate
machines or servers. Here’s an outline of its importance:
• Simplified Communication
o Abstraction of Complexity: RPC abstracts the complexity of network communication,
allowing developers to call remote procedures as if they were local, simplifying the
development of distributed applications.
o Resource Sharing: Enables sharing of resources and services across a network, such
as databases, computation power, or specialized functionalities.
The RPC (Remote Procedure Call) architecture in distributed systems is designed to enable
communication between client and server components that reside on different machines or nodes
across a network. The architecture abstracts the complexities of network communication and allows
procedures or functions on one system to be executed on another as if they were local. Here’s an
overview of the RPC architecture:
• Client: The client is the component that makes the RPC request. It invokes a procedure or
method on the remote server by calling a local stub, which then handles the details of
communication.
• Server: The server hosts the actual procedure or method that the client wants to execute. It
processes incoming RPC requests and sends back responses.
2. Stubs
• Client Stub: Acts as a proxy on the client side. It provides a local interface for the client to call
the remote procedure. The client stub is responsible for marshalling (packing) the procedure
arguments into a format suitable for transmission and for sending the request to the server.
• Server Stub: On the server side, the server stub receives the request, unmarshals (unpacks)
the arguments, and invokes the actual procedure on the server. It then marshals the result
and sends it back to the client stub.
• Unmarshalling: The reverse process of converting the received byte stream back into the
original data format that can be used by the receiving system.
4. Communication Layer
• Transport Protocol: RPC communication usually relies on a network transport protocol, such
as TCP or UDP, to handle the data transmission between client and server. The transport
protocol ensures that data packets are reliably sent and received.
• Message Handling: This layer is responsible for managing network messages, including
routing, buffering, and handling errors.
5. RPC Framework
• Interface Definition Language (IDL): Used to define the interface for the remote procedures.
IDL specifies the procedures, their parameters, and return types in a language-neutral way.
This allows for cross-language interoperability.
• RPC Protocol: Defines how the client and server communicate, including the format of
requests and responses, and how to handle errors and exceptions.
• Timeouts and Retries: Mechanisms to handle network delays or failures by retrying requests
or handling timeouts gracefully.
• Exception Handling: RPC frameworks often include support for handling remote exceptions
and reporting errors back to the client.
7. Security
• Authentication and Authorization: Ensures that only authorized clients can invoke remote
procedures and that the data exchanged is secure.
• Encryption: Protects data in transit from being intercepted or tampered with during
transmission.
In distributed systems, Remote Procedure Call (RPC) implementations vary based on the
communication model, data representation, and other factors. Here are the main types of RPC:
1. Synchronous RPC
• Description: In synchronous RPC, the client sends a request to the server and waits for the
server to process the request and send back a response before continuing execution.
• Characteristics:
2. Asynchronous RPC
• Description: In asynchronous RPC, the client sends a request to the server and continues its
execution without waiting for the server’s response. The server’s response is handled when
it arrives.
• Characteristics:
o Non-Blocking: The client does not wait for the server’s response, allowing for other
tasks to be performed concurrently.
o Use Cases: Useful for applications where tasks can run concurrently and where
responsiveness is critical.
3. One-Way RPC
• Description: One-way RPC involves sending a request to the server without expecting any
response. It is used when the client does not need a return value or acknowledgment from
the server.
• Characteristics:
o Fire-and-Forget: The client sends the request and does not wait for a response or
confirmation.
o Use Cases: Suitable for scenarios where the client initiates an action but does not
require immediate feedback, such as logging or notification services.
4. Callback RPC
• Description: In callback RPC, the client provides a callback function or mechanism to the
server. After processing the request, the server invokes the callback function to return the
result or notify the client.
• Characteristics:
o Asynchronous Response: The client does not block while waiting for the response;
instead, the server calls back the client once the result is ready.
o Use Cases: Useful for long-running operations where the client does not need to
wait for completion.
5. Batch RPC
• Description: Batch RPC allows the client to send multiple RPC requests in a single batch to
the server, and the server processes them together.
• Characteristics:
Performance and optimization of Remote Procedure Calls (RPC) in distributed systems are crucial for
ensuring that remote interactions are efficient, reliable, and scalable. Given the inherent network
latency and resource constraints, optimizing RPC can significantly impact the overall performance of
distributed applications. Here’s a detailed look at key aspects of performance and optimization for
RPC:
• Minimizing Latency
o Batching Requests: Group multiple RPC requests into a single batch to reduce the
number of network round-trips.
• Reducing Overhead
o Efficient Serialization: Use efficient serialization formats (e.g., Protocol Buffers, Avro)
to minimize the time and space required to marshal and unmarshal data.
o Request and Response Size: Optimize the size of requests and responses by
including only necessary data to reduce network load and processing time.
o Load Balancers: Use load balancers to distribute RPC requests across multiple
servers or instances, improving scalability and preventing any single server from
becoming a bottleneck.
o Result Caching: Cache the results of frequently invoked RPC calls to avoid redundant
processing and reduce response times.
o Local Caching: Implement local caches on the client side to store recent results and
reduce the need for repeated remote calls.