Distributed System2
Distributed System2
• Definition: A distributed system includes multiple physically different nodes linked together
using the network. All the nodes in this system communicate with each other and control processes
in a team. Nodes include a small portion of the distributed operating system software. It connects
multiple computers through a single channel. It uses many central processors to serve multiple
real-time applications and users.
• Types of distributed system
❑ Client/server systems
❑ Peer-to-peer systems
❑ Middleware
❑ Three-tier
❑ N-tier
Client/server systems
• In client-server systems, the client requests a resource or file and the server fetches that
resource. Users and servers usually communicate through a computer network, so they are
a part of distributed systems. A client is in contact with just one server.
Peer-to-peer systems
• The peer-to-peer techniques contain nodes that are equal participants in data sharing. The nodes
communicate with each other as needed to share resources. This is done with the help of a network.
All the tasks are equally separated between all the nodes.
Middleware
• Middleware can be thought of as an application that sits between two separate applications and
provides service to both. It works as a base for different interoperability applications running on
different operating systems. Data can be transferred to other between others by using this service.
Three-tier
• Three-tier system uses a separate layer and server for each function of a program. In this data of
the client is stored in the middle tier rather than sorted into the client system or on their server
through which development can be done easily. It includes an Application Layer, Data Layer, and
Presentation Layer. This is mostly used in web or online applications.
N-tier
• N-tier is also called a multitier distributed system. The N-tier system can contain any number of
functions in the network. N-tier systems contain similar structures to three-tier architecture. When
interoperability sends the request to another application to perform a task or to provide a service.
N-tier is commonly used in web applications and data systems.
Distributed system goals
• Boosting Performance: The distributed system tries to make things faster by dividing a
bigger task into small chunks and finally processing them simultaneously in different computers.
It’s just like a group of people working together on a project. For example, when we try to search
for anything on the internet the search engine distributes the work among several servers and then
retrieve the result and display the webpage in a few seconds.
• Enhancing Reliability: Distributed system ensures reliability by minimizing the load of
individual computer failure. If one computer gets some failure then other computers try to keep the
system running smoothly. For Example, when we search for something in social media if one
server gets an issue then also we are able to access photos, and posts because they switch the server
quickly.
• Scaling for the Future: Distributed systems are experts at handling increased demands. They
manage the demands by incorporating more and more computers into the system. This way they
run everything smoothly and can handle more users.
Contd.
• Resourceful Utilization: Resource Utilization is one of the most prominent features of a
Distributed system. Instead of putting a load on one computer, they distribute the task among the
other available resource. This ensures that work will be done by utilizing every resource.
• Fault Tolerance and Resilience: Distributed system comes with backup plans. If any
computer fails they redirect the task to some other computer ensuring less delay and a smooth
experience.
• Security and Data Integrity: Distributed system have special codes and lock to protect data
from other. They use some renowned techniques for encryption and authentication to keep
information safe and unauthorized access. Distributed systems prioritize data security as you keep
your secret safe.
• Load Balancing: As we know distributed systems ensure good resource utilization and allow
the system to handle a high volume of data without getting it slow down, this is achieved by load
balancing, in which it evenly distributed load to all the computers available. Thus preventing
single-machine overload and preventing bottlenecks.
Distributed System Models
• Following are different models in distributed system
❑ Physical Model
❑ Architectural Model
❑ Fundamental Model
Physical Model
• A physical model is basically a representation of the underlying hardware elements of a distributed
system. It encompasses the hardware composition of a distributed system in terms of computers
and other devices and their interconnections.
• It is primarily used to design, manage, implement and determine the performance of a distributed
system. A physical model majorly consists of the following components:
❖ Nodes
❖ Links
❖ Middleware
❖ Network Topology
❖ Communication Protocols
Architectural Model
• Architectural model in a distributed computing system is the overall design and structure of the
system, and how its different components are organized to interact with each other and provide the
desired functionalities. It is an overview of the system, on how will the development, deployment
and operations take place.
• The key aspects of the architectural model are
❖ Client-Server model
❖ Peer-to-peer model
❖ Layered model
❖ Micro-services model
• In the micro-service model, a complex application or task, is decomposed into multiple
independent tasks and these services running on different servers. Each service performs only a
single function and is focused on a specific business capability. This makes the overall system
more maintainable, scalable, and easier to understand.
Fundamental Model
• The fundamental model in a distributed computing system is a broad
conceptual framework that helps in understanding the key aspects of
the distributed systems. These are concerned with more formal
description of properties that are generally common in all architectural
models. It represents the essential components that are required to
understand a distributed system’s behaviour.
• Three fundamental models are as follows
1. Interaction Model
2. Remote Procedure Call (RPC)
3. Failure Model
4. Security Model
Interaction model
• Distributed computing systems are full of many processes interacting with
each other in highly complex ways. Interaction model provides a framework
to understand the mechanisms and patterns that are used for communication
and coordination among various processes. Different components that are
important in this model are –
• Message Passing – It deals with passing messages that may contain, data,
instructions, a service request, or process synchronisation between different
computing nodes. It may be synchronous or asynchronous depending on the
types of tasks and processes.
• Publish/Subscribe Systems – Also known as pub/sub system. In this the
publishing process can publish a message over a topic and the processes that
are subscribed to that topic can take it up and execute the process for
themselves. It is more important in an event-driven architecture.
Remote Procedure Call(RPC)
• It is a communication paradigm that has an ability to invoke a new process or a method on a
remote process as if it were a local procedure call. The client process makes a procedure call
using RPC and then the message is passed to the required server process using communication
protocols. These message passing protocols are abstracted and the result once obtained from the
server process, is sent back to the client process to continue execution.
Failure model
• This model addresses the faults and failures that occur in the distributed
computing system. It provides a framework to identify and rectify the faults
that occur or may occur in the system. Fault tolerance mechanisms are
implemented so as to handle failures by replication and error detection and
recovery methods. Different failures that may occur are:
• Crash failures – A process or node unexpectedly stops functioning.
• Omission failures – It involves a loss of message, resulting in absence of
required communication.
• Timing failures – The process deviates from its expected time quantum and
may lead to delays or unsynchronised response times.
• Byzantine failures – The process may send malicious or unexpected
messages that conflict with the set protocols.
Security model
• Distributed computing systems may suffer malicious attacks, unauthorised access and data
breaches. Security model provides a framework for understanding the security
requirements, threats, vulnerabilities, and mechanisms to safeguard the system and its
resources. Various aspects that are vital in the security model are –
• Authentication – It verifies the identity of the users accessing the system. It ensures that
only the authorised and trusted entities get access. It involves –
• Password-based authentication – Users provide a unique password to prove their identity.
• Public-key cryptography – Entities possess a private key and a corresponding public key, allowing
verification of their authenticity.
• Multi-factor authentication – Multiple factors, such as passwords, biometrics, or security tokens,
are used to validate identity.
• Encryption – It is the process of transforming data into a format that is unreadable without
a decryption key. It protects sensitive information from unauthorized access or disclosure.
•
Design Issues in distributed model
• The following are some of the major design issues of distributed systems
1. Heterogeneity: Heterogeneity is applied to the network, computer hardware, operating system, and
implementation of different developers. A key component of the heterogeneous distributed system
client-server environment is middleware. Middleware is a set of services that enables applications
and end-user to interact with each other across a heterogeneous distributed system.
2. Openness: The openness of the distributed system is determined primarily by the degree to which
new resource-sharing services can be made available to the users. Open systems are characterized
by the fact that their key interfaces are published. It is based on a uniform communication
mechanism and published interface for access to shared resources. It can be constructed from
heterogeneous hardware and software.
3. Scalability: The scalability of the system should remain efficient even with a significant increase
in the number of users and resources connected. It shouldn’t matter if a program has 10 or 100
nodes; performance shouldn’t vary. A distributed system’s scaling requires consideration of a
number of elements, including size, geography, and management.
Contd.
4. Security: The security of an information system has three components Confidentially, integrity, and
availability. Encryption protects shared resources and keeps sensitive information secrets when
transmitted.
5. Failure Handling: When some faults occur in hardware and the software program, it may produce
incorrect results or they may stop before they have completed the intended computation so corrective
measures should to implemented to handle this case. Failure handling is difficult in distributed
systems because the failure is partial i, e, some components fail while others continue to function.
6. Concurrency: There is a possibility that several clients will attempt to access a shared resource at
the same time. Multiple users make requests on the same resources, i.e. read, write, and update. Each
resource must be safe in a concurrent environment. Any object that represents a shared resource in a
distributed system must ensure that it operates correctly in a concurrent environment.
7. Transparency: Transparency ensures that the distributed system should be perceived as a single
entity by the users or the application programmers rather than a collection of autonomous systems,
which is cooperating. The user should be unaware of where the services are located and the transfer
from a local machine to a remote one should be transparent.
Communication in distributed system
• Message passing is the interaction of exchanging messages between at least two processors. The
cycle which is sending the message to one more process is known as the sender and the process
which is getting the message is known as the receiver.
• In a message-passing system, we can send the message by utilizing send function and we can
receive the message by utilizing receive function. Following are the general syntaxes for send
function and receive function.
Send()
Receive()
Send (receiver, message)
Receive(sender, message)
• Message passing is possible at whatever point the processors are in communication. The
communication of a message can be established in distributed in two ways.
• Interprocess communication(IPC)
• Remote methodology call(RPC)
Interprocess Communication
• Inter-process communication (IPC) is a mechanism that allows
processes to communicate with each other and synchronize their
actions. The communication between these processes can be seen as a
method of cooperation between them. Processes can communicate
with each other through both:
1. Shared Memory
2. Message Passing
• Communication between processes using shared memory requires
processes to share some variable, and it completely depends on how
the programmer will implement it.
Contd.
• One way of communication using shared memory can be imagined like this: Suppose process1 and process2
are executing simultaneously, and they share some resources or use some information from another process.
Process1 generates information about certain computations or resources being used and keeps it as a record in
shared memory.
• When process2 needs to use the shared information, it will check the record stored in shared memory and take
note of the information generated by process1, and act accordingly.
Message Passing Method
• In this method, processes communicate with each other without using
any kind of shared memory. If two processes p1 and p2 want to
communicate with each other, they proceed as follows:
Establish a communication link (if a link already exists, no need to
establish it again.)
Start exchanging messages using basic primitives.
We need at least two primitives:
–send(message, destination) or send(message)
– receive(message, host) or receive(message)
Contd.
• The message size can be of fixed size or of variable size. If it is of fixed size, it is easy for an OS designer but
complicated for a programmer and if it is of variable size then it is easy for a programmer but complicated for
the OS designer. A standard message can have two parts: header and body.
• The header part is used for storing message type, destination id, source id, message length, and control
information. The control information contains information like what to do if runs out of buffer space,
sequence number, priority. Generally, message is sent using FIFO style.
Message Passing through Communication Link
• Message passing is carried out via a link. A link has some capacity that determines the number of messages
that can reside in it temporarily for which every link has a queue associated with it which can be of zero
capacity, bounded capacity, or unbounded capacity. In zero capacity, the sender waits until the receiver
informs the sender that it has received the message. In non-zero capacity cases, a process does not know
whether a message has been received or not after the send operation. For this, the sender must communicate
with the receiver explicitly. Implementation of the link depends on the situation, it can be either a direct
communication link or an in-directed communication link.
• Direct Communication links are implemented when the processes use a specific process identifier for the
communication, but it is hard to identify the sender ahead of time. For example the print server.
• In-direct Communication is done via a shared mailbox (port), which consists of a queue of messages. The
sender keeps the message in the mailbox and the receiver picks them up.
Synchronous vs Asynchronous Transmission
• Synchronous Transmission: In Synchronous Transmission, data is sent in form of blocks or
frames. This transmission is the full-duplex type. Between sender and receiver, synchronization is
compulsory. In Synchronous transmission, There is no time-gap present between data. It is more
efficient and more reliable than asynchronous transmission to transfer a large amount of data.
• Example: Chat Rooms,Telephonic Conversations,Video Conferencing
Contd.
• Asynchronous Transmission: In Asynchronous Transmission, data is
sent in form of byte or character. This transmission is the half-duplex
type transmission. In this transmission start bits and stop bits are added
with data. It does not require synchronization.
• Example: Email, Forums, Letters
•
•
Message Passing through Exchanging the Messages.
• Synchronous and Asynchronous Message Passing:
• A process that is blocked is one that is waiting for some event, such as a resource becoming
available or the completion of an I/O operation. IPC is possible between the processes on same
computer as well as on the processes running on different computer i.e. in networked/distributed
system.
• In both cases, the process may or may not be blocked while sending a message or attempting to
receive a message so message passing may be blocking or non-blocking. Blocking is
considered synchronous and blocking send means the sender will be blocked until the message is
received by receiver.
• Similarly, blocking receive has the receiver block until a message is available. Non-blocking is
considered asynchronous and Non-blocking send has the sender sends the message and continue.
Similarly, a Non-blocking receive has the receiver receive a valid message or null.
RPC
• RPC is an effective mechanism for building client-server systems that are distributed. RPC
enhances the power and ease of programming of the client/server computing concept.
• It’s a protocol that allows one software to seek a service from another program on another
computer in a network without having to know about the network. The software that makes the
request is called a client, and the program that provides the service is called a server.
• The calling parameters are sent to the remote process during a Remote Procedure Call, and the
caller waits for a response from the remote procedure.
• There are 5 elements used in the working of RPC:
A. Client
B. Client Stub
C. RPC Runtime
D. Server Stub
E. Server
Contd.
• Client: The client process initiates RPC. The client makes a standard
call, which triggers a correlated procedure in the client stub.
Contd.
• Client Stub: Stubs are used by RPC to achieve semantic transparency.
The client calls the client stub. Client stub does the following tasks:
• The first task performed by client stub is when it receives a request from a
client, it packs(marshalls) the parameters and required specifications of
remote/target procedure in a message.
• The second task performed by the client stub is upon receiving the result
values after execution, it unpacks (unmarshalled) those results and sends them
to the Client.
RPC Runtime:
• The RPC runtime is in charge of message transmission between client
and server via the network. Retransmission, acknowledgment, routing,
and encryption are all tasks performed by it.
• On the client side, it receives the result values in a message from the
server side, and then it further sends it to the client stub whereas, on
the server side, RPC Runtime gets the same message from the server
stub when then it forwards to the client machine.
• It also accepts and forwards client machine call request messages to
the server stub.
Server Stub & Server
• Server stub does the following tasks:
The first task performed by server stub is that it
unpacks(unmarshalled) the call request message which is received
from the local RPC Runtime and makes a regular call to invoke the
required procedure in the server.
The second task performed by the server stub is that when it receives
the server’s procedure execution result, it packs it into a message and
asks the local RPC Runtime to transmit it to the client stub where it is
unpacked.
• After receiving a call request from the client machine, the server stub passes
it to the server. The execution of the required procedure is made by the
server and finally, it returns the result to the server stub so that it can be
passed to the client machine using the local RPC Runtime.
RPC Process
• The client, the client stub, and one instance of RPC Runtime are all
running on the client machine.
• A client initiates a client stub process by giving parameters as normal.
The client stub acquires storage in the address space of the client.
• At this point, the user can access RPC by using a normal Local
Procedural Call. The RPC runtime is in charge of message
transmission between client and server via the network.
Retransmission, acknowledgment, routing, and encryption are all tasks
performed by it.
Contd.
• On the server side, values are returned to the server stub, after the
completion of the server operation, which then packs (which is also
known as marshaling) the return values into a message. The transport
layer receives a message from the server stub.
return result;
}
}
Contd.
• Step 3: Creating Stub and Skeleton objects from the
implementation class using rmic
• The rmic tool is used to invoke the rmi compiler that creates the Stub
and Skeleton objects. Its prototype is rmic class name. For above
program the following command need to be executed at the command
prompt
rmic SearchQuery.
• Step 4: Start the rmiregistry
Start the registry service by issuing the following command at the
command prompt start rmiregistry
Contd.
• Step 5: Create and execute the server application program
The next step is to create the server application program and execute it
on a separate command prompt.
• PRAM algorithms are mostly theoretical but can be used as a basis for
developing an efficient parallel algorithm for practical machines and
can also motivate building specialized machines.
PRAM Architecture Model
• The following are the modules of which a PRAM consists of:
1. It consists of a control unit, global memory, and an unbounded set of similar processors, each
with its own private memory.
2. An active processor reads from global memory, performs required computation, and then
writes to global memory.
3. Therefore, if there are N processors in a PRAM, then N number of independent operations
can be performed in a particular unit of time.
Models of PRAM
• While accessing the shared memory, there can be conflicts while performing the
read and write operation (i.e.), a processor can access a memory block that is
already being accessed by another processor. Therefore, there are various
constraints on a PRAM model which handles the read or write conflicts. They are:
• EREW: also called Exclusive Read Exclusive Write is a constraint that doesn’t
allow two processors to read or write from the same memory location at the same
instance.
• CREW: also called Concurrent Read Exclusive Write is a constraint that allows all
the processors to read from the same memory location but are not allowed to write
into the same memory location at the same time.
• ERCW: also called Exclusive Read Concurrent Write is a constraint that allows all
the processors to write to the same memory location but are now allowed to read
the same memory location at the same time.
• CRCW: also called Concurrent Read Concurrent Write is a constraint that allows
all the processors to read from and write to the same memory location parallelly.
Example
• Suppose we wish to add an array consisting of N numbers. We
generally iterate through the array and use N steps to find the sum of
the array.
• So, if the size of the array is N and for each step, let’s assume the time
taken to be 1 second. Therefore, it takes N seconds to complete the
iteration.
• The same operation can be performed more efficiently using a CRCW
model of a PRAM. Let there be N/2 parallel processors for an array of
size N, then the time taken for the execution is 4 which is less than N
= 6 seconds in the following illustration.
Contd.
Message Oriented Vs. Stream Oriented Communication
A system call is involved in it. No system call is involved, it is created using APIs.
The process has its own Process Control Block, Stack, and Thread has Parents’ PCB, its own Thread Control Block, and Stack and
Address Space. common Address space.
Changes to the parent process do not affect child Since all threads of the same process share address space and other
processes. resources so any changes to the main thread may affect the behavior of
the other threads of the process.
Process Migration
• Process migration is a particular type of process management by
which processes are moved starting with one computing environment
and then onto the next.
• From the example above, for every message, a process needs to send first
before the other receives, and timestamp(send) < timestamp(receive) for
every message. In the case of the same process, for instance at process
2, we know that the receiving of message 1 happens before the sending of
m3, hence timestamp(receive_1) < timestamp(send_3). However by
construction, timestamp(receive_1) < timestamp(send_2) but nothing can
be said about the sending of m2 and receiving of m1.
Contd.
• Hence, Lamport’s logical timestamps obey the rule of
causality but cannot distinguish between casual and
concurrent:
If 2 events follow the happens-before relationship, i.e. E1
-> E2 then timestamp(E1) < timestamp but
timestamp(E1) < timestamp (E2) implies either (E1 < E2) or
(E1 and E2 are concurrent)
Vector Clock
• The vector clock tries to overcome the shortcoming of the logical
clock. Suppose there are N presses in the system, each process
uses a vector of integer clocks where each vector has N elements,
We denote the vector maintained by process i as Vi [1…N], the jth
element of the vector at process i, Vi[j], is i’s acknowledgment of
latest events at process j.
• Vector Clock algorithm to assign and adjust vector timestamp in each
process:
1. On an instruction or send event at process i, it increments only its
i-th element of its vector clock
2. When a process sends a message, it attaches along its’ vector
clock
3. When a process j receives a message from process i, it increase
its’ j-th element of its own vector clock and update other elements
in the vector:
o V_j[i] = V_j[i] + 1
o V_j[k] = max(V_i[k], V_j[k]) for k ≠ j
Contd.
• In the example above, node 1 updates its’ vector to [1,0,0] to
represent the event of sending at A before sending to node 2.
Upon receiving, node 2 updates the event of receiving at C.
When node 2 receives another message at F, it again updates
the event of receiving and then adjust other elements in the
vector.
Contd.
• Using vector clock, we define some relationships of 2 events a
and b:
• The Bully Algorithm – This algorithm applies to system where every process can send a message
to every other process in the system. – Suppose process P sends a message to the coordinator.
•
Bully Algorithm
• Suppose process P sends a message to the coordinator.
1. If the coordinator does not respond to it within a time interval T, then it is
assumed that coordinator has failed.
2. Now process P sends an election messages to every process with high priority
number.
3. It waits for responses, if no one responds for time interval T then process P
elects itself as a coordinator.
4. Then it sends a message to all lower priority number processes that it is
elected as their new coordinator.
5. However, if an answer is received within time T from any other process Q
o Process P again waits for time interval T’ to receive another message from
Q that it has been elected as coordinator.
o If Q doesn’t responds within time interval T’ then it is assumed to have
failed and algorithm is restarted.
The Ring Algorithm
• This algorithm applies to systems organized as a ring(logically or physically). In this algorithm we
assume that the link between the process are unidirectional and every process can message to the
process on its right only. Data structure that this algorithm uses is active list, a list that has a
priority number of all active processes in the system.
1. If process P1 detects a coordinator failure, it creates new active list which is empty
initially. It sends election message to its neighbour on right and adds number 1 to its active
list.
2. If process P2 receives message elect from processes on left, it responds in 3 ways:
o If message received does not contain 1 in active list then P1 adds 2 to its active list and
forwards the message.
o If this is the first election message it has received or sent, P1 creates new active list with
numbers 1 and 2. It then sends election message 1 followed by 2.
o If Process P1 receives its own election message 1 then active list for P1 now contains
numbers of all the active processes in the system. Now Process P1 detects highest
priority number from list and elects it as the new coordinator.
Asynchronous Ring Election Algorithm (Clockwise leader
election)