Introduction To Distributed Systems
Introduction To Distributed Systems
Introduction To Distributed Systems
Distributed systems are everywhere. The Internet enables users throughout the world to access its
services wherever they may be located. Each organization manages an intranet, which provides
local services and Internet services for local users and generally provides services to other users
in the Internet. Small distributed systems can be constructed from mobile computers and other
small computational devices that are attached to a wireless network.
Resource sharing is the main motivating factor for constructing distributed systems. Resources
such as printers, files, web pages or database records are managed by servers of the appropriate
type. For example, web servers manage web pages and other web resources. Resources are
accessed by clients – for example, the clients of web servers are generally called browsers.
We define a distributed system as one in which hardware or software components located at
networked computers communicate and coordinate their actions only by passing messages. This
simple definition covers the entire range of systems in which networked computers can usefully
be deployed.
Definition: - A distributed system is a collection of independent computers that appear to the
users of the system as a single computer.
1. Network of workstations
.
3.The cloud
Various models are used for building distributed computing systems. These models can be
broadly classified following categories – minicomputer, workstation, processor pool, workstation
server and hybrid. They are briefly described below.
Minicomputer Model:
The minicomputer model is a simple extension of the centralized time sharing system as
shown in Figure 1.2, a distributed computing system based on this model consists of a
few minicomputers (they may be large supercomputers as well) interconnected by a
communication network. Each minicomputer usually has multiple users simultaneously
logged on to it. For this, several interactive terminals are connected to each
minicomputer. Each user is logged on to one specific minicomputer, with remote access
to other minicomputers. The network allows a user to access remote resources that are
available on some machine other than the one on to which the user is currently logged.
The minicomputer model may be used when resource sharing (Such as sharing of
information databases of different types, with each type of database located on a different
machine) with remote users is desired.
The early ARPANET is an example of a distributed computing system based on the
minicomputer model.
Workstation Model:
As shown in Fig. 1.3, a distributed computing system based on the workstation model
consists of several workstations interconnected by a communication network. A
company’s office or a university department may have several workstations scattered
throughout a building or campus, each workstation equipped with its own disk and
serving as a single-user computer.
It has been often found that in such an environment, at any one time (especially at night),
a significant proportion of the workstations are idle (not being used), resulting in the
waste of large amounts of CPU time. Therefore, the idea of the workstation model is to
interconnect all these workstations by a high speed LAN so that idle workstations may be
used to process jobs of users who are logged onto other workstations and do not have
sufficient processing power at their own workstations to get their jobs processed
efficiently.
In this model, a user logs onto one of the workstations called his or her “home”
workstation and submits jobs for execution. When the system finds that the user’s
workstation does not have sufficient processing power for executing the processes of the
submitted jobs
efficiently, it transfers one or more of the process from the user’s workstation to some
other workstation that is currently idle and gets the process executed there, and finally the
result of execution is returned to the user’s workstation.
The processor – pool model is based on the observation that most of the time a user does not
need any computing power but once in a while he or she may need a very large amount of
computing power for a short time. (e.g., when recompiling a program consisting of a large
number of files after changing a basic shared declaration). Therefore, unlike the workstation –
server model in which a processor is allocated to each user, in the processor-pool model the
processors are pooled together to be shared by the users as needed. The pool of processors
consists of a large number of microcomputers and minicomputers attached to the network. Each
processor in the pool has its own memory to load and run a system program or an application
program of the distributed computing system.
In the pure processor-pool model, the processors in the pool have no terminals attached directly
to them, and users access the system from terminals that are attached to the network via special
devices. These terminals are either small diskless workstations or graphic terminals, such as X
terminals. A special server (Called a run server) manages and allocates the processors in the pool
to different users on a demand basis. When a user submits a job for computation, an appropriate
number of processors are temporarily assigned to his or her job by the run server. For example, if
the user’s computation job is the compilation of a program having n segments, in which each of
the segments can be complied independently to produce separate re-locatable object files, n
processors from the pool can be allocated to this job to compile all the n segments in parallel.
When the computation is completed, the processors are returned to the pool for use by other
users.
In the processor-pool model there is no concept of a home machine. That is, a user does not log
onto a particular machine but to the system as a whole.
1. In general, it is much cheaper to use a few minicomputers equipped with large, fast disks
that are accessed over the network than a large number of diskful workstations, with each
workstation having a small, slow disk.
3. In the workstation server model, since all files are managed by the file servers, user have
the flexibility to use any workstation and access the files in the same manner irrespective
of which workstation the user is currently logged on. Note that this is not true with the
workstation model, in which each workstation has its local file system, because different
mechanisms are needed to access local and remote files.
4. In the workstation server model, the request response protocol described above is mainly
used to access the services of the server machines. Therefore, unlike the workstation
model, this model does not need a process migration facility, which is difficult to
implement.
Hybrid Model:
To continue the advantages of both the workstation-server and processor-pool models, a hybrid
model may be used to build a distributed computing system. The hybrid model is based on the
workstation-server model but with the addition of a pool of processors. The processors in the
pool can be allocated dynamically for computations that are too large for workstations or that
requires several computers concurrently for efficient execution.
Despite these complexities and difficulties, a distributed operating system must be designed to
provide all the advantages of a distributed system to its users. That is, the users should be able to
view a distributed system as a virtual centralized system that is flexible, efficient, reliable, secure
and easy to use. To meet this challenge, the designers of a distributed operating system must deal
with several design issues.
Transparency:
A distributed system that is able to present itself to user and application as if it were only a single
computer system is said to be transparent. There are eight types of transparencies in a distributed
system:
Replication Transparency: Hides the fact that multiple copies of a resource could exist
simultaneously. To hide replication, it is essential that the replicas have the same name.
Consequently, as system that supports replication should also support location
transparency.
Concurrency Transparency: It hides the fact that the resource may be shared by several
competitive users. Example, two independent users may each have stored their file on the
same server and may be accessing the same table in a shared database. In such cases, it is
important that each user doesn’t notice that the others are making use of the same
resource.
Failure Transparency: Hides failure and recovery of the resources. It is the most
difficult task of a distributed system and is even impossible when certain apparently
realistic assumptions are made. Example: A user cannot distinguish between a very slow
or dead resource. Same error message come when a server is down or when the network
is overloaded of when the connection from the client side is lost. So here, the user is
unable to understand what has to be done, either the user should wait for the network to
clear up, or try again later when the server is working again.
For example, concurrent updates to the same file by two different processes should be prevented.
Concurrency transparency means that each user has a feeling that he or she is the sole user of the
system and other users do not exist in the system. For providing concurrency transparency, the
resource sharing mechanisms of the distributed operating system must have the following four
properties:
1. An event-ordering property ensures that all access requests to various system resources
are properly ordered to provide a consistent view to all users of the system.
2. A mutual-exclusion property ensures that at any time at most one process accesses a
shared resource, which must not be used simultaneously by multiple processes if program
operation is to be correct.
3. A no-starvation property ensures that if every process that is granted a resource, which
must not be used simultaneously by multiple processes, eventually releases it, every
request for that resource is eventually granted.
4. A no-deadlock property ensures that a situation will never occur in which competing
processes prevent their mutual progress even though no single one requests more
resources than available in the system.
Reliability:
In general, distributed systems are expected to be more reliable than centralized systems due to
the existence of multiple instances of resources. However, the existence of multiple instances of
the resources alone cannot increase the system’s reliability. Rather, the distributed operating
system, which manages these resources, must be designed properly to increase the system’s
reliability by taking full advantage of this characteristic feature of a distributed system.
A fault is a mechanical or algorithmic defect that may generate an error. A fault in a system
causes system failure. Depending on the manner in which a failed system behaves, system
failures are of two types – fail stop and Byzantine. In the case of fail-stop failure, the system
stops functioning after changing to a state in which its failure can be detected. On the other hand,
in the case of Byzantine failure, the system continues to function but produces wrong results.
Undetected software bugs often cause Byzantine failure of a system. Obviously, Byzantine
failures are much more difficult to deal with than fail-stop failures.
For higher reliability, the fault-handling mechanisms of a distributed operating system must be
designed properly to avoid faults, to tolerate faults, and to detect and recover from faults.
Flexibility:
Another important issue in the design of distributed operating systems is flexibility. Flexibility is
the most important features for open distributed systems. The design of a distributed operating
system should be flexible due to the following reasons:
1. Ease of modification: From the experience of system designers, it has been found that
some parts of the design often need to be replaced / modified either because some bug is
detected in the design or because the design is no longer suitable for the changed system
environment or new-user requirements. Therefore, it should be easy to incorporate
changes in the system in a user-transparent manner or with minimum interruption caused
to the users.
2. Ease of enhancement: In every system, new functionalities have to be added from time
to time it more powerful and easy to use. Therefore, it should be easy to add new services
to the system. Furthermore, if a group of users do not like the style in which a particular
service is provided by the operating system, they should have the flexibility to add and
use their own service that works in the style with which the users of that group are more
familiar and feel more comfortable.
Fault Avoidance:
Fault avoidance deals with designing the components of the system in such a way that the
occurrence of faults in minimized. Conservative design practice such as using high reliability
components are often employed for improving the system’s reliability based on the idea of fault
avoidance. Although a distributed operating system often has little or no role to play in
improving the fault avoidance capability of a hardware component, the designers of the various
software components of the distributed operating system must test them thoroughly to make
these components highly reliable.
Fault Tolerance:
Fault tolerance is the ability of a system to continue functioning in the event of partial system
failure. The performance of the system might be degraded due to partial failure, but otherwise
the system functions properly. Some of the important concepts that may be used to improve the
fault tolerance ability of a distributed operating system are as follows:
Distributed control: For better reliability, many of the particular algorithms or protocols
used in a distributed operating system must employ a distributed control mechanism to
avoid single points of failure. For example, a highly available distributed file system
should have multiple and independent file servers controlling multiple and independent
storage devices. In addition to file servers, a distributed control technique could also be
used for name servers, scheduling algorithms, and other executive control functions. It is
important to note here that when multiple distributed servers are used in a distributed
system to
provide a particular type of service, the servers must be independent. That is, the design
must not require simultaneous functioning of the servers; otherwise, the reliability will
become worse instead of getting better. Distributed control mechanisms are described
throughout this book.
The mechanisms described above may be employed to create a very reliable distributed
system. However, the main drawback of increased system reliability is potential loss of
execution time efficiency due to the extra overhead involved in these techniques. For
many systems it is just too costly to incorporate a large number of reliability mechanisms.
Therefore, the major challenge for distributed operating system designers is to integrate
these mechanisms in a cost-effective manner for producing a reliable system.
Performance:
If a distributed system is to be used its performance must be at least as good as a centralized
system. That is, when a particular application is run on a distributed system, its overall
performance should be better than or at least equal to that of running the same application on a
single processor system. However, to achieve his goal, it is important that the various
components of the operating system of a distributed system be designed properly; otherwise, the
overall performance of the distributed system may turn out to be worse than a centralized system.
Some design principles considered useful for better performance are as follows:
3. Minimize copying of data: Data copying overhead (e.g. moving data in and out of
buffers) involves a substantial CPU cost of many operations. For example, while being
transferred from its sender to its receiver, a message data may take the following path on
the sending side :
Similarly, in several systems, the data copying overhead is also large for read and write
operations on block I/O devices. Therefore, for better performance, it is desirable to avoid
copying of data, although this is not always simple to achieve. Making optimal use of
memory management often helps in eliminating much data movement between the
kernel, block I/O devices, clients, and servers.
4. Minimize network traffic: System performance may also be improved by reducing inter
node communication costs. For example, accesses to remote resources require
communication, possibly through intermediate nodes. Therefore, migrating a process
closer to the resources it is using most heavily may be helpful in reducing network traffic
in the system if the decreased cost of accessing its favorite resource offsets the possible
increased post of accessing its less favored ones. Another way to reduce network traffic is
to use the process migration facility to cluster two or more processes that frequently
communicate with each other on the same node of the system. Avoiding the collection of
global state information for making some decision also helps in reducing network traffic.
Scalability:
Scalability refers to the capability of a system to adapt to increased service load. It is inevitable
that a distributed system will grow with time since it is very common to add new machines or an
entire sub network to the system to take care of increased workload or organizational changes in
a company. Therefore, a distributed operating system should be designed to easily cope with the
growth of nodes and users in the system. That is, such growth should not cause serious disruption
of service or significant loss of performance to users.
Chapter-3
PROCESSES
Process turns out that having a finer granularity in the form of multiple threads of control per
process.
Threads are often provided in the form of a thread package. Such a package
3.4 Multithreaded servers: The file server normally waits for an incoming request for a file
operation, subsequently carries out the request, and then sends back the reply. One possible, and
particularly popular organization is shown in Fig Here one thread, the dispatcher, reads
incoming requests for a file operation. The requests are sent by clients to a well-known end point
for this server. After examining the request, the server chooses an idle (i.e., blocked) worker
thread and hands it the request. The worker proceeds by performing a blocking read on the
local file system, which may cause the thread to be suspended until the data are fetched from
disk. Another thread is selected to be executed.
3.5 The Role of virtualization in distributed systems: Every (distributed) computer system
offers a programming interface to higher level s/w. There are many different types of interfaces,
ranging from the basic instruction set as offered by a CPU to the vast collection of application
programming interfaces that are shipped with many current middleware systems. In its essence,
virtualization deals with extending or replacing an existing interface so as to mimic the
behaviour of another system.
Fig: virtualizing system A on top of system B.
3.7 Code Migration: Code migration in distributed systems took place in the form of
Process migration in which an entire process was moved from one machine to another.
Performance can be improved if processes are moved from heavily-loaded to lightly-loaded
machines.
Chapter 4
COMMUNICATION
Persistent: messages are held by the middleware communication service until they can be
delivered (e.g., email)
Sender can terminate after executing send
Receiver will get message next time it runs
Transient: messages exist only while the sender and receiver are running
Communication errors or inactive receiver cause the message to be discarded
Transport-level communication is transient
Asynchronous v Synchronous Communication
Asynchronous: (non-blocking) sender resumes execution as soon as the message is passed to the
communication/middleware software
Synchronous: sender is blocked until
The OS or middleware notifies acceptance of the message, or
The message has been delivered to the receiver, or
The receiver processes it & returns a response
Discrete versus Streaming Communication
Due to the absence of shared memory, all communication in distributed systems is based on
sending and receiving (low level) messages. When process A wants to communicate with process B, it
first builds a message in its own address space. Then it executes a system call that causes the operating
system to send the message over the network to B.A model called the Open Systems Interconnection
Reference Model (Day and Zimmerman, 1983) is used in communication.
Connection oriented & Connection-less communication system is used.
level communication.
Remote Method Invocations (RMI):- RMI stands for remote method invocations which is a way of
invoking or calling of object in a remote server and client
4.2 Remote Procedure Call: When a process on machine A calls' a procedure on machine B, the calling
process on A is suspended, and execution of the called procedure takes place on B. Information can be
transported from the caller to the callee in the parameters and can come back in the procedure result. No
message passing at all is visible to the programmer. This method is known as Remote Procedure Call, or
often just RPC.
Client Stub (A Process) packs the parameters into a message and requests that message to be sent
to the server.
When the message arrives at the server, the server's operating system passes it up to a server stub.
A server stub is the server-side equivalent of a client stub which is a piece of code that transforms
requests coming in over the network into local procedure calls. The server stub unpacks the parameters
from the message and then calls the server procedure in the usual way.
The client stub inspects the message, unpacks the result, copies it to its caller, and returns in the
usual way. Packing parameters into a message is called parameter marshalling.
Naming
Introduction
Geographical scale of
Worldwide Organization Department
network
Responsiveness to
Seconds Milliseconds Immediate
lookups
Availability
Very High High low
requirement
Number of replicas Many None or few None
Is client-side caching
Yes Yes Sometimes
applied?
Synchronization
Discussion point
the issue of synchronization based on time (actual time and relative ordering)
distributed mutual exclusion to protect shared resources from simultaneous
access by multiple processes
how a group of processes can appoint a process as a coordinator; can be done
by means of election algorithms
6.1 Clock Synchronization
in centralized systems, time can be unambiguously decided by a system call e.g., process
A at time t1 gets the time, say tA, and process b at time t2, where t1 < t2, gets the time, say
tB then tA is always less than (possibly equal to but never greater than) tB
achieving agreement on time in distributed systems is difficult
no
even if all computers initially start at the same time, they will get out of synch
after some time due to crystals in different computers running at different
frequencies, this phenomenon called clock skew
How is time actually measured?
earlier astronomically;
based on the amount of time it takes the earth to rotate the sun; 1 solar
second = 1/86400th of a solar day (24*3600 = 86400)
it was later discovered that the period of the earth’s rotation is not constant
the earth is slowing down due to tidal friction and atmospheric drag
geologists believe that 300 million years ago there were about 400 days
per year
the length of the year is not affected, only the days have become longer
in some countries, UTC (Universal Coordinated Time) is broadcasted on
shortwave radio and satellites (as a short pulse at the start of each UTC second)
for those who need precise time; but one has to pay for the propagation delay
Clock Synchronization Algorithms
two situations:
one machine has a receiver of UTC time, then how do we synchronize
all other machines to it
no machine has a receiver, each machine keeps track of its own time, then
how to synchronize them
many algorithms have been proposed
a model for all algorithms
each machine has a timer that ticks H times per second or causes an interrupt; the
interrupt handler adds 1 to the clock
let the value of the clock that is obtained so be C
when the UTC time is t, the value of the clock on machine p is Cp(t); if everything
is perfect, Cp(t) = t or dC/dt = 1
but in practice there will be errors; either it ticks faster or slower
if is a constant such that 1- dC/dt 1+ , then the timer is said to be
working within its specification
is set by the manufacturer and is called the maximum drift rate
the relation between clock time and UTC when clocks tick at different rates
For some applications, it is sufficient if all machines agree on the same time, rather than with the
real time; we need internal consistency of the clocks rather than being close to the real time