Distributed Operating Systems: Unit - 2
Distributed Operating Systems: Unit - 2
Adv.O.S 1
Definition:
Distributed operating system:
The term distributed system is used to describe a
system with the following characteristics: it consists
of several computers that do not share a memory or
a clock;
the computers communicate with each other by
exchanging messages over a communication network
and each computer has its own memory and runs its
own operating system.
Adv.O.S 2
Architecture of a distributed operating system
CP
U
CP
U
U MEMOR
CP dIS
K
Y
dIS
ME K
MO
RY
RY
E MO
M
Communication
Network
CPU U
CP
RY
dISK EMO
M
MEM
O RY
Adv.O.S 3
Advantages of Distributed System:
list
Inherently Distributed Applications
Several applications are inherently distributed in nature and require a distributed
computing system for their realization
Examples:
Large bank with offices all over the world – customers can deposit/withdraw
money from his/her account from any branch of the bank
Airline reservation system
Employees database
Extensibility
Possible to gradually extend the power and functionality of a distributed
computing system by simply adding additional resources (HW and SW)
Attractive - Can extend the resources without affecting the normal
functionality of the existing system (called as open distributed systems)
Adv.O.S 4
Advantages of Distributed System
Information sharing among distributed users
Groupware / computer supported cooperative working : Emerging
technology – Distributed computing system by a group of users to
work cooperatively by transferring files and logging into other’s m/c
to run programs
Resource sharing
Adv.O.S 5
Advantages of Distributed System
Shorter response times and higher throughput
Expected to perform better than centralized system
Metrics : Response time and throughput
By partitioning the computations
By evenly distributing the load from overloaded processors
Reliability
Global Knowledge
Scalability
Naming
Compatibility
Process Synchronization
Resource Management
Security
Structuring
Adv.O.S 8
Global knowledge
Up-to-date state of all the processes and resources (global state of
the system) is completely and accurately known in shared memory
computer systems
Solution :
•Determine efficient techniques to implement decentralized system
wide control
•Determine efficient ways to schedule the different events happens at
different times in DS.
Adv.O.S 9
Scalability
It refers to the capability of a system to adapt to increased service load DCS should
be designed to easily cope with the growth of nodes and users in the system
Adv.O.S 12
Computation Migration: The computation migrates to
another location. Migrating computation may be
efficient under certain circumstances. The remote
procedure call mechanism has been widely used for
computation migration and for providing
communication between computers.
Adv.O.S 13
Security: The security of a system is the responsibility of
its operating system. Two issues that must be
considered in the design of security for computer
systems are authentication and authorization.
Authentication is the process of guaranteeing that an
entity is what it claims to be.
Authorization is the process of deciding what privileges
an entity has and making only these privileges
available.
Structuring : The structure of an operating system
defines how various parts of the operating system are
organized. The monolithic kernel, the collective
kernel structure and object oriented operating system
Adv.O.S 14
Communication Primitives
The communication primitives are the high level
constructs with which programs use the underlying
communication network. The communication
primitives influence a programmer’s choice of
algorithm as well as the ultimate performance of the
programs .
Two communication models , namely , message
passing and remote procedure call, that provide
communication primitives. These two models have
been widely used to develop distributed operating
systems and applications for distributed system
Adv.O.S 15
INHERENT LIMITATIONS OF A DISTRIBUTED SYSTEM
A distributed system is a collection of computers that are
specially separated and do not share a common
memory.
We discuss the inherent limitation of distributed systems
and their impact on the design and development of
distributed systems.
Adv.O.S 16
A Distributed system with two sites
Local state of A Local state of B
Communication
$500 $200
A Channel
S1: A S2: B
B Communication
$450 $200
Channel
S1: A S2: B
C
Communication
$450 $250
Channel
S1: A S2: B
Adv.O.S 17
Lamport’s Logical Clocks
Definition: Due to the absence of perfectly synchronized clocks and
global time in distributed systems, the order in which two events
occur at two different computers cannot be determined based on
the local time at which they occur.
Happened before relation:
a -> b : Event a occurred before event b. Events in the same process.
a -> b : If a is the event of sending a message m in a process and b is
the event of receipt of the same message m by another process.
a -> b, b -> c, then a -> c. “->” is transitive.
Causally Ordered Events
a -> b : Event a “causally” affects event b
Concurrent Events : two distinct events a and b are said to be
concurrent If a !-> b and b !-> a in other words, concurrent events do
not casually affect each other.
For any two events a and b in a system either ab , ba or a||b.
Adv.O.S 18
Space-time Diagram
Example :
Internal
Events
Messages
P2
e21 e22 e23 e24
Time
Adv.O.S 19
Logical Clocks
Conditions satisfied:
Ci is clock in Process Pi.
If a -> b in process Pi, Ci(a) < Ci(b)
Let a: sending message m in Pi; b : receiving message m in Pj;
then, Ci(a) < Cj(b).
Implementation Rules:
R1: Ci = Ci + d (d > 0); clock is updated between two successive
events.
R2: Cj = max(Cj, tm) + d; (d > 0); When Pj receives a message
m with a time stamp tm (tm assigned by Pi, the sender; tm =
Ci(a), a being the event of sending message m).
A reasonable value for d is 1
Adv.O.S 20
Space-time Diagram
Example 2:
How Lamport’s logical clocks advance
Clock
values
Adv.O.S 21
Limitation of Lamport’s Clock
Space e11 e12 e13
P1
(1) (2) (3)
(1) (2)
Time
C(e11) < C(e32) but not causally related.
This inter-dependency not reflected in Lamport’s Clock.
Adv.O.S 22
The above figure show a computation over three
processes. Clearly c( e11)< c(e22) and c(e11)<c(e32).
However , we can see from the figure that even e11 is
casually related to event e22 but not to event e32, since
a path exists from e11 to e22 but not from e11 to e32.
Adv.O.S 23
Vector Clocks
Vector clocks is an algorithm for generating a partial ordering of events in a
distributed system and detecting causality violations. Just as in Lamport
timestamps, interprocess messages contain the state of the sending process's
logical clock. A vector clock of a system of N processes is an array/vector of N
logical clocks, one clock per process; a local "smallest possible values" copy of
the global clock-array is kept in each process, with the following rules for clock
updates:
Initially all clocks are zero.
Each time a process experiences an internal event, it increments its own logical
clock in the vector by one.
Each time a process prepares to send a message, it increments its own logical
clock in the vector by one and then sends its entire vector along with the
message being sent.
Each time a process receives a message, it increments its own logical clock in
the vector by one and updates each element in its vector by taking the
maximum of the value in its own vector clock and the value in the vector in the
received message (for every element).
Adv.O.S 24
Adv.O.S 25
Vector Clocks
Keep track of transitive dependencies among processes
for recovery purposes.
Ci[1..n]: is a “vector” clock at process Pi whose entries
are the “assumed”/”best guess” clock values of different
processes.
Ci[j] (j != i) is the best guess of Pi for Pj’s clock.
Vector clock rules:
Ci[i] = Ci[i] + d, (d > 0); for successive events in Pi
For all k, Cj[k] = max (Cj[k],tm[k]), when a message m with
time stamp tm is received by Pj from Pi.
: For all
Adv.O.S 26
Vector Clocks Comparison
Equal : ta=tb iff i , ta [i]=tb[i];
Not Equal: tatb iff i , ta [i] tb[i];
Less than or equal : ta tb iff i , ta [i] tb[i];
Not Less than or Equal to : ta tb iff i , ta [i] tb[i];
Less than: ta < tb iff (ta < tb ^ tatb )
Not Less than: ta < t b
Concurrent ta || tb
Adv.O.S 27
Vector Clock …
Dissemination of time in vector clocks
P3 e31 e32
(0,0,1) (0,0,2)
Time
Adv.O.S 28
Causal Ordering of Messages
Space Send(M1)
P1
Send(M2)
P2
P3 (1)
(2)
Time
Adv.O.S 29
Message Ordering …
Not really worry about maintaining clocks.
Order the messages sent and received among all processes in a
distributed system.
(e.g.,) Send(M1) -> Send(M2), M1 should be received ahead of M2 by
all processes.
This is not guaranteed by the communication network since M1 may
be from P1 to P2 and M2 may be from P3 to P4.
Message ordering:
Deliver a message only if the preceding one has already been delivered.
Otherwise, buffer it up
Adv.O.S 30
Global State
The global state of a distributed system mainly
consists of the local state of all processes along with
the messages that are being sent but are not
delivered.
On many occasions , the global state of a distribute
system should be known. In a distributed database
system, the local state consists of database records.
The records that are temporarily used for
communication purpose are removed.
Adv.O.S 31
Global State: plot
Global-states and their transitions
Global State 1 in the bank account example.
C1: Empty
$500 $200
A C2: Empty B
Global State 2
C1: Tx $50
$450 $200
A C2: Empty B
Global State 3
C1: Empty
$450 $250
A C2: Empty B
Adv.O.S 32
Recording Global State...
(e.g.,) Global state of A is recorded in (1) and not in (2).
State of B, C1, and C2 are recorded in (2)
Extra amount of $50 will appear in global state
Reason: A’s state recorded before sending message and C1’s state
after sending message.
Inconsistent global state if n < n’, where
n is number of messages sent by A along channel before A’s state
was recorded
n’ is number of messages sent by A along the channel before
channel’s state was recorded.
Consistent global state: n = n’
Adv.O.S 33
Recording Global
Similarly, for consistency m = m’
State...
m’: no. of messages received along channel before B’s state recording
m: no. of messages received along channel by B before channel’s state
was recorded.
Also, n’ >= m, as in no system no. of messages sent along the
channel be less than that received
Hence, n >= m
Consistent global state should satisfy the above equation.
Consistent global state:
Channel state: sequence of messages sent before recording sender’s
state, excluding the messages received before receiver’s state was
recorded.
Only transit messages are recorded in the channel state.
Adv.O.S 34
Recording Global State
Send(Mij): message M sent from Si to Sj
rec(Mij): message M received by Sj, from Si
time(x): Time of event x
LSi: local state at Si
send(Mij) is in LSi iff (if and only if) time(send(Mij)) <
time(LSi)
rec(Mij) is in LSj iff time(rec(Mij)) < time(LSj)
transit(LSi, LSj) : set of messages sent/recorded at LSi
and NOT received/recorded at LSj
Adv.O.S 35
Recording Global State …
inconsistent(LSi,LSj): set of messages NOT
sent/recorded at LSi and received/recorded at LSj
Global State, GS: {LS1, LS2,…., LSn}
Consistent Global State, GS = {LS1, ..LSn} AND for all i
in n, inconsistent(LSi,LSj) is null.
Transitless global state, GS = {LS1,…,LSn} AND for all i
in n, transit(LSi,LSj) is null.
Adv.O.S 36
Recording Global State ..
LS1 M2
S1 M1
S2
LS2
M1: transit
M2: inconsistent
Adv.O.S 37
Recording Global State...
Strongly consistent global state: consistent and
transitless, i.e., all send and the corresponding receive
events are recorded in all LSi.
LS11 LS12
LS22 LS23
LS21
Adv.O.S 38
Requirement of Mutual Exclusion algorithms
The primary objective of a mutual exclusion algorithm is to maintain mutual
exclusion: that is to guarantee that only one request accesses the critical
section at a time. In addition the following characteristics are considered
important in a mutual exclusion algorithm:
Freedom from Deadlocks:-two or more sites should not endlessly wait for
messages that will never arrive.
Freedom from Starvation: A site should not be forced to wait indefinitely to
execute critical section while other sites are repeatedly executing critical
section. That is , every requesting site should get an opportunity to execute
critical section in a finite time.
Fairness: Fairness dictates that requests must be executed in the order they
are made. Since a physical global clock does not exist, time is determined by
logical clocks. Note that fairness implies freedom from starvation , but not
vice-versa.
Fault tolerance: a mutual exclusion algorithm is fault-tolerant if in the wake
of a failure, it can reorganize itself to that it continues to function without any
disruptions.
Adv.O.S 39
DEADLOCK HANDLING STRATEGIES IN
DISTRIBUTED SYSTEMS
There are 3 strategies to handle deadlocks, deadlock prevention,
deadlock avoidance and deadlock detection.
Deadlock Prevention: It is commonly achieved by either having a
process acquire all the needed resources simultaneously before it begins
execution or by preempting a process that holds the needed resource.
Deadlock Avoidance: In the deadlock avoidance approach to
distributed systems, a resource is granted to a process if the resulting
global system state is safe. Because of the following problems:
Every site has to maintain information on the global state of the system,
which translates into huge storage requirements and extensive
communication costs
The process of checking for a safe global state must be mutually exclusive ,
because if several sites concurrently perform checks for a safe global state
Adv.O.S 40
Due to the large number of processes and resources , it
will be computationally expensive to check for a safe
state.
Deadlock Detection: Deadlock detection in distributed
systems has two favorable conditions:
1. Once a cycle is formed in WFG, it persist until it is
detected and broken
2. Cycle detection can proceed concurrently with the
normal activities of a system
Adv.O.S 41
Control Organization for Distributed Deadlock Detection: select
one and discuss in detail
Centralized Control
Distributed Control
Hierarchical Control
Adv.O.S 43
Hierarchical Control : In hierarchical deadlock
detection algorithms, sites are arranged in a
hierarchical fashion, and a site detects deadlocks
involving only its descendant sites.
Hierarchical algorithms exploit access patterns local
to a cluster of sites to efficiently detect deadlocks.
They tend to get the best of both the centralized and
the distributed control organizations in that there is
no single point of failure and a site is not bogged
down by deadlock detection activities with which it is
not concerned .
Adv.O.S 44
Centralized Deadlock Detection Algorithms
The completely centralized algorithm is the simplest
centralized deadlock detection algorithm, wherein a
designated site called the control site, maintains the
WFG of the entire system and checks it for the
existence of deadlock cycles.
All sites request and release resources by sending
request resource and release resource messages to the
control site, respectively.
When the control site receives a request resource or a
release resource message, it correspondingly updates
its WFG .
Adv.O.S 45
Distributed Deadlock Detection Algorithms
In distributed deadlock detection algorithms, all sites
collectively cooperate to detect a cycle in the state
graph that is likely to be distributed over several sites
of the system.
A distributed deadlock detection algorithm can be
initiated whenever a process is forced to wait , and it
can be initiated either by the local site or the process
or by the site where the process waits.
Distributed Deadlock detection algorithms can be
divided into four classes – path-pushing, edge-chasing,
diffusion computation and global state detection
Adv.O.S 46
In Path Pushing Algorithms, the wait for dependency information of
the global WFG is disseminated in the form of paths.
In edge-chasing Algorithms, special messages called probes are
circulated along the edge of the WFG to detect a cycle. A process
declares a deadlock when it receives a probe initiated by it. An
interesting feature of edge-chasing algorithms is that probes are of a
fixed size.
Diffusion Computation type algorithms make use of echo algorithms
to detect deadlocks . To detect a deadlock , a process sends out query
messages along all the outgoing edges in the WFG. These queries are
successively propagated through the edges of the WFG.
Global state detection based deadlock detection algorithms exploit
the following facts
A consistent snapshot of a distributed system can be obtained without
freezing the underlying computation.
A consistent snapshot may not represent the system state at any moment in
time, but if a stable property holds in the system before the snapshot
collection is initiated, this property will still hold in the snapshot.
Adv.O.S 47
HIERARCHICAL DEADLOCK DETECTION ALGORITHMS
In hierarchical algorithms, sites are arranged in
hierarchical fashion, and a site is responsible for
detecting deadlocks involving only its children sites.
Adv.O.S 48