Distributed System
Distributed System
UNIT
Characterization of
Distributed System
CONTENTS
1-2B to 1-4B
Part-1 : Introduction, Example
of Distributed System
1-4B to 1-9B
Part-2 : Resource Sharing and Web ..eeech..+cese+e
Challenges, Architectural
Models, Fundamental Models
1-1 B (CS-Sem-7)
1-2 B (CS-Sem-7) Characterization of Distributed System
PART- 1
Questions-Answers
main
Que 1.1. What is a distributed system ? Describe the
characteristics of distributed- system. Give two example of
AKTU2014-15, Marks 05
distributed system.
Answer
Distributed system :
1. A distributed system is a system in which
software or hardware
communicates and
components connected via communication network
coordinates their actions only by passing messages.
2. Computers that are connected by a network may be spatially separated
by distanee.
3. Resources may be managed by servers and accessed by clients.
Characteristics of distributed system :
1. Heterogeneity : Distributed system enables the users to access services
computers and
and run application over a heterogeneous collection of
networks.
characteristics
2. Openness : The openness of a computer system is the
that determine whether the system can be extended and re-implemented
in various ways.
3.
Concurrency : Concurrency in distributed system is use to help different
time.
users to access the shared resource at the same
4. Scalability : A system is described as scalable if it remains effective
when there is significant increase in the number of resources and the
numbers of users.
5. Security : Security provides confidentially, integrity and availability of
the information resources.
Example of distributed system:
1. Internet : The Internet is a very large distributed system. It enables
users to make use of services such as the World Wide Web, e-mail and
file transfer.
Distributed System 1-3 B(CS-Sem-7)
2 Intranet :
a An intranet is a private network that is contained within an
enterprise.
b An intranet is connected to the internet via router, which allows
the users inside the intranet to make use of services such as web or
e-mail.
Answer
Distributed transparency is the property of distributed databases by the
virtue of which the internal details of the distribution are hidden from the
users.
Types of transparencies :
1. Access transparency : It enables local and remote resources to be
accessed using identical operations.
2. Location transparency : It enables resources to be accessed without
knowledge of their physical or network location.
3. Concurrency transparency: It enables several processes to operate
concurrently using shared resources without interference between
them.
4 Replication transparency:It enables multiple instances of resources
to be used to increase reliability and performance without knowledge of
the replicas by users or application programmers.
5. Failure transparency :It enables the concealment of faults, allowing
users and application program to complete their tasks despite the failure
of hardware or software components.
6 Performance transparency:It allows the system to be reconfigured
to improved performances as load varies.
PART-2
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Distributed System 1-5B (CS-Sem-7)
Answer
1 An architecture model of a distributed system simplifies and abstracts
the functions of the individual components of a distributed system
2. It also considers the placement of the components across a network of
computers and the interrelationship between the components.
3. The main objective of these models is to make the system reliable,
manageable, adaptable and cost-effective.
4. The two main types of architectural model are :
a. Client-server model (Search engine):
Fig. 1.8.1 illustrates the simple structure in which client
processes interact with individual server processes in separate
host computers in order to access the shared resources that
they manage.
1-8 B (CS-Sem-7) Characterization of Distributed System
Invocation Inyocation
Client Server
Client
Key:
process: Computer
Fig. 1.8.1. Clients invoke individual servers.
This is the architecture that is most widely employed.
Client-server model offers a direct and simple approach tothe
sharing of data and other resources.
iv. Servers may acts as a client of other servers.
V For example, a web server is often a client of a local file server
that manages the files in which the web pages are stored.
b. Peer-to-Peer model :
In this architecture, all of the processes which are involved in
task play similar roles, interacting cooperatively as peers
without any distinction between client and server processes.
ii The Fig. 1.8.2 illustrates the form of a peer-to-peer application.
Application Application
Coordination) Coordination
Code Code
Peer-2
Peer-1
/Application
Coordination)
Code
Peer-3
PART-3
Questions-Answers
Answer
Limitations of distributed systems are as follows :
1. Absence of global clock:
In a distributed system, global clock (or common clock) is not
present.
b Suppose a global clock is available for all the processes in the
system.
C In this case, two different processes can observe a global clock
value at different instants due to unpredictable message
transmission delays.
d Therefore, two different processes, may falsely perceive two
diferent instants in physical time to be a single instant in physical
time.
2. Absence of shared memory :
a The computer in a distributed system do not share common
memory,an up-to-date state of the entire system is not available
to any individual process.
b. It is necessary for reasoning about the system's behaviour,
debugging, recovering from failures, etc.
C. Aprocess in a distributed system can obtain a coherent but partial
view of the system or a complete but incoherent view ofthe system.
d. A view is said to be coherent if allthe observations of different
processes (computers) are made at the same physical time.
e. Because of the absence of a global clock in'a distributed system,
obtaininga coherent global state of the system is difficult.
Distributed System 1-11 B(CS-Sem-7)
Example:
Local state Local state
of A of B
S1: A S2: B
Fig. 110.1.
Message under tran sit
(Not yet reached to B)
S1: A S2: B
Fig. 1.10.1.
a. S1 records its local state (Rs. 450) just after debit (- 50) and S2
records its location (200) before receiving.
b If transit message is not taken care off
Global state = Local state S1+ Local state S2
= 450+ 200
= 650 = Rs. 50 missing i.e., in coherent system
PART-4
Questions-Answers
the important
Que 1.12. What are Lamport logical clocks ? List
clocks. If A and B
conditions to be satisfied by Lamport logical A > B then
and if
represent two distinct events in a process statement.
Justify the
CA) < C(B) but vice-versa not true.
AKTU2015-16, Marks 10
Answer
Lamport logical clocks :
monotonically increasing software counter,
A Lamport logical clock is a to any physical clock.
whose value need bear no particular relationship
Lamport logical clocks:
Following conditions are to be satisfied by
process P.,and a occurs before
1 Ifa andb are two events within the same
b, then C(a) < C{b).
P, and bis the receipt of that
2 It a is the sending of a message by process
message by process P, then C{a) <C{6).
always go forward, never
3 A clock C, associated with a process P, must clock must always be
backward. That is, corrections to time of a logical
never by subtracting
made by adding a positive value to the clock,
value.
Distributed System 1-13 B (CS-Sem-7)
Justification : Event 'A' casually affects event 'B ifA ’ B. Now, ifA ’B
then C(A) <C(B), but it vice-versa (reverse) is not here,because nothing can
be said about events by comparing timestamps.
Fig. 1.13.1.
a. Fig. l.13.1 shows a computation over three processes clearly,
Ce,,) <Ce,) and Cle,)<Ce,).
b. However, we can see from the Fig. 1 that event e,, is causally
related to event e but not to e
Note that the initial clock values are assumed to be zero and d is
assumed to equal 1.
1-14 B (CS-Sem-7) Characterization of Distributed System
In other words, in Lamport's system of clocks, we can guarantee
that if Cla) < C(b) then b-a, however we cannot say whether
events a and bare causally related or not by just looking at the
timestamps of the events.
e. The reason for the ab0ve limitation is that each clock can
independently advance due to the occurrence of local events in a
process.
f. The Lamport's clock system cannot distinguish between the
advancements of clocks due to local events from those due to the
exchange of messages between processes.
Therefore, using the timestamps assigned by Lamport's clocks we
cannot reason about the causal relationship between twoevents
occurring in different processes by just looking at the timestamps
of the events.
Que 1.14. What are vector clocks ? Explain with the help of
implementation rule of vector clocks, how they are implemented ?
Givethe advantages of vector clock over Lamport clock.
AKTU2014-15, Marks 05
Answer
Vector clocks:
1 Vector clocks are used in a distributed system to determine whether
pairs of events are causally related.
2 Using vector clocks, timestamps are generated for each event in the
system, and their causal relationship is determined by comparing those
timestamps.
Implementation of vector clocks :
1. Let n' be the number of processes in a distributed system. Each process
P, is equipped with a clock C, which is an integer vector of length n.
2. Let a, b be a pair of events.Let Clalli] be the ¿h element of the vector
clock for the event a.
3. Cla) is dominated by Cb) i.e., Ca) <C%), if and only if thefollowing two
conditions hold :
a i,0sisn-1:Clalli] < C[b][i]
b Bi,0sisn-1:Clal[i] < Cb][i]
4. To implement a system of vector clocks, initialize the vector clock of
each process to 0, 0, ..., 0, (n component).
Distributed System 1-15 B (CS-Sem-7)
PART-5
Questions-Answers
If the two messages m, and m, are not received by recipient Qin the order
they were sent by process P, this means message delivery will not be causal.
Algorithm :
Schiper-Eggli-Sandoz algorithm :
Instead of maintaining a vector clock based on the number of messages sent
to each processes, the vector clock for this algorithm can be incremented at
any rate and has no additional meaning related to the number of messages
spent to the processes.
Sending a message :
1. All messages are timestamped and sent out with a list of all timestamps
of messages sent to other processes.
1-17 B (CS-Sem-7)
Distributed System
of message in
Que 1.16. Explain the algorithm for casual ordering
distributed system.
Answer
Algorithm for casual ordering of message in distributed system:
a.
Birman-Schiper-Stephenson algorithm :
There are three basic principles to this algorithm :
1. All messages are time stamped by the sending
process.
2. A message cannot be
delivered until:
delivered locally.
i All the messages before this one have been
the original
Allthe other messages that have been sent out from receiving
delivered at the
process have been accounted as
process.
updated.
3. When a message is delivered, the clock is
communicate through
This algorithm requires that the processes message could be
broadcast messages which ensure that only one
received at any one time.
Page 1-16B,
b. Schiper-Eggli-Sandoz algorithm : Refer Q. 1.15,
Unit -1.
Answer
S,
LS11 LS12
Answer
4. A process must record its local state before it receives a marker on any
of its incoming channels.
Chandy-Lamport algorithm :
1. Marker receiving rule for processj :
On receiving a marker along channel C:
IfÇhas not recorded its state) then
Record its process state
Record the state of C as the empty set
Follow the "marker sending rule"
else
Record the state of C as the set of messages received along C after
j's state was recorded and before j received the marker along C.
2 Marker sending rule for process i :
a. Process irecords its state.
b. For each outgoing channelC on which a marker has not been sent,
isends a marker along Cbefore i sends further messages along C.
1-20 B (CS-Sem-7) Characterization of Distributed System
AKTU2017-18, Marks 10
Answer
CONTENTS
Part-l Distributed Mutual +e*****se...+eeee 2-2B to 2-3B
Exclusion : Classification of
Distributed Mutual Exclusion,
Requirement of Mutual
Exclusion Theorem
Part-2 : Token Based and 2-3B to 2-10B
Non-Token Based
Algorithm, Performance
Metric for Distributed
Mutual Exclusion Algorithm
Part-3 : Distributed Deadlock .........2-10B to 2-12B
Detection : System Model
Resource vs Communication
Deadlocks, Deadlock
Prevention, Avoidance, Detection & Resolution
Part-4 : Centralized Deadlock ...2-12B to 2-18B
Detection, Distributed
Deadlock Detection, Path-Pushing
Algorithm, Edge Chasing Algorithms
2-1 B (CS-Sem-7)
2-2 B (CS-Sem-7) Distributed Mutual Exclusion
PART- 1
Distributed Mutual Exclusion : Classification of Distributed Mutual
Exclusion, Requiremernt of MutualExclusion Theorem.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Answer
Mutual exclusion:
1. Mutual exclusion is a problem that arises if the process relies on a
common resource that can be used only by one process at a time.
2. Concurrent access to shared resources is prevented.
3. Mutual exclusion algorithm guarantees that only one request accesses
the critical section (CS) at a time.
4. There are two classes of distributed mutual exclusion algorithm:
aNon-token based algorithm
b Token based algorithm
Requirements of good mutual exclusion algorithm :
1. Freedom from deadlocks : Two or more sites should not endlessly
wait for messages that will never arrive.
2. Freedom from starvation :A site should not be forced to wait
indefinitely to execute CS i.e., every requesting site should get an
opportunity to execute CS in a finite time.
3. Fairness : Fairness dictates that requests must be executed in the order in
which they arrive in the system.
4. Fault tolerance : Amutual exclusion algorithm is fault-tolerant if in
the wake of a failure, it can recognize itself so that it continues to
function without any (prolonged) disruptions.
Distributed System 2-3 B (CS-Sem-7)
PART-2
Token Based and Non-Token Based Algorithin, Performánce Metric
for Distributed Mutual Exclusion Algorithm.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
2-4B (CS-Sem-7) Distributed Mutual Exclusion
Answer
Lamport's algorithm :
1. In Lamport's algorithm, Vi:1sisN::R, =(S,, S,, ..Sy). Every
site
S, keeps a queue, request-queue, , which contains mutual exclusion
requests ordered by their timestamps.
2 This algorithm requires message to be delivered in the FIFO order
between every pair of sites.
Algorithm :
1. Requesting the critical section:
a When a site S, wants to enter the critical section (CS), it sends a
REQUEST (ts, i) message to all the sites in its request set R, and
places the request on request-queue, Where (ts,, i) is the timestamp
of the request.
b. When a site S,receives the REQUEST (ts, i) message from site S,
it returns a timestamped REPLY message to S,
and places site S,s
request on the request-queue,.
2-6 B (CS-Sem-7) Distributed Mutual Exclusion
Answer
Suzuki-Kasami algorithm:
1. In the Suzuki-Kasami's algorithm, if a site attempting to enter the CS
but does not have the token, it broadcasts a request message for the
token to all other sites.
2. The main design issues in this algorithm are :
a. It distinguishws between outdated request messages and current
request messages.
b It determines which site has an outdated request for the critical
section.
Algorithm:
1. Requesting the critical seetion :
Ifthe requesting site S, does not have the token, then it increments
its sequence number, RN,i], and sends a REQUEST (i, sn) message
toall other sites. (sn is the updated value of RN,(i).
Distributed Systemn 2-7B(CS-Sem-7)
b. When a site S; rceives this message, it sets RN i]to max (RNi],
sn). If S, has the idle token, then it sends the token to S, if RN (1=
LN]+1. (LN]isthe sequence number of the request that síte S;
executed most recently).
2. Executing the critical section :
a. Site S; executes the CS when it has received the token.
3. Releasing the critical section : After finishing the execution of the
CS, site S, takes the following actions:
It sets LNi] element of the token array equal to RN,E.
b. For every site S, whose ID is not in the token queue, it appends its
ID to the token queue if RN,] = LNr] +.1.
If token queue is non-empty after the above update, then it deletes
the top site ID from the queue and sends the token to the site
indicated by ID.
Thus, after having executed its CS, a site gives priority to other sites with
outstanding requests for the CS (over its pending requests for the CS).
Que 2.8. Explain the Ricart-Agrawala algorithm for mutual
exclusion. Mention the performance of this algorithm.
AKTU 2014-15, Marks 05
Answer
Synchronization Time
delay
Fig. 2.9.2.
3. Number of message per CS:As number of message exchange reduces,
the performance will improve.
4. System throughput : It is the rate at which the system executes for
the CS. If sd is the synchronization delay and E is the average critical
section time then the throughput is given by the following equation :
1
System throughput =
(sd + E)
Answer
Difference : Refer Q. 2.2, Page 2-3B, Unit-2.
Classification of mutual exclusion algorithm :
1. Token based algorithm : In the token based algorithm, a unique
token is shared among all sites. If sites possess the token then it is
allowed to enter its CS (critical section).
2. Non-token based algorithm : In non-token based algorithms,a site
communicates witha set of othersites to arbitrate who should execute
the CS next.
PART-3
Distributed Deadlock Detection : System Model
Resource vs Communication Deadlocks, Deadlock Prevention
Avoidance, Detection & Resolution.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Answer
1 Deadlock is a situation in which a set of processes are blocked because
each process is holding a resources and waiting for another resources
acquired by some other pro cess.
2 The detection of deadlocks in a distributed DBMS is more complicated,
because it involves several different sites.
3 Thus, in a distributed DBMS it is necessary to draw a globalwait-for
graph (GWFG) for the entire system to detect a deadlock situation.
Deadlock handling strategies in distributed system :
1. Deadlock prevention :
a. Deadlock prevention is achieved by having a process collect all the
needed resources at once before it beings executing or by preempting
a process that holds the needed resource.
b Now, mutual exclusion, hold-and-wait, no pre-emption and circular
wait are the four necessary conditions for a deadlock to occur. If
one of these conditions is never satisfied then deadlock can be
prevented.
C. Deadlock prevention methods are
i. Collective requests : These methods deny the hold and wait
condition by ensuring that whenever a process requests a
resource it does not hold any other resource.
ii Ordered requests : In this method circular-wait is denied
such that each resource type is assigned a unique global
number toimpose total ordering of all resource types.
iii. Preemption:A preemptable resource is one whose state can
be easily saved and restored later. Deadlocks can be prevented
using resource allocation policies to deny no-preemption
condition.
d Deadlock prevention is highly incompetent and unrealistic in
distributed system.
2. Deadlock avoidance :
a. For deadlock avoidance in distributed system, a resource is assigned
to a process if the state of global system is safe.
b State of global system includes all processes and resources in
distributed system.
C. Deadlock avoidance algorithm can be done in the following steps :
When a process requests for a resource, if the resource is
available for allocation it is not immediately allocated to the
process. Rather, the system assumes that the request is
granted.
2-12 B (CS-Sem-7) Distributed Mutual Exclusion
PART-4
Questions-Answers
d. The control site checks the WFG for deadlocks whenever a request
edge is added to the WFG.
2. The Ho-Ramamoorthy algorithms : Ho and Ramamoorthy gave
two centralized deadlock detection algorithms called two-phase and one
phase algorithms.
a.
The two-phase algorithm :
1. In the two-phase algorithm, every site maintains a status table
that contains the status of all the processes initiated at that
side.
i Periodically, a designated site requests the status table from
all sites, constructs a WFG from the information received, and
searches it for cycles.
ii. If there is no cycle, then the system is free from deadlocks,
otherwise, the designated site again requests status tables
from all the sites and again constructsa WFG'using only those
transactions which are common to both reports.
iv. If the same cycle is detected again, the system is declared
deadlocked.
b. The one-phase algorithm :
i. The one-phase algorithm requires only one status report from
each site; however each site maintains two status tables; a
resource status table and a process status table.
The resource status table at a site keeps track of the
transactions that have locked or are waiting for resources
stored at that site.
The process status table at a site keeps track of the resources
locked by or waited for by all the transactions at that site.
iv. Periodically, a designated site requests both the tables from
every site, construct a WFG using only those transactions for
which the entry in the resource table matches the
corresponding entry in process table, and searches the WFG
for cycles.
V. Ifno cycle is found, then the system is not deadlocked, otherwise
a deadlock is detected.
Control,
site
Central
Control site Control
site site
Fig. 2.18.1.
d As a result, a control site collects status table from all the sites in its
cluster and applies the one-phase deadlock detection algorithm to
detect all deadlocks involving only intracluster transactions.
2-18 B (CS-Sem-7) Distributed Mutual Exclusion
CONTENTS
Part-1 : Agreement Protocol : 3-2B to 3-4B
Introduction, System Models
Part-2 : Classification of Agreement 3-4B to 3-11B
Problem, Byzantine Agreement
Problem, Consensus Problem,
Interactive Consistency Problem,
Solution to Byzantine Agreement
Problem
3-1 B (Cs-Sem-7)
3-2 B (CS-Sem-7) Agreement Protocols
PART- 1
Agreement Protocol : Introduction, System Models.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
PART-2
Classification of Agreement Problem Byzantine Agreement Problem,
Consensus Problem, Interactive Consistency Problem, Solution to
Byzantine Agreement Problem.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Answer
Agreement protocol:ReferQ. 3.1, Page 3-2B, Unit-3.
Byzantine agreement problem, the consensus problem and
interactive consistency problem : Refer Q. 3.4, Page 3-4B, Unit-3.
Lamport-Shostak-Pease algorithm :
Lamport algorithm, also referred to as Oral Message algorithm OM(m),
m >0, solves the Byzantine agreement problem for 3m + lor more
in the presence of at most m faulty processors
processors.
Let n denote the total number of processors(where, n >3m+ 1). The algorithm
is recursively defined as follows :
Algorithm OM(0):
1 The source processor sends its value to every processor.
2 Each processor uses the value it receives from the source. If it receives
no value, then it uses a default value of 0.
Algorithm OM (m), m>0:
1 The source processor sends its value to every processor.
2 For each i, let v, be the value processor i receives from the source (if it
receives no value, then it uses a default value of 0). Processor iacts as
the new source and initiates algorithm OM(m - 1) wherein, it sends the
value v, to each of the other n - 2 processors.
3 For each i andj (where iandj are not equal), letv,be the value processor
ireceived from processor j in step 2 using Algorithm OM (m - 1). If it
receives no value, then it uses a default value of0. Processor i uses the
value majority (v,, V,..).
4 The majority function is used to select the majority value out of values
received in round of message exchange.
5. The function majority (u, V V,-) computes majority of values
U,, Uy ....Uif it exists (otherwise returns 0).
Que 3.6. Describe Lamport-Shostak-Pease algorithm. How does
vector clock overcome the disadvantages of Lamport clock ? Explain
with an example. AKTU2016-17, Marks 15
Answer
Lamport-Shostak-Pease algorithm : Refer Q. 3.5, Page 3-5B, Unit-3. .
Vector clock overcome advantage of Lamport clock: With Lamport
clocks, we cannot determine whether two events are casually related by
looking at the timestamps, because if CA) < C(B) does not always mean
A’ Bwhile veçtor clock allow to compare the timestamps of the events to
determine whether they are casually related or not.
Distributed System 3-7B (CS-Sem-7)
To.o,o) e1
(2, 1, 3)
ez2
(2, 1, 4)
Pg (0, 0, 1) (0, 0, 2)
(0, 0, 0) e31 E32 eg4 Time
Fig.3.6.1.
Que 3.7. What do you understand by Byzantine agreement
problem? AKTU2018-19, Marks10
OR
What is Byzantine agreement problem ? Provide the solution to
Byzantine agreement problem. AKTU2018-19, Marks 10
Answer
1. In Byzantine agreement problem a single value, which is to be agreed
on is initializes by an arbitrary processes and all non-faulty processes
have to agree on that value.
2. There are n processes, n = lp Pgy.. P,} with unique names over
N={1,.. , n) and at most Byzantine participants t<n of the processes
can be Byzantine.
3. Each pro cess starts with an input value v from a set of values.
4. The goal of this protocol is to ensure that all non-faulty processes
eventually output the same value.
5. The output of a non-faulty process is called the decision value.
6. An algorithm solves the Byzantine agreement if the following conditions
hold :
a. Agreement : All non-faulty processes agreed on the same value
(i.e., there are no two non-faulty processes that decide different
values).
b. Validity : If all non-faulty processes start with the same valueu,
the decision value of allnon-faulty participants is u.
C. Termination :All non-faulty processes decide a value.
3-8 B (CS-Sem-7)
Agreement Protocols
7. Reaching agreement in presence of Byzantine processes is expensive
as the number of messages grows quadratically with the
participants nand the number of round (time)grows linearlynumber
with theof
number of Byzantine participants t (with n > 3t).
Solution to Byzantine agreement : Solution to Byzantine
problem is given by Lamport Shostak-Pease algorithm. agreement
Lamport Shostak-Pease algorithm : Refer Q. 3.5, Page 3-5B, Unit-3.
Que 3.8.
Show that a Byzantine agreement cannot be reached
among three processors, where one processor is faulty.
OR
Explain treatment of impossible result for the solution of Byzantine
agreement problem.
Answer
1 Sometimes, the agreement problem may lead to such a condition which
is quite impossible to solve.
2 The situation where the agreement is impossible, called as impossible
result.
3 This type of problem cannot be reached to agreement.
4 In a system, the impossible result situation is found with more than two
processors.
5 Let us check the situation of impossible result in a system with three
processors.
6. Consider a system with three processors, Po, P, and P:
7. We assume that there are only two values, 0 and 1, on which
processors
agree and processor P, initiates the initial value.
8 There are two possibilities:
Case I: P, is not faulty :
1. Assume P, is faulty.
2 Suppose that P, broadcasts an initial value of 1to both P, and P,.
3 Processor P, acts maliciously and communicates a value of 0 to
processor P:
4 Thus, P, receives conflicting values from P, and P,.
5 However, since P, is non-faulty, processor P, must accept 1 as the
agreed upon value.
3-9 B (CS-Sem-7)
Distributed System
Po
1
P1 P.
Po
P; 0
Answer
Byzantine agreement problem and its solution : Refer Q. 3.7,
Page 3-7B, Unit-3.
Proof:
1 Considera system with four processors as P,, P,, P, Pa Assume that
processors are exchanging three values x, y and z to each other, P,
initiate the initial value and processors P, andP, are faulty.
2 To initiate the agreement, processor P, execute algorithm OM(1)and
sends its value x to all other processor as shown in Fig. 3.9.1.
P1
X
X
P2 P3 P4
Fig. 3.9.1.
3. After receiving the value x from source processor P,, processors P,, P,
and P, execute the algorithm OM(0).
4 Processor P, is non-faulty and send value x to processor P, and Pa
Faulty processors P, and P, sends valuey to (Ps, P) and z to (P,, P;)
respectively as shown in Fig. 3.9.2.
P
P2 P3 PA
Fig. 3.9.2.
5. After receiving all the messages, processor P;, P, Pz and P, decide on
the majority value.
Majority values for Byzantine solution :
ProcessorReceived majority Common majority
values values
P (x, x, 2)
(x, y, z) 0
P
PA (x, x, y)
Distributed System 3-11B (CS-Sem-7)
6 According to majority value table, processors does not agree on single
common majority value, which violates the condition of Byzantine
agreement problem.
7
"This proves that Byzantine agreement cannot always reach among
four processors if two processors are faulty.
PART-3
Application of AgreementProtocol, Atomic Commit in Distributed
Database System.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
PART-4
Questions-Answers
1 In distributed file system, files can be stored at one machine and the
computation can be performed at other machine.
2 When a machine needs to access a file stored on a remote machine, the
remote machine performs the necessary file access operations and
returns data if a read operation is performed.
3. File server are higher performance machines which are used to store
file and performs storage and retrieval operations.
4. Client machines are used for computational purpose and to access the
files stored on servers.
Distributed System 3-15 B (CS-Sem-7)
Client
Client cache
cache
Ccachelient
Local
Communication disk
network
Server Server
cache cache
Name server:
Aname server is a process that maps names specified by clients
to stored objects such as files and directories.
ii The mapping occurs when a process references a file or
directory for the first time.
b. Cache manager :
i. A cache manager isa process that implements file caching.
In file caching, a copy of data stored at a remote file server is
brought to the client's machine when referenced by the client.
üi. Cache managers can be present at both clients and file servers.
iv. Cache managers at the servers, cache files in the main memory
toreduce delays due to disk látency.
V. Ifmultiple clients are allowed to cache a file and modify it, the
copies can become inconsistent.
vi. To avoid this inconsistency problem, cache managers at both
servers and clients coordinate to perform data storage and
retrieval operations.
Que 3.16. Explain the mechanism for distributed file system.
3-16B (CS-Sem-7) Agreement Protocols
Answer
Mechanism for building distributed file system :
1. Mounting :
a A mount mechanism allows the binding of different ilename spaces
to form a single hierarchically structured name space.
Server X
b C
g
Id f
Server Y k
Mount
points Server Z
restricted to those that can recover after discovering that the cached
data is invalid, that is, the data should be self-validating upon use.
4 Bulk data transfer:
a. In this mechanism, multiple consecutive data blocks are transferred
from server to client.
b. This reduces file access overhead by obtaining multiple number of
blocks with a single seek, by formatting and transmitting multiple
number of large packets in single context switch and by reducing
the number of acknowledgement that need to be sent.
C. Bulk transfer amortizes the cost of the fixed communication protocol
overheads and di_k seek time over many consecutive blocks ofa
file.
5. Encryption :
a. Encryption is the process used for data security in the distributed
system.
b. Anumber of possible threats exist, such as unauthorized release of
information, unauthorized modification of information.
C. Encryption prevents unauthorized release and modification of
information.
d. For performance, encryption/decryption may be performed by
special hardware at the client and server.
Que 3.17.
i. Explain typical architecture of distributed file system. Give the
mechanisms for building distributed file system.
ii. What is caching ? How isuseful in DFS?
AKTU 2014-15, Marks 10
Answer
i. Architecture of distributed file system: Refer Q. 3.15, Page 3-14B,
Unit-3.
Mechanisms for building DFS:Refer Q. 3.16, Page 3-15B, Unit-3.
ii. Caching: Refer Q. 3.16, Page 3-15B, Unit-3.
Uses of caching in DFS:
a The file system performance can be improved by caching; since
accessing remote disks is much slower than accessing local memory
or local disks.
b. Caching reduces the frequency of access to the file servers and the
communication network, thereby, improving the scalability ofa file
system.
3-18 B (CS-Sem-7) Agreement Protocols
miss
hit|data read
Lower level Lower level
cache cache
Que 3.19. Write and explain various issues that must be addressed
in design and implementation of distributed file system.
AKTU 2017-18, Marks 05
Answer
Following are various issues that must be addressed in the design and
implementation of distributed file system:
1. Naming and name resolution :
A name in file systems is associated with an object.
b. Name resolution refers to the process of mapping a name to an
object or, in the case of replication, to multiple objects.
C. A name space is a collection of names which may or may not share
an identical resolution mechanism.
2. Caches on disk or main memory :
a. The advantages of having the cache in the main memory are as
follows :
Diskless workstations can take advantages of caching as they
are cheaper.
3-20 B (CS-Sem-7) Agreement Protocols
b Caches deal with the memory contention between cache and virtual
memory system.
3. Writing policy :
a. The writing policy decides when a modified cache block at a client
should be transferred to the server.
b. The simplest policy is write-through.
C. In write-through, all writes requested by the applications at clients
are also carried out at the servers immediately.
d. The main advantage of write-through is reliability.
4. Availability:
a. Availability is one of the important issues in the design of distributed
file systems.
b. The failure of servers or the communication network can severely
affect the availability of files.
C. Replication is the primary mechanism used for enhancing the
availability of files in distributed file systems.
5. Scalability : The issue of scalability deals with the suitability of the
design of a system to handle the demands of the growing system.
6 Semantics :
The semantics of a file system characterizes the effects of accesses
on files,
b. The basic semantics are easily understood by programmers and are
easy to handle.
C. Aread operation will return the data (stored) due to the latest write
operation.
3. The naming and locating facilities joint.ly form a naming system that
hides the details of how and where an object is actually located in the
network.
4. The naming system plays a very important role in achieving the goal of
location transparency, facilitating transparent migration and replication
of objects, object sharing.
Flat naming :
1 Flat name is a simplest name space where names are character strings.
2. Flat names are fixed size bit strings that can be efficiently handed by
machines.
3 Names defined ina flat name space are called primitive or flat names.
4. Flat names do not have any structure.
5 Flat names are suitable for use either for small name spaces having
names for only a few objects or for system-oriented name spaces.
Structured naming :
1 Structured names are organized into name spaces.
2. A structured name space is represented as a labeled directed graph
with twotypes of nodes.
3. A leaf node represents a named entity and stores information about
the entity.
4. A
directory node stores a directory table of (edge label, node ID) pairs.
Que 3.21. Explain shared memory architecture and distributed
memory architecture. AKTU2017-18, Marks 10
Answer
Shared Bus
Memory
Fig. 3.21.1.
6. It belongs to the MIMD (Multiple Instruction, Multiple Data)
computational model.
Distributed mnemory architecture: Its main features are :
1 All processors in the system are directly connected to own memory
and caches. Any processor cannot directly access another processor's
memory.
2 Each node has a network interface (NI).
3. Allcommunication and synchronization between pro cessors happens
via messages passed through the NI.
4 Since this approach uses messages for communication and
synchronization, it is often called message passing architecture.
5 This architecture belongs to the MIMD (Multiple Instruction Stream,
Multiple Data Stream) programming model.
6. A schematic view of the distributed memory approach is shown in the
Fig. 3.21.2, where each processor has local memory and processors
(each denoted by P) communicate through an interconnection network.
Also, each processor has its own cache (denoted by $).
P P
M I/O M M I/O
Fig. 3.21.2.
3-23 B (CS-Sem-7)
Distributed System
Answer
Benefits of caching:
servers to
Due to caching, data is cached in the main memory at the
reduce disk access latency.
b. File system perfornmance can be improved by caching.
It improves the scalability of a file system.
Assumptions : Following assumptions must hold for caching to be useful :
1. Client-specific assumptions :The assumptions that are client-specific
are:
a.
The client might issue a list of requests instead of individual
requests. This is to be known as a request-list. The client suffers
insignificant performance penalty in the construction of this list.
b. The client will receive a bundle that is a collection of responses to
some or all ofits requests (in a request-list). The client will exert a
fixed amount of resources to dissemble the bundle.
C. The client is prepared to receive responses not in the same order as
that ofits requests. In addition, the order of requests and responses
is not important to the client.
2. Server-specifie assumptions : The assumptions that are server
specific are:
a. The server will determine whether or not to use the PC-Bundle
mechanism whenever a request-list is received. The server incurs
insignificant overhead for this action.
b. The server will determine which of the responses to a request-list
should be bundled. It will exert a fixed amount of resources to
determine and assemble a bundle.
3. Proxy-specific assumptions :The assumptions that are server-specific
are :
PART-5
Design Issues in Distributed Shared N aory.
Questions-Answers
Answer
1. Distributed Shared Memory (DSM) is a form of memory architecture
where the (physically separate) memories can be addressed as one
(logically shared) address space.
2. The shared memory model provides a virtual address space shared
between all nodes.
3. DSM is primarily a tool for any distributed application in which individual
shared data items can be accessed directly.
Architecture:
1. DSM consists of number of nodes or computers each of which is
connected toother through high speed communication channel.
Node 1 Node 2 Node n
Distributed
Shared memory
Fig. 3.23.1,
Distributed System 3-25 B (CS-Sem-7)
2 Each node consists of one or more Central Processing Units and a single
memory unit.
3 Message is passed from one node to another by means of simple message
passing technique.
4 Data moves between main memory and secondary memory (within a
node) and between main memories of different nodes.
5 The main memory of individual nodes is used to cache pieces of shared
memory space using mapping manager.
6. When a process accesses data in the shared address space, the mapping
manager maps shared memory address to physical memory.
7. The mapping manager is layer of software implemented either in the
operating kernel or as runtime library routine.
8. For mapping operation, the shared memory space is partitioned into
blocks.
9 A simple message passing system allows on different node to exchange
message with each other.
5. Replacement strategy :
If the local memory of a node is full, a cache miss at that node implies
not only a fetch of the accessed data block from a remote node but
also a replacement.
b. Therefore, a cache replacement strategy is also necessary in the
design ofa DSM system.
PART-6
Questions-Answers
Answer
Clients
Data block
Fig. 3.25.1.
Answer
2. Because many nodes can write shared data concurrently, the access to
shared data must be controlled tomaintain its consistency.
3 One simple way to maintain consistency is to use a gap-free sequencer.
4. In this scheme, all nodes wishing to modify shared data willsend the
modifications to a sequencer.
5 The sequencer will assign a sequence number and multicast the
modification with the sequence number to all the nodes that have a copy
of the shared data item.
6. Each node processes the modification requests in the sequence number
order.
7. A
gap between the sequence number of a modification request and the
expected sequence number at a node indicates that one or more
modifications have been missed.
8 Under such circumstances, the node will ask for the retransmission of
the modifications it has missed.
Answer
Following are the advantages of DSM:
1. DSM provides a simple abstraction for sharing data.
2. DSM systems allow complex structures to be passed by reference, thus
simplifying the development of algorithms for distributed applications.
3. DSM takes advantages of the locality of reference exhibited by programs
and thereby cuts down on the overhead of communicating over the
network.
4 DSM systems are cheaper tobuild.
Distributed System 3-29 B (CS-Sem-7)
Receive response
Clients
Receive block
multicast invalidate
Invalidate
Receive invalidate
invalidate block
Access data
CONTENTS
Part-1 : Failure Recovery in Distributed ......4-2B to 4-4B
System : Concepts in Backward
and Forward Recovery, Recovery
in Concurrent System
4-1 B (CS-Sem-7)
4-2 B (CS-Sem-7) Failure Recovery in Distributed System
PART- 1
Failure Recovery in Distributed System : Concepts in Backward
and Forward Recovery, Recovery in Concurrent System.
Questions-Answers
Answer
Forward and backward recovery, advantages and disadvantages
of forward recovery : Refer Q. 4.1, Page 4-2B, Unit-4.
Advantages of backward recovery :
1. Backwa recovery can handle unpredictable errors caused by residual
design faults.
2.
Backward recovery can be used regardless of the damage sustained by
the state.
3. Backward recovery can handle transparent or permanent arbitrary
faults.
4-4 B (CS-Sem-7) Failure Recovery in Distributed System
Disadvantages of backward recovery :
1, Backward recovery requires significant resources (i.e., time,
computation, and stable storage) to perform checkpointing and
recovery.
2. The implementation of backward recovery often requires that the
system be halted temporarily.
3. Restoring the previous state of a system or process (performance wise)
relatively costly.
Que 4.3. What do you mean by recovery in concurrent system ?
Explain. AKTU 2014-15, Marks 05
OR
What do you mean by backward and forward error recovery? Discuss
recovery in concurrent systems in detail.
AKTU2015-16, Marks 10
Answer
Backward and forward error recovery : Refer Q. 4.1, Page 4-2B,
Unit-4.
Recovery in concurrent system :
1. In concurrent systems, several processes cooperate by exchanging
information to accomplish a task.
2. The information exchange can be through a shared memory in the case
of shared memory machines or through messages in the case of
distributed system.
3 Recovery in concurrent system means to rollback all the processes at
the time of failure.
4 During failure the system assigns a recovery points at the point where
failure occur in the process.
5. If the failed process is associated with active process then the active
process must also rollback at an earlier state.
6. Recovery point helps to undo the effect caused by failed process.
PART-2
Questions-Answers
Z
Time
Answer
Consistent checkpoints :
X,
X
Answer
Method to obtain consistent set of checkpoints :
1 Assume that the action of taking a checkpoint and the action of sending
or receiving a message are indivisible; that is, they are not interrupted
by any other events.
2 Ifevery process takes a checkpoint after sending every message, the set
of the most recent checkpoints is always consistent.
3 The set of latest checkpoints is consistent because the latest checkpoint
at every process corresponds to a state where all the messages recorded
as received in it have already been recorded elsewhere as sent.
4 Therefore, rolling back a process to its latest checkpoint would not
result in any orphan messages.
5 Taking a checkpoint after sending each message is sent is expensive,so
we reduce the overhead by taking a checkpoint after every KK> 1)
messages sent.
6 However, this method suffers from the domino effect.
Time
Failure
(a)
X
n
m
Time
2nd rollback
(6)
Fig. 4.7.1.
6 Process X, after resuming from x,, sends n, and receives m
7. However, because Xis rolled back, there is no record of sending n,
and hence Yhas to rollback for the second time.
8 This forces X to rollback too, as it has received m,, and there is no
record of sending m, at Y.
9 This situation can repeat indefinitely, preventing the system from
making any progress.
ii. Domino effect:
1. Consider the system activity illustrated in the Fig. 4.7.2.
2 In the Fig. 4.7.2, X,Y, and Zare three processes that cooperate by
exchanging information (shown by the arrows).
3. Each symbol marks a recovery point towhich a process can be
rolled back in the event of afailure.
X X
X
Time
Fig. 4.7.2.
Distributed System 4-9 B (CS-Sem-7)
Answer
Synchronous checkpointing :
1
so
In this approach, processes synchronize their checkpointing activitythe
that a globally consistent set of checkpoints is always maintained in
system.
4-10 B (CS-Sem-7) Failure Recovery in Distributed System
2. In this method, consistent set of checkpoints are used which avoids
livelock problems during recovery.
Algorithm:
1 It assumes the following characteristics :
a.
Processes communicate by exchanging messages through
communication channels.
b. Channels are FIFO in nature.
C.
Communication failures do not partition the network.
2.
The checkpoint algorithm takes two kinds of checkpoints on stable
storage:
a.
Permanent checkpoint : A permanent checkpoint is a local
checkpoint at aprocess and is apart ofa consistent global checkpoint.
b. Tentative checkpoints : A tentative checkpoint is a temporary
checkpoint that is made a permanent checkpoint on the successful
termination of the checkpoint algorithm.
3. Processes rollback only totheir permanent checkpoints.
4 The algorithm has two phases :
a. First phase:
i An initiating process P, takes tentative checkpoint and requests
all the processes to take tentative checkpoints.
Each process informs process P; whether it accepts or rejects
the request of taking tentative checkpoint.
When all the process has successfully accepted the tentative
checkpoints then P decides to make this checkpoint a
permanent checkpoint. Otherwise tentative checkpoint is
discarded.
b. Second phase :
i. P, informs all the processes of the decision it reached at the
end of the first phase.
Aprocess, on receiving the message from P,, will act accordingly.
i. Therefore, either all or none of the processes accept permanent
checkpoints.
Que 4.9. Write short note on lost message.
Answer
Lost messages :
a.
Suppose that checkpoints x, and y,(Fig. 4.9.1) are chosen as the recovery
points for processes Xand Y, respectively.
Distributed System 4-11 B(CS-Sem-7)
b In this case, the event that sent message m is recorded in x,, while the
event ofits receipt at Yis not recorded in y,.
C. If Y fails after receiving message m, the system is restored to state
Y, in which message mis lost as process Xispast the point where
it sendsmessage m.
d This condition can also arise if m is lost in the communication channel
and processes X and Yare in state x, and y,, respectively.
e Both the above conditions are indistinguishable.
Failure
Time
Fig. 4.9.1.
2. It also assumes that the checkpoint and the rollback recovery algorithms
are not concurrently invoked.
3. This algorithm has two phases:
First phase :
i. An initiating process P, checks whether all the processes are
willing to restart from their previous checkpoints.
i. Aprocess may reject the request of P, ifit is already participating
in acheckpointing or a recovering process initiated by some
other process.
iii. If all the processes accept the request of P, to restart from
their previous checkpoints, P, decides to restart all the
processes.
b. Second phase:
i P.propagates its decision toall the processes.On receiving P's
decision, a process will act accordingly.
The recovery algorithm requires that every process do not
send messages related to underlying computation while it is
waiting for P's decision.
Que 4.12. Write the difference between deadlock and livelock.
Distributed System 4-13 B (CS-Sem-7)
Answer
Difference:
S. No. Deadlock Livelock
PART-3
Fault Tolerance :Issues in Fault Tolerance.
Questions-Answers
Answer
Faults : A fault is an anomalous physical condition. The causes of a fault
include design errors, manufacturing problems, damage fatigue or other
deterioration, and external disturbances.
Failure: Failure of asystem occurs when the system does not perform its
services in the manner specified. An erroneous state of the system is a state
which could lead to a system failure by a sequence of valid state transaction.
4-14 B (CS-Sem-7) Failure Recovery in Distributed System
Different fault tolerance approaches :
1. Replication :
a.
Replication is the process of creating and maintaining multiple copies
of data objects or processes on several nodes.
b Therefore, iffailure on one node occurs then data will be accessible
to the user from other node.
C.
Replication provides high data availability and performance.
2. Checkpointing :
a. Fault tolerance can be achieved through checkpointing.
b. Checkpointing means to periodically save the consistent state of
the system in a reliable storage medium. Each such instance when
a system is in the consistent state is called a checkpoint.
C. Checkpointing is primarily used to avoid losing all the useful
processing done before a fault has occurred.
In case of a fault, checkpoint enables the execútion of a program to
be resumed fromn a previous consistent state rather than resuming
the execution from the beginning.
Que 4.14. Discuss at least three main issues that are relevant to
the understanding of distributed fault tolerance system. Explain
how that makes it important. AKTU2015-16, Marks 10
Answer
Issues in the fault tolerance are as follows :
1. Process deaths:
a. When a process dies, it is important that the resources allocated to
that process are recouped, otherwise they may be permanently
lost.
b Many distributed systems are structured along the client-server
model in which a client requests a service by sendinga message to
a server.
C.
Ifthe server process fails, it is necessary that the client machine be
informed so that the client process, waiting for a reply can be
unblocked to take suitable action.
d. Similarly, if a client process dies after sending a request toa server,
it is imperative that the server be informed that the client process
no longer exists.
e This will facilitate the server in reclaiming any resources it has
allocated to the client process.
Distributed System 4-15 B (CS-Sem-7)
2. Machine failure :
In the case of machine failure, all the processes running at the
machine will die.
b. As far as the behaviour of a client process or a server process 1s
concerned, there is not much difference in their behaviour in the
event of a machine failure or a process death.
C. In case of machine failure, an absence of any kind of message
indicates either process death or a failure.
3. Network failure:
a. A communication link failure can partition a network into subnets,
making it impossible for a machine to communicate with another
machine in a different subnet.
b. A process cannot give the difference between a machine and a
communication link failure, unless the underlying communication
network (such asa slotted ring network) can recognize a machine
failure.
C
If the communication network cannot recognize machine failures
and thus cannot return a suitable error code (such as ethernet), a
fault-tolerant design will have to assume that a machine may be
operating and processes on that machine are active.
PART-4
Commit Protocol, Voting Protocol, Dynamic Voting Protocol.
Questions-Answers
Long Answer Type and Medium Answer TypeQuestions
coordinator participant
can Commit ?
1. prepared to commit
2. prepared to commit
(waiting for votes) yes
(uncertain)
doCommit ?
3.committed 4. committed
| haveCommitted ?
done
Que 4.16. What are commit protocols ? Explain how two phase
protocols respond to failure of participating site and failure of
coordinator. AKTU 2014-15, Marks 05
Answer
Commit protocols :
1. In distributed system commit protocols ensure the atomicity across the
sites, i.e., when a transaction executes at multiple sites it must either be
committed at all the sites or aborted at all the sites.
2. The goal of commit protocols is to have all the concern participants
agree either to commit or to abort a transaction.
Handlinga failure of a participating site :
Let us assume that the failed site is S, and the Transaction Coordinator is TC.
There are two things we need to look into to handle failure of a participating
site:
Distributed System 4-17 B (CS-Sem-7)
1. The response of the Transaction Coordinator of
transaction T :
If the failed site have not sent any message
(<ready T>), the TC
cannot decide to commit the transaction. Hence, the transactionT
should be aborted and other participating sites is to be informed.
b. If the failed site have sent a message (<ready T>), the TC
can
assume that the failed site also was ready to commit, hence the
transaction can be committed by TC and the other sites will be
informed to commit. In this case, the site which recovers from
failure has to execute the two phase (2PC) protocol to set its local
database up-to-date.
2. The response of the failed site when it recovers:
When recover from failure, the recovering siteS, must identify the
fate of the transactions which was going on during the failure of S.
This can be done by examining the log file entries of site S;.
b. This is how the two phase (2PC) protocol handles the failure of a
participating Site.
Handling the failure of a coordinator site :
Let us suppose that the coordinator site failed during execution of two phase
(2PC) protocol for atransaction T. This situation can be handled in following
two way :
1 The other sites which are participating in the transaction T may try to
decide the fate of the transaction. That is, they may try to decide on
commit or abort of T using the control messages available in every site.
2. The second way is towait until the coordinator site recovers.
Que 4.17.Deseribe three phase commit protocol. Howthree phase
commit protocol is different than two phase commit protocol ?
AKTU 2017-18,Marks 10
Answer
Phases in three phase commit proto col :
1 The three-phase commit (3PC) protocol isa distributed algorithm which
lets all nodes in a distributed system agrees to commit a transaction.
2. 3PC is non-blocking protocol.
3. 3PC places an upper bound on the amount of time required before a
transaction either commits or aborts.
4. This property ensures that if a given transaction holds some resource
locks, it will release the locks after the time-out.
5. The three-phase commit (3PC) protocol is more complicated and more
expensive phase in 3PC protocol.
4-18 B (CS-Sem-7) Failure Recovery in Distributed System
Phase 1:Voting /Prepare phase :
1. Transaction Coordinator (TC) of the transaction writes
BEGIN_COMMIT message in its log file and sends PREPARE message
to all the participating sites and waits.
2. On receiving PREPARE message, ifa site is ready to commit, then the
site's Transaction Manager (TM) writes READY in its log and send
VOTE_COMMIT to TC.
3. If any site is not ready to commit, it writes ABORT in its log and
responds with VOTE ABORT to the TC.
Phase 2: BufferingPre-commit phase :
1. IfTC received VOTE_COMMIT from all the participating sites, then it
writes PREPARE_TO0_COMMIT in its log and sends
PREPARE_TO_COMMITmessage to all the participating sites.
2 If TC receives any one VOTE ABORT message, it writes ABORT in its
log and sends GLOBAL ABORT to all the participating sites and also
writes END_OF_TRANSACTION message in its log.
3. On receiving the message PREPARE_TO_COMMIT, the TM of
participating sites write PREPARE_TO_COMMIT in their log and
respond with READY_ TO_COMMIT message to the TC.
4. If they receive GLOBAL_ABORT message, then TM of the sites write
ABORT in their logs and acknowledge the abort.
Phase 3: Decision/Commit or abort phase :
1 If all responses are READY_TO_COMMIT, then TC writes COMMIT
in its log and send GLOBAL_COMMIT message to all the participating
sites' TMs.
2. The TM of all sites then writes COMMIT in their log and sends an
acknowledgement to the TC. Then, TC writes
END_OF_TRANSACTION in its log.
Three phase vs. two phase commit protocol :
1. In two-phase commit protocol, when coordinator fails during execution
then participating sites are unable to determine whether the coordinator
has made a decision to abort or commit the transaction, which cause
participating sites to be in blocked state.
2. To remove this blocking problem in 2PC, three phase commit protocol
was proposed. Three-Phase Commit protocol is able to. prevent this
blocking problem by taking the decision based on the decision of all
sites.
3
votes = 1 votes = 1
750 msecs 750 msecs
Votes = 2
100 msecs
Fig. 4.18.1.
4-20 B (CS-Sem-7) Failure Recovery in Distributed System
2. Site 1, 2, and 3 can still collect a guorum (also referred to as
while site 4 cannot collect a majority)
3.
quorum.
If another partition or a failure of a site
occurs, making any site
unavailable, the system cannot serve any read or write requests as a
quorumn cannot be collected in any partition.
4. In other words, thà system is completely unavailable which is a serious
problem.
5. Dynamic voting protocols solve this problem by adapting the number of
votes or the set of sites that can form a quorum, to the changing state of
the system due to site and communication failures:
6 In the dynamicprotocols, following two approaches are used to enhance
availability:
a.
Majority based approach :The set of sitesthat can forma majority
to allow access to replicated data changes with the changing state
of the system.
b. Dynamic vote reassignment :The number of votes assigned to
a site changes dynamically.
Method to obtain consistent set of checkpoint : Refer Q. 4.6, Page
4-7B, Unit-4.
CONTENTS
Part-1 Transaction and Concurrency 5-2B to 5-4B
Control : Transaction, Nested
Transactions
Part-2 : Locks, Optimistic 5-4B to 5-10B
Concurrency Control, Timestamp
Ordering, Comparison of Methods
for Concurrency Control
Part-3 : Distributed Transaction 5-10B to 5-14B
Flat and Nested Transaction
5-1 B (CS-Sem-7)
5-2 B (CS-Sem-7) Transaction and Concurrency Control
PART-1
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Answer
Nested transaction :
1 In a nested transaction, the top-level transaction can open
subtransactions, and each subtransaction can open further
subtransactions down to any depth of nesting.
Distributed System 5-3 B (CS-Sem-7)
2 The Fig. 5.2.1 shows a client's transaction T that opens
two
subtransactions T, and T,, which access objects at servers Xand Y.
3 The subtransactions T, and T, open further
subtransactions T1)
T and T Which access objects at servers M, N, O andP.
4 In the nested case, subtransactions at the same level can run
concurrently, so Tand T, are concurrent, and as they invoke objects in
different servers, they can run in parallel.
5 The four subtransactions T,,,Ti, To, and T, also run
concurrently.
M
T
Client
T X
T1
T, Y
T 22
Fig. 5.2.1.
Que 5.3.Explain how the two phase commit protocol for nested
transaction ensures that if the top level transactions commit, all
the right descendents are committed or aborted ?
AKTU2015-16, Marks 10
Answer
1 Consider the top-level transaction T and its subtransactions shown in
Fig. 5.3.1.
T, abort (at M)
T, aborted (at Y)
PART-2
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Que 5.4. What is lock ? What are the different modes in which
transaction can lock a data object.
Distributed System 5-5 B (CS-Sem-7)
Answer
1 A lock is avariable associated with shared resources such as data item
that determines whether read/write operation can be performed on
that data item.
2. In lock based techniques,each data object has a lock associated with it.
3. A transaction can hold, request or release the lock on a data object, as
required by the transaction.
4 The transaction is said to have the locked data object, ifit holds a lock.
5. There are two modes of locking in which transaction can lock data object :
a. Exclusive:
If a transaction has locked the data object in exclusive mode,
no other transaction can lock it in any mode.
ii. In this locking scheme, the server attempts to lock any object.
iii. If a client requests access to an object, the request is suspended
and the client must wait until the object is unlocked.
b. Shared:
i If the transaction has locked the data object in shared mode,
other transaction can concurrently lock it but only in shared
mode.
i. If a client requests access to an object, the request is always
successful.
ii. All the transactions reading the same object can share their
read lock.
Answer
Time
First Second Release Release
lock of first of second
lock
acquisition acquisition lock lock
Answer
Optimistic concurrency control states that the conflicts among the transactions
are rare in distributed database system. It is only an assumption so it is also
called optimistic. In optimistic concurrency control scheme, each transaction
goes through three phase:
1. Working phase : During this phase, each transaction has a tentative
version of each of the objects that it updates. The use oftentative versions
allows the transaction to abort either during the working phase or other
validation phase. The rules for read/write are:
a. Read operation is performed if the tentative version for that
transaction already exists.
b. Write operation record the new values of several concurrent
transaction objects as tentative values which are invisible to other
transactions.
2. Validation phase: When the close transaction request is received, the
transaction is validated to establish whether or not its operations on
objects conflicts with operations of other transaction on same objects.
3. Update phase:If the transaction is validated, all the changes recorded
in its tentative versions are made permanent -read only transaction
can commit immediately after passing validation.
Que 5.8. Discuss the optimistic methods for distributed
concurrency control. What are the different validation conditions
for optimistic concurrency control ? Explain.
|AKTU2015-16, Marks 10
Answer
Answer
Validation condition for optimistic concurren cy control :
Refer Q. 5.8, Page 5-7B, Unit-5.
Effects of validation conditions on transaction in distributed
system :
1. If the validation conditions are successful, then the transaction can
commit.
2 If the validation conditions fail, then some form of conflict resolution
must be used and the current transaction will be aborted.
3 Rule 1and 2 test whether there is a overlapping between the objects of
pair of transaction T; and T;.
4 Rule 3 ensures that no two transactions can overlap in update phase.
5 Due to restriction on write operations no dirty read can occurs.
Que 5.10. Write short notes on timestamp ordering transaction
management. AKTU 2015-16, Marks 05
Answer
1. In distributed transaction, each coordinator issue globally unique
timestamps.
2. Aglobally unique transaction timestamp is issued to the client by the
first coordinator accessed by a transaction.
3 The transaction timestamp is passed to the coordinator at each server
whose objects perform an operation in the transaction.
4 The servers of distributed transactions are jointly responsible for
ensuring that they are performed in a serially equivalent manner.
5. Atimestamp consists of apair <local timestamp, server-id>.
6. The agreed ordering of pairs of timestamps is based on a comparison in
which the server-id part is less significant.
Distributed System 5-9 B (CS-Sem-7)
Que 5.11. Explain strict two phase locking with its rules.
Answer
Strict two phase locking: Refer Q. 5.5,Page 5-5B, Unit-5.
The rules for the use of locks in a strict two phase locking
implementation are as follows :
1 When an operation accesses an object within a transaction:
a If the object is not already locked, it is locked and the operation
proceeds.
b If the object has the conflicting lock set by another transaction, the
transaction wait until it is unlocked.
C. Ifthe object has the non-conflicting lock set by another transaction,
the lock is shared and the operation proceeds.
d. If the object has already been locked in the same transaction, the
lock will be promoted if necessary and the operation proceeds.
2. When a transaction is committed or aborted, the server unlocks all
objects it locked for the transaction.
These rules ensure strictness because the locks are held until a transaction
has either committed or aborted.
Answer
Advantages of multiversion timestamp ordering:
1 It allows more concurrency in distributed system.
2. Improved system responsiveness by providing multiple versions.
3 Reduces the probability of conflicts transaction.
4. Read request never fails and is never made to wait.
Disadvantages of multiversion timestamp ordering :
1. Reading of adata item also requires the updating of the read timestamp
field resulting in two potential disk accesses, rather than one.
2. The conflicts between transactions are resolved through rollbacks, rather
than through waits.
3. It require huge amount of storage for storing multiple versions of data
objects.
4. It does not ensure recoverability and cascadelessness.
PART-3
Questions-Answers
Long Answer Type and Medium Answer Type Questions
T
T Y
Client
Que 5.16.
i. What are the goals of distributed transaction ? Distinguish
between flat and nested transaction along with its structure.
5-12 B(CS-Sem-7) Transaction and Concurrency Control
Answer
i. Goals of distributed transaction :
1. The goal of distributed transaction is to ensure that all of the objects
managed by, a server remain in a consistent state when they are
accessed by multiple transactions and in the presence of server
crashes.
2. To maintain ACID properties of transaction in distributed system.
3 To complete overall transaction occurring at different nodes.
4. Toensure the consistency of a set of shared data objects accessed
by user at the time of failures and concurrent access.
Difference between flat and nested transaction:
S. No. Flat transaction Nested transaction
1 A flat client transaction In a nested transaction, the top-level
completes each of its | transaction can open subtransactions,
requests before going on to and each subtransaction can open
the next one. Therefore, further subtransactions down to any
each transaction accesses depth of nesting.
server objects sequentially.
2 In the Fig. 5.16.1, In nested transaction as shown in
transaction T is a flat Fig. 5.16.2, subtransactions at the
transaction that invokes same level can run concurrently, so
operation on objects in T,and T, are concurrent, and as they
servers X, Y and Z. invoke objects in different servers,
they can run in parallel.
M
x
T
T Y Client,
T,
Client T12
T
Fig. 5.16.1.
P
Fig. 5.16.2.
Distributed System 5-13 B (CS-Sem-7)
transactions TM Scheduler DM
Database
transactionsTM Scheduler
DM D
Database
transactionsTM Scheduler DM D
Database
Fig. 5.17.1. Distributed transaction management model.
Execution of a transaction at the TM results in the execution of its
actions at the DM.
5-14 B (CS-Sem-7) Transaction and Concurrency Control
e. So, the DM executes a stream of transaction actions, directed
towards it by the TM.
PART-4
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Que 5.19. Explain two phase and three phase commit protocol.
Answer
Two phase commit protocol : Refer Q. 4.15, Page 4-15B, Unit-4.
Three phase commit protocol:Refer Q. 4.17, Page 417B, Unit-4.
Que 5.20. Write short note on conflict resolution.
Answer
Aconflict is resolved by.taking one of the following actions:
1. Wait: The requesting transaction is made to wait until the conflicting
transaction either completes or aborts.
Distributed System 5-15 B (CS-Sem-7)
2. Restart:
Either the requesting transaction or the transaction it conflicts
with is aborted and started afresh.
b. Restarting is achieved by using one of the following primitives :
3. Die: The requesting transaction aborts and starts afresh.
4 Wound :
a. The transaction in conflict with the requesting transaction is tagged
as wounded and a message "wounded" is sent to all sites that the
wounded transaction has visited.
b If the message is received before the wounded transaction has
committed at a site, the concurrency control algorithm at that site
initiates an abort of the wounded transaction, otherwise the
message is ignored.
C. Ifawounded transaction is aborted, it is started again.
d The requesting transaction proceeds after the wounded transaction
completes or aborts.
Answer
Following are the algorithms for conflict resolution in timestamps
concurrency control :
1. Wait-die algorithm :
a The wait-die algorithm is a nonpreemptive algorithm because a
requesting transaction never forces the transaction holding the
requested data object to abort.
b Suppose requesting transaction T, is in conflict with a transaction
T, If T, is older (i.e., has a smaller timestamp), then T, waits,
otherwise T, dies.
C. The older transaction waits for the younger transaction if the
younger has accessed the granule first.
d. The younger transaction is aborted and restarted ifit tries to access
a granule after an older concurrent transaction.
2 Wound-wait algorithm :
The wound-wait algorithm is a preemptive algorithm.
b. Supposea requesting transaction T, is in conflict with a transaction
T,. If T, is older, it wounds T, otherwise it watts.
C. The older transaction preempts the younger by suspending itif the
younger transaction tries to access a granule after an older
concurrent transaction.
5-16 B (CS-Sem-7) Transaction and Concurrency Control
An older transaction willwait for a younger one to commit if the
younger has accessed a granule that both want.
Que 5.22. What are the advantages, problems and applications of
optimistic concurrency control ?
Answer
Advantages:
Optimistic concurrency control is very efficient when conflicts are rare.
The occasional conflicts result in the transaction roll back.
i. The rollback involves only the local copy of data. And thus no cascading
rollback occurs.
Problems:
Conflicts are expensive to deal.
i. Longer transactions are more likely to have conflicts and may be
repeatedly rolled back because of conflicts with short transactions.
Applications :
Only suitable for environments where there are few conflicts and no
long transactions.
Acceptable for mostly read or query database systems that require very
few update transactions.
Que 5.23. Explain the schemes which conflicts in obtaining local
locks ?
Answer
Schemeswhich conflicts in obtaining local locks :
1. Write-locks-all, read-locks-one :
a. In this scheme exclusive locks are acquired on all copies, while
shared locks are acquired only on one arbitrary copy.
b. Aconflict is always detected, because a shared-exclusive conflict is
detected at the site where the shåred lock is required and exclusive
exclusive conflicts are detected at all sites.
2 Majority locking :
a. Both shared and exclusive locks are requested at a majority of the
copies of the data item.
b. If two transactions are required to lock the same item, there is at
least one copy ofit where the conflict is discovered.
3. Primary copy locking: In primary copy locking, one copy of each data
item is assigned the primary copy and all locks must be required at this
copy so that conflicts are discovered at the site where the primary copy
resides.
Distributed System 5-17 B (CS-Sem-7)
PART-5
Questions-Answers
Que 5.25. Write short notes on wait for graph with example of
distributed transaction. AKTU2015-16, Marks 05
Answer
A
Wait-For Graph (WFG) is a graph where
Each node represents a process.
b. An edge, P,-’ P, means that P, is blocked waiting for P, to release
a resource.
5-18 B (CS-Sem-7) Transaction and Concurrency Control
2. Asystem is deadlocked if and only if there is adirected cycle in the WFG.
3 In Distributed Database Systems (DDBS), users access the data objects
of the database by executing transactions.
4. The data objects of a database can be viewed as resources that are
acquired (through locking) and released (through unlocking) by
transactions.
5. In DDBS a wait for graph is referred to as a transaction-wait-for graph
(TWF graph).
6. In a TWF graph, nodes are transactions and there is a directed edge
from node T,to node T, if T, is blocked and is waiting for T, to release
Some resource.
X U
The order of the entries in the log reflects the order in which
transactions
3.
have prepared, committed and aborted at that server.
4.
is called
During the normal operation of a server, its recovery manageraborts a
whenever a transaction prepares to commit, commits or
transaction.
recovery
5. When the server is prepared to commit a transaction, the
list to the recovery file,
manager appends all the objects in its intentions
followed by the current status of that transaction (prepared) together
with its intentions list.
recovery
6. When a transaction is eventually committed or aborted, the to its
transaction
manager appends the corresponding status of the
recovery file.
Answer
PART-6
Replication : System Modaland Group Communication, Fault,
Tolerant Services, Highly Available Services and Transaction With
Replicated Data.
Questions-Answers
Long Answer Type and Medium Answer Type Questions
Replica manager :
1. Replica manager is a subsystem that is responsible for managing the
synchronization of replicas.
2. Replica refers to a single copy of the data in a system that employs
replication.
3. Replica managers held various replicas and perform operations upon
them.
4. Areplica manager acts as a server in client-server environment.
5 Acollection of replica managers provides service to client.
Architectural model for replicated data :
FE
(RM (RM)
Fronts
Clients ends
Replica
(RM
TPE managers
and replies
Request
Fig. 5.30.1.
OR
Explain the processing of queries and update operation in gossip
services.
Answer
Service
(RM)
RM Gossip RM
Fig. 5.31.1.
i. Gossip architecture :
1 Gossip architecture is a framework for implementing highly available
services by replicating data close to the points where groups of clients
need it.
2, Here the replica managers exchange 'gossip' messages periodically in
order to convey the updates received from clients.
3. Agossip service provides two basic types of operation : queries (read
only operations) and updates (modify but do not read the state).
4. Akey feature is that front ends send queries and updates to any replica
manager that is available and can provide reasonable response times.
5 The system makes two guarantees:
a Each client obtains a consistent service over time.
Relaxed consistency between replicas.
b.
Processing of queries and update operations in gossip service :
a. Request :
The front end normally sends requests to only a single replica
manager at a time.
ii. However, a front end will communicate with a different replica
manager when the one it normally -uses fails or becomes
unreachable, or if the normal manager is heavily loaded.
ü. Front ends, and thus clients, may be blocked on query
operations.
5-24 B (CS-Sem-7) Transaction and Concurrency Control
iv. The default arrangement for update operation is to return to the
client as soon as the operation has been passed to the front end; the
front end then propagates the operation in the background.
b. Update response :Ifthe request is an update then the replica manager
replies as soon as it has received the update.
C. Coordination :
i The replica manager that receives a request does not process it
until it can apply the request according to the required ordering
constraints.
i. This may involve receiving updates from other replica managers,
in gossip messages.
d. Execution:The replica manager executes the request.
e. Query response : If the request is a query then the replica manager
replies at this point.
f Agreement :The replica managers update one another by exchanging
gossip messages, which contain the most recent updates they have
received.
:ii. Quorum consensus methods :
1 Aquorum is a subgroup of replica managers whose size gives it the right
to carry out operations.
2 In this scheme, an update operation on a logical object may be completed
successfully by a subgroup of its group of replica managers.
3. The other members of the group will therefore have out-of-date copies
of the object.
4. Versions numbers may be used to determine whether copies are
up-to-date.
5. Each copy of an object has a version number, but only the copies that
are up-to-date have the current version number.
Que 5.32. Discuss the following in terms of distributed system :
i. Sequential consistency
ii. Highly available services AKTU 2018-19, Marks 10
OR
Write short note on highly available services and sequentially
consistency. AKTU2015-16, Marks 10
Answer
i. Sequential consistency :
1. Sequential consistency is astrong safety property for concurrent systems.
5-25 B (CS-Sem-7)
Distributed System
2 A system is sequentially consistent if the
result of any execution of the
executed in a
operations of all the processors is same as if they were
sequential order, and the operations of each individual processor appear
in this sequence in the order specified by its program.
total order, and
3. It implies that operations appear to take place in some each individual
that order is consistent with the order of operations on
process.
the event of a
4 Sequential consistency cannot be totally available in progress.
network partition, some or all nodes will be unable tO make
ii. Highly available services :
1. Availability of service means the percentage of
time that a service is up.
availability is close to
2. Highly available service is the service whose
100% with reasonable response timne.
3. It may not conform to sequential consistency.
highly available
4. Gossip architecture is a framework for implementing
of clients
services by replicating data close to the points where groups
need it.
transactions.
Que 5.33. Describe the architecture of replicated
Answer
Architectures for replicated transactions :
requests to
1. In thisarchitecture, we assume that a front end sends client
one of the group of replica managers of a logical object.
communicate with a
2. In. the primary copy approach, all front ends
distinguished 'primary' replica manager to perform an operation, and
that replica manager keeps the backups up to date.
3.
Front ends may communicate with any replica manager to perform an
operation.
4 The replica manager that receives a request to perform an operation on
a particular object local state responsible for getting the cooperation of
the other replica managers in the group that have copies of that object.
5. Different replications schemes have different rules as to how. many of
successful
the replica managers in a group are required for the
completion of an operation.
6 In the read-one write-all scheme, a read request can be performed by a
single replica manager, whereas a write request must be performed by
all the replica managers in the group, as shown in Fig. 5.33.1.
7 Quorum consensus schemes are designed to reduce the number of
replica managers that must perform update operations, but at the expense
of increasing the number ofreplica managers required to perform read
only operations.
5-26 B (CS-Sem-7) Transaction and Concurrency Control
(B) B) (B)
Fig. 5.33.1.
Group
address
expansion
Group Leave
send
Multicast
communicatioD Fail SGroup membership
management
Join
Process group
Fig.B.34.L. Services provided for process groups.
Distributed System 5-27 B (CS-Sem-7)
FE
Primary
RM
(RM
Backup
RM
FE RM FEC
RM
Que 5.36. What are stub and skeleton and why are they needed in
remote procedure calls ? AKTU 2017-18, Marks 10,
Answer
Stub is a function which converts the function call into a network response
and a network response into a function return.
Skeleton converts requests into function calls and function returns into
network replies.
Need of stub and skeleton in Remote Procedure Call (RPC):
1. RPC allows a local computer (client) to remotely call procedures on a
different computer (server).
2. The client and server use different address spaces, so parameters used
in a function (procedure) call have to be converted, atherwise the
values of those parameters could not be used, because pointers to
parameters in one computer's memory would point to different data
on the other computer.
3 The client and server may also use different data representations,
even for simple parameters.
4 Stubs perform the conversion of the parameters, so a remote procedure
call looks like a local function call for the remote computer.
5 Stub libraries must be installed on both the client and server side.
6. A client stub is responsible for conversion of parameters used in a
function call and deconversion of results passed from the server after
execution of the function.
7. A server skeleton, the stub on the server side, is responsible for
deconversion of parameters passed by the client and conversion of the
results after the execution of the function.
Que 5.38. How does a server know that one of his remote objects
provided by him isno longer used by clients and can be collected ?
How does Java RMI handle this problem and what alternatives
are there ? AKTU2017-18, Marks 10|
Answer
1. When a client first receives a reference to a remote object, a "referenced"
message is sent to the server that is exporting the object.
2. Every subsequent reference within the client's local machine causes a
reference counter to be incremented.
3. As a local reference is finalized, the reference count is decremented,
and once the count goes to zero, an 'unreferenced' message is sent to
the server.
4. Once the server has no more live references to an object and there are
no local references, it is free to be finalized and garbage collected.
5. This condition tells a server that a remote object provided by him is no
longer used by clients and can be collected.
RMI uses its distributed garbage collection feature to collect remote server
objects that are no longer referenced by any client in the network.
Que 5.39. De-activation is a technology used topreserve server
resources where a server which provides remote objects to clients
can de-activate those remote objects. Clients should not know
about this. What must the server do to avoid surprises for the
clients ? AKTU2017-18, Marks 10
Distributed System 5-31 B (CS-Sem-7)
Answer
While using de-activation technologies to avoid surprises for the clients,
server must do the following :
1 It must give the client permission to recreate (activate) the object
again.
2 The remote objects must be available for a long period without any
predetermined expiration time out.
3 The remote objects state must not be lost between individual invocations
and must be available to all clients.
4 May provide remote objects whose lifetime is controlled by clients.