Distributed Computing UNIT-4
Distributed Computing UNIT-4
Syllabus
Contents
4.1 Consensus and Agreement Algorithms: Problem Definition
4.2 Byzantine Agreement Problem
4.3 Overview of Results
4.4 Solution to Byzantine Agreement Problem
4.5 Agreement in a Failure-Free System (Synchronousand Asynchronous)
4.6 Agreement in Synchronous Systems
with Failures May-22, Marks 13
4.11 Checkpoint-basedRecovery
4.12 Coordinated Checkpointing Algorithm
(4-1)
Distributed Computing 4-2 Consensus and Recovery
• In distributed data bases, there may be a situation where data managers have to
decide "Whether to commit or Abort the Transaction". When there is no failure,
reaching an agreement is easy.
• However, in case of failures, processes must exchange their values with other
processes and relay the values received from others several times to isolate the
effect of faulty processor.
system.
issues the order, lieutenants to the commander are to decide to attack or retreat.
• But the one or more of the generalsmay be treacherous, i.e. faulty.
. If a lieutenant is treacherous, he tells one of his peers that the commander told
him toattack and another that they are to retreat.
• Source processor broadcasts its values to others. Solution must meet following
objectives :
Agreement : All non-faulty processors agree on the same value.
Validity : If source is nonfaulty, then the common agreed value must be the value
supplied by the source processor.
• "If source is faulty then all non - faulty processors can.agree on any common
value"."Value agreed upon by faulty processorsis irrelevant"
General1 General 1
Retreat Retreat
Fig. 4.2.1
• No solution for three processes can handle a single traitor. In a system with m
faulty processes agreement can be achieved only if there are 2m+1 (more than 2/3)
functioning correctly.
iv. All non faulty processors must agree on a single common value.
• If initial value of non-faulty processors is different then all non - faulty processors
processors is irrelevant".
ii, All non faulty processors must agree on a set of common values.
• In all the previous mentioned problems, all non faulty processors must reach an
agreement.
• In Byzantine agreement problem, only one processor initializes the value where as
in other two cases, every processorhas its own initial value.
Consensus is not solvable in asynchronous system even if one process can fail by
crashing
1 No failure
Agreement attainable Agreement attainable
2 Crash failure
Agreement attainable Agreement not attainable
[f<n process]
3.
Byzantine failure Agreement attainable Agreement not attainable
[f (n- 1) /3] byzantine
process
p0 p2
7
p1 p2 p1 0 po
Fig. 4.4.1
This algorithm also known as Oral Message Algorithm OM(m) where m is the
number of faulty processors
Algorithm OM(0)
1. Source processorsends its values to every processor
2. Each processor uses the value it receives from source. [If no value is received
default value 0 is used]
Algorithm OM(m), m>0
1. Source x broadcasts value to all processes
:
with 4 is is faulty.
p0 p0
p1 p3 p1
Fig. 4.4.2
Example 2:
System with 4 processors : p0, p1, p2, p3. p0 is source, and is faulty.
Assumption :
Possible values are only 1 and 0.
Step 1 : p0 initiates the initial value to be 1 for pl and p3. For p2, it sends a
Step 3: Majority function at pl, p2, p3 is still the same (1), which is the desired
result.
p0
Fig. 4.4.3
the system.
A distributed mechanism would have each process broadcast its values to others,
and each process computes the same functionon the values received.
Asynchronous
of message hops.
system : Consensus can similarly be reached in a constant number
The validity condition is satisfied because processes do not send fictitious values
in this failure model
most O(n) in each round, and each message has one integer. Hence the total
University Question
1. List the agreement statements that should be follotoed in synchronous systems with failure.
When failure occurs, the process rolls back to its most recent checkpoint, assume
the state saved in that checkpointand resumes execution.
University Question
M,
Process
M3
Process,
Mo
Process,
M Ms
• The computation is asynchronous, i.e., each process progresses at its own speed
and messages are exchanged through reliable channels, whose transmission delays
are finite but arbitrary.
•Local checkpoint is a snapshot of the state of the process at a given instance and
the event of recording the state of a process is called local checkpointing)
by Cp,i"
• We also assume that each process P, takes an initial checkpoint Cp,0 immediately
before execution begins and ends with a virtual checkpoint that represents the last
state attained before termination.
The ¡th checkpoint interval of process P, denotes all the computation performed
between its ith and (i+1) n checkpoint, including the ih checkpoint but not the
(i+1)th checkpoint.
• Checkpointing in distributed systems requires that all processes (sites) that interact
with one another establish periodic checkpoints.
• All the sites save their local states : local checkpoints. All the local checkpoints,
one from each site,collectively form a global checkpoint.
Po
Po
m.
m4
P
P
m2
P2
P2
Fig. 4.9.1
rollbacks.
P1
Failure
P2
Time
Messagesent Inconsistent cut
from P2 to P1
. Establish a set of local checkpoints(one for each process in the set) such that no
. There is one recovery point for each process in the set during the interval spanned
is no information flow between any pair in the
of processes
by checkpoints;there
and any process outside the set.
set and process in the set
. Fig.
4.93shows consistent set of checkpoint.
Z
Time
of another process.
checkpoints
.
Consistent set of
Checkpoint notation
• Each node maintains:
from node is
counter with which each message that
1. Monotonically increasing
labelled.
last_label_ received,M
first_label_sent [X]
Fig. 4.9.4
Note : "sl" denotes a "smallest label" that is < any other label and "l" denotes
"largest label" that is > any other label.
checkpoints.
4.9.1.1| Checkpointing
Algorithm
• Make some
V simplifying assumptions
Processes communicate by exchanging
messages through channels.
2 Channels are FIFO, end-to-end
protocols cope with message loss due to
rollback recovery.
Phase One
Initiating process P, takes a
tentative checkpointand requests that all the
take processes
tentative checkpoints.
TECHNICAL PUBLICATIONS
an up-thrust for knowledge
Computing 4- 13 Consensus and Recovery
Dstibutead
Two
Phase
its decision to all processes.
propagates
,Does this guarantee we have a strongly consistent state ? Can you construct an
example that shows we can still have lost messages
Tentative
Time
checkpoint X2
X
Messages
m
take acheckpoint
sent after Y sent the first message after the last checkpoint (last_recv(x, y)>.
first_send(y, x)).
• When a process takes a checkpoint,it will ask all other processes that sent
progress.
affect performance.
• Restore the system state to a consistent state after a failure with assumptions:
single initiator, checkpoint and rollback recovery algorithms are not invoked
concurrently.
Phase One :
Process P checks whether all processes are willing to restart from their previous
checkpoints.
• If all processes are willing to restart from their previous checkpoints, P, decides
• Otherwise, P; decides that all the processes continue with their normal activities.
Phase Two :
P, propagates its decision to all processes.
Optimization
6A minimum number of processes roll back.
. y will restart from its permanent checkpoint only if X is rolling back to a state
Time
X
Failure
m
• In-transit message :Messages that have been sent but not yet received
rollback
Duplicate messages : Arise due to message logging and replaying during process
recovery.
AU : May-22, Dec.-22
4.10 lssues in Failure Recovery
Recovery refers to restoring a system to its normal operational state. Once a failure
has occurred, it is essential that the process where the failure happened can
recover to a correct state.
process wiL
•In distributed process recovery, undo effect of interactions oft tailed
Failure of a system occurs when the system does not perform its service in the
manner specified.
An erroneous state of the system is a state which could lead to a system failure by
a sequence of valid state transitions.
•A system is said to "fail" when it cannot meet its promises. A failure is brought
about by the existence of "errors" in the system.
• A system is said to have a failure if the service it delivers to the user deviates
from compliance with the system specification for a specified period of time.
• Fig. 4.10.1 shows concept of fault and recovery.
Fault
Causes
Leads to
Failure
Valid state
:
Error the part of the system
lead
from
a system
its
failure
intended value.
by a sequence
TECHNICAL PUBLICATIONS
an up-thrust for
knowledge
4-17 Consensus and Recovery
Computing
Astrbuted
Questions
University
2 llustrate the different types of failures m distributed systems and explain how to prevent
AU:Dec.-22, Marks 13
them.
code
periodically or before critical
state. By saving the current state of the system
of lost
sections, it provides the baseline information needed for the restoration
non-volatile storage.
system state and stores the
mechanism takes a snapshot of the
The checkpointing
on some non-volatile storage medium.
data
amount of state required to be
the
Clearly,the cost of a checkpointwill vary with
mechanism being used to save
to the storage
saved and the bandwidth available
the state.
internal state of the system can be restored,
• In the event of a system failure, the
at which its state was last saved.
and it can continue service from the point
this involves restarting the failed task or system, and providing some
Typically
that there is state to be recovered.
parameter indicating
and the bandwidth to the
• Depending on the task complexity, the amount of state,
as it did previously.
This will tolerate any transient fault, however if the fault was caused by a design
then the system will continue to fail and recover endlessly. In some cases,
error,
this may be the most important type of fault to guard against, but not in every
case.
1.Uncoordinated Checkpointing
when to take checkpoints
• Each process has autonomy in deciding
Disadvantages :
a. Domino effect during a recovery
to find .
slow because processes need
to iterate
D, Recovery from a failure is
Assume each process P, starts its execution with an initial checkpoint C;0
When P receives a message m during I,y it records the dependency from Ij,, to
• When failure occurs, the recovering process initiátes rollback by broad casting a
2. Coordinated Checkpointing
storage.
However, the approach suffers from high overhead associated with the
checkpointing process.
Two approaches are used to reduce the overhead: first is to minimizethe number
of synchronization messages and the number of checkpoints, the other is to make
the checkpointing process nonblocking.
Blocking
Checkpointing
remains
After a process takes a local checkpoint,to prevent orphan messages, it
X
X
Non-blocking Checkpointing
need not stop their execution while taking checkpoints.
The processes
Key issue with coordinated checkpointing: Being able to prevent a process from
receiving application messages that could make the checkpoint inconsistent.
Checkpoint request
Initiator
Initiator
request
Checkpoint
Checkpoint request
Po Cox m
Po Cox
Po Cox
PA
C1x
P1
P
C1x (c)
(b)
(a)
checkpoint
Fig. 4.11.2 Non-blocking
post-checkpoint message.
3. Communication-induced Checkpointing
The Communication-Induced Checkpointing (CIC) protocols are popular, because
they help in bounding rollback propagation during failure recovery, by ensuring
checkpoints independently.
Wavoids domino effect, while allowing processes to take some of their checkpoints
independently.
. Communication-induced checkpointing forces each process to take
based on information
checkpoints
piggybacked on the application messages it receives
from
other processes.
Checkpoints are taken such a system-wide consistent
that
state always exists on
stable storage, thereby avoiding the domino effect.
and related
Communication-induced checkpointing piggybacks protocol
• The receiver
has to take a forced checkpoint to advance the global recovery line.
determine if it
works by assigning
Index-based communication-induced checkpointing
Coordinated
4.11:1 Difference between Uncoordinated,
and Communication Induced Check Pointing
One Many
Number of check Many
point
Possible No No
Domino effect
Possible No No
Orphan process
Algorithm
4.12 Coordinated Checkpointing to live-lock.
or
effect
to domino
• Uncoordinated checkpointing
may lead
coordination
: the system-wide
• Two basic approaches tocheckpoint a process to
initiate
has
which
1. The Koo-Toueg algorithm,
41
all other processes from whom it has received a message since taking its last
checkpoint
Call the set of such processes II
. The message
received from
tells each process
it before
in II (e.g., Q), the last message, m gp, that P has
the tentative checkpoint was taken.
set.
its corresponding
. Here
recovery.
we discuss, Juang-Venkatesan algorithm for asynchronous
checkpointing and
messages in FIFO
channels are reliable, delivery
Assumptions : communication
transmission delay is arbitrary but finite.
order, infinite buffers, message
checkpointing. During the
that is based on asynchronous
• They gave an algorithm
to which the system can
a consistent set of checkpoints
recovery, we need to find
be restored.
of both the number of
each process keeps track
• In this recovery algorithm
received from other processes.
messages it has send to and This
in this recovery.
by processes are also involved
iterations of rollback
Several
of Orphan messages.
algorithm avoids the existence
other processes to find if any
it is necessary for all
Whenever a process rollbacks,
back process has become an orphan message.
message send by the rolled
processor
if the number of messages received by
Orphan messages are discovered, P} to
P} is greater than number of messages sent by process
Pi from process
state of processes, then
one or more message
process Pi, according to the current
periodically.
remained if crashed
b. Stable log : longer timne to access but
to checkpoint of computation to
checkpoint
checkpoints. Doing
that
find a set of consistent
the set of checkpoints,
• ldea : From
received.
based on the number of messages sent and
Answers
4.14 Two Marks Questions with
AU :Dec.-22
single value.
AU: May-22
initial value.
in the presence of
Ans. :The difference between the agreement problem and the consensus problem
value, whereas in
is
the
that, in the agreement problem, a single process has the initial
have an initial value.
consensus problem, all processes
List classification
of failures.
Q.7
Ans.
: Failures
in
failure
a computer system
2. System
can be classified
failure
as follows :
1. Process
process after
the crash
its
of
recovery.
whose state is inconsistent
specified.
equal.
a global
The local
consistent
checkpoints of different processes are not coordinated
checkpoint.
to the protocol.
checkpoint according
useless checkpoints.
Explain
Q20
Ans.
: A useless
state.
checkpoint
Useless
of a process
one that will never be part of a
is
performance
overhead
execution of a
is the
process.
sequence of events between two consecutive
Ans. : Messages with receive recorded but message send not recorded are called the
orphan messages.
Ans. : Goal to achieve an optimal assignment is finding minimum weight cutest. The
a
weight of a cutset is the sum of the weights of the edges in the cutset. This sums up
the execution and communication costs for that assignment.
e, such that (e; --> e; )and (e; --> c;) and (e; -/-> c;)
Q.25 What is the basic idea behind task assignment approach ? AU:May-17
are known.
c. The cost of processing each task on every node is known.
d. The IPC costs between every pair of tasks is known.
are known.
e. Precedencerelationships among the taks
out of date.