Lecture 3
Lecture 3
Matrix Time
In the system of matrix clocks, the time is represented by a set of n x
n matrices of non-negative integers.
mti[i, j] denotes the latest knowledge that process p i has about the
local logical clock, mtj[j, j], of process pj. Mt[i,*] is vector vti
mti[j, k] represents the knowledge that process pi has about the latest
knowledge that pj has about the local logical clock, mtk[k, k], of pk .
The entire matrix mti denotes pi’s local view of the global logical time.
1
Data Structures Week 1
Matrix Time
Process pi uses below rules to update its clock:
Rule 2: Each message m is piggybacked with matrix time mt.
When pi receives such a message (m,mt) from a process pj, then
pi executes the below sequence:
1. Update its global logical time as follows:
(a) 1 ≤ k ≤ n : mti[i, k] := max(mti[i, k], mt[j, k]) (That is,
update its row with the pj’s row for higher values.)
(b) 1 ≤ k, l ≤ n : mti[k, l] := max(mti[k, l], mt[k, l])
2. Execute Rule 1.
2
3. Deliver message m
Data Structures Week 1
1
Data Structures Week 1
Matrix Time
Current update is:
1 ≤ k ≤ n : mti[i, k] := max(mti[i, k], mt[j, k])
4
Data Structures Week 1
In addition, matrix clocks have the following property:
– mink (mti[k, i]) ≥ t ⇒ process pi knows that every other process
pk knows that pi’s local time has progressed till t.
If this is true, it is clear that process pi knows that all other processes
has received pi ‘s msg sent at t and hence can discard the msg.
In many applications, this implies that processes will no longer
require from pl certain information (it might be holding information
until all processes are updated about some msg from pl)and can use
this fact to discard obsolete information.
5
Data Structures Week 1
6
Data Structures Week 1
Teaser 4: Use matrix clocks for the above statement and answer “Has D’s msg
reached everyone.”
A Distributed Program in Execution
How can causal order among messages be
guaranteed where msgs are broadcast? Read about
it from An Efficient Causal Order Algorithm for
Message Delivery in Distributed System, Jangt,
Park, Cho, and Yoon
Global States and Snapshots
To detect stable states – deadlock detection
For checkpointing and recovery
Global States and Snapshots
Consider the distributed execution shown above.
Global States and Snapshots
If the state of the account A is recorded at time t0 and the state
of account B and that of the channels C12 and C21 are
recorded at t2, what is the total money in the system?
Global States and Snapshots
Initially 1000+1500=2500
At check point 1000(A) +1400(B)+150(C12)+100(C21)= 2650
An extra 150 Rs !!!.
If this state is used for checkpointing, then there is trouble!
Global State of a Distributed System
Can not stop the process and see the global snapshot and
neither can we synchronise all the systems and ask them to
take a snapshot at a particular instant.
Not necessary a state the system has visited but is one state
that the system could have visited
Application: Check Pointing Recovery, Deadlock needs
understanding current state to measure if a true deadlock.
Issues: We need to know both node and channel states as
some msgs might be in transit
Assumption: System can not be stopped to take global state.
Also no global clock to do the same.
Global State of a Distributed System
One can think of the collection of local states as the global
state.
Recall that the local state of a process, Pi, is the contents of
the processor registers, stack, local memory, etc.
The state of a channel is the set of messages in transit in the
channel.
Events lead to changes in the state of local process(es).
Global States and Snapshots
Recording the global state of a distributed system on-the-fly is
an important paradigm.
The lack of globally shared memory, global clock and
unpredictable message delays in a distributed system make
this problem non-trivial.
Building on the definition consistent global states we discuss
issues to be addressed to obtain consistent distributed
snapshots.
Then several algorithms to determine on-the-fly such
snapshots are presented for several types of networks.
Global State of a Distributed System
At any instant the state of a process Pi denoted by LSi, is a
result of the sequence of all events executed by pi upto that
instant. State of a Channel Cij is denoted as SCij
•We use the above to capture the state of a channel Cij, denoted
SCx,yi,j as follows.
SCx,yi,j = {mij | send(mij) <= LSxi and recv(mij) LSyj}
In words, the state of SCx,yi,j denotes all messages that have been
sent by Pi up to event exi and not received by Pj up to event eyj.
What went wrong here?
Initially 1000+1500=2500
At check point 1000(A) +1400(B)+150(C12)+100(C21)= 2650
An extra 150 Rs !!!.
If this state is used for checkpointing, then there is trouble!
What went wrong here?
In the upper image the Rs150 transfer has not started at time t0 and hence
should not be considered as a msg in transit. So in upper image total should be
1000(A at t0 ) + 1400( B at t2)+ 100(SC21)=2500
Note the lower image though not reality is a consistent state possible same as the
upper image. Consider taking the snapshot at the time t0(lower image).
Conditions for Consistent Global States
The global state of a distributed system is a collection of the
local states of the processes and the channels.
Notationally, global state GS is defined as,
– GS = { UiLSi , Ui,j SCij}
A global state GS is a consistent global state iff it satisfies the
following two conditions :
C1: send(mij) LSi ⇒ mij SCij rec(mij)LSj. (is the Exclusive-
OR operator)
C2: send(mij)LSi ⇒ mij SCij Ʌ rec(mij) LSj.
Conditions for Consistent Global States
A global state GS is a consistent global state iff it satisfies the
following two conditions :
C1: send(mij) LSi ⇒ mij SCij rec(mij)LSj. (is the
Exclusive-OR operator)
C1 states the law of conservation of messages.
– Every message that is recorded as sent in the local state
of some process is either captured in the state of the
channel or is captured in the local state of the receiver.
C2: send(mij)LSi ⇒ mij SCij Ʌ rec(mij) LSj.
C2 states that for every cause there is an effect.
– If a message is not recorded as sent in the local state of a
process Pi, then the message cannot be included in the
state of the channel Cij or be captured as received by Pj.
Interpretation in terms of cuts
A cut in a space-time diagram is a line joining an arbitrary point
on each process line that slices the space-time diagram into a
PAST and a FUTURE.
A consistent global state corresponds to a cut in which every
message received in the PAST of the cut was sent in the PAST of
that cut.
Such a cut is known as a consistent cut.