Lecture 1
Lecture 1
• Slip Tests(~7): 15
• Project:20
• Assignments(3) : 20: 2 coding and one theory
• 1 Quiz + Midsem + Endsem: (10+12)+23=45
Syllabus (Tentative)
Module 1
Introduction
Time and Synchronization
Mutual Exclusion
Deadlock Detection
Module 2
Distributed graph algorithms
Consensus, Agreement, Locking
Deadlock Handling
Distributed File systems
Syllabus (Tentative)
Module 4
Limitations of distributed computing
Self-Stabilization
CAP Theorem
Block Chain and BitCoins (Guest lecture)
TAs
Ramaguru Guru
Kannav Mehta
Akshit Garg
Snehal Kumar
Sahiti Reddy
Prince Varshney
Nikunj Nawal
Lokesh V
Motivation
A group of computers working together as to appear as a
single computer to the end-user.– Tannenbaum
Eg: Airline reservation, Online Banking
Required Features:
Lot of computers
Perform Concurrently
Fail independently
Dont share a global clock
Motivation
• Horizontal Scaling to fit data, shift data to manage load – scalability
There is a limitation to how much you can upgrade a machine. Every machine has its threshold for
RAM, storage, and processing power.
• Enhanced Reliability
• Fault Tolerance
Data Structures Week 1
Data Structures Week 1
• Lack of global knowledge – a participating node has information of only his local
memory and local computations. (no shared memory)
• Concurrency control – Data distributed across nodes. Moreover each has its own
clock. How do you implement Mutual exclusion now or critical sections now?
• Failure and recovery : Nodes may fail, communication channels may fail – how do
you still make sure the system functions. When a node recovers how do you make
the updates on it.
Any distributed program has three types of events
Local actions
Message Send
Message Receive
Can then view a distributed program as a
sequential execution of the above events.
Data Structures Week 1
Logical Time
• Logical time is a notion we introduce to order
events. It provides a mechanism to define the
causal order in which events occur at different
processes. The ordering is based on the following:
• Two events occurring at the same process happen
in the order in which they are observed by the
process.
• If a message is sent from one process to another,
the sending of the message happened before the
receiving of the message.
• If e occurred before e' and e' occurred before e"
then e occurred before e".
Causality between events
Think of events happening at a process Pi.
One can order these events sequentially,
In other words, place a linear order on these events.
Let hi = {e1i, e2i, …}, be the events at process Pi.
A linear order Hi is a binary relation →i on the events
hi such that eki →i eji if and only if the event eki occurs
before the event eji at Process Pi.
The dependencies captured by →i are often called
as causal dependencies among the events hi at
P i.
Causality for msgs
What about events generated due to messages?
Consider a binary relation →msg across messages
exchanged.
Clearly, for any message m, send(m) →msg
recv(m).
These two relations →i and →msg allow us to view
the execution of a distributed program
Causality Definition
One can now view the execution of a distributed
program as a collection of events.
Consider the set of events H = Ui hi, where hi is
the events that occurred at process Pi.
Define a binary relation → expressing causality
among pairs of events that possibly occur at
different processes. → defined as:
i = j and x < y, or
• .
• Concurrent: e1
3
and e3
1
? e42 and e13 ?
• Causal: e 3
3
→ e5
1
, e 2
1
→ e3
2
, e 4
3
→ e5
1 ?
Data Structures Week 1
Logical Time
• Armed with our notation of events and precedences amongst
events, we now study logical time.
• We now see how logical time can be maintained in a distributed
system.
• Three ways to implement logical time -
– scalar time,
– vector time, and
– matrix time
scalar time,
vector time, and
matrix time
Data Structures Week 1
Logical Time
The logical clock C is a function that maps an event e in a
distributed system to an element in the time domain T, denoted as
C(e) and called the timestamp of e, and is defined as follows:
C:H→T
such that for two events ei and ej , ei → ej C(ei) < C(ej).
32
Data Structures Week 1
Logical Time
Consistent: When T and C satisfy the condition: for two events
ei and ej , ei → ej C(ei) < C(ej).
This property is called the clock consistency condition.
Strongly Consistent: When T and C satisfy the condition:
for two events ei and ej , ei → ej C(ei) < C(ej).
then the system of clocks is said to be strongly consistent.
33
Data Structures Week 1
Logical Time
Each processor needs some data structure to
represent logical time.
A local logical clock, denoted by lci, that helps process
pi measure its own progress.
A logical global clock, denoted by gci, that is a
representation of process pi’s local view of the logical
global time.
Allows pi to assign consistent time stamps to its local
events.
Typically, lci is a part of gci.
Each processor needs a protocol to update the
data structures to ensure the consistency condition.
34
Data Structures Week 1
Logical Time
Each processor needs a protocol to update the
data structures to ensure the consistency
condition.
Rule 1: Specify how the local logical clock is updated
by a process when it executes an event.
Rule 2: Specify how a process updates its logical
global clock to update its view of the global time and
global progress
all logical clocks systems implement Rule1 and
Rule2
35
Data Structures Week 1
Scalar Time
Proposed by Lamport in 1978
Lamport won the 2013 Turing award for “contributions
to the theory and practice of distributed and
concurrent systems, notably the invention of concepts
such as causality and logical clocks, .....”.
Time domain is the set of non-negative integers.
The logical local clock of a process pi and its local
view of the global time are combined into one
integer variable Ci.
36
Data Structures Week 1
Scalar Time
Data structure: The logical local clock of a process pi
and its local view of the global time are squashed
into one integer variable Ci.
Rule1: Before executing an event (send, receive, or
internal), process pi executes the following:
Ci := Ci + d (d > 0)
Rule 2: Each message piggybacks the clock value of its
sender at sending time. When a process p i receives a
message with timestamp Cmsg , it executes the following
actions:
1. Ci := max(Ci, Cmsg )
2. Execute Rule1. 37
3. Deliver the message.
Data Structures Week 1
Scalar Time
Example. Use d = 1 and assign time stamps for
the events.
38
Data Structures Week 1
Scalar Time
Example
39
Data Structures Week 1
Total Ordering: Can use the logical time given by the scalar
clocks to induce a total order .
Note that the timestamps alone do not induce a total order.
Two events at different processors can have an identical
timestamp.
But the tuple (t, i) for each event with (t1,i1) < (t2, i2) if either t1
< t2 or ( (t1 == t2) and i1<i2) is a total order. In (t,i), t denotes
timestamp and i the identiy of the process.
This total order is consistent with the relation →.
Note: According to the total order above: For events e1 and
e2, e1 < e2 Either e1 → e2 or e1 || e2.
40
Data Structures Week 1
Scalar Time
Event Counting:
Strong Consistency:
Sriram Murthy
42
Data Structures Week 1
Scalar Time
Event Counting: Set the increment d to 1 always.
If some event e has a timestamp t, then e is
dependent on t – 1 other events to occur.
This can be called as the height of event e.
Sriram Murthy
No Strong Consistency: Note that scalar time does
not provide strong consistency. [Strong consistency
requires that ei → ej C(ei) < C(ej).]
Example suffices. Refer to the timeline again.
43
Data Structures Week 1
Scalar Time
44
Data Structures Week 1
Vector time solves this problem but with big data
structures.
45