0% found this document useful (0 votes)
26 views21 pages

Unit 1

Uploaded by

Only One
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views21 pages

Unit 1

Uploaded by

Only One
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

UNIT 1

1
What are the trends in Distributed systems.
Distributed systems are undergoing a period of significant change
and this can be traced back to a number of influential trends:
 the emergence of pervasive networking technology;
 the emergence of ubiquitous computing coupled with the desire to
support
user mobility in distributed systems;
 the increasing demand for multimedia services;
 the view of distributed systems as a utility.
 Pervasive networking and the modern Internet

The internet is also a very large distributed systems .It enables


users, wherever they are to make use of services such as the www,
email and file transfer.

The set of services is open-ended .It can be extended by the


addition of server computers and new type of services. The role of a
firewall is to protect an intranet by preventing unauthorizes msgs
from leaving or entering.
 Mobile and ubiquitous computing

Mobile computing is the performance of computing tasle while


the user is on the move, or visiting places other than their usual
environment. In mobile computing, users who are away from their
home intranet are still devices provided with access to resources via
the devices they carry with them.

Ubiquitous computing is the harnessing of many small cheap


computational devices that are present in user’s physical
environments, including home, office and even natural settings.

 Distributed multimedia systems

Distributed systems should be able to perform the same


function for continuous media type such as audio and video; that it
should be able to store and locate audio or video files, to transmit
them across the network, to support the presentation of the media
types to the user and optionally also to share the media types across
or a group of users.
 Distributed computing as a utility

With the increasing maturity of distributed systems


infrastructures a number of companies are promoting the view of
distributed resources as a utility drawing the analogy between
distributed resources and other utilities such as electricity. This
model applies to both physical and more logical services.
2

Message-passing systems versus shared memory systems


In Shared memory systems there is a (common) shared address
space throughout the system.
 Communication among processors takes place via shared data
variables, and control variables for synchronization (Semaphores and
monitors) among the processors.
 If a shared memory is distributed then it is called distributed shared
memory.
 All multicomputer (NUMA as well as message-passing) systems that
do not have a shared address space hence communicate by message
passing.
1.5.1 Emulating message-passing on a shared memory system (MP
→SM)
 The shared address space is partitioned into disjoint parts, one part
being assigned to each processor.
 “Send” and “receive” operations are implemented for writing to
and reading from the destination/sender processor’s address space,
respectively.
 Specifically, a separate location is reserved as mailbox (assumed to
have unbounded in size) for each ordered pair of processes.
 A Pi–Pj message-passing can be emulated by a write by Pi to the
mailbox and then a
read by Pj from the mailbox.
 The write and read operations are controlled using synchronization
primitives to inform the receiver/sender after the data has been
sent/received.
1.5.2 Emulating shared memory on a message-passing system (SM
→MP)
 This involves use of “send” and “receive” operations for “write”
and “read” operations.
 Each shared location can be modeled as a separate process;
 “write” to a shared location is emulated by sending an update
message to the corresponding owner process and a “read” by
sending a query message.
 As accessing another processor’s memory requires send and
receive operations, this emulation is expensive.
 In a MIMD message-passing multicomputer system, each
“processor” may be a tightly coupled multiprocessor system with
shared memory. Within the multiprocessor system, the processors
communicate via shared memory.
 Between two computers, the communication by message passing
are more suited for wide-area distributed systems.

Logical Time
3.2.1 Definition
 A system of logical clocks consists of a time domain T and a logical
clock C.
 Elements of T form a partially ordered set over a relation < called as
happened before or causal precedence.
 The logical clock C is a function that maps an event e to the time
domain T, denoted as C(e) and called the timestamp of e, and is
defined as follows:
C: H T
 such that the following monotonicity property is satisfied then it is
callled the clockconsistency condition.
for two events ei and ej, ei→ ej⇒ C(ei) < C(ej).
 When T and C satisfy the following condition, the system of clocks
is said to bestrongly consistent.
for two events ei and ej , ei→ej ⇔ C(ei) < C(ej ),
3.2.2 Implementing logical clocks
 Implementation of logical clocks requires addressing two issues:
 data structures local to every process to represent logical time and
 a protocol to update the data structures to ensure the consistency
condition.
 Each process pi maintains data structures with the following two
capabilities:
o A local logical clock, lci, that helps process pi to measure its
own progress.
o A logical global clock, gci, represents the process pi’s local
view of logical global time. It allows this process to assign consistent
timestamps to its local events.
 The protocol ensures that a process’s logical clock, and thus its
view of global time, is managed consistently.
 The protocol consists of the following two rules:
o R1 This rule governs how the local logical clock is updated by
a process when it executes an event (send, receive, or internal).
o R2 This rule governs how a process updates its global logical
clock to update its view of the global time and global progress. It
dictates what information about the logical time is piggybacked in a
message and how it is used by the process to update its view of
global time.
3.3 Scalar time
 3.3.1 Definition
 The scalar time representation was proposed by Lamport to totally
order events in a distributed system. Time domain is represented as
the set of non-negative integers.
 The logical local clock of a process pi and its local view of global
time are squashed into one integer variable Ci.
Rules R1 and R2 used to update the clocks is as follows:
 R1 : Before executing an event (send, receive, or internal), process pi
executes:
Ci := Ci + d (d > 0)
d can have a different value, may be application-
dependent. Here d is kept at 1.
R2 : Each message piggybacks the clock value of its sender
at sending time. When a
process pi receives a message with timestamp Cmsg, it
executes the following actions:
1. Ci := max(Ci, Cmsg);
2. execute R1;
3. deliver the message.
 Figure 3.1 shows the evolution of scalar time with d=1.

Figure 3.1 Evolution of scalar time


3.3.2 Basic properties
Consistency property
 scalar clocks satisfy the monotonicity and consistency property. i.e.,
for two events ei and ej,
ei →ej ⇒ C(ei) < C(ej ).

Total Ordering
 Scalar clocks can be used to totally order events in a distributed
system.
 Problem in totally ordering events: Two or more events at different
processes may have an identical timestamp. i.e., for two events e1
and e2, C(e1) = C(e2) ⇒ e1|| e2.
 In Figure 3.1, 3rd event of process P1 and 2nd event of process P2
have same scalar timestamp. Thus, a tie-breaking mechanism is
needed to order such events.
 A tie among events with identical scalar timestamp is broken on the
basis of their process identifiers. The lower the process identifier
then it is higher in priority.
 The timestamp of an event is a tuple (t, i) where t - time of
occurrence and i – identity of the process where it occurred. The
total order relation ≺ on two events x and y with
timestamps (h,i) and (k,j) is:
x ≺ y⇔ (h < k or (h = k and i < j))
Event counting
 If the increment value of d is 1 then, if the event e has a timestamp
h, then h- 1 represents minimum number of events that happened
before producing the event e;
 In the figure 3.1,five events precede event b on the longest causal
path ending at b.
No strong consistency
 The system of scalar clocks is not strongly consistent; that is, for
two events ei and ej,
C(ei) < C(ej) ≠> ei→ej .
 In Figure 3.1, the 3rd event of process P1 has smaller scalar
timestamp than 3rd event of process P2.
3.4 Vector time
Definition
 The system of vector clocks was developed Fidge, Mattern, and
Schmuck.
 Here, the time domain is represented by a set of n-dimensional
non-negative integer vectors.
 Each process pi maintains a vector vti[1..n], where vti[i] is the local
logical clock of pi that specifies the progress at process.
 vti[j] represents process pi’s latest knowledge of process pj local
time.
 If vti[j] = x, then process pi knowledge on process pj till progressed
x.
 The entire vector vti constitutes pi’s view of global logical time and
is used to timestamp events.
 Process pi uses the following two rules R1 and R2 to update its
clock:
R1:
 Before executing an event, process pi updates its local logical time
as follows:
vti[i] = vti[i] + d (d>0)
R2:
 Each message m is piggybacked with the vector clock vt of the
sender process at sending time.
 On receipt of message (m,vt), process pi executes:
1. update its global logical time as follows:
1 ≤ k ≤n : vti[k] := max(vti[k], vt[k])
2. execute R1;
3. deliver the message m.
 The timestamp associated with an event is the value of vector clock
of its process when the event is executed.
 The vector clocks progress with the increment value d = 1. Initially,
it is [0, 0, 0, .. , 0].
 The following relations are defined to compare two vector
timestamps, vh and vk:
vh = vk ⇔∀x : vh[x] = vk[x]
vh ≤ vk ⇔∀x : vh[x] ≤ vk[x]
vh < vk ⇔vh ≤ vk and ∃ x : vh[x] < vk[x]

vh || vk ⇔¬(vh < vk)∧ ¬(vk < vh)

3.4.2 Basic properties


Isomorphism
 The relation “→” denotes partial order on the set of events in a
distributed execution.
 If events are timestamped using vector clocks, then
 If two events x and y have timestamps vh and vk, respectively, then
x→y ⇔vh < vk
x || y ⇔vh || vk
 Thus, there is an isomorphism between the set of partially ordered
events and their vector timestamps.
 Hence, to compare two timestamps consider the events x and y
occurred at processes pi and pj are assigned timestamps vh and vk,
respectively, then
x→y ⇔vh[i] ≤ vk[i]
x || y ⇔vh[i] > vk[i] ∧ vh[j] < vk[j]
Strong consistency
 The system of vector clocks is strongly consistent;
 Hence, by examining the vector timestamp of two events, it can be
determined that whether the events are causally related.
Event counting
 If d is always 1 in rule R1, then the ith component of vector clock at
process pi, vti[i], denotes the number of events that have occurred at
pi until that instant.
 so, if an event e has timestamp vh, vh[j] denotes the number of
events executed by process pj that causally precede e.
 Ʃvh[j]-1 represents the total number of events that causally
precede e in the distributed computation.
Applications
 As vector time tracks causal dependencies exactly, it's applications
are as follows:
 distributed debugging,
 Implementations of causal ordering communication in distributed
shared memory.
 Establishment of global breakpoints to determine consistency of
checkpoints in recovery.
Linear Extension
 A linear extension of a partial order (E, ) is a linear ordering of E i.e.,
consistent with partial order, if two events are ordered in the partial
order, they are also ordered in the linear order. It is viewed as
projecting all the events from the different processes on a
single time axis.
Dimension
 The dimension of a partial order is the minimum number of linear
extensions whose intersection gives exactly the partial order.

4
Synchronous versus asynchronous executions
 An asynchronous execution is an execution in which
(i) there is no processor synchrony and there is no bound on the drift
rate of processor clocks,
(ii) message delays (transmission + propagation times) are finite but
unbounded, and
(iii) there is no upper bound on the time taken by a process to
execute a step.

Figure 1.9 An example timing diagram of an asynchronous execution


in a message-passing system.
 A synchronous execution is an execution in which
(i) processors are synchronized and the clock drift rate between
any two processors is bounded,
(ii) message delivery (transmission + delivery) times are such
that they occur in one logical step or round, and

(iii) there is a known upper bound on the time taken by a


process to execute a step.
If processors are allowed to have an asynchronous execution
for a period of time and then they synchronize, then the
granularity of the synchrony is coarse. This is really a virtually
synchronous execution, and the abstraction is sometimes termed
as virtual synchrony.
Ideally, many programs want the processes to execute a series of
instructions in rounds (also termed as steps or phases)
asynchronously, with the requirement that after each
round/step/phase, all the processes should be synchronized and
all messages sent should be delivered.

Figure 1.10 An example of a synchronous execution in a message-


passing system. All the messages sent in a round are
received within that same round.
1.7.1 Emulating an asynchronous system by a synchronous
system (A→S)
An asynchronous program can be emulated on a synchronous
system as a special case of an asynchronous system – all
communication finishes within the same round in which it is
initiated.
1.7.2 Emulating a synchronous system by an asynchronous
system (S →A)
A synchronous program (written for a synchronous system)
can be emulated on an asynchronous system using a tool
called synchronizer.
1.7.3 Emulations
Using the emulations shown, any class can be emulated by
any other. If system A can be emulated by system B,
denoted A/B, and if a problem is not solvable in B, then it is
also not
solvable in A. Likewise, if a problem is solvable in A, it is also
solvable in B.
Hence, all four classes are equivalent in terms of
“computability” i.e., what can and cannot be computed – in
failure-free systems.

Figure 1.11 Emulations among the principal system classes in


a failure-free system
5
3.9 Physical clock synchronization: NTP
3.9.1 Motivation
 In centralized systems:
o there is no need for clock synchronization because, there is
only a single clock. A process gets the time by issuing a system call to
the kernel.
o When another process after that get the time, it will get a
higher time value.
Thus, there is a clear ordering of events and no ambiguity
about event occurences.
 In distributed systems:
o there is no global clock or common memory.
o Each processor has its own internal clock and its own notion
of time drift apart by several seconds per day, accumulating
significant errors over time.
 For most applications and algorithms that runs in a distributed
system requires:
1. The time of the day at which an event happened on a machine in
the network.
2. The time interval between two events that happened on different
machines in the network.
3. The relative ordering of events that happened on different
machines in the network.
 Example applications that need synchronization are: secure
systems, fault diagnosis and recovery, scheduled operations,
database systems.
 Clock synchronization is the process of ensuring that physically
distributed processors have a common notion of time.
 Due to different clocks rates, the clocks at various sites may diverge
with time.
 To correct this periodically a clock synchronization is performed.
Clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time).
 Clocks that are not synchronized with each other will adhere to
physical time termed as physical clocks.
Definitions and terminology
Let Ca and Cb be any two clocks.
1. Time The time of a clock in a machine p is given by the function
Cp(t), where Cp(t) = t for
a perfect clock.
2. Frequency Frequency is the rate at which a clock progresses. The
frequency at time t of clock Ca is Ca '(t).
3. Offset Clock offset is the difference between the time reported by
a clock and the real time. The offset of the clock Ca is given by Ca(t)-
t. The offset of clock Ca relative to Cb at time t ≥ 0 is given by Ca(t)-
Cb(t).
4. Skew The skew of a clock is the difference in the frequencies of
the clock and the perfect clock. The skew of a clock Ca relative to
clock Cb at time t is Ca'(t)-Cb'(t).
If the skew is bounded by ρ, then as per Eq.(3.1), clock values are
allowed to diverge at a rate in the range of 1-ρ to 1+ρ.
5. Drift (rate) The drift of clock Ca is the second derivative of the
clock value with respect to time, namely, Ca''(t). The drift of clock Ca
relative to clock Cb at time t is is Ca''(t)-Cb''(t).
Clock inaccuracies
 Physical clocks are synchronized to an accurate real-time standard
like UTC.
However, due to the clock inaccuracy, a timer (clock) is said to be
working within its specification if

where constant ρ is the maximum skew rate.


Offset delay estimation method
 The Network Time Protocol (NTP) , is widely used for clock
synchronization on the Internet, uses the offset delay estimation
method.
 The design of NTP involves a hierarchical tree of time servers.
 The primary server at the root synchronizes with the UTC.
 The next level contains secondary servers, which act as a backup to
the primary server.
 At the lowest level is the synchronization subnet which has the
clients.
Clock offset and delay estimation
 This protocol performs several trials and chooses the trial with the
minimum delay to accurately estimate the local time on the target
node due to varying message or network delays between the nodes.
 Let T1,T2,T3,T4 be the values of the four most recent timestamps
as shown in the figure.
 Assume that clocks A and B are stable and running at the same
speed. Let ,
a = T1 -T3 and b = T2 -T4.
 If the network delay difference from A to B and from B to A, called
differential delay, is small, the clock offset θ and roundtrip delay of B
relative to A at time T4 are approximately given by the following

 Each NTP message includes the latest three timestamps T1, T2, and
T3, while T4 is determined upon arrival.
Figure :The Behaviour of fast, slow and perfect clocks with respect
to UTC.
The network time protocol (NTP) synchronization p

You might also like