0% found this document useful (0 votes)
69 views

Chapter 2

Uploaded by

Sathish Sid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Chapter 2

Uploaded by

Sathish Sid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 54

Chapter 2 – Basic Concepts

Contents
 Parallel computing.
 Concurrency.
 Parallelism levels; parallel computer architecture.
 Distributed systems.
 Processes, threads, events, and communication channels.
 Global state of a process group.
 Logical clocks.
 Runs and cuts.
 The snapshot protocol.
 Atomic actions.
 Consensus algorithms.
 Modeling concurrency with Petri Nets.
 Client-server paradigm.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 2
The path to cloud computing
 Cloud computing is based on ideas and the experience
accumulated in many years of research in parallel and distributed
systems.
 Cloud applications are based on the client-server paradigm with a relatively
simple software, a thin-client, running on the user's machine, while the
computations are carried out on the cloud.
 Concurrency is important; many cloud applications are data-intensive and
use a number of instances which run concurrently.
 Checkpoint-restart procedures are used as many cloud computations run for
extended periods of time on multiple servers. Checkpoints are taken
periodically in anticipation of the need to restart a process when one or
more systems fail.
 Communication is at the heart of cloud computing. Communication
protocols which support coordination of distributed processes travel through
noisy and unreliable communication channels which may lose messages or
deliver duplicate, distorted, or out of order messages.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 3
Parallel computing
 Parallel hardware and software systems allow us to:
 Solve problems demanding resources not available on a single system.
 Reduce the time required to obtain a solution.
 The speed-up S measures the effectiveness of parallelization:
S(N) = T(1) / T(N)
T(1)  the execution time of the sequential computation.
T(N)  the execution time when N parallel computations are executed.
 Amdahl's Law: if α is the fraction of running time a sequential program
spends on non-parallelizable segments of the computation then
S = 1/ α
 Gustafson's Law: the scaled speed-up with N parallel processes
S(N) = N – α( N-1)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 4
Concurrency; race conditions and deadlocks
 Concurrent execution can be challenging.
 It could lead to race conditions, an undesirable effect when the results of
concurrent execution depend on the sequence of events.
 Shared resources must be protected by locks/ semaphores /monitors to
ensure serial access.
 Deadlocks and livelocks are possible.
 The four Coffman conditions for a deadlock:
 Mutual exclusion - at least one resource must be non-sharable, only one
process/thread may use the resource at any given time.
 Hold and wait - at least one processes/thread must hold one or more
resources and wait for others.
 No-preemption - the scheduler or a monitor should not be able to force a
process/thread holding a resource to relinquish it.
 Circular wait - given the set of n processes/threads {P1, P2 , P3 , …,Pn }.
Process P1 waits for a resource held by P2 , P2 waits for a resource held
by P3, and so on, Pn waits for a resource held by P1.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 5
A monitor provides special procedures to access the data in a critical section.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 6
More challenges
 Livelock condition: two or more processes/threads continually change
their state in response to changes in the other processes; then none of
the processes can complete its execution.
 Very often processes/threads running concurrently are assigned
priorities and scheduled based on these priorities. Priority inversion, a
higher priority process/task is indirectly preempted by a lower priority
one.
 Discovering parallelism is often challenging and the development of
parallel algorithms requires a considerable effort. For example, many
numerical analysis problems, such as solving large systems of linear
equations or solving systems of PDEs (Partial Differential Equations),
require algorithms based on domain decomposition methods.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 7
Parallelism
 Fine-grain parallelism  relatively small blocks of the code can be
executed in parallel without the need to communicate or
synchronize with other threads or processes.
 Coarse-grain parallelism  large blocks of code can be executed in
parallel.
 The speed-up of applications displaying fine-grain parallelism is
considerably lower that those of coarse-grained applications;
the processor speed is orders of magnitude larger than the
communication speed even on systems with a fast interconnect.
 Data parallelism  the data is partitioned into several blocks and
the blocks are processed in parallel.
 Same Program Multiple Data (SPMD)  data parallelism when
multiple copies of the same program run concurrently, each one on
a different data block.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 8
Parallelism levels
 Bit level parallelism. The number of bits processed per clock cycle,
often called a word size, has increased gradually from 4-bit, to 8-bit,
16-bit, 32-bit, and to 64-bit. This has reduced the number of
instructions required to process larger size operands and allowed a
significant performance improvement. During this evolutionary
process the number of address bits have also increased allowing
instructions to reference a larger address space.
 Instruction-level parallelism. Today's computers use multi-stage
processing pipelines to speed up execution.
 Data parallelism or loop parallelism. The program loops can be
processed in parallel.
 Task parallelism. The problem can be decomposed into tasks that can
be carried out concurrently. For example, SPMD. Note that data
dependencies cause different flows of control in individual tasks.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 9
Parallel computer architecture
 Michael Flynn’s classification of computer architectures is based on
the number of concurrent control/instruction and data streams:
 SISD (Single Instruction Single Data) – scalar architecture with one
processor/core.
 SIMD (Single Instruction, Multiple Data) - supports vector processing.
When a SIMD instruction is issued, the operations on individual vector
components are carried out concurrently.
 MIMD (Multiple Instructions, Multiple Data) - a system with several
processors and/or cores that function asynchronously and
independently; at any time, different processors/cores may be
executing different instructions on different data. We distinguish several
types of systems:
 Uniform Memory Access (UMA).

 Cache Only Memory Access (COMA).

 Non-Uniform Memory Access (NUMA).

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 10
Distributed systems
 Collection of autonomous computers, connected through a network
and distribution software called middleware which enables computers
to coordinate their activities and to share system resources.
 Characteristics:
 The users perceive the system as a single, integrated computing facility.
 The components are autonomous.
 Scheduling and other resource management and security policies are
implemented by each system.
 There are multiple points of control and multiple points of failure.
 The resources may not be accessible at all times.
 Can be scaled by adding additional resources.
 Can be designed to maintain availability even at low levels of
hardware/software/network reliability.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 11
Desirable properties of a distributed system
 Access transparency - local and remote information objects are
accessed using identical operations.
 Location transparency - information objects are accessed without
knowledge of their location.
 Concurrency transparency - several processes run concurrently using
shared information objects without interference among them.
 Replication transparency - multiple instances of information objects
increase reliability without the knowledge of users or applications.
 Failure transparency - the concealment of faults.
 Migration transparency - the information objects in the system are
moved without affecting the operation performed on them.
 Performance transparency - the system can be reconfigured based on
the load and quality of service requirements.
 Scaling transparency - the system and the applications can scale without
a change in the system structure and without affecting the applications.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 12
Processes, threads, events
 Dispatchable units of work:
 Process  a program in execution.
 Thread  a light-weight process.
 State of a process/thread  the ensemble of information we need to
restart a process/thread after it was suspended.
 Event  is a change of state of a process.
 Local events.
 Communication events.
 Process group  a collection of cooperating processes; the
processes work in concert and communicate with one another in
order to reach a common goal.
 The global state of a distributed system consisting of several
processes and communication channels is the union of the states of
the individual processes and channels

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 13
Messages and communication channels
 Message  a structured unit of information.
 Communication channel  provides the means for processes or
threads to communicate with one another and coordinate their
actions by exchanging messages. Communication is done only by
means of send(m) and receive(m) communication events, where m
is a message.
 The state of a communication channel : given two processes pi and
pj, the state of the channel ξi,j, from pi to pj consists of messages
sent by pi but not yet received by pj.
 Protocol  a finite set of messages exchanged among processes to
help them coordinate their actions.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 14
Space-time diagrams display local and 1
e e
2 3 4 11
e
1 1 e e 1

communication events during a process


1 1
p 1

lifetime. Local events are small black (a)

circles. Communication events in


different processes are connected by 1 2 3 4
e e e e e
5 6
e
lines from the send event and to the
1 1 1 1 1 1
p 1

receive event.
1 2 5
e e 3
e e
4
e
(a)All events in process p1 are local; the p 2
2 2 2 2 2

process is in state σ1 immediately after (b)

the occurrence of event e11 and remains


in that state until the occurrence of
event e12 . 1 3 4 5
e
1 e
2
1 e 1 e 1 e 1

(b)Two processes p1 and p2; event e12 is p 1

a communication event, p1 sends a 1 2 3 4


e 2 e 2 e 2 e 2

message to p2 ; e23 is a communication p 2

event, process p2 receives the message 1 2 3 4


e e e e
sent by p1.
3 3 3 3

p 3

(c)
(c)Three
processes interact by means of
communication events.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 15
Global state of a process group
 The global states of a distributed computation with n processes form
an n-dimensional lattice.
 How many paths to reach a global state exist? The more paths, the
harder is to identify the events leading to a given state. A large
number of paths increases the difficulties to debug a system.
 In case of two threads the number of paths from the initial state Σ(0,0)
to the state Σ(m,n) is:
N (m,n) = (m + n) /(m!n!)
 In the two dimensional case the global state Σ (m,n) can only be
reached from two states, Σ(m-1,n) and Σ(m,n-1)

 
(m-1,n) (m,n-1)


(m,n)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 16

0,0
1 2
e e
1 1

p
1


1, 0

0 ,1
p
2 1
e e 2
2
2
1 2
e 1 e 1

  
2,0 1,1 0, 2
p 1

p 2
1 2
e e
The lattice of the 
3, 0 2 2

(a)  
2 ,1 1, 2

1 2
e e
global states of two 1 1

p
1

3,1

2, 2
p
processes with the 2 1
e e 2
2
2

space-time showing 
4 ,1

3, 2

2,3
e e
1
1
2
1

only the first two p


1

p
events per process.  
5 ,1 3, 3
 
4, 2 2, 4 2 1 2
e e 2 2

1 2
e e
(b) The six possible 
5, 2

4,3

3, 4

2,5 p
1
1 1

sequences of p
2
e
1
2 e
2
2

events leading to 
53

4, 4

3, 5
e
1
1 e
1
2

p
the state Σ2,2 p
1


4,5 2

5, 4 1 2
e 2 e 2


5, 5

time


6,5

(a) (b)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 17
Communication protocols - coordination
 Communication in the presence of channel failures, a major concern. It
is impossible to guarantee that two processes will reach an agreement
in case of channel failures (see next slide).
 In practice, error detection and error correction codes allow processes
to communicate reliably though noisy digital channels. The
redundancy of a message is increased by more bits and packaging a
message as a codeword.
 Communication protocols implement:
 Error control mechanisms – using error detection and error correction
codes.
 Flow control - provides feedback from the receiver, it forces the sender to
transmit only the amount of data the receiver can handle.
 Congestion control - ensures that the offered load of the network does not
exceed the network capacity.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 18
1
2
Process p1 Process p2
n-1
n

Process coordination in the presence of errors; each message may be lost with
probability ε. If a protocol consisting of n messages exists, then the protocol
should be able to function properly with n-1 messages reaching their
destination, one of them being lost.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 19
Time and time intervals
 Process coordination requires:
 A global concept of time shared by cooperating entities.
 The measurement of time intervals, the time elapsed between two
events.
 Two events in the global history may be unrelated, neither one is the
cause of the other; such events are said to be concurrent events.
 Local timers provide relative time measurements. An isolated system
can be characterized by its history expressed as a sequence of
events, each event corresponding to a change of the state of the
system.
 Global agreement on time is necessary to trigger actions that should
occur concurrently.
 Timestamps are often used for event ordering using a global time
base constructed on local virtual clocks.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 20
Logical clocks
 Logical clock (LC)  an abstraction necessary to ensure the clock
condition in the absence of a global clock.
 A process maps events to positive integers. LC(e) the local variable
associated with event e.
 Each process time-stamps each message m it sends with the value
of the logical clock at the time of sending:
TS(m) = LC(send(m)).
 The rules to update the logical clock:
LC(e) = LC + 1 if e is a local event
LC(e) = max (LC, (TS(m)+1)) if e is a receive event.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 21
1 2 3 4 5 12
p 1

m 1 m 2 m
5

1 2 6 7 8 9
p 2

m 3 m 4

1 2 3 10 11
p 3

Three processes and their logical clocks.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 22
Message delivery rules; causal delivery
 The communication channel abstraction makes no assumptions about
the order of messages; a real-life network might reorder messages.
 First-In-First-Out (FIFO) delivery  messages are delivered in the
same order they are sent.
 Causal delivery  an extension of the FIFO delivery to the case when
a process receives messages from different sources.
 Even if the communication channel does not guarantee FIFO delivery,
FIFO delivery can be enforced by attaching a sequence number to
each message sent. The sequence numbers are also used to
reassemble messages out of individual packets.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 23
Process Process
p i
p j

deliver
Channel/ Channel/
Process receive Process
Interface Interface
Channel

Message receiving and message delivery are two distinct operations. The
channel-process interface implements the delivery rules, e.g., FIFO delivery.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 24
p 1

m m 3
2

p 2

m 1

p 3

Violation of causal delivery when more than two processes are involved;
message m1 is delivered to process p2 after message m1, though message m1
was sent before m3. Indeed, message m3 was sent by process p1 after
receiving m1, which in turn was sent by process p3 after sending message m1.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 25
Runs and cuts
 Run  a total ordering of all the events in the global history of a
distributed computation consistent with the local history of each
participant process; a run implies a sequence of events as well as a
sequence of global states.
 Cut  a subset of the local history of all processes.
 Frontier of the cut in the global history of n processes  an n-tuple
consisting of the last event of every process included in the cut.
 Cuts provide the intuition to generate global states based on an
exchange of messages between a monitor and a group of processes.
The cut represents the instance when requests to report individual
state are received by the members of the group.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 26
Consistent and inconsistent cuts and runs
 Consistent cut  a cut closed under the causal precedence
relationship.
 A consistent cut establishes an instance of a distributed computation;
given a consistent cut we can determine if an event e occurred before
the cut.
 Consistent run  the total ordering of events imposed by the run is
consistent with the partial order imposed by the causal relation.
 The causal history of event e is the smallest consistent cut of the
process group including event e.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 27
1 3 6
e
1 e
2
1 e
1
4
e e
1
5
1 e
1

p 1

m m C 1 C 2
m
5
1 2
1 2 3
e
2 e
2 e
2 e e
4
2
5
2
6
e
2
p 2

m 3 m 4

1 2 4 5
e 3 e
3
3
e
3 e 3 e
3

p 3

Inconsistent and consistent cuts.


The cut C1=(e14,e25,e32) is inconsistent because it includes e24 - the event triggered by
the arrival of the message m3 at process p2 but does not include e33, the event triggered
by process p2 sending m3 thus, the cut C1 violates causality.
The cut C2=(e15,e26,e33) is consistent; there is no causal inconsistency, it includes
event e26 - the sending of message m4 without the effect of it, the event e34 - the
receiving of the message by process p3.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 28
1 3 6
e
1 e
2
1 e
1
4
e e
1
5
1 e
1

p 1

m
1 m 2 m
5

1 2 3
e 2 e
2 e 2 e e
4
2
5
2 e
6
2
p 2

m 3 m 4

1 2 4 5
e 3 e
3
3
e
3 e
3 e
3

p 3

The causal history of event e25 is the smallest consistent cut including e25.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 29
The snapshot protocol of Chandy and Lamport
 A protocol to construct consistent global states.
 The protocol has three steps:
 Process p0 sends to itself a take snapshot message.
 Let pf be the process from which pi receives the take snapshot message
for the first time. Upon receiving the message, the process pi records its
local state and relays the take snapshot along all its outgoing channels
without executing any events on behalf of its underlying computation;
channel state is set to empty and process pi starts recording messages
received over each of its incoming channels.
 Let ps be the process from which pi receives the take snapshot
message beyond the first time; process pi stops recording messages
along the incoming channel from ps and declares channel state between
processes pi and ps as those messages that have been recorded.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 30
1
p0 p1
1 2
2
1 1 1 2 2 2

2
2 2
2
2
p5 p2
2 2
2 2 2

2 2 2
22 2
2 2
2 p4 p3
2

 Six processes executing the snapshot protocol of Chandy and Lamport.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 31
Concurrency
 Required by system and application software:
 Reactive systems respond to external events; e.g., the kernel of an
operating system, embedded systems.
 Improve performance - parallel applications partition the workload and
distribute it to multiple threads running concurrently.
 Support a variable load and shorten the response time - distributed
applications, including transaction management systems and
applications based on the client-server.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 32
Context switching when a Virtual
Application Exception
page fault occurs during Memory
thread 1 Handler
Manager
the instruction fetch phase.

Virtual Memory Manager


IR PC Translate (PC)
attempts to translate the
into (Page#,Displ)
virtual address of a next Is (Page#) in
instruction of thread 1 and primary storage?
encounters a page fault.
YES- compute the
Thread 1 is suspended physical address
waiting for the event of the instruction
signaling that the page was
paged-in from the disk. IR PC
Save PC
NO – page fault Handle page
The Scheduler dispatches
thread 2. Identify Page
SEND(Page #

The Exception Handler Issue AWAIT o


behalf of threa
invokes the Multi-Level AWAIT
Memory Manager.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 33
Atomic actions
 Parallel and distributed applications must take special precautions
for handling shared resources.
 Atomic operation  a multi-step operation should be allowed to
proceed to completion without any interruptions and should not
expose the state of the system until the action is completed.
 Hiding the internal state of an atomic action reduces the number of
states a system can be in thus, it simplifies the design and
maintenance of the system.
 Atomicity requires hardware support:
 Test-and-Set  instruction which writes to a memory location and
returns the old content of that memory cell as non-interruptible.
 Compare-and-Swap  instruction which compares the contents of a
memory location to a given value and, only if the two values are the
same, modifies the contents of that memory location to a given new
value.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 34
All-or-nothing atomicity
 Either the entire atomic action is carried out, or the system is left in the
same state it was before the atomic action was attempted;
a transaction is either carried out successfully, or the record targeted
by the transaction is returned to its original state.
 Two phases:
 Pre-commit  during this phase it should be possible to back up from it
without leaving any trace. Commit point - the transition from the first to the
second phase. During the pre-commit phase all steps necessary to prepare
the post-commit phase, e.g., check permissions, swap in main memory all
pages that may be needed, mount removable media, and allocate stack
space must be carried out; during this phase no results should be exposed
and no actions that are irreversible should be carried out.
 Post-commit phase  should be able to run to completion. Shared
resources allocated during the pre-commit cannot be released until after the
commit point.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 35
Commit Committed
New action
Pending Discarded

Abort Aborted

The states of an all-or-nothing action.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 36
Cell storage
a k z

a a a k k k z z z

Version history of a each variable in cell storage

Cell storage
ACTION M
Catalog
a
READ
n Outcome records
a
WRITE
g
COMMIT
e Version histories

ABORT
r

Journal storage

Storage models. Cell storage does not support all-or-nothing actions. When we
maintain the version histories it is possible to restore the original content but we
need to encapsulate the data access and provide mechanisms to implement the two
phases of an atomic all-or-nothing action. The journal storage does precisely that.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 37
Atomicity
 Before-or-after atomicity  the effect of multiple actions is as if
these actions have occurred one after another, in some order.
 A systematic approach to atomicity must address several delicate
questions:
 How to guarantee that only one atomic action has access to a shared
resource at any given time.
 How to return to the original state of the system when an atomic action
fails to complete.
 How to ensure that the order of several atomic actions leads to
consistent results.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 38
Consensus protocols
 Consensus  process of agreeing to one of several alternates
proposed by a number of agents.
 Consensus service  set of n processes; clients send requests,
propose a value and wait for a response; the goal is to get the set of
processes to reach consensus on a single proposed value.
 Assumptions:
 Processes run on processors and communicate through a network;
processors and network may experience failures, but not Byzantine
failures.
 Processors: (i) operate at arbitrary speeds; (ii) have stable storage and
may rejoin the protocol after a failure; (iii) send messages to one another.
 The network: (i) may lose, reorder, or duplicate messages; (ii) messages
are sent asynchronously; it may take arbitrary long time to reach the
destination.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 39
Paxos
 Paxos  a family of protocols to reach consensus based on a finite
state machine approach.
 Basic Paxos considers several types of entities:
 Client  agent that issues a request and waits for a response.
 Proposer  agent with the mission to advocate a request from a client,
convince the acceptors to agree on the value proposed by a client, and to
act as a coordinator to move the protocol forward in case of conflicts.
 Acceptor  agent acting as the fault-tolerant memory of the protocol.
 Learner  agent acting as the replication factor of the protocol and taking
action once a request has been agreed upon.
 Leader  a distinguished proposer.
 Quorum  subset of all acceptors.
 A proposal has a proposal number pn and contains a value v.
 Several types of requests flow through the system, prepare, accept.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 40
The Paxos algorithm has two phases
 Phase I.
 Proposal preparation: A proposer (the leader) sends a proposal (pn=k,v). The
proposer chooses a proposal number pn=k and sends a prepare message to a
majority of acceptors requesting:
 that a proposal with pn < k should not be accepted.

 The proposal with the highest number pn < k already accepted by each acceptor.

 Proposal promise: An acceptor must remember the proposal number of the highest
proposal number it has ever accepted and the highest proposal number it has ever
responded to. It can accept a proposal with pn=k if and only if it has not responded to
a prepare request with pn > k; if it has already replied to a prepare request for a
proposal with pn > k it should not reply.
 Phase II.
 Accept request: If the majority of acceptors respond, then the proposer chooses the
value v of the proposal as follows:
 the value v of the highest proposal number selected from all the responses.

 an arbitrary value if no proposal was issued by any of the proposers

Proposer sends an accept request message to a quorum of acceptors with (pn=k,v).


 Accept: If an acceptor receives an accept message for a proposal with the proposal
number pn=k it must accept it if and only if it has not already promised to consider
proposals with a pn > k.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 41
1. Prepare
v=a
Acceptor
A
3 Accept
The flow of messages v=b
2 Promise request

for the Paxos consensus Accepted


proposal
algorithm. Individual 4 Accept

clients propose different v=c


1. Prepare
values to the leader who Leader
Acceptor
B
Accepted
initiates the algorithm. v=d proposal

Acceptor A accepts the


value in message with Accepted
2 Promise proposal
proposal number pn=k; v=e 3 Accept
request
acceptor B does not 5. The leader accepts a
proposal and informs
Acceptor
C
respond with a promise v=f all acceptors that the
1. Prepare
proposal has been
while acceptor C Individual clients
accepted

responds with a promise, request different values


a,b,c,d,e,f...
1. Prepare – the leader chooses a proposal with proposal number pn=k.
2. Promise – an acceptor promises to accept the proposal only if it has not
but ultimately does not responded to a proposal with pn> k. (B does not respond)
3. Accept request – the leader chooses the value v of the highest proposal
accept the proposal. number from all acceptors who have sent a promise and sends it to all of them.
4. Accept – an acceptor accepts a proposal with pn=k only if it has not promised
to accept a proposal with pn >k. Stores the accepted value in persistent memory
5. The leader accepts a proposal if the quorum of acceptors send an accept
message for the proposal with pn=k. (C does not accept)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 42
C P A1 A2 A3 L1 L2
The basic Paxos with
three actors: client request(v)
proposer (P), three prepare request(v)
acceptors (A1,A2,A3)
and two learners (L1, promise request (1, null)
L2). The client (C) accept request (1, v)
sends a request to accepted (1, v)
one of the actors client response (v)
playing the role of a (a)
proposer. The
C P A1 A2 A3 L1 L2
entities involved are
(a) Successful first client request(v)
round when there are prepare request(v)
no failures. X
A 2 fails
(b) Successful first promise request (1, null)
round of Paxos when accept request (1, v)
an acceptor fails. accepted (1, v)
client response (v)
(b)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 43
Petri Nets (PNs)

 PNs  bipartite graphs; tokens that flow through the graph.


 Used to model the dynamic rather than static behavior of systems,
e.g., detect synchronization anomalies.
 Bipartite graph  graphs with two classes of nodes; arcs always
connect a node in one class with one or more nodes in the other
class.
 PNs are also called Place-Transition (P/T) Nets. The two classes of
nodes are: places and transitions. Arcs connect one place with one or
more transitions or a transition with one or more places.
 Firing of a transition removes tokens from its input places and adds
them to its output places.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 44
Modeling concurrency with Petri Nets
 Petri Nets can model different activities in a distributed system.
 A transition may model the occurrence of an event, the execution of a
computational task, the transmission of a packet, a logic statement.
 The input places of a transition model the pre-conditions of an event,
the input data for the computational task, the presence of data in an
input buffer, the pre-conditions of a logic statement.
 The output places of a transition model the post-conditions associated
with an event, the results of the computational task, the presence of
data in an output buffer, or the conclusions of a logic statement.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 45
p1 p2 p1 p2 p1 p2

2 1 2 1 2 1
t1 t1 t1
3 3 3

p3 p3 p3

(a) (b) (c)

Petri Nets firing rules. (a) An unmarked net with one transition t1 with two input places,
p1 and p2 and one output place, p3. (b) The marked net, the net with places populated
by tokens; the net before firing the enabled transition t1.
(c) The marked net after firing transition t1 two tokens from place p1 and one from
place p2 are removed and transported to place p3
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 46
p4

t3

p1 p1 p2 p1 p2 p3

t1 t2 t1 t2 t3 t1 t2

(a) (b) (c)

Petri Nets modeling.


(a) Choice; only one of the transitions, t 1 or t2 , may fire.

(b) Symmetric confusion; transitions t1 and t3 are concurrent and, at the same
time, they are in conflict with t2. If t2 fires, then t1 and/or t3 are disabled.
(c) Asymmetric confusion; transition t1 is concurrent with t3 and it is in conflict with
t2 if t3 fires before t1.
Cloud Computing: Theory and Practice.
an C. Marinescu Chapter 2 47
t1 p2 t3 t2
p1 p3

t1 t4
p1 p4

p2 t3 p4
t2 p3 t4

p1 (b)
(a)
n

p1 p2 t3 t4

t1 t2 p3 t2 p2 t1 p4
n
n
p3 p4

n
(d)
(c)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 48
Classes of Petri Nets
 Classified based on the number of input and output flow relations
from/to a transition or a place and by the manner in which
transitions share input places:
 State Machines - are used to model finite state machines and cannot
model concurrency and synchronization.
 Marked Graphs - cannot model choice and conflict.
 Free-choice Nets - cannot model confusion.
 Extended Free-choice Nets - cannot model confusion but allow inhibitor
arcs.
 Asymmetric Choice Nets - can model asymmetric confusion but not
symmetric one.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 49
Classes of Petri Nets

Petri Nets

Asymmetric Free State Marked


Choice Choice Machines Graphs

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 50
The client-server paradigm
 Based on enforced modularity  the modules are forced to interact
only by sending and receiving messages.
 This paradigm leads to:
 A more robust design, the clients and the servers are independent
modules and may fail separately.
 The servers are stateless, they do not have to maintain state
information; the server may fail and then come up without the clients
being affected or even noticing the failure of the server.
 An attack is less likely because it is difficult for an intruder to guess the
format of the messages or the sequence numbers of the segments,
when messages are transported by TCP.

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 51
 (a) Email service; the sender
and the receiver
communicate
asynchronously using
inboxes and outboxes. Mail
demons run at each site.
 (b) An event service
supports coordination in a
distributed system (a)
environment. The service is
based on the publish-
subscribe paradigm; an
event producer publishes
events and an event
consumer subscribes to
events. The server maintains
queues for each event and
delivers notifications to
clients when an event
occurs. (b)

Cloud Computing: Theory and Practice.


Dan C. Marinescu Chapter 2 52
Browser Web Server
World Wide Web.
HTTP request
The three-way handshake SYN
RTT
involves the first three SYN TCP connection establishment
messages exchanged ACK + HTTP request

between the client and the


Server residence time.
server. Once the TCP ACK Web page created on the fly
connection is established Data
Data transmission time
the HTTP server takes its Data ACK
time to construct the page
to respond the first
request; to satisfy the HTTP request

second request, the HTTP ACK

server must retrieve an Server residence time.


image from the disk. Image retrieved from disk

The response time


includes the RTT, the Image transmission time
Image
server residence time, and
the data transmission time.

time time
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 53
request
HTTP client
Web
TCP
port
HTTP
Browser 80 server
response

request to proxy
HTTP client
Web request to server
Browser
Proxy
TCP port 80

response to client HTTP


server
response to proxy

HTTP client request request


Web
TCP
port
HTTP
Tunnel 80 server
Browser
response response

A Web client can: (a) communicate directly with the server; (b) communicate
through a proxy; (c) use tunneling to cross the network.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 2 54

You might also like