CS3551 DC 5 Units Notes
CS3551 DC 5 Units Notes
M COLLEGE OF ENGINEERING
EnggTree.com
UNIT III
Mutual exclusion in a distributed system states that only one process is allowed to execute the
critical section (CS) at any given time.
Message passing is the sole means for implementing distributed mutual exclusion.
The decision as to which process is allowed access to the CS next is arrived at by
message passing, in which each process learns about the state of all other processes in
some consistent way.
There are three basic approaches for implementing distributed mutual exclusion:
1. Token-based approach:www.EnggTree.com
A unique token is shared among all the sites.
If a site possesses the unique token, it is allowed to enter its critical section
This approach uses sequence number to order requests for the critical section.
Each requests for critical section contains a sequence number. This sequence
number is used to distinguish old and current requests.
This approach insures Mutual exclusion as the token is unique.
Eg: Suzuki-Kasami’s Broadcast Algorithm
2. Non-token-based approach:
A site communicates with other sites in order to determine which sites should
execute critical section next. This requires exchange of two or more successive
round of messages among sites.
This approach use timestamps instead of sequence number to order requests
for the critical section.
When ever a site make request for critical section, it gets a timestamp.
Timestamp is also used to resolve any conflict between critical section requests.
All algorithm which follows non-token based approach maintains a logical
clock. Logical clocks get updated according to Lamport’s scheme.
Eg: Lamport's algorithm, Ricart–Agrawala algorithm
Preliminaries
The system consists of N sites, S1, S2, S3, …, SN.
Assume that a single process is running on each site.
The process at site Si is denoted by pi. All these processes communicate
asynchronously over an underlying communication network.
A process wishing to enter the CS requests all other or a subset of processes by
sending REQUEST messages, and waits for appropriate replies before entering the
CS.
While waiting the process is not allowed to make further requests to enter the CS.
A site can be in one of the following three states: requesting the CS, executing the CS,
or neither requesting nor executing the CS.
In the requesting the CS state, the site is blocked and cannot make further requests for
the CS.
In the idle state, the site is executing outside the CS.
In the token-based algorithms, a site can also be in a state where a site holding the
token is executing outside the CS. Such state is referred to as the idle token state.
At any instant, a site may have several pending requests for CS. A site queues up
these requests and serves them one at a time.
www.EnggTree.com
N denotes the number of processes or sites involved in invoking the critical section, T
denotes the average message delay, and E denotes the average critical section
execution time.
The safety property states that at any instant, only one process can execute the
critical section. This is an essential property of a mutual exclusion algorithm.
Liveness property:
This property states the absence of deadlock and starvation. Two or more sites
should not endlessly wait for messages that will never arrive. In addition, a site must
not wait indefinitely to execute the CS while other sites are repeatedly executing the
CS. That is, every requesting site should get an opportunity to execute the CS in finite
time.
Fairness:
Fairness in the context of mutual exclusion means that each process gets a fair
chance to execute the CS. In mutual exclusion algorithms, the fairness property
generally means that the CS execution requests are executed in order of their arrival in
the system.
www.EnggTree.com
LAMPORT’S ALGORITHM
Lamport’s Distributed Mutual Exclusion Algorithm is a permission based algorithm
proposed by Lamport as an illustration of his synchronization scheme for distributed
systems.
In permission based timestamp is used to order critical section requests and to resolve
any conflict between requests.
In Lamport’s Algorithm critical section requests are executed in the increasing order of
timestamps i.e a request with smaller timestamp will be given permission toexecute
critical section first than a request with larger timestamp.
Three type of messages ( REQUEST, REPLY and RELEASE) are used and
communication channels are assumed to follow FIFO order.
A site send a REQUEST message to all other site to get their permission to enter
critical section.
A site send a REPLY message to requesting site to give its permission to enter the
critical section.
A site send a RELEASE message to all other site upon exiting the critical section.
Every site Si, keeps a queue to store critical section requests ordered by their
timestamps.
request_queuei denotes the queue of site Si.
A timestamp is given to each critical section request using Lamport’s logical clock.
www.EnggTree.com
Timestamp is used to determine priority of critical section requests. Smaller timestamp
gets high priority over larger timestamp. The execution of critical section request is
always in the order of their timestamp.
Correctness
Theorem: Lamport’s algorithm achieves mutual exclusion.
Proof: Proof is by contradiction.
Suppose two sites Si and Sj are executing the CS concurrently. For this to happen
conditions L1 and L2 must hold at both the sites concurrently.
This implies that at some instant in time, say t, both S i and Sj have their own requests
at the top of their request queues and condition L1 holds at them. Without loss of
generality, assume that Si ’s request has smaller timestamp than the request of Sj .
From condition L1 and FIFO property of the communication channels, it is clear that at
instant t the request of Si must be present in request queuej when Sj was executing its
www.EnggTree.com
CS. This implies that Sj ’s own request is at the top of its own request queue when a
smaller timestamp request, Si ’s request, is present in the request queuej – a
contradiction!
Message Complexity:
Lamport’s Algorithm requires invocation of 3(N – 1) messages per critical section execution.
These 3(N – 1) messages involves
(N – 1) request messages
(N – 1) reply messages
(N – 1) release messages
Performance:
Synchronization delay is equal to maximum message transmission time. It requires 3(N – 1)
messages per CS execution. Algorithm can be optimized to 2(N – 1) messages by omitting
the REPLY message in some situations.
www.EnggTree.com
www.EnggTree.com
Message Complexity:
Ricart–Agrawala algorithm requires invocation of 2(N – 1) messages per critical section
execution. These 2(N – 1) messages involve:
(N – 1) request messages
(N – 1) reply messages
Performance:
Synchronization delay is equal to maximum message transmission time It requires
2(N – 1) messages per Critical section execution.
MAEKAWA‘s ALGORITHM
Maekawa’s Algorithm is quorum based approach to ensure mutual exclusion in
www.EnggTree.com
distributed systems.
Maekawa used the theory of projective planes and showed that N = K(K – 1)+ 1. This
relation gives |Ri|= √N.
Correctness
Theorem: Maekawa’s algorithm achieves mutual exclusion.
Proof: Proof is by contradiction.
Suppose two sites Si and Sj are concurrently executing the CS.
Message Complexity:
Maekawa’s Algorithm requires invocation of 3√N messages per critical section execution as
the size of a request set is √N. These 3√N messages involves.
√N request messages
√N reply messages
√N release messages
Performance:
Synchronization delay is equal to twice the message propagation delay time. It requires 3√n
messages per critical section execution.
Correctness
Mutual exclusion is guaranteed because there is only one token in the system and a site holds
the token during the CS execution.
Theorem: A requesting site enters the CS in finite time.
Proof: Token request messages of a site Si reach other sites in finite time.
Since one of these sites will have token in finite time, site Si ’s request will be placed in the
token queue in finite time. www.EnggTree.com
Since there can be at most N − 1 requests in front of this request in the token queue, site Si
will get the token and execute the CS in finite time.
Message Complexity:
The algorithm requires 0 message invocation if the site already holds the idle token at the
time of critical section request or maximum of N message per critical section execution. This
N messages involves
(N – 1) request messages
1 reply message
Performance:
Synchronization delay is 0 and no message is needed if the site holds the idle token at the
time of its request. In case site does not holds the idle token, the maximum synchronization
delay is equal to maximum message transmission time and a maximum of N message is
required per critical section invocation.
Distributed approach:
In the distributed approach different nodes work together to detect deadlocks. No
single point failure as workload is equally divided among all nodes.
The speed of deadlock detection also increases.
Hierarchical approach:
This approach is the most advantageous approach.
It is the combination of both centralized and distributed approaches of deadlock
detection in a distributed system.
In this approach, some selected nodes or cluster of nodes are responsible for deadlock
detection and these selected nodes are controlled by a single node.
System Model
www.EnggTree.com
A distributed program is composed of a set of n asynchronous processes p1, p2, . .
. , pi , . . . , pn that communicates by message passing over the communication
network.
Without loss of generality we assume that each process is running on a different
processor.
The processors do not share a common global memory and communicate solely by
passing messages over the communication network.
There is no physical global clock in the system to which processes have
instantaneous access.
The communication medium may deliver messages out of order, messages may be
lost garbled or duplicated due to timeout and retransmission, processors may fail
and communication links may go down.
We make the following assumptions:
The systems have only reusable resources.
Processes are allowed to make only exclusive access to resources.
There is only one copy of each resource.
A process can be in two states: running or blocked.
In the running state (also called active state), a process has all the needed
resources and is either executing or is ready for execution.
In the blocked state, a process is waiting to acquire some resource.
Wait for graph
This is used for deadlock deduction. A graph is drawn based on the request and
acquirement of the resource. If the graph created has a closed loop or a cycle, then there is a
deadlock.
Preliminaries
MODELS OF DEADLOCKS
The models of deadlocks are explained based on their hierarchy. The diagrams illustrate the
working of the deadlock models. Pa, Pb, Pc, Pdare passive processes that had already acquired
the resources. Peis active process that is requesting the resource.
AND Model
In the AND model, a passive process becomes active (i.e., its activation condition is
fulfilled) only after a message from each process in its dependent set has arrived.
In the AND model, a process can request more than one resource simultaneously and the
request is satisfied only after all the requested resources are granted to the process.
The requested resources may exist at different locations.
The out degree of a node in the WFG for AND model can be more than 1.
The presence of a cycle in the WFG indicates a deadlock in the AND model.
Each node of the WFG in such a model is called an AND node.
OR Model
A process can make a request for numerous resources simultaneously and the request
is satisfied if any one of the requested resources is granted.
Presence of a cycle in the WFG of an OR model does not imply a deadlock
in the OR model.
In the OR model, the presence of a knot indicates a deadlock.
With every blocked process, there is an associated set of processes called dependent
set.
A process shall move from an idle to an active state on receiving a grant message
from any of the processes in its dependent set.
A process is permanently blocked if it never receives a grant message from any of the
processes in its dependent set.
www.EnggTree.com
A set of processes S is deadlocked if all the processes in S are permanently blocked.
In short, a processis deadlocked or permanently blocked, if the following conditions
are met:
1. Each of the process is the set S is blocked.
2. The dependent set for each process in S is a subset of S.
3. No grant message is in transit between any two processes in set S.
A blocked process P is the set S becomes active only after receiving a grant message
from a process in its dependent set, which is a subset of S.
Note that AND requests for p resources can be stated as and OR requests for p
resources can be stated as
Unrestricted model
No assumptions are made regarding the underlying structure of resource requests.
In this model, only one assumption that the deadlock is stable is made and hence it is
the most general model.
This model helps separate concerns: Concerns about properties of the problem (stability
and deadlock) are separated from underlying distributed systems computations (e.g.,
message passing versus synchronous communication).
www.EnggTree.com
Therefore, distributed deadlocks can be detected by taking a snapshot of the system and
examining it for the condition of a deadlock
Features:
1. Only one process in a cycle www.EnggTree.com
detects the deadlock. This simplifies the deadlock
resolution – this process can abort itself to resolve the deadlock. This algorithm can
be improvised by including priorities, and the lowest priority process in a cycle
detects deadlock and aborts.
2. In this algorithm, a process that is detected in deadlock is aborted spontaneously, even
though under this assumption phantom deadlocks cannot be excluded. It can be
shown, however, that only genuine deadlocks will be detected in the absence of
spontaneous aborts.
Each node of the WFG has two local variables, called labels:
1. a private label, which is unique to the node at all times, though it is not constant.
2. a public label, which can be read by other processes and which may not be unique.
Each process is represented as u/v where u and u are the public and private labels,
respectively. Initially, private and public labels are equal for each process. A global WFG
is maintained and it defines the entire state sof the system.
The algorithm is defined by the four state transitions as shown in Fig.3.10, where z =
inc(u, v), and inc(u, v) yields aunique label greater than both u and v labels that are
notshown do not change.
The transitions in the defined by the algorithm are block, activate , transmit and
detect.
Block creates an edge in the WFG.
Two messages are needed, one resource request and onemessage back to the blocked
process to inform it of thepublic label of the process it is waiting for.
Activate denotes that a process has acquired the resourcefrom the process it was
Detect means that the probe with the private label of some process has returned to it,
indicating a deadlock.
This algorithm can easily be extended to include priorities, so that whenever a
deadlock occurs, the lowest priority process gets aborted.
This priority based algorithm has two phases.
1. The first phase is almost identical to the algorithm.
2. The second phase thewww.EnggTree.com
smallest priority is propagated around the circle. The
propagation stops when one process recognizes the propagated priority as its
own.
Message Complexity:
If we assume that a deadlock persists long enough to be detected, the worst-case complexity
of the algorithm is s(s - 1)/2 Transmit steps, where s is the number of processes in the cycle.
www.EnggTree.com
Fig 3.11: Chandy–Misra–Haas algorithm for the AND model
Performance analysis
In the algorithm, one probe message is sent on every edge of the WFG which
connects processes on two sites.
The algorithm exchanges at most m(n − 1)/2 messages to detect a deadlock that
involves m processes and spans over n sites.
The size of messages is fixed and is very small (only three integer words).
The delay in detecting a deadlock is O(n).
Advantages:
It is easy to implement.
Each probe message is of fixed length.
There is very little computation.
There is very little overhead.
There is no need to construct a graph, nor to pass graph information to other sites.
This algorithm does not find false (phantom) deadlock.
There is no need for special data structures.
Performance analysis
For every deadlock detection, the algorithm exchanges e query messages ande reply
messages, where e = n(n – 1) is the number of edges.
UNIT IV
CONSENSUS AND RECOVERY
Consensusand agreement algorithms: Problem definition – Overview of results – Agreement in
a failure – free system (Synchronous and Asynchronous) – Agreement in synchronous systems
with failures. Check pointing and rollback recovery: Introduction – Background and definitions
– Issues in failure recovery – Checkpoint-based recovery – Coordinated check pointing algorithm
– Algorithm for asynchronous check pointing and recovery.
Agreement: All non-faulty processes must agree on the same (single) value.
Validity: If all the non-faulty processes have the same initial value, then the agreed upon value
by all the non-faulty processes must be that same value.
The overhead bounds are for the given algorithms, and not necessarily tight bounds for the
problem.
M.A.M COLLEGE OF ENGINEERING
www.EnggTree.com
Validity: If the source process is non-faulty, then the agreed upon value by all the non- faulty
processes must be the same as the initial value of the source.
www.EnggTree.com
Each phase has a unique "phase king" derived, say, from PID. Each phase has two rounds:
1 in 1st round, each process sends its estimate to all other processes.
2 in 2nd round, the "Phase king" process arrives at an estimate based on the values it
received in 1st round, and broadcasts its new estimate to all others.
(f + 1) phases, (f + 1)[(n - 1)(n + 1)] messages, and can tolerate up to f < dn=4e malicious processes
Correctness Argument
2 In phase k, all non-malicious processes Pi and Pj will have same estimate of consensus
value as Pk does.
Pi and Pj use their own majority values. Pi 's mult > n=2 + f )
Pi uses its majority value; Pj uses phase-king's tie-breaker value. (Pi’s mult > n=2 + f ,
Pj 's mult > n=2 for same value)
Pi and Pj use the phase-king's tie-breaker value. (In the phase in which Pk is non-
malicious, it sends same value to Pi and Pj )
In all 3 cases, argue that Pi and Pj end up with same value as estimate
www.EnggTree.com
If all non-malicious processes have the value x at the start of a phase, they will continue
to have x as the consensus value at the end of the phase.
Agreement: All non-faulty processes must make a decision and the values decided upon by any
two non-faulty processes must be within range of each other.
Validity: If a non-faulty process Pi decides on some value vi , then that value must be within the
range of values initially proposed by the processes.
Termination: Each non-faulty process must eventually decide on a value. The algorithm for the
message-passing model assumes n ≥ 5f + 1, although the problem is solvable for n > 3f + 1.
www.EnggTree.com
Not possible to go from bivalent to univalent state if even a single failure is allowed. Difficulty is
not being able to read & write a variable atomically.
Weakening the consensus problem, e.g., k-set consensus, approximate consensus, and
renaming using atomic registers.
Using memory that is stronger than atomic Read/Write memory to design wait- free
consensus algorithms. Such a memory would need corresponding access primitives.
Are there objects (with supporting operations), using which there is a wait-free (i.e., (n -1)- crash
resilient) algorithm for reaching consensus in a n-process system? Yes, e.g., Test&Set, Swap,
Compare&Swap. The crash failure model requires the solutions to be wait-free.
www.EnggTree.com
An object is defined to be universal if that object along with read/write registers can simulate
any other object in a wait-free manner. In any system containing up to k processes, an object X
such that CN(X) = k is universal.
For any system with up to k processes, the universality of objects X with consensus number k is
shown by giving a universal algorithm to wait-free simulate any object using objects of type X
and read/write registers.
2 Then, the arbitrary k-process consensus objects are simulated with objects of type X,
having consensus number k. This trivially follows after the first step.
A nonblocking operation, in the context of shared memory operations, is an operation that may
not complete itself but is guaranteed to complete at least one of the pending operations in a
finite number of steps.
The linked list stores the linearized sequence of operations and states following each operation.
Operations to the arbitrary object Z are simulated in a nonblocking way using an arbitrary
consensus object (the field op.next in each record) which is accessed via the Decide call.
Each process attempts to thread its own operation next into the linked list.
A single pointer/counter cannot be used instead of the array Head. Because reading and
updating the pointer cannot be done atomically in a wait-free manner.
process communication.
Some protocols assume that the communication uses first-in-first-out (FIFO) order, while
other protocols assume that the communication subsystem can lose, duplicate, or reorder
messages.
Rollback-recovery protocols therefore must maintain information about the internal
interactions among processes and also the external interactions with the outside world.
A local checkpoint
All processes save their localwww.EnggTree.com
states at certain instants of time
A local check point is a snapshot of the state of the process at a given instance
Assumption
– A process stores all local checkpoints on the stable storage
– A process is able to roll back to any of its existing local checkpoints
www.EnggTree.com
1. In-transit message
messages that have been sent but not yet received
2. Lost messages
messages whose “send‟ is done but “receive‟ is undone due to rollback
www.EnggTree.com
3. Delayed messages
messages whose “receive‟ is not recorded because the receiving process was
either down or the message arrived after rollback
4. Orphan messages
messages with “receive‟ recorded but message “send‟ not recorded
do not arise if processes roll back to a consistent global state
5. Duplicate messages
arise due to message logging and replaying during process recovery
In-transit messages
In Figure , the global state {C1,8 , C2, 9 , C3,8, C4,8} shows that message m1 has been sent but
www.EnggTree.com
not yet received. We call such a message an in-transit message. Message m2 is also an in-transit
message.
Delayed messages
Messages whose receive is not recorded because the receiving process was either down or the
message arrived after the rollback of the receiving process, are called delayed messages. For
example, messages m2 and m5 in Figure are delayed messages.
Lost messages
Messages whose send is not undone but receive is undone due to rollback are called lostmessages.
This type of messages occurs when the process rolls back to a checkpoint prior to reception of the
message while the sender does not rollback beyond the send operation of the message. In Figure ,
message m1 is a lost message.
Duplicate messages
Duplicate messages arise due to message logging and replaying during process
recovery. For example, in Figure, message m4 was sent and received before the
rollback. However, due to the rollback of process P4 to C4,8 and process P3 to C3,8,
both send and receipt of message m4 are undone.
www.EnggTree.com
The computation comprises of three processes Pi, Pj , and Pk, connected through a communication
network. The processes communicate solely by exchanging messages over fault- free, FIFO
communication channels.
www.EnggTree.com
Checkpoint-based recovery
Checkpoint-based rollback-recovery techniques can be classified into three categories:
1. Uncoordinated checkpointing
2. Coordinated checkpointing
3. Communication-induced checkpointing
1. Uncoordinated Checkpointing
Each process has autonomy in deciding when to take checkpoints
Advantages
The lower runtime overhead during normal execution
Disadvantages
1. Domino effect during a recovery
2. Recovery from a failure is slow because processes need to iterate to find a
consistent set of checkpoints
3. Each process maintains multiple checkpoints and periodically invoke a
www.EnggTree.com
garbage collection algorithm
4. Not suitable for application with frequent output commits
The processes record the dependencies among their checkpoints caused by message
exchange during failure-free operation
The following direct dependency tracking technique is commonly used in uncoordinated
checkpointing.
Direct dependency tracking technique
Assume each process 𝑃𝑖 starts its execution with an initial checkpoint 𝐶𝑖,0
𝐼𝑖,𝑥 : checkpoint interval, interval between 𝐶𝑖,𝑥−1 and 𝐶𝑖,𝑥
When 𝑃𝑗 receives a message m during 𝐼𝑗,𝑦 , it records the dependency from 𝐼𝑖,𝑥 to 𝐼𝑗,𝑦,
which is later saved onto stable storage when 𝑃𝑗 takes 𝐶𝑗,𝑦
When a process receives this message, it stops its execution and replies with the
dependency information saved on the stable storage as well as with the dependency
information, if any, which is associated with its current state.
www.EnggTree.com
The initiator then calculates the recovery line based on the global dependency information
and broadcasts a rollback request message containing the recovery line.
Upon receiving this message, a process whose current state belongs to the recovery line
simply resumes execution; otherwise, it rolls back to an earlier checkpoint as indicated by
the recovery line.
2. Coordinated Checkpointing
In coordinated checkpointing, processes orchestrate their checkpointing activities so that all
local checkpoints form a consistent global state
Types
1. Blocking Checkpointing: After a process takes a local checkpoint, to prevent orphan
messages, it remains blocked until the entire checkpointing activity is complete
Disadvantages: The computation is blocked during the checkpointing
2. Non-blocking Checkpointing: The processes need not stop their execution while taking
checkpoints. A fundamental problem in coordinated checkpointing is to prevent a process
from receiving application messages that could make the checkpoint inconsistent.
www.EnggTree.com
Algorithm
The algorithm consists of two phases. During the first phase, the checkpoint initiator
identifies all processes with which it has communicated since the last checkpoint and sends
them a request.
Upon receiving the request, each process in turn identifies all processes it has
communicated with since the last checkpoint and sends them a request, and so on, until
no more processes can be identified.
During the second phase, all processes identified in the first phase take a checkpoint. The
result is a consistent checkpoint that involves only the participating processes.
In this protocol, after a process takes a checkpoint, it cannot send any message until the
second phase terminates successfully, although receiving a message after the checkpoint
has been taken is allowable.
3. Communication-induced Checkpointing
Communication-induced checkpointing is another way to avoid the domino effect, while allowing
processes to take some of their checkpoints independently. Processes may be forced to take
additional checkpoints
www.EnggTree.com
Two types of checkpoints
1. Autonomous checkpoints
2. Forced checkpoints
The checkpoints that a process takes independently are called local checkpoints, while those that
a process is forced to take are called forced checkpoints.
Communication-induced check pointing piggybacks protocol- related information on
each application message
The receiver of each application message uses the piggybacked information to determine
if it has to take a forced checkpoint to advance the global recovery line
The forced checkpoint must be taken before the application may process the contents of
the message
In contrast with coordinated check pointing, no special coordination messages are
exchanged
Two types of communication-induced checkpointing
1. Model-based checkpointing
2. Index-based checkpointing.
Model-based checkpointing
Model-based checkpointing prevents patterns of communications and checkpoints
that could result in inconsistent states among the existing checkpoints.
No control messages are exchanged among the processes during normal operation. All
information necessary to execute the protocol is piggybacked on application messages
www.EnggTree.com
Suppose a set of processes crashes. A process p in becomes an orphan when p itself does
not fail and p’s state depends on the execution of a nondeterministic event e whose determinant
cannot be recovered from the stable storage or from the volatile memory of a surviving process.
storage or from the volatile memory of a surviving process. Formally, it can be stated as follows
Types
1. Pessimistic Logging
Pessimistic logging protocols assume that a failure can occur after any non-deterministic
event in the computation. However, in reality failures are rare
Pessimistic protocols implement the following property, often referred to as synchronous logging,
which is a stronger than the always-no-orphans condition
Synchronous logging
– ∀e: ¬Stable(e)
www.EnggTree.com
⇒ |Depend(e)| = 0
Thai is,if an event has not been logged on the stable storage, then no process can depend
on it.
Example:
Suppose processes 𝑃1 and 𝑃2 fail as shown, restart from checkpoints B and C, and roll
forward using their determinant logs to deliver again the same sequence of messages as in
the pre-failure execution
Once the recovery is complete, both processes will be consistent with the state of 𝑃0
that includes the receipt of message 𝑚7 from 𝑃1
• Consider the example shown in Figure Suppose process P2 fails before the determinant for
m5 is logged to the stable storage. Process P1 then becomes an orphan process and must
roll back to undo the effects of receiving the orphan message m6. The rollback of P1
further forces P0 to roll back to undo the effects of receiving message m7.
• Advantage: better performance in failure-free execution
• Disadvantages:
• coordination required on output commit
• more complex garbage collection
• Since determinants are logged asynchronously, output commit in optimistic logging
protocols requires a guarantee that no failure scenario can revoke the output. For example,
if process P0 needs to commit output at state X, it must log messages m4 andm7 to the
stable storage and ask P2 to log m2 and m5. In this case, if any process fails, the
computation can be reconstructed up to state X.
3. Causal Logging
• Combines the advantages of both pessimistic and optimistic logging at the expense of a more
complex recovery protocol
www.EnggTree.com
• Like optimistic logging, it does not require synchronous access to the stable storage except
during output commit
• Like pessimistic logging, it allows each process to commit output independently and never
creates orphans, thus isolating processes from the effects of failures at other processes
• Make sure that the always-no-orphans property holds
• Each process maintains information about all the events that have causally affected its state
• Consider the example in Figure Messages m5 and m6 are likely to be lost on the failures
of P1 and P2 at the indicated instants. Process
• P0 at state X will have logged the determinants of the nondeterministic events that
causally precede its state according to Lamport’s happened-before relation.
www.EnggTree.com
• These events consist of the delivery of messages m0, m1, m2, m3, and m4.
• The determinant of each of these non-deterministic events is either logged on the stable
storage or is available in the volatile log of process P0.
• The determinant of each of these events contains the order in which its original receiver
delivered the corresponding message.
• The message sender, as in sender-based message logging, logs the message content. Thus,
process P0 will be able to “guide” the recovery of P1 and P2 since it knows the order in
which P1 should replay messages m1 and m3 to reach the state from which P1 sent message
m4.
• Similarly, P0 has the order in which P2 should replay message m2 to be consistent with
both P0 and P1.
• The content of these messages is obtained from the sender log of P0 or regenerated
deterministically during the recovery of P1 and P2.
• Note that information about messages m5 and m6 is lost due to failures. These messages
may be resent after recovery possibly in a different order.
• However, since they did not causally affect the surviving process or the outside world, the
M.A.M COLLEGE OF ENGINEERING
www.EnggTree.com
First Phase
1. An initiating process Pi takes a tentative checkpoint and requests all other processes to take
tentative checkpoints. Each process informs Pi whether it succeeded in taking a tentative
checkpoint.
2. A process says “no” to a request if it fails to take a tentative checkpoint
3. If Pi learns that all the processes have successfully taken tentative checkpoints, Pi decides
that all tentative checkpoints should be made permanent; otherwise, Pi decides that all the
tentative checkpoints should be thrown-away.
Second Phase
1. Pi informs all the processes of the decision it reached at the end of the first phase.
2. A process, on receiving the message from Pi will act accordingly.
M.A.M COLLEGE OF ENGINEERING
3. Either all or none of the processes advance the checkpoint by taking permanent
checkpoints.
4. The algorithm requires that after a process has taken a tentative checkpoint, it cannot
send messages related to the basic computation until it is informed of Pi’s decision.
Correctness: for two reasons
i. Either all or none of the processes take permanent checkpoint
ii. No process sends message after taking permanent checkpoint
An Optimization
The above protocol may cause a process to take a checkpoint even when it is not necessary for
consistency. Since taking a checkpoint is an expensive operation, we avoid taking checkpoints.
The above protocol, in the event of failure of process X, the above protocol will require
processes X, Y, and Z to restart from checkpoints x2, y2, and z2, respectively.
www.EnggTree.com
Process Z need not roll back because there has been no interaction between process Z and the
other two processes since the last checkpoint at Z.
Basic idea
Since the algorithm is based on asynchronous check pointing, the main issue in the
recovery is to find a consistent set of checkpoints to which the system can be restored.
The recovery algorithm achieves this by making each processor keep track of both the
number of messages it has sent to other processors as well as the number of messages it
has received from other processors.
Whenever a processor rolls back, it is necessary for all other processors to find out if any
message has become an orphan message. Orphan messages are discovered by comparing
the number of messages sent to and received from neighboring processors.
For example, if RCVDi←j(CkPti) > SENTj→i(CkPtj) (that is, the number of messages received
by processor pi from processor pj is greater than the number of messages sent by processor pj to
processor pi, according to the current states the processors), then one or more messages at
processor pj are orphan messages.
The Algorithm
When a processor restarts after a failure, it broadcasts a ROLLBACK message that it had failed
Procedure RollBack_Recovery
processor pi executes the following:
STEP (a)
if processor pi is recovering after a failure then
CkPti := latest event logged in the stable storage
www.EnggTree.com
else
CkPti := latest event that took place in pi {The latest event at pi can be either in stable or in
volatile storage.}
end if
STEP (b)
for k = 1 1 to N {N is the number of processors in the system} do
for each neighboring processor pj do
compute SENTi→j(CkPti)
send a ROLLBACK(i, SENTi→j(CkPti)) message to pj
end for
for every ROLLBACK(j, c) message received from a neighbor j do
if RCVDi←j(CkPti) > c {Implies the presence of orphan messages} then
find the latest event e such that RCVDi←j(e) = c {Such an event e may be in the volatile storage
or stable storage.}
CkPti := e
end if
end for
end for{for k}
D. An Example
Consider an example shown in Figure 2 consisting of three processors. Suppose processor Y
fails and restarts. If event ey2 is the latest checkpointed event at Y, then Y will restart from the
state corresponding to ey2.
www.EnggTree.com
Since RCVDZ←Y (CkPtZ) = 2 > 1, Z will set CkPtZ to ez1 satisfying RCVDZ←Y (ez1) = 1 ≤
1.
At Y, RCVDY←X(CkPtY ) = 1 < 2 and RCVDY←Z(CkPtY ) = 1 = SENTZ←Y (CkPtZ).
Y need not roll back further.
M.A.M COLLEGE OF ENGINEERING
www.EnggTree.com
UNIT V
CLOUD COMPUTING
Definition of Cloud Computing – Characteristics of Cloud – Cloud Deployment Models –
Cloud Service Models – Driving Factors and Challenges of Cloud – Virtualization – Load
Balancing – Scalability and Elasticity – Replication – Monitoring – Cloud Services and
Platforms: Compute Services – Storage Services – Application Services
The term cloud refers to a network or the internet. It is a technology that uses remote
www.EnggTree.com
servers on the internet to store, manage, and access data online rather than local drives. The
data can be anything such as files, images, documents, audio, video, and more.
Cloud Computing is defined as storing and accessing of data and computing services
over the internet. It doesn’t store any data on your personal computer. It is the on-demand
availability of computer services like servers, data storage, networking, databases, etc. The
main purpose of cloud computing is to give access to data centers to many users. Users can
also access data from a remote server.
Cloud computing decreases the hardware and software demand from the user’s side.
The only thing that user must be able to run is the cloud computing systems interface software,
which can be as simple as Web browser, and the Cloud network takes care of the rest. We all
have experienced cloud computing at some instant of time, some of the popular cloud services
we have used or we are still using are mail services like gmail, hotmail or yahoo etc.
Characteristics of Cloud
3) High Scalability
Cloud offers "on-demand" provisioning of resources on a large scale, without having engineers
for peak loads.
4) Multi-Sharing
With the help of cloud computing, multiple users and applications can work more efficiently
with cost reductions by sharing common infrastructure.
www.EnggTree.com
5) Device and Location Independence
Cloud computing enables the users to access systems using a web browser regardless of their
location or what device they use e.g. PC, mobile phone, etc. As infrastructure is off-site
(typically provided by a third-party) and accessed via the Internet, users can connect from
anywhere.
6) Maintenance
Maintenance of cloud computing applications is easier, since they do not need to be installed
on each user's computer and can be accessed from different places. So, it reduces the cost also.
7) Low Cost
By using cloud computing, the cost will be reduced because to take the services of cloud
computing, IT company need not to set its own infrastructure and pay-as-per usage of
resources.
The cloud deployment model identifies the specific type of cloud environment based
on ownership, scale, access, and the cloud’s nature and purpose. There are various deployment
models are based on the location and who manages the infrastructure.
Public Cloud
www.EnggTree.com
The public cloud is available to the general public, and resources are shared between
all users. They are available to anyone, from anywhere, using the Internet. The public cloud
deployment model is one of the most popular types of cloud.
This computing model is hosted at the vendor’s data center. The public cloud model
makes the resources, such as storage and applications, available to the public over the
WWW. It serves all the requests; therefore, resources are almost infinite.
Highly available anytime and anywhere, with robust permission and authentication
mechanism. www.EnggTree.com
There is no need to maintain the cloud.
Does not have any limit on the number of users.
The cloud service providers fully subsidize the entire Infrastructure. Therefore, you
don’t need to set up any hardware.
Does not cost you any maintenance charges as the service provider does it.
It works on the Pay as You Go model, so you don’t have to pay for items you don’t
use.
There is no significant upfront fee, making it excellent for enterprises that require
immediate access to resources.
The private cloud deployment model is a dedicated environment for one user or customer.
You don’t share the hardware with any other users, as all the hardware is yours. It is a one-to-
one environment for single use, so there is no need to share your hardware with anyone else.
The main difference between private and public cloud deployment models is how you handle
the hardware. It is also referred to as “internal cloud,” which refers to the ability to access
systems and services within an organization or border.
www.EnggTree.com
You have complete command over service integration, IT operations, policies, and user
behavior.
Companies can customize their solution according to market demands.
It offers exceptional reliability in performance.
A private cloud enables the company to tailor its solution to meet specific needs.
It provides higher control over system configuration according to the company’s
requirements.
Private cloud works with legacy systems that cannot access the public cloud.
This Cloud Computing Model is small, and therefore it is easy to manage.
It is suitable for storing corporate information that only permitted staff can access.
You can incorporate as many security services as possible to secure your cloud.
A hybrid cloud deployment model combines public and private clouds. Creating a
hybrid cloud computing model means that a company uses the public cloud but owns on-
premises systems and provides a connection between the two. They work as one system, which
is a beneficial model for a smooth transition into the public cloud over an extended period.
Some companies cannot operate solely in the public cloud because of security concerns
or data protection requirements. So, they may select the hybrid cloud to combine the
requirements with the benefits of a public cloud. It enables on-premises applications with
sensitive data to run alongside public cloud applications.
It is applicable only when a company has varied use or demand for managing the
workloads.
Managing a hybrid cloud is complex, so if you use a hybrid cloud, you may spend too
much.
Its security features are not good as the Private Cloud.
www.EnggTree.com
Because of its restricted bandwidth and storage capacity, community resources often
pose challenges.
It is not a very popular and widely adopted cloud computing model.
Security and segmentation are challenging to maintain.
Multi-cloud Model
www.EnggTree.com
Multi-cloud computing refers to using public cloud services from many cloud service
providers. A company must run workloads on IaaS or PaaS in a multi-cloud configuration from
multiple vendors, such as Azure, AWS, or Google Cloud Platform.
There are many reasons an organization selects a multi-cloud strategy. Some use it to
avoid vendor lock-in problems, while others combat shadow IT through multi-cloud
deployments. So, employees can still benefit from a specific public cloud service if it does not
meet strict IT policies.
A multi-cloud deployment model helps organizations choose the specific services that
work best for them.
It provides a reliable architecture.
With multi-cloud models, companies can choose the best Cloud service provider based
on contract options, flexibility with payments, and customizability of capacity.
It allows you to select cloud regions and zones close to your clients.
Companies are extensively using these cloud computing models all around the world.
Each of them solves a specific set of problems. So, finding the right Cloud Deployment Model
for you or your company is important.
Here are points you should remember for selecting the right Cloud Deployment Model:
Scalability: You need to check if your user activity is growing quickly or unpredictably
with spikes in demand.
Privacy and security: Select a service provider that protects your privacy and the
security of your sensitive data.
Cost: You must decide how many resources you need for your cloud solution. Then
calculate the approximate monthly cost for those resources with different cloud
providers.
Ease of use: You must select a model with no steep learning curve.
Legal Compliance: You need to check whether any relevant low stop you from
selecting any specific cloud deployment model.
www.EnggTree.com
SaaS, PaaS, and IaaS are the three main cloud computing service model categories.
You can access all three via an Internet browser or online apps available on different devices.
The cloud service model enables the team to collaborate online instead of offline creation and
then share online.
www.EnggTree.com
Software as a Service (SaaS) is a web-based deployment model that makes the software
accessible through a web browser. SaaS software users don’t need to care where the software
is hosted, which operating system it uses, or even which programming language it is written
in. The SaaS software is accessible from any device with an internet connection.
This cloud service model ensures that consumers always use the most current version
of the software. The SaaS provider handles maintenance and support. In the SaaS model, users
don’t control the infrastructure, such as storage, processing power, etc.
Characteristics of SaaS
Advantages SaaS
The biggest benefit of using SaaS is that it is easy to set up, so you can start using it
instantly.
Compared with on-premises software, it is more cost-effective.
You don’t need to manage or upgrade the software, as it is typically included in a SaaS
subscription or purchase.
It won’t use your local resources, such as the hard disk typically required to install
desktop software.
It is a cloud computing service category that provides a wide range of hosted
capabilities and services.
Developers can easily build and deploy web-based software applications.
You can easily access it through a browser.
Disadvantages SaaS
It would help if you opted for configuration over customization within a SaaS-based
delivery model.
You must carefully understand the usage rates and set clear objectives to achieve the
SaaS adoption.
www.EnggTree.com
You can complement your SaaS solution with integrations and security options to make
it more user-oriented.
This Model provides all the facilities required to support the complex life cycle of
building and delivering web applications and services entirely for the Internet. This cloud
computing model enables developers to rapidly develop, run, and manage their apps without
building and maintaining the infrastructure or platform.
Characteristics of PaaS
Advantages PaaS
Disadvantages of SaaS
You have control over the app’s code and not its infrastructure.
The PaaS organization stores your data, so it sometimes poses a security risk to your
app’s users.
Vendors provide varying service levels, so selecting the right services is essential.
The risk of lock-in with a vendor may affect the ecosystem you need for your
development environment.
Here are essential things you need to consider before PaaS implementation:
Analyze your business needs, decide the automation levels, and also decides whether
you want a self-service or fully automated PaaS model.
You need to determine whether to deploy on a private or public cloud.
Plan through the customization and efficiency levels.
www.EnggTree.com
Infrastructure as a Service (IaaS)
Organizations can purchase resources on-demand and as needed instead of buying the
hardware outright.
The IaaS cloud vendor hosts the infrastructure components, including the on-premises
data center, servers, storage, networking hardware, and the hypervisor (virtualization layer).
This Model contains the basic building blocks for your web application. It provides
complete control over the hardware that runs your application (storage, servers, VMs, networks
& operating systems). IaaS model gives you the best flexibility and management control over
your IT resources.
Characteristics of IaaS
Advantages of IaaS
www.EnggTree.com
Easy to automate the deployment of storage, networking, and servers.
Hardware purchases can be based on consumption.
Clients keep complete control of their underlying infrastructure.
The provider can deploy the resources to a customer’s environment anytime.
It can be scaled up or downsized according to your needs.
Disadvantages of IaaS
You should ensure that your apps and operating systems are working correctly and
providing the utmost security.
You’re in charge of the data, so if any of it is lost, it’s up to you to recover it.
IaaS firms only provide the servers and API, so you must configure everything else.
Here are some specific considerations you should remember before IaaS Implementation:
You should clearly define your access needs and your network’s bandwidth to
facilitate smooth implementation and functioning.
Plan out detailed data storage and security strategy to streamline the business process.
Ensure that your organization has a proper disaster recovery plan to keep your data
safe and accessible.
Here are some essential criteria for selecting the best cloud service provider:
Financial stability: Look for a well-financed cloud provider that has steady profits
from the infrastructure. If the company shuts down because of monetary issues, your
solutions will also be in jeopardy.
Industries that prefer the solution: Before finalizing cloud services, examine its
existing clients and markets. Your cloud service provider should be popular among
www.EnggTree.com
companies in your niche or neighboring ones.
Datacenter locations: To avoid safety risks, ensure that cloud providers enable your
data’s geographical distribution.
Encryption standards: You should make sure the cloud provider supports major
encryption algorithms.
Check accreditation and auditing: The widely used online auditing standard is
SSAE. This procedure helps you to verify the safety of online data storage. ISO
27001 certificate verifies that a cloud provider complies with international safety
standards for data storage.
Backup: The provider should support incremental backups so that you can store
offsite and quickly restore.
assures data integrity, it is your responsibility to carry out user authentication and
authorization, identity management, data encryption, and access control. Security issues on the
cloud include identity theft, data breaches, malware infections, and a lot more which eventually
decrease the trust amongst the users of your applications. This can in turn lead to potential loss
in revenue alongside reputation and stature. Also, dealing with cloud computing requires
sending and receiving huge amounts of data at high speed, and therefore is susceptible to data
leaks.
Cost Management
Even as almost all cloud service providers have a “Pay As You Go” model, which
reduces the overall cost of the resources being used, there are times when there are huge costs
incurred to the enterprise using cloud computing. When there is under optimization of the
resources, let’s say that the servers are not being used to their full potential, add up to the
hidden costs. If there is a degraded application performance or sudden spikes or overages in
the usage, it adds up to the overall cost. Unused resources are one of the other main reasons
why the costs go up. If you turn on the services or an instance of cloud and forget to turn it off
during the weekend or when there is no current use of it, it will increase the cost without even
www.EnggTree.com
using the resources.
Multi-Cloud Environments
Due to an increase in the options available to the companies, enterprises not only use a
single cloud but depend on multiple cloud service providers. Most of these companies use
hybrid cloud tactics and close to 84% are dependent on multiple clouds. This often ends up
being hindered and difficult to manage for the infrastructure team. The process most of the
time ends up being highly complex for the IT team due to the differences between multiple
cloud providers.
Performance Challenges
Challenges also arise in the case of fault tolerance, which means the operations continue as
required even when one or more of the components fail.
When an organization uses a specific cloud service provider and wants to switch to
another cloud-based solution, it often turns up to be a tedious procedure since applications
written for one cloud with the application stack are required to be re-written for the other cloud.
There is a lack of flexibility from switching from one cloud to another due to the complexities
involved. Handling data movement, setting up the security from scratch and network also add
up to the issues encountered when changing cloud solutions, thereby reducing flexibility.
Since cloud computing deals with provisioning resources in real-time, it deals with
enormous amounts of data transfer to and from the servers. This is only made possible due to
the availability of the high-speed network. Although these data and resources are exchanged
over the network, this can prove to be highly vulnerable in case of limited bandwidth or cases
when there is a sudden outage. Even when the enterprises can cut their hardware costs, they
www.EnggTree.com
need to ensure that the internet bandwidth is high as well there are zero network outages, or
else it can result in a potential business loss. It is therefore a major challenge for smaller
enterprises that have to maintain network bandwidth that comes with a high cost.
Due to the complex nature and the high demand for research working with the cloud
often ends up being a highly tedious task. It requires immense knowledge and wide expertise
on the subject. Although there are a lot of professionals in the field they need to constantly
update themselves. Cloud computing is a highly paid job due to the extensive gap between
demand and supply. There are a lot of vacancies but very few talented cloud engineers,
developers, and professionals. Therefore, there is a need for upskilling so these professionals
can actively understand, manage and develop cloud-based applications with minimum issues
and maximum reliability.
Virtualization
www.EnggTree.com
Host Machine: The machine on which the virtual machine is going to be built is known as Host
Machine.
Guest Machine: The virtual machine is referred to as a Guest Machine.
Benefits of Virtualization
Drawback of Virtualization
High Initial Investment: Clouds have a very high initial investment, but it is also true
that it will help in reducing the cost of companies.
Learning New Infrastructure: As the companies shifted from Servers to Cloud, it
requires highly skilled staff who have skills to work with the cloud easily, and for this,
you have to hire new staff or provide training to current staff.
Risk of Data: Hosting data on third-party resources can lead to putting the data at risk,
it has the chance of getting attacked by any hacker or cracker very easily.
Characteristics of Virtualization
www.EnggTree.com
Increased Security: The ability to control the execution of a guest program in a
completely transparent manner opens new possibilities for delivering a secure,
controlled execution environment. All the operations of the guest programs are
generally performed against the virtual machine, which then translates and applies them
to the host programs.
Managed Execution: In particular, sharing, aggregation, emulation, and isolation are
the most relevant features.
Sharing: Virtualization allows the creation of a separate computing environment
within the same host.
Aggregation: It is possible to share physical resources among several guests, but
virtualization also allows aggregation, which is the opposite process.
Types of Virtualization
1. Application Virtualization
2. Network Virtualization
3. Desktop Virtualization
4. Storage Virtualization
5. Server Virtualization
6. Data virtualization
1. Application Virtualization:
2. Network Virtualization:
The ability to run multiple virtual networks with each having a separate control and
data plan. It co-exists together on top of one physical network. It can be managed by individual
parties that are potentially confidential to each other. Network virtualization provides a facility
to create and provision virtual networks, logical switches, routers, firewalls, load balancers,
Virtual Private Networks (VPN), and workload security within days or even weeks.
www.EnggTree.com
3. Desktop Virtualization:
4. Storage Virtualization:
5. Server Virtualization:
This is a kind of virtualization in which the masking of server resources takes place.
Here, the central server (physical server) is divided into multiple different virtual servers by
changing the identity number, and processors. So, each system can operate its operating
systems in an isolated manner. Where each sub-server knows the identity of the central server.
It causes an increase in performance and reduces the operating cost by the deployment of main
server resources into a sub-server resource. It’s beneficial in virtual migration, reducing energy
consumption, reducing infrastructural costs, etc.
6. Data Virtualization:
This is the kind of virtualization in which the data is collected from various sources and
managed at a single place without knowing more about the technical information like how data
is collected, stored & formatted then arranged that data logically so that its virtual view can be
accessed by its interested people and stakeholders, and users through the various cloud services
remotely. Many big giant companies are providing their services like Oracle, IBM, At scale,
Cdata, etc.
www.EnggTree.com
Load Balancing
Load balancing is the method that allows you to have a proper balance of the amount
of work being done on different pieces of device or hardware equipment. Typically, what
happens is that the load of the devices is balanced between different servers or between the
CPU and hard drives in a single cloud server.
Load balancing was introduced for various reasons. One of them is to improve the
speed and performance of each single device, and the other is to protect individual devices
from hitting their limits by reducing their performance.
Traffic on the Internet is growing rapidly, accounting for almost 100% of the current
traffic annually. Therefore, the workload on the servers is increasing so rapidly, leading to
overloading of the servers, mainly for the popular web servers. There are two primary solutions
to overcome the problem of overloading on the server-
Cloud-based servers can achieve more precise scalability and availability by using farm
server load balancing. Load balancing is beneficial with almost any type of service, such as
HTTP, SMTP, DNS, FTP, and POP/IMAP.
Static algorithms are built for systems with very little variation in load. The entire
traffic is divided equally between the servers in the static algorithm. This algorithm requires
in-depth knowledge of server resources for better performance of the processor, which is
determined at the beginning of the implementation.
However, the decision of load shifting does not depend on the current state of the
system. One of the major drawbacks of static load balancing algorithm is that load balancing
tasks work only after they have been created. It could not be implemented on other devices for
load balancing.
2. Dynamic Algorithm
The dynamic algorithm first finds the lightest server in the entire network and gives it
priority for load balancing. This requires real-time communication with the network which can
help increase the system's traffic. Here, the current state of the system is used to control the
load.
Round robin load balancing algorithm uses round-robin method to assign jobs. First, it
randomly selects the first node and assigns tasks to other nodes in a round-robin manner. This
is one of the easiest methods of load balancing.
Processors assign each process circularly without defining any priority. It gives fast
response in case of uniform workload distribution among the processes. All processes have
different loading times. Therefore, some nodes may be heavily loaded, while others may
remain under-utilised.
www.EnggTree.com
Weighted Round Robin Load Balancing Algorithms have been developed to enhance
the most challenging issues of Round Robin Algorithms. In this algorithm, there are a specified
set of weights and functions, which are distributed according to the weight values.
Processors that have a higher capacity are given a higher value. Therefore, the highest
loaded servers will get more tasks. When the full load level is reached, the servers will receive
stable traffic.
The opportunistic load balancing algorithm allows each node to be busy. It never
considers the current workload of each system. Regardless of the current workload on each
node, OLB distributes all unfinished tasks to these nodes.
The processing task will be executed slowly as an OLB, and it does not count the
implementation time of the node, which causes some bottlenecks even when some nodes are
free.
Under minimum to minimum load balancing algorithms, first of all, those tasks take
minimum time to complete. Among them, the minimum value is selected among all the
functions. According to that minimum time, the work on the machine is scheduled.
Other tasks are updated on the machine, and the task is removed from that list. This
process will continue till the final assignment is given. This algorithm works best where many
small tasks outweigh large tasks.
Hardware-based load balancers: Hardware-based load balancers are dedicated boxes that
contain application-specific integrated circuits (ASICs) optimized for a particular use. ASICs
allow network traffic to be promoted at high speeds and are often used for transport-level load
balancing because hardware-based load balancing is faster than a software solution.
www.EnggTree.com
Major Examples of Load Balancers
Direct Routing Request Despatch Technique: This method of request dispatch is similar to
that implemented in IBM's NetDispatcher. A real server and load balancer share a virtual IP
address. The load balancer takes an interface built with a virtual IP address that accepts request
packets and routes the packets directly to the selected server.
Linux Virtual Load Balancer: This is an open-source enhanced load balancing solution used
to build highly scalable and highly available network services such as HTTP, POP3, FTP,
SMTP, media and caching, and Voice over Internet Protocol (VoIP) is done. It is a simple and
powerful product designed for load balancing and fail-over. The load balancer itself is the
primary entry point to the server cluster system. It can execute Internet Protocol Virtual Server
(IPVS), which implements transport-layer load balancing in the Linux kernel, also known as
layer-4 switching.
Cloud load balancing takes advantage of network layer information and leaves it to
decide where network traffic should be sent. This is accomplished through Layer 4 load
balancing, which handles TCP/UDP traffic. It is the fastest local balancing solution, but it
cannot balance the traffic distribution across servers.
HTTP(s) load balancing is the oldest type of load balancing, and it relies on Layer 7.
This means that load balancing operates in the layer of operations. It is the most flexible type
www.EnggTree.com
of load balancing because it lets you make delivery decisions based on information retrieved
from HTTP addresses.
Load balancers can be further divided into hardware, software and virtual load
balancers.
It depends on the base and the physical hardware that distributes the network and
application traffic. The device can handle a large traffic volume, but these come with a hefty
price tag and have limited flexibility.
It can be an open source or commercial form and must be installed before it can be
used. These are more economical than hardware solutions.
It differs from a software load balancer in that it deploys the software to the hardware
load-balancing device on the virtual machine.
The technology of load balancing is less expensive and also easy to implement. This
allows companies to work on client applications much faster and deliver better results at a
lower cost.
Cloud load balancing can provide scalability to control website traffic. By using
effective load balancers, it is possible to manage high-end traffic, which is achieved using
network equipment and servers. E-commerce companies that need to deal with multiple
visitors every second use cloud load balancing to manage and distribute workloads.
Load balancers can handle any sudden traffic bursts they receive at once. For example,
in case of university results, the website may be closed due to too many requests. When one
uses a load balancer, he does not need to worry about the traffic flow. Whatever the size of the
traffic, load balancers will divide the entire load of the website equally across different servers
and provide maximum results in minimum response time.
Greater Flexibility
The main reason for using a load balancer is to protect the website from sudden crashes.
When the workload is distributed among different network servers or units, if a single node
fails, the load is transferred to another node. It offers flexibility, scalability and the ability to
handle traffic better. Because of these characteristics, load balancers are beneficial in cloud
environments. This is to avoid heavy workload on a single server.
www.EnggTree.com
Cloud Elasticity
The Flexibility is the capacity to develop or contract framework assets (like process,
capacity or organization) powerfullywww.EnggTree.com
on a case by case basis to adjust to responsibility changes
in the applications in an autonomic way.
Example: Consider an online shopping site whose transaction workload increases during
festive season like Christmas. So for this specific period of time, the resources need a spike up.
In order to handle this kind of situation, we can go for a Cloud-Elasticity service rather than
Cloud Scalability. As soon as the season goes out, the deployed resources can then be requested
for withdrawal.
Cloud Scalability
Cloud scalability is used to handle the growing workload where good performance is
also needed to work efficiently with software or applications. Scalability is commonly used
where the persistent deployment of resources is required to handle the workload statically.
Example: Consider you are the owner of a company whose database size was small in earlier
days but as time passed your business does grow and the size of your database also increases,
so in this case you just need to request your cloud service vendor to scale up your database
capacity to handle a heavy workload.
It is totally different from what you have read above in Cloud Elasticity. Scalability is used to
fulfill the static needs while elasticity is used to fulfill the dynamic need of the organization.
Scalability is a similar kind of service provided by the cloud where the customers have to pay-
per-use. So, in conclusion, we can say that Scalability is useful where the workload remains
high and increases statically.
Types of Scalability
In this type of scalability, increase the power of existing resources in the working
environment in an upward direction.
www.EnggTree.com
2. Horizontal Scalability
3. Diagonal Scalability
It is a mixture of both Horizontal and Vertical scalability where the resources are added
both vertically and horizontally.
Replication
The simplest form of data replication in cloud computing environment is to store a copy
of a file (copy), in expanded form, the copying and pasting in any modern operating systems.
Replication is the reproduction of the original data in unchanged form. Changing data accesses
are expensive in general through replication. In the frequently encountered master / slave
replication, a distinction between the original data (primary data) and the dependent copies. In
peer copies (version control) there must be merging of data sets (synchronization). Sometimes
it is important to know which data sets must have the replicas. Depending on the type of
replication it is located between the processing and creation of the primary data and their
replication in a certain period of time. This period is usually referred to as latency.
Advantages:
More robust
Requires less coordination when deployed
The work gets offloaded from the servers to the storage device
Disadvantages:
www.EnggTree.com
Requires homogenous storage environments: the source and target array have to be
similar
It is costly to implement
Host-based data replication uses the servers to copy data from one site to another site.
Host-based replication software usually includes options like compression, encryption and,
throttling, as well as failover. Using this method has several advantages and disadvantages.
Advantages:
Disadvantages:
Network-based data replication uses a device or appliance that sits on the network in
the path of the data to manage replication. The data is then copied to a second device. These
devices usually have proprietary replication technology but can be used with any host server
and storage hardware.
www.EnggTree.com
Advantages
Disadvantages:
Higher initial set-up cost because it requires proprietary hardware, as well as ongoing
operational and management costs
Requires implementation of a storage area network (SAN)
Monitoring
infrastructure. This continuous evaluation of resource levels, server response times, and speed
predicts possible vulnerability to future issues before they arise.
• Hardware Layer
• Virtualization Layer
Partitions the physical hardware resources into multiple virtual resources that enabling
pooling of resources.
Builds upon the IaaS layers below and provides standardized stacks of services such as
www.EnggTree.com
database service, queuing service, application frameworks and run-time environments,
messaging services, monitoring services, analytics services, etc.
• Applications Layer
• Hardware Layer
www.EnggTree.com
Compute Service
to launch the instance. You can also create their own AMIs with custom applications,
libraries and data. Instances can be launched with a variety of operating systems.
• Instance Sizes
When you launch an instance you specify the instance type (micro, small,
medium, large, extra-large, etc.), the number of instances to launch based on the
selected AMI and availability zones for the instances.
• Key-pairs
When launching a new instance, the user selects a key-pair from existing
keypairs or creates a new keypair for the instance. Keypairs are used to securely
connect to an instance after it launches.
• Security Groups
The security groups to be associated with the instance can be selected from the
instance launch wizard. Security groups are used to open or block a specific network
port for the launched instances.
www.EnggTree.com
• Launching Instances
To create a new instance, the user selects an instance machine type, a zone in which
the instance will be launched, a machine image for the instance and provides an instance name,
instance tags and meta-data.
• Disk Resources
Every instance is launched with a disk resource. Depending on the instance type, the
disk resource can be a scratch disk space or persistent disk space. The scratch disk space is
deleted when the instance terminates. Whereas, persistent disks live beyond the life of an
instance.
• Network Options
Network option allows you to control the traffic to and from the instances. By default,
traffic between instances in the same network, over any port and any protocol and incoming
SSH connections from anywhere are enabled.
www.EnggTree.com
www.EnggTree.com
Storage Services
Cloud storage services allow storage and retrieval of any amount of data, at any time
from anywhere on the web.
Most cloud storage services organize data into buckets or containers.
Scalability
Cloud storage services provide high capacity and scalability. Objects upto several
tera-bytes in size can be uploaded and multiple buckets/containers can be created on cloud
storages.
Replication
Access Policies
Cloud storage services provide several security features such as Access Control
Lists (ACLs), bucket/container level policies, etc. ACLs can be used to selectively grant
access permissions on individual objects. Bucket/container level policies can also be
defined to allow or deny permissions across some or all of the objects within a single
bucket/container.
Encryption www.EnggTree.com
Cloud storage services provide Server Side Encryption (SSE) options to encrypt all
data stored in the cloud storage.
Consistency
Strong data consistency is provided for all upload and delete operations. Therefore,
any object that is uploaded can be immediately downloaded after the upload is complete.
ACLs are used to control access to objects and buckets. ACLs can be configured
to share objects and buckets with the entire world, a Google group, a Google-hosted
domain, or specific Google account holders.
www.EnggTree.com
Google App Engine is the platform-as-a-service (PaaS) from Google, which includes
both an application runtime and web frameworks.
Runtimes
- App Engine provides runtime environments for Java, Python, PHP and Go
programming language.
www.EnggTree.com
Sandbox
- Applications run in a secure sandbox environment isolated from other applications.
- The sandbox environment provides a limited access to the underlying operating
system.
Web Frameworks
- App Engine provides a simple Python web application framework called webapp2.
App Engine also supports any framework written in pure Python that speaks WSGI,
including Django, CherryPy, Pylons, web.py, and web2py.
Datastore
- App Engine provides a no-SQL data storage service
Authentication
- App Engine applications can be integrated with Google Accounts for user
authentication.
URL Fetch service
- URL Fetch service allows applications to access resources on the Internet, such as
web services or other data.
Other services
- Email service
- Image Manipulation service
- Memcache
- Task Queues
www.EnggTree.com
- Scheduled Tasks service