0% found this document useful (0 votes)

47 views101 pages

Fault

The document summarizes key concepts related to fault tolerance in distributed systems. It discusses types of faults and failures, and techniques for achieving fault tolerance through redundancy including information, time, and physical redundancy. It also covers process resilience through replication, reliable communication using group communication and virtual synchrony, and ensuring distributed agreement and transactions through protocols like two-phase commit.

Uploaded by

s.b.v.seshagiri1407

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views101 pages

Fault

Uploaded by

s.b.v.seshagiri1407

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 101

Chapter 8

Fault Tolerance
Part I Introduction Part II Process Resilience Part III Reliable Communication Part IV Distributed Commit Part V Recovery

CSCE455/855 Distributed Operating Systems

Giving credit where credit is due:

Most of the lecture notes are based on slides by Prof. Jalal Y. Kawash at Univ. of Calgary and Dr. Daniel M. Zimmerman at
CALTECH

Some of the lecture notes are based on slides by Scott Shenker and Ion Stoica at Univ.of California, Berkeley, Timo Alanko at Univ. of Helsinki, Finland, Hugh C. Lauer at Worcester Polytechnic Institute, Xiuwen Liu at Florida State
University

I have modified them and added new slides

Chapter 8

Fault Tolerance
Part I Introduction

Fault Tolerance
A DS should be fault-tolerant
Should be able to continue functioning in the presence of faults

Fault tolerance is related to dependability

Dependability
Dependability Includes Availability Reliability Safety Maintainability

Availability & Reliability (1)

Availability: A measurement of whether a system is ready to be used immediately
System is up and running at any given moment

Reliability: A measurement of whether a system can run continuously without failure

System continues to function for a long period of time

Availability & Reliability (2)

A system goes down 1ms/hr has an availability of more than 99.99%, but is unreliable A system that never crashes but is shut down for a week once every year is 100% reliable but only 98% available

Safety & Maintainability

Safety: A measurement of how safe failures are
System fails, nothing serious happens For instance, high degree of safety is required for systems controlling nuclear power plants

Maintainability: A measurement of how easy it is to repair a system

A highly maintainable system may also show a high degree of availability Failures can be detected and repaired automatically? Self-healing systems?

Faults
A system fails when it cannot meet its promises (specifications) An error is part of a system state that may lead to a failure A fault is the cause of the error Fault-Tolerance: the system can provide services even in the presence of faults Faults can be:
Transient (appear once and disappear) Intermittent (appear-disappear-reappear behavior)
A loose contact on a connector intermittent fault

Permanent (appear and persist until repaired)

Failure Models
Type of failure Crash failure Omission failure Receive omission Send omission Timing failure Response failure Value failure State transition failure Arbitrary failure (Byzantine failure) Description A server halts, but is working correctly until it halts A server fails to respond to incoming requests A server fails to receive incoming messages A server fails to send messages A server's response lies outside the specified time interval The server's response is incorrect The value of the response is wrong The server deviates from the correct flow of control A server may produce arbitrary responses at arbitrary times

Failure Masking

Redundancy is key technique for hiding failures Redundancy types: 1. Information: add extra (control) information
Error-correction codes in messages

2. Time: perform an action persistently until it succeeds:

Transactions Process replication, electronic circuits

3. Physical: add extra components (S/W & H/W)

Example Redundancy in Circuits (1)

Example Redundancy in Circuits (2)

Triple modular redundancy.

Chapter 8

Fault Tolerance
Part II Process Resilience

Process Resilience
Mask process failures by replication
Organize processes into groups, a message sent to a group is delivered to all members

If a member fails, another should fill in

Flat Groups versus Hierarchical Groups

a) b)

Communication in a flat group. Communication in a simple hierarchical group

Process Replication
Replicate a process and group replicas in one group How many replicas do we create? A system is k fault-tolerant if it can survive and function even if it has k faulty processes
For crash failures (a faulty process halts, but is working correctly until it halts)
k+1 replicas

For Byzantine failures (a faulty process may produce arbitrary responses at arbitrary times)
2k+1 replicas

Agreement
Need agreement in DS:
Leader, commit, synchronize

Distributed Agreement algorithm: all nonfaulty processes achieve consensus in a finite number of steps Perfect processes, faulty channels: two-army Faulty processes, perfect channels: Byzantine generals

Two-Army Problem

Byzantine Generals Problem

Byzantine Generals -Example (1)

The Byzantine generals problem for 3 loyal generals and1 traitor. a) The generals announce the time to launch the attack (by messages marked by their ids). b) The vectors that each general assembles based on (a) c) The vectors that each general receives in step 3, where every general passes his vector from (b) to every other general.

Byzantine Generals Example (2)

The same as in previous slide, except now with 2 loyal generals and one traitor.

Byzantine Generals
Given three processes, if one fails, consensus is impossible Given N processes, if F processes fail, consensus is impossible if N 3F

OceanStore
Global-Scale Persistent Storage on Untrusted Infrastructure

Update Model
Concurrent updates w/o wide-area locking
Conflict resolution

A master replica?

Updates Serialization

Role of primary tier of replicas

Incompatible with the untrusted infrastructure assumption

A secondary tier of replicas

All updates submitted to primary tier of replicas which chooses a final total order by following Byzantine agreement protocol
the result of the updates is multicast down the dissemination tree to all the secondary replicas

The Path of an OceanStore Update

Chapter 8

Fault Tolerance
Part III Reliable Communication

Reliable Group Communication

When a group is static and processes do not fail
Reliable communication = deliver the message to all group members
Any order delivery Ordered delivery

Basic Reliable-Multicasting Schemes

A simple solution to reliable multicasting when all receivers are known and assumed not to fail a) Message transmission b) Reporting feedback

Atomic Multicast
All messages are delivered in the same order to all processes
Group view: the view on the set of processes contained in the group Virtual synchronous multicast: a message m multicast to a group view G is delivered to all non-faulty processes in G

Virtual Synchrony System Model

The logical organization of a distributed system to distinguish between message receipt and message delivery

Message Delivery
Delivery of messages - new message => HBQ - decision making - delivery order - deliver or not to deliver? - the message is allowed to be delivered: HBQ => DQ - when at the head of DQ: message => application (application: receive )

Application
delivery
hold-back queue

delivery queue

Message passing system

Virtual Synchronous Multicast

a) Message is not delivered
A A

b) Message is delivered

C Gi = (A, B, C) Gi+1 = (B, C)

C
Gi = (A, B, C) Gi+1 = (B, C)

Virtual Synchronous Multicast

a) Message is not delivered
A A

b) ???

C Gi = (A, B, C) Gi+1 = (B, C)

C
Gi = (A, B, C) Gi+1 = (B, C)

Virtual Synchronous Multicast

a) ???
A

C Gi = (A, B, C) Gi+1 = (B, C)

Reliability of Group Communication?

A sent message is received by all members
(acks from all => ok)

Problem: during a multicast operation

an old member disappears from the group a new member joins the group

Solution
membership changes synchronize multicasting
during a MC operation no membership changes

Virtual synchrony: all processes see message and membership change in the same order

Virtual Synchronous Multicast

a) Message is not delivered
A A

b) Message is delivered

C Gi = (A, B, C) Gi+1 = (B, C)

C
Gi = (A, B, C) Gi+1 = (B, C)

Virtual Synchrony Implementation: [Birman et al., 1991]

Only stable messages are delivered Stable message: a message received by all processes in the messages group view Assumptions (can be ensured by using TCP):
Point-to-point communication is reliable Point-to-point communication ensures FIFO-ordering

How to determine if a message is stable?

Virtual Synchrony Implementation: Example

Gi = {P1, P2, P3, P4, P5} P5 fails P1 detects that P5 has failed P1 send a view change message to every process in Gi+1 = {P1, P2, P3, P4} P1
P2 P3

change view
P4

Virtual Synchrony Implementation: Example

Every process
Send each unstable message m from Gi to members in Gi+1 Marks m as being stable Send a flush message to mark that all unstable messages have been sent
unstable message
P2 P3

P1 P4

flush message

Virtual Synchrony Implementation: Example

Every process
After receiving a flush message from all processes in Gi+1 installs Gi+1
P1 P4

Announcement
2nd Midterm in the week after Spring Break
March 27, Wednesday

Chapters 6, 7, 8.1, & 8.2

Distributed Commit
Goal: Either all members of a group decide to perform an operation, or none of them perform the operation
Atomic transaction: a transaction that happens completely or not at all

Assumptions
Failures:
Crash failures that can be recovered Communication failures detectable by timeouts

Notes:
Commit requires a set of processes to agree similar to the Byzantine generals problem but the solution much simpler because stronger assumptions

Distributed Transactions
client
server

Atomic atomic Consistent isolated Isolated serializable Durable

Database

server

Database

client

A Distributed Banking Transaction

openTransaction closeTransaction . join BranchX T Client
T = openTransaction a.withdraw(4); c.deposit(4); b.withdraw(3); d.deposit(3); closeTransaction

join

participant

a.withdraw(4);

participant B join BranchY participant C D BranchZ c.deposit(4); d.deposit(3); b.withdraw(3);

One-phase Commit
One-phase commit protocol
One site is designated as a coordinator The coordinator tells all the other processes whether or not to locally perform the operation in question This scheme however is not fault tolerant

Transaction Processing (1)

S1
coordinator
F1

T_Id
flag: init
P1 27

client .
Open transaction T_write F1,P1 T_write F2,P2 T_write F3,P3 Close transaction . join S2
participant
F2

T_Id
flag: init
P2 27

S3
participant

T_Id
flag: init
P3 2745

Transaction Processing (2)

Close
F1

coordinator

T_Id
init committed wait done
P1 27

client .
Open transaction T_read F1,P1 T_write F2,P2 T_write F3,P3 Close transaction .

doCommit ! canCommit?

Yes HaveCommitted

T_Id committed ready init

P2 27

Yes HaveCommitted
T_Id committed ready init
P3 2745

Two Phase Commit (2PC)

Coordinator send VOTE_REQ to all Participants

send vote to coordinator if (vote == no) decide abort halt

if (all votes yes) decide commit send COMMIT to all else decide abort send ABORT to all who voted yes halt

if receive ABORT, decide abort else decide commit halt

Two-Phase Commit (1)

a) b)

The finite state machine for the coordinator in 2PC. The finite state machine for a participant.

Two-Phase Commit (2)

a) b)

The finite state machine for the coordinator in 2PC. The finite state machine for a participant.

Two-Phase Commit (3)

State of Q COMMIT ABORT Action by P Make transition to COMMIT Make transition to ABORT

INIT
READY

Make transition to ABORT

Contact another participant

Actions taken by a participant P when residing in state READY and having contacted another participant Q.

Two-Phase Commit (4)

actions by coordinator:

write START _2PC to local log; multicast VOTE_REQUEST to all participants; while not all votes have been collected { wait for any incoming vote; if timeout { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; exit; } record vote; } if all participants sent VOTE_COMMIT and coordinator votes COMMIT{ write GLOBAL_COMMIT to local log; multicast GLOBAL_COMMIT to all participants; } else { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; }

Outline of the steps taken by the coordinator in 2PC.

Two-Phase Commit (5)

actions by participant:
write INIT to local log; wait for VOTE_REQUEST from coordinator; if timeout { write VOTE_ABORT to local log; exit; } if participant votes COMMIT { write VOTE_COMMIT to local log; send VOTE_COMMIT to coordinator; wait for DECISION from coordinator; if timeout { multicast DECISION_REQUEST to other participants; wait until DECISION is received; /* remain blocked */ write DECISION to local log; } if DECISION == GLOBAL_COMMIT write GLOBAL_COMMIT to local log; else if DECISION == GLOBAL_ABORT write GLOBAL_ABORT to local log; } else { write VOTE_ABORT to local log; send VOTE ABORT to coordinator; }

Steps taken by participant process in 2PC.

Two-Phase Commit (6)

actions for handling decision requests: /* executed by separate thread */
while true { wait until any incoming DECISION_REQUEST is received; /* remain blocked */ read most recently recorded STATE from the local log; if STATE == GLOBAL_COMMIT send GLOBAL_COMMIT to requesting participant; else if STATE == INIT or STATE == GLOBAL_ABORT send GLOBAL_ABORT to requesting participant; else skip; /* participant remains blocked */

Steps taken by participant process for handling incoming decision requests.

Two-Phase Commit(7)
When all participants are in the ready states, no final decision can be reached Two-phase commit is a blocking commit protocol

Three-Phase Commit (1)

There is no state from which a transition can be made to either Commit or Abort There is no state where it is not possible to make a final decision and from which transition can be made to Commit non-blocking commit protocol

Three-Phase Commit (2)

Coordinator sends Vote_Request (as before) If all participants respond affirmatively,
Put Precommit state into log on stable storage Send out Prepare_to_Commit message to all

After all participants acknowledge,

Put Commit state in log Send out Global_Commit

Three-Phase Commit (3)

Coordinator blocked in Wait state
Safe to abort transaction

Coordinator blocked in Precommit state

Safe to issue Global_Commit Any crashed or partitioned participants will commit when recovered

Three-Phase Commit (4)

Participant blocked in Precommit state
Contact others Collectively decide to commit

Participant blocked in Ready state

Contact others If any in Abort, then abort transaction If any in Precommit, the move to Precommit state If all in Ready state, then abort transaction

Chapter 8

Fault Tolerance
Part V Recovery

Recovery
Weve talked a lot about fault tolerance, but not about what happens after a fault has occurred A process that exhibits a failure has to be able to recover to a correct state There are two basic types of recovery:

Backward Recovery Forward Recovery

Backward Recovery
The goal of backward recovery is to bring the system from an erroneous state back to a prior correct state The state of the system must be recorded checkpointed - from time to time, and then restored when things go wrong Examples

Reliable communication through packet retransmission

Forward Recovery

The goal of forward recovery is to bring a system from an erroneous state to a correct new state (not a previous state) Examples:

Reliable communication via erasure correction, such as an (n, k) block erasure code

More on Backward Recovery

Backward recovery is far more widely applied Checkpointing is costly, so its often combined with message logging

Stable Storage

In order to store checkpoints and logs, information needs to be stored safely - not just able to survive crashes, but also able to survive hardware faults RAID is the typical example of stable storage

Checkpointing

Related to checkpointing, let us first discuss the global state and the distributed snapshot algorithm

Determining Global States

The Global State of a distributed computation is the set of local states of all individual processes involved in the computation + the states of the communication channels How?

Obvious First Solution

Synchronize clocks of all processes and ask all processes to record their states at known time t Problems?

Time synchronization possible only approximately

distributed

banking applications: no approximations!

Does not record the state of messages in the channels

Global State

We cannot determine the exact global state of the system, but we can record a snapshot of it

Distributed Snapshot: a state the system might have been in [Chandy and Lamport]

A nave snapshot algorithm

Processes record their state at any arbitrary point A designated process collects these states

+ So simple!! - Correct??

Example
Producer Consumer problem
p q

p records its state

Example
p m q

Example
p q

q records its state

Example The recorded state

p q

The sender has no record of the sending The receiver has the record of the receipt

Whats Wrong?
p m

Result:

Global state has record of the receive event but no send event violating the happens-before concept!!

Cut

A consistent cut (meaningful global state) ?

Cut

A consistent cut (meaningful global state) ?

Cuts

a) b)

A consistent cut (meaningful global state) An inconsistent cut

The Snapshot Algorithm

Records a set of process and channel states such that the combination is a consistent GS. Assumptions:
All messages arrive intact, exactly once Communication channels are unidirectional and FIFOordered There is a comm. path between any two processes Any process may initiate the snapshot (sends Marker) Snapshot does not interfere with normal execution Each process records its state and the state of its incoming channels

The Snapshot Algorithm (2)

1. Marker sending rule for initiator process P0
After P0 has recorded its state

for each outgoing channel C, sends a marker on C

2. Marker receiving rule for a process Pk, on receipt of a marker over channel C
if Pk has not yet recorded its state - records Pks state - records the state of C as empty - turns on recording of messages over other incoming channels for each outgoing channel C, sends a marker on C - else - records the state of C as all the messages received over C since Pk saved its state

Snapshot Example
P1 P2 P3
e10 e11,2
M a

e13 e14
M M b

e15 e24 e25

M M M

e20 e 1,2,3 2 e30

e31,2,3

e34

1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels C21 and C31 2- P2 receives Marker over C12, records its state (S2), sets state(C12) = {} sends Marker to P1 & P3; turns on recording for channel C32 3- P1 receives Marker over C21, sets state(C21) = {a} 4- P3 receives Marker over C13, records its state (S3), sets state(C13) = {} sends Marker to P1 & P2; turns on recording for channel C23 5- P2 receives Marker over C32, sets state(C32) = {b} 6- P3 receives Marker over C23, sets state(C23) = {} 7- P1 receives Marker over C31, sets state(C31) = {}

Snapshot Example
P1 P2 P3
e10
a

e13 e24
b

e20 e30

Distributed Snapshot Algorithm

When a process finishes local snapshot, it collects its local state (S and C) and sends it to the initiator of the distributed snapshot The initiator can then analyze the state One algorithm for distributed global snapshots, but its not particularly efficient for large systems

Checkpointing
Weve discussed distributed snapshots The most recent distributed snapshot in a system is also called the recovery line

Independent Checkpointing

It is often difficult to find a recovery line in a system where every process just records its local state every so often - a domino effect or cascading rollback can result:

Coordinated Checkpointing
To solve this problem, systems can implement coordinated checkpointing Weve discussed one algorithm for distributed global snapshots, but its not particularly efficient for large systems Another way to do it is to use a two-phase blocking protocol (with some coordinator) to get every process to checkpoint its local state simultaneously

Coordinated Checkpointing
Make sure that processes are synchronized when doing the checkpoint Two-phase blocking protocol

1. 2.
3.

Coordinator multicasts CHECKPOINT_REQUEST Processes take local checkpoint Delay further sends Acknowledge to coordinator Send state
Coordinator multicasts CHECKPOINT_DONE

Message Logging

Checkpointing is expensive - message logging allows the occurrences between checkpoints to be replayed, so that checkpoints dont need to happen as frequently

Message Logging

We need to choose when to log messages Message-logging schemes can be characterized as pessimistic or optimistic by how they deal with orphan processes

An orphan process is one that survives the crash of another process but has an inconsistent state after the other process recovers

Message Logging

An example of an incorrect replay of messages

Message Logging
We assume that each message m has a header containing all the information necessary to retransmit m (sender, receiver, sequence no., etc.) A message is called stable if it can no longer be lost - a stable message can be used for recovery by replaying its transmission

Message Logging

Each message m leads to a set of dependent processes DEP(m), to which either m or a message causally dependent on m has been delivered

Message Logging

The set COPY(m) consists of the processes that have a copy of m, but not in their local stable storage any process in COPY(m) could deliver a copy of m on request

Message Logging

Process Q is an orphan process if there is a nonstable message m, such that Q is contained in DEP(m), and every process in COPY(m) has crashed

Message Logging

To avoid orphan processes, we need to ensure that if all processes in COPY(m) crash, no processes remain in DEP(m)

Pessimistic Logging
For each nonstable message m, ensure that at most one process P is dependent on m The worst that can happen is that P crashes without m ever having been logged No other process can have become dependent on m, because m was nonstable, so this leaves no orphans

Optimistic Logging
The work is done after a crash occurs, not before If, for some m, each process in COPY(m) has crashed, then any orphan process in DEP(m) gets rolled back to a state in which it no longer belongs in DEP(m) Dependencies need to be explicitly tracked, which makes this difficult to implement - as a result, pessimistic approaches are preferred in real-world implementations

Fault Tolerance Notes
No ratings yet
Fault Tolerance Notes
101 pages
Distributed Computing System Quiz Questions
75% (4)
Distributed Computing System Quiz Questions
9 pages
Sec 2425 L02
No ratings yet
Sec 2425 L02
56 pages
01 Da24 Introduction
No ratings yet
01 Da24 Introduction
55 pages
Lecture23 FaultTolerance
No ratings yet
Lecture23 FaultTolerance
56 pages
Unit # IV Replication and Fault Tolerance
No ratings yet
Unit # IV Replication and Fault Tolerance
82 pages
ProcessResilience FaultTolerance Recovery
No ratings yet
ProcessResilience FaultTolerance Recovery
21 pages
Du3 1
No ratings yet
Du3 1
54 pages
Fault Tolerance
No ratings yet
Fault Tolerance
40 pages
DC - Unit 4 Latest
No ratings yet
DC - Unit 4 Latest
110 pages
# Consensus and Agreement Algorithms: Distributed Computing
No ratings yet
# Consensus and Agreement Algorithms: Distributed Computing
9 pages
Blockchain - Unit1
No ratings yet
Blockchain - Unit1
115 pages
CST402 Distributed Computing M5
No ratings yet
CST402 Distributed Computing M5
41 pages
ADS Chapter 4 Concurrency Control Techniques
No ratings yet
ADS Chapter 4 Concurrency Control Techniques
36 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
51 pages
Fault Tolerance FDCC
No ratings yet
Fault Tolerance FDCC
76 pages
Final GB DC CH 02 D Group Communication
No ratings yet
Final GB DC CH 02 D Group Communication
43 pages
Module 5 Notes
No ratings yet
Module 5 Notes
10 pages
Module 5
No ratings yet
Module 5
11 pages
Week 04
No ratings yet
Week 04
49 pages
UNIT 4 DC Final
No ratings yet
UNIT 4 DC Final
38 pages
Fault Tolerance (1) : Distributed Systems Principles and Paradigms
No ratings yet
Fault Tolerance (1) : Distributed Systems Principles and Paradigms
49 pages
Chapter 8
No ratings yet
Chapter 8
107 pages
Chen 07
No ratings yet
Chen 07
39 pages
Fault Tolerant Message Passing Systems
No ratings yet
Fault Tolerant Message Passing Systems
26 pages
DS Unit-3 Notes
No ratings yet
DS Unit-3 Notes
35 pages
Chapter-6 - Transactions-Concurrency and Recovery
No ratings yet
Chapter-6 - Transactions-Concurrency and Recovery
42 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
37 pages
Unit Iv Consensus and Recovery
No ratings yet
Unit Iv Consensus and Recovery
38 pages
Chapter 8-Fault Tolerance
100% (1)
Chapter 8-Fault Tolerance
71 pages
Pmic Safety Detroit Techday
No ratings yet
Pmic Safety Detroit Techday
61 pages
Lm1-Consensus Algorithm
No ratings yet
Lm1-Consensus Algorithm
35 pages
Binance-Spot Trade History-202505012320
No ratings yet
Binance-Spot Trade History-202505012320
4 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Intro To DS Chapter 6
No ratings yet
Intro To DS Chapter 6
51 pages
ch08 Ts TK Fault Tolerance I
No ratings yet
ch08 Ts TK Fault Tolerance I
29 pages
Chapter 06 Fault - Tolerance
No ratings yet
Chapter 06 Fault - Tolerance
30 pages
Unit 1
No ratings yet
Unit 1
1 page
Chapte Four DS
No ratings yet
Chapte Four DS
37 pages
Elec Chap 3
No ratings yet
Elec Chap 3
28 pages
CSE446 Lecture 4
No ratings yet
CSE446 Lecture 4
32 pages
Chapter 3 Problem
0% (2)
Chapter 3 Problem
5 pages
Lec 3
No ratings yet
Lec 3
30 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
30 pages
Cloud Computing 1 74
No ratings yet
Cloud Computing 1 74
74 pages
08 Falhas
No ratings yet
08 Falhas
41 pages
Distributed Systems Ii Fault-Tolerant Broadcast (CNT.) : Prof Philippas Tsigas
No ratings yet
Distributed Systems Ii Fault-Tolerant Broadcast (CNT.) : Prof Philippas Tsigas
65 pages
UNIT-2 FBCT
0% (1)
UNIT-2 FBCT
77 pages
Unit5 Compressed Fault Tolerance - PACE
No ratings yet
Unit5 Compressed Fault Tolerance - PACE
11 pages
Ds Chapter 7
No ratings yet
Ds Chapter 7
21 pages
Unit - Iv
No ratings yet
Unit - Iv
10 pages
Chapter 8 Fault Tolerance
No ratings yet
Chapter 8 Fault Tolerance
20 pages
CH 4
No ratings yet
CH 4
25 pages
DS Chapter V8.0fault Tolerance
No ratings yet
DS Chapter V8.0fault Tolerance
23 pages
Lecture 18: Distributed Agreement: CSC 469H1F / CSC 2208H1F Fall 2007 Angela Demke Brown
No ratings yet
Lecture 18: Distributed Agreement: CSC 469H1F / CSC 2208H1F Fall 2007 Angela Demke Brown
35 pages
Distributed Systems: Fault Tolerance: Fall 2013
No ratings yet
Distributed Systems: Fault Tolerance: Fall 2013
42 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
CBDT3103 Answer
No ratings yet
CBDT3103 Answer
9 pages
$17k Finding Lost Crypto With WalletCracker
No ratings yet
$17k Finding Lost Crypto With WalletCracker
5 pages
Consensus Failure
No ratings yet
Consensus Failure
79 pages
Unit 4
No ratings yet
Unit 4
11 pages
Robot Chassis and Drivetrain Fundamentals
No ratings yet
Robot Chassis and Drivetrain Fundamentals
65 pages
Fault Tolerance and Consensus: C. Bettini - Distributed and Pervasive Systems
No ratings yet
Fault Tolerance and Consensus: C. Bettini - Distributed and Pervasive Systems
10 pages
Block Chain Techonology
No ratings yet
Block Chain Techonology
2 pages
AURIX MultiCore Lauterbach Handout
No ratings yet
AURIX MultiCore Lauterbach Handout
56 pages
Module 4 - Cloud Programming and Software Environments
No ratings yet
Module 4 - Cloud Programming and Software Environments
25 pages
Adc Student: Andrew Brown Jonathan Warner Laura Strickland
No ratings yet
Adc Student: Andrew Brown Jonathan Warner Laura Strickland
51 pages
Distributed Computing: Farhad Muhammad Riaz
No ratings yet
Distributed Computing: Farhad Muhammad Riaz
18 pages
H Da Barth Functional Safety On Multicore
No ratings yet
H Da Barth Functional Safety On Multicore
23 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
No ratings yet
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
52 pages
Analog Electronics
No ratings yet
Analog Electronics
254 pages
Define - Distributed System
No ratings yet
Define - Distributed System
33 pages
Consensus & Agreement: Arvind Krishnamurthy Fall 2003
No ratings yet
Consensus & Agreement: Arvind Krishnamurthy Fall 2003
41 pages
07 DRITSANOS IoT-Conference Schneider-Electric
No ratings yet
07 DRITSANOS IoT-Conference Schneider-Electric
13 pages
WASE 2021 Cloud Computing Handout - S2 - 23
No ratings yet
WASE 2021 Cloud Computing Handout - S2 - 23
25 pages
Hbase问题处理
No ratings yet
Hbase问题处理
11 pages
Software Testing and Reliability
No ratings yet
Software Testing and Reliability
77 pages
Fault System One
No ratings yet
Fault System One
19 pages
Group Communications: (A) Fault Tolerance Based On Replicated Servers
No ratings yet
Group Communications: (A) Fault Tolerance Based On Replicated Servers
10 pages
Introduction To Electronic Design
No ratings yet
Introduction To Electronic Design
18 pages
Introduction To Electronic Design
No ratings yet
Introduction To Electronic Design
18 pages
Gowda2019 ECU Inter - Processor Data Communication
No ratings yet
Gowda2019 ECU Inter - Processor Data Communication
11 pages
Gowda2019 ECU Inter - Processor Data Communication
No ratings yet
Gowda2019 ECU Inter - Processor Data Communication
11 pages
L17 FET DC Analysis
No ratings yet
L17 FET DC Analysis
19 pages
Electronic Stability Control: NHTSA's Notice of Proposed Rulemaking
No ratings yet
Electronic Stability Control: NHTSA's Notice of Proposed Rulemaking
18 pages
Distributed Systems: Election Algorithms
No ratings yet
Distributed Systems: Election Algorithms
44 pages
Amazon EKS
No ratings yet
Amazon EKS
36 pages
Analytical Study of Sense Amplifier
100% (1)
Analytical Study of Sense Amplifier
5 pages
ZD 007 Sample Definition Project Management EN
No ratings yet
ZD 007 Sample Definition Project Management EN
1 page
QEP en Quality Requirements For Samples
No ratings yet
QEP en Quality Requirements For Samples
3 pages
Mach Final Case Study
No ratings yet
Mach Final Case Study
24 pages
Unit 1 CC
No ratings yet
Unit 1 CC
4 pages
DC 70
No ratings yet
DC 70
2 pages
Mutual Exclusion
No ratings yet
Mutual Exclusion
7 pages
State Recording Algorithm
No ratings yet
State Recording Algorithm
11 pages
09 Transaksi (2) - NDN
No ratings yet
09 Transaksi (2) - NDN
26 pages
3) Unit - 1 - P1 - Block Chain Technology
No ratings yet
3) Unit - 1 - P1 - Block Chain Technology
26 pages
A Review On Blockchain Security Issues and Challenges
No ratings yet
A Review On Blockchain Security Issues and Challenges
7 pages
Pros and Cons of Permissioned Blockchain Pros
No ratings yet
Pros and Cons of Permissioned Blockchain Pros
2 pages
Module 2: Goals of Parallelism Week 2 Learning Outcomes:: General-Purpose Computing On Graphics Processing Units
No ratings yet
Module 2: Goals of Parallelism Week 2 Learning Outcomes:: General-Purpose Computing On Graphics Processing Units
11 pages
Matlab DCE
No ratings yet
Matlab DCE
21 pages
4 Trangination and Concurency
No ratings yet
4 Trangination and Concurency
17 pages
OS-Chapter 17 Distributed Coordination
No ratings yet
OS-Chapter 17 Distributed Coordination
5 pages
Types of Consistency Models Used in DSM: 1.weak Consistency 2.release Consistency 3.entry Consistency
No ratings yet
Types of Consistency Models Used in DSM: 1.weak Consistency 2.release Consistency 3.entry Consistency
13 pages
Lecture1 (I) Course Profile
No ratings yet
Lecture1 (I) Course Profile
3 pages