0% found this document useful (0 votes)

24 views

15-440 Distributed Systems: Fault Tolerance, Logging and Recovery Thursday Oct 8, 2015

The document summarizes key concepts from a lecture on fault tolerance, logging, and recovery in distributed systems. It discusses: 1) Types of failures and fault tolerance approaches like redundancy and masking failures. 2) Recovery strategies like backward and forward recovery using checkpointing. 3) The challenges of making transactions reliable in distributed systems using techniques like write-ahead logging and shadow paging to provide durability and atomicity. 4) Key aspects of write-ahead logging including log records, transaction and dirty page tables, and a three pass recovery process using analysis, redo, and undo passes.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

15-440 Distributed Systems: Fault Tolerance, Logging and Recovery Thursday Oct 8, 2015

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 30

15-440 Distributed Systems

Lecture 10
Fault Tolerance, Logging and recovery
Thursday Oct 8th, 2015
Logistics Updates

• P1 checkpoint due (11:59EST, Oct 04th)

• Part A Due: Oct 13th
• Part B Due: Oct 25th

• HW2 released
• Due Oct 13th
• (*No Late Days*) => time to prepare for Mid term

• Mid term Tuesday Oct 18th in class

• Will cover everything until the first half of class

2
Today's Lecture Outline

• Real Systems (are often unreliable)

• We ignored failures till now
• Fault Tolerance basic concepts

• Fault Tolerance – Checkpointing

• Fault Tolerance – Logging and Recovery

3
What is Fault Tolerance?

• Dealing successfully with partial failure

within a distributed system
• Fault tolerant ~> dependable systems
• Dependability implies the following:
1. Availability
2. Reliability
3. Safety
4. Maintainability
Dependability Concepts

• Availability – the system is ready to be used immediately.

• Reliability – the system runs continuously without failure.

• Safety – if a system fails, nothing catastrophic

will happen. (e.g. process control systems)

• Maintainability – when a system fails, it can

be repaired easily and quickly (sometimes, without its
users noticing the failure). (also called Recovery)
• What’s a failure? : System that cannot meet its goals => faults
• Faults can be: Transient, Intermittent, Permanent
Failure Models
Masking Failures by Redundancy

• Strategy: hide the occurrence of failure from

other processes using redundancy.
1. Information Redundancy – add extra bits to
allow for error detection/recovery (e.g.,
Hamming codes and the like).
2. Time Redundancy – perform operation and, if
needs be, perform it again. Think about how
transactions work (BEGIN/END/COMMIT/ABORT).
3. Physical Redundancy – add extra (duplicate)
hardware and/or software to the system.
Masking Failures by Redundancy

Triple modular redundancy in a circuit (b)

A,B,C are circuit elements and V* are voters
Today's Lecture Outline

• Real Systems (are often unreliable)

• We ignored failures till now
• Fault Tolerance basic concepts

• Fault Tolerance – Recovery using Checkpointing

• Fault Tolerance – Logging and Recovery

9
Achieving Fault Tolerance in DS

• Process Resilience (when processes fail) T8.2

• Have multiple processes (redundancy)
• Group them (flat, hierarchically), voting
• Reliable RPCs (communication failures) T8.3
• Several cases to consider (lost reply, client crash, …)
• Several potential solutions for each case
• Distributed Commit Protocols
• Perform operations by all group members, or not at all
• 2 phase commit, … (last lecture)
• Today: A failure has occurred, can we recover?
10
Recovery Strategies
• When a failure occurs, we need to bring the
system into an error free state (recovery). This
is fundamental to Fault Tolerance.
1. Backward Recovery: return the system to
some previous correct state (using checkpoints),
then continue executing. Example?
• Packet retransmit in case of lost packet
2. Forward Recovery: bring the system into a
correct new state, from which it can then
continue to execute. Example?
• Erasure coding, (n,k) where k < n <= 2k
Forward and Backward Recovery

• Major disadvantage of Backward Recovery:

• Checkpointing can be very expensive (especially
when errors are very rare).
• [Despite the cost, backward recovery is implemented
more often. The “logging” of information can be
thought of as a type of checkpointing.].
• Major disadvantage of Forward Recovery:
• In order to work, all potential errors need to be
accounted for up-front.
• When an error occurs, the recovery mechanism then
knows what to do to bring the system forward to a
correct state.
Checkpointing

A recovery line to detect the correct distributed snapshot

This becomes challenging if checkpoints are un-coordinated
Independent Checkpointing

The domino effect – Cascaded rollback

P2 crashes, roll back, but 2 checkpoints inconsistent (P2 shows m received, but P1 does not show m sent)
Coordinated Checkpointing

• Key idea: each process takes a checkpoint after a globally

coordinated action. (why is this good?)

• Simple Solution: 2-phase blocking protocol

• Co-ordinator multicast checkpoint_REQUEST message
• Participants receive message, takes a checkpoint, stops sending
(application) messages, and sends back checkpoint_ACK
• Once all participants ACK, coordinator sends checkpoint_DONE to allow
blocked processes to go on

• Optimization: consider only processes that depend on the recovery

of the coordinator (those it sent a message since last checkpoint)

15
Recovery – Stable Storage

(a) Stable storage.

(b) Crash after drive 1 is updated.
(c) Bad spot.
Today's Lecture Outline

• Real Systems (are often unreliable)

• We ignored failures till now
• Fault Tolerance basic concepts

• Fault Tolerance – Checkpointing

• Fault Tolerance – Logging and Recovery

17
Goal: Make transactions Reliable

• …in the presence of failures

• Machines can crash. Disk Contents (OK), Memory (volatile), Machines
don’t misbehave
• Networks are flaky, packet loss, handle using timeouts

• If we store database state in memory, a crash will cause loss of

“Durability”.

• May violate atomicity, i.e. recover such that uncommited

transactions COMMIT or ABORT.

• General idea: store enough information to disk to determine

global state (in the form of a LOG)

18
Challenges:

• Disk performance is poor (vs memory)

• Cannot save all transactions to disk
• Memory typically several orders of magnitude faster

• Writing to disk to handle arbitrary crash is hard

• Several reasons, but HDDs and SSDs have buffers

• Same general idea: store enough data on disk so as to

recover to a valid state after a crash:
• Shadow pages and Write-ahead Logging (WAL)

19
Shadow Paging Vs WAL

• Shadow Pages
• Provide Atomicity and Durability, “page” = unit of storage
• Idea: When writing a page, make a “shadow” copy
• No references from other pages, edit easily!
• ABORT: discard shadow page
• COMMIT: Make shadow page “real”. Update pointers to
data on this page from other pages (recursive). Can be
done atomically
• Essentially “copy-on-write” to avoid in-place page update

20
Shadow Paging vs WAL

• Write-Ahead-Logging
• Provide Atomicity and Durability
• Idea: create a log recording every update to database
• Updates considered reliable when stored on disk
• Updated versions are kept in memory (page cache)
• Logs typically store both REDO and UNDO operations
• After a crash, recover by replaying log entries to
reconstruct correct state

• WAL is more common, fewer disk operations, transactions

considered committed once log written.

21
Write-Ahead Logging

• View as sequence of entries, sequential number

• Log-Sequence Number (LSN)
• Database: fixed size PAGES, storage at page level
• Pages on disk, some also in memory (page cache)
• “Dirty pages”: page in memory differs from one on disk
• Reconstruction of global consistent state
• Log files + disk contents + (page cache)
• Logs consist of sequence of records
• Begin LSN, TID #Begin TXN
• End LSN, TID, PrevLSN #Finish TXN (abort or commit)
• Update LSN, TID, PrevLSN, pageID, offset, old value, new value

22
Write-Ahead Logging

• Logs consist of sequence of records

• To record an update to state
• Update LSN, TID, PrevLSN, pageID, offset, old value, new value
• PrevLSN forms a backward chain of operations for each TID
• Storing “old” and “new” values allow REDO operations to bring a page
up to date, or UNDO an update reverting to an earlier version
• Transaction Table (TT): All TXNS not written to disk
• Including Seq Num of the last log entry they caused
• Dirty Page Table (DPT): all dirty pages in memory
• Modified pages, but not written back to disk.

23
Write-Ahead-Logging

• Commit a transaction
• Log file up to date until commit entry
• Don't update actual disk pages, log file has information
• Keep "tail" of log file in memory => not commits
• If the tail gets wiped out (crash), then partially executed
transactions will lost. Can still recover to reliable state
• Abort a transaction
• Locate last entry from TT, undo all updates so far
• Use PrevLSN to revert in-memory pages to start of TXN
• If page on disk needs undo, wait (come back to this)

24
Recovery using WAL – 3 passes

• Analysis Pass
• Reconstruct TT and DPT (from start or last checkpoint)
• Get copies of all pages at the start
• Recovery Pass (redo pass)
• Replay log forward, make updates to all dirty pages
• Bring everything to a state at the time of the crash
• Undo Pass
• Replay log file backward, revert any changes made by
transactions that had not committed (use PrevLSN)
• For each write Compensation Log Record (CLR)
• Once you reach BEGIN TXN, write an END TXN entry

25
WAL can be integrated with 2PC

• WAL can integrate with 2PC

• Have additional log entries that capture 2PC operation
• Coordinator: Include list of participants
• Participant: Indicates coordinator
• Votes to commit or abort
• Indication from coordinator to Commit/Abort

26
Optimizing WAL

• As described earlier:
• Replay operations back to the beginning of time
• Log file would be kept forever, (entire Database)
• In practice, we can do better with CHECKPOINT
• Periodically save DPT, TT
• Store any dirty pages to disk, indicate in LOG file
• Prune initial portion of log file: All transactions upto
checkpoint have been committed or aborted.

27
Summary

• Real Systems (are often unreliable)

• Introduced basic concepts for Fault Tolerant Systems
including redundancy, process resilience, RPC

• Fault Tolerance – Backward recovery using

checkpointing, both Independent and coordinated

• Fault Tolerance –Recovery using Write-Ahead-

Logging, balances the overhead of checkpointing
and ability to recover to a consistent state

28
Transactions: ACID Properties

• Atomicity: Each transaction completes in its

entirely, or is aborted. If aborted, should not have
have effect on the shared global state.
• Example: Update account balance on multiple servers

• Consistency: Each transaction preserves a set of

invariants about global state. (exact nature is
system dependent).
• Example: in a bank system, law of conservation of $$

30
Transactions: ACID Properties
• Isolation: Also means serializability. Each
transaction executes as if it were the only one with
the ability to RD/WR shared global state.

• Durability: Once a transaction has been

completed, or “committed” there is no going back.
In other words there is no “undo”.

• Transactions can also be nested

• “Atomic Operations” => Atomicity + Isolation
31

bypass-any-otp-hackers-pathway-to-bypass
100% (2)
bypass-any-otp-hackers-pathway-to-bypass
45 pages
Failure Recovery: Checkpointing Undo/Redo Logging
No ratings yet
Failure Recovery: Checkpointing Undo/Redo Logging
22 pages
1904050001
No ratings yet
1904050001
119 pages
System Recovery
No ratings yet
System Recovery
38 pages
Consensus
No ratings yet
Consensus
77 pages
Unit-3 Part2
No ratings yet
Unit-3 Part2
74 pages
CS 194: Distributed Systems
No ratings yet
CS 194: Distributed Systems
15 pages
4th Unit Topics Recovery
No ratings yet
4th Unit Topics Recovery
73 pages
Lecture 21
No ratings yet
Lecture 21
53 pages
Dbms Unit 4 Notes.
No ratings yet
Dbms Unit 4 Notes.
21 pages
14 Recovery
No ratings yet
14 Recovery
4 pages
Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification
No ratings yet
Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification
55 pages
Unit 4 - DSRM
No ratings yet
Unit 4 - DSRM
5 pages
Lec7 Logging
No ratings yet
Lec7 Logging
4 pages
Failure Recovery in Distributed Systems
No ratings yet
Failure Recovery in Distributed Systems
24 pages
Database Systems: Recovery Control
No ratings yet
Database Systems: Recovery Control
25 pages
Data Access
No ratings yet
Data Access
18 pages
u4p6
No ratings yet
u4p6
10 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
No ratings yet
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
52 pages
Fault Avoidance and Tolerance Technique
No ratings yet
Fault Avoidance and Tolerance Technique
15 pages
Distributed Failure Recovery
No ratings yet
Distributed Failure Recovery
30 pages
Distributed Deadlocks and Transaction Recovery
100% (1)
Distributed Deadlocks and Transaction Recovery
22 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
CheckpointingRecovery ds14
No ratings yet
CheckpointingRecovery ds14
35 pages
Recovery
No ratings yet
Recovery
35 pages
Session 19 Recovery
No ratings yet
Session 19 Recovery
18 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Lesson07 Reliability
No ratings yet
Lesson07 Reliability
41 pages
GNR-18 DBMS Unit-5
No ratings yet
GNR-18 DBMS Unit-5
22 pages
Checkpointing and Rollback
No ratings yet
Checkpointing and Rollback
61 pages
Assignment 4 - 044
No ratings yet
Assignment 4 - 044
4 pages
17 Recovery
No ratings yet
17 Recovery
14 pages
Lecture 11A - Replication Control
No ratings yet
Lecture 11A - Replication Control
15 pages
Crash Recovery
No ratings yet
Crash Recovery
5 pages
Recovery System
No ratings yet
Recovery System
27 pages
LectDB 26recovery-1
No ratings yet
LectDB 26recovery-1
16 pages
Software Fault Tolerance Methods
No ratings yet
Software Fault Tolerance Methods
50 pages
Crash Recovery Method: Kathleen Durant CS 3200
No ratings yet
Crash Recovery Method: Kathleen Durant CS 3200
35 pages
unit 4
No ratings yet
unit 4
94 pages
Database Systems
No ratings yet
Database Systems
6 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Recovery
No ratings yet
Recovery
26 pages
Database System Recovery: CSEP 545 Transaction Processing For E-Commerce Philip A. Bernstein
No ratings yet
Database System Recovery: CSEP 545 Transaction Processing For E-Commerce Philip A. Bernstein
45 pages
Unit 3 GRP
No ratings yet
Unit 3 GRP
12 pages
12 Backup Recovery-TELU (1)
No ratings yet
12 Backup Recovery-TELU (1)
44 pages
5 Recovery Techniques Modified
No ratings yet
5 Recovery Techniques Modified
28 pages
Recovery
No ratings yet
Recovery
36 pages
Outline: File System Consistency Issues in The Presence of Failures
No ratings yet
Outline: File System Consistency Issues in The Presence of Failures
4 pages
OS CO4 S5 DataIntegrity DistributedSystems
No ratings yet
OS CO4 S5 DataIntegrity DistributedSystems
33 pages
Module 4 - Distributed Shared Memory and Failure Recovery - Sreerag Sanilkumar
No ratings yet
Module 4 - Distributed Shared Memory and Failure Recovery - Sreerag Sanilkumar
14 pages
Recovery
No ratings yet
Recovery
4 pages
Database Recovery Techniques
No ratings yet
Database Recovery Techniques
22 pages
Serial Schedule Non-Serial Schedule: Checkpoints
No ratings yet
Serial Schedule Non-Serial Schedule: Checkpoints
7 pages
Chapte Four DS
No ratings yet
Chapte Four DS
37 pages
ADBS Chapter 5
No ratings yet
ADBS Chapter 5
31 pages
Chapter 3 - Recovery Techniques
100% (1)
Chapter 3 - Recovery Techniques
22 pages
11 Distributed1
No ratings yet
11 Distributed1
42 pages
Unit 4 (KCS501)
No ratings yet
Unit 4 (KCS501)
12 pages
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
Oracle Recovery Appliance Handbook: An Insider’S Insight
From Everand
Oracle Recovery Appliance Handbook: An Insider’S Insight
Ramesh Raghav
No ratings yet
06 dfs2
No ratings yet
06 dfs2
50 pages
15-440 Distributed Systems Fall 2016: L-23 Security
No ratings yet
15-440 Distributed Systems Fall 2016: L-23 Security
38 pages
15-440 Distributed Systems: Lecture 19 - Naming and Hashing
No ratings yet
15-440 Distributed Systems: Lecture 19 - Naming and Hashing
46 pages
Thriving in ACrowded and Changing World
No ratings yet
Thriving in ACrowded and Changing World
168 pages
Methodically Defeating Nintendo Switch Security
No ratings yet
Methodically Defeating Nintendo Switch Security
12 pages
2004 05 7
No ratings yet
2004 05 7
72 pages
Med Trans
No ratings yet
Med Trans
8 pages
Skyraksys
No ratings yet
Skyraksys
8 pages
CCM-Application Handbook-2020
No ratings yet
CCM-Application Handbook-2020
41 pages
Design and Construction of A Welding Machine With A Variable Current Selector
No ratings yet
Design and Construction of A Welding Machine With A Variable Current Selector
11 pages
Apple Marketing Mix
No ratings yet
Apple Marketing Mix
10 pages
Q2 Practical Research 2 Module 12
No ratings yet
Q2 Practical Research 2 Module 12
20 pages
2.1.1.a AOITruthTablesToLogicExpressions
No ratings yet
2.1.1.a AOITruthTablesToLogicExpressions
6 pages
Amritsar College of Engineering & Technology, Amritsar: Secrecy Branch
No ratings yet
Amritsar College of Engineering & Technology, Amritsar: Secrecy Branch
1 page
MSE-II Schedule (B.Tech VI Sem) Apr-2025
No ratings yet
MSE-II Schedule (B.Tech VI Sem) Apr-2025
1 page
Quotation 1
No ratings yet
Quotation 1
1 page
Vehicle System EngType ECU Name Ecu Info. Read Codes Erase Codes Live Data Active Test (PDFDrive)
No ratings yet
Vehicle System EngType ECU Name Ecu Info. Read Codes Erase Codes Live Data Active Test (PDFDrive)
50 pages
Notes On Media Literacy
No ratings yet
Notes On Media Literacy
5 pages
Ip Nurse Call System User Manual
No ratings yet
Ip Nurse Call System User Manual
16 pages
Unit 22: Onboard Passenger Operations
No ratings yet
Unit 22: Onboard Passenger Operations
17 pages
Immediate download Visual Ethics A Guide for Photographers Journalists and Media Makers 2nd Edition Paul Martin Lester ebooks 2024
100% (6)
Immediate download Visual Ethics A Guide for Photographers Journalists and Media Makers 2nd Edition Paul Martin Lester ebooks 2024
40 pages
Undergraduate Dissertation Structure Template
100% (2)
Undergraduate Dissertation Structure Template
5 pages
h19347 Ocp410 Amd - DG
No ratings yet
h19347 Ocp410 Amd - DG
57 pages
Module Code & Module Title CU6051NA - Artificial Intelligence Assessment Weightage & Type 20% Individual Coursework Year and Semester 2019-20 Autumn
No ratings yet
Module Code & Module Title CU6051NA - Artificial Intelligence Assessment Weightage & Type 20% Individual Coursework Year and Semester 2019-20 Autumn
10 pages
Intelli Lok Functions Lego Duplo Part 2
No ratings yet
Intelli Lok Functions Lego Duplo Part 2
1 page
Vmware Airlift
No ratings yet
Vmware Airlift
2 pages
1.DHF 00 A P 003 R1 Vendor Data Instruction
No ratings yet
1.DHF 00 A P 003 R1 Vendor Data Instruction
29 pages
9781003242741_previewpdf
No ratings yet
9781003242741_previewpdf
23 pages
Iot Security 1.1 Chapter 5 Quiz: Attempt History
No ratings yet
Iot Security 1.1 Chapter 5 Quiz: Attempt History
14 pages
MIPS Assembly Language Programming Robert Britton All Chapters Instant Download
100% (7)
MIPS Assembly Language Programming Robert Britton All Chapters Instant Download
60 pages
NodeJS Coding Standards & Best Practices
100% (1)
NodeJS Coding Standards & Best Practices
2 pages
Fuel Injection ATD
No ratings yet
Fuel Injection ATD
104 pages
Manual Android Mini PC
No ratings yet
Manual Android Mini PC
25 pages
Gensys Traum
No ratings yet
Gensys Traum
5 pages

15-440 Distributed Systems: Fault Tolerance, Logging and Recovery Thursday Oct 8, 2015

Uploaded by

15-440 Distributed Systems: Fault Tolerance, Logging and Recovery Thursday Oct 8, 2015

Uploaded by

15-440 Distributed Systems

• P1 checkpoint due (11:59EST, Oct 04th)

• Mid term Tuesday Oct 18th in class

• Real Systems (are often unreliable)

• Fault Tolerance – Checkpointing

• Fault Tolerance – Logging and Recovery

• Dealing successfully with partial failure

• Availability – the system is ready to be used immediately.

• Reliability – the system runs continuously without failure.

• Safety – if a system fails, nothing catastrophic

• Maintainability – when a system fails, it can

• Strategy: hide the occurrence of failure from

Triple modular redundancy in a circuit (b)

• Real Systems (are often unreliable)

• Fault Tolerance – Recovery using Checkpointing

• Fault Tolerance – Logging and Recovery

• Process Resilience (when processes fail) T8.2

• Major disadvantage of Backward Recovery:

A recovery line to detect the correct distributed snapshot

The domino effect – Cascaded rollback

• Key idea: each process takes a checkpoint after a globally

• Simple Solution: 2-phase blocking protocol

• Optimization: consider only processes that depend on the recovery

(a) Stable storage.

• Real Systems (are often unreliable)

• Fault Tolerance – Checkpointing

• Fault Tolerance – Logging and Recovery

• …in the presence of failures

• If we store database state in memory, a crash will cause loss of

• May violate atomicity, i.e. recover such that uncommited

• General idea: store enough information to disk to determine

• Disk performance is poor (vs memory)

• Writing to disk to handle arbitrary crash is hard

• Same general idea: store enough data on disk so as to

• WAL is more common, fewer disk operations, transactions

• View as sequence of entries, sequential number

• Logs consist of sequence of records

• WAL can integrate with 2PC

• Real Systems (are often unreliable)

• Fault Tolerance – Backward recovery using

• Fault Tolerance –Recovery using Write-Ahead-

• Atomicity: Each transaction completes in its

• Consistency: Each transaction preserves a set of

• Durability: Once a transaction has been

• Transactions can also be nested

You might also like