PBFT

This document describes the Practical Byzantine Fault Tolerance (PBFT) algorithm for replicating services across unreliable networks where some systems may experience failures. It explains the problem PBFT aims to solve, outlines the system model and failure assumptions, and details the three phase commit protocol used to reach agreement on state across replicas in the presence of faulty nodes.

Uploaded by

sai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views26 pages

PBFT

Uploaded by

sai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Practical Byzantine Fault Tolerance

Appears in the Proceedings of the Third Symposium on Operating

Systems Design and Implementation, New Orleans, USA, February 1999

Published: February 1999

The Problem
• Provide a reliable answer to a computation even in
the presence of Byzantine faults.
• A client would like to
• Transmit a request
• Wait for k replies
• Conclude that the answer is a true answer
The Model
• Networks are unreliable
• Can delay, reorder, drop,retransmit
• Some fraction of nodes are unreliable
• May behave in any way, and need not follow the
protocol.
• Nodes can verify the authenticity of messages
Failures
• The system requires 3f+1 nodes to withstand f
failures
• All f nodes may be faulty, and not respond
• But there is no guarantee that the remaining n-f
are good, and good nodes must outnumber bad
nodes.
• This holds if n-2f > f or n > 3f
Nodes
• Maintain a state
• Log
• View number
• state
• Can perform a set of operations
• Need not be simple read/write
• Must be deterministic
• Well behaved nodes must:
• start at the same state
• Execute requests in the same order
Views
• Operations occur within views
• For a given view, a particular node in is
designated the primary node, and the others are
backup nodes
• Primary = v mod n
• N is number of nodes
• V is the view number
Protocol
A three phase protocol
• Pre-prepare: primary proposes an order
• Prepare: Backup copies agree on #
• Commit: agree to commit
Agreement
• Quorum based
• 2f+1 nodes must have same value
• System has 3f+1 nodes
• Any 2f+1 subset has >= 1 good node in common
• Good nodes don’t lie
• Same decision at each node w/ quorum
Messages
• The following messages are used by the
protocol, and are signed by the sender
• Request <o,t,c> (called m)
• Sent from the client to the primary
• Contains: client #, timestamp, and operation
• Reply <v,t,c,I,r>
• Pre-prepare <v,d,n>, m
• Multicast from primary to backups
• Contains view #, sequence #, digest
• Message may be sent separately
Messages
• Prepare <v,n,d,I >
• Sent amongst backups
• Commit <v,n,d,I >
• Replica I is prepared to commit seq # n, view v

• Messages are accepted in each phase

• If the current node is in view v
• The sequence number,n, is within a certain range
• The node has not received contradictory messages
• The digest matches the computed digest
Pre-prepare
• The client sends a message to the primary
• The primary assigns a sequence number to the
message, and multicasts it.
• Backups:
• Receive the pre-prepare message
• Validate it and drop the message if invalid
• Record the message, the pre-prepare message, and a
newly generated prepare message in the log
• Multicast the prepare message to the other backups
Prepare
• A prepare message indicates a backups
willingness to accept a given sequence number.
• Once a quorum of messages prepare messages
is received, a commit message is sent
Commit
• Nodes must ensure that enough nodes have all
been prepared before applying the changes so:
• A node waits for a quorum of commit messages
before applying a change.
• Changes are applied in order of sequence number
• Cannot be applied until all lower numbered messages have
been applied.
Truncating the log
• Checkpoints at regular intervals
• Requests are in log, or already stable
• Each node maintains multiple copies of state:
• A copy of the last proven checkpoint
• 0 or more unproven checkpoints
• The current working state
• A node sends a checkpoint message when it
generates a new checkpoint
• checkpoint is proven when a quorum agrees
• Then this checkpoint becomes stable
• Log truncated, old checkpoints discarded
View change
• The view change mechanism
• Protects against faulty primaries
• Backups propose a view change when a timer
expires
• The timer runs whenever a backup has accepted some
message & is waiting to execute it.
• Once a view change is proposed, the backup will no
longer do work (except checkpoint) in the current view.
View change 2
• A view change message contains
• # of the highest message in the stable checkpoint
• And the check point messages
• A pre-prepare message for non-checkpointed messages
• And proof it was prepared
• The new primary declares a new view when it
receives a quorum of messages
New view
* uncheck pointed messages
• New primary computes
• Maximum checkpointed sequence number
• Maximum sequence number not checkpointed
• Constructs new pre-prepare messages
• Either is a new pre-prepare for a message in the new view
• Or a no-op pre-prepare so there are no gaps
New view 2
• New primary sends a new view message
• Contains all view change messages
• All computed pre-prepare messages
• Recipients verify:
• The pre-prepare messages
• The have the latest checkpoint
• If not, they can get a copy
• Sends a prepare message for each pre-prepare
• Enters the new view
Controlling View Changes
• Moving through views too quickly
• Nodes will wait longer if
• No useful work was done in the previous view
• I.e. only re-execution of previous requests\
• Or enough nodes accepted the change, but no new view was
declared
• If a node gets f+1 view change requests with a higher
view number
• It will send its own view change with the minimum view
number
• This is safe, because at least one non-faulty replica sent a
message
nondeterminism
• The model requires that requests be deterministic
• But this is not always the case
• E.g. update a timestamp using the current clock
• Two solutions
• Let the primary propose a value
• Create a <value, message> pair and proceed as before
• Allow the backups to select values
• Wait for 2f+1
• Start three-phase protocol
optimizations
• Don’t send f+1 messages back to the client
• Instead send f digests, and 1 result
• If they don’t match, retry with old protocol
• Tentative commit
• After prepare, backup may tentatively execute request
• Client waits for a querom of tentative replies, otherwise
retries and waits for f+1 replies
• Read-only
• Clients multicast directly to replicas
• Replicas execute the request, wait until no tentative request
are pending, return the result
• Client waits for a quorum of results
Implementation
• The protocol is implemented in a replication library
• No mechanism to change views
• Uses upcalls to allow servers to:
• Invoke requests (client)
• Execute requests
• Create and delete checkpoints
• Retrieve checkpoints
• Compute digests (of checkpoints)
Implementation 2
• Communication
• Udp for point to point
• Udp multicast for group communication
Micro benchmark
• Compares a service that executes a no-op
• Single server vs Replicated using protocol
BFS
• Implementation of NFS using the replication library.
• Looks like normal NFS to clients
• Replication library runs requsts via a relay
• Server maintains filesystem state in memory mapped
files
BFS 2
• Server maintains at most 2 checkpoints
• Using copy on write
• Digests computed incrementally
• For efficienty

Developing A Database Security Plan
100% (1)
Developing A Database Security Plan
13 pages
Various - 1890 - Modern Astrology - The Astrologers Magazine, Volume 1
No ratings yet
Various - 1890 - Modern Astrology - The Astrologers Magazine, Volume 1
298 pages
Scribd Documents For Free - Still Works in 2021 - Filelem
No ratings yet
Scribd Documents For Free - Still Works in 2021 - Filelem
5 pages
Quick Reference Guide: Foundation Level
No ratings yet
Quick Reference Guide: Foundation Level
2 pages
Complete HTML Notes 1681809769
No ratings yet
Complete HTML Notes 1681809769
27 pages
Working Project
No ratings yet
Working Project
72 pages
R134a HXWC Series Water Cooled Screw Flooded Chillers Cooling Capacity 200 To 740 Tons 703 To 2603 KW Products That Perform PDF
100% (2)
R134a HXWC Series Water Cooled Screw Flooded Chillers Cooling Capacity 200 To 740 Tons 703 To 2603 KW Products That Perform PDF
16 pages
FTP, SMTP and DNS: 2: Application Layer 1
No ratings yet
FTP, SMTP and DNS: 2: Application Layer 1
18 pages
European Commission - Proposal Submission Service - User Manual
No ratings yet
European Commission - Proposal Submission Service - User Manual
72 pages
Fault Tolerance Notes
No ratings yet
Fault Tolerance Notes
101 pages
Project On Implementation and Verification of Ospf
100% (1)
Project On Implementation and Verification of Ospf
22 pages
Digital Platforms and The Demand For International Tourism Services
No ratings yet
Digital Platforms and The Demand For International Tourism Services
36 pages
Server Process Redundancy and Race Conditions v1.0
No ratings yet
Server Process Redundancy and Race Conditions v1.0
25 pages
Raft
No ratings yet
Raft
68 pages
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
No ratings yet
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
20 pages
A Beginner's Guide To Paxos
No ratings yet
A Beginner's Guide To Paxos
32 pages
The Chubby Locks Service: For Loosely-Coupled Distributed Systems
No ratings yet
The Chubby Locks Service: For Loosely-Coupled Distributed Systems
57 pages
ReactJS - React Router, Route With Parameters, Child Routes 1
No ratings yet
ReactJS - React Router, Route With Parameters, Child Routes 1
42 pages
Opera Hotel
100% (1)
Opera Hotel
49 pages
NetSDK - C# Programming Manual (Field Surveillance Unit)
No ratings yet
NetSDK - C# Programming Manual (Field Surveillance Unit)
36 pages
License With Software Key From Local Server
No ratings yet
License With Software Key From Local Server
13 pages
Devendra Kumar's Resume
No ratings yet
Devendra Kumar's Resume
3 pages
Paxos Made Moderately Complex: Robbert Van Renesse Cornell University Rvr@cs - Cornell.edu March 25, 2011
No ratings yet
Paxos Made Moderately Complex: Robbert Van Renesse Cornell University Rvr@cs - Cornell.edu March 25, 2011
15 pages
LogDevice Consensus Deepdive
No ratings yet
LogDevice Consensus Deepdive
56 pages
Inity - L Nity M I I Re Trol M: When You
No ratings yet
Inity - L Nity M I I Re Trol M: When You
2 pages
A Project Report On FINAL REVIEW
No ratings yet
A Project Report On FINAL REVIEW
7 pages
Intol 07 Bftquorums
No ratings yet
Intol 07 Bftquorums
15 pages
Social Media Marketing Workbook 20 Books in 1 Digital Alchemy Mastering The Art of Web Conversion (McDonald, Andrew)
100% (1)
Social Media Marketing Workbook 20 Books in 1 Digital Alchemy Mastering The Art of Web Conversion (McDonald, Andrew)
833 pages
Federico Lusiani: Honors & Awards About Me
No ratings yet
Federico Lusiani: Honors & Awards About Me
1 page
Raft
No ratings yet
Raft
30 pages
DS UNIT-3 Saqs Laqs (Complete)
No ratings yet
DS UNIT-3 Saqs Laqs (Complete)
16 pages
Consensus and Paxos
No ratings yet
Consensus and Paxos
34 pages
Fault
No ratings yet
Fault
101 pages
06 Synchronization
No ratings yet
06 Synchronization
52 pages
Blockchain-Assignment 4 Answer Description
No ratings yet
Blockchain-Assignment 4 Answer Description
4 pages
Ba Rf18xc-Rf18xci 76 En-Us
No ratings yet
Ba Rf18xc-Rf18xci 76 En-Us
184 pages
L20: Replicated State Machines With Paxos: Sam Madden 6.033 Spring 2014
No ratings yet
L20: Replicated State Machines With Paxos: Sam Madden 6.033 Spring 2014
44 pages
Plant Watering System Using ESP8266
No ratings yet
Plant Watering System Using ESP8266
5 pages
Consistency and Rep Contd
No ratings yet
Consistency and Rep Contd
28 pages
Fault Tolerance in Distributed Systems: A Fault-Tolerant System
No ratings yet
Fault Tolerance in Distributed Systems: A Fault-Tolerant System
15 pages
Architecture Diagram
No ratings yet
Architecture Diagram
3 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Replication and Consistency in Distributed Systems (Cont'd)
No ratings yet
Replication and Consistency in Distributed Systems (Cont'd)
17 pages
PBFT Algorithm
No ratings yet
PBFT Algorithm
76 pages
Byzantine Fault-Tolerance: COMP 413 Fall 2002
No ratings yet
Byzantine Fault-Tolerance: COMP 413 Fall 2002
21 pages
STV To Amd
No ratings yet
STV To Amd
1 page
Lecture23 FaultTolerance
No ratings yet
Lecture23 FaultTolerance
56 pages
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
No ratings yet
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
52 pages
Block Chain Material
No ratings yet
Block Chain Material
19 pages
Fault System One
No ratings yet
Fault System One
19 pages
Lec14 Paxos
No ratings yet
Lec14 Paxos
4 pages
Unit 3-1
No ratings yet
Unit 3-1
26 pages
Lecture 10 - Replication
No ratings yet
Lecture 10 - Replication
37 pages
Unit 4 - DSRM
No ratings yet
Unit 4 - DSRM
5 pages
CP Enc t41pl3c
No ratings yet
CP Enc t41pl3c
4 pages
06 Consensus
No ratings yet
06 Consensus
11 pages
Byzantine Fault Tolerance
No ratings yet
Byzantine Fault Tolerance
12 pages
Chapter 05
No ratings yet
Chapter 05
32 pages
Cse535 F24 1003 BFT
No ratings yet
Cse535 F24 1003 BFT
47 pages
UniUni - Support
No ratings yet
UniUni - Support
1 page
Unit 4
No ratings yet
Unit 4
11 pages
UNIT - 4B Fault Tolerance
No ratings yet
UNIT - 4B Fault Tolerance
13 pages
Chapter 8
No ratings yet
Chapter 8
29 pages
Dilip Kumar Vavilapalli
No ratings yet
Dilip Kumar Vavilapalli
8 pages
Colyer. Aurora II 2019
No ratings yet
Colyer. Aurora II 2019
5 pages
Distributed Computing Practice Questions Chapter 4 pt2
No ratings yet
Distributed Computing Practice Questions Chapter 4 pt2
6 pages
Chap 3 DC
No ratings yet
Chap 3 DC
13 pages
Unit - Iv
No ratings yet
Unit - Iv
10 pages
CSE446 Lecture 5
No ratings yet
CSE446 Lecture 5
34 pages
Financial EDI
No ratings yet
Financial EDI
10 pages
Midterm Cheatsheet
No ratings yet
Midterm Cheatsheet
2 pages
ByzantineFT SMR
No ratings yet
ByzantineFT SMR
41 pages
OSINT (Hunter & TheHarvester)
No ratings yet
OSINT (Hunter & TheHarvester)
13 pages
Modernprotocols-Lewispye 1
No ratings yet
Modernprotocols-Lewispye 1
6 pages
Ch8 Distributed
No ratings yet
Ch8 Distributed
12 pages
DP-600 Dumps
No ratings yet
DP-600 Dumps
6 pages
Unit # IV Replication and Fault Tolerance
No ratings yet
Unit # IV Replication and Fault Tolerance
82 pages
CST 428 Block Chain Technologies: Consensus Algorithms and Bitcoin
No ratings yet
CST 428 Block Chain Technologies: Consensus Algorithms and Bitcoin
75 pages
RAFT
No ratings yet
RAFT
8 pages
Week 7
No ratings yet
Week 7
96 pages
Sec 2425 L05
No ratings yet
Sec 2425 L05
32 pages
View Stamped Replication
No ratings yet
View Stamped Replication
14 pages
Ans May Jun 2023
No ratings yet
Ans May Jun 2023
21 pages
DC Ese Notes
No ratings yet
DC Ese Notes
47 pages
DC Unit 3
No ratings yet
DC Unit 3
44 pages
Fault Tolerance
No ratings yet
Fault Tolerance
40 pages
22.quorom Based Protocol
No ratings yet
22.quorom Based Protocol
2 pages
DS Unit5
No ratings yet
DS Unit5
13 pages
Kafka Developer Certified: The Essential Guide
From Everand
Kafka Developer Certified: The Essential Guide
SUJAN
No ratings yet
Zig Programming: From Zero to Systems Master
From Everand
Zig Programming: From Zero to Systems Master
Niklas Hoffmann
No ratings yet

PBFT

Uploaded by

PBFT

Uploaded by

Practical Byzantine Fault Tolerance

Appears in the Proceedings of the Third Symposium on Operating

Published: February 1999

• Messages are accepted in each phase

You might also like