Vincent Gramoli
University of Sydney and EPFL
Vincent Gramoli
Data61-CSIRO and University of Sydney
Consensus is a fundamental problem of distributed computing. While this problem has been
known to be unsolvable since 1985, existing protocols were designed these past three decades to
solve consensus under various assumptions. Today, with the recent advent of blockchains, various
consensus implementations were proposed to make replicas reach an agreement on the order of trans-
actions updating what is often referred to as a distributed ledger. Very little work has however been
devoted to explore its theoretical ramifications. As a result existing proposals are sometimes mis-
understood and it is often unclear whether the problems arising during their executions are due to
implementation bugs or more fundamental design issues.
In this paper, we discuss the mainstream blockchain consensus algorithms and how the classic
Byzantine consensus can be revisited for the blockchain context. In particular, we discuss proof-
of-work consensus and illustrate the differences between the Bitcoin and the Ethereum proof-of-
work consensus algorithms. Based on these definitions, we warn about the dangers of using these
blockchains without understanding precisely the guarantees their consensus algorithm offers. In par-
ticular, we survey attacks against the Bitcoin and the Ethereum consensus algorithms. We finally
discuss the advantage of the recent Blockchain Byzantine consensus definition over previous defini-
tions, and the promises offered by emerging consistent blockchains.
1 Introduction
The blockchain technology [33] promises to radically transform the way individuals and companies
exchange digital assets and track securely ownership of these assets without the control of a central
authority. At its heart lies a distributed ledger that is consistent with high probability when particular
assumptions are fulfilled. In particular, the distributed set of participants guarantee its consistency
despite potentially malicious participants that behave arbitrarily, also called Byzantine failures [29].
The novelty of blockchain is a genuine combination of well-known research results taken from
distributed computing, cryptography and game theory. Its distributed nature guarantees the persis-
tence of the ledger data. Its public key crypto-system offers the capabilities for a user to sign trans-
actions that transfer assets from her account to other accounts. Its incentive mechanisms guarantee
that a subset of participants maintain the validity of the transactions. And finally, a Byzantine tolerant
consensus protocol aims at guaranteeing the integrity of the ledgers by defining a total order on newly
appended blocks of transactions.
Put into the blockchain context, the consensus problem is for the non-faulty or correct processes of
a distributed system to agree on one block of transaction at a given index of a chain of block. This
consensus problem can be stated along three properties: (i) agreement: no two correct processes de-
cided different blocks; (ii) validity: the decided block is a block that was proposed by one process;
(iii) termination: all correct processes eventually decide. A protocol solving the consensus problem is
necessary to guarantee that blocks are totally ordered, hence preventing concurrently appended blocks
from containing conflicting transactions.
Today, with the recent advent of blockchains, various consensus implementations were proposed
to make replicas reach an agreement on the order of blocks of transactions updating the distributed
ledger. However, consensus has been known to be unsolvable since 1985. While existing protocols
were designed these past three decades to solve consensus under various assumptions, it remains un-
clear what are the guarantees offered by blockchain consensus algorithms and what are the necessary
conditions for these guarantees to be satisfied. While the source code of most blockchain protocols is
publicly available, the theoretical ramifications of the blockchain abstraction are rather informal. As
main blockchain systems, like Bitcoin [33] and Ethereum [46], are now used to trade millions of US$
every day1 , it has become crucial to precisely identified its theoretical ramifications to anticipate the
situations where large volume of assets could be lost.
In this paper, we illustrate the danger of using proof-of-work blockchain without understand-
ing precisely their guarantees by listing vulnerabilities that affect the predominant proof-of-work
blockchain systems, namely Bitcoin and Ethereum.2 To this end, we describe the consensus algorithms
at the heart of these two blockchain systems. We also relate these consensus algorithms to decades of
research on the topic of distributed computing. More precisely, we identify situations where proof-of-
work blockchain consensus is violated by: (i) presenting a survey of existing attacks against the Bitcoin
consensus protocol and (ii) showing how Ethereum, which copes with some of these attacks, may suf-
fer from recent attacks, namely the blockchain anomaly [35] and the balance attack [34]. We elaborate
on the risks for users to misconfigure proof-of-work blockchain systems when deploying them as a
private and consortium blockchains and our own experience with the settings of the R3 Ethereum
testbed. The fact that both main proof-of-work blockchains are vulnerable allows us to conclude that
more research is necessary to design safe consensus algorithms suited for blockchains.
The rest of the paper is organized as follows. Section 2 presents the general blockchain model.
Section 3 introduces the classic Byzantine consensus problem and the probabilistic variant of it. Sec-
tion 4 specifies the differences of the consensus algorithms used in Bitcoin and Ethereum. Section 5
describes the attacks against Bitcoin and two recent attacks against the Ethereum consensus algorithm.
Section 6 redefines the Byzantine consensus in the light of the blockchain context. Section 7 discusses
the consortium model and recent reliable consensus proposals. Section 8 concludes.
transfer of digital assets or coins from the account of the source process pi to the account of a destination
process p j 6= pi . Each transaction is uniquely identified and broadcast to all processes in a best-effort
manner. We assume that a process re-issuing the same transfer multiple times creates as many distinct
Processes that initiate the consensus protocol are called miners, they initiate the consensus through
a propose function depicted at lines 7–12 of Alg. 1 allowing them to propose new blocks. Processes
decide upon a new block at a given index at line 18 depending on a function get-main-branch that is
specific to the type of proof-of-work blockchain system in use (cf. Sections 4.3.1 and 4.3.2 for Bitcoin
and Ethereum corresponding function, respectively). We refer to the computational power of a miner
as its mining power and we denote the total mining power t as the sum of the mining powers of all
miners in V. Each miner tries to group a set T of transactions it heard about into a block b ⊇ T as long
as transactions of T do not conflict and that the account balances remain non-negative. For the sake of
simplicity in the presentation, the graph G is static meaning that no processes can join and leave the
system, however, processes may fail as described in Section 2.3.
(a) view `1 (b) view `2 (c) view `3 (d) global state `0 = `1 ∪
`2 ∪ `3
Figure 1: The global state `0 of a blockchain results from the union of the distributed local views `1 , `2
and `3 of the blockchain
• Agreement: no two correct processes decide different blocks;
• Termination: all correct processes eventually decide a block;
• Validity: the decided block is a block proposed by some process.
An algorithm has to fulfil these three properties to solve the Byzantine Consensus problem.
Blockchain systems operate over a network, like the Internet, in which the assumption of commu-
nication synchrony, where every message gets delivered within a known period of time, might be unre-
alistic. Unfortunately, consensus is known to be unsolvable in asynchronous networks even in the case
of a simple crash failure [21]. To cope with this impossibility, various proposals relaxed the guarantees
of the classic Byzantine consensus in favor of probabilistic guarantees by exploiting randomization.
Figure 2: The blockchain structure starts with a genesis block at index 0 and links successive blocks in
reverse order of their index; a new block is decided at index i > 0 when the blockchain depth reaches
i + k (note that a blockchain of depth 0 is the genesis block)
Interestingly, one could propose a different definition from the problem solved by Bitcoin-NG by
defining the termination of the Bitcoin consensus protocol [33], which is used in Bitcoin, as follows. One
can observe the creation of distinct blocks at the same index of a blockchain as a transient violation of
agreement as depicted with the two blocks at index i + k in Figure 2. Under the synchrony assumption,
a reorganization (cf. Section 4.3) guarantees however with high probability that the block at index i is
uniquely decided when the chain depth reaches i + k. Garay et al. [22] noted that this probability grows
exponentially fast with the depth. Applications can then consider that the depth reaching i + k as
the termination of consensus for the block at index i, indicating that the transactions of this block are
successfully committed [25] and, for example, that goods bought by these transactions can be shipped.
Following this reasoning and selecting an appropriate parameter k, one can show that Bitcoin can,
in principle (provided that message delays are bounded and that correct processes have a sufficient
computational power), solve the Monte Carlo Byzantine consensus problem (described below).
• Probabilistic agreement: no two correct processes decide different blocks with probability at least δ;
• Termination: all correct processes eventually decide a block;
• Validity: the decided block is a block proposed by some process.
An algorithm has to fulfil these three properties to solve the Monte Carlo Byzantine Consensus problem.
This variant of consensus would guarantee that the blockchain application returns a (sometimes
incorrect) result to the client. To avoid being unresponsive, the application could decide of a timeout
after which it considers the transaction successful even though the blockchain consensus did not ac-
knowledge this success. For example, a merchant could wait for a predetermined period during which
it observes any possible invalidation of the transaction by the blockchain. After this period and if no
invalidation occurred, the transaction is considered valid. Note that there exist variants of the Monte
Carlo consensus problem where the validity is also probabilistic [31].
In the worst case scenario, the merchant may be wrong and the transaction may eventually be
considered invalid, in which case the merchant will loose goods. Provided that this scenario occurs
with a sufficiently small probability over all transactions, the merchant can predetermine its waiting
period based on her expected gain over a long series of transactions.
2. Unique-pointer-per-block assumption: Each non-genesis block contains exactly one hash of another
block, hence its outdegree is 1.
Algorithm 1 describes the progressive construction of the blockchain at a particular node pi upon
reception of blocks from other processes by simply aggregating the newly received blocks to the known
blocks (lines 13–15). As every added block contains a hash to a previous block that eventually leads
back to the genesis block indicated by its parent field, each block is associated with a fixed index. By
convention we consider the genesis block at index 0, and the blocks at k hops away from the genesis
block as the blocks at index k. As an example, consider the simple blockchain `1 = h B1 , P1 i depicted
in Figure 1(a) where B1 = { g, b1 } and P1 = {hb1 , gi}. The genesis block g has index 0 and the block b1
has index 1.
The point where distinct blocks of the global blockchain DAG have the same predecessor block is called
a fork. As an example Figure 1(d) depicts a fork with two branches pointing to the same block: g in this
In the remainder of this paper, we refer to the DAG as a tree rooted in g with upward pointers
allowing children blocks to point to their parent block.
Algorithm 2 The additional field and functions used by the Bitcoin consensus at pi
19: m = 5, the number of blocks to be appended after the block containing
20: tx, for tx to be committed in Bitcoin
Algorithm 2 depicts the Bitcoin-specific pseudocode that includes its consensus protocol to decide
on a particular block at some index (lines 21–31) and the choice of parameter m (line 19) explained later
in Section 4.4. When a fork occurs, the Bitcoin protocol resolves it by selecting the deepest branch as
the main branch (lines 21–28) by iteratively selecting the root of the deepest subtree (line 24). When
process pi is done with this pruning, it obtains the main branch of its blockchain view. Note that
the pseudocode for checking whether a block is decided and a transaction committed based on this
parameter m is common to Bitcoin and Ethereum, and was presented in lines 13–18 of Alg. 1; only the
parameter m used in these lines differ between the Bitcoin consensus algorithm (Alg. 2, line 19) and
this variant of the Etherem consensus algorithm (Alg. 3, line 19).
6 At the time of writing, the Ethereum consensus algorithm in use differs significantly from the G HOST protocol. For the sake
of simplicity, we will focus here on the G HOST protocol when referring to the Ethereum consensus algorithm.
Figure 3: Nakamoto’s consensus protocol at the heart of Bitcoin selects the main branch as the deepest
branch (in black) whereas the G HOST consensus protocol at the heart of Ethereum follows the heaviest
subtree (in grey)
Algorithm 3 The additional field and functions used by the Ethereum consensus at pi
19: m = 11, the number of blocks to be appended after the block containing
20: tx, for tx to be committed in Ethereum (since Homestead v1.3.5)
The main difference between the Bitcoin and Ethereum consensus protocols is depicted in Figure 3,
where the black blocks represent the main branch selected by Nakamoto’s consensus protocol and the
grey blocks represent the main branch selected by G HOST.
7 This period increases regularly under the influence of a recent algorithm called “the time bomb” that adapts the difficulty of
4.4 Decided blocks and committed transactions
A blockchain system S must define when the block at an index is agreed upon. To this end, it has
to define a point in its execution where a prefix of the main branch can be “reasonably” considered
as persistent.8 More precisely, there must exist a parameter m provided by S for an application to
consider a block as decided and its transactions as committed. This parameter is typically mbitcoin = 5 in
Bitcoin (Alg. 2, line 19) and methereum = 11 in Ethereum (Alg. 3, line 19). Note that these two choices
do not lead to the same probability of success [23] and different numbers are suggested by different
applications [35].
Definition 3 (Transaction commit). Let `i = h Bi , Pi i be the blockchain view at node pi in system S. For a
transaction tx to be locally committed at pi , the conjunction of the following properties must hold in pi ’s view
`i :
1. Transaction tx has to be in a block b0 ∈ Bi of the main branch of system S. Formally, tx ∈ b0 ∧ b0 ∈ Bi0 :
ci = h Bi0 , Pi0 i = get-main-branch()i .
2. There should be a subsequence of m blocks b1 , ..., bm appended after block b. Formally, ∃b1 , ..., bm ∈ Bi :
hb1 , b0 i, hb2 , b1 i, ..., hbm , bm−1 i ∈ Pi . (In short, we say that b0 is decided.)
A transaction tx is committed if there exists a process pi where tx is locally committed.
Property (1) is needed because processes eventually agree on the main branch that defines the cur-
rent state of accounts in the system—blocks that are not part of the main branch are ignored. Property
(2) is necessary to guarantee that the blocks and transactions currently in the main branch will persist
and remain in the main branch. Before these additional blocks are created, processes may not have
reached consensus regarding the unique blocks b at index j in the chain. This is illustrated by the fork
of Figure 1 where processes consider, respectively, the pointer hb1 , gi and the pointer hb2 , gi in their
local blockchain view. By waiting for m blocks were m is given by the blockchain system, the system
guarantees with a reasonably high probability that processes will agree on the same block b.
For example, consider a fictive blockchain system with mfictive = 2 that selects the heaviest branch
(Alg. 3, lines 21–28) as its main branch. If the blockchain state was the one depicted in Figure 3, then
blocks b2 and b5 would be decided and all their transactions would be committed. This is because they
are both part of the main branch and they are followed by at least 2 blocks, b8 and b13 . (Note that we
omit the genesis block as it is always considered decided but does not include any transaction.)
have successfully used Ethereum to transfer digital assets based on parameter methereum = 11 [35].
to double spend in Bitcoin and Ethereum, and some already translated in significant financial losses.
As we describe in Sections 5.2 and 5.3, some of these issues are not inherent to the Bitcoin consensus
protocol but could also occur with the Ethereum consensus protocol. With the advent of consortium
and private blockchains, some of these factors are even simply produced by a misconfiguration of the
deployed blockchain systems.
1. ti is proposed 2. ti appears committed 4. tj is committed first
index i ti ti ti tj
index j tj
3. tj is proposed by another node
Figure 4: The blockchain anomaly: a first client issues ti that gets successfully mined and committed
then a second client issues t j , with t j being conditional to the commit of ti (note that j ≥ i + k for ti
to be committed before t j gets issued), but the transaction t j gets finally reorganized and successfully
committed before ti , hence violating the dependency between ti and t j
chain the reward system does not necessarily incentivize many processes to mine correctly. Note that
in the R3 experiments not all processes were mining because it was decided they would not do so [34].
• Agreement: no two correct processes decide different blocks;
• Termination: all correct processes eventually decide a block;
• Validity: a decided block is valid, it satisfies the predefined predicate valid.
An algorithm has to fulfil these three properties to solve the Blockchain Byzantine Consensus problem.
As far as we know, the only algorithm that solves the Blockchain Byzantine Consensus problem is
called the Democratic BFT (DBFT) and was first formalized in [15]. The reason why this definition of
consensus is better suited to blockchain is twofold. First, the valid predicate allows the blockchain to
decide a block of transactions that was proposed by Byzantine participants. This difference is possible
thanks to the use of the valid predicate that defines the validity of a block proposed by a Byzantine
participant. Without this valid predicate, the decided value could not be one of the values proposed
by a Byzantine as these are undefined. Second, the decided value does not need to be one of the
proposed value. This allows to decide a number of transactions that grows potentially with the number
of participants. To solve the classic Byzantine consensus, only one of the proposed block could be
decided, hence limiting the number of decided block to 1 out of n − t blocks of transactions proposed
by correct participants. To solve the Blockchain consensus, however, the decided block could represent
the union of all the n − t blocks proposed by correct participants.
1. Permissioned: only a specified set of institutions can participate in the consensus of the consor-
tium blockchain. The fact that each participant needs a permission to participate in the consensus
does not prevent other users to potentially access the current state of the blockchain, they simply
cannot take part of the decision process. The appealing aspects of this consortium is that the
decision is not controlled by a leader [26] or a single institution as in the case of fully-private
blockchains. This is also in contrast with the permissionless Bitcoin and Ethereum main chain in
which any participant connected to Internet can join at any time, and alleviates the problem of
having an uncontrollable amount of nodes wasting resources.
2. Global knowledge: given that the membership is pre-determined, one can reasonably assume
that most participants are aware of the exact list of the n participants of the consortium. Con-
sequently, any participant that lags behind, simply needs to contact a majority or a quorum of
participants to catch up with the most up-to-date system size n. Moreover, this fixed list of partic-
ipants naturally prevents an attacker from executing a Sybil attack by forging multiple identities
it can control.
3. Bound on the number of failures: given that the list of participants is known, one can reason-
ably assume that a malicious participant cannot convince the consortium to introduce a large
number of fake identities in comparison to the consortium size. Moreover, one can assume that
new participants go through a detailed KYC (know-your-customer) process before getting the
permission to join the consortium. This makes it realistic to limit coalitions of f malicious partic-
ipants to f << n at the same time.
Despite these differences, the consortium blockchain model is close to the original blockchain model
as we explain below.
1. The failure model is the same as in the classic model. It is still necessary to tolerate Byzantine
failures in a consortium model as the participating institutions can have conflicting interests, and
the blockchain should protect from the possible misbehavior of a participant.
2. The communication model is also the same as in the classic model. These institutions may be
located in different regions of the globe and typically communicate through internet when issu-
ing transactions. The Internet is unpredictable and the delay of a message cannot be known in
advance. In particular, the Internet network is shared by machines external to the consortium
and is subject to large localized failures due to disasters, it is thus impossible to control or even
anticipate traffic disruptions, congestions and delays.
Provided that f < n3 , practical Byzantine fault tolerant solutions could be used realistically to solve
the Byzantine consensus in the consortium blockchain model and without the need for proof-of-work.
Of course, practical Byzantine fault tolerant solutions remain quite limited for several reasons: (i) they
usually require a leader election that is difficult to implement and that conflicts with the inherent de-
centralization aim of blockchains: in particular it is impossible to elect a correct leader as a Byzantine
leader could act correctly up to the point where it gets elected, (ii) they often employ complex tech-
niques to circumvent the impossibility result like a global random coin that returns the same random
value to any process and whose values cannot be anticipated by Byzantine processes, and (iii) they
typically rely on costly public-key cryptosystems to guarantee authentication.
7.3.1 Ripple
The consensus protocol of Ripple, the third largest digital currency in market capitalization, was pro-
posed as a white paper and uses mutually interesting sets of replicas, also known as quorums [41].
The protocol bootstraps with a hard-coded list of initial replicas. Each node requests a different list of
replicas, also called a unique node list (UNL), and waits until a quorum, which represents at least 80%
of this list, answers. It also requires minimal connectivity and an intersection size among UNLs that
represents at least 20% of each UNL. However, there has been a debate wether the assumptions of the
Ripple consensus were suffient to implement consensus. In particular, the intersection property alone
was shown insufficient to solve consensus and that stricly more than 40% was actually required [2].
7.3.2 The Hyperledger fabric
IBM is a key partner in the Hyperledger project [12], a recent industry-wide collaborative effort to
develop an open-source blockchain. Although the current version of the Hyperledger codebase (v0.6)
features a naive consensus approach relying on a central server for testing purpose, the next generation
of Hyperledger, is expected to feature the practical fault tolerant Byzantine protocol [13] and a variant
of Apache Kafka9 . PBFT [13] and Kafka are being implemented as modular consensus protocol one
can plug to Hyperledger. Hyperledger also features a subledger abstraction that allows partners to
collaborate within a consortium blockchain without revealing the content of the blockchain to other
12 concurrentsystems/rbbc/.
become a consensus participants in charge of deciding.
8 Conclusion
While the blockchain technology is reshaping ownership tracking through distributed ledgers, it re-
mains difficult for blockchain users to understand the guarantees this technology has to offer. This
paper describes the causes of this difficulty in mainstream proof-of-work blockchain systems, namely
Bitcoin and Ethereum. One cause is the probabilistic nature of its consensus algorithms: although it
appears that one should wait longer to increase the probability of agreement in case of network delays,
most applications rely on a fixed predicate to define the termination of consensus. Another cause is
that users have started deploying blockchain protocols in either a private or a consortium context, often
involving fewer miners with a different distribution of the mining power and where network delays
can be artificially introduced.
While the recent redefinition of the consensus problem in the context of blockchain helps address-
ing the major tradeoff between consistency and performance, it is crucial to design new provable algo-
rithms especially tailored for blockchains and validate them through large-scale experimentations.
