Untangling Blockchain: A Data Processing View of Blockchain Systems
Untangling Blockchain: A Data Processing View of Blockchain Systems
Abstract—Blockchain technologies are gaining massive momentum in the last few years. Blockchains are distributed ledgers that
enable parties who do not fully trust each other to maintain a set of global states. The parties agree on the existence, values and
histories of the states. As the technology landscape is expanding rapidly, it is both important and challenging to have a firm grasp of
what the core technologies have to offer, especially with respect to their data processing capabilities. In this paper, we first survey the
arXiv:1708.05665v1 [cs.DB] 17 Aug 2017
state of the art, focusing on private blockchains (in which parties are authenticated). We analyze both in-production and research
systems in four dimensions: distributed ledger, cryptography, consensus protocol and smart contract. We then present BLOCKBENCH,
a benchmarking framework for understanding performance of private blockchains against data processing workloads. We conduct a
comprehensive evaluation of three major blockchain systems based on BLOCKBENCH, namely Ethereum, Parity and Hyperledger
Fabric. The results demonstrate several trade-offs in the design space, as well as big performance gaps between blockchain and
database systems. Drawing from design principles of database systems, we discuss several research directions for bringing blockchain
performance closer to the realm of databases.
1 I NTRODUCTION
Blockchain technologies are taking the world by storm, development of new blockchain platforms designed for
largely due to the success of Bitcoin [1]. A blockchain, private settings where participants are authenticated. Block-
also called distributed ledger, is essentially an append-only chain systems in such environments are called private (or
data structure maintained by a set of nodes which do not permissioned), as opposed to the early systems operating in
fully trust each other. Nodes in the blockchain agree on an public environments (or permissionless) where anyone can
ordered set of blocks, each containing multiple transactions, join and leave. Applications such as security trading and
thus the blockchain can be viewed as a log of ordered trans- settlement [6], asset and finance management [7], [8], bank-
actions. In the database context, blockchain can be viewed ing and insurance [9] are being built and evaluated. These
as a solution to distributed transaction management: nodes applications are currently supported by enterprise-grade
keep replicas of the data and agree on an execution order database systems like Oracle and MySQL, but blockchain
of transactions. However, traditional databases assume a has the potential to disrupt this status quo because it incurs
trusted environment and employ well known concurrency lower infrastructure and human costs [9]. In particular,
control techniques [2], [3], [4] to order transactions. Block- blockchain’s immutability and transparency help reduce
chain’s key property is that it assumes nodes behave in arbi- human errors and the need for manual intervention due
trary (or Byzantine) manner. Being able to tolerate Byzantine to conflicting data. Blockchain can help streamline business
failure by design, blockchain offers stronger security than processes by removing duplicate efforts in data governance.
incumbent database systems. Goldman Sachs estimated 6 billion saving in current cap-
In the original design, Bitcoin’s blockchain stores coins ital market [9], and J.P. Morgan forecast that blockchains
as the system states. For this application, Bitcoin nodes will start to replace currently redundant infrastructure by
implement a simple replicated state machine model which 2020 [8].
moves coins from one address to another. Since then, block-
chain has grown beyond crypto-currencies to support user- Amid the growing commercial and academic interest, a
defined states and Turing complete state machine mod- large number of blockchain systems have sprung up, each
els. For example, Ethereum [5] enables any decentralized, claiming some unique capabilities. Both private and public
replicated applications known as smart contracts. More im- sector are clamoring to adopt blockchains, but they face
portantly, interest from the industry has started to drive overwhelming choices. While challenging, it is important
to have a firm grasp on what the technology can and
cannot do. A quest for understanding blockchain must
• Tien Tuan Anh Dinh, Rui Liu, Ji Wang, Beng Chin Ooi are with the
Department of Computer Science, National University of Singapore, ultimately answer the following questions:
Singapore.
{
E-mail: dinhtta, liur, wangji, ooibc @comp.nus.edu.sg
• }
Meihui Zhang is with Singapore University of Technology and Design, 1) What is a blockchain? Specifically, what are its
Singapore. unique properties that benefit current and future
E-mail: meihui [email protected] applications?
• Gang Chen is with Zhejiang University.
E-mail: [email protected] 2) How do current blockchains differ from each other,
both qualitatively in the design and quantitatively
in their performance?
3) What are the current challenges? And what do future blockchains look like?
2
To answer these questions, in this paper, we start by
distinguishing two major classes of blockchain systems,
namely public and private blockchains. We then explain
four key technical concepts by which current systems can
be categorized: distributed ledger, cryptography,
consensus protocol and smart contract. Next, we describe
BLOCK- BENCH [10], our benchmarking framework for
quantita- tively evaluate and compare private blockchains. block t block t+1
Using BLOCKBENCH, we conduct comprehensive PrevBlockNonce PrevBlockNonce
TransactionTransaction
evaluation of three major blockchains: Ethereum [5], Parity roothashroothash
[11] and Hy- perledger [12]. The results show that current
blockchains’ performance is limited, far below what a state-
of-the-art database system can offer. Finally, we draw from
Transaction list Transaction list
our experi- ence in building large-scale database systems
several design principles that can improve future
Fig. 1: Blockchain data structure. Transactions are packed
blockchains.
into blocks which are linked to previous blocks.
In summary, our contributions are:
1) We provide an in-depth survey of blockchain sys-
tems. We discuss state of the art, and categorize semantics. The key difference is the failure model under con-
current systems along four dimensions: distributed sideration. Current transactional, distributed databases [13],
ledger, cryptography, consensus protocol and [14] employ classic concurrency control techniques such as
smart contract. two-phase commit to ensure ACID. They can achieve high
2) We describe our benchmarking framework, performance, because of the simple crash failure model. In
BLOCKBENCH, that is designed for understanding contrast, the original blockchain design considers a much
performance of private blockchains against data hostile environment in which nodes exhibit Byzantine be-
processing workloads. havior. Under this model the overhead of concurrency con-
3) We present a comprehensive evaluation of Eth- trol is much higher [15].
ereum, Parity and Hyperledger. The results show At a high level, a blockchain system can be categorized
the limitation of blockchains as data processing as either public or private. In the former, any node can join
platforms. They identify several performance bot- and leave the system, thus the blockchain is fully decentral-
tlenecks, and therefore can serve as a baseline for ized, resembling a peer-to-peer system [16]. In the latter,
future blockchain research and development. the blockchain enforces strict membership. More
In the next section, we provide an overview of block- specifically, there is an access control mechanism to
chain systems, separating them into public and private determine who can join the system. As the result, every
settings. Section 3 explains the four building blocks which node is authenticated and its identity is known to the other
are used in Section 4 to categorize existing blockchains. nodes.
Section 5 describes BLOCKBENCH, followed by the evalu-
ation of three blockchains in Section 6. Section 7 discusses 2.1 Public Blockchain
a number of lessons learned from the performance study,
and how to bring design principles from databases to Bitcoin [1] is the most well known example of public
improve blockchains. Section 8 concludes. blockchains. In Bitcoin the states are digital coins (crypto-
currencies), and a transaction moves coins from one set of
addresses to another. Each node broadcasts a set of
2 B LOCKCHAINS : PRIVATE VS. P UBLIC transac- tions it wants to perform. Special nodes called
A typical blockchain system consists of multiple nodes miners collect transactions into blocks, check for their
which do not fully trust each other. Some nodes exhibit validity, and start a consensus protocol to append the
Byzantine behavior, but the majority is honest. Together, blocks onto the block- chain. Bitcoin uses proof-of-work
the nodes maintain a set of shared, global states and (PoW) for consensus: only a miner which has successfully
perform transactions modifying the states. Blockchain is a solved a computationally hard puzzle (finding the right
special data structure which stores historical states and nonce for the block header) can append to the blockchain.
transac- tions. All nodes in the system agree on the PoW is tolerant of Byzantine failure, but it is probabilistic
transactions and their order. Figure 1 shows the blockchain in nature: it is possible that two blocks are appended at the
data structure, in which each block is linked to its same time, creating a fork in the blockchain. Bitcoin
predecessor via a cryptographic pointer, all the way back resolves this by only considering a block as confirmed after
to the first (genesis) block. Because of this, blockchain is it is followed by a number of blocks (typically six blocks).
often referred to as a distributed ledger. This probabilistic guarantee leads to security and
A transaction in a blockchain is the same as in performance issues: attacks have been demonstrated by an
traditional adversary controlling only 25% of the nodes [17], and
databases: a sequence of operations applied on some Bitcoin transaction throughput remains very low (7
states. As such, the blockchain transaction requires the transactions per second [18]).
same ACID Most public blockchain systems employ variants of PoW
for consensus. PoW works well in the public settings be-
cause it guards against Sybil attacks [16]. However, being
3
non-deterministic and computationally expensive, it is un-
suitable for applications such as banking and finance which the other hand, a general-purpose blockchain may use low-
must handle large volumes of transactions in a level model such as table or key-value. Second, the system
deterministic manner. may have one or more ledgers which may be connected
to each other. A large enterprise, for example, may own
multiple ledgers, one for each of its departments: engi-
2.2 Private Blockchain neering, customer care, supply chain, payroll, etc.. Third,
Hyperledger [12] is among the most popular private ledger ownership may vary from completely open to public
blockchains. Since node identities are known in the pri- to strictly controlled by one party. Bitcoin, for example is
vate settings, most blockchains adopt one of the protocols completely open, and as a consequence requires expensive
from the vast literature on distributed consensus. Zab [19], consensus protocol to identify who can update the ledger.
Raft [20], Paxos [21], PBFT [15] are popular protocols that Parity [11], on the other hand, pre-determines a set of
are in active use today. Hyperledger directly uses PBFT1, owners who can write to the ledger simply by singing the
while others like Parity [11], Ripple [6] and ErisDB [22] blocks.
develop their own variants. PBFT is a three-phase protocol.
In the pre-prepare phase, a leader broadcast a value to be 3.2 Consensus
commit by other nodes. Next, in the prepare phase, the
nodes broadcast the values they are about to commit. The content of the ledger reflects historical and current
Finally, the commit phase confirms the committed value states maintained by the blockchain. Being replicated, up-
when more than two third of the nodes agree in the dates to the ledger must be agreed on by all parties. In
previous phase. PBFT is communication bound, but it other words, multiple parties must come to a consensus.
achieves both safety and liveness in partially synchronous Note that this is not the case in many real-world
networks. Besides deterministic consensus, another key applications such as fiat currency, in which one entity (e.g.
property of private blockchains is that they support smart the bank or the government) decides the updates.
contracts which can express highly complex transaction One key property of a blockchain system is that the
logics. These properties are particularly desirable in nodes do not trust each other, meaning that some may
business and financial systems. Indeed, private blockchains behave in Byzantine manners. The consensus protocol
evoke such interest from major banking and financial must therefore tolerate Byzantine failures. The research
institutions that some even claim that they have the litera- ture on distributed consensus is vast, and there are
potentials to disrupt current practices in data management many variants of previously proposed protocols being
[8], [9]. developed for blockchains [23]. They can be largely
classified along a spectrum. One extreme consists of
purely computation based protocols that use proof of
3 K EY C ONCEPTS computation to randomly select a node which single-
handedly decides the next op- eration. Bitcoin’s proof-of-
Categorizing blockchains as public or private is useful
work (PoW) is an example. The other extreme are purely
for identifying major characteristics of many blockchains.
communication based protocols in which nodes have equal
However, understanding their subtle differences warrants
votes and go through multi- ple rounds of communication
a finer taxonomy. This section introduces four
to reach consensus. These protocols, PBFT [15] being the
underpinning concepts, based on which a more detailed
prime example, are used in private settings because they
classification of the systems can be obtained.
assume authenticated nodes. In between these extremes
are hybrid protocols which aim to improve performance of
3.1 Distributed Ledger PoW and PBFT. For instance, Proof-of-Elapsed-Time
A ledger is a data structure that consists of an ordered list (PoET) eliminates expensive mining in PoW by leveraging
of transactions. For example, a ledger may record monetary trusted hardware such as Intel SGX. Another example is
transactions between multiple banks, or goods exchanged Proof-of-Authority (PoA) [24] which improves PBFT by
among known parties. In blockchains, the ledger is repli- pre-selecting a small set of trusted nodes that vote among
cated over all the nodes. Furthermore, transactions are themselves to reach consensus. Similarly, Stellar [25] and
grouped into blocks which are then chained together. Thus, Ripple [6] improve PBFT by executing consensus in smaller
the distributed ledger is essentially a replicated append-only networks.
data structure. A blockchain starts with some initial states,
and the ledger records entire history of update operations 3.3 Cryptography
made to the states.
A system supporting distributed ledgers can be char- Blockchain systems make heavy use of cryptographic tech-
acterized in three dimensions, as illustrated in Table 1. niques to ensure integrity of the ledgers. Integrity here
First, the application built on top of the ledger determines refers to the ability to detect tampering of the blockchain
the data model of what being stored in the ledger. For data. This property is vital in public settings where there
example, a crypto-currency application may adopt the is no pre-established trust. For example, public confidence
user- account model resembling traditional banking in crypto-currencies like Bitcoin, which determines values
systems. On of the currencies, is predicated upon the integrity of the
ledger; that is the ledger must be able to detect double
1. Hyperledger has two main releases: v0.6.0 and v1.0.0-rc1. The spending. Even in private blockchains, integrity is equally
former supports PBFT, but the latter adopts a no-Byzantine consensus essential because the authenticated nodes can still act
protocol based on Kafka.
maliciously.
TABLE 1: Examples of distributed ledgers.
Data Model Number of ledgers Owner Example
Accounts One Administrator Traditional ledgers used in financial institutions.
Private ledger used within a financial institu-
Assets Many Group of users tion, or between small groups of financial orga-
nizations, e.g. global financial services.
Coins or accounts One Any user Crypto-currencies like Bitcoin or Ethereum.
where bh is the current block at height h, bal(M ) returns Bitcoin [75] exploit this probabilistic guarantee to allow
how many coins in M ’s account, and age returns how double spending. In contrast, the original PBFT protocol
much time has passed since the creation of a block at a [15] is deterministic. Implemented in the earlier version of
certain height. Hy- perledger (v0.6), the protocol ensures that once a block
Ethereum’s upcoming PoS protocol is implemented as is appended, it is final and cannot be replaced or modified.
a smart contract. Referred to as Casper, it allows miners
It incurs O(N 2) network messages for each round of agree-
to become validators by depositing Ethers to the Casper
ment where N is the number of nodes in the network. In
account. The contract then picks a validator to propose
practice, however, the original protocol scales poorly and
the next block according to the deposit amount. Its unique
collapses even before reaching the network limit [76]. We
feature, however, is to force validators to behave correctly
observe the same scalability issues in our evaluation of
or else risk losing the entire deposit. In particular, each
Hyperledger with BLOCKBENCH.
validator places a bet on whether a certain block will be
Tendermint proposes a small modification on top of
confirmed in the future. If the block is confirmed, the
PBFT. Instead of each node having an equal vote, in Ten-
validator gets a small reward. But if it is not, the validator
dermint each node may have different voting power, pro-
loses its deposit. This mechanism avoids the nothing-at-stake
portional to their stake in the network. To reach agreement
problem in which validators can propose blocks in different
in Tendermint it is necessary to only gather over 2/3 of the
branches. Tezos implements a simplified version of Casper
total voting power. This may be cheaper than waiting for
in which the nodes buy in to become authorities which can
2/3 of the network to response when there is a small number
then approve changes to the underlying blockchain. Tezos
of nodes with high stakes.
aims to provide an amendable blockchain in which soft
Recent works on improving PBFT have mainly focused
forks and hard forks are inherent features of the blockchain.
on its performance. Zyzzyva [77] optimizes for normal
PBFT variants cases (when there are no failures) via speculative
execution. XFT [78], assumes a network less hostile than
PoW suffers from non-finality, that is a block appended
purely Byzan- tine, and demonstrates better performance
to a blockchain is not confirmed until it is extended by
by reducing the number of network messages.
many other blocks. Even then, its existence in the block-
HoneyBadger [79], on the other hand, focuses on
chain is only probabilistic. For example, eclipse attacks on
improving security under asyn- chronous networks. It
employs a randomized agreement pro-
tocol which achieves safety with overwhelming probability even under network asynchrony. By optimizing the network layer, it is
shown to outperform PBFT even when the net- work is
synchronous. Both Zyzzyva, XFT and HoneyBadger hold every epoch. In particular, a node identity is its solution to
great promise, but they have not been integrated into any a cryptographic puzzle. In the second phase, the selected
blockchains. nodes perform PBFT to determine the next block. The end
result is faster block confirmation time at a scale much
Trusted hardware greater than traditional PBFT (over 1000 nodes).
Most overheads of PoW and PBFT can be attributed to Similar to Byzcoin and Elastico, Dfinity [39] and Algo-
the assumption that nodes behave in Byzantine manners. rand [84] select at each round a random set of nodes that
The availability of Intel SGX [80] or ARM TrustZone [81], can propose blocks. Unlike the former, they dispense with
however, makes it possible to relax the trust model in the PoW and instead use verifiable random functions (VRFs) to
Byzantine settings. In particular, a node equipped with select the consensus group. In Dfinity, the VRF is based on
trusted hardware can be reliably checked for certain the threshold signature of the previous block. In Algorand,
proper- ties, for example, that it is running a specific it is based on a random seed published in the previous
software. block and the node’s secret key.
Sawtooth Lake leverages SGX to replace PoW with a
Non-Byzantine
more efficient protocol called PoET. Specifically, PoET runs
inside an enclave protected by SGX. It starts by taking a The systems described so far in this section tolerate Byzan-
block number as input and generating a timer of a random tine failures, rendering them attractive for public settings
duration t. Afterward, it can produce certificates indicating and for private settings where the cost of engaging trusted
how much time has passed since the timer starts. A node parties (for example, for escrowing assets) is high. Some
whose PoET generates the smallest t can append the block blockchains, however, assume trusted parties in order to
when the timer expires. In particular, the node attaches its simplify their designs. These blockchains have no safety
PoET certificate to the block, and as long as t is smaller guarantees when any of such parties behaves maliciously.
than what generated by any other node the block is Openchain [35] relies on a single trusted party (called
accepted. validator) that determines the next block. Consequently, it
A2M [82] and Hybster [83] both exploit trusted is most vulnerable to attacks as the validator is the single
hardware to reduce the number of replicas needed to point of failure. Multichain and Parity have more than one
tolerate f failures from 3f + 1 to 2f + 1. This means an N trusted party which is referred to as authority in their
-node network can now tolerate up to N/2 adversarial systems. Each authority is given a time slice, via round-
nodes, as opposed to N/3 adversarial nodes in the robin scheduling, during which it can append new blocks to
original PBFT. A2M’s and Hybster’s safety are the chain. This simple proof-of-authority (PoA) protocol
dependent on the trusted code bases (TCBs) that avoids single point of failure while ensuring balanced
implement simple functions: a log data structure in the workloads among the authorities. HydraChain and
former and a monotonic counter in the latter. BigChainDb also have multi- ple authorities, but one
authority cannot unilaterally decide the next blocks.
Federated Instead, the block is decided via majority voting. Quorum
Despite numerous improvements to the original protocol, [33] employs Raft [20] as the consensus pro- tocol among its
PBFT-based consensus remains communication bound, authorities. Raft implements crash tolerant state machine
thus it ultimately fails to scale beyond a certain number of replication, which is an important building block of
nodes. To overcome this hard limit without scarifying modern distributed database systems. Using Raft, Quorum
safety, Stellar and Ripple adopt an approach that partitions is able to make safe progress even when some authority
the network into smaller groups called federates. Each nodes crash.
federate runs a lo- cal consensus protocol among its Corda’s consensus protocol is executed by a set of
members, which does not run into scalability problems trusted parties called notaries which check if a given
because of the small network size. Local agreements are transac-
then propagated to the entire network via nodes lying in tion has been executed before. By delegating this check to an
the intersections of the federates. Global consensus can be entity outside of the blockchain, Corda can justify using Raft
achieved under certain conditions. For Stellar, the for consensus. Transactions in Corda are sent to the notaries
condition is that every two federates intersect at non- before being confirmed in the blockchain. The notaries then
Byzantine nodes. Ripple’s safety conditions are that there is use Raft to ensure that the transactions are replicated among
a large majority of honest nodes in every federate, and that themselves and remain highly available despite crashes.
the intersection of any two federates contain at least one The latest release of Hyperledger (v1.0) outsources the
honest node. consensus component to Kafka — another building block of-
Both Stellar and Ripple assume federates are pre- ten found in distributed database systems. More specifically,
defined transactions are sent to a centralized Kafka service which
and their safety conditions can be enforced by a network orders them into a stream of events. Every node subscribes
administrator. In a decentralized environment where node to the same Kafka stream and therefore is notified of new
identities are unknown, such assumptions do not hold. transactions in the same order as they are published. Since
Byzcoin [57] and Elastico [65] propose novel, two-phase there is only one Kafka service, the observed transaction
protocols that combine PoW and PBFT. In the first phase, sequence is the same at every node.
PoW is used to form a consensus group. Byzcoin imple-
Others
ments this by having a sliding window over the blockchain
and selecting the miners of the blocks within the window. IOTA [36] uses its own consensus protocol called Tangle
Elastico [65] groups nodes by their identities that change in which the blocks form a direct acyclic graph (DAG) as
opposed to a chain. In addition, a block in Tangle consists
of only one transaction. When appended, the block must Hyperledger does not have its own bytecotes. Instead,
it runs its language-agnostic smart contracts inside Docker
approve two other blocks creating links to them in the DAG.
containers. Specifically, a contract can be written in any
The block is confirmed when it is approved by many other
language, which is then compiled into native code and
blocks. Targeting IoT environments, Tangle’s main goal is
packed into a Docker image. When the contract is
efficiency and low-cost payment. Although its security has
uploaded, each node starts a new container with that
not been rigorously analyzed, the low values of transactions
image. Invoking the contract is done via Docker APIs. The
(micropayments) in Tangle could in practice discourage
contract can access the blockchain states via two methods
Byzantine behavior.
getState and putState exposed by a shim layer. One benefit
Kadena [26] proposes an extension to Raft that handles
Byzantine failures. It introduces various techniques on top of Hyperledger is that it supports multiple high-level
of Raft, such as message signatures, client verification and programming languages like Go and Java. However, its key-
value interfaces with the blockchain necessitates extra
incremental hashing. However, like Tangle, it is unclear
application logics for mapping high-level data structures
whether the protocol guarantees safety and liveness.
into key-value tuples.
Sawtooth Lake supports smart contracts in the form of
4.4 Smart Contracts transaction families. Each family is a user-defined Python
class loaded into the ledger during start up. The contract
Recall that a smart contract system can be characterized by
is executed in the native runtime environment as a normal
its language expressiveness or by its execution environment.
Python program.
Except for Openchain, IOTA, Ripple and Stellar, all systems
One consequence of supporting Turing complete con-
listed in Table 2 let users customize transaction logics to suit
tracts is that software bugs are all but inevitable. While
their applications. In the following, we group them by the
empowering, the Ethereum smart contract model receives
contract language expressiveness.
strong criticism because it directly exposes Ethers against
programming bugs. The security concerns indeed material-
Scripts
ized in the DAO attack [43] in which the attacker stole $ 50M
Bitcoin provides approximately 200 opcodes, but many of worth of asset. The attack exploits a concurrency bug in the
them are disabled in the latest implementation. Users can DAO smart contract which allows one to repeatedly draw
write stack-based programs with the opcodes. The most more money than what is specified in the transaction. Such
popular contracts in Bitcoin are related to multi-signatures. bugs are inherent in a language like EVM which has weak
One example is the escrow contract that requires 2 out of 3 or no formal specifications of its semantics. OYENTE [86]
signatures before a coin can be released. The language can presents three major causes of security bugs: transaction
also implement bounty-hunting style contracts, for order dependencies, timestamp dependencies and mishan-
example, one that releases the reward coins when the pre- dled exceptions. It formalizes Ethereum semantics and pro-
image of a hash value is found. poses a tool for checking bugs directly on EVM bytecodes.
BigchainDB [27] adopts a more expressive language The tool discovered over 8000 Ethereum contracts (worth
called crypto-condition. Developed as part of the Interledger over $60M ) with potential security bugs.
Protocol project [85], crypto-condition allows specifying Like any other transactions on the blockchain, smart
complex boolean expressions over many types of contract executions are transparent. It means the inputs,
signatures. A crypto-condition script contains conditions outputs and the states of the contract are visible to the
and fulfillments which are treated as inputs and output of network. Hawk [87] extends Zerocash to provide transaction
the script. The available conditions include timeout which privacy for smart contracts. The main challenge compared
enables time- release contracts. Crypto-condition’s to Zerocash lies in the arbitrary transaction logics, whereas
encoding is higher level than Bitcoin opcodes, making it in Zerocash the logics are constrained by a small set of oper-
easy to express com- plex logics. ations. Another challenge is to protect local states, which is
not applicable in Zerocash. Given a contract, Hawk compiles
Turing complete it with zkSNARK to make it privacy preserving. Transaction
inputs and outputs are pre- and post-processed via Hawk to
Ethereum is among the first blockchains offering Turing-
hide the complex cryptographic details. Although the pro-
complete smart contracts. Users write their contracts in
tocols incur large overhead both in time and space, Hawk
either Solidity, Serpent or LLC language, which then get
represents a practical cryptographic system that achieves
compiled to EVM bytecodes. EVM executes normal crypto-
both transaction privacy and fairness.
currency transactions, and it treats smart contract bytecodes
as a special transaction. Specifically, each smart contract is Verifiable
given its own memory to store local states. The memory is
Even before the DAO attack, some blockchains have
exposed as a key-value storage, though Solidity provides high-
rejected the models that allow for unconstrained
level data types such as map, array and composite structures.
computations. The languages of Kadena, Tezos and Corda
Resources consumed during execution of the contract, both in
are more powerful than Bitcoin scripts, but they trade
terms of CPU and memory, are tracked by EVM and charged to
Turing completeness for safety. Kadena’s language is a Lisp-
the transaction sender’s account. EVM also keeps track of
like functional language called Pact [26]. A Pact contract is
intermediate state changes and reverse them if there are
stored in the ledger in human readable form, which is then
insufficient funds to pay for the execution.
parsed and executed in Ocaml. It is strongly typed and can
be formally verified. Similarly, Tezos’s stack-based language
called Michelson
Application Asset Securities
Crypto-currency
management settlement ... are multiple BLOCKBENCH workloads for evaluating it
individually.
block t block t+1
The consensus layer implements the consensus
... Block header Block header ...
protocol. The data model layer contains the structure,
TransactionContractTransactionContract
CPU Storage
roothashroothashroothashroothash Network content and operations on the blockchain data. The
execution layer includes details of the runtime
Consensus
environment for execut- ing smart contracts. Finally, the
Smart contract Smart contract
application layer includes classes of blockchain
Code
input, output input, output
applications. Croman et. al. [18] pro- posed to divide
Code
State storage State storage
blockchain into several planes: network, consensus,
Execution engine Data model storage, view and side plane. While similar to
BLOCKBENCH’s four layers, the plane abstraction was
geared towards crypto-currency applications and did not
take into account the execution of smart contracts.
Fig. 3: Blockchain software stack on a fully validating node.
5.2 Implementation
YCSB,
5 BLOCKBENCH
The previous section has presented a thorough qualitative
analysis of existing blockchains. In this section, we describe
our benchmarking framework called BLOCKBENCH [10].
Designed for quantitative analysis of blockchains as
data processing platforms, the framework targets private
blockchains with Turing-complete smart contracts. BLOCK-
BENCH is open source [88] and contains data processing
workloads commonly found in database benchmarks.
5.1 Layers
BLOCKBENCH targets blockchains that function as data
processing platforms. Such a blockchain must have no re-
strictions on the application logics, thus it must support
Turing complete smart contracts. Figure 3 shows the
logical components of the blockchain software stack, from
which we refine the taxonomy described in Section 4 into
four concrete layers shown in Figure 4. For each layer
ations, number of clients, threads, etc.). It collects failure. We simulate crashes, net- work delays and
runtime statistics which are used to compute five random message corruptions.
important metrics. • Security metrics: the ratio between the total
• Throughput: the number of successful number of blocks included in the main branch and
transactions per second. A workload can the total number of confirmed blocks. The lower the
be configured with multiple clients and ratio, the less vulnerable the system is from double
threads per clients to saturate the spending or selfish mining.
blockchain throughput.
• Latency: the response time per transaction. 5.3 Workloads
Driver implements blocking BLOCKBENCH comes with macro benchmark workloads
transactions, i.e. it waits for one transaction for evaluating the application layer, and micro benchmark
to finish before starting another. workloads for analyzing the lower layers. Smart contract
• Scalability: the changes in throughput implementations of the workloads shown in Figure 4 are
and latency when increasing the number available and can be readily deployed on Ethereum, Parity
of nodes and number of concurrent and Hyperledger.
workloads.
3. We assume that the smart contract implementing the workload’s
• Fault tolerance: the changes in logic is already implemented and deployed on the blockchain.
throughput and la- tency during node
Macro benchmark workloads type account_t struct {
We port two popular database benchmark workloads into Balance int
BLOCKBENCH, namely YCSB and Smallbank. YCSB is CommitBlock int
}
widely used for evaluating NoSQL databases, for which type transaction_t {
we implement a simple smart contract which functions as From string
a key-value storage. The WorkloadClient is based on To string
Val int
the YCSB driver [89] which preloads each storage with a
}
number of records, and supports requests with different func Invoke_SendValue(from_account string,
ratios of read and write operations. For Smallbank [90], a to_account string, value int) {
popular benchmark for OLTP workload, we implement a var pending_list []transaction_t
pending_list = decode(GetState("pending_list"))
smart contract that transfers money from one account to var new_txn transaction_t
another. new_txn = transaction_t
{ from_account, to_account,
value
Besides database workloads, BLOCKBENCH also pro- }
vides three other workloads based on real Ethereum con- pending_list = append(pending_list, new_txn)
PutState(’pending_list’, encode(pending_list))
tracts. The first is EtherId, a popular contract implementing }
a domain name registrar. The second is Doubler, the func Query_BlockTransactionList(block_number int)
pyramid []transaction_t {
scheme contract shown earlier in Figure 2. The third is return decode(GetState("block:"+block_number))
}
WavesPresale that implements a crowdfunding campaign func Query_AccountBlockRange(account string,
via digital token sales. start_block int, end_block int)
[]account_t {
Micro benchmark workloads version := decode(GetState(account+":latest"))
var ret []account_t
For the consensus layer, BLOCKBENCH provides DoNoth- while true {
ing workload in which the smart contract accepts a trans- var acc account_t
action as input and simply returns. Since the contract acc = decode(GetState(account+":"+version))
if acc.CommitBlock >= start_block &&
execution involves minimal number of operations at the acc.CommitBlock < end_block {
execution and data model layer, the overall performance ret = append(ret, acc)
will be determined by the consensus layer. } else if acc.CommitBlock < start_block {
break;
For the data model layer, BLOCKBENCH provides Ana-
}
lytics workload that is similar to an OLAP workload. In par- version -= 1
ticular, it performs scan-like and aggregate queries whose
performance is determined by the system’s data model. }
Specifically, there are two queries: return ret
}
Q1: Compute the total transaction values committed between
block i and block j. tract implementation in Hyperledger. To support historical data
Q2: Compute the largest transaction value involving a given lookup, the contract appends a counter to the key of each account. To
state (account) between block i and block j. fetch a specific version of an account, the key account:version is
used. The latest version is stored at the key account:latest. The
For Ethereum and Parity, both queries can be implemented
via JSON-RPC APIs that return transaction details and ac- contract also keeps keep a CommitBlock value in the data field for
count balances at a specific block. For Hyperledger, how- every version to point to the block number in which the current
ever, the second query must be implemented via a smart version is committed. To fetch the balances of a given account in a
contract (VersionKVStore), because Hyperledger has no given block range, the contract scans all versions of this account and
APIs for querying historical states. Figure 5 shows the con- returns the corresponding balance when the version’s CommitBlock
value is in the specified range. Fig. 5: Code snippet from the VersionKVStore smart
Another workload for the data model layer stresses contract for Analytics workload (Q1 and Q2).
the persistent storage. In particular, the IOHeavy workload
evaluates the blockchain’s IO performance by invoking a
contract that performs a large number of random writes execution layer for computationally heavy tasks by
and random reads to the local states. invoking a contract that executes quick sort algorithm over
Finally, for the execution layer BLOCKBENCHprovides a large array.
the CPUHeavy workload. It measures the efficiency of the
6 EVALUATION
We selected Ethereum, Parity and Hyperledger for a com-
parative study using BLOCKBENCH. They occupy different
positions in the design space, and are considered the most
mature in terms of the codebase and user base. We used
the popular Go implementation of Ethereum, geth v1.4.18,
the Parity release v1.6.0. Unless otherwise specified, the
Hyperledger version is v0.6.0-preview. We set up a private
testnet for Ethereum and Parity by defining a genesis block
and directly adding peers to the miner network. For Eth-
ereum, we manually tuned the difficulty variable in
the genesis block to ensure that miners do not diverge in
large networks. For Parity, we set the stepDuration
variable to
1. In both Ethereum and Parity, confirmationLength is
set to 5 seconds. The default batch size in Hyperledger is
500.
The experiments were run on a 48-node commodity clus-
ter. Each node has an E5-1650 3.5GHz CPU, 32GB RAM, 2TB
Throughput
Latency Throughput vs. HStore
Ethereum Parity Hyperledger
103 Ethereum Parity Hyperledger 105
104 Ethereum Parity Hyperledger142702
H-Store
second
10
#tx/s
#tx/s
51
38
284 255 101 103
1273 1122
102 3 4 284255
45 46 100 102
4546
101
10−1 101
YCSB Smallbank YCSB Smallbank YCSB Smallbank
hard drive, running Ubuntu 14.04 Trusty, and connected to To put their performance in context, we compare the
the other nodes via 1GB switch. For Ethereum, we reserved three blockchains against a popular in-memory database
8 cores out of the available 12 cores per machine, so that system, namely H-Store, using the YCSB and Smallbank
the periodic polls from the client’s driver process do not workload. Blockchains and databases do not necessarily
interfere with the mining process. Our main findings are as share the same design goal: the former are not designed
follows: for general data processing, nor do the latter protect data
integrity against Byzantine failures. Nonetheless, we argue
• Hyperledger performs consistently better than Eth- that the comparison offers useful insights into the design
ereum and Parity across the benchmarks. But it fails trade-offs and relative performance of the two systems. We
to scale up to more than 16 nodes. ran H-Store’s own benchmark driver and set the
• Ethereum and Parity are more resilient to node fail- transaction rate at 100,000 tx/s. Figure 7 shows at least an
ures, but they are vulnerable to security attacks that order of mag- nitude gap in throughput and two order of
forks the blockchain. magnitude in la- tency. Specifically, H-Store achieves over
• The main bottlenecks in Hyperledger and Ethereum 140K tx/s through- put while maintaining sub-millisecond
are the consensus protocols, but for Parity the latency. The gap in performance is due to the cost of
bottle- neck is caused by transaction signing. consensus protocols. For YCSB, for example, H-Store
• Ethereum and Parity incur large overheads in terms requires almost no coordination among peers, whereas
of memory and disk usage. Their execution engine is Ethereum and Hyperledger suffer the overhead of PoW and
also less efficient than that of Hyperledger. PBFT. An interesting observation is the overhead of
• Hyperledger’s data model is low level, but its flexi- Smallbank. Recall that compared to YCSB, Smallbank
bility enables customized optimization for analytical consists of more complex transactions in which multiple
queries. keys are updated in a single transaction. Smallbank is
simple but is representative of the large class of transac-
6.1 Macro benchmarks tional workloads such as TPC-C. We observe that in H-Store,
This section discusses the performance of the blockchains Smallbank achieves 6.6x lower throughput and 4x higher
at the application layer, using YCSB and Smallbank bench- latency than YCSB, which reflects the cost of distributed
marks. transaction management. In contrast, the blockchains
suffer modest degradation in performance: 10% in
Throughput and latency throughput and 20% in latency. This is because each node
Figure 6 shows the peak performance with 8 servers and 8 in the blockchains maintains the complete states, therefore
concurrent clients over the period of 5 minutes. We it pays no overhead in coordinating distributed
observe that in terms of throughput, Hyperledger transactions since the states are not partitioned.
outperforms the other two in both benchmarks. The gap
between Hyper- ledger and Ethereum is due to the Scalability
difference in the con- sensus protocols: one is based on
PBFT while the other is based on PoW. With 8 servers, the We fixed the client request rate ( 320 requests per second
communication cost from broadcasting messages is for Hyperledger, 160 requests per second for Ethereum
cheaper than block mining whose difficulty is set at roughly and Parity) and increased both the number of clients and
2.5s per block. The gap between Parity and Hyperledger is the number of servers. Figure 8 illustrates how well the
not due to consensus protocols, as Parity’s PoA protocol is three systems scale to handle larger YCSB workloads.
expected to be simpler and more efficient than both PoW Parity’s performance remains constant as the network size
and PBFT. Instead, we observe that Parity processes and offered load increase, due to the constant transaction
transactions at a constant rate, and that it enforces a pro- cessing rate at the servers. Interestingly, while
maximum client request rate at around 80 tx/s. Ethereum’s throughput and latency degrade almost linearly
beyond 8 servers, Hyperledger stops working beyond 16
servers.
second
or view change messages are dropped with high probability.
#tx/s
second
13,090
second
7
10
MB
1
10 10.52
3.01 4,150
200
1.94 2,078
100 1,353
0.33 106 718
100 0.19 376473
10−1
0 10−2 105
0 50 100 150 200 250 300 350 400 1M 10M x100M 1M 10M 100Mx
time (second) input size input size
second
MB
103
104
5,459 5,045 4,865
377 359 337 2,337 2,086 12,804 12,104
2019 2,477
1631
1282 1,283
102 69 103 675
103
x x 512 6.4M 360
0.8M 1.6M 3.2M 6.4M 12.8M 0.8M 1.6M 3.2M 12.8M 0.8M 1.6M 3.2M 6.4M 12.8M
# tuples # tuples x x # tuples x x
Latency
Latency Transaction througput
13.314 101 104
101 Ethreum Parity Hyperledger 8.901 8.465
Ethereum Parity Hyperledger 4.907 SmallBank YCSB
3.472
DoNothing
1.374
0.915 0.984
100 103 112212731285
100 0.595
0.533
0.427
second
second
#tx/s
256 284 328
0.168
0.129 0.135 0.107
4
10 4
Hyperledger v0.6.0 Hyperledger v1.0.0-rc1 Hyperledger v0.6.0 Hyperledger v1.0.0-rc1
6122 5815 5618 5411 10
3
10
8559 8690 8785 8758
ops/sec
ops/sec
MB
675
506
360
3 625 625 623 625 280
10 3 197
10 2 141
10 107
627 628 631 630
63
2
10 2 1
0.2M 0.4M 0.8M 1.2M 10 0.2M 0.4M 0.8M 1.2M 10 0.2M 0.4M 0.8M 1.2M
# tuples # tuples # tuples
(a) Write
(b) Read (c) Read
throughputs of v1.0 is an order of magnitude worse than PBFT with a centralized service not only fails to protect the
that of v0.6. Furthermore, v1.0 crashes with more than blockchain against Byzantine
0.8M operations, reporting exceptions about message
oversizes. The significant gap can be attributed to the
changes in the system architecture from v0.6 to v1.0. In
the former, the nodes take part in PBFT to confirm a block.
In this case, transactions in the IOHeavy workload incur no
consensus overhead because there is only one node. In the
latter, a new service, the orderer, is introduced into the
network to order transactions and provide the consensus.
With this new service, transactions in the IOHeavy
workload now need to communicate with the orderer for
them to be confirmed. More specifically, the nodes in v1.0
perform three more steps to finish a transaction compared
to v0.6. As communication overhead increases, the
throughputs decrease. This result suggests that replacing
failures, but it may also impair the overall performance. signing transactions when there are many accounts, we
considered transactions using only 1024 accounts. We then
Data model - Analytics executed the two queries described in Section 5.3 and
We implemented the analytics workload by measured their latencies. Fig- ure 13 shows that the
initializing the three systems with over 120, 000 performance for Q1 is similar, whereas Q2 sees a significant
accounts with a fixed balance. We then pre-loaded gap between Hyperledger and the rest. We note that the main
them with 100, 000 blocks, each contains 3 bottleneck for both Q1 and Q2 is the number of network (RPC)
transactions on average. The transaction trans- requests sent by the client. For Q1, the client sends the same
fers a value from one random account to another number of requests to all systems, therefore their
random account. Due to Parity’s overheads in performance are similar. On the other hand,
for Q2 the client sends one RPC per block to Ethereum and Usability of blockchain. Our experience in working with the three
Parity, but only one RPC to Hyperledger because of our blockchain systems confirms that in their cur- rent states, the
customized smart contract implementation. This saving in blockchains are not yet ready for mass usage. Their designs and
network roundtrip time translates to over 10x codebases are still being refined constantly, and there are no other
improvement in Q2 latency. established applications beside crypto-currency. Of the three systems,
Ethereum is more mature both in terms of its codebase, user base and
6.2.1 Consensus de- veloper community. Another usability issue we encountered
We deployed the DoNothing smart contract that accepts
a transaction and returns immediately. We measured the
throughput of this workload and compare against that of
YCSB and Smallbank. The differences compared to other
workloads, shown in Figure 13[c] is indicative of the cost
of consensus protocol versus the rest of the software stack.
In particular, for Ethereum we observe 10% increases in
throughput as compared to YCSB, which means that execu-
tion of the YCSB transaction accounts for the 10% overhead.
We observe no differences among these workloads in Parity,
because the bottleneck in Parity is due to transaction signing
(even empty transactions still need to be signed), not due to
consensus or transaction execution.
7 D ISCUSSION
In this section, we first distill the lessons learned during the
comparative studies of Ethereum, Parity and Hyperledger.
We then discuss how design principles from database sys-
tems could help improve blockchain performance.
execute commit
batch
...
transactions