Blockchain As Software Connector
Blockchain As Software Connector
Blockchain As Software Connector
Abstract—Blockchain is an emerging technology for decent- cryptocurrency [24]. Cryptocurrency is a digital currency that
ralized and transactional data sharing across a large network is based on peer-to-peer network and cryptographic tools.
of untrusted participants. It enables new forms of distributed Cryptocurrencies are low-cost and inherently independent of
software architectures, where components can find agreements on
their shared states without trusting a central integration point or any centralized authority to transfer virtual money or issue new
any particular participating components. Considering the block- units of money. New units of money are issued by the users of
chain as a software connector helps make explicitly important the cryptocurrency through mining. The virtual money can be
architectural considerations on the resulting performance and transferred among peer-to-peer users without going through a
quality attributes (for example, security, privacy, scalability and trusted authority to purchase goods and services in real world.
sustainability) of the system. Based on our experience in several
projects using blockchain, in this paper we provide rationales Bitcoin is the first and most widely used cryptocurrency.
to support the architectural decision on whether to employ a The second generation of blockchain became a generally
decentralized blockchain as opposed to other software solutions, programmable infrastructure with a public ledger that records
like traditional shared data storage. Additionally, we explore spe- computational results. Smart contracts [20] were introduced as
cific implications of using the blockchain as a software connector autonomous programs running across the blockchain network
including design trade-offs regarding quality attributes.
Index Terms—Blockchain; Architecture connector; Design; and can express triggers, conditions and business logic to
Trade-off enable complicatedly programmable transactions. Smart con-
tracts are more versatile than simple currency transactions.
I. I NTRODUCTION The design of a blockchain-based system has not yet been
Blockchain is an emerging technology that enables new systematically explored, and there is little understanding about
forms of distributed software architectures, where components the impact of introducing the blockchain in a software archi-
can find agreements on their shared states for decentralized and tecture. In this paper, we discuss our experience obtained from
transactional data sharing across a large network of untrusted applying the blockchain into a number of projects, which res-
participants without relying on a central integration point that ulted in operational prototypes we built using readily available
should be trusted by every component within the system . blockchain techniques. The prototypes included in this paper
The blockchain data structure is a timestamped list of are 1) a decentralized trading market for data sharing, and 2)
blocks, which records and aggregates data about transactions a platform for participating organisations to securely negotiate
that have ever occurred within the blockchain network. Thus, and store sensitive data values, which represents a scenario of
the blockchain provides an immutable data storage, which only secure data exchange and negotiation.
allows inserting transactions without updating or deleting any Based on this experience, from an architectural perspect-
existing transaction on the blockchain to prevent tampering ive, according to the taxonomy of software connectors [16],
and revision. The whole network reaches a consensus before we propose to consider the blockchain as a novel kind of
a transaction is included into the immutable data storage. The software connector, which should be considered as a pos-
next writer of new records on the immutable data storage is sible decentralized alternative to existing centralized shared
decided via different mechanisms, for example, Proof-of-work date storage. Such view helps us make explicitly important
or Proof-of-stake [24]. architectural considerations on the resulting quality attributes
The first generation of blockchain is a public ledger for of the applications. We found that using the blockchain as
monetary transactions with very limited capability to support a software connector could improve information transparency
programmable transactions. A typical type of applications is and traceability. However, the mining mechanism increases
the communication latency, which might cause poor user
Cryptocurrencies
experience. Likewise, the amount of data that can be stored Bitcoin [19] https://bitcoin.org/
on the blockchain is very limited, thus making it important to Peercoin http://peercoin.net/
decide which data (or meta-data) should be stored on-chain Colouredcoins http://coloredcoins.org/
Omni http://www.omnilayer.org/
vs. off-chain. Nxt http://nxt.org/
The paper proceeds by introducing background information Smart contract platforms
about the blockchain in Section II, followed by discussing Etheruem https://www.ethereum.org/
Counterparty http://counterparty.io/
blockchain from an architecture perspective in Section III.
Ledger platforms
Section IV compares the blockchain with existing software Factom http://factom.org/
connectors. Section V discusses the detailed architecture of our Ripple https://ripple.com/
prototypes using blockchain as a software connector. Section Eris https://erisindustries.com/
MultiChain http://www.multichain.com/
VI enumerates the lessons learned from our experience, before Enigma http://enigma.media.mit.edu/
Section VII concludes the paper.
Table I: Examples of blockchain applications and platforms
II. B LOCKCHAIN
A. Background
their peers untill the transaction reaches every node in the
Initially, the blockchain was the key technique behind
network.
Bitcoin [19]. The blockchain is a public ledger maintained
This flooding approach guarantees that a valid transaction
by all the nodes within the cryptocurrency network. The
will reach the whole network within few seconds. The senders
blockchain stores all the transactions that have ever occurred
do not need to trust the nodes they use to broadcast the
in the cryptocurrency system. Later, the concept was gener-
transactions, as long as they use more than one to ensure
alized to a distributed ledger that exploits the blockchain to
that the transaction propagates. The recipients do not need
verify and store transactions without needing cryptocurrency
to trust the senders either because the transactions are signed
or tokens [27].
and contain no confidential information or credentials such as
The blockchain network does not rely on any central trus-
private keys.
ted authority, which has the power to control the system,
When a transaction reaches a mining node, it is verified
like traditionally centralized banking and payment systems.
and included in a block, which is propagated to the network.
Instead, trust is achieved as an emergent property from the
The block is chained into the blockchain once the whole
interactions between nodes within the network. In this paper,
network reaches a consensus. Once recorded on the blockchain
we use blockchain to refer to the data structure replicated on
and confirmed by sufficient subsequent blocks, the transaction
the nodes and blockchain network to refer to the infrastructure
becomes a permanent part of the public ledger and is accepted
composed of a decentralized peer-to-peer network of nodes.
as valid in principle by all nodes within the blockchain
Blocks and transactions are the two essential elements mak-
network.
ing up the blockchain. Seen as a data structure, the blockchain
is an ordered list of blocks. Blocks are the containers aggreg- B. Blockchain Applications and Platforms
ating transactions. Every block is identifiable, and linked back Table I gives some examples of blockchain platforms that
to its previous block in the chain. use the blockchain at the core of their architecture.
Transactions represent state transitions with ownership in- 1) Cryptocurrency: Cryptocurrency uses cryptography to
formation, which could include new data records and transfer control the monetary issuance and secures the transaction. The
of control among participants. The transactions in cryptocur- first cryptocurrency, Bitcoin, created in 2009, is still the most
rencies are the data structures that encode the monetary value widely-used cryptocurrency [1]. Bitcoin allows developers to
being transferred between accounts. More generally, such as add 40 bytes of arbitrary data to a transaction, which can be
in Ethereum, the transactions are a set of identifiable data permanently recorded on the blockchain. Thus, the blockchain
packages that store monetary value, code, and/or parameters of Bitcoin has been used to register asset and ownership other
and results of function calls. The integrity of the transactions than monetary transactions, like in Ascribe1 .
is ensured by cryptographic techniques. Some cryptocurrencies are overlay networks on Bitcoin, for
Once created, the transaction is signed with the signature example, coloured coins, which taints a subset of Bitcoin
of the transaction’s initiator, which indicates the authorization to represent and manage real-world assets. Other overlay
to spend the money, create the contract, or pass the data networks completely define new transaction syntax, such as
parameters associated with the transactions. If the signed Omni and Counterparty. There are also cryptocurrencies that
transaction is properly formed, it is valid and contains all the have their own blockchains built from scratch, such as Nxt.
information needed to be executed. Please refer to [18], [3] and [27] for more comprehensive
The transaction is sent to a node connected to the blockchain surveys on the state-of-art of existing cryptocurrencies.
network, which knows how to validate the transaction. The
invalid transactions are discarded, while the valid transactions 1 Ascribe — https://www.ascribe.io/
are propagated to another three to four other connected nodes,
which will further validate the transactions and send them to
2) Smart contract: Smart contract is the most important Node
element in the second generation of blockchains, which en- Application layer
ables a generally programmable infrastructure. The smart con- Off-chain Off-
Off-chain
Off- chain
chain
tract is deployed and executed on the blockchain network and data control
control
control
can be used by the components connected to the blockchain to Blockchain
connector
Permission Incentive
reach agreements and solve common problems with minimal mangement mechanism
trust.
Validation Transaction
There are platforms that allow end users to build self- oracle validation
Mining
executing contracts on the Bitcoin blockchain network, for
Secure clearing Chain
example, smartcontract2 . The smart contract can still be up- Blockchain
paymantlayer
Chain
Chain
dated after being submitted and before being propagated to the
Blockchain layer
network. However, smart contracts on the Bitcoin blockchain
network are very simple due to the limited expressiveness of
the corresponding scripting language, which does not support
complex control flow.
Ethereum, as a blockchain-based platform, views smart
contract as their first-class element. Ethereum has built its own
blockchain from scratch with a built-in Turing-complete script Blockchain network
language for writing smart contracts. Counterparty has recre-
ated Ethereum smart contract platform on Bitcoin3 . The smart Figure 1: Overview of blockchain as connector
contract has been used to enable programmable transactions
using external, independently managed state. Other facilitation
and machine-to-machine communication in IoT (Internet-of-
services include cryptography-based secure clearing payment,
Things), for example, ADEPT (Autonomous Decentralized
mining, transaction validation, incentive mechanisms, and per-
Peer-To-Peer Telemetry) project of IBM [10].
mission management.
III. T HE B LOCKCHAIN C ONNECTOR Every node in the blockchain network has two layers,
A. Software Connector namely, application layer and blockchain layer. Part of the
application is implemented inside the blockchain connector in
Software connectors are the fundamental building blocks
terms of smart contracts. The part of application outside the
of software interactions [16]. A connector is an interaction
blockchain connector might host off-line data and application
mechanism for the components. Connectors include pipes,
logic, and interact with the blockchain through transactions.
repositories, and sockets. For example, middleware can be
Table II shows some design decisions developers need to con-
viewed as a connector between the components that use the
sider when using blockchain as a connector and summarizes
middleware [6]. Connectors in distributed systems are the
the corresponding impact on quality attributes.
key elements to achieve system properties, such as perform-
One of the main architectural decisions for software con-
ance, reliability, security, etc. Connectors provide interaction
nector is that what functionality is implemented in the con-
services, which are largely independent of the functionality
nector and what is implemented in the component. In the
of the interacting components [26]. The services provided
case of blockchain, this decision concerns which data and
by a software connector could be classified into four cat-
computation should be placed on-chain or kept off-chain
egories: communication, coordination, conversion and facilita-
(Application Design Decision 1 in Table II). While the block-
tion. Communication services transfer data among components
chain provides a trust-less network that can verify partial
while coordination transfers control among components. Con-
computational results and provide agreements on the outcomes
version services adjust the interactions to allow components
of transactions, the amount of computational power and data
that have not been exactly tailored for each other to establish
storage space available on the blockchain network remains
interactions. Facilitation services help to support and optimise
limited.
components’ interactions.
Another decision concerns the access scope of the block-
B. Overview chain: public, private or consortium/community [4] (Applic-
Fig. 1 gives an overview of the blockchain playing the ation Design Decision 2 in Table II). Most of the cryptocur-
role of software connector. The blockchain is a complex, rencies are built on top of public blockchains, which can be
network-based software connector, which provides communic- accessed and mined by anyone with Internet access. Using a
ation, coordination (through transactions, smart contracts and public blockchain results in better information transparency
validation oracles) and facilitation services [16]. The validation and audit-ability, but sacrifices information privacy. Consor-
oracle facilitates component coordination within the network tium blockchain is used across multiple organizations. The
consensus process of a consortium blockchain is controlled
2 Smartcontract — http://www.smartcontract.com/ by authorized nodes. The right to read the blockchain may
3 http://counterparty.io/news/counterparty-recreates-ethereums-smart-
be public, or restricted to the participants of the blockchain
contract-platform-on-bitcoin/
Table II: Design decisions and quality attribute trade-offs average 3-20 transactions per second, while the mainstream
Blockchain Design Decision 1 payment service, like VISA, can handle on average 2000
Mechanisms of improving transaction processing rate transactions per second. There are works trying to improve the
Larger block size; Off-chain transactions; Smaller transaction without
signature; Scalable protocol
scalability (Blockchain Design Decision 1 in Table II). Bitcoin
Blockchain Design Decision 2 plans to increase its block size from 1MB to 8MB to allow
Mechanisms of selecting the next block included in the blockchain miners to include more transactions into one block. Bitcoin
Proof-of-work, Proof-of-stake, Proof-of-burn, Proof-of-retrievability
lightening network [21] moves some of the transactions off-
Application Design Decision 1
Scope: on-chain chain. A multisignature transactions is established between
Enable verification of computational result, limited computation power two participants as a micropayment channel to transfer value
and data storage offchain. Once both sides wish to close the micropayment
Examples: Metadata (V-A), Negotiable value (V-B).
Scope: off-chain channel and finalize the value transfer, a transaction is sub-
More computation power and data storage, less cost, additional trust mitted to the global Bitcoin blockchain. Segregated witness4
required proposes to remove the signatures from transactions to reduce
Examples: Raw personal data (V-A), Sensitive information (V-B)
Application Design Decision 2 the size of transactions, thus, one block could contain more
Public chain transactions. Bitcoin-NG [8] decouples Bitcoin’s blockchain
Information transparency, growth potential to larger scale, trustworthy, operation into two planes: leader election and transaction
existing user base
Examples: V-A serialization. Once a leader is selected randomly, it is entitled
Private chain to serialize transactions until the next is selected.
Easier management, better privacy Another concern of using blockchain is that all the in-
Examples: Consortium blockchain (V-B)
Application Design Decision 3 formation on the blockchain is publicly available to all the
Single chain participants within the network, especially the information
Easier chain management and permission management, harder data on the public blockchain, which is publicly accessible by
management and isolation
Examples: V-A, V-B.
everyone. Cryptography is the only way to preserve data
Multiple chains privacy.
Information isolation, harder chain management and permission man- Besides, if a public blockchain is used, running computa-
agement
Application Design Decision 4
tions on the blockchain costs actual money. Thus, applications
External Validation oracle are not supposed to deploy all computations and store all data
Introduce a third party trusted by the whole network on the blockchain. A common practice we also observed in
Examples: Arbitrator (V-A) our projects is to keep the big and private raw data off-chain,
Internal Validation oracle
Periodically injecting external state into the blockchain might intro- and stores the meta-data on-chain.
duce latency issues. The source of external state also needs to be
trusted. C. Communication Service
Application Design Decision 5
Permissionless vs. Permissioned blockchain
Communication service is a primary block of component
Trade-offs: Performance, cost, censorship, reversibility, finality, flex- interaction, which transfer data among components. Fig. 2
ibility in governance shows the internal structure of a node within the blockchain
Permissions: Read/Join network, submit transaction, mine, create network. Components use blockchain as a mediator to transfer
assets Example: Permissioned (V-A, V-B)
data. There are two ways to store data on the blockchain.
One is to add data into transactions, like Bitcoin; the other is
network. A private blockchain’s write permission is kept to to add data into contract storage, like Ethereum. Both ways
one organization. Using consortium and private blockchains store data through submitting transactions to the blockchain,
requires a permission management component to authorize which may contain the information of money transfer together
the participants within the network. There are many platforms with some arbitrary data. After the transaction is included in
that support building consortium chains and private chains, for the blockchain, the data becomes publicly accessible to the
example, Multichain and Eris. components within the network.
Additionally, a blockchain-based system can maintain a Some blockchain platforms provide an API and/or tools to
unique chain to record all types of transactions together or access and filter the historical transactions. Ethereum suggests
maintain multiple chains to isolate information of separate to cache all transactions to prevent the blockchain network
parties or of separate concerns, for example, using one chain from being under heavy stress due to frequent queries. The
to store transactions, and using a separate chain to store authors of MultiChain also plan to establish a bridge between
access control information (Application Design Decision 3 in its blockchains and regular relational databases in its future
Table II). versions. Using the ordinary database indexing techniques, the
historical transactions can be analyzed more efficiently.
Challenges of public blockchain Scalability is one of the Other than transactions, blocks also contain the state of the
main criticisms of public blockchains. Currently, the public whole system after applying those transactions, In Bitcoin,
blockchains, like Bitcoin and Ethereum, can only handle on 4 https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki
Node Node tion oracle is part of the blockchain connector, but can be
Application layer Application layer independent of the blockchain network (Application Design
Off- Off-
Off-chain
Off-chain
Off- chain
chain Off-chain
Off-chain Off- chain
chain Decision 4 in Table II). When a validation of transactions de-
control
control data control
control
control
control data
pends on some external state, the validation oracle is requested
TX TX TX TX to validate the transaction and sign the valid transaction. This
will block the progress of the transaction until a condition over
Oracle
the external state is verified by the validation oracle.
Oracleoracle
Validation
If the validation cannot be automated, a human arbitrator
Contract Contract Contract Contract
Blockchain layer Blockchain layer
can validate transactions and sign valid transactions. If the
validation can be automated, an automated arbitrator, could
Figure 2: Interaction between applications and blockchain periodically pull the value of the variables from contract
storage as state of the application to validate the transactions.
the state is the collection of coins of all the accounts that However, both ways introduce a externally trusted party again.
have not been spent yet. In Ethereum, the state of the system The last way is to inject the external state into the blockchain
is the changes of the whole contract storage. In Ethereum, through periodically updating the value of variables within the
every contract has its own storage where only the contract contract storage. The last way can cause time delay between
can write to. The contract storage can be viewed as a flexible the external state changes and the change being injected into
key-value data store. The data stored in contract storage can the blockchain.
be updated through sending transactions to the corresponding In Bitcoin, an automated validation oracle can be imple-
contract with new value. The contract has an address, which mented as a server outside the blockchain network, which
is used to query the contract storage. In the block, the state has its own key pair. When a transaction requires external
is stored in a tree data structure. For example, Bitcoin uses a state to be validated, the validation oracle is requested to sign
Merkle tree whereas Ethereum uses a Patricia tree. Similarly the transaction on-demand. The logic of validating transaction
as the transactions, the state of contract storage can be queried is defined by the user. Thus, the validation oracle signs a
through API. By default, the query returns the current state. transaction when the user-defined expression on the server is
D. Coordination Service evaluated to be true [2]. To reduce the required trust, Orisi5
runs a set of independent validation oracles. Orisi allows the
Different components of the architecture can coordinate participants involved in a contract to select a set of oracles
their computations through the blockchain. To do so, it is they both are comfortable using before initiating the contract,
possible to submit transactions to smart contracts to invoke and then sign a contract requiring a certain number of the
the functions defined in the smart contracts, or use a validation validation oracle signatures.
oracle to sign transactions, the outcome of which depends on
the external state. E. Facilitation Services
As shown in Fig. 2, the control of the application flows 1) Transaction validation: The mechanism to validate
through transactions initiated from externally owned accounts transactions is specific to blockchains. Generally, the trans-
and transferred among contract accounts. Contracts behave like actions are validated by being re-executed by the node that
autonomous agents that live in the execution environment of receives the transactions. For example, in Bitcoin, the val-
the blockchain network. Contracts are instantiated by submit- idation of transactions relies on two scripts, including a
ting transactions with the source code of the contracts to the locking script in the output of a transaction that specifies the
blockchain network. A contract defines a set of functions. At conditions to spend the coins referred by the transaction, and
the function level, the contract runs the code of a function an unlocking script that satisfies the conditions placed on a
when receiving a transaction calling the function with its transaction output by the locking script. When a transaction
required parameters. At the contract level, a contract could is validated, the unlocking script in each input is executed
create a new contract by sending a transaction. The contract alongside the corresponding locking script to see if it satisfies
can also kill itself. The contract cannot receive any transactions the spending condition. The transaction is valid if the result
after killing itself, but the source code of the contract cannot be of executing scripts is “TRUE”, which means the unlocking
removed from the blockchain. The source code is permanently script has succeeded in resolving the spending condition. If
stored within the transaction that creates the contract. the result is not “TRUE” after executing the combined script,
the transaction is invalid.
Validation oracle The execution environment of blockchain is 2) Mining mechanism: Mining is a process, in which some
a closed environment, which is not allowed to import external nodes within the blockchain network aggregate transactions
states through polling external servers. To address this limita- into blocks. These nodes are called miners in the blockchain
tion, the concept of validation oracle is introduced to evaluate network. Once a new block is generated by a miner, the miner
conditions that cannot be expressed within blockchains. propagates the block to the blockchain network. And the new
A validation oracle is a mechanism that facilitate component
5 Orisi: Distributed Bitcoin oracles — http://orisi.org/
coordination within the network using external state. Valida-
block is included into the public blockchain after the whole sanctions and seizure of assets [7]. Besides, the “Proof-of-
network reaching consensus. work” process, used as a Sybil protection mechanism, is costly
There are different mechanisms to select the miner as the and wasteful.
next author to update the ledger (Blockchain Design Decision By contrast, permissioned blockchain networks, like Ripple
2 in Table II). In Bitcoin, the miner is chosen at random and Eris Industries, are more congruent with traditional bank-
through “Proof-of-work”. “Proof-of-work” is a piece of data ing systems and can provide more utility to financial institu-
that is very costly to produce but easy to be verified. Producing tions [25]. In permissioned blockchain networks, the identity
a “Proof-of-work” is a random process with low probability. of the validators or even the participants is whitelisted through
Thus, the miners in Bitcoin network compete to generate the some types of KYC (Know Your Customer) procedure, which
“Proof-of-work” through burning their CPU time. The first is a widely used method of managing identity in traditional
miner to find the “Proof-of-work” is the potential next author finance. It means that the participants of the system require
of the blockchain. The difficulty of the work is adjusted to legal identities in real world to validate transactions. Thus,
generate a new block every 10 minutes. However, proof of permissioned blockchains are able to legally host off-chain
work largely limits the capacity of processing transactions. assets in the real world, while permissionless systems cannot.
“Proof-of-stake” is an alternative mechanism, which grants Other than the permission of validation, basic permissions
mining rights to participants in proportion to their holding like joining the network, submitting transactions, mining, and
of the currency within the blockchain network. For example, creating assets can be also managed by the permission man-
the miners in Peercoin blockchain network need to prove agement service. Once joining a blockhain network, the par-
the ownership of a certain amount of currency to mine ticipant inherently has the read permission on the blockchain
blocks. “Proof-of-stake” blockchains provide protection from because all the information recorded is publicly available. The
a malicious attack because executing an attack would require permission information can be stored on-chain as well.
the attackers to own large amount of currency, which is very There are trade-offs between permissioned and permis-
expensive. Besides, the miners owning a large stake most sionless systems including transaction processing rate, cost,
probably won’t attack the system, for example, through double censorship, reversibility, finality [25] and the flexibility in
spending. In long term, such attacks will decrease the value changing and optimizing the network rules.
of the cryptocurrency and the value of their stake. 5) Economic incentive: Every blockchain introduces eco-
The “Proof-of-burn” process used in Counterparty block- nomic incentives, reputation and rating mechanisms for miners
chain involves destroying Bitcoins and generating propor- to validate transactions and generate blocks and participants
tionally XCPs (coins used in Counterparty). More recently, to be honest.
Permacoin proposes a modification to Bitcoin [17], which For example, in Bitcoin, the miners have two incentives to
uses “Proof-of-retrievability” to re-purpose Bitcoin’s mining mine blocks, including the reward of generating new blocks
resource to distributed storage of archival data. This approach and the transaction fees associated with transactions being
provides additional incentives to contribute resources to the aggregated into the blocks. Ethereum also charges computation
network. fee for the miner to execute the smart contracts. Enigma has a
3) Secure clearing payment: Blockchain provides a service fixed price for storage, data retrieval, and computation within
of trusted peer-to-peer payment through cryptography. Every the network. Besides, a node is required to submit a security
transaction is associated with the public key of its initiator. deposit to join the network. If a node is found to lie, its deposit
The transaction can be broadcast to the blockchain network will be split among the other honest nodes.
only after the initiator signing the transaction with the cor-
IV. C OMPARISON WITH OTHER C ONNECTORS
responding private key. Thus, the authenticity is enforced by
the key pairs. The transaction validation checks if new extra A. Centralized, Shared Data Store
money created from the blockchain network after a specific Shared data stores, like key-value stores, export a basic
transaction. Create/Read/Update/Delete (CRUD) interface. The blockchain
4) Permission management: Blockchains could be classi- is an append-only data store as it does not support update
fied into permissioned blockchains and permissionless block- but rather supports the creation of new transactions. Any
chains (Application Design Decision 5 in Table II). The service changes/updates on contract states are appended to the block-
of permission management is provided by a permissioned chain as new transactions. An analogy with this so-called
blockchain network. ledger in data stores is the concept of log where data items
The participants of a permissionless blockchain networks get appended but never deleted or updated. This immutability-
is either pseudonymous or anonymous, like Bitcoin and Eth- of-stored-information property is the key to the traceability of
ereum. Using anonymous validators takes the risk of Sybil the relevant assets recorded on the blockchain.
attack, where the attacker gains a disproportional amount Traditional shared data stores use different strategies to
of influence on the system. For example, in Bitcoin, any improve sustainability and throughput, and reduce latency,
participant with a sufficient share of computational power is such as master-slave replication, and multi-master replication.
able to change the records in the blockchain without respecting Blockchain provides a more sustainable data storage because
jurisdictional boundaries and therefore undermine financial the data is duplicated on every node within the blockchain
network. But the throughput of some blockchains is not the corresponding public key. The digital signature resulting
comparable with shared data store due to the latency caused from public-key cryptography is also used in blockchains
by mining. to preserve the ownership of coins. Collision-resilient hash
Traditional shared data stores have their own consensus functions help verifying the integrity of the message. They
protocols to synchronize replicas [11] in a fully trusted envir- take the content of the message and produce a digest. This
onment, such as 2-Phase Commit and Paxos. The consensus digest once sent encrypted allows the receiver to observe that
protocol of blockchain is aimed to tolerant Byzantine Gener- the signed message was not altered. As an example, Bitcoin
als’ Problem [13], in which components of the system aim uses the SHA256 whereas the early replicated state machine
at reaching agreement among themselves to process correct tolerating Byzantine failures [5] used the AdHash solution
operations despite a faulty component. The comparison of based on MD5.
the consensus protocols used for blockchains and for general Finally, a replicated state machine totally orders the requests
distributed systems is detailed in Section IV-B. from components. It controls concurrency by scheduling re-
Besides, blockchain is able to validate the consistency of quests issued by components and thus serves as a facilitation
transactions based on rules attached with the transactions in connector. This total order is also the key property of the
terms of smart contract. Such rules can be applied on the whole blockchain. In Bitcoin, each block contains the hash of the
blockchain, for example, to prevent double-spending problem previous block according to this total order, hence allowing to
through checking new extra money created during a spending audit preceding transactions by backtracking the blockchain
transaction. up to the genesis block. To maintain this total order and to
prevent the chain from becoming a tree, miners always append
B. Replicated State Machine
blocks to the first chain of maximal length they hear of and all
Replicated state machine [22] is a general method to im- transactions that are part of forked blocks in shorter branches
plement a fault-tolerant service with a distributed system. To are simply discarded.
cope with failures, it replicates the service at several servers
and coordinates the service requests issued by the clients. V. P ROJECT R ETROSPECTIVE
Similarly, the blockchain uses distribution not to depend nor A. Data Monetization
rely on any single entity. Our first project is a platform to support the scenario of data
State machine replication typically relies on a consensus monetization in which the data owners increase the value of
protocol that takes as an input the requests of the components their data through trading their data sets with data consumers.
and decides upon one of these requests [12]. In the case of We consider two use cases. In one case, the data providers
a distributed locking service, the consensus will guarantee publish their data sets on the platform and the data consumers
that only one particular client acquires a lock, while multiple could select data sets from one or more data owners to do
clients requested it concurrently. Blockchain also features a analytics for different purposes, and pay the data owners
consensus protocol to ensure that among multiple conflicting according to the value of their data sets. In the other case, the
proposed transactions, only one gets approved, preventing for data consumers first post their analytics jobs with the price
example a double spending of the same coins. information on the platform, after which data owners could
For reaching a consensus on a particular transaction request, browse the list of analytics jobs and select the jobs based on
the replicated state machine requires sufficiently many votes. the conditions defined in the offer.
Replicated state machine rely on quorums of voters [15] The platform can be used in different business scenarios, for
that stem from the concept of weighted votes [9]. Typical example, trading personal data produced by individuals. This
blockchain implementations also requires sufficiently many scenario is inspired by [14], which discusses an economy of
votes. In Ripple, sufficiently many votes are obtained when micropayment based on the Web to compensate people for
a minimum of nodes in a unique node list have voted whereas originally creative work they post on the Web. Thus, personal
in Bitcoin sufficiently many votes are obtained when a suffi- data is treated as private property that can be traded.
ciently complex challenge (Proof-of-work) is solved. Another possible scenario is the data analytics across or-
A replicated state machine supports communication by ganizations. The organizations doing data analytics pay the
transmitting data among components. Components can store organizations who provide the data. The amount of money is
and retrieve information that will persist despite failures. calculated based on the value of the data set. For example,
The state machine replication guarantees that the information in an analytics based on two data sets from two different
stored by one component gets replicated and delivered to insurance companies, the data set from a company with larger
another components upon requests even when some failures number of customers is more valuable than the one from
occur. a company with smaller number of customers. Thus, the
To address arbitrary failures or Byzantine failures [13], insurance company, which provides more valuable data set
replicated state machines exploit security mechanisms. The gets more money from the organization doing the analytics
sender of a message is typically authenticated with public-key using the data sets.
cryptography so that the encryption with the sender private key Fig. 3 shows the architecture of the platform. The plat-
serves as a signature to whoever decrypts the message with form provides mainly three functions, including data trading,
registering, trading and negotiating. Off-chain data include
Transfer raw data results of usage policy compliance checking and the informa-
tion of analytics job, such as processing time, the data sets
Owner Consumer involved, and the monetary value eventually paid to each
Register data set, of the data owner. Besides, the blockchain is inherently a
Upload data analytics script...
payment infrastructure that supports conditional payment. In
our case, for example, the payment is triggerd before the
On1chain& analytic job starts. The amount of money is calculated by the
Dataset®istry&& Job®istry& smart contract according to the contribution criteria associated
• Metadata& • Contribu3on&criteria&
• Policy&address& • Dataset&requirement& with the analytics job and metadata of the involved data set.
&
Users, as data owner or data consumer, interact with the
Tamper1proof&log&of&events& Condi:onal&
• Usage&policy&compliance&result&
payment&
smart contracts running on the blockchain. In this platform,
• When&and&what&analy3cs&job&&
due to its size, the raw data is stored and transferred off-chain.
This reflects current practices of popular Web applications
Data&analy:cs&& Hosted&raw&data&
which allow users to download the data associated with their
Data&analy:cs& Usage&policy& accounts, for example, Google takeout service6 .
infrastructure& specifica:on& One issue of this kind of marketplaces is how to verify that
the data being sold complies with the owner’s description. In
Policy&compliance&checker&
our platform, we introduce reputation and rating mechanisms
Policy& Conflict& Policy&
Compiler& resolu3on& enforcement& for data owners to be honest. A similar ongoing industrial
Off1chain& project, called Slur7 , is an anonymous marketplace for trad-
ing secret information. Slur introduces an additional role,
Figure 3: Architecture of data monetization platform called Arbitrators, that validates the data. When the buyers
claim that the content does not match the seller’s description,
compliance checking of user-defined usage policy and data
slur randomly selects several arbitrators to evaluate the con-
analytics. The policy compliance checking and data analytics
tent. The arbitrators are paid for their effort.
is off-chain functionality, the technical detail of which is out
of the scope of this paper. B. Organizations Sharing Sensitive Data
The blockchain in this project allows the communication Another project we are working on is a platform for parti-
and facilitates the interactions between data owners and data cipating organizations to securely negotiate and store sensitive
consumers through running a set of smart contracts, logging data values (such as prices, delivery dates, or legal contracts).
events in an immutable data storage and providing a condi- The architecture of this project is given in Fig. 4. This scenario
tional payment infrastructure. requires secure data exchange and negotiations. Some details
On the blockchain, there is a data set registry implemented had to be omitted and generalised due to Intellectual Property
as a smart contract, which stores all the data sets registered reasons.
on the platform and allows data owners to register a new The users could log on to the platform via federated access.
data set on the blockchain. The new data set is registered There are negotiation templates stored in a central place with
through calling the data set registry contract to create a data specific value fields in the template to be negotiated. The
set contract, which stores the hash of the data set to allow sensitive information is still kept inside the organizations
consumers to check the integrity of the off-chain data. The where the data was originated, and thus not available to
metadata of the data set, like the description and the size of other organizations using the same platform or stored in
the data set, and a pointer to the corresponding user-defined any centralised third party platform. The negotiation can be
usage policy is stored off-chain. initiated, negotiated and signed through a web application or
Similarly, we use another smart contract to implement a job a mobile application.
registry that stores the list of the existing analytics jobs on the As a part of the platform, blockchain is used to facilitate the
platform and allows data consumers to add new analytics jobs. negotiation by using smart contracts, and store the different
Every analytics job is a contract, which defines the requirement versions of sensitive data produced during negotiation. One
of the requested data sets and the criteria to measure the smart contract is used to represent one negotiation. The nego-
contribution of an involved data set, for example, the size of tiation is initiated by a participant from an existing template
the data set, the publish date of the data set, and the coverage by selecting initial values for the negotiable variables. All
of the data set etc. A more comprehensive value model is out the values produced during the negotiation are included in
of the scope of this project. The trading and negotiation logic blockchain. Since the information on blockchain is publicly
are implemented in smart contracts as well. available to all users, the value is encrypted before being
Blockchain provides a tamper-proof log of events that ever
occurred in the platform, including both on-chain and off- 6 https://www.google.com/settings/takeout
7 http://slur.io
chain events/activities/data. On-chain events/activities include
enforces integrity and auditabilty of the sensitive data. A
consortium blockchain or public blockchain can be selected in
this scenario since the privacy of the data is preserved through
Initiate, negotiate, sign cryptography.
VI. D ISCUSSION
On6chain$
Lesson: scalability and performance The performance of
Key$distribu,on$ Access$control$$
public blockchain is very limited. As mentioned earlier, public
Tamper6proof$log$of$events$ blockchains can only process 3-20 transactions per second.
• Proposal(of(new(value(
• Agree/disagree( The average transaction processing rate we calculated from
the whole blockchain (1020156 blockchains at 18/02/2016
Contract$
Contract$
Nego,a,on$
template$ Key$genera,on$ 00:21:12 GMT) is 1.7 transactions per block, and the average
template$
template$
transaction processing rate from the latest 100000 blocks
Federated$ Document$
Authen,ca,on$ generator$ (920156-1020156) is 3.4 transactions per block. The average
Off-chain mining time is 17 seconds.
We conducted a small experiment to test the performance
Confidential of a private blockchain, and compared the result with the
public blockchain. We built a Ethereum private blockchain and
Organisation A Organisation B Organisation C
created 50 accounts in the genesis block. We issued simple
transactions which transfer 0.001 ether from one account to
Documents Documents Documents
another. The sender and the recipient of the transactions were
Identities Identities Identities chosen randomly from the 50 accounts. During the experiment,
we found a bug in Ethereum that causes many transactions
Figure 4: Overview of the legal platform failed to be included. The issue was reported and confirmed
as legitimate8 . After fixing the issue, the performance of the
included into the blockchain. When a negotiation is created private chain became much better than the public chain. The
by an involved participant, our platform generates a secret number of transactions included into one block was around
key associated with the negotiation, which is used to encrypt 15000 transaction on average, and the mining time was around
the value of the negotiable variables before adding the inform- 41 second on average. Thus, the transaction process rate was
ation into blockchain. Then a smart contract is generated to around 366 transactions per second.
facilitate this negotiation. The smart contract 1) implements
the negotiation process, 2) has access-control management to Lesson: Privacy Public blockchains do not guarantee data
restrict the access to the negotiation, and 3) distributes the privacy. Also permissionless blockchains cannot preserve pri-
secret key of the contract by encrypting it with the participant’s vacy of the data because anyone could join the blockchain
public key, which is his/her blockchain address, and allows network without permission, and all the data on the blockchain
the participant to retrieve his/her encrypted contract secret is visible to all participants. Thus, for scenarios like the
key. Once a participant gets the encrypted contract secret key, legal contract platform, a permissioned blockchain is more
he/she decrypts it with his/her private key. With the contract appropriate, which can allow developers to explicitly grant
secret key, the participant can query the encrypted value of permissions to the participants. Besides, the information on
the negotiable variable and decrypt it. The retrieval of the blockchain might need to be encrypted to preserve privacy. In
negotiable variable and decryption is transparent to the end this case, the key needed to be generated and stored off-chain.
user. Thus, the blockchain doesn’t have enough information that can
The negotiation is done peer-to-peer and may require be used by the components without permissions to access the
manual user intervention. Every activity, such as proposing sensitive data.
a new value, agreeing or disagreeing on a value, are included
in the blockchain as different versions of the negotiation. Lesson: Trusted third-party Using external state does not
Once an agreement is reached and signed by all the involved always introduce the need for trusting an additional party. For
participants, the negotiation is finalized, a digital document of example, in the licence renewal scenario, the government is a
the negotiation is generated. The digital document is stored trusted party anyway, thus, we use government as an validation
internally in the organisation. To bind the digital document oracle that injects external state into the blockchain.
and the corresponding smart contract, the address if the smart
Lesson: Incentives If a blockchain-based system has com-
contract is included in the digital documents, and then the hash
putation ran off-chain or data stored off-chain, an additional
of the digital document is included in the smart contract. After
economic incentive is required for the participants to be
the binding, the smart contract could be killed to avoid further
honest. Incentives for miners may include rewards, transaction
interaction and modification.
In this platform, the blockchain prevents tampering and 8 https://github.com/ethereum/go-ethereum/issues/2139
fees, computation fees, or data storage fees. Incentives for the choice of introducing a blockchain in the architecture and
participants to be honest can involve: security deposits, or discussed the corresponding trade-offs.
reputation and rating mechanisms used in our first project.
ACKNOWLEDGMENTS
Lesson: Reducing cost The applications on top of the block- NICTA is funded by the Australian Government through the
chain could reduce the transactions being included into block- Department of Communications and the Australian Research
chain. For example, establishing micropayment channel, which Council through the ICT Centre of Excellence Program.
only submit the transaction once being closed by either party.
The transient state does not need to be included into block- R EFERENCES
chain, for example, not all the activities during negotiation are [1] Crypto-currency market capitalizations. http://coinmarketcap.com/.
[2] bitcoinwiki. Contract. https://en.bitcoin.it/wiki/Contract#Example 7:
worth to be included into blockchain. To reduce the number Rapidly-adjusted .28micro.29payments to a pre-determined party.
of submitted transactions, an alternative design is to only [3] J. Bonneau, A. Miller, J. Clark, A. Narayanan, J. A. Kroll, and E. W.
record the different value of negotiable variables and the final Felten. Sok: Research perspectives and challenges for bitcoin and
cryptocurrencies. In the 36th IEEE Symposium on Security and Privacy
voting result of the value rather than record every single voting (SP2015), pages 104–121, May 2015.
activity. [4] V. Buterin. On public and private blockchains. https://blog.ethereum.
org/2015/08/07/on-public-and-private-blockchains/.
Lesson: Data and contract management If the data to be [5] M. Castro and B. Liskov. Practical byzantine fault tolerance. In Proc.
stored by the application is associated with the state of the of OSDI, pages 173–186, 1999.
[6] P. Clements, F. Bachman, L. Bass, D. Garlan, J. Ivers, R. Little, R. Nord,
contract processing it, the data will be discarded once the and J. Stafford. Documenting Software Architectures: Views and Beyond.
functionality of the contract is updated through uploading a Addison-Wesley, 2003.
new version of the contract to the blockchain. To address this [7] EBA. Eba(european banking authority) opinion on “virtual currencies”.
2014.
problem, we suggest to separate the computation from the data [8] I. Eyal, A. E. Gencer, E. G. Sirer, and R. van Renesse. Bitcoin-ng: A
in dedicated smart contracts. scalable blockchain protocol. In 13th USENIX Symposium on Networked
Once deployed on the blockchain, the smart contract is Systems Design and Implementation (NSDI 16), Santa Clara, CA, Mar.
2016. USENIX Association.
always ”running” and responding to requests. We suggest to [9] D. K. Gifford. Weighted voting for replicated data. In Proceedings
kill the contract explicitly once the functionality of the contract of the seventh ACM symposium on Operating systems principles, pages
is not used to avoid further interaction and unnecessary cost. 150–162. ACM Press, 1979.
[10] IBM. Device democracy saving the future of the internet of things.
2015.
Lesson: Off-chain data Store We stored meta-data on-chain to [11] B. Kemme and G. Alonso. Database replication: a tale of research across
be publicly accessible, and kept the raw private data off-chain. communities. Proceedings of the VLDB Endowment, 3(1-2):5–12, 2010.
For example, we put the hash of personal data on-chain, but [12] L. Lamport. The part-time parliament. ACM TOCS, 16(2):133–169,
1998.
transfer the raw data off-chain. [13] L. Lamport, R. Shostak, and M. Pease. The byzantine generals problem.
Due to the limited size of the data store provided by ACM Trans. Program. Lang. Syst., 4(3):382–401, July 1982.
the blockchain [23], an off-chain data store is necessary for [14] J. Lanier. Who Owns the Future? Simon and Schuster, 2013.
[15] D. Malkhi and M. Reiter. Byzantine quorum systems. In Proceedings
some applications. There are existing platforms providing of the Twenty-ninth Annual ACM Symposium on Theory of Computing,
a data layer on top of the blockchains, such as Factom, STOC ’97, pages 569–578, 1997.
which stores only the hash of the the private data and small [16] N. R. Mehta, N. Medvidovic, and S. Phadke. Towards a taxonomy of
software connectors. In Proc. of ICSE, pages 178–187, June 2000.
amounts of public data in their own blockchain. Factom also [17] A. Miller, A. Juels, E. Shi, B. Parno, and J. Katz. Permacoin:
anchors the Bitcoin blockchain every 10 minutes to be more Repurposing bitcoin work for data preservation. In IEEE Symposium
secure. Distributed data storage, like IPFS9 , DHT (Distributed on Security and Privacy, May 2014.
[18] M. Morisse. Cryptocurrencies and bitcoin: Charting the research
Hash Table) are also sometime used in combination with the landscape, August 2015.
blockchains to build decentralized applications. [19] S. Nakamoto. Bitcoin: A peer-to-peer electronic cash system. https:
//bitcoin.org/bitcoin.pdf.
VII. C ONCLUSIONS [20] S. Omohundro. Cryptocurrencies, smart contracts, and artificial intelli-
gence. AI Matters, 1(2):19–21, Dec. 2014.
In this paper we have presented our experience from [21] J. Poon and T. Dryja. The bitcoin lightning network: Scalable off-chain
using the blockchain in several projects. The blockchain instant payments. 2016.
[22] F. B. Schneider. Implementing fault-tolerant services using the state
provides communication and coordination services through machine approach: A tutorial.
transactions, validation oracles and smart contracts, and spe- [23] P. Snow, B. Deery, J. Lu, D. Johnston, and P. Kirby. Business processes
cific facilitation services, including permission management, secured by immutable audit trails on the blockchain. 2014.
[24] M. Swan. Blockchain: Blueprint for a New Economy. O’Reilly, US,
cryptography-based secure payment, transaction validation, 2015.
mining and incentives. We have compared the blockchain to [25] T. Swanson. Consensus-as-a-service: a brief report on the emergence of
related software connectors such as the shared data store and permissioned, distributed ledger systems. 2015.
[26] R. N. Taylor, N. Medvidovic, and E. M. Dashofy. Software Architecture:
the replicated state machine, highlighting the most important Foundations, Theory, and Practice. Wiley, 2009.
theoretical differences. Based on the practical project experi- [27] F. Tschorsch and B. Scheuermann. Bitcoin and beyond: A technical
ence we have distilled important design decisions implied by survey on decentralized digital currencies. IACR Cryptology ePrint
Archive, 2015:464, 2015.
9 IPFS — https://ipfs.io/