Formal Verification of Smartcontracts
Formal Verification of Smartcontracts
Formal Verification of Smartcontracts
2
3
Microsoft Research
Inria
Harvard University
{antdl,fournet,gonthier,aseemr,nswamy,santiago}@microsoft.com
{karthikeyan.bhargavan,nadim.kobeissi,thomas.sibut-pinote}@inria.fr
[email protected]
Abstract
ted too quickly, the difficulty increases, thus raising the computational cost of mining.
Ethereum is similarly built on a blockchain based on
proof-of-work; however, its ledger is considerably more expressive than that of Bitcoins: it stores Turing-complete
programs in the form of Ethereum Virtual Machine (EVM)
bytecode, while transactions are construed as function calls
and can carry additional data in the form of arguments. Furthermore, contracts may also use non-volatile storage and
log events, both of which are recorded in the ledger.
The initiator of a transaction pays a fee for its execution
measured in units of gas. The miner who manages to append a block including the transaction gets to claim the fee
converted to Ether at a specified gas price. Some operations
are more expensive than others: for instance, writing to storage and initiating a transaction is four orders of magnitude
more expensive than an arithmetic operation on stack values. Therefore, Ethereum can be thought of as a distributed
computing platform where anyone can run code by paying
for the associated gas charges.
The integrity of the system relies on the honesty of a
majority of miners: a miner may try to cheat by not running
the program, or running it incorrectly, but honest miners will
reject the block and fork the chain. Since the longest chain is
the one that is considered valid, miners are incentivized not
to cheat and to verify that others do as well, since their block
reward may be lost unless malicious miners can supply the
majority of new blocks to the network.
While Ethereums adoption has led to smart contracts
managing millions of dollars in currency, the security of
these contracts has become highly sensitive. For instance,
a variant of a well-documented reentrancy attack was recently exploited in TheDAO [2], a contract that implements
a decentralized autonomous venture capital fund, leading to
the theft of more than $50M worth of Ether, and raising the
question of whether similar bugs could be found by static
analysis [6].
In this paper, we outline a framework to analyze and
formally verify Ethereum smart contracts using F? [9], a
functional programming language aimed at program verification. Such contracts are generally written in Solidity [3],
1.
Introduction
The blockchain technology, pioneered by Bitcoin [7] provides a globally-consistent append-only ledger that does not
rely on a central trusted authority. In Bitcoin, this ledger
records transactions of a virtual currency, which is created
by a process called mining. In the proof-of-work mining
scheme, each node of the network can earn the right to append the next block of transactions to the ledger by finding
a formatted value (which includes all transactions to appear
in the block) whose SHA256 digest is below some difficulty
threshold. The system is designed to ensure that blocks are
mined at a constant rate: when too many blocks are submit1
2016/8/11
1. Given a Solidity program, we can use Solidity? to translate it to F? and verify at the source level functional correctness specifications such as contract invariants, as well
as safety with respect to runtime errors.
hstatementi ::=
| htypei @identifier (= hexpressioni)? (*decl*)
| if( hexpressioni ) hstatementi
(else hstatementi)?
| { (hstatementi ;)* }
| return (hexpressioni)?
| throw
| hexpressioni
F*
Solidity
Verified Translation
Solidity*
Subset of F*
Source Code
Equivalence
Proof
EVM
Compiled Bytecode
Verified Decompilation
EVM*
Verify
Functional Correctness
Runtime Safety
Subset of F*
hlhs
|
|
|
expressioni ::=
@identifier
hlhs expressioni [ hlhs expressioni]
hlhs expressioni . @identifier
Verify
hbinopi ::= + | - | * | / | %
| && | || | == | != | > | < | >= | <=
hunopi ::= + | - | !
Our smart contract verification framework is a twopronged approach (Figure 1) based on F? . F? comes with
a type system that includes dependent types and monadic
effects, which we apply to generate automated queries to
statically verify properties on EVM bytecode and Solidity
sources.
While it is clearly favorable to obtain both the Solidity
source code and EVM bytecode of a target smart contract,
we design our architecture with the assumption that the verifier may only have the bytecode. At the moment of this writing, only 396 out of 112,802 contracts have their source code
available on http://etherscan.io. Therefore we provide
separate tools for decompiling EVM bytecode (EVM? ), and
analyzing Solidity source code (Solidity? ).
2.
Translating Solidity to F?
2016/8/11
6. to translate assignments, we keep an environment of local, state, and ambient global variable names: local variable declarations and assignments are translated to let
bindings; globals are replaced with library calls; state
properties are replaced with update on the state type;
7. built-in method calls (e.g.address.send()) are replaced by library calls.
We show a minimalistic Solidity contract and its F? translation in Figure 3. The only type annotation added by the
translation is a custom Eth effect on the contracts methods,
which we describe in Section 2.2. The Solidity library defines the mapping type (a reference to a map) and the associated functions update map and lookup. Furthermore,
it defines the numeric types used in Solidity, which are unsigned 256-bit by default.
2.2
The example in Figure 3 captures two major pitfalls of Solidity programming. First, many contracts fail to realize that
send and its variants are not guaranteed to succeed (send
returns a bool). This is highly surprising for Solidity programmers because all other runtime errors (such as running out of gas or call stack overflows) trigger an exception.
Such exceptions (including the ones triggered by throw) revert all transactions and all changes to the contracts properties. This is not the case of send: the programmer needs
to undo side effects manually when it returns false, e.g.
if(!addr.send(x)) throw.
The other problem illustrated in MyBank is reentrancy.
Since transactions are also method calls, calling send is a
transfer of program control. Consider the following malicious contract:
contract Malicious {
uint balance;
MyBank bank = MyBank(0xdeadbeef8badf00d...);
function Malicious(){
balance = msg.value;
bank.Deposit.value(balance)();
bank.Withdraw.value(0)(balance); // forwarding gas
}
Translation to F?
It attacks the Withdraw method of MyBank by calling recursively into it at the point where it does its send. The if
condition in the second Withdraw call is still satisfied (because the balances are updated after send, and there is no
check that it was successful). Even though the send in the
second call to Withdraw is guaranteed to fail (because unlike method calls, send allocates only 2300 gas for the call),
it still corrupts the balance by decreasing twice, causing an
unsigned integer underflow. After corrupting the balance,
2016/8/11
module MyBank
open Solidity
contract MyBank {
mapping (address uint) balances;
function Deposit() {
balances[msg.sender] += msg.value;
}
Evaluation Despite the limitations of our tool (in particular, it doesnt support many syntactic features of Solidity),
we are able to translate and typecheck 46 out of the 396
contracts we collected on https://etherscan.io. Out of
these, only a handful are valid in the Eth effect. This is a
clear sign that a large scale analysis of published contract is
likely to uncover widespread vulnerabilities; we leave such
analysis to future work.
3.
2016/8/11
let x
let x
let x
let x
let myBank () =
burn 6 ( opcodes: PUSH1 60, PUSH1 40 );
mstore [0x40uy] [0x60uy];
...
let x 28 = eqw [0xF8uy; 0xF8uy; 0xA9uy; 0x12uy] x 3 in
burn 10 ( opcode JUMPI );
if nonZero x 28 then
begin ( offset: 165 )
// decompiled code of Balance method
end
4.
Conclusion
Our preliminary experiments in using F? to verify smart contracts show that the type and effect system of F? is flexible
enough to express and prove non-trivial properties. In parallel, Luu et al. [6] used symbolic execution to detect flaws
in EVM bytecode programs, and an experimental Why3 [5]
formal verification backend is now available from the Solidity web IDE [4].
The examples we considered are simple enough that we
did not have to write a full implementation of EVM bytecode. We plan to complete a verified reference implementation and use it to verify that the output of the Solidity compiler is functionally equivalent to the sources.
We implemented EVM? and Solidity? in OCaml. It would
be interesting to implement and verify parts of these tools
using F? instead. For instance, we could prove that the stack
and control flow analysis done in EVM? is sound with respect to a stack machine semantics.
5
2016/8/11
References
[2] V. Buterin.
Critical update re: Dao vulnerability.
https://blog.ethereum.org/2016/06/17/criticalupdate-re-dao-vulnerability, 2016.
[3] Ethereum. Solidity documentation Release 0.2.0. http:
//solidity.readthedocs.io/, 2016.
[9] N. Swamy, C. Hritcu, C. Keller, A. Rastogi, A. DelignatLavaud, S. Forest, K. Bhargavan, C. Fournet, P.-Y. Strub,
M. Kohlweiss, J.-K. Zinzindohoue, and S. Zanella-Beguelin.
Dependent types and multi-monadic effects in F*. In 43rd
Annual ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, POPL 16, pages 256270. ACM,
2016.
2016/8/11