4 Smart Contracts and Ethereum 101

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Smart Contracts and Ethereum 101: Smart Contracts: Definition, Ricardian contracts.

Ethereum
101: Introduction, Ethereum blockchain, Elements of the Ethereum blockchain, Precompiled
contracts.

Smart Contracts :
 Smart contracts were first theorized by Nick Szabo in the late 1990s

Smart contracts are described by Szabo as follows:

"A smart contract is a computerized transaction protocol that executes the terms of a contract.
The general objectives are to satisfy common contractual conditions (such as payment terms,
liens, confidentiality, and even enforcement), minimize exceptions both malicious and accidental,
and minimize the need for trusted intermediaries. Related economic goals include lowering fraud
loss, arbitrations and enforcement costs, and other transaction costs."

Definition :A smart contract is a secure and unstoppable computer program representing an


agreement that is automatically executable and enforceable.

 Smart contract is in fact a computer program that is written in a language that a computer
or target machine can understand. Also, it encompasses agreements between parties in
the form of business logic.
 Another key idea is that smart contracts are automatically executed when certain
conditions are met.
 They are enforceable, which means that all contractual terms are executed as defined and
expected, even in the presence of adversaries.
 Enforcement is a broader term that encompasses traditional enforcement in the form
of law, along with implementation of certain measures and controls that make it
possible to execute contract terms without requiring any mediation.
 Smart contracts should not rely on traditional methods of enforcement. Instead, they
should work on the principle that code is law, meaning that there is no need for an
arbitrator or a third party to control or influence the execution of the smart contract.
Smart contracts are self-enforcing as opposed to legally enforceable.
 Smart Contracts are secure and unstoppable, which means that these computer
programmes are required to be designed in such a fashion that they are fault tolerant and
executable in reasonable amount of time. These programmes should be able to execute
and maintain a healthy internal state, even if external factors are unfavorable.

 For example, imagine a normal computer programme which is encoded with some
logic and executes according to the instruction coded within it, but if the environment
it is running in or external factors it relies on deviate from the normal or expected
state, the programme may react arbitrarily or simply abort. It is important that smart
contracts are immune to this type of issue.
 Secure and unstoppable may well be considered requirements or desirable features
but it will provide greater benefits in the long run if security and unstoppable
properties are included in the smart contract definition from the beginning. There is

1
also a suggestion by some researchers that smart contracts need not be automatically
executable; instead they can be what's called automatable, due to manual human input
required in certain scenarios.
 Whilst it's true that in some cases human input and control is desirable, it is not
absolutely necessary; and, for a contract to be truly smart, it has to be fully
automated. Certain inputs that need to be provided by people can and should also be
automated via the use of Oracles.
 Smart contracts usually operate by managing their internal state using a state machine model.
This allows development of an effective framework for programming smart contracts, where
the state of a contract is advanced further based on some predefined criteria and conditions.
 There is also on-going debate on the question of whether code is acceptable as a contract in a
court of law. This raises several questions around how a smart contract can be legally binding:
 Can it be developed in such a way that it is readily acceptable and understandable in a
court of law?
 How can dispute resolution be implemented within the code, and is it possible?
 Moreover, regulatory and compliance requirements is another topic that needs to be
addressed before smart contracts can be used as effectively as traditional legal
documents.

The preceding questions open up various possibilities, such as making smart contract code readable
not only by machines but also by people. If humans and machines can both understand the code
written in a smart contract it might be more acceptable in legal situations, as opposed to just a piece
of code that no-one understands except for programmers.

 Smart contracts are inherently required to be deterministic in nature.


This property will allow a smart contract to be run by any node on a network and achieve the
same result. If the result differs even slightly between nodes, consensus then cannot be
achieved and a whole paradigm of distributed consensus on blockchain can fail. Moreover, it is
also desirable that the contract language itself is deterministic thus ensuring the integrity and
stability of the smart contracts. Deterministic in the sense that there are no non-deterministic
functions used in the language which can produce varied results on different nodes.
 For example, various floating point operations calculated by various functions in a variety
of programming languages can produce different results in different runtime
environments.
 Another example is of some math functions in JavaScript which can produce different
results for the same input on different browsers, and which can in turn lead to various
bugs. This is highly undesirable in smart contracts because, if results are inconsistent
between nodes, then consensus will never be achieved.

A deterministic feature ensures that smart contracts always produce the same output for a
specific input. In other words, programs once compiled produce a solid and accurate business logic
that is completely in line with the requirements programmed in the high level code.

In summary, a smart contract has the following four properties:


 Automatically executable
 Enforceable

2
 Semantically sound Secure
 Unstoppable.

The first two properties are required as a minimum, whereas the latter two may not be required or
implementable in certain scenarios and can be relaxed.

For example, a derivatives contract does not perhaps need to be semantically sound and
unstoppable but should at least be automatically executable and enforceable at a basic level.

On the other hand, a title deed needs to be semantically sound and complete therefore, in order for
it to be implemented as a smart contract, the language must be understood by both computers and
people. This issue of interpretation was addressed by Ian Grigg with his invention of Ricardian
contracts.

 Deploying smart contracts on a blockchain : Smart contracts may or may not be deployed
on a blockchain but it makes sense to deploy them on a blockchain due to the distributed
consensus mechanism provided by blockchain.
 Ethereum is an example of a blockchain that natively supports the development and
deployment of smart contracts. Smart contracts on Ethereum blockchain are usually
part of a larger application such as Decentralized Autonomous organization (DAOs).
 As a comparison, in bitcoin blockchain the lock_time field in the bitcoin transaction can
be seen as an enabler of a basic version of a smart contract. The lock_time field enables
a transaction to be locked until a specified time or after a number of blocks, thus
enforcing a basic contract that a certain transaction can only be unlocked if certain
conditions (elapsed time or number of blocks) is met.

However, this is very limited in nature and should be only viewed as an example of a
basic smart contract. In addition to the above mentioned example, bitcoin scripting
language, though limited, can be used to construct basic smart contracts.

 Programming Languages : Code for Smart contracts in Ehterium is written in HLL such as
 Serpent: Python like HLL that can be used to write smart contract for Etherium
 LLL: Low-level Lisp-like Language used to write smart contract.
 Solidity: HLL developed for Etherium with Java Script like syntax to write code for smart
contract
 Viper: A new language developed from scratch to achieve a secure ,simple and auditable
language .
Serpent & LLL are no longer supported .Most commonly used language is Solidity.

Ricardian contracts
 Ricardian contracts were originally proposed in the Financial Cryptography in 7 Layers
paper by Ian Grigg in late 1990s.
 These contracts were used initially in a bond trading and payment system called Ricardo.
 The key idea is to write a document which is understandable and acceptable by both a
court of law and computer software.
 Ricardian contracts address the challenge of issuance of value over the Internet.

3
 It identifies the issuer and captures all the terms and clauses of the contract in a document
in order to make it acceptable as a legally binding contract.
 Ricardian contract is a document that has several of the following properties:
 A contract offered by an issuer to holders
 A valuable right held by holders, and managed by the issuer
 Easily readable by people (like a contract on paper)
 Readable by programs (parseable, like a database)
 Digitally signed
 Carries the keys and server information
 Allied with a unique and secure identifier
In practice, the contracts are implemented by producing a single document that contains the terms
of the contract in legal prose and the required machine-readable tags. This document is digitally
signed by the issuer using their private key. This document is then hashed using a message digest
function to produce a hash by which the document can be identified. This hash is then further used
and signed by parties during the performance of the contract in order to link each transaction, with
the identifier hash thus serving as evidence of intent.

This is depicted in the diagram, usually called a bowtie model. The diagram shows number of
elements :

 The World of Law on the left hand side from where the document originates. This
document is a written contract in legal prose with some machine-readable tags.
 It is then hashed
 The resultant message digest is used as an identifier throughout the World of Accountancy
shown on right hand side of the diagram.

The World of Accountancy element represents any accounting, trading and information systems
that are being used in a business to perform various business operations. The idea behind this flow is
that the message digest generated by hashing the document is first used in a so called genesis
transaction, or first transaction, and then used in every transaction as an identifier throughout the
operational execution of the contract. This way, a secure link is created between the original written
contract and every transaction in the World of Accounting.

Ricardian contracts, bowtie diagram

4
A Ricardian contract is different from a smart contract in the sense that a smart contract does not
include any contractual document and is focused purely on the execution of the contract.

A Ricardian contract, on the other hand, is more concerned with the semantic richness and
production of a document that contains contractual legal prose.

The semantics of a contract can be divided into two types:

 Operational semantics
 Denotational semantics

The first type defines the actual execution, correctness and safety of the contract, and the latter is
concerned with the real-world meaning of the full contract.

Some researchers have differentiated between smart contract code and smart legal contracts
where a smart contract is only concerned with the execution of the contract and the second type
encompasses both the denotational and operational semantics of a legal agreement. It makes sense
to perhaps categorize smart contracts based on the difference between semantics, but it is better to
consider smart contracts as a standalone entity that is capable of encoding legal prose and code
(business logic) in it.

At bitcoin, a very simple implementation of a smart contract can be observed which is fully oriented
towards the execution of the contract, whereas a Ricardian contract is more geared towards
producing a document that is understandable by humans, with some parts that a computer program
can understand.

This can be viewed as legal semantics vs operational performance (semantics vs performance) as


shown in the following diagram.

5
A smart contract is made up to have both of these elements (performance and semantics)
embedded together, which completes an ideal model of a smart contract.

A Ricardian contract can be represented as a tuple of three objects, namely

 Prose: It represents the legal contract in regular language


 Parameters : Parameters join the appropriate parts of the legal contract to the equivalent
code.
 Code: It represents the program that is a computer-understandable representation of legal
prose
Ricardian contracts have been implemented in many systems, such as CommonAccord,
OpenBazaar, OpenAssets, and Askemos

Ethereum 101

Introduction
 Ethereum was conceptualized by Vitalik Buterin in November 2013.
 The key idea proposed was the development of a Turing-complete language that allows the
development of arbitrary programs (smart contracts) for blockchain and decentralized
applications. This is in contrast to bitcoin, where the scripting language is very limited and
allows basic and necessary operations only.

The Ethereum stack

The Ethereum stack consists of various components.

 At the core, there is the Ethereum blockchain running on the P2P Ethereum network.
 Secondly, there's an Ethereum client (usually geth) that runs on the nodes and connects to
the peer-to-peer Ethereum network from where blockchain is downloaded and stored
locally. It provides various functions, such as mining and account management. The local
copy of the blockchain is synchronized regularly with the network.
 Another component is the web3.js library that allows interaction with geth via the Remote
Procedure Call (RPC) interface.

This can be visualized in the following diagram:

6
Ethereum blockchain

Ethereum, just like any other blockchain, can be visualized as a transaction-based state machine.
The idea is that a genesis state is transformed into a final state by executing transactions
incrementally. The final transformation is then accepted as the absolute undisputed version of the
state.

In the following diagram, the Ethereum state transition function is shown, where a transaction
execution has resulted in a state transition

In the preceding example, a transfer of 2 Ether from Address 4718bf7a to Address 741f7a2 is
initiated. The initial state represents the state before the transaction execution and the final state is
what the morphed state looks like

Currency (ETH and ETC)

As an incentive to the miners, Ethereum also rewards its native currency called Ether, abbreviated as
ETH. After the DAO hack a hard fork was proposed in order to mitigate the issue; therefore, there
are now two Ethereum blockchains: one is called Ethereum classic and its currency is represented
by ETC, whereas the hard-forked version is ETH, which continues to grow and on which active
development is being carried out. ETC, however, has its own following with a dedicated community
that is further developing ETC, which is the nonforked original version of Ethereum.

Forks

With the latest release of homestead, due to major protocol upgrades, it resulted in a hard fork.

The protocol was upgraded at block number 1,150,000, resulting in the migration from the first
version of Ethereum known as Frontier to the second version of Ethereum called homestead.

A recent unintentional fork that occurred on November 24, 2016, at 14:12:07 UTC was due to a bug
in the geth client's journaling mechanism. Network fork occurred at block number 2,686,351. This
bug resulted in geth failing to revert empty account deletions in the case of the empty out-of-gas
exception. This was not an issue in parity (another popular Ethereum client).

This means that from block number 2686351, the Ethereum blockchain is split into two, one running
with parity clients and the other with geth. This issue was resolved with the release of geth version
1.5.3

7
GAS

 Another key concept in Ethereum is that of gas.


 All transactions on the Ethereum blockchain are required to cover the cost of computation
they are performing. The cost is covered by something called gas or crypto fuel, which is a
new concept introduced by Ethereum.
 This gas as execution fee is paid upfront by the transaction originators. The fuel is consumed
with each operation. Each operation has a predefined amount of gas associated with it. Each
transaction specifies the amount of gas it is willing to consume for its execution.
 If it runs out of gas before the execution is completed, any operation performed by the
transaction up to that point is rolled back. If the transaction is successfully executed, then
any remaining gas is refunded to the transaction originator.
 This concept should not be confused with mining fee, which is a different concept that is
used to pay gas as a fee to the miners.

The consensus mechanism

The consensus mechanism in Ethereum is based on the GHOST protocol originally proposed by Zohar
and Sompolinsky in December 2013.

Ethereum uses a simpler version of this protocol, where the chain that has most computational
effort spent on it in order to build it is identified as the definite version. Another way of looking at it
is to find the longest chain, as the longest chain must have been built by consuming adequate mining
effort. Greedy Heaviest Observed Subtree (GHOST) was first introduced as a mechanism to alleviate
the issues arising out of fast block generation times that led to stale or orphan blocks.

In GHOST, stale blocks are added in calculations to figure out the longest and heaviest chain of
blocks. Stale blocks are called Uncles or Ommers in Ethereum.

The following diagram shows a quick comparison between the longest and heaviest chain

8
The world state

 The world state in Ethereum represents the global state of the Ethereum blockchain.
 It is basically a mapping between Ethereum addresses and account states.
 The addresses are 20 bytes long. This mapping is a data structure that is serialized using
Recursive Length Prefix (RLP). RLP is a specially developed encoding scheme that is used in
Ethereum to serialize binary data for storage or transmission over the network and also to
save the state in a Patricia tree.
 The RLP function takes an item as an input, which can be a string or a list of items, and
produces raw bytes that are suitable for storage and transmission over the network. RLP
does not encode data; instead, its main purpose is to encode structures.

The account state

The account state consists of four fields: nonce, balance, storageroot and codehash and is described
in detail here.

Nonce: This is a value that is incremented every time a transaction is sent from the address.

In case of contract accounts, it represents the number of contracts created by the account.

Contract accounts are one of the two types of accounts that exist in EthereumBalance .This value
represents the number of Weis which is the smallest unit of the currency (Ether) in Ethereum held
by the address.

Storage root :

This field represents the root node of a Merkle Patricia tree that encodes the storage contents of the
account.

Code hash

This is an immutable field that contains the hash of the smart contract code that is associated with
the account. In the case of normal accounts, this field contains the Keccak 256-bit hash of an empty
string. This code is invoked via a message call.

The world state and its relationship with accounts trie, accounts, and block header can be visualized
in the following diagram.

9
It shows the account data structure in the middle of the diagram, which contains a storage root
hash derived from the root node of the account storage trie shown on the left. The account data
structure is then used in the world state trie, which is a mapping between addresses and account
states.

Finally, the root node of the world state trie is hashed using the Keccak 256-bit algorithm and made
part of the block header data structure, which is shown on the right-hand side of the diagram as
state root hash

Transactions

A transaction in Ethereum is a digitally signed data packet using a private key that contains the
instructions that, when completed, either result in a message call or contract creation.

Transactions can be divided into two types based on the output they produce:

 Message call transactions: This transaction simply produces a message call that is used to
pass messages from one account to another.
 Contract creation transactions: As the name suggests, these transactions result in the
creation of a new contract. This means that when this transaction is executed successfully, it
creates an account with the associated code.

Both of these transactions are composed of a number of common fields, which are described
here. Nonce : Nonce is a number that is incremented by one every time a transaction is sent by the
sender. It must be equal to the number of transactions sent and is used as a unique identifier for the
transaction. A nonce value can only be used once.

gasPrice :The gasPrice field represents the amount of Wei required in order to execute transaction.
gasLimit: The gasLimit field contains the value that represents the maximum amount of gas that can
be consumed in order to execute the transaction.

10
For now, it is sufficient to say that this is the amount of fee in Ether that a user (for example, the
sender of the transaction) is willing to pay for computation

To: As the name suggests, the to field is a value that represents the address of the recipient of the
transaction.

Value : Value represents the total number of Wei to be transferred to the recipient; in the case of a
contract account, this represents the balance that the contract will hold.

Signature: Signature is composed of three fields, namely v, r, and s. These values represent the
digital signature (R, S) and some information that can be used to recover the public key (V). Also of
the transaction from which the sender of the transaction can also be determined.

The signature is based on ECDSA scheme and makes use of the SECP256k1 curve. In this section,
ECDSA will be presented in the context of its usage in Ethereum. V is a single byte value that depicts
the size and sign of the elliptic curve point and can be either 27 or 28. V is used in the ECDSA
recovery contract as a recovery ID. This value is used to recover (derive) the public key from the
private key. In secp256k1, the recovery ID is expected to be either 0 or 1. In Ethereum, this is offset
by 27.

R is derived from a calculated point on the curve. First, a random number is picked up, which is
multiplied with the generator of the curve to calculate a point on the curve. The x coordinate part of
this point is R. R is encoded as a 32 byte sequence. R must be greater than 0 and less than the
secp256k1n limit
(115792089237316195423570985008687907852837564279074904382605163141518161494337).
S is calculated by multiplying R with the private key and adding it into the hash of the message to be
signed and by finally dividing it with the random number chosen to calculate R. S is also a 32 byte
sequence. R and S together represent the signature.

In order to sign a transaction, the ECDSASIGN function is used, which takes the message to be
signed and the private key as an input and produces V, a single byte value; R, a 32 byte value, and S,
another 32 byte value.

The equation is as follows: ECDSASIGN (Message, Private Key) = (V, R, S) Init

The Init field is used only in transactions that are intended to create contracts.

This represents a byte array of unlimited length that specifies the EVM code to be used in the
account initialization process. The code contained in this field is executed only once, when the
account is created for the first time, and gets destroyed immediately after that. Init also returns
another code section called body, which persists and runs in response to message calls that the
contract account may receive.

These message calls may be sent via a transaction or an internal code execution.

Data If the transaction is a message call, then the data field is used instead of init, which represents
the input data of the message call. It is also unlimited in size and is organized as a byte array. This
can be visualized in the following diagram, where a transaction is a tuple of the fields mentioned
earlier, which is then included in a transaction trie (a modified Merkle-Patricia tree) composed of the

11
transactions to be included.

Finally, the root node of transaction trie is hashed using a Keccak 256-bit algorithm and is included
in the block header along with a list of transactions in the block.

Transactions can be found in either transaction pools or blocks.

When a mining node starts its operation of verifying blocks, it starts with the highest paying
transactions in the transaction pool and executes them one by one. When the gas limit is reached or
no more transactions are left to be processed in the transaction pool, the mining starts.

In this process, the block is repeatedly hashed until a valid nonce is found that, once hashed with
the block, results in a value less than the difficulty target. Once the block is successfully mined, it will
be broadcasted immediately to the network, claiming success, and will be verified and accepted by
the network. This process is similar to Bitcoin's mining process. The only difference is that
Ethereum's Proof of Work algorithm is ASIC-resistant, known as Ethash, where finding a nonce
requires large memory.

Contract creation transaction

There are a few essential parameters that are required when creating an account. These parameters
are listed as follows:

 Sender
 Original transactor
 Available gas
 Gas price
 Endowment, which is the amount of ether allocated initially
 A byte array of arbitrary length Initialization
 EVM code

12
 Current depth of the message call/contract-creation stack (current depth means the number
of items that are already there in the stack)

Addresses generated as a result of contract creation transaction are 160-bit in length. They are the
rightmost 160-bits of the Keccak hash of the RLP encoding of the structure containing only the
sender and the nonce.

Initially, the nonce in the account is set to zero. The balance of the account is set to the value
passed to the contract. Storage is also set to empty. Code hash is Keccak 256-bit hash of the empty
string. The account is initialized when the EVM code (Initialization EVM code) is executed.

In the case of any exception during code execution, such as not having enough gas, the state does
not change. If the execution is successful, then the account is created after the payment of
appropriate gas costs.

The current version of Ethereum (homestead) specifies that the result of contract transaction is
either a new contract with its balance, or no new contract is created with no transfer of value. This is
in contrast to previous versions, where the contract could be created regardless of the contract code
deployment being successful or not due to an out-of-gas exception

Message call transaction

A message call requires several parameters for execution, which are listed as follows:

 Sender
 The transaction originator
 Recipient
 The account whose code is to be executed
 Available gas
 Value
 Gas price
 Arbitrary length byte array
 Input data of the call
 Current depth of the message call/contract creation stack

Message calls result in state transition. Message calls also produce output data, which is not used if
transactions are executed. In cases where message calls are triggered by VM code, the output
produced by the transaction execution is used.

In the following diagram, the segregation between two types of transaction is shown

13
Elements of the Ethereum blockchain
Ethereum virtual machine (EVM)
 EVM is a simple stack-based execution machine that runs bytecode instructions in order to
transform the system state from one state to another.
 The word size of the virtual machine is set to 256-bit.
 The stack size is limited to 1024 elements and is based on the LIFO (Last in First Out) queue.
 EVM is a Turing-complete machine but is limited by the amount of gas that is required to run
any instruction. This means that infinite loops that can result in denial of service attacks are
not possible due to gas requirements.
 EVM also supports exception handling in case exceptions occur, such as not having enough
gas or invalid instructions, in which case the machine would immediately halt and return the
error to the executing agent.
 EVM is a fully isolated and sandboxed runtime environment. The code that runs on the EVM
does not have access to any external resources, such as a network or filesystem.
 EVM is a stack-based architecture. EVM is big-endian by design and it uses 256-bit wide
words. This word size allows for Keccak 256-bit hash and elliptic curve cryptography
computations. There are two types of storage available to contracts and EVM.
 The first one is called memory, which is a byte array. When a contract finishes the code
execution, the memory is cleared. It is akin to the concept of RAM.
 The other type, called storage, is permanently stored on the blockchain. It is a key value
store. Memory is unlimited but constrained by gas fee requirements. The storage associated
with the virtual machine is a word addressable word array that is nonvolatile and is
maintained as part of the system state. Keys and value are 32 bytes in size and storage.
 The program code is stored in a virtual readonly memory (virtual ROM) that is accessible
using the CODECOPY instruction. The CODECOPY instruction is used to copy the program
code into the main memory. Initially, all storage and memory is set to zero in the EVM.

14
The following diagram shows the design of the EVM where the virtual ROM stores the program code
that is copied into main memory using CODECOPY. The main memory is then read by the EVM by
referring to the program counter and executes instructions step by step. The program counter and
EVM stack are updated accordingly with each instruction execution.

 EVM optimization is an active area of research and recent research has suggested that EVM
can be optimized and tuned to a very fine degree in order to achieve high performance.
 Research into the possibility of using Web assembly (WASM) is underway already. WASM is
developed by Google, Mozilla, and Microsoft and is now being designed as an open standard
by the W3C community group. The aim of WASM is to be able to run machine code in the
browser that will result in execution at native speed.
 Similarly, the aim of EVM 2.0 is to be able to run the EVM instruction set (Opcodes) natively
in CPUs, thus making it faster and efficient.

Execution environment

There are some key elements that are required by the execution environment in order to execute
the code. The key parameters are provided by the execution agent, for example, a transaction.
These are listed as follows:

1. The address of the account that owns the executing code.

2. The address of the sender of the transaction and the originating address of this execution.

3. The gas price in the transaction that initiated the execution.

4. Input data or transaction data depending on the type of executing agent. This is a byte array; in
the case of a message call, if the execution agent is a transaction, then the transaction data is
included as input data.

15
5. The address of the account that initiated the code execution or transaction sender. This is the
address of the sender in case the code execution is initiated by a transaction; otherwise, it's the
address of the account.

6. The value or transaction value. This is the amount in Wei. If the execution agent is a transaction,
then it is the transaction value.

7. The code to be executed presented as a byte array that the iterator function picks up in each
execution cycle.

8. The block header of the current block

9. The number of message calls or contract creation transactions currently in execution. In other
words, this is the number of CALLs or CREATEs currently in execution.

The execution environment can be visualized as a tuple of nine elements, as follows:

In addition to the previously mentioned nine fields, system state and the remaining gas are also
provided to the execution environment.

The execution results in producing the resulting state, gas remaining after the execution, self-
destruct or suicide set (described later), log series (described later), and any gas refunds.

Machine state

Machine state is also maintained internally by the EVM. Machine state is updated after each
execution cycle of EVM. An iterator function runs in the virtual machine, which outputs the results of
a single cycle of the state machine.

Machine state is a tuple that consist of the following elements:

 Available gas
 The program counter, which is a positive integer up to 256

16
 Memory contents
 Active number of words in memory
 Contents of the stack

The EVM is designed to handle exceptions and will halt (stop execution) in case any of the
following exceptions occur:

 Not having enough gas required for execution


 Invalid instructions
 Insufficient stack items
 Invalid destination of jump op codes
 Invalid stack size (greater than 1024)

The iterator function

The iterator functions are functions that are used to set the next state of the machine and eventually
the world state. These functions include the following:

 It fetches the next instruction from a byte array where the machine code is stored in the
execution environment
 It adds/removes (PUSH/POP) items from the stack accordingly.
 Gas is reduced according to the gas cost of the instructions/Opcodes.
 It increments the program counter (PC).

Machine state can be viewed as a tuple shown in the following diagram

The virtual machine is also able to halt in normal conditions if STOP or SUICIDE or RETURN Opcodes
are encountered during the execution cycle

Accounts

 Accounts are one of the main building blocks of the Ethereum blockchain.
 The state is created or updated as a result of the interaction between accounts.
 Operations performed between and on the accounts represent state transitions.

State transition is achieved using what's called the Ethereum state transition function, which works
as follows:

1. Confirm the transaction validity by checking the syntax, signature validity, and nonce.

17
2. Transaction fee is calculated and the sending address is resolved using the signature.
Furthermore, sender's account balance is checked and subtracted accordingly and nonce is
incremented. An error is returned if the account balance is not enough.

3. Provide enough ether (gas price) to cover the cost of the transaction. This is charged per byte
incrementally according to the size of the transaction.

4. In this step, the actual transfer of value occurs. The flow is from the sender's account to receiver's
account. The account is created automatically if the destination account specified in the transaction
does not exist yet. Moreover, if the destination account is a contract, then the contract code is
executed. This also depends on the amount of gas available. If enough gas is available, then the
contract code will be executed fully; otherwise, it will run up to the point where it runs out of gas.

5. In cases of transaction failure due to insufficient account balance or gas, all state changes are
rolled back with the exception of fee payment, which is paid to the miners.

6. Finally, the remainder (if any) of the fee is sent back to the sender as change and fee is paid to the
miners accordingly. At this point, the function returns the resulting state.

Types of accounts

There are two types of accounts in Ethereum:

 Externally owned accounts(EAO)


 Contract accounts

Externally owned accounts (EOAs) are similar to accounts that are controlled by a private key in
bitcoin. Contract accounts are the accounts that have code associated with them along with the
private key. An EOA has ether balance, is able to send transactions, and has no associated code.

Contract Account (CA) has ether balance, associated code, and the ability to get triggered and
execute code in response to a transaction or a message.Due to the Turing-completeness property of
the Ethereum blockchain, the code within contract accounts can be of any level of complexity. The
code is executed by EVM by each mining node on the Ethereum network. In addition, contract
accounts are able to maintain their own permanent state and can call other contracts.

Block :Blocks are the main building blocks of a blockchain. Ethereum blocks consist of various
components,:

 The block header


 The transactions list
 The list of headers of Ommers or Uncles

The transaction list is simply a list of all transactions included in the block. In addition, the list of
headers of Uncles is also included in the block.

The most important and complex part is the block header, which is discussed here.

18
Block header : Block headers are the most critical and detailed components of an Ethereum block.
The header contains valuable information, which is described in detail here.

 Parent hash :This is the Keccak 256-bit hash of the parent (previous) block's header.
 Ommers hash:This is the Keccak 256-bit hash of the list of Ommers (Uncles) blocks included
in the block.
 Beneficiary:Beneficiary field contains the 160-bit address of the recipient that will receive
the mining reward once the block is successfully mined.
 State root:The state root field contains the Keccak 256-bit hash of the root node of the state
trie. It is calculated after all transactions have been processed and finalized.
 Transactions root: The transaction root is the Keccak 256-bit hash of the root node of the
transaction trie. Transaction trie represents the list of transactions included in the block.
 Receipts root : The receipts root is the keccak 256 bit hash of the root node of the
transaction receipt trie. This trie is composed of receipts of all transactions included in the
block. Transaction receipts are generated after each transaction is processed and contain
useful post-transaction information..
 Logs bloom :The logs bloom is a bloom filter that is composed of the logger address and log
topics from the log entry of each transaction receipt of the included transaction list in the
block.
 Difficulty :The difficulty level of the current block.
 Number: The total number of all previous blocks; the genesis block is block zero.
 Gas limit : The field contains the value that represents the limit set on the gas consumption
per block.
 Gas used :The field contains the total gas consumed by the transactions included in the
block.
 Timestamp : Timestamp is the epoch Unix time of the time of block initialization.
 Extra data :Extra data field can be used to store arbitrary data related to the block.
 Mixhash :Mixhash field contains a 256-bit hash that once combined with the nonce is used
to prove that adequate computational effort has been spent in order to create this block.
 Nonce : Nonce is a 64-bit hash (a number) that is used to prove, in combination with the
mixhash field, that adequate computational effort has been spent in order to create this
block.

19
Precompiled contracts
There are four precompiled contracts in Ethereum. Here is the list of these contracts and details.

 The elliptic curve public key recovery function


 The SHA-256 bit hash function
 The RIPEMD-160 bit hash function
 The identity function

The elliptic curve public key recovery function :

ECDSARECOVER (Elliptic curve DSA recover function) is available at address 1. It is denoted as ECREC
and requires 3000 gas for execution. If the signature is invalid, then no output is returned by this
function. Public key recovery is a standard mechanism by which the public key can be derived from
the private key in elliptic curve cryptography.

The ECDSA recovery function is shown as follows: ECDSARECOVER(H, V, R, S) = Public Key

It takes four inputs: H, which is a 32 byte hash of the message to be signed and V, R, and S, which
represent the ECDSA signature with the recovery ID and produce a 64 byte public key.

V, R, and S have been discussed in detail previously in this chapter

In Ethereum. V is a single byte value that depicts the size and sign of the elliptic curve point and
can be either 27 or 28. V is used in the ECDSA recovery contract as a recovery ID.

R is derived from a calculated point on the curve. First, a random number is picked up, which is
multiplied with the generator of the curve to calculate a point on the curve. The x coordinate part

20
of this point is R. R is encoded as a 32 byte sequence. R must be greater than 0 and less than the
secp256k1n limit
(115792089237316195423570985008687907852837564279074904382605163141518161494337).
S is calculated by multiplying R with the private key and adding it into the hash of the message to
be signed and by finally dividing it with the random number chosen to calculate R. S is also a 32
byte sequence. R and S together represent the signature.

In order to sign a transaction, the ECDSASIGN function is used, which takes the message to be
signed and the private key as an input and produces V, a single byte value; R, a 32 byte value, and S,
another 32 byte value.

The SHA-256 bit hash function

The SHA-256 bit hash function is a precompiled contract that is available at address 2 and produces
a SHA256 hash of the input. It is almost like a pass-through function. Gas requirement for SHA-256
(SHA256) depends on the input data size. The output is a 32 byte value.

The RIPEMD-160 bit hash function

The RIPEMD-160 bit hash function is used to provide RIPEMD 160-bit hash and is available at address
3. The output of this function is a 20-byte value. Gas requirement, similar to SHA-256, is dependent
on the amount of input data.

The identity function

The identity function is available at address 4 and is denoted by the ID. It simply defines output as
input; in other words, whatever input is given to the ID function, it will output the same value. Gas
requirement is calculated by a simple formula: 15 + 3 [Id/32] where Id is the input data.

This means that at a high level, the gas requirement is dependent on the size of the input data albeit
with some calculation performed, as shown in the preceding equation.

All the previously mentioned precompiled contracts can become native extensions and can be
included in the EVM opcodes in the future.

Transaction validation and execution

Transactions are executed after verifying the transactions for validity.

Initial tests are listed as follows:

 A transaction must be well-formed and RLP-encoded without any additional trailing bytes
 The digital signature used to sign the transaction is valid
 Transaction nonce must be equal to the sender's account's current nonce
 Gas limit must not be less than the gas used by the transaction
 The sender's account contains enough balance to cover the execution cost

21
The transaction sub state

A transaction sub-state is created during the execution of the transaction that is processed
immediately after the execution completes. This transaction sub-state is a tuple that is composed
of three items.

 Suicide set: This element contains the list of accounts that are disposed of after the
transaction is executed.
 Log series : This is an indexed series of checkpoints that allow the monitoring and
notification of contract calls to the entities external to the Ethereum environment, such as
application frontends. It works like a trigger mechanism that is executed every time a
specific function is invoked or a specific event occurs. Logs are created in response to events
occurring in the smart contract. It can also be used as a cheaper form of storage
 Refund balance : This is the total price of gas in the transaction that initiated the execution.
Refunds are not immediately executed; instead, they are used to partially offset the total
execution cost.

The following diagram describes the transaction sub-state tuple

The block validation mechanism

An Ethereum block is considered valid if it passes the following checks:

 Consistent with Uncles and transactions. This means that all Ommers (Uncles) satisfy the
property that they are indeed Uncles and also if the Proof of Work for Uncles is valid.
 If the previous block (parent) exists and is valid.
 If the timestamp of the block is valid. This basically means that the current block's
timestamp must be higher than the parent block's timestamp. Also, it should be less
than 15 minutes into the future. All block times are calculated in epoch time (Unix time).

If any of these checks fails, the block will be rejected.

22
Block finalization
Block finalization is a process that is run by miners in order to validate the contents of the block
and apply rewards. It results in four steps being executed. These steps are described here in detail.

 Ommers validation Validate Ommers (stale blocks also called Uncles). In the case of mining,
determine Ommers. The validation process of the headers of stale blocks checks whether
the header is valid and the relationship of the Uncle with the current block satisfies the
maximum depth of six blocks. A block can contain a maximum of two Uncles.
 Transaction validation Validate transactions. In the case of mining, determine transactions.
The process involves checking whether the total gas used in the block is equal to the final
gas consumption after the final transaction.
 Reward application Apply rewards, which means updating the beneficiary's account with a
reward balance. In Ethereum, a reward is also given to miners for stale blocks, which is 1/32
of the block reward. Uncles that are included in the blocks also receive 7/8 of the total block
reward. The current block reward is 5 Ether. A block can have a maximum of two Uncles.
 State and nonce validation Verify the state and nonce. In the case of mining, compute a
valid state and nonce.

Block difficulty
Block difficulty is increased if the time between two blocks decreases, whereas it increases if the
block time between two blocks decreases. This is required to maintain a roughly consistent block
generation time.

The difficulty adjustment algorithm in Ethereum's homestead release is shown as follows:

block_diff = parent_diff + parent_diff // 2048 * max(1 - (block_timestamp - parent_timestamp) //


10, -99) + int(2**((block.number // 100000) - 2))

The preceding algorithm means that, if the time difference between the generation of the parent
block and the current block is less than 10 seconds, the difficulty goes up.

If the time difference is between 10 to 19 seconds, the difficulty level remains the same

Finally, if the time difference is 20 seconds or more, the difficultly level decreases. This decrease is
proportional to the time difference. In addition to timestamp-difference-based difficulty adjustment,
there is also another part that increases the difficulty exponentially after every 100,000 blocks.

This is the so called difficulty time bomb or Ice age introduced in the Ethereum network, which will
make it very hard to mine on the Ethereum blockchain at some point in the future.

23
This will encourage users to switch to Proof of Stake as mining on the POW chain will eventually
become prohibitively difficult. According to the latest update and estimates based on the algorithm,
the block generation time will become significantly high during the second half of the year 2017 and
in 2021, it will become so high that it will be virtually impossible to mine on the POW chain. This
way, miners will have no choice but to switch to the Proof of Stake scheme proposed by Ethereum
called Casper.

Ether
 Ether is minted by miners as a currency reward for the computational effort they spend in
order to secure the network by verifying and with validation transactions and blocks.
 Ether is used within the Ethereum blockchain to pay for the execution of contracts on the
EVM. Ether is used to purchase gas as crypto fuel, which is required in order to perform
computation on the Ethereum blockchain.
 The denomination table is shown as follows:

Fees are charged for each computation performed by the EVM on the blockchain
Gas
 Gas is required to be paid for every operation performed on the ethereum blockchain.
 This is a mechanism that ensures that infinite loops cannot cause the whole blockchain to
stall due to the Turing-complete nature of the EVM.
 A transaction fee is charged as some amount of Ether and is taken from the account balance
of the transaction originator. A fee is paid for transactions to be included by miners for
mining. If this fee is too low, the transaction may never be picked up; the more the fee, the
higher are the chances that the transactions will be picked up by the miners for inclusion in
the block.
 Conversely, if the transaction that has an appropriate fee paid is included in the block by
miners but has too many complex operations to perform, it can result in an out-of-gas
exception if the gas cost is not enough. In this case, the transaction will fail but will still be
made part of the block and the transaction originator will not get any refund.

Transaction cost can be estimated using the following formula:

Total cost = gasUsed * gasPrice

24
Here, gasUsed is the total gas that is supposed to be used by the transaction during the
execution and gasPrice is specified by the transaction originator as an incentive to the
miners to include the transaction in the next block. This is specified in Ether.

 Each EVM opcode has a fee assigned to it. It is an estimate because the gas used can be
more or less than the value specified by the transaction originator originally.

For example, if computation takes too long or the behavior of the smart contract changes in
response to some other factors, then the transaction execution may perform more or less
operations than originally intended and can result in consuming more or fewer gas.

 If the execution runs out of gas, everything is immediately rolled back; otherwise, if the
execution is successful and there is some remaining gas, then it is returned to the
transaction originator.
 Each operation costs some gas; a high level fee schedule of a few operations is shown as
an example here

Based on the preceding fee schedule and the formula discussed earlier, an example
calculation of the SHA3 operation can be calculated as follows:
SHA3 costs 30 gas

Current gas price is 25 GWei, which is 0.000000025

Ether Multiplying both: 0.000000025 * 30 = 0.00000075 Ether

In total, 0.00000075 Ether is the total gas that will be charged

Fee schedule
Gas is charged in three scenarios as a prerequisite to the execution of an operation:
 The computation of an operation
 For contract creation or message call
 Increase in the usage of memory

25
Messages
 Messages are the data and value that are passed between two accounts.
 A message is a data packet passed between two accounts. This data packet contains
data and value (amount of ether).
 It can either be sent via a smart contract (autonomous object) or from an external
actor (externally owned account) in the form of a transaction that has been digitally
signed by the sender.
 Contracts can send messages to other contracts.
 Messages only exist in the execution environment and are never stored.
 Messages are similar to transactions; however, the main difference is that they are
produced by the contracts, whereas transactions are produced by entities external
(externally owned accounts) to the Ethereum environment.
A message consists of the components mentioned here:
1. Sender of the message
2. Recipient of the message
3. Amount of Wei to transfer and message to the contract address
4. Optional data field (Input data for the contract)
5. Maximum amount of gas that can be consumed
Messages are generated when CALL or DELEGATECALL Opcodes are executed by the
contracts

Calls
 A call does not broadcast anything to the blockchain; instead, it is a local call to a
contract function and runs locally on the node. It is almost like a local function call. It
does not consume any gas as it is a read-only operation. It is akin to a dry run.
 Calls are executed locally on a node and generally do not result in any state change.
 Call is the act of passing a message from one account to another. If the destination
account has an associated EVM code, then the virtual machine will start upon the
receipt of the message to perform the required operations. If the message sender is
an autonomous object, then the call passes any data returned from the virtual
machine operation.
 State is altered by transactions. These are created by external factors and are signed
and then broadcasted to the Ethereum network.
Mining
Mining is the process by which new currency is added to the blockchain
This is an incentive for the miners to validate and verify blocks made up of transactions.
The mining process helps secure the network by verifying computations. At a theoretical
level, a miner performs the following functions:
1. Listens for the transactions broadcasted on the Ethereum network and determines the
transactions to be processed.
2. Determines stale blocks called Uncles or Ommers and includes them in the block.
3. Updates the account balance with the reward earned from successfully mining the block.
4. Finally, a valid state is computed and block is finalized, which defines the result of all state
transitions.

26
 The current method of mining is based on Proof of Work, which is similar to that of
bitcoin. When a block is deemed valid, it has to satisfy not only the general
consistency requirements, but it must also contain the Proof of Work for a given
difficulty.
 The Proof of Work algorithm is due to be replaced with the Proof of Stake algorithm
with the release of serenity.
 Considerable research work has been carried out in order to build the Proof of Stake
algorithm suitable for the Ethereum Network. An Algorithm named Casper has been
developed, which will replace the existing Proof of Work algorithm in Ethereum. This
is a security deposit based on the economic protocol where nodes are required to
place a security deposit before they can produce blocks. Nodes have been named
bonded validators in Casper, whereas the act of placing the security deposit is
named bonding.

Ethash
Ethash is the name of the Proof of Work algorithm used in Ethereum.
Similar to bitcoin, the core idea behind mining is to find a nonce that once hashed the result
in a predetermined difficulty level. Initially, the difficulty was low when Ethereum was new
and even CPU and single GPU mining was profitable to a certain extent, but that is no longer
the case. Now either pooled mining is profitable, or large GPU mining farms are used for
mining purposes.
Ethash is a memory-hard algorithm, which makes it difficult to be implemented on
specialized hardware. As in bitcoin, ASICs have been developed, which have resulted in
mining centralization over the years, but memory-hard Proof of Work algorithms are one
way of thwarting this threat and Ethereum implements Ethash to discourage ASIC
development for mining.
This algorithm requires choosing subsets of a fixed resource called DAG (Directed Acyclic
Graph) depending on the nonce and block headers. DAG is around 2 GB in size and changes
every 30000 blocks.
Mining can only start when DAG is completely generated the first time a mining node starts.
The time between every 30000 blocks is around 5.2 days and is called epoch.
This DAG is used as a seed by the Proof of Work algorithm called Ethash.
According to current specifications, the epoch time is defined as 30,000 blocks.
The current reward scheme is 5 Ether for successfully finding a valid nonce.
In addition to receiving 5 Ethers, the successful miner also receives the cost of the gas
consumed within the block and an additional reward for including stale blocks (Uncles) in
the block
A maximum of two Uncles are allowed per block and are rewarded 7/8 of the normal block
reward.
In order to achieve a 12 second block time, block difficulty is adjusted at every block.
The rewards are directly proportional to the miner's hash rate, which basically means how
fast a miner can hash.
Mining can be performed by simply joining the Ethereum network and running an
appropriate client.

27
The key requirement is that the node should be fully synced with the main network before
mining can start.

In the upcoming section, various methods of mining are mentioned

 CPU mining: Even though not profitable on the main net, CPU mining is still valuable
on the test network or even a private network to experiment with mining and
contract deployment. A geth example is shown on how to start CPU mining here.
Geth can be started with mine switch in order to start mining:

 GPU mining
 Mining rigs: As difficulty increased over time for mining Ether, mining rigs with
multiple GPUs were starting to be built by the miners. A mining rig usually contains
around five GPU cards, and all of them work in parallel for mining, thus improving
the chances of finding valid nonces for mining. Mining rigs can be built with some
effort and are also available commercially from various vendors.
A typical mining rig configuration includes the components:

Motherboard A specialized motherboard with multiple PCI-E x1 or x16 slots, for


example, BIOSTAR Hi-Fi or ASRock H81, is required.
SSD hard drive An SSD hard drive is required. The SSD drive is recommended
because of its much faster performance over the analog equivalent. This will be
mainly used to store the blockchain.
GPU The GPU is the most important component of the rig as it is the main
workhorse that will be used for mining. For example, it can be a Sapphire AMD
Radeon R9 380 with 4 GB RAM.

Linux Ubuntu's latest version is usually chosen as the operating system for the rig.
There is also another variant of Linux available, called EthOS that is especially built
for Ethereum mining and supports mining operations natively.

Finally, mining software such as Ethminer and geth are installed. Additionally,
some remote monitoring and administration software is also installed so that rigs
can be monitored and managed remotely, if required.
It is also important to put appropriate air conditioning or cooling mechanisms in
place as running multiple GPUs can generate a lot of heat. This also necessitates
the need for using an appropriate monitoring software that can alert users if there
are any problems with the hardware, for example, if the GPUs are overheating.

28
Mining pools: There are many online mining pools that offer Ethereum mining.

The Ethereum network


The Ethereum network is a peer-to-peer network where nodes participate in order to
maintain the blockchain and contribute to the consensus mechanism. Networks can be
divided into three types, based on requirements and usage.
 MainNet: MainNet is the current live network of ethereum. The current version of
MainNet is homestead.
 TestNet :TestNet is also called Ropsten and is the test network for the Ethereum
blockchain. This blockchain is used to test smart contracts and DApps before being
deployed to the production live blockchain. Moreover, being a test network, it
allows experimentation and research
 Private net(s) As the name suggests, this is the private network that can be created
by generating a new genesis block. This is usually the case in distributed ledger
networks, where a private group of entities start their own blockchain and use it as a
permissioned blockchain.

Supporting protocols
There are various supporting protocols that are in development in order to support the
complete decentralized ecosystem. This includes whisper and Swarm protocols. In addition
to the contracts layer, which is the core blockchain layer, there are additional layers that
need to be decentralized in order to achieve a complete decentralized ecosystem. This
includes decentralized storage and decentralized messaging.
Whisper, being developed for ethereum, is a decentralized messaging protocol, whereas
Swarm is a decentralized storage protocol. Both of these technologies are being developed
currently and have been envisaged to provide the basis for a fully decentralized web.

Whisper : Whisper provides decentralized peer-to-peer messaging capabilities to the


ethereum network. In essence, whisper is a communication protocol that nodes use in order
to communicate with each other. The data and routing of messages are encrypted within
whisper communications. Moreover, it is designed to be used for smaller data transfers and
in scenarios where real-time communication is not required.
Whisper is also designed to provide a communication layer that cannot be traced and
provides "dark communication" between parties.
Blockchain can be used for communication, but that is expensive and consensus is not really
required for messages exchanged between nodes

Swarm : Swarm is being developed as a distributed file storage platform. It is a


decentralized, distributed, and peer-to-peer storage network. Files in this network are
addressed by the hash of their content. This is in contrast to the traditional centralized
services, where storage is available at a central location only. This is developed as a native
base layer service for the Ethereum web 3.0 stack. Swarm is integrated with DevP2P, which

29
is the multiprotocol network layer of Ethereum. Swarm is envisaged to provide a DDOS
(Distributed Denial of service)-resistant and fault-tolerant distributed storage layer for
Ethereum Web 3.0.

Both whisper and Swarm are under development and, even though Proof of Concept and
alpha code has been released for Swarm, there is no stable production version available yet.
The following figure gives a high level overview of how Swarm and whisper fit together and
work with blockchain:

Applications developed on Ethereum

There are various implementations of DAOs and smart contracts in Ethereum, most notably, the
DAO, which was recently hacked and required a hard fork in order for funds to be recovered. The
DAO was created to serve as a decentralized platform to collect and distribute investments. Augur is
another DAPP that has been implemented on Ethereum, which is a decentralized prediction market

30

You might also like