Bitcoin Notes
Bitcoin Notes
Bitcoin Notes
Saravanan Vijayakumaran
Department of Electrical Engineering
Indian Institute of Technology Bombay
Email: [email protected]
Version 0.1
October 4, 2017
Abstract
i
Contents
1 Introduction 1
4 The Blockchain 38
4.1 Rewarding Blockchain Updation . . . . . . . . . . . . . . . . . 39
4.2 The Block Header . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Bitcoin Transactions . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 Bitcoin Ownership . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6 Double Spending Attacks . . . . . . . . . . . . . . . . . . . . . 55
4.7 Blockchain Integrity . . . . . . . . . . . . . . . . . . . . . . . . 59
4.8 The 51% Attacker . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Bitcoin Transactions 63
5.1 Block Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Pre-SegWit Regular Transactions . . . . . . . . . . . . . . . . . 65
5.3 Pre-SegWit Coinbase Transactions . . . . . . . . . . . . . . . . 71
5.4 Bitcoin Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.5 Pre-SegWit Standard Scripts . . . . . . . . . . . . . . . . . . . 79
5.6 Pre-SegWit Signature Generation . . . . . . . . . . . . . . . . . 96
5.7 Transaction Malleability . . . . . . . . . . . . . . . . . . . . . . 105
5.8 SegWit Standard Scripts . . . . . . . . . . . . . . . . . . . . . . 109
ii
CONTENTS iii
6 Contracts 127
6.1 Escrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Micropayments . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.3 Decentralized Lotteries . . . . . . . . . . . . . . . . . . . . . . . 132
Introduction
• A method for ensuring that only the rightful owner of bitcoins stored in
an address can move them to a new address.
1
CHAPTER 1. INTRODUCTION 2
network where the participants are anonymous and not accountable for their
behaviour.
The main innovation in Bitcoin is that the maintenance of the blockchain
database is linked to the creation of new bitcoins. The blockchain consists of
a linked list or chain of blocks where each block contains a set of transactions.
Blocks are appended to the blockchain one at a time where each addition
requires finding a solution to a computationally hard search problem. Nodes
in the Bitcoin network which successfully add a block to the blockchain are
rewarded with new bitcoins. Such nodes are called miners and their search
for solutions of the computationally hard problems is called mining.
In the forthcoming chapters, we will describe the different aspects of the
Bitcoin system in detail.
Chapter 2
In public key cryptography, each entity owns a pair of related keys: a public
key and a private key. Given the private key, it is easy to calculate the
corresponding public key. Finding the private key from the public key is
computationally hard. As the names suggest, the private key needs to be kept
a secret while the public key can be advertised. When public key cryptography
is used for encrypted message transmission, the sender uses the receiver’s
public key to encrypt the message. The receiver can decrypt the encrypted
message using its private key. It is computationally infeasible to decrypt the
encrypted message without knowledge of the private key. When public key
cryptography is used for implementing digital signatures, the signer uses its
private key to create the signature on a given message. Verifying the validity
of a signature on a message only requires the public key of the signer. It is
computationally infeasible to create a valid signature without knowledge of
the private key. These concepts are illustrated in Figure 2.1.
Elliptic curve cryptography (ECC) is a method for implementing public
key cryptography. Bitcoin uses public keys derived from the secp256k1 elliptic
curve1 to derive Bitcoin addresses. Ownership of a Bitcoin address is proved
by generating a digital signature using the corresponding private keys and the
elliptic curve digital signature algorithm (ECDSA). Such a proof of ownership
is required in order to spend the bitcoins which have been received by a Bitcoin
address.
Understanding the specifics of the ECC used in Bitcoin requires knowledge
of some abstract algebra. In order to motivate the required prerequisites, let
us look at the structure of the private and public keys as specified by the
secp256k1 domain parameters. All undefined terms will be discussed in the
following sections.
1
The secp256k1 elliptic curve domain parameters are specified in http://www.secg.
org/sec2-v2.pdf.
3
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 4
Easy
Hard
Encrypted Communication
Receiver’s Receiver’s
Public Key Private Key
Digital Signatures
Decision on
Message Signer (Message, Signature) Verifier
Signature Validity
Signer’s Signer’s
Private Key Public Key
p = FFFFFFFF
| {z · · · FFFFFFFF} FFFFFFFE FFFFFC2F.
FFFFFFFF (2.1)
48 hexadecimal digits
Let Fp denote the corresponding finite field. Consider the solutions (x, y) ∈ F2p
to the equation
y 2 = x3 + 7. (2.2)
The solutions are nothing but points on the curve with coordinates from Fp .
Let E be the set of such points. If we add a special element O called the
“point at infinity” to E, then a binary operation can be defined on E ∪ {O}
which makes E ∪ {O} into a group. This binary operation is called “addition”
for convenience and denoted by +. The set E ∪ {O} is called the elliptic curve
corresponding to y 2 = x3 + 7 over Fp .
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 5
|P + P + ·{z
· · + P + P} (2.5)
k times
2.1 Groups
Let G be a set. A binary operation on G is a rule for combining pairs of
elements from G. Familiar examples of binary operations are addition and
multiplication over the integers.
Definition 1 (Group). Let G be a set with a binary operation ∗ defined on
it. G is called a group if it satisfies the following properties:
(i) ∗ is closed: For all x, y ∈ G, x ∗ y belongs to G.
(ii) ∗ is associative: For all x, y, z ∈ G,
(x ∗ y) ∗ z = x ∗ (y ∗ z).
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 6
x∗e=e∗x=x
(iv) Inverses exist: For every x ∈ G, there exists an element y ∈ G such that
x∗y =y∗x=e
• A set G can have more than one binary operation defined on it. While
the above definition refers to the set G as a group, the binary operation
which endows G with the group structure needs to be specified explicitly
in case of ambiguity.
• G is a subgroup of itself.
In the last example, the order of H is 2 which divides the order 6 of Z6 . This
is not a coincidence. It is an example of Lagrange’s theorem for finite groups
which we state without proof.
Let us return to Z6 to motivate the next result. Note that the subgroup
H = {0, 3} can be written as {3 + 3, 3}. Now consider an arbitrary element
in Z6 , say 4. By repeatedly adding 4 to itself modulo 6 we get {4, 4 + 4 =
2, 4 + 4 + 4 = 0} = {0, 2, 4}, which is also a subgroup of Z6 . Again, this is not
a coincidence but an example of the following theorem about finite groups.
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 8
u ∗ v = xm ∗ xn = (x
| ∗ ·{z
· · ∗ x}) ∗ (x
| ∗ ·{z
· · ∗ x}) = x
| ∗ ·{z
· · ∗ x},
m times n times m+n times
where the last equality follows from the fact that xm and xn are also
elements of the group G where the associativity of ∗ allows us to disregard
the parentheses.
(iii) Identity: There exists e ∈ hxi such that e ∗ u = u ∗ e for all u ∈ hxi.
hxi is a subset of a finite group G. But the definition of hxi involves
an infinite sequence of powers of x, all of which belong to hxi since ∗
is closed. This is possible only if the powers of x start repeating. Let
m, n ∈ Z+ be such that m 6= n and xm = xn . There is no loss in
generality in assuming that m > n. Let x−1 ∈ G be the inverse of x. It
exists because G is a group. We have not yet proved that x−1 belongs
to hxi. Multiplying both sides of the identity xm = xn by n copies of
x−1 , we get
−1
xm ∗ x · · ∗ x−1} = xn ∗ x
| ∗ ·{z
−1
· · ∗ x−1}
| ∗ ·{z
n times n times
−1 −1 −1
=⇒ |x ∗ ·{z
· · ∗ x} ∗ x
| ∗ ·{z
·· ∗ x } = x
| ∗ ·{z
· · ∗ x} ∗ x · · ∗ x−1}
| ∗ ·{z
m times n times n times n times
· · ∗ x} = xm−n = e.
=⇒ |x ∗ ·{z
m−n times
(iv) Inverse: For any u ∈ hxi, there exists a v ∈ hxi such that u∗v = v∗u = e.
Let p = m − n defined in the previous part. Then xp = e. For u ∈ hxi,
let u = xk for some k ∈ Z+ . Consider the remainder r when k is divided
by p, i.e. r = k mod p. Then 0 ≤ r ≤ p − 1.
If r = 0, then k = pq for some q ∈ Z+ . This implies that
u = xk = xpq = x p
· · ∗ x}p = e| ∗ ·{z
| ∗ ·{z · · ∗ e} = e.
q times p times
The order of the group E ∪ {O} is a 256-bit prime number n. If the base
point P is not the identity of E ∪{O}, then by the above theorem the subgroup
generated by P is equal to E ∪ {O}. To prove the P is not the identity of
E ∪ {O} will require us to define the binary operation on E ∪ {O}. For now,
we can give an argument based on common sense. If the base point P were
the identity element of E ∪ {O}, then all the public keys kP would be equal to
P . The digital signature scheme would break down since the same public key
would validate signatures created by any private key. Also, since the public
keys are used to derive Bitcoin addresses, there would be only one Bitcoin
address in the whole system!
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 10
k1 P + −P
| + ·{z
· · + −P} = k2 P + −P
| + ·{z
· · + −P}
k2 times k2 times
| + ·{z
=⇒ P · · + P} + −P
| + ·{z
· · + −P} = P
| + ·{z
· · + P} + −P
| + ·{z
· · + −P}
k1 times k2 times k2 times k2 times
| + ·{z
=⇒ P · · + P} = O =⇒ (k1 − k2 )P = O.
k1 −k2 times
2.2 Fields
The next algebraic object we need in order to understand ECDSA is called a
field. A field can be described succintly in terms of abelian groups.
In abelian groups, the result of the binary operation between a pair of ele-
ments is independent of the order of the elements. All the examples of groups
we encountered in the previous section were abelian groups. An example of
a non-abelian group is the set of n × n nonsingular matrices with real entries
where n ≥ 2 and matrix multiplication is the binary operation. For instance,
let n = 2 and consider the following calculation where A and B are 2 × 2
nonsingular matrices.
1 1 0 1 1 2 0 1 0 1 1 1
= 6= =
0 1 1 1 1 1 1 2 1 1 0 1
| {z } | {z } | {z } | {z }
A B B A
While + and ∗ can be arbitrary binary operations which satisfy the re-
quired properties, they are usually referred to as “addition” and “multiplica-
tion” respectively. The third requirement in the field definition is called the
distributivity of multiplication over addition.
Here are two familiar examples of fields.
• The real numbers R form a field with + and ∗ defined as the usual
addition and multiplication of real numbers.
• The rational numbers Q form a field with + and ∗ defined as the usual
addition and multiplication of rational numbers.
Both R and Q are fields with an infinite number of elements. Fields with finite
cardinality are called finite fields. To describe the ECDSA with the secp256k1
domain parameters, we need to define a finite field whose cardinality is a prime
number. Such fields are called prime fields.
Prime Fields
Let p be a prime number. Let Fp = {0, 1, 2, . . . , p − 1} be the integers from
0 to p − 1. Define + and ∗ on Fp as integer addition modulo p and integer
multiplication modulo p respectively, i.e. for all x, y ∈ Fp
x + y = x + y mod p,
x ∗ y = xy mod p.
We prove that Fp is a field.
Proof. We need to check the three properties stated in the field definition.
• Since Fp is the same as Zp with + as the binary operation, it is a group
with 0 as the identity element. It is an abelian group as x + y mod p =
y + x mod p.
• F∗p = Fp \ {0} = {1, 2, . . . , p − 1}. To prove that F∗p is a group with ∗
as the binary operation, we need to check closure of ∗, associativity of
∗, existence of identity and inverses. Once we have shown that F∗p is a
group, it will follow that it is an abelian group as xy mod p = yx mod p.
– Closure: The result of x ∗ y is an integer from Fp . The only way ∗
can fail to be closed on F∗p is when x ∗ y = 0 when x, y ∈ F∗p . But
x ∗ y = 0 implies that the integer product xy is a multiple of p,
i.e. xy = kp for some integer k. As elements of F∗p , x and y only
have factors less than p. The product of such factors cannot be
equal to p since it is a prime number.
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 12
gcd(x, y) = xu + yv.
The identity states that the gcd of two integers x, y can be written
as an integer linear combination of x and y. For example, the
integers 15 and 35 have gcd 5 which can be written as 5 = 35(1) +
15(−2). The integers u and v can be calculated from the Euclidean
algorithm for finding gcds.
To find the multiplicative inverse of x ∈ F∗p , consider the gcd of x
and p. Since x is a positive integer less than p and p is a prime,
gcd(x, p) = 1. By Bézout’s identity, there exist integers u, v such
that xu + pv = 1. Considering both sides of this equation modulo
p, we get xu mod p = 1.
If the integer u belongs to F∗p , then it is the inverse of x as x ∗ u =
u ∗ x = 1. If u ∈ / F∗p , then divide it by p to get the remainder r,
i.e. u = qp + r where 0 ≤ r ≤ p − 1. Note that r cannot be 0 as this
would mean u is a multiple of p and xu mod p = 0. So r belongs
to F∗p and is the inverse of x as
+ 0 1 2 3 4 ∗ 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1
y y
4 4
2 2
x x
−2 2 −2 2
−2 −2
−4 −4
(a) y 2 = x3 − x + 2 (b) y 2 = x3 − 2x
y = mx + y1 − mx1
−y1
where m = xy22 −x 1
. To find the points of intersection between this line and the
curve given in equation (2.6), let us eliminate y from the curve equation by
substituting the expression for y from the line. We get
(mx + y1 − mx1 )2 = x3 + ax + b
=⇒ x3 − m2 x2 + [a − 2m(y1 − mx1 )] x + b − (y1 − mx1 )2 = 0. (2.7)
The solutions to the above cubic equation will be the x coordinates of the
points of intersection between the line and the elliptic curve. We already
know two such coordinates, namely x1 and x2 . If we denote the x coordinate
of the third intersection point R0 by x03 , another representation for the cubic
in equation (2.7) is
(x − x1 )(x − x2 )(x − x03 ) = 0.
Equating the coefficients of x2 in the two representations of the cubic, we get
m2 = x1 + x2 + x03 =⇒ x03 = m2 − x1 − x2 .
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 15
y O y
R0
Q
P
P
x x
Q
R
R0
P
x
(c) x1 = x2 , y1 = y2 6= 0
Substituting x03 into the line equation, we get the y coordinate of the third
intersection point R0 as
Since R is the reflection of R0 about the x-axis, its coordinates are (x3 , y3 ) =
(x03 , −y30 ).
Now consider the case when P and Q are distinct points on the curve which
lie on a vertical line as shown in Figure 2.3(b). The coordinates of P and Q
are related as x1 = x2 and y1 = −y2 . In this case, we define P + Q to be equal
to the point at infinity O. It is convenient to think of O as a point which lies
at infinite height along the line joining P and Q. With this interpretation of
O, the line joining any P ∈ E and O will be a vertical line which intersects
the curve again at Q, the reflection of P about the x-axis. The reflection of Q
will be P itself. So by the intersection-reflection procedure for point addition
we described earlier, we get P + O = O + P = P . This almost makes O the
identity element of E ∪ {O} under point addition. We say almost because the
case of P = O is not handled as P was assumed to be a point in E. To make
O into the identity element, we will define O + O = O.
If O is the identity element of E ∪ {O}, then the inverse of a point P ∈ E
is its reflection Q about the x-axis since we defined P + Q = Q + P = O. We
will denote the inverse of P by −P . For P = (x, y), we have −P = (x, −y).
The inverse of O is O itself.
Finally, let us consider the case when P and Q are not distinct points,
i.e. P = Q. The point addition in this case is called point doubling as P + P
is denoted as 2P . We will apply the intersection-reflection procedure using
the tangent line at P as illustrated in 2.3(c). The resulting point R will be
defined to be equal to 2P . For P = (x1 , y1 ), the slope of the tangent to the
curve at P is given by
∂f
dy 3x2 + a
m1 = ∂x
= − ∂f = 1
dx 2y1
∂y x=x1 ,y=y1
(x − x1 )2 (x − x02 ) = 0.
y20 = m1 x02 + y1 − m1 x1 .
The reflection R of R0 about the x-axis has coordinates (x2 , y2 ) = (x02 , −y20 ).
We now summarize the point addition operation on E ∪ {O}.
y2 − y1 2
y2 − y1
x3 = − x1 − x2 , y3 = (x1 − x3 ) − y1 . (2.8)
x2 − x1 x2 − x1
Note that the case of point doubling when y1 = 0 is subsumed by the second
case above as (x1 , y1 ) = (x1 , −y1 ) for y1 = 0.
So have we proved that E ∪ {O} is a group under the point addition oper-
ation? The operation is closed by construction, O is the identity element and
every element in E ∪ {O} has an inverse. The only group property remaining
to be checked is the associativity of point addition. While + is indeed asso-
ciative, we will skip the proof as it is a tedious exercise to prove associativity
using the above rules.
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 18
10 10
8 8
6 6
y
y
4 4
2 2
0 0
0 2 4 6 8 10 0 2 4 6 8 10
x x
(a) y 2 = x3 + 10x + 2 (b) y 2 = x3 + 9x
Table 2.2: Point addition for the elliptic curve y 2 = x3 + 10x + 2 over F11
The point addition operation for the curve in Figure 2.4(a) is illustrated in
Table 2.2. For example, to find the result of adding (3, 2) to (3, 9), we find
the entry in the table which lies both in the row starting with (3, 2) and the
column starting with (3, 9). The entry in this case is O as −2 = 9 in F11 .
2.5 ECDSA
Signing a message using a digital signature algorithm involves the creation of a
bit string which is easy to create if a private key is known and computationally
infeasible otherwise. The created bit string is called a digital signature. It is
attached to the message as proof that an entity with knowledge of a private
key has signed the message (see Figure 2.1). Unlike the usual handwritten
signatures, digital signatures are message dependent. If a digital signature did
not depend on the message being signed, it could be attached to a different
message which the signer did not sign and still serve as a valid signature.
But plain dependence on the messages being signed is not enough. A digital
signature algorithm needs to be resistant to forgery. Informally, unforgeability
requires that an adversary with access to signatures on a set of messages
created using a private key should not be able to create a valid signature on
a new message in a computationally feasible manner.
The unforgeability of the digital signatures created using the ECDSA is
due to the difficulty of solving the elliptic curve discrete logarithm problem
(ECDLP). The ECDLP refers to the problem of finding a private key k given
the public key kP for elliptic curves defined over finite fields. The original
discrete logarithm problem (DLP) was defined using a prime field Fp . It
involves finding the positive integer x given a, b ∈ Fp such that ax = b mod p.
Here x can be intepreted as the discrete logarithm of b to the base a. Since
ax is the result of combining x copies of a with the multiplication operation,
CHAPTER 2. ELLIPTIC CURVE CRYPTOGRAPHY 21
• Due to the random choice of j, two different runs of the signature gener-
ation algorithm for the same message m will yield different signatures.
• Suppose an adversary who has the access to the message m and the
corresponding signature (r, s) wants to recover the private key k. Since
m is known, the adversary can calculate the message digest e. If the
adversary can find j, then the private key k can be calculated as k =
r−1 (js−e) mod n. Finding j from r involves finding j given jP , i.e. solv-
ing the ECDLP which is computationally infeasible.
Given a message m, a public key kP , and a digital signature (r, s), the
signature is verified as follows:
To see why the above verification procedure works, consider a valid sig-
nature (r, s) on a message m with digest e. Then there exists an integer j ∈
{1, 2, . . . , n − 1} such that jP = (x1 , y1 ), r = x1 mod n, and s = j −1 (e + kr).
The integer j satisfies j = s−1 (e + kr) which give us
In the above equation, we could suppress the modn expressions while sub-
stituting for j1 and j2 because nP = O and O is the identity for point
addition. Since Q = jP , their x coordinates are equal modulo n. Since
j ∈ {1, 2, . . . , n − 1}, jP is never equal to O. This is the reason behind
rejecting the signature in step 5.
Chapter 3
Cryptographic Hash
Functions
Hash functions are defined as functions which map bit strings of arbitrary
length to bit strings of fixed length. As the number of possible inputs is
larger than the number of possible outputs, a hash function is a many-to-one
function. A cryptographic hash function H is defined as a hash function which
has the following properties:
24
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 25
3.1 SHA-256
The SHA-256 hash function was announced in 2001 by the National Institute
of Standards and Technology (NIST), an agency which is part of the U.S. De-
partment of Commerce. It was specified as part of the Secure Hash Standard1
detailed in the Federal Information Processing Standards Publication 180-2
(FIPS PUB 180-2). The SHA in SHA-256 is an abbreviation of “secure hash
algorithm” and the 256 indicates the output length in bits.
While the definition of a hash function allowed its input to be a bit string of
arbitrary length, the SHA-256 function specification restricts the input to be
at most 264 − 1 bits long. This restriction is imposed because the specification
requires the length of the input to be stored in a 64-bit unsigned integer. The
SHA-256 operation can be divided into preprocessing and hash computation.
For convenience, let us refer to the input bit string as the message and denote
it by M .
Preprocessing
The preprocessing step consists of message padding and state initialization.
Message padding involves appending some bits to the message resulting in a
1
See http://dx.doi.org/10.6028/NIST.FIPS.180-4 for the latest version of this stan-
dard
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 26
1. If the message M is l bits long, then find the smallest non-negative solution
k to the equation
k + l + 65 = 0 mod 512. (3.1)
Since the message M is l bits long, the padded message after step 2 is l + k + 1
bits long. After step 3, the padded message is l + k + 65 bits long. The
requirement that the final padded message length be a multiple of 512 is
satisfied by choosing k according to equation (3.1). Note that padding is
done even if the original message length l was already a multiple of 512. We
will discuss the reasoning behind the choice of this particular padding scheme
after we discuss the hash computation step. As an example, consider the 6-
bit message M = 101010. For l = 6, the smallest non-negative solution to
equation (3.1) is k = 441. The 64-bit representation of 6 is 00 · · · 00110. The
512-bit padded message is given by
| {z } 1 |00000 ·{z
101010 · · 00000} 00
| · · {z
· 00110} .
M 441 zeros l
State initialization involves setting the value of a 256-bit initial hash value
(0) (0) (0)
H (0) to a fixed constant. Let H0 , H1 , . . . , H7 be eight 32-bit words which
constitute H (0) . They are initialized as follows.
(0) (0)
H0 = 0x6a09e667, H1 = 0xbb67ae85,
(0) (0)
H2 = 0x3c6ef372, H3 = 0xa54ff53a,
(0) (0)
(3.2)
H4 = 0x510e527f, H5 = 0x9b05688c,
(0) (0)
H6 = 0x1f83d9ab, H7 = 0x5be0cd19.
Hash Computation
Let the padded message consist of N 512-bit blocks M (1) , M (2) , . . . , M (N ) .
The hash computation is an iterative process where the ith message block
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 27
M (i) is combined with the previous hash value H (i−1) using a function f to
generate the next hash value H (i) .
• For U = u0 u1 · · · u30 u31 and 1 ≤ n ≤ 32, the shift right and rotate right
operations on U are defined as
SHRn (U ) = 000
| ·{z
· · 000} u0 u1 · · · u30−n u31−n ,
n zeros
n
ROTR (U ) = u31−n+1 u31−n+2 · · · u30 u31 u0 u1 · · · u30−n u31−n ,
respectively.
• Let
Ch(U, V, W ) = (U ∧ V ) ⊕ (¬U ∧ W ),
Maj(U, V, W ) = (U ∧ V ) ⊕ (U ∧ W ) ⊕ (V ∧ W ),
where Ch and Maj are short for Choice and Majority. The Ch function
performs a bitwise choice between the bits of V and W depending on
whether the corresponding bit in V is 1 or 0. The Maj function finds
the bitwise majority among the bits of U , V , and W .
• Let
Each run of the above equation is called a round. In each of the 64 rounds,
the calculation of the variable T1 involves the constant Kj and the message
schedule block Wj .
(i) (i) (i)
4. The eight 32-bit words H0 , H1 , . . . , H7 which constitute the ith hash
value H (i) are calculated as
(i) (i) (i) (i−1) (i−1) (i−1)
(H0 , H1 , . . . , H7 ) = A + H0 , B + H1 , . . . , H + H7 .
Properties
While the SHA-256 hash computation looks complicated, it can be efficiently
implemented as it consists of simple operations of 32-bit words. However. this
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 29
Messages
SHA-256
M1 , M2 , M3 , . . .
y3 y1 ··· y2
To solve for the variables mi , we have to solve the equation f (M (1) , H (0) ) =
y. We would have to express each bit in the 256-bit message digest y as a
function of the variables mi and find a simultaneous solution to the resulting
256 equations. While it is easy to calculate the compression function output
for a given input, it is difficult to express the output bits as a function of
the input. Even if we somehow managed to find the 256 equations, they will
involve complicated functions of the variables and solving them simultaneously
is still difficult.
To see why finding second preimages or collisions for SHA-256 is hard, let
us interpret each of the 2256 message digests as corresponding to a bin. Each
input message is assigned to one of these bins in a deterministic manner (see
Figure 3.1). But one cannot predict which bin a particular message will go
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 30
Finding a second preimage is equivalent to fixing one of the bins and finding
a message which maps to that bin. Finding a collision is equivalent to finding
two distinct messages which are assigned to the same bin. But the number
of bins is 2256 which is very large. Finding a second message which maps to
a fixed bin or a pair of messages which map to the same bin is extremely
unlikely.
To get an idea of how unlikely such events are, let us asssume that messages
are equally likely to be assigned to any of the 2256 bins. Then the probability
of finding a second preimage of a given message using n distinct messages is
n
approximately 2256 . The number of messages n would have to be close to 2256
to make this probability non-negligible. The probability of finding a collision
2
using n distinct messages is approximately 2n257 . In this case, n would have to
be close to 2128 to make this probability non-negligible.
Let us now discuss the motivation behind the specific message padding
scheme used in SHA-256. The main requirement for a padding scheme used
in a hash function is that distinct messages should not result in the same
padded message. Let pad(M ) be the result of padding a message M . If
M 6= M 0 but pad(M ) = pad(M 0 ), then M and M 0 will have the same mes-
sage digest resulting in a collision. For example, consider the simple padding
scheme which consists of only appending zeros to a message M until the length
becomes a multiple of 512. Then the messages 111 and 1110 will result in the
same padded message which consists of 3 ones followed by 519 zeros.
The padding scheme used in SHA-256 appends a 1 followed by zeros and
the length of the message as a 64-bit field. Suppose that the length field was
not appended and the number of zeros appended after the 1 was chosen to
make the padded message length a multiple of 512. This padding scheme
satisfies pad(M ) 6= pad(M 0 ) for all M 6= M 0 . To see this, first consider the
case when messages M and M 0 have the same length. Then the same bit
string S = 100 · · · 00 will be appended to both of them during the padding
operation resulting in pad(M ) = M kS and pad(M 0 ) = M 0 kS where k denotes
the concatenation operation between bit strings. Clearly, pad(M ) 6= pad(M 0 )
if M 6= M 0 . Now consider the case when M and M 0 have different lengths.
Let pad(M ) = M kS and pad(M 0 ) = M 0 kS 0 where both S and S 0 have the
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 31
form 100 · · · 00 with a different number of zeros after the 1. Then pad(M ) 6=
pad(M 0 ) since they both end in a different number of zeros.
If the length field is not required to guarantee that distinct messages re-
sult in distinct padded messages, then why is it appended to the message?
Appending the length field prevents some types of attacks for finding colli-
sions or second preimages. We will not discuss these attacks here and refer
the reader to Chapter 9 of the Handbook of Applied Cryptography for more
details2 .
Note that if the length field is finally appended to the message, we could
technically avoid appending the single 1 and only append zeros. This padding
scheme would also result in distinct padded messages for distinct messages.
The number of 512-bit blocks generated by appending only zeros is sometimes
smaller than the number of 512-bit blocks generated by appending a 1 fol-
lowed by zeros. For example, suppose the message M is 448 bits long. The
padding scheme which appends only zeros will append the 64-bit length field
L resulting in the single 512-bit block M kL. The padding scheme which ap-
pends a 1 followed by zeros cannot accommodate the length field in the 63
bits which remain in the first block after the 1 is appended to M . So it adds
511 zeros after the 1 increasing the intermediate padded message length to
960 bits. Appending the 64-bit length field L gives a 1024-bit final padded
message M k1k000 · · · 000kL which splits into two 512-bit blocks. In spite of
this disadvantage, the padding scheme which appends a 1 followed by zeros
and a length field has been used in SHA-256. This was probably a result of a
conservative design approach taken by the SHA-256 designers.
3.2 RIPEMD-160
The RIPEMD-160 hash function was announced in 1996 by researchers from
the German Information Security Agency and the Katholieke Universiteit Leu-
ven, Belgium. It is an enhanced version of an earlier hash function called
RIPEMD which was developed in 1992 as part of the European Union project
RACE Integrity Primitives Evaluation (RIPE). The MD in RIPEMD stands
for “message digest”. The number 160 indicates the output length in bits.
As in SHA-256, the input to RIPEMD-160 can be at most 264 − 1 bits
long. The padding scheme is similar to the one used in SHA-256 with a single
1 being appended to the message followed by some zeros and a 64-bit field
which contains the message length. The number of zeros appended is chosen
to make the padded message length a multiple of 512. The difference is that
the 64-bit length field appears in little-endian format, i.e. the least significant
32-bit word appears first. For example, message length of 6 bits would be
appended as 0x0000 0000 0000 0006 in the SHA-256 padding scheme and
as 0x0000 0006 0000 0000 in the RIPEMD-160 padding scheme.
2
Available for free download at http://cacr.uwaterloo.ca/hac/
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 32
The hash computation is an iterative process like in SHA-256 but the chain-
ing variable used is 160 bits long and the compression function is different. Let
the padded message consist of N 512-bit blocks M (1) , M (2) , . . . , M (N ) . The
state of the computation is initialized by setting the 160-bit initial hash value
H (0) to a fixed constant. The ith message block M (i) is combined with the
previous hash value or chaining variable H (i−1) using a compression function
g to generate the next hash value H (i) .
Here the function g accepts 672 bits as input and returns 160 bits as output.
The final hash value H (N ) corresponds to the 160-bit output of the RIPEMD-
160 hash function.
All that remains to completely specify the RIPEMD-160 hash function is to
define the compression function g as we had defined the compression function
f used in SHA-256. We will not define g here and refer the reader to Algorithm
9.55 of the Handbook of Applied Cryptography for an exact description. The
description of the SHA-256 compression function f in the previous section
served as a concrete example of a complicated input-output relationship which
is easy to compute in the forward direction but resistant to finding preimages,
second preimages, and collisions. It would be redundant to give yet another
example of the same phenomenon by describing g.
1. Encode each leading zero byte (if any) as a 1. For example, if the first
m leading bytes bn , bn−1 , . . . , bn−m+1 are all zero bytes then they will be
encoded as m ones.
2. Let bn−m be the first byte which is not a zero. Let N be the integer whose
big-endian representation is given by bn−m bn−m−1 · · · b0 . Then representing
each byte bi as an integer from 0 to 255, we have
n−m
X
N= bi 256i
i=0
For example, consider the byte string 0x00001234 given in hexadecimal for-
mat. In base 256, it is given by 0 0 18 52. It has two leading zero bytes and
the remaining portion corresponds to the integer N = 4660 = 18 × 256 + 52.
In base 58, N is given by 1 22 20 as 4660 = 582 + 22 × 58 + 20. The Base58
encoding of the byte string 0x00001234 is then given by 112PM where the two
leading ones encode the two leading zero bytes, the 2 encodes the integer 1, P
encodes 22 and M encodes 20.
Recall that a ECDSA public key in Bitcoin is a point kP on the secp256k1
elliptic curve where k is a 256-bit integer representing the private key and P
is the base point. In uncompressed format, the public key can be represented
using 65 bytes consisting of the single byte 0x04 followed by the 32-byte x
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 34
Public key
(uncompressed format)
SHA-256
RIPEMD-160
Prefix address
version byte
BkR
Double
SHA-256
Extract first C4
k
four bytes
BkRkC4
Base58
Encoding
P2PKH Address
and y coordinates of kP (see Section 2.4). Let X and Y denote the 32-byte
big-endian representations of the x and y coordinates of kP respectively. Let k
denote the bit string concatenation operator. The P2PKH address generation
procedure is illustrated in Figure 3.2 and proceeds as follows:
S = SHA-256 (0x04kXkY ) .
S is 32 bytes long.
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 35
R = RIPEMD-160(S).
R is 20 bytes long.
3. Prefix R with a single byte B which contains the address version number to
get BkR which is 21 bytes long. The address version number for P2PKH
addresses is 0x00 on the main Bitcoin P2P network and 0x6f on testnet
which is a network used for testing Bitcoin features.
4. Calculate the result of applying the SHA-256 hash function twice on BkR:
C is 32 bytes long.
5. Let C4 denote the first four bytes of C. Append C4 to BkR to get a 25-byte
string BkRkC4 .
6. Encode BkRkC4 using Base58 encoding to the get the P2PKH address A:
A = Base58(BkRkC4 ).
When the address version byte B is 0x00, the P2PKH address begins with a
1 as the Base58 encoding represents leading zero bytes with ones. For example,
the P2PKH address corresponding to the public key equal to the base point P
given in equation (2.4) is given by 1EHNa6Q4Jz2uvNExL497mE43ikXhwF6kZm.
When B is 0x00, the P2PKH address consists of at most 34 Base58 characters
including the leading 1. This is because the largest base 256 integer which can
fit in the 24-byte RkC4 field is 25624 −1. This number is smaller than 5833 −1,
the largest base 58 number which has 33 digits. The P2PKH address can have
less than 34 Base58 characters because the integer in RkC4 can sometimes be
represented using 32 digits in base 58. While typing a Base58-encoded P2PKH
address is cumbersome, it is more convenient than typing the 25-byte BkRkC4
as 50 hexadecimal digits or 200 bits.
The C4 field serves the role of a checksum which can be used to detect
errors introduced while typing a P2PKH address. Before an address A is
used, Base58 decoding is performed to get BkRkC4 . The double SHA-256
hash of BkR is calculated and checked to be equal to C4 . This is called
checksum validation. The field C4 , being the partial output of the double
SHA-256 function, can be assumed to behave like a random 32-bit value.
If someone were to make an error in typing or writing down the P2PKH
address A, then the checksum validation will succeed only with a probability
1
232
. But why invoke the SHA-256 function twice when one invocation would
also yield a checksum with the same property? The double SHA-256 function
CHAPTER 3. CRYPTOGRAPHIC HASH FUNCTIONS 36
P2PK Solve
Private key
Address ECDLP
Find Find
P2PKH Solve
RIPEMD-160 SHA-256 Private key
Address ECDLP
preimage preimage
Figure 3.3: Steps involved in recovering a private key from a P2PK address
and a P2PKH address
The Blockchain
List of List of
··· List of
Transactions Transactions Transactions
38
CHAPTER 4. THE BLOCKCHAIN 39
The motivation for and design of SegWit will be discussed in the next chapter
as it requires us to first understand some shortcomings in the pre-SegWit
Bitcoin system.
In this chapter, we will describe the blockchain and explain the motivation
behind different aspects of its design.
nVersion 4 bytes
hashPrevBlock 32 bytes
hashMerkleRoot 32 bytes
nTime 4 bytes
nBits 4 bytes
nNonce 4 bytes
nVersion nVersion
hashPrevBlock hashPrevBlock
hashMerkleRoot Double hashMerkleRoot
nTime SHA-256 nTime
nBits nBits
nNonce nNonce
Figure 4.3: The hashPrevBlock field contains the double SHA-256 hash of
the previous block header
CHAPTER 4. THE BLOCKCHAIN 41
h = H(h0 kh1 )
t0 t1 t2 t3
The hashMerkleRoot field stores the root hash of a Merkle tree formed
using the transactions in the block. The transactions are arranged as a list and
the double SHA-256 hash of each of them is computed. Using these hashes as
leaves, a binary tree is created where each node is associated with a double
SHA-256 hash of the concatenation of its child hashes. This is illustrated in
Figure 4.4 for the case when four transactions are used to construct a Merkle
tree. In the figure, t0 , t1 , t2 , t3 represent transactions, k denotes the concate-
nation operator, and H(·) is used to denote the double SHA-256 function,
i.e. H(x) = SHA-256(SHA-256(x)). The hash value h associated with the
root of the tree is called the root hash or Merkle root of the tree. When the
h = H(h0 kh1 )
t0 t1 t2
h = H(h0 kh1 )
h0 = H(h00 kh01 ) h1
t0 t1 t2 t3
number of transactions is not a power of two, some nodes in the Merkle tree
will have only one child. In that case, the hash of the single child is concate-
nated with itself and then hashed to derive the hash value associated with the
parent node. Figure 4.5 illustrates the calculation of the Merkle root corre-
sponding to three transactions. In this case, the node corresponding to h1 has
only one child h10 . The value h10 is concatenated with itself and hashed to
get h1 .
The root hash stored in hashMerkleRoot is a compact representation of all
the transactions in a block. Any change to a transaction in a block will result
in a change in the hashMerkleRoot field as the SHA-256 function is collision
resistant. A change in the hashMerkleRoot will in turn result in a change in
the block hash of the block. The block hash of a block also depends on the
hashMerkleRoot of the previous block through the hashPrevBlock. As this
dependence is recursive, the block hash of a particular block depends on all
the transactions in all the previous blocks all the way upto the genesis block.
Changing any past transaction will involve a recalculation of all the block
hashes of blocks which are subsequent to and including the block containing
that transaction. This property will turn out to be important for guaranteeing
tamper resistance of the transaction data.
But why use a Merkle tree of the transactions? Why not hash the concate-
nation of all the transactions and put the resulting hash in the block header?
To guarantee tamper resistance of the transaction data, including the hash
H(t0 kt1 k · · · ktn−1 ) of the transactions t0 , t1 , . . . , tn−1 in the block header in-
stead of the Merkle root would have been sufficient. The reason for using
the Merkle root is that it enables efficient membership proofs of transactions
within a block. For example, suppose we want to prove that the transaction t1
CHAPTER 4. THE BLOCKCHAIN 43
was involved in the calculation of the root hash h in Figure 4.4. We only need
to provide the Merkle branch consisting of the hashes h00 and h1 as shown in
Figure 4.6. The hashes h01 , h0 , and h can be calculated from t1 and the Merkle
branch. If the root hash h appears in the hashMerkleRoot field of a block,
then by the second preimage resistance of the SHA-256 hash function we can
be certain that this block contains the transaction t1 . In general, the Merkle
branch required to prove the existence of a transaction in a block containing
n transactions has size O(log2 n). On the other hand, if H(t0 kt1 k · · · ktn−1 )
had been used instead of the Merkle root then the membership proof of a
transaction would require us to specify all the transactions in the block whose
size is O(n). There are nodes in the Bitcoin network called simple payment
verification (SPV) nodes which store only the block headers and not the whole
blocks like full nodes. When they require information about transactions in
the blockchain which contain their Bitcoin addresses, they contact full nodes
which respond with Merkle branches to prove the existence of the relevant
transactions.
The last three fields in the block header nTime, nBits, and nNonce are
related to mining. They are explained in the next section.
4.3 Mining
Mining is the process by which new blocks are added to the blockchain. Each
block consists of a block header followed by a list of transactions. The list
begins with a special transaction called the coinbase transaction which encodes
the transfer of the block reward (block subsidy plus the transaction fees from
the other transactions) to the miner which added the block to the blockchain.
Each coinbase transaction involves the creation of new bitcoins. The amount of
bitcoins created is equal to the block subsidy which is currently 12.5 bitcoins.
The other transactions in the list are called regular transactions. They encode
the transfer of bitcoins which were created in some previous block. A block
must contain exactly one coinbase transaction but it may contain zero or more
regular transactions. The maximum number of regular transactions in a block
is limited by the block size which was 1 MB until August 2017 and 4 MB after
that.
Nodes which want to record new regular transactions in the blockchain
broadcast them on the Bitcoin network. When other nodes hear these new
transactions, they add them to a transaction memory pool (mempool) which
is stored in local memory (RAM). A miner node forms a candidate block by
collecting some transactions from its mempool. The miner includes a coinbase
transaction in the candidate block which transfers the block reward to its own
Bitcoin address. There will be several miner nodes competing to add the next
block in the blockchain and claim the resulting block reward. The candidate
blocks created by these different miner nodes will differ in the coinbase trans-
CHAPTER 4. THE BLOCKCHAIN 44
actions as each node will insert its own Bitcoin address as the recipient of
the block reward. The candidate blocks may also differ in the regular trans-
actions included in them as different miner nodes may have different sets of
transactions in their respective mempools. This may be due to the miner
nodes receiving transactions broadcasted on the Bitcoin network at different
times due to network latencies.
The height of a block in the blockchain is the number of blocks preceding
it. The genesis block has height 0, the immediate successor of the genesis
block has height 1 and so on. Suppose a miner node is attempting to add
a candidate block at height N . The newest block in the node’s copy of the
blockchain has height N − 1. The hashPrevBlock field of the candidate block
header is populated with the block hash of the block at height N − 1. The
hashMerkleRoot field is populated with the Merkle root of the transactions
in the candidate block. The nVersion field contains the current block version
number.
The nTime field is populated with a timestamp in Unix time format to
record the time of candidate block creation. The Unix time is the number of
seconds which have elapsed since 12:00 AM Coordinated Universal Time on
January 1st, 1970 with deductions to account for leap seconds.1 Each node in
the network has a local clock which is not necessarily synchronized with the
local clocks of the other nodes. So there is no globally unique notion of time
in the network. The Bitcoin system does not specify an explicit algorithm
for calculating the nTime field in a candidate block. However, it imposes two
constraints to ensure that the timestamp in the nTime field is approximately
correct:
A miner node is free set to the nTime field to any value which satisfies these
constraints. The first constraint specifies a lower bound on nTime which can
be calculated from the current blocks in the blockchain. The upper bound
specified by the second constraint cannot be explicitly calculated by the miner
1
See https://en.wikipedia.org/wiki/Unix_time
CHAPTER 4. THE BLOCKCHAIN 45
Table 4.1: Examples of nBits field values and corresponding target thresholds
as it does not know the network-adjusted times of the other nodes in the
network. But it can hope to satisfy the upper bound by using nTime values
which are equal or close to its own network-adjusted time.
The nBits field in the block header encodes a 256-bit unsigned integer
called the target threshold using a base 256 version of the scientific notation.
Let b1 b2 b3 b4 be the four bytes in nBits. The first byte b1 plays the role of
the exponent and the remaining three bytes encode the mantissa. The target
threshold T is derived as
T = b2 b3 b4 × 256b1 −3 ,
226815 ≈ 262144 = 218 . In this case, the average number of trials required
was approximately
2256 2256 2256
= ≈ = 270 .
T +1 226815 × 25624−3 + 1 218 × 2168
Such a large number of trials required to find a valid block hash is the reason
why mining is a computationally difficult problem. A miner which successfully
finds a block hash for a candidate block below the target threshold is said to
have found or mined a valid block. Since the only way to mine a valid block
is to search through a large number of candidate block hashes, the valid block
is called a proof-of-work (PoW) solution. It proves that a certain amount of
work was performed on the average in order to find it.
What happens after a miner finds a valid block at height N ? Such a miner
immediately broadcasts the block on the Bitcoin network. It also appends
the block to its local copy of the blockchain and begins mining for the next
block at height N + 1. When other nodes receive this broadcasted block, their
reaction depends on the state of their local copy of the blockchain. For now,
assume that all the other nodes in the network have the same copy of the
blockchain consisting of blocks from the genesis block to a block at height
N − 1. We will relax this assumption later. When the new block at height N
arrives, miner nodes which are still mining their candidate blocks at height N
stop mining, append the new block to their local copy of the blockchain, and
start mining for the next block at height N + 1. If the receiving node is a full
node which does not perform mining, then it will just add the new block to
its local copy of the blockchain.
How is the target threshold value chosen? The rate at which a comput-
ing device can calculate block hashes is measured in megahashes per second
(MH/s), gigahashes per second (GH/s), or terahashes per second (TH/s).
These units correspond to 106 , 109 , and 1012 hashes per second respectively.
A typical personal computer (PC) can calculate block hashes at a rate less
than 100 MH/s. To calculate 270 block hashes, a PC operating at 100 MH/s
will require more than 300,000 years. Nowadays, mining is done using ap-
plication specific integrated circuits (ASICs) designed specifically to compute
several instances of the double SHA-256 function in parallel. One can purchase
mining rigs which combines several such ASIC chips to deliver hash rates of
the order of a few TH/s. A single mining rig operating at 1 TH/s will still
require more than 30 years to calculate 270 hashes. The mining landscape is
dominated by companies which have consolidated thousands of such mining
rigs into datacenters in locations with low electricity and cooling costs. There
are also mining pools where geographically distributed nodes combine their re-
spective mining hash rates to reduce the time required to mine a valid block.
The total hash rate available across the whole Bitcoin network on January 1,
2017 was estimated2 to be 2,463,610 TH/s. Using this hash rate, 270 block
2
Source: https://blockchain.info/charts/hash-rate
CHAPTER 4. THE BLOCKCHAIN 47
B
B
A
MB
A B
A MA B
B
A A
A B B
Figure 4.7: Illustration of network state in the event of a blockchain fork with
two branches each having one block.
What if the miner runs out of nNonce values to try? Since the nNonce
field is only 4 bytes long, a miner can generate only 232 trials by modifying
it. In case a miner cannot a find a block hash below the threshold in these
trials, it can modify a field called the coinbase which is present in the coinbase
transaction. This field is of variable length and can have a maximum length
of 100 bytes. Except for the first four bytes which are reserved for a specific
purpose, the coinbase can hold arbitrary data. Once a miner has exhausted
the 232 trials by modifying the nNonce, it can change some bits in the coinbase
which will in turn modify the hashMerkleRoot value in the block header. The
miner can now retry the 232 nNonce values as they will result in new block
hash values.
What if two miners find valid blocks at around the same time and broadcast
the blocks? Once again let us assume that all the network nodes have the same
copy of the blockchain which ends in a block at height N − 1. Suppose two
valid blocks A and B both at height N are found by two different miners and
are broadcasted before each miner received the other miner’s block. This is
posssible due to the delays inherent in propagating blocks over the network.
The other nodes either receive block A first or block B first. Each node
will accept the first block at height N that it receives and reject the second
block. So if a miner receives block A first it will append it to its local copy
of the blockchain and start mining for a block at height N + 1 with block A
as the previous block. If the miner later receives block B, it will reject it.
Eventually every node in the network would have received either block A or
block B and extended its local copy of blockchain with the received block.
This is illustrated in Figure 4.7 where MA and MB denote the miner nodes
which created blocks A and B respectively. Edges have been drawn between
nodes which are peers in the Bitcoin P2P network. Nodes labelled A have
CHAPTER 4. THE BLOCKCHAIN 49
Block
A
Block Block
···
N −2 N −1
Block
B
Block Block
···
N −2 N −1
Block Block
B B0
(b) Block chain state in the event that both branches in a fork get extended
to equal height
Block Block Block
A A0 A00
Block Block
···
N −2 N −1
Block Block
B B0
(c) Block chain state in the event one branch in a fork becomes longer than
the other
received block A first and nodes labelled B received block B first. Such a
situation is called a blockchain fork since the state of blockchain as seen by
the network as a whole consists of two branches both originating from the
same parent block at height N − 1.
Another way to represent the blockchain fork is shown in Figure 4.8(a).
Nodes which first received block A extended their copy of the blockchain using
the upper branch containing block A. The nodes which first received block
B extended the blockchain using the lower branch. Both branches will have
CHAPTER 4. THE BLOCKCHAIN 50
some proportion of the miners in the Bitcoin network working to extend them.
It is possible that valid blocks are found once again around the same time on
both branches and broadcast on the network. This results in a situation
shown in 4.8(b). Blocks A0 and B 0 were found by miners trying to extend
the branches containing blocks A and B respectively. Due to the randomness
inherent in the mining process and the block propagation in the network, it is
unlikely that both branches in a blockchain fork get extended to equal height
indefinitely. Eventually, one branch will become longer than the other. This
situation is shown in Figure 4.8(c) where the branch starting from block A
has been extended to height N + 2 while the branch starting from block B
has been extended to height N + 1. The Bitcoin protocol requires the network
nodes to switch to the longest branch they become aware of. So when the
block A00 is received by the miner nodes which are working on extending the
branch starting at block B, they will switch to the branch starting at block
A and begin mining candidate blocks which have block A00 as their previous
block. They will request the intermediate blocks A and A0 from the peer who
communicated block A00 to them. Non-miner full nodes which have block B 0
as the latest block in their copy of the blockchain also switch to the branch
starting from block A upon receiving block A00 . The branch consisting of blocks
B and B 0 will no longer be extended. Blocks belonging to such abandoned
branches are called stale blocks and they are eventually deleted. By having
all nodes switch to the longest branch, the protocol ensures that only a single
linear list of blocks survives after the resolution of blockchain forks. The
network is said to have achieved consensus about which linear list of blocks
constitute the blockchain.
What about the transactions in the stale blocks? A transaction is valid
only if it belongs to a block which survives after any blockchain forks have
been resolved. The coinbase transactions in stale blocks become invalid. A
regular transaction in a stale block could already be present in one of the
blocks which survived after fork resolution. If not, it is added back to the
mempool of transactions which nodes use to construct new candidate blocks.
Coinbase Transaction
Amount x1
Output 0
Challenge Script C1
Amount x2
Output 1
Challenge Script C2
input because the source of bitcoins is not a previous transaction output but
the block reward, i.e. the sum of the block subsidy and the transaction fees
from the transactions in the block. Each output in the coinbase transaction
specifies two items:
• The amount of bitcoins from the block reward which are associated with
this output.
• A script which specifies the conditions under which the bitcoins associ-
ated with this output can be spent.
The script in an output can be thought of as a challenge. An entity which
provides a satisfactory response can transfer the bitcoins associated with the
output. Figure 4.9 illustrates a coinbase transaction with two outputs. The
first output specifies an amount x1 and a challenge script C1 . A satisfactory
response to C1 is needed to spend the x1 bitcoins. Similarly, a satisfactory
response to the challenge script C2 is needed to spend the x2 bitcoins in the
second output.
To see an example of a challenge and a satisfactory response to it, consider
a miner who creates a block and wants the block reward to be paid to P2PKH
addresses it owns. Ownership of a P2PKH address is the same as knowing
the private key corresponding to the public key used to create the address.
In this case, the challenge script in an output of the coinbase transaction will
contain a P2PKH address. The challenge script will require anyone who wants
to spend the bitcoins to provide a response script which consists of two items:
• A public key which hashes to the given P2PKH address.
The sum of the amounts in all the outputs of the coinbase transaction
should not exceed the block reward. Suppose that the block reward in the
block containing the coinbase transaction of Figure 4.9 is R bitcoins. Then the
amounts x1 and x2 must satisfy x1 +x2 ≤ R. This ensures that the transaction
does not spend more than the amount of bitcoins which are available for
spending. If x1 + x2 < R, then the R − x1 − x2 bitcoins from the block reward
become unspendable. So coinbase transactions set the sum of the output
amounts to be equal to the block reward. In the past, errors in coinbase
transaction creation by miners have resulted in blocks where this sum is not
equal to the block reward.
To spend the bitcoins earned in a coinbase transaction, the miner would
have to create a regular transaction. Regular transactions have at least one
input and at least one output. Each input specifies three items:
• A response script which will satisfy the conditions required to spend the
bitcoins in the output.
Amount x1 TXID1
Output 0
Challenge Script C1 Output Index = 1 Input 1
Response Script R2
Amount x2
Output 1 Challenge Script C2 TXID2
Output Index = 0 Input 2
Response Script R3
Figure 4.10 illustrates a regular transaction which has three inputs and
two outputs. The first two inputs refer to two outputs from a previous regular
transaction with transaction identifier equal to TXID1. The third input refers
to the output of a previous coinbase transaction with transaction identifier
equal to TXID2. The first output in the previous regular transaction specifies
a bitcoin amount x1 and a challenge script C1 . The first input of the regular
transaction unlocks the x1 bitcoins by providing a satisfactory response R1 to
C1 . Similarly, the second and third inputs of the regular transaction unlock x2
and x3 bitcoins by providing satisfactory responses R2 and R3 to challenges C2
and C3 respectively. A total of x1 + x2 + x3 bitcoins are available for transfer
to the outputs. The two outputs of the regular transaction are allocated y1
and y2 bitcoins respectively, where y1 + y2 ≤ x1 + x2 + x3 . Challenge scripts
C4 and C5 are included in the outputs specifying conditions under which these
outputs can be spent. A transaction fee of x1 +x2 +x3 −y1 −y2 can be claimed
by the miner which includes this transaction in a block on the blockchain.
How is the value of the transaction fee determined? Miners aim to max-
imize their block reward while constructing candidate blocks. As the block
subsidy is fixed, they seek to maximize the sum of the transaction fees from
the transactions included in the block. A high transaction fee is not the only
CHAPTER 4. THE BLOCKCHAIN 54
across nodes whose local copies differ. For example, consider the blockchain
fork illustrated in Figure 4.8(a) where miners MA and MB have mined blocks
A and B respectively. The UTXO set in nodes which consider the branch
containing block A to be the longest branch will contain the output of the
coinbase transaction which transfers the block reward to MA . This output
will not be present in nodes which consider the branch ending in block B to
be the longest branch. The UTXO set in these nodes will contain the output
of the coinbase transaction which transfers the block reward to MB . Hence
the ownership of the block reward in the block at height N is undetermined
until the fork is resolved. Once blockchain forks are resolved, the UTXO set
(and consequently the bitcoin ownership record) seen by all the nodes in the
network will be identical. But the temporary ambiguity about the UTXO
set during blockchain forks should be taken in account in situations where
bitcoins are used as a mode of payment.
Block BN
··· BN −1 BN +1 ··· BN +m−1
with t1
Block
N −1 N N +1 ··· N +m−1
Height
(a) Block chain state when Bob hands over the goods to Alice (after t1
receives m confirmations)
Block BN
··· BN −1 BN +1 ··· BN +m−1
with t1
Block
N −1 N N +1 ··· N +m−1 N +m
Height
(b) Block chain state after Alice succeeds in creating a longer branch con-
taining t2
2. Alice will broadcast t1 on the Bitcoin network for inclusion in the blockchain.
She will keep t2 a secret.
6. All the nodes in the Bitcoin network will eventually switch to the t2 branch
and the t1 branch will be abandoned. Usually, transactions which are in
CHAPTER 4. THE BLOCKCHAIN 57
stale blocks, i.e. blocks which are in abandoned branches, are added back
to the transaction pool if they have not already appeared in the surviving
branch. Miners use this transaction pool for constructing new candidate
blocks. However, miners which have switched to the t2 branch will not
add t1 to their transaction pools as it conflicts with t2 . The end result is
that Bob has already transferred the goods to Alice but the x bitcoins he
thought he received from Alice in t1 are back in Alice’s possession. Since
Alice can now spend these bitcoins again, this attack is called a double
spending attack.
The double spending attack as described above will always succeed if Alice
can influence 50% or more of the total network hash rate to work on the branch
containing t2 . With the majority of the network hash rate working to extend
it, the t2 branch will eventually overtake the t1 branch. Alice could herself be
a miner who controls a majority of the network hash rate or she could collude
with a group of miners who collectively control a majority of the network hash
rate. If less than 50% of the network hash rate is used to attempt a double
spending attack, it may or may not succeed. Suppose that a fraction q of the
network hash rate is used to mount the double spending attack where q < 12
and that the remaining p = 1 − q fraction of the network hash rate continues
to extend the branch containing t1 . If Bob waits for m confirmations before
transferring the goods to Alice, then the success probability of the double
spending attack is
m
X m+k−1 h m k i
P (m, q) = 1 − p q − pk−1 q m+1 . (4.1)
k
k=0
It is derived in Appendix B. Table 4.2 lists P (m, q) for some values of m and
q. For all values of q less than 0.5, P (m, q) decreases with increasing m. So
waiting for more confirmations reduces the risk of a successful double spending
attack. While the rate of decrease in P (m, q) is exponential for q = 0.1, it
slows down as q increases. For q = 0.49, a double spending attack has a 95%
chance of success if Bob waits for only one confirmation. Even if Bob waits
for ten confirmations, a double spending attack has a 90% chance of success.
For a fixed value of m, P (m, q) increases exponentially with q, eventually
becoming one when q exceeds 0.5.
In spite of the guaranteed success of a double spending attack mounted
using a majority of the network hash rate, such attacks have not been observed
in practice. This is probably because successful double spending attacks would
undermine confidence in the Bitcoin currency as a mode of payment and reduce
its exchange rate in terms of fiat currency. The miners who control significant
fractions of the network hash rate have invested large amounts of capital in
establishing their mining infrastructure. These investments will continue to
generate a profit for the miners when the bitcoins earned as part of the block
CHAPTER 4. THE BLOCKCHAIN 58
reward can be sold at a high price. So these miners will avoid any behaviour,
like attempting double spending attacks, which may negatively affect the price.
Nevertheless, double spending attacks are still possible by an attacker who
does not have a long term stake in Bitcoin. For example, a hacker may be
able to temporarily take control of a miner node and divert its hash rate
toward mounting a double spending attack. So it is prudent for merchants
like Bob to wait for confirmations on a transaction before considering it valid.
How many confirmations m should Bob wait for? Bob does not know the
fraction of network hash rate q which will be used by Alice to mount a double
spending attack. If q ≥ 21 , then the attack will be successful irrespective of the
value of m. By accepting bitcoins as a valid mode of payment, Bob is implicitly
assuming that Alice cannot gain control over a majority of the network hash
rate. If q < 21 , then the success probability of the double spending attack
decreases as m increases. As he does not know the value of q, Bob cannot
choose the value of m to bring the success probability below a predetermined
level like 0.01. He can only hope to reduce the success probability by increasing
m. But m cannot be very large as each confirmation takes approximately 10
minutes to appear. Consequently, all customers, irrespective of whether they
are honest or malicious, will experience a delay of about 10m minutes before
Bob transfers the goods to them. Several merchants in the Bitcoin ecosystem
wait for six confirmations (m = 6), which corresponds to a delay of about an
hour before goods are transferred from a merchant to a customer. Smaller or
larger values of m are used by merchants depending on the value of the goods
being sold.
A zero confirmation transaction is one which has been broadcast on the
Bitcoin network but has not been included in a valid block on the blockchain.
When the value of goods involved is small and confirmation delays cannot
be tolerated, merchants may accept zero confirmation transactions as valid
CHAPTER 4. THE BLOCKCHAIN 59
payment. For example, suppose Bob runs a coffee shop where bitcoins can be
used to buy coffee. To pay for a cup of coffee, Alice broadcasts a transaction
on the Bitcoin network paying Bob the required amount of bitcoins. Bob
may choose to give Alice her coffee as soon as he hears the transaction on
the network and before it has been included in a valid block. Since this
transaction has zero confirmations, it can be cancelled more easily through a
double spending attack. But Bob may take this risk because it is unlikely that
Alice will undertake the effort involved in a double spending attack just for a
cup of coffee’s worth of bitcoins. Also, the delay incurred in waiting for even
one confirmation may be undesirable in case Alice is an honest customer.
value which makes BN 0 a valid block. To replace B with B 0 in all the copies
N N
the blockchain stored across the Bitcoin network, Alice has to create a branch
containing BN0 which is longer than the branch containing B (as illustrated
N
in Figure 4.12(b)) and broadcast it. Once all the nodes in the Bitcoin network
switch to the branch containing BN 0 , it will become the block at height N
spending attack scenario, the t1 branch has no lead in terms of blocks when
Alice begins mining the t2 branch.
Even though Alice is interested in only modifying the block at height N ,
she has to construct new valid blocks at heights N + 1, N + 2 and so on.
This is because the block BN 0 is not a drop-in replacement for B
N in the
blockchain. The block header of BN +1 contains the block hash of BN in the
hashPrevBlock field. As the block headers of BN and BN 0 differ, their block
hashes also differ by the collision resistance of the SHA-256 hash function. So
BN +1 is not a valid successor block to BN 0 . Alice has to mine a new valid
0
block BN +1 which has the block hash of BN 0 in its hashPrevBlock field. By
4
A block is said to have received m confirmations if the transactions in it have received
m confirmations as defined in Section 4.6.
CHAPTER 4. THE BLOCKCHAIN 60
Block
N −1 N N +1 ··· N +m−1
Height
0
BN 0 0 0
BN +1 ··· BN +n−1 BN +n
Block
N −1 N N +1 ··· N +n−1 N +n
Height
0
(b) Block chain state when the branch containing BN overtakes the branch
containing BN . Here n ≥ m.
tinue mining blocks on the BN branch as it is the longest branch they know.
Let q be the fraction of the total network hash rate which Alice controls. If
q ≥ 12 , then Alice will eventually succeed in constructing the longer branch
irrespective of the value of m. If q < 21 , then the probability5 that Alice suc-
q m+1
ceeds is ( 1−q ) where m is the number of confirmations BN has received
when Alice begins mining the BN 0 branch. As this probability decreases expo-
nentially with m, blocks become more difficult to tamper with as the number
of confirmations they have received increases. Table 4.3 lists Alice’s success
probability for some values of m and q. Unless Alice controls a fraction of the
network hash rate which is close to 0.5, it becomes nearly impossible for her
to tamper with blocks which have received 50 or more confirmations. Even
with q = 0.4, Alice only has a one in a billion chance of modifying a block
with 50 confirmations. Given that a block receives about six confirmations in
an hour, a block with 50 confirmations is about eight hours old.
Even if Alice controls the majority of the network hash rate, she cannot
make modifications to existing blocks which require her to know the private
keys of other users. For example, suppose a block contains a transaction where
Bob transfers some bitcoins to Carol. Such a transaction will have an input
which unlocks a UTXO owned by Bob and an output which creates a UTXO
5
See Appendix B for the derivation.
CHAPTER 4. THE BLOCKCHAIN 61
that can be unlocked only by Carol. Alice cannot modify this transaction to
make herself the recipient of the bitcoins instead of Carol. This is because
the response script Bob uses to unlock his UTXO requires a digital signature
which can only be generated using Bob’s private key. The output of the
transaction which specifies Carol as the recipient is part of the the message
that is used to generate the signature. If Alice replaces this output with
an output which specifies her as the recipient, the message used to generate
the signature changes and Bob’s private key is needed to generate the new
signature. Furthermore, Alice cannot even change the amount of bitcoins
which Bob is transferring to Carol as this amount is also part of the message
that is used to generate the signature. A more detailed description of the
signature generation procedure is given in Section 5.6.
such transactions in new blocks. She can also decide the minimum fee rate
for transactions by not including those transactions which pay less than this
minimum into new blocks. Such behaviour will cause the Bitcoin system to
become less attractive as a mode of payment. The attacker can even cause
the Bitcoin system to stop functioning as a payment system by mining only
empty blocks. These are blocks which contain only the coinbase transaction
and no regular transactions. Such blocks are considered valid by the Bitcoin
protocol. Without any new regular transactions appearing on the blockchain,
the Bitcoin currency would be worthless.
While the presence of a 51% attacker can lead to the collapse of the Bitcoin
system, such an event is unlikely due to the prohibitive costs involved in
generating a majority of the network hash rate. Even if the cost of acquiring
the required mining equipment can be ignored, the electricity and cooling costs
involved in keeping the equipment running will be high. The 51% attacker
cannot hope to recover these costs by selling bitcoins as the attack itself will
drive the Bitcoin price down to zero by undermining the effectiveness of the
Bitcoin system.
4.9 Summary
The blockchain is the main innovation in Bitcoin. By incentivizing addition of
blocks to the blockchain, the Bitcoin system ensures that multiple copies of the
blockchain are maintained across a geographically distributed network. The
network achieves consensus over the state of the blockchain by having each
node switch to the longest branch it hears. The computationally demanding
task of mining valid blocks not only ensures a predictable rate of new currency
creation but also makes the whole system resistant to control by a single entity.
With attackers not able to control significant fractions of the network hash
rate, double spending attacks become unlikely and transactions with a few
dozen confirmations can be considered irreversible.
Chapter 5
Bitcoin Transactions
63
CHAPTER 5. BITCOIN TRANSACTIONS 64
Number of
VarInt (1 – 9 bytes)
Transactions n
Coinbase
Transaction
Regular
Transaction 1
Regular List of
Transaction 2 Transactions
..
.
Regular
Transaction n − 1
• If n ∈ {253, 254, . . . , 216 − 1}, encode n using three bytes. The first byte
contains the prefix 253 followed by the representation of n as a 16-bit
unsigned integer.
• If n ∈ {216 , 216 +1, . . . , 232 −1}, encode n using five bytes. The first byte
contains the prefix 254 followed by the representation of n as a 32-bit
unsigned integer.
CHAPTER 5. BITCOIN TRANSACTIONS 65
• If n ∈ {232 , 232 + 1, . . . , 264 − 1}, encode n using nine bytes. The first
byte contains the prefix 255 followed by the representation of n as a
64-bit unsigned integer.
Decoding an integer stored in VarInt form is trivial as the first byte tells
us the length of the encoding. The VarInt encoding is used frequently in the
Bitcoin transaction format to specify the lengths of lists or variable length
fields.
Transaction Version
The transaction begins with a 4-byte field called nVersion which is used
to store the version number of the transaction format. The version number
dictates the rules for interpreting the fields in a transaction. As of August
2017, the transaction version number can be either 1 or 2. The two versions
differ in the interpretation of the nSequence field in the transaction inputs
(to be discussed later).
Regular Transaction
Format
Input Format
4 bytes nVersion
1 – 9 bytes Number of Inputs N
hash 32 bytes
Input 0 n 4 bytes
scriptSigLen 1–9 bytes
scriptSig scriptSigLen bytes
Input 1 nSequence 4 bytes
..
.
Input N − 1
Output Format
1 – 9 bytes Number of Outputs M
nValue 8 bytes
Output 0 scriptPubkeyLen 1 – 9 bytes
scriptPubkey scriptPubkeyLen bytes
Output 1
..
.
Output M − 1
4 bytes nLockTime
···
Block h − 10 h−9 h−8 h−7 h−6 h−5 h−4 h−3 h−2 h−1 h
Height
nTime 8:13 8:23 8:33 8:43 8:53 9:03 9:13 9:23 9:33 9:43 9:53
Figure 5.3: Illustration of the one hour delay in the actual lock time due to
usage of median-time-past to determine expiry.
What if we want to have a lock time at block height which is greater than 5×108
or at a Unix time which is less than 5 × 108 ? Given that 2,016 blocks are
mined approximately every two weeks, it would take more than 9,500 years
for the block height to exceed 5 × 108 . Not being able to specify lock times
that far ahead in the future is effectively not a restriction. The Unix time of
5 × 108 corresponds to a time in the past (12:53 AM on November 5, 1985).
Thus a lock time that expires at a Unix time which is less than 5 × 108 will
never be required.
The policy of using the median-time-past to check nLockTime validity
was proposed in BIP 113. It was made mandatory from the block at height
419328 which was mined in July 2016. Prior to that, a transaction was allowed
to be part of a block if the nTime value in the block’s header exceeded the
nLockTime value. This was changed because miners could set the nTime in a
candidate block to a future value in order to include transactions whose lock
times had not yet expired and increase the transaction fees they received. As
the median of the nTime values is known to all the miners, manipulating it
by setting the nTime value to a future value in mined blocks does not give
any specific miner an advantage over the other miners. Another advantage of
comparing nLockTime to median-time-past values instead of nTime values is
that median-time-past values increase monotonically with block height while
nTime values are not required to. So if the lock time of a transaction expires
at a certain block height, it remains in the expired state irrespective of the
nTime values in the blocks at subsequent heights.
A consequence of using the median-time-past is that the lock time of a
transaction expires approximately one hour after the time specified in the
nLockTime field. For example, suppose the nLockTime field of a transaction t
specifies a Unix time corresponding to 9:00 AM on January 1, 2018. Figure 5.3
illustrates the state of the blockchain when the lock time of this transaction
expires. The latest block in the blockchain has height h and an nTime value
of 9:53 AM on January 1, 2018. The nTime values in the 11 blocks at heights
h − 10 through h have been chosen to be exactly 10 minutes apart for the
CHAPTER 5. BITCOIN TRANSACTIONS 68
sake of illustration (the AM suffix and date have been omitted in the figure).
The median-time-past of the block at height h is 9:03 AM on January 1, 2018
corresponding to the nTime of the block at height h − 5. So the transaction
t can be included in the next block at height h + 1. Note that the lock
time of the transaction t expired after the block at height h was added to
the blockchain. Since the nTime field is populated before the mining of the
candidate block begins, the block at height h was probably broadcast on the
network approximately 10 minutes after 9:53 AM. So the effective time of
expiry of the lock time on the transaction t is around 10:00 AM rather than
9:00 AM on January 1, 2018.
For the nLockTime to be considered while evaluating a transaction for
inclusion in a block, at least one of the transaction inputs must have an
nSequence value which is less than 0xFFFFFFFF. The nSequence field occupies
4 bytes and can have a maximum value of 0xFFFFFFFF. If all the inputs have
nSequence value equal to 0xFFFFFFFF, then the nLockTime field is ignored
and the transaction can be included in any block.
If the lock time applies to the whole transaction, why is its validity con-
trolled by the nSequence fields in all the transaction inputs? Why not use a
single field to indicate whether the nLockTime should be ignored or not? The
nSequence field was originally intended to enable multiple entities to collabo-
ratively construct a multi-input transaction where each entity was responsible
for one of the inputs. The initial version of the transaction would have a lock
time in the future and nSequence values less than 0xFFFFFFFF. The entities
would indicate a newer version of a transaction by increasing the nSequence
values in the inputs owned by them. The original Bitcoin Core reference client
implementation allowed the replacement of a transaction in a node’s transac-
tion mempool with newer versions until the transaction’s lock time expired. A
transaction was considered final once all inputs had nSequence values equal to
0xFFFFFFFF and could be included in a block immediately without waiting for
the lock time to expire. Transaction replacement based solely on nSequence
values was disabled in 2010 as a malicious node could flood the network with
newer versions of the same transaction without incurring any penalty. There
was also no way to guarantee that miners would include a newer version of
a transaction in a block when the older version paid a higher transaction
fee. When disabling transaction replacement, the transaction format was not
changed and the nSequence field in each input continued to control the valid-
ity of the transaction lock time.3
3
A fee-based transaction replacement policy called opt-in full replace-by-fee (RBF) which
does not suffer from the shortcomings of the nSequence-based transaction replacement was
proposed in BIP 125. An implementation of opt-in full RBF was added to the Bitcoin Core
reference client in 2016. But it is a policy which does not affect block validity and is not
required to be strictly followed by the nodes in the network.
CHAPTER 5. BITCOIN TRANSACTIONS 69
Input Format
Each input in a pre-SegWit regular transaction has the same five fields. Figure
5.2 shows these fields for the first input. The fields in the other inputs are not
shown for brevity. The input fields have the following semantics.
• The hash field contains the 256-bit transaction identifier (TXID) of a
previous transaction containing the output which will be unlocked by
this input. The field is called hash because the TXID is the double
SHA-256 hash of the previous transaction.
• The 4-byte n field contains the index of the output being unlocked in
the previous transaction. The index of the first output in the previous
transaction is 0, the index of the second output is 1, and so on.
• The scriptSigLen field is a VarInt encoding of the length of the re-
sponse script which is used to unlock the output.
• The scriptSig field contains the response script itself. The format of
response and challenge scripts will be described in Section 5.4.
• The 4-byte nSequence field is interpreted differently depending on the
transaction version. In both version 1 and version 2 transactions, if all
transaction inputs have their nSequence fields set to 0xFFFFFFFF, then
the nLockTime field is ignored.
In version 2 transactions, the nSequence field can be used to specify
a relative lock time of an input. The relative lock time can have units
of either number of blocks or seconds. Before we go into the details of
the encoding, let us consider the functionality of the relative lock time.
Suppose the relative lock time of an input is k blocks. If the output
which is being unlocked by this input is in a block with height K, then
a transaction containing this input cannot be included in a block whose
height is less than K +k. Now suppose the relative lock time of the input
is specified as t seconds. If the output being unlocked by this input is
in a block at height ho , let the median-time-past of the block at height
ho − 1 be T seconds. Then a transaction containing the input cannot be
included in a block at height hi until the median-time-past of the block
at height hi − 1 exceeds T + t seconds.
The relative lock time is unlike the lock time specified by nLockTime
which specifies an absolute block height or median-time-past before
which a transaction cannot be included in a block. The relative lock
time was introduced in BIP 68 in order to enable smart contracts which
involve a sequence of dependent transactions with minimum delays be-
tween them (see Chapter 6). Absolute lock times cannot be used to
ensure a delay between two transactions if the time or block height at
which the first transaction is added to the blockchain is not known.
CHAPTER 5. BITCOIN TRANSACTIONS 70
Start
nSequence does
Yes
nSequence[31] = 1? not encode a
relative lock time
No
k = nSequence[15:0]
No Yes
nSequence[22] = 1?
Figure 5.4: Relative lock time encoding in the nSequence field of version 2
transactions
The flowchart in Figure 5.4 illustrates the relative lock time encoding in
version 2 transactions. Let nSequence[i] denote the bit with index i
in the field where nSequence[0] is the least significant bit (LSB) and
nSequence[31] is the most significant bit (MSB). If nSequence[31] is
set, then the nSequence field does not encode a relative lock time. If it
is not set, then the nSequence field encodes a relative block time whose
units are determined by the bit nSequence[22]. The magnitude of the
relative lock time is given by the 16 least significant bits nSequence[15]
to nSequence[0]. Let k be the unsigned integer represented by these
16 bits. If nSequence[22] is not set, then the relative lock time is
k blocks. If it is set, then the relative lock time is k × 512 seconds.
The multiplier 512 was chosen because it is the power of two closest
to 600 (the average number of seconds required to find a block). This
choice allows the relative lock time to specify similar durations of time
using either blocks or seconds. The maximum relative lock time in
blocks is 216 − 1 = 65, 535 blocks which corresponds to approximately
1.25 years. In terms of seconds, the maximum relative lock time is
(216 − 1) × 512 = 33, 553, 920 seconds ≈ 1.06 years.
CHAPTER 5. BITCOIN TRANSACTIONS 71
In spite of relying on the same nSequence field, absolute lock times (us-
ing nLockTime) and relative lock times can be enabled independently of each
other. Both of them can be disabled by setting all the nSequence values
to 0xFFFFFFFF. Setting nSequence[31] to 0 in any transaction input enables
both the absolute lock time for the whole transaction and the relative lock time
for that particular input. Choosing nSequence strictly less than 0xFFFFFFFF
but with nSequence[31] equal to 1 in any transaction input enables the ab-
solute lock time for the whole transaction and disables the relative lock time
for that particular input. Finally, setting nSequence[31] to 0 and nLockTime
to 0 effectively disables the absolute lock time for the whole transaction and
enables the relative lock time for that particular input.
Output Format
Each output in a pre-SegWit regular transaction has the same three fields.
Figure 5.2 shows these fields for the first output. The fields in the other out-
puts are not shown for brevity. The output fields have the following semantics.
Coinbase Transaction
Format
Input Format
4 bytes nVersion
1 byte Number of Inputs = 1 hash 32 bytes
n 4 bytes
Dummy Input scriptSigLen 1–9 bytes
scriptSig scriptSigLen bytes
1 – 9 bytes Number of Outputs M nSequence 4 bytes
Output 0
Output Format
Output M − 1
4 bytes nLockTime
Input Format
The sole input in the coinbase transaction is a dummy input in spite of having
the five fields corresponding to a regular transaction input. It does not unlock
a previous transaction output. The input fields have the following semantics.
• The scriptSig field does not contain a response script which can unlock
a previous output. Instead, it contains the coinbase field which can have
a maximum length of 100 bytes. The length of the coinbase field is
specified in the scriptSigLen field.
The height of the block containing the coinbase transaction is stored
at the beginning of the coinbase field. The rest of the bytes can be
arbitrarily set by the miner creating the coinbase transaction. Miners
use these bytes to modify the hashMerkleRoot value in the block header
in case they run out of nNonce values to try when mining the block.
The height currently occupies the first four bytes in the coinbase field.
The first byte is set to 0x03 to indicate that the height is encoded in
the next three bytes. The following three bytes contain the height as a
signed integer in little-endian format. When the block height increases
CHAPTER 5. BITCOIN TRANSACTIONS 73
Why were the hash, n, nSequence, and nLockTime fields included in the coin-
base transaction format if they either take fixed values or are ignored? While
there is no way to know for sure, these fields were probably included to main-
tain consistency with the the regular transaction format. While wasteful in
terms of space, this consistency translates to some minor conveniences in the
C++ code of the Bitcoin Core reference client.
Output Format
The outputs in the coinbase transaction have the same fields as the outputs
in a regular transaction. As in a regular transaction output, the nValue
field specifies the amount of bitcoins (in satoshis) locked in the output, the
scriptPubkey field contains a challenge script, and the scriptPubkeyLen field
contains the length of scriptPubkey in bytes. The main difference is that the
amount specified in the nValue field should be less than or equal to the block
reward (block subsidy + transaction fees) in the block. Furthermore, the sum
of the nValue fields from all the coinbase outputs should not exceed the block
reward.
An output in a coinbase transaction cannot be spent until it has received
101 confirmations. So an input unlocking the coinbase transaction output of
a block at height h has to be in a block at height strictly higher than h + 100.
This rule ensures that coinbase transaction outputs are spent only after they
are unlikely to become invalid due to a blockchain fork.
Why is the block height stored in the coinbase field? This is to ensure that
the coinbase transactions in different blocks on the blockchain have different
TXIDs. The coinbase transactions in two blocks mined by the same miner can
contain the same challenge scripts in the scriptPubkey fields of the outputs.
The nValue field values can also be the same. If the block height is not
included in the coinbase field, then the scriptSig fields can also be the same.
Consequently, the two coinbase transactions will have the same TXID (the
double SHA-256 hash). This is undesirable as we will not be able to distinguish
between the two coinbase transactions when we specify this TXID in the hash
field of an input. The idea of including the block height in the coinbase field
was proposed in BIP 34. This behaviour was made mandatory from the block
at height 2,24,413 which was mined in March 2013.
CHAPTER 5. BITCOIN TRANSACTIONS 74
Table 5.2: Script operators which push data onto the stack
• The operator OP 1NEGATE pushes the number −1 onto the stack. The
operators OP 1 to OP 16 are used to push numbers in the range 1 to 16
onto the stack. While these push operations can be performed using the
opcode 0x01, doing so will require two bytes: one byte for the opcode
0x01 and one byte for the number to be pushed. Specific operators
for pushing the numbers −1, 1, 2, . . . , 16 were probably included because
these small numbers are more likely to occur in scripts.
• The opcode 0x50 is reserved for future use. There is a minor convenience
in skipping 0x50 and using 0x51 as the opcode for OP 1. The number
to be pushed by the operators OP 1NEGATE, OP 1, . . . , OP 16 can be ex-
pressed as the difference between their opcodes and 0x50. For example,
−1 = 0x4F − 0x50 and 1 = 0x51 − 0x50.
CHAPTER 5. BITCOIN TRANSACTIONS 76
The operators with opcodes in the range 0x61 to 0xB2 (97 to 178 in deci-
mal) specify operators which perform flow control, stack manipulation, arith-
metic, and cryptographic operations.4 Table 5.3 lists some examples of such
operators. A complete list of all the operators and their definitions can be
found in the Bitcoin Wiki.5 We will describe some of these operators in the
next section.
Script uses postfix notation to express operations which are not the data
push operations listed in Table 5.2. In postfix notation, the parameters of an
operator are specified before the operator. For example, the postfix notation
for the sum 2 + 3 is 2 3 +. In Script, this postfix expression would be given
by the bytestring 0x525393 which corresponds to OP 2 OP 3 OP ADD when the
opcodes are replaced with operator names. Figure 5.6 shows the state of the
stack when different parts of the expression are executed. We assume that the
stack is initially empty. The expression is evaluated from left to right. The
OP 2 operator is executed first resulting in the number 2 being pushed onto
the stack. The execution of OP 3 pushes the number 3 onto the stack. When
OP ADD is executed, the numbers 3 and 2 are popped off the stack and their
sum 5 is pushed onto the stack.
OP 2 OP 3 OP ADD
2
OP 3 OP ADD
3
OP ADD 2
contain valid response scripts. If any of the response scripts are invalid, the
nodes will reject the transaction and not re-broadcast it to their neighbors.
The nodes validate a response script in the following manner:
1. The response script is first executed using an empty stack. If the re-
sponse script execution terminates with an error, it is considered invalid.
2. If the response script execution succeeds, the state of the stack at the
end of the execution is used to execute the challenge script. If the
challenge script execution terminates with an error, the response script
is considered invalid.
x1
x2
<Challenge Script> ..
.
xn
y1
y2
..
.
ym
Figure 5.7: Stack state during the execution of the response and challenge
scripts
For convenience, let S denote the 256-bit string appearing in the above script.
Figure 5.8 shows the state of the stack during the execution of this script. The
top stack element is assumed to be x before script execution. The OP HASH256
operator pops the top stack element x and pushes its double SHA-256 hash
H(x) onto the stack. The 0x20 operator pushes a 32-byte array containing S
onto the stack. The OP EQUAL operator pops the top two stack elements and
pushes the number 1 onto the stack if H(x) and S are equal. If they are not
equal, it pushes the number 0 onto the stack.
CHAPTER 5. BITCOIN TRANSACTIONS 79
x
OP HASH256 0x20 S OP EQUAL
H(x)
0x20 S OP EQUAL
S
OP EQUAL H(x)
0 or 1
Figure 5.8: Stack state during the execution of the challenge script OP HASH256
0x20 S OP EQUAL
CPU time or RAM at each network node during script validation. This consti-
tutes a denial-of-service (DoS) attack on the network as regular transactions
broadcasted by non-malicious nodes will experience delays before they are
recorded on the blockchain. To avoid such DoS attacks, network nodes will
not relay transactions containing scripts which do not belong a limited set
of standard scripts. Prior to SegWit activation, the set of standard challenge
scripts consisted of the five script templates: Pay to Public Key (P2PK), Pay
to Public Key Hash (P2PKH), m-of-n Multi-signature, Pay to Script Hash
(P2SH), and Null Data.6
<Signature>
<Public Key> OP CHECKSIG
<Public Key>
OP CHECKSIG <Signature>
True/False
Figure 5.9: Stack state during the execution of P2PK response and challenge
scripts
P2PKH Address
Base58
Decoding
BkRkC4
Discard last
four bytes
BkR
Discard address
version prefix byte
• The OP DUP operator duplicates the top stack element, i.e. it pushes a
copy of the top stack element onto the stack.
• The OP HASH160 operator pops the top stack element x and pushes
RIPEMD-160(SHA-256(x)) onto the stack.
• The OP EQUALVERIFY operator pops the top two stack elements and com-
pares them. If they are equal, the script execution continues. If they
are not equal, the script terminates with an error.
A valid response to the P2PKH challenge script contains exactly two data
pushes: the first one pushes a byte array containing a valid signature and the
second one pushes a byte array containing an uncompressed public key. So
the scriptSig field is of the form
<Signature> <Public Key>.
Figure 5.11 shows the state of the stack during the execution of the P2PKH
response and challenge scripts. The execution proceeds as follows:
1. The response script pushes <Signature> and <Public Key> onto the
stack.
CHAPTER 5. BITCOIN TRANSACTIONS 84
<Signature>
<Public Key> OP DUP OP HASH160
<PubKeyHash> OP EQUALVERIFY OP CHECKSIG
<Public Key>
OP DUP OP HASH160
<Signature>
<PubKeyHash> OP EQUALVERIFY OP CHECKSIG
<Public Key>
<Public Key>
OP HASH160
<Signature>
<PubKeyHash> OP EQUALVERIFY OP CHECKSIG
<PubKeyHashCalc>
<Public Key>
<Signature>
<PubKeyHash> OP EQUALVERIFY OP CHECKSIG
<PubKeyHash>
<PubKeyHashCalc>
<Public Key>
OP EQUALVERIFY OP CHECKSIG <Signature>
<Public Key>
<Signature>
OP CHECKSIG
True/False
Figure 5.11: Stack state during the execution of P2PKH response and chal-
lenge scripts
CHAPTER 5. BITCOIN TRANSACTIONS 85
2. The OP DUP operator in the challenge script pushes a copy of the top
stack element <Public Key> onto the stack.
3. The OP HASH160 operator pops the top stack element <Public Key> and
calculates its SHA-256 + RIPEMD-160 hash <PubKeyHashCalc>. This
hash is then pushed onto the stack.
4. The public key hash <PubKeyHash> from the challenge script is pushed
onto the stack.
5. The OP EQUALVERIFY operator pops and compares the top two stack el-
ements <PubKeyHash> and <PubKeyHashCalc>. If they are equal, then
the script execution proceeds. If they are not equal, the script execution
terminates with an error. Equality implies that the <Public Key> pro-
vided by the response script has a SHA-256 + RIPEMD-160 hash which
is equal to the <PubKeyHash> given in the challenge script.
The values in the response script can cause the challenge script execution
to fail in two9 ways:
• The <Signature> given in the response script fails the secp256k1 ECDSA
signature validation procedure performed using the <Public Key> given
in the response script.
<PrivKeyC>. Thus the output can be spent if any two of Alice, Bob and
Carol agree to spend it.
where the OP 0 operator which pushes an empty array onto the stack is present
to account for a bug in the OP CHECKMULTISIG operator implementation. The
bug causes OP CHECKMULTISIG to pop one extra item off the stack. This bug
cannot be fixed without requiring all the nodes in the network to upgrade their
client software. If some of the nodes do not upgrade their clients, it would
result in a hard fork in the Bitcoin blockchain due to upgraded and non-
upgraded nodes disagreeing on the validity of the multisig challenge scripts.10
Figure 5.12 shows the state of the stack during the execution of the m-of-n
multisig response and challenge scripts. For brevity, some of the intermediate
stack states consisting of only data push operations have been omitted from
the figure. The execution proceeds as follows:
2. The challenge script pushes the integer m, the n public keys <Public Key
1>, . . ., <Public Key n> onto the stack, and integer n onto the stack.
10
See Chapter 7 for details about why hard forks are undesirable.
11
A subsequence of a ordered sequence of elements x1 , x2 , . . . , xn is given by
xi1 , xi2 , . . . , xik where 1 ≤ i1 < i2 < · · · < ik ≤ n, i.e. some elements can be omitted
from the original sequence but the order of the remaining elements is unchanged.
CHAPTER 5. BITCOIN TRANSACTIONS 88
<Signature m>
..
.
m <Public Key 1> · · · <Public Key n> n OP CHECKMULTISIG <Signature 1>
<Empty Array>
n
<Public Key n>
..
.
<Public Key 1>
m
OP CHECKMULTISIG <Signature m>
..
.
<Signature 1>
<Empty Array>
True/False
Figure 5.12: Stack state during the execution of m-of-n multisig response and
challenge scripts
where Alice, Bob, and Carol control the private keys corresponding to the
public keys <PubKeyA>, <PubKeyB>, and <PubKeyC> respectively. Suppose
<SigA>, <SigB>, and <SigC> are signatures created by Alice, Bob, and Carol
respectively. Then the following response script is valid because the signatures
from Alice and Carol appear in the same order as their public keys in the
challenge script.
OP 0 <SigA> <SigC>.
CHAPTER 5. BITCOIN TRANSACTIONS 89
But the following response script is invalid because the signatures from Alice
and Carol do not appear in the same order as their public keys in the challenge
script.
OP 0 <SigC> <SigA>.
While the OP CHECKMULTISIG operator allows n to be as large as 20, the
Bitcoin protocol considers m-of-n multisig challenge scripts with n greater
than 3 to be non-standard. But m-of-n multisig scripts with values of n
upto 15 are considered standard if these scripts are embedded inside a P2SH
challenge script. To distinguish between the two types of multisig scripts, the
former version is called a bare multisig script while the latter is called a P2SH
multisig script.
where <Redeem Script Byte Array> is a data push of the entire redeem
script <Redeem Script> as a byte array onto the stack as a single item. The
<Response To Redeem Script> portion of the scriptSig field contains a
valid response to the redeem script <Redeem Script>.
Figure 5.13 shows the state of the stack during the execution of the P2SH
response and challenge scripts. We assume that the <Response To Redeem
Script> portion of the response script consists of data push operations push-
ing data xn , xn−1 , . . . , x1 onto the stack. The execution proceeds as follows:
2. The redeem script specified by the byte array <Redeem Script Byte
Array> is pushed onto the stack. The state of the stack at this point is
saved for later use.
3. The OP HASH160 operator pops the top stack element <Redeem Script>
and calculates its SHA-256 + RIPEMD-160 hash <RedeemScriptHashCalc>.
This hash is then pushed onto the stack.
5. The OP EQUAL operator pops and compares the top two stack elements
<RedeemScriptHash> and <RedeemScriptHashCalc>. If they are equal,
then the number 1 is pushed onto the stack and the script execution
continues. If they are not equal, the number 0 is pushed onto the stack
and the script execution terminates with an error. Equality implies that
the redeem script provided by the response script has a SHA-256 +
RIPEMD-160 hash which is equal to the <RedeemScriptHash> given in
the challenge script.
6. If the top stack element is 1, then the state of the stack which was saved
in step 2 is restored. The redeem script <Redeem Script> is popped
from the stack and executed. If the top stack element evaluates to
True after the redeem script execution, then the P2SH response script
specified in the scriptSig field is considered valid. Otherwise, it is
considered invalid.
CHAPTER 5. BITCOIN TRANSACTIONS 91
x1
<Redeem Script Byte Array> ..
.
OP HASH160 <RedeemScriptHash> OP EQUAL xn
<Redeem Script>
x1
..
OP HASH160 <RedeemScriptHash> OP EQUAL .
xn
<RedeemScriptHashCalc>
x1
..
<RedeemScriptHash> OP EQUAL .
xn
<RedeemScriptHash>
<RedeemScriptHashCalc>
x1
OP EQUAL ..
.
xn
0 or 1
x1
..
.
xn
x1
..
.
<Redeem Script> xn
True/False
Figure 5.13: Stack state during the execution of P2SH response and challenge
scripts
CHAPTER 5. BITCOIN TRANSACTIONS 92
As an example, let us consider the case when the redeem script is a 2-of-3
multisig challenge script. The form of the scriptPubkey field is unchanged
from the general case. The scriptSig field is given by
Figure 5.14 shows the state of the stack during the execution of the P2SH 2-of-
3 multisig response and challenge scripts. Figure 5.14(a) shows the execution
until the beginning of the redeem script execution and Figure 5.14(b) shows
the redeem script execution. The last state in the former figure is repeated
as the first state in the latter figure for continuity. In Figure 5.14(a), we have
omitted the intermediate states showing the data pushes of the empty byte
array (by OP 0) and <Sig1> for brevity. After <Sig2> is pushed onto stack,
the entire redeem script byte array is pushed onto the stack as a single item.
In the redeem script execution shown in Figure 5.14(b), the operators in the
redeem script are executed. The redeem script is enclosed in angle brackets
<...> to differentiate the data push of the redeem script as a byte array from
its execution.
Suppose Alice wants Bob to make a bitcoin payment to an output which
can be unlocked by providing signatures created by any two out of three private
keys. She can of course share the three public keys required to create the 2-of-
3 bare multisig challenge script with Bob. Alternatively, she can specify this
challenge script as the redeem script in the P2SH script template and send the
SHA-256 + RIPEMD-160 hash of the redeem script to Bob. This hash can be
conveniently shared with Bob using a P2SH address which is similar to the
P2PKH address described in Section 3.3. The generation of a P2PKH address
from a public key is illustrated in Figure 3.2. The P2SH address generation
procedure is essentially the same except for the following two differences.
1. Instead of the uncompressed public key, the redeem script is hashed first
with SHA-256 and then with RIPEMD-160.
2. The address version byte which is prefixed to the hash is 0x05 for main-
net addresses and 0xC4 for testnet addresses.
The checksum calculation and the Base58 encoding procedures are the same
as in the P2PKH address generation.
As the address version byte for P2SH addresses on mainnet is 0x05, the
input to the Base58 encoding procedure is a number described by 25 hex-
adecimal digits in the range 0x050000...0000 to 0x05FFFF...FFFF. All the
numbers in this range lie between 2 × 5833 and 2 × 5833 + 25 × 5832 . Hence
they all begin with the number 2 and consist of exactly 34 digits in base 58
CHAPTER 5. BITCOIN TRANSACTIONS 93
OP 0 <Sig1> <Sig2>
<OP 2 <PubKey1> <PubKey2> <PubKey3> OP 3 OP CHECKMULTISIG>
OP HASH160 <RedeemScriptHash> OP EQUAL
<Sig2>
<Sig1>
<OP 2 <PubKey1> <PubKey2> <PubKey3> OP 3 OP CHECKMULTISIG>
<Empty Array>
OP HASH160 <RedeemScriptHash> OP EQUAL
OP 2 <PubKey1>
<PubKey2> <PubKey3>
OP 3 OP CHECKMULTISIG
<Sig2>
OP HASH160 <RedeemScriptHash> OP EQUAL <Sig1>
<Empty Array>
<RedeemScriptHashCalc>
<Sig2>
<Sig1>
<RedeemScriptHash> OP EQUAL <Empty Array>
<RedeemScriptHash>
<RedeemScriptHashCalc>
<Sig2>
<Sig1>
OP EQUAL
<Empty Array>
0 or 1
<Sig2>
<Sig1>
<Empty Array>
<Sig2>
<Sig1>
<Empty Array>
OP 2 <PubKey1> <PubKey2> <PubKey3> OP 3 OP CHECKMULTISIG
(a) P2SH multisig execution until the beginning of redeem script execution
Figure 5.14: Stack state during the execution of 2-of-3 P2SH multisig response
and challenge scripts
CHAPTER 5. BITCOIN TRANSACTIONS 94
2
<Sig2>
<Sig1>
<PubKey1> <PubKey2> <PubKey3> OP 3 OP CHECKMULTISIG <Empty Array>
<PubKey1>
2
<Sig2>
<Sig1>
<PubKey2> <PubKey3> OP 3 OP CHECKMULTISIG
<Empty Array>
<PubKey2>
<PubKey1>
2
<Sig2>
<PubKey3> OP 3 OP CHECKMULTISIG <Sig1>
<Empty Array>
<PubKey3>
<PubKey2>
<PubKey1>
2
<Sig2>
OP 3 OP CHECKMULTISIG
<Sig1>
<Empty Array>
3
<PubKey3>
<PubKey2>
<PubKey1>
2
OP CHECKMULTISIG <Sig2>
<Sig1>
<Empty Array>
True/False
Figure 5.14: Stack state during the execution of 2-of-3 P2SH multisig response
and challenge scripts (continued)
CHAPTER 5. BITCOIN TRANSACTIONS 95
Null Data
The null data challenge script is a method to store small amounts of data
(upto 80 bytes) on the blockchain. It has a scriptPubkey of the form
OP RETURN <Data>
where <Data> can contain at most 80 bytes of arbitrary data. The scriptPubkey
field itself has a maximum size of 83 bytes with the OP RETURN operator oc-
cupying 1 byte, the length of the data occupying at most 2 bytes,14 and the
data occupying upto 80 bytes.
The OP RETURN operator causes script execution to terminate immediately
irrespective of the state of the stack. So there exists no response script which
can provide a valid response to this challenge script. For this reason, null
data outputs are unspendable and any bitcoins locked by a null data challenge
script will be lost forever. Outputs containing null data challenge scripts are
not added to the set of UTXOs even if they have some bitcoins associated
with them.
The sole reason for including the null data challenge script in the list of
standard scripts is to provide a means to securely store data on the blockchain.
Once the block containing the null data output receives a few dozen confir-
mations, it becomes computationally infeasible to change the data recorded
in the output. This property can be exploited to build timestamping applica-
tions where the hash of some document can be recorded on the blockchain to
prove the existence of the document prior to some point in time.
13
P of a positive integer N is the sequence of digits ak ak−1 · · · a0
The base 58 representation
where 0 ≤ ai ≤ 5 and N = ki=0 ai 58i .
14
The length of the data is encoded using the data push operators given in Table 5.2.
Data lengths upto 75 bytes need only one byte to encode while data lengths from 76 to 80
need two bytes.
CHAPTER 5. BITCOIN TRANSACTIONS 96
SIGHASH ALL
The SIGHASH ALL hash type is the default option where all the inputs and
outputs in the transaction are signed. Consider the regular transaction shown
in Figure 5.15. The number 0x02 which appears once before the inputs and
once before the outputs indicates there are two inputs and two outputs in the
transaction. Input 0 unlocks a previous output which is at index n0 of a trans-
action with TXID hash0. Let prevScriptPubkey0 denote the scriptPubkey
CHAPTER 5. BITCOIN TRANSACTIONS 97
Message for
Regular Transaction Input 0 signatures
nVersion nVersion
0x02 0x02
hash0 hash0
n0 n0
Input 0
Input 0 scriptSigLen0 prevScriptPubkeyLen0
Fields
scriptSig0 prevScriptPubkey0
nSequence0 nSequence0
hash1 hash1
n1 n1 Input 1
Input 1 scriptSigLen1 0x00 Fields
scriptSig1 nSequence1
nSequence1 0x02
0x02 nValue0
Output 0
nValue0 scriptPubkeyLen0
Fields
Output 0 scriptPubkeyLen0 scriptPubkey0
scriptPubkey0 nValue1
Output 1
nValue1 scriptPubkeyLen1
Fields
Output 1 scriptPubkeyLen1 scriptPubkey1
scriptPubkey1 nLockTime
nLockTime nHashType
Figure 5.15: Message used to generate signature for the first input using the
SIGHASH ALL hash type
field of this output and let prevScriptPubkeyLen0 denote its length. If the
challenge script specified by prevScriptPubkey0 requires a signature to be
present in scriptSig0 with signature hash type SIGHASH ALL, then the mes-
sage to be signed is shown on the right in Figure 5.15. This message is obtained
according to the following rules:
• The nVersion field and the number of inputs are included without mod-
ification.
• The number of outputs, the fields in the outputs, and the nLockTime
field are included without modification.
• The 4-byte signature hash type is appended at the end as shown by the
nHashType field. This step is not unique to the SIGHASH ALL signature
hash type and is done for all the other hash types as well.
The message digest is calculated as the double SHA-256 hash of this mes-
sage and signed with the private key to generate the signature. The signature
will appear in the scriptSig0 field of Input 0. As the signature is not known
at the time of message digest calculation, the scriptSig0 field cannot be
included in the message. If any of the fields included in the message are mod-
ified after the signature generation, the signature will become invalid. This
property has the following semantic consequences:
• Since hash0 and n0 are included in the message, the UTXO being un-
locked by Input 0 cannot be changed. This is because hash0 contains
the TXID of the transaction containing the UTXO and n0 contains the
index of the UTXO in the list of outputs in that transaction. Specify-
ing the UTXO location in the message prevents the signature generated
to unlock the UTXO from being used again to unlock another UTXO
locked by the same challenge script (for example, two UTXOs may have
the same P2PKH challenge script).16
• As hash1 and n1 are included in the message, the UTXO which will be
unlocked by Input 1 cannot be changed. This is useful in scenarios when
the response scripts in Input 0 and Input 1 are provided by two different
entities. The entity generating the signatures for Input 0 can be sure
that the amount of bitcoins contributed by Input 1 will not change as
the corresponding UTXO cannot change.
• As all the output fields are included in the message, the intended recip-
ients (specified by the challenge scripts) of the bitcoins unlocked by the
transaction inputs and the amounts being sent them cannot be changed.
nVersion nVersion
0x02 0x02
hash0 hash0
n0 n0 Input 0
Input 0
prevScriptPubkeyLen0 0x00 Fields
Fields
prevScriptPubkey0 nSequence0
nSequence0 hash1
hash1 n1
Input 1
n1 Input 1 prevScriptPubkeyLen1
Fields
0x00 Fields prevScriptPubkey1
nSequence1 nSequence1
0x02 0x02
nValue0 nValue0
Output 0 Output 0
scriptPubkeyLen0 scriptPubkeyLen0
Fields Fields
scriptPubkey0 scriptPubkey0
nValue1 nValue1
Output 1 Output 1
scriptPubkeyLen1 scriptPubkeyLen1
Fields Fields
scriptPubkey1 scriptPubkey1
nLockTime nLockTime
nHashType nHashType
Figure 5.16: Message used to generate signature for the second input using
the SIGHASH ALL hash type
The message used to generate the signature for Input 1 of the transaction
from Figure 5.15 using the SIGHASH ALL hash type is shown in Figure 5.16.
We have repeated the message used for the Input 0 signature in this figure for
easy comparison between the two messages. The two messages differ only in
which input fields are included. The message for Input 1 signatures replaces
the scriptSig0 and scriptSigLen0 from Input 0 with a zero byte. This is
done even if the scriptSig0 field is already known to allow the signatures
for the two inputs to be generated in any order. The prevScriptPubkey1
and prevScriptPubkeyLen1 be the challenge script and its length from the
UTXO being unlocked by Input 1. These fields replace the scriptSig1 and
scriptSigLen1 fields in Input 1.
While we used a transaction with two inputs and two outputs to illustrate
the message generation for the SIGHASH ALL hash type, the procedure for
transactions with arbitrary number of inputs and outputs is similar. All the
outputs in the transaction are included in the message. When generating the
message for a particular input, the scriptSig and scriptSigLen fields in
that input are replaced with the scriptPubkey and scriptPubkeyLen fields
from the UTXO being unlocked. The scriptSig and scriptSigLen fields
from all the other inputs are excluded.
As discussed in Section 5.5, for all signature hash types the ECDSA sig-
CHAPTER 5. BITCOIN TRANSACTIONS 100
Regular Transaction
Figure 5.17: Message used to generate signature for the first input using the
SIGHASH NONE hash type
nature generated using the message digest and private key is encoded using
DER encoding. The least significant byte of the signature hash type is ap-
pended to this DER encoded signature to indicate its type. This byte is used
by the operators OP CHECKSIG and OP CHECKMULTISIG to generate the correct
message from the transaction for signature verification.
SIGHASH NONE
When the SIGHASH NONE signature hash type is used, none of the outputs
in the transaction are included in the message being signed. Figure 5.17
shows the message used to generate the signatures for the first input of the
transaction from Figure 5.15. The fields related the inputs of the transaction
are obtained by the same procedure used in the SIGHASH ALL case with one
exception. The nSequence1 field from Input 1 is set to zero (0x00000000)
in the message. This allows the nSequence1 field to be modified before the
signatures in Input 1 are generated.
The field indicating number of outputs is set to zero and none of the fields
from the outputs of the transactions are included in the message. This may
seem insecure because the outputs can be modified without invalidating the
signature. If a regular transaction in which all the inputs have signatures with
SIGHASH NONE hash type is broadcast on the network, the miner can replace
CHAPTER 5. BITCOIN TRANSACTIONS 101
the addresses receiving the payment with its own address and include the
transaction in the blockchain. This can be avoided by having at least one of
the transaction inputs have signatures with the SIGHASH ALL hash type.
The utility of the SIGHASH NONE hash type is that it enables entities which
trust each other to construct transactions where the receiver of the payment is
not known beforehand. For example, suppose Alice and Bob want to purchase
a rare book by pooling their bitcoin funds. They will create a transaction
containing two inputs where each of them will provided signatures for one
of the inputs. But the book is not currently available and requires them to
search for a seller who has it. Bob offers to search for the book. If Alice trusts
Bob, she can unlock the UTXO containing her contribution to the cost of the
book using signatures of SIGHASH NONE hash type and give the transaction
to Bob. The message used to generate Alice’s signature will not contain the
outputs. Hence the receiver of the payment is not fixed. When Bob finds
a seller having the book, he can include the seller’s Bitcoin address in the
transaction outputs and unlock the UTXO containing his contribution using
signatures of SIGHASH ALL hash type. When this transaction is broadcast on
the network, the outputs cannot be modified as they are protected by the
signatures in Bob’s input.
For transactions with arbitrary number of inputs and outputs, the message
for generating the signatures for each input always excludes the outputs and
sets the number of outputs to zero. Apart from setting the nSequence fields
in all the other inputs to zero when generating the message for a particular
input, the portion of the message related to the transaction inputs is generated
as in the SIGHASH ALL case.
SIGHASH SINGLE
The SIGHASH SINGLE signature hash type is used in situations when each
entity unlocking a UTXO in a multi-input transaction wants to sign only one
of the outputs. In the message for signatures in the input at index i, only the
output at index i is included. The fields related to the inputs are included in
the message by the same procedure used in the SIGHASH NONE case.
Figure 5.18 shows the messages used to generate the signatures in each of
the two inputs of the transaction from Figure 5.15. For the Input 0 message,
the number of outputs is set to one (0x01) and only the fields from Output 0
are included in the message. For the Input 1 message, the number of outputs
is set to two (0x02). A null output is included instead of Output 0. It consists
of a 64-bit nValue field set to all ones (0xFFFF FFFF FFFF FFFF) followed by
a single zero byte representing an empty scriptPubkey field. All the fields
from Output 1 are included in the message. In general, the message for an
input with index i includes the number of outputs set to i + 1, i − 1 null
outputs, and the unmodified output with index i. The outputs with index
greater than i are ignored.
CHAPTER 5. BITCOIN TRANSACTIONS 102
nVersion
0x02
hash0
n0
Input 0
Regular Transaction prevScriptPubkeyLen0
Fields
prevScriptPubkey0
nSequence0
nVersion
hash1
0x02
n1 Input 1
hash0
0x00 Fields
n0
0x00000000
Input 0 scriptSigLen0
0x01 Number of outputs
scriptSig0
nValue0
nSequence0 Output 0
scriptPubkeyLen0 Fields
hash1
scriptPubkey0
n1
nLockTime
Input 1 scriptSigLen1
nHashType
scriptSig1
nSequence1
0x02
nValue0 Message for Input 1 signatures
Output 0 scriptPubkeyLen0
scriptPubkey0 nVersion
nValue1 0x02
Output 1 scriptPubkeyLen1 hash0
scriptPubkey1 n0 Input 0
nLockTime 0x00 Fields
0x00000000
hash1
n1
Input 1
prevScriptPubkeyLen1
Fields
prevScriptPubkey1
nSequence1
0x02 Number of outputs
0xFFFFFFFFFFFFFFFF
Null Output
0x00
nValue1
Output 1
scriptPubkeyLen1
Fields
scriptPubkey1
nLockTime
nHashType
Figure 5.18: Messages used to generate signatures for each input using the
SIGHASH SINGLE hash type
hash type which was used unlock a UTXO can be reused to unlock other
UTXOs locked by the same challenge script. This bug is due to an oversight
in an early implementation of the Bitcoin Core client and has remained unfixed
as it requires a soft fork to fix.
The SIGHASH SINGLE hash type is useful in scenarios when multiple entities
fund the different inputs in a transaction. Each entity wants to ensure that a
certain amount of bitcoins from the inputs is paid to a receiver of their choice.
But each entity does not care for how the remaining amount of bitcoins are
spent.
SIGHASH ANYONECANPAY
All previous three signature hash types include all the inputs of the transaction
in the message being signed. The SIGHASH ANYONECANPAY hash type specifies
that the message used to generate signatures for a particular transaction input
includes only that input. It is always used in conjunction with one of the
three previous signature hash types. For example, the SIGHASH ANYONECANPAY
|SIGHASH ALL hash type corresponds to the case when all the outputs are
signed and only one of inputs is signed. This hash type is represented by
the value 0x81 which is the bitwise OR of the SIGHASH ANYONECANPAY and
SIGHASH ALL hash type values from Table 5.4. Once the signature of this hash
type has been generated, the outputs in the transaction cannot be modified
but the other inputs can be modified. Hence the name “anyone can pay”.
Figure 5.19 shows the messages used to generate the signatures having
hash type SIGHASH ANYONECANPAY|SIGHASH ALL in each of the two inputs of
the transaction from Figure 5.15. For the Input 0 message, the number of
outputs is set to one (0x01) and only the fields from Input 0 are included in
the message. For the Input 1 message, the number of outputs is once again
set to one and only the fields from Input 1 are included. In both cases, all the
output fields are included in the message.
The SIGHASH ANYONECANPAY|SIGHASH NONE and SIGHASH ANYONECANPAY
|SIGHASH SINGLE hash types have values 0x82 and 0x83. The messages gen-
erated by these hash types include only one input at a time. The inclu-
sion of the outputs follows the procedure described for SIGHASH NONE and
SIGHASH SINGLE respectively.
To see the utility of the SIGHASH ANYONECANPAY hash type, consider a
crowdfunding scenario where the recipient of the funds and the amount of
funds required are known. Some people want to participate in the crowdfund-
ing by unlocking UTXOs containing their contribution. A transaction is cre-
ated where each funder’s contribution is represented by an input and the out-
put contains the recipient’s address. Without using the SIGHASH ANYONECANPAY
hash type, the number of funders and their UTXO details will need to be
fixed before the signatures for each input can be generated. But with the
SIGHASH ANYONECANPAY hash type, the signatures for each input can be gen-
CHAPTER 5. BITCOIN TRANSACTIONS 104
nVersion
0x01
hash0
n0
Regular Transaction Input 0
prevScriptPubkeyLen0
Fields
prevScriptPubkey0
nVersion nSequence0
0x02 0x02
hash0 nValue0
Output 0
n0 scriptPubkeyLen0
Fields
Input 0 scriptSigLen0 scriptPubkey0
scriptSig0 nValue1
nSequence0 Output 1
scriptPubkeyLen1
hash1 Fields
scriptPubkey1
n1 nLockTime
Input 1 scriptSigLen1 nHashType
scriptSig1
nSequence1
0x02 Message for Input 1 signatures
nValue0
Output 0 scriptPubkeyLen0
scriptPubkey0 nVersion
nValue1 0x01
Output 1 scriptPubkeyLen1 hash1
scriptPubkey1 n1
Input 1
nLockTime prevScriptPubkeyLen1
Fields
prevScriptPubkey1
nSequence1
0x02
nValue0
Output 0
scriptPubkeyLen0
Fields
scriptPubkey0
nValue1
Output 1
scriptPubkeyLen1
Fields
scriptPubkey1
nLockTime
nHashType
Figure 5.19: Messages used to generate signatures for each input using the
SIGHASH ANYONECANPAY|SIGHASH ALL hash type
Regular Transaction
nVersion
Number of Inputs N
hash
n
Input 0 scriptSigLen
scriptSig
nSequence
..
.
hash
n
Double
Input N − 1 scriptSigLen
SHA-256 TXID
scriptSig
Hash
nSequence
Number of Outputs M
nValue
Output 0 scriptPubkeyLen
scriptPubkey
..
.
nValue
Output M − 1 scriptPubkeyLen
scriptPubkey
nLockTime
• They both specify the same previous UTXOs as the source of bitcoins.
The hash and n fields in the inputs of both the transactions have to be
identical.
• They both specify the same new UTXOs as destinations of the bitcoins
being transferred. The nValue and scriptPubkey fields in the outputs
of both transactions have to be identical.
Note that the scriptSig fields are not required to be identical for two trans-
actions to be functionally identical. The scriptSig field contains ECDSA sig-
natures whose generation involves a random integer (see Section 2.5). Hence
there are multiple valid signatures corresponding to the same message digest
and private key. Each of these signatures will result in a different scriptSig
field in an input and consequently a different TXID for the transaction. So an
entity which knows any one of the private keys needed to generate a signature
required in the response script can change the TXID by simply regenerating
the signature with a different random integer.
Transaction malleability can even be effected by entities which do not know
any of the private keys required to generate a valid response script. Using
notation from Section 2.5, let (r, s) be a valid secp256k1 ECDSA signature
for a message m. In Appendix A, we show that (r, n−s) is also a valid signature
for the message m where p is the 256-bit prime number given in equation (2.3).
The integer s is not allowed to be zero in an ECDSA signature. Since n is
an odd integer, n − s 6= s mod n for all s 6= 0. Hence replacing the byte
representation of (r, s) in a scriptSig field of a transaction with the byte
representation of (r, n − s) will modify the TXID without invalidating the
signature. This change does not require knowledge of the private key which
was used to generate (r, s).
Once a transaction receives a confirmation i.e. once it is included in a
block on the blockchain, changing its TXID will change the hashMerkleRoot
in the block header. With a high probability, this will cause the block hash
to exceed the target threshold invalidating the block. As a result, transac-
tion malleability does not pose problems in situations when the transaction
outputs will be spent after the transaction is confirmed. However, there are
protocols where unconfirmed transaction outputs are referenced by a spend-
ing transaction. Transaction malleability will cause such protocols to fail by
breaking the dependency between the transactions.
To understand how transaction malleability causes protocols involving
spending of unconfirmed transaction outputs to fail, consider the following
situation. Suppose Alice is a professor who wants to teach her student Bob
about Bitcoin transactions. Bob does not own any bitcoins. So Alice decides
to transfer x bitcoins to Bob with the intention of getting them back. But
Alice does not have enough trust in Bob’s integrity or competence to send the
bitcoins to a Bitcoin address which requires only Bob’s signature to spend (for
example, a P2PK or P2PKH address derived from Bob’s private key). Bob
may decide to cheat Alice by refusing to provide the signature required to send
the bitcoins from his Bitcoin address back to Alice. Or he may make a mis-
take in constructing the refund transaction resulting in the bitcoins being sent
to an address not owned by Alice. To prevent these undesirable situations,
Alice executes a protocol which is illustrated in Figure 5.21. Figure 5.21(a)
illustrates the transactions t1 and t2 Alice creates to initiate the protocol and
Figure 5.21(b) illustrates the messages exchanged during the protocol. The
CHAPTER 5. BITCOIN TRANSACTIONS 107
Transaction t1
Transaction t2
with TXID i1
Input unlocking
Input with hash = i1 and
Input 0 x bitcoins from
n = 0 unlocking the Input 0
Alice’s UTXO
2-of-2 multisig output in t1
Output locked by
Output returning
Output 0 2-of-2 multisig Output 0
funds to Alice
challenge script
1. Create t1
2. Create t2
ith A’s sig
3. Send t2 w
4. Send t w 5. Broadcast
2 ith B’s sig t1
on
t1 confirmati
6. Broadcast
t2
o n
t2 confirmati
Figure 5.21: Alice’s protocol for transferring funds to a 2-of-2 multisig output
and reclaiming them
input contains the TXID i1 of t1 and the index of the output containing
the 2-of-2 multisig challenge script. But the response script in t2 ’s input
has no signatures.
Even though Alice creates t1 in step 1, she does not broadcast it on the network
until Bob sends her the refund transaction t2 with his signature included in
it. The dependency between t1 and t2 is created using the TXID i1 of t1 .
Transaction malleability can be used to cause Alice’s protocol to fail by
breaking the dependency between t1 and t2 . When Alice broadcasts t1 in step
5, Bob or any node on the network can replace Alice’s signature (r, s) in t1 with
the valid signature (r, n − s) and rebroadcast the modified t1 on the network.
Let t01 denote the modified t1 that has a different TXID i01 . Since both t1 and
t01 spend the same UTXO owned by Alice, only one of them will be included in
a block. If t01 gets included in a block, then the refund transaction t2 is invalid
as its input contains the TXID i1 of t1 . Now Alice will have to request Bob
to sign a new version of t2 in order to get her bitcoins back. If Bob refuses
to cooperate, then she cannot get her bitcoins back. While Bob cannot spend
the bitcoins as this requires Alice’s signature, he can inconvenience Alice by
making the funds unspendable.
The reason transaction malleability is possible is because all the signa-
ture hash types exclude the scriptSig field from the message digest used
to create signatures in a transaction. But this field is included in the TXID
calculation. Replacing the signatures in the scriptSig field with new valid
signatures changes the TXID without invalidating the transaction. SegWit
solves the problem of transaction malleability by defining two new script tem-
plates which move the signature data out of the scriptSig field and into a
separate structure called the witness. This witness structure is not included
in the TXID calculation. In the Bitcoin Core client, the witness structure is
stored in a field called scriptWitness.
CHAPTER 5. BITCOIN TRANSACTIONS 109
OP 0 0x14 <PubKeyHash>
<Empty Array>
0x14 <PubKeyHash>
<PubKeyHash>
<Empty Array>
Figure 5.22: Stack state during the execution of an empty response script
followed by a P2WPKH challenge script
<PubKeyHash>, i.e.
scriptSig: (empty),
scriptWitness: <Signature> <Public Key>.
The fields in the scriptWitness are pushed onto an empty stack and the
P2PKH challenge script is executed on this stack. Note that the fields in the
scriptWitness are identical to the scriptSig fields of the P2PKH response
script described in Section 5.5. The script execution is as shown in Figure
5.11.
Suppose a miner mines a new valid block which has a transaction with an
input which spends P2WPKH outputs. When this block is broadcast on the
network, nodes running SegWit-capable clients receive both the scriptSig
CHAPTER 5. BITCOIN TRANSACTIONS 111
and scriptWitness fields. But nodes running pre-SegWit clients do not re-
ceive the scriptWitness fields. They see only the empty scriptSig fields
which are valid responses as per the pre-SegWit script execution procedure.
Hence the block is considered valid by the nodes running pre-SegWit clients.
While each transaction input which unlocks an output locked by a SegWit
challenge script has a scriptWitness field associated with it, this field is
not included in the input data structure to maintain backward compatibility
with pre-SegWit clients. The signature data which was part of the scriptSig
field in pre-SegWit regular transactions is moved to the scriptWitness field.
In cryptography, digital signatures are considered witnesses which prove that
the signer knows the private key corresponding to a public key. The phrase
“segregated witness” is motivated by the separation of the signature data from
the inputs.
The P2WPKH challenge script can be embedded inside a P2SH script
template resulting in a P2SH-P2WPKH challenge script. The scriptPubkey
field has a P2SH challenge script of the form
The 0x16 operator in the scriptSig field pushes the 22-byte P2WPKH chal-
lenge script (which is the P2SH redem script) onto the stack. The scriptWitness
field once again contains a signature and a compressed public key. While
the response to the redeem script was present in the scriptSig field in pre-
SegWit P2SH scripts, it is located in the scriptWitness field in the P2SH-
P2WPKH script. The advantage of a P2SH-P2WPKH script template over
the P2WPKH script template is that an entity requesting bitcoin payment to
an output locked by the former can simply share the P2SH address generated
from the RedeemScriptHash.
When pre-SegWit clients encounter a transaction input which unlocks a
P2SH-P2WPKH output, they will first verify that the SHA-256 + RIPEMD-
160 hash RedeemScriptHashCalc of the redeem script in the scriptSig field
matches the RedeemScriptHash given in the scriptPubkey of the output.
When they execute the redeem script, the OP 0 operator pushes an empty
byte array and the 0x14 operator pushes the 20-byte PubKeyHash onto the
stack making it the top stack element. As PubKeyHash is extremely unlikely
CHAPTER 5. BITCOIN TRANSACTIONS 112
to be the all zeros bytestring, it evaluates to True and the script execution
succeeds. This is illustrated in Figure 5.23. On the other hand, SegWit-
capable clients will insert the PubKeyHash in the redeem script into a P2PKH
challenge script as before and use the contents of the scriptWitness field to
execute it.
OP 0 0x14 <PubKeyHash>
<RedeemScriptHashCalc>
<RedeemScriptHash> OP EQUAL
<RedeemScriptHash>
<RedeemScriptHashCalc>
OP EQUAL
0 or 1
OP 0 0x14 <PubKeyHash>
Redeem Script Execution
<Empty Array>
0x14 <PubKeyHash>
<PubKeyHash>
<Empty Array>
redeem scripts, we will call the redeem script in the scriptSig field the P2SH
redeem script and the redeem script in the scriptWitness field the P2WSH
redeem script. The scriptPubkey field has a P2SH challenge script of the
form
OP HASH160 <P2SH RedeemScriptHash> OP EQUAL
where P2SH RedeemScriptHash is the 20-byte SHA-256 + RIPEMD-160 hash
of the P2WSH challenge script (which is the P2SH redeem script). The
scriptSig and scriptWitness fields are given by
The 0x22 operator in the scriptSig field pushes the 34-byte P2WSH chal-
lenge script onto the stack. The scriptWitness field once again contains the
response to the P2WSH redeem script followed by the script itself.
As in the P2SH-P2WPKH case, pre-SegWit clients validate inputs spend-
ing P2SH-P2WSH outputs based on the equality of the P2SH redeem script
hash and the hash given in the scriptPubkey field. SegWit-capable clients
perform the additional step of executing the P2WSH redeem script in the
scriptWitness field.
nVersion
Serialization for TXID Marker Byte = 0x00
Calculation Flag Byte = 0x01
Number of Inputs N
nVersion Input 0
Number of Inputs N
..
Input 0 .
.. Input N − 1
.
Number of Outputs M
Input N − 1
Double Double
SHA-256 Output 0 SHA-256
Number of Outputs M Hash Hash
..
Output 0 .
.. Output M − 1
. TXID WTXID
Output M − 1 Witness 0
nLockTime ..
.
Witness N − 1
nLockTime
The double SHA-256 hash of the second serialization is called the witness
transaction identifier (WTXID). If all the inputs in a transaction are SegWit
inputs, the TXID is not malleable as its calculation does not involve any
signatures which will be present in the witness structures. Even if one of
the inputs in a transaction is a non-SegWit input, then the TXID of the
transaction is malleable. The WTXID of transaction will be malleable as
CHAPTER 5. BITCOIN TRANSACTIONS 116
Number of
VarInt
Stack Items n
Length in Bytes
of Stack Item 1 VarInt
Stack Item 1
Witness i − 1
Length in Bytes
VarInt
of Stack Item 2
Witness i
Stack Item 2
Witness i + 1 ..
.
Length in Bytes
of Stack Item n VarInt
Stack Item n
h = H(h0 kh1 )
t0 t1 t2 t3
Figure 5.26: WTXID Merkle tree for a block with four transactions
where the 0x24 operator indicates that the following 36 bytes are of interest.
The 4-byte field containing 0xAA21A9ED is the fixed commitment header for
SegWit witness commitments. Figure 5.27 illustrates a coinbase transaction
for a block containing SegWit transactions. There are two outputs in the
transaction: the first output is a regular P2PKH output used by the miner
to send the block reward to a P2PKH address owned by him and the sec-
ond output is a null data output containing the witness commitment hash.
The second output has a nValue field equal to zero as null data outputs are
unspendable. The scriptPubkeyLen field is set to 0x26 indicating that the
scriptPubkey is 38 bytes long. The scriptPubkey field contains the null
data challenge script containing the witness commitment hash. Even though
the dummy input in the coinbase transaction is not a SegWit input, it has
a witness structure containing a single stack item consisting of the 32-byte
witness reserved value.
CHAPTER 5. BITCOIN TRANSACTIONS 118
nVersion nValue = 0
Number of Inputs = 1 scriptPubkeyLen = 0x26
scriptPubkey = OP RETURN 0x24
Dummy Input 0xAA21A9ED <32-byte Commitment Hash>
Number of Outputs = 2
Witness Structure
Output 0 Storing Reserved Value
(P2PKH output)
Number of
Stack Items = 0x01
Output 1 Length in Bytes
(Null data output) of Stack Item 1 = 0x20
nLockTime
Figure 5.27: Example of a coinbase transaction for a block with SegWit trans-
actions
shown in Figure 5.15. The fields corresponding to the two transaction outputs
appear in both the messages. The fields containing the TXID and index of the
previous outputs (hash0, n0, hash1, n1) being unlocked by the inputs appear
in both the messages. The sequence number fields of the inputs (nSequence0,
nSequence1) appear in both the messages. The consequence of such repeti-
tions is that the amount of data to be hashed by the double SHA-256 hash
function increases quadratically, i.e. O(N 2 ), with the number of inputs N .
This is undesirable as this quadratic complexity is due to an oversight in the
original design. The amount of data hashed can be made to increase only
linearly with the number of inputs if the fields which are repeatedly hashed
are hashed only once and the hash values reused. Such a modification of the
message digest calculation will require a hard fork change in the Bitcoin pro-
tocol which entails upgrading all the nodes in the network. For this reason,
the message digest calculation was not changed. But when SegWit was pro-
posed as a soft fork solution to transaction malleability, the Bitcoin developers
saw an opportunity to introduce a more efficient message digest calculation
algorithm for the signatures in SegWit inputs. As pre-SegWit clients see Seg-
Wit outputs as anyone-can-spend outputs, they do not need to calculate the
signatures required to validate the SegWit inputs which unlock these outputs.
SegWit-capable clients use the new message digest calculation algorithm for
validating SegWit inputs and the old message digest calculation algorithm for
validating non-SegWit inputs.
Figure 5.28 shows the messages used to generate the signatures in each
of the two inputs of a regular transaction. The message always contains the
nVersion field, the nLockTime field, and the nHashType field. Three new fields
hashPrevouts, hashSequence, and hashOutputs not present in pre-SegWit
signature messages are also always included in the message. The signature
message corresponding to a particular input includes six fields related to that
input. They are as follows:
2. The index (n0 or n1) of the output being unlocked in the transaction
containing it.
nVersion
hashPrevouts
Regular Transaction hashSequence
hash0
nVersion n0
0x02 prevScriptPubkeyLen0 Input 0
hash0 prevScriptPubkey0 Fields
n0 prevNValue0
Input 0 scriptSigLen0 nSequence0
scriptSig0 hashOutputs
nSequence0 nLockTime
hash1 nHashType
n1
Input 1 scriptSigLen1
scriptSig1 Message for Input 1 signatures
nSequence1
0x02 nVersion
nValue0 hashPrevouts
Output 0 scriptPubkeyLen0 hashSequence
scriptPubkey0 hash1
nValue1 n1
Output 1 scriptPubkeyLen1 prevScriptPubkeyLen1 Input 1
scriptPubkey1 prevScriptPubkey1 Fields
nLockTime prevNValue1
nSequence1
hashOutputs
nLockTime
nHashType
Figure 5.28: Messages used to generate SegWit signatures for the both inputs
of a transaction with two inputs
the output. This field was not present in any of the messages used in
the pre-SegWit signature generation. The motivation for including this
field was to make the offline signing of transactions by devices which
do not have access to the whole blockchain safer. When an untrusted
entity requests an offline device to sign a transaction, the presence of
the amount in the message ensures that the signature becomes invalid
if the amount was misrepresented.
use SHA256d(·) to denote the SHA256d hash function. The following rules
are used to define the fields:
2. If the hash type is ALL, the hashSequence field is to the SHA256d hash of
CHAPTER 5. BITCOIN TRANSACTIONS 122
all the sequence number fields from the inputs. Otherwise, it is set to the
256-bit all zeros bitstring. This definition of hashSequence follows from
the fact that all the sequence numbers are included in the pre-SegWit
messages only when the hash type is ALL.
3. If the hash type is ALL or ALL|ANYONECANPAY, the hashOutputs field is
set to the SHA256d hash of the concatenation of all the output fields.
If the hash type is SINGLE or SINGLE|ANYONECANPAY, the hashOutputs
field is set to the SHA256d hash of the output which has the same
index as the input being signed. For example, messages for Input 0 with
SINGLE hash type will have a hashOutputs field equal to
SHA256d(nValue0 k scriptPubkeyLen0 k scriptPubkey0).
If the hash type is NONE or NONE|ANYONECANPAY, the hashOutputs field
is set to the 256-bit all zeros bitstring. So hashOutputs either contains
a hash of all the outputs, a single output, or none of the outputs.
These three fields help reduce the amount of data which is hashed if the hash
types of the inputs result in identical field values. For example, in Figure 5.28
suppose both Input 0 and Input 1 have signatures of hash type ALL. The the
values of the hashPrevouts, hashSequence, and hashOutputs fields are the
same for both the inputs. These fields will calculated once for Input 0 and
reused for Input 1.
For example, the below P2SH response script with a redeem script con-
sisting of a 1-of-2 multisig script contributes 2 sigops. We have explicitly
indicated the push of the N -byte redeem script by the PushN operator.
Note that the OP CHECKMULTISIG in the redeem script does not con-
tribute 20 sigops (according to rule 2) inspite of being in a scriptSig
field. This is because when the operators in the P2SH response script
are parsed the OP CHECKMULTISIG in the redeem script is considered part
of the N -byte data which is pushed onto the stack. It is not interpreted
as an operator in the scriptSig field.
SegWit increased both the block size and sigop limits for blocks containing
SegWit transactions while keeping these limits the same for blocks without
SegWit transactions. We discuss the block size increase first followed by the
sigop limit increase.
CHAPTER 5. BITCOIN TRANSACTIONS 124
Sigop Limit
SegWit increases the limit on the number of sigops in a block to 80,000. The
number of sigops in a block is calculated as follows where the first three rules
are the same as the pre-SegWit sigop rules with a scaling factor of four.
For a non-SegWit input, rules 4 through 7 do not apply. The sigop count is
exactly four times the pre-SegWit sigop count. Since the sigop limit is also
four times the previous limit, a non-SegWit block which has at most 20,000
sigops will continue to be valid under the SegWit sigop calculation rules.
Chapter 6
Contracts
6.1 Escrow
Consider the scenario where Alice (the buyer) wants to purchase a used book
from Bob (the seller) using bitcoins. Alice and Bob live in different cities mak-
ing it infeasible for them to meet and perform the transaction. Bob promises
to ship the book to Alice once he receives the bitcoin payment. But Alice does
not trust Bob and fears that he may not send her the book after receiving the
payment. To reduce her risk, Alice proposes to use an escrow contract to pay
Bob. The contract needs a third party Carol (the escrow) who both Alice and
Bob trust. The contract proceeds as follows:
1. Alice requests public keys from Bob and Carol. Let these keys be
PubKeyB and PubKeyC respectively.
2. Alice transfers x bitcoins to a 2-of-3 multisig output which has the chal-
lenge script
OP 2 <PubKeyA> <PubKeyB> <PubKeyC> OP 3 OP CHECKMULTISIG
where PubKeyA is Alice’s public key.
3. Once Bob sees that Alice’s transaction has appeared on the blockchain,
he ships the book to Alice.
127
CHAPTER 6. CONTRACTS 128
4. The funds locked in the multisig output can be spent if any two of Alice,
Bob, and Carol provide signatures created by their respective private
keys. Any of the three following scenarios can happen.
(i) Alice is happy with the book she has received. She signs a transac-
tion which unlocks the 2-of-3 multisig output and transfers the x
bitcoins (minus the transaction fees) to the P2PK address contain-
ing Bob’s public key. She sends this transaction to Bob who adds
his own signature and broadcasts it on the network for inclusion
on the blockchain.
(ii) Alice receives the book but refuses to sign the transaction paying
Bob. Bob provides proof of shipment to the escrow Carol and
requests her to sign a transaction paying him. If Carol is convinced
that Bob actually shipped the book to Alice, she will send the
signed transaction to Bob, who will add his own signature to the
transaction and broadcast it on the network.
(iii) Bob does not ship the book to Alice. Furthermore, he refuses to
sign the transaction refunding the bitcoins to Alice. In this case,
Alice requests Carol to sign a transaction refunding the bitcoins. If
Carol complies, Alice adds her own signature to the refund trans-
action and broadcasts it on the network.
The above escrow contract fails if the escrow Carol colludes with Alice or Bob.
If Alice and Carol collude, then they can refuse to pay Bob even if he sent the
book to Alice. If Bob and Carol collude, then they can transfer the bitcoins
to any address without sending the book to Alice. Another weakness of the
contract is that it is difficult for Bob to give proof of shipment. He can send
the tracking information of the package to Carol but the package itself may be
empty. A solution to the empty package problem is to choose an escrow Carol
who lives in the same city as Alice, ask Bob to ship the book to Carol, and
have Alice collect it from her. Alice can open the package in Carol’s presence
and Carol can verify that it is not empty.
6.2 Micropayments
Even Bitcoin transaction involves paying transaction fees which makes using
Bitcoin to make small payments expensive (the transaction fees may exceed
the payment amount). But if a sequence of small payments are to be made
to the same entity, the micropayment contract can be used which aggregates
the small payments and requires that transaction fees be paid for only one
transaction.
Consider the scenario where Alice offers proofreading and editing services
online in return for bitcoins. Clients can email Alice their documents and
CHAPTER 6. CONTRACTS 129
Alice will reply with typos and grammatical errors she has found in the docu-
ments. Alice charges her clients a fixed amount of bitcoins per edited page. To
avoid the situation where a client refuses payment after receiving the edited
document, Alice uses the micropayment contract. This contract enables her
to get payment incrementally for each page she edits. Let Bob be a client who
wishes a to have a 100 page document proofread by Alice. Let us assume that
Alice charges 0.0001 bitcoins per page. So Bob expects to pay a maximum of
0.001 bitcoins to Alice. The protocol proceeds as follows:
1. Bob requests a public key from Alice. He also generates a public key
for himself. Let PubKeyA and PubKeyB be Alice’s and Bob’s public keys
respectively.
Bob does not broadcast t1 on the network at this point. If he does, then
he is liable to have his funds locked in the multisig output forever in
the event that Alice refuses to sign any transaction which spends this
output.
7. Alice edits only the first page of the document. She creates a transaction
e1 which unlocks the 2-of-2 multisig output in t1 and pays her 0.0001
bitcoins and the remaining 0.0099 bitcoins (minus transaction fees) to
Bob.
CHAPTER 6. CONTRACTS 130
8. Alice includes her signature in e1 and sends it to Bob along with the
first page edits.
(i) If Bob refuses to sign e1 , then Alice is unpaid only for the effort
spent in editing one page. She terminates the contract. Bob broad-
casts the refund transaction t2 after the relative lock time expires
and receives the 0.01 bitcoins (minus transaction fees).
(ii) If Bob signs e1 and returns it to Alice, then Alice is guaranteed at
least 0.0001 bitcoins if she broadcasts e1 before the relative lock
time on t2 expires. But Alice does not broadcast e1 at this point.
9. Alice edits the second page of the document. She creates a transaction
e2 which unlocks the 2-of-2 multisig output in t1 and pays her 0.0002
bitcoins and the remaining 0.0098 bitcoins (minus transaction fees) to
Bob.
10. Alice includes her signature in e2 and sends it to Bob along with the
second page edits.
(i) If Bob refuses to sign e2 , then Alice can broadcast e1 and get paid
for the edits in the first page. She is unpaid only for the effort
spent in editing the second page. She terminates the contract.
When Alice broadcasts e1 , Bob receives 0.0099 bitcoins (minus
transaction fees).
(ii) If Bob signs e2 and returns it to Alice, then Alice is guaranteed at
least 0.0002 bitcoins if she broadcasts e2 before the relative lock
time on t2 expires. But Alice does not broadcast e2 at this point.
11. Alice continues this process of sending edits for the next page along with
a transaction requesting cumulative payment for all pages edited so far.
Once all the pages have been edited, the contract terminates. Figure 6.1
illustrates the steps in the protocol when neither Alice nor Bob cheats.
Alice has to take care to finish editing before the relative lock time on
t2 expires. So she has n days after t1 is confirmed to finish the edits.
If Bob refuses to sign any of the ei transactions, Alice will not edit the subse-
quent pages. But Bob can always cheat Alice out of the payment for the last
page (page 100) as he receives the edits for the last page along with a request
to sign e100 . This risk should be acceptable to Alice as she anyway receives
payment for the first 99 pages. If Alice wants to avoid not getting paid for the
last page, she can distribute the cost of editing the last page across the cost
of editing the first 99 pages and offer the last page edits for free.
CHAPTER 6. CONTRACTS 131
y
Request public ke
Send PubKeyA
Create PubKeyB
Create t1
Create t2
sig
Send t2 with B’s
e100 confirmation
Figure 6.1: Illustration of the steps in the micropayments protocol when nei-
ther Alice nor Bob cheats
CHAPTER 6. CONTRACTS 132
Alice Bob
Generate Generate
random bit random bit
xa + xb
Yes No
xa ⊕ xb = 0?
commit to their bits at the beginning of the lottery. The protocol proceeds as
follows:
3. The successful execution of the protocol requires Alice and Bob to re-
veal their secret bytestrings. To ensure this, Alice broadcasts a deposit
transaction on the network which unlocks a UTXO she owns and pays
two bitcoins to an output locked by a P2SH challenge script with redeem
script given below (indented for readability).
OP IF
OP SHA256 <HashA> OP EQUALVERIFY <PubKeyA>
OP ELSE
<Timeout> OP CHECKLOCKTIMEVERIFY OP DROP <PubKeyB>
OP ENDIF
OP CHECKSIG
The OP IF operator pops the the top stack element and checks if it
evaluates to True. If yes, the script between the OP IF operator and
the OP ELSE operator is executed. Otherwise, the script between the
OP ELSE operator and the OP ENDIF operator is executed. Note that
the OP CHECKSIG operator is executed at the very end in both cases.
The OP CHECKLOCKTIMEVERIFY operator checks that the top stack el-
ement is less than the nLockTime field of the transaction which un-
locks the output of the deposit transaction. If yes, the script execution
continues. Otherwise, it terminates. As transactions with a lock time
enabled cannot be included in a block until the lock time expires, the
OP CHECKLOCKTIMEVERIFY operator ensures that the deposit transaction
output is not spent until the after the block height or Unix time specified
in the Timeout field has been reached.
There are two possible response scripts to the above redeem script. Alice
can spend the output in the deposit transaction at any time by providing
CHAPTER 6. CONTRACTS 134
1
OP IF OP SHA256 <HashA> OP EQUALVERIFY <PubKeyA> <SecretA>
OP ELSE <Timeout> OP CHECKLOCKTIMEVERIFY OP DROP <PubKeyB> <SigAlice>
OP ENDIF OP CHECKSIG
<SecretA>
OP SHA256 <HashA> OP EQUALVERIFY <PubKeyA> OP CHECKSIG <SigAlice>
<HashSecretA>
<HashA> OP EQUALVERIFY <PubKeyA> OP CHECKSIG <SigAlice>
<HashA>
<HashSecretA>
OP EQUALVERIFY <PubKeyA> OP CHECKSIG
<SigAlice>
<SigAlice>
<PubKeyA> OP CHECKSIG
<PubKeyA>
OP CHECKSIG <SigAlice>
True/False
Figure 6.3: Stack state during the execution of the deposit transaction redeem
script given Alice’s response
<SigAlice> <SecretA> OP 1
where SigAlice is Alice’s signature. Figure 6.3 shows the state of the
stack during the execution of the redeem script given Alice’s response
CHAPTER 6. CONTRACTS 135
<Empty Array>
OP IF OP SHA256 <HashA> OP EQUALVERIFY <PubKeyA>
<SigBob>
OP ELSE <Timeout> OP CHECKLOCKTIMEVERIFY OP DROP <PubKeyB>
OP ENDIF OP CHECKSIG
<SigBob>
<Timeout> OP CHECKLOCKTIMEVERIFY OP DROP <PubKeyB> OP CHECKSIG
<Timeout>
<SigBob>
OP CHECKLOCKTIMEVERIFY OP DROP <PubKeyB> OP CHECKSIG
<Timeout>
<SigBob>
OP DROP <PubKeyB> OP CHECKSIG
<SigBob>
<PubKeyB> OP CHECKSIG
<PubKeyB>
<SigBob>
OP CHECKSIG
True/False
Figure 6.4: Stack state during the execution of the deposit transaction redeem
script given Bob’s response
to it. Alternatively, if the current block height or Unix time exceeds the
timeout encoded in the Timeout field then Bob can spend the output in
the deposit transaction by providing a response of the form
<SigBob> OP 0.
Figure 6.4 shows the state of the stack during the execution of the redeem
script given Bob’s response to it. Recall that the OP 0 operator pushes
an empty byte array onto the stack which evaluates to False.
CHAPTER 6. CONTRACTS 136
(ii) If the lengths of SecretA and SecretB are not equal, the re-
sponse to the redeem script contains Bob’s signature followed by
the bytestrings SecretA and SecretB. The scriptSig field is given
by
While Alice and Bob know their own secret bytestrings, they need the
other secret bytestring to construct a valid response script. We will
discuss how they acquire the other bytestring below.
The redeem script consists of three functional parts shown below.
The <Check Hashes> portion of the redeem script checks that the bytes-
trings given in the response script hash to HashA and HashB. In Script
notation, it is given by
where the OP 2DUP operator duplicates the top two stack elements, the
OP SHA256 operator replaces the top stack element with its SHA-256
hash, and the OP EQUALVERIFY operator pops the top two stack ele-
ments and checks them for equality (if they are equal script execution
proceeds, otherwise it terminates). Figure 6.5 shows the state of the
stack during the execution of the <Check Hashes> portion of the redeem
script. We assume that the response to the redeem script (with Alice’s
signature) has already been pushed onto the stack. The stack items
HashSecretA and HashSecretB represent the SHA-256 hashes of the
bytestrings SecretA and SecretB respectively. The two OP EQUALVERIFY
operators check that these items are equal to the HashA and HashB
bytestrings provided in the redeem script. If either of the secret bytestrings
provided in the redeem script do not have the required hashes, the script
execution terminates and the remaining portion of the redeem script
is not executed. As the stack can only store byte arrays, the secrets
SecretA and SecretB were chosen to be bytestrings of length either 16
or 17 bytes. If the stack had allowed storage of arbitrary bitstrings, we
could have chosen the secrets to be bitstrings of length 128 or 129 bits.
If the <Check Hashes> portion of the redeem script succeeds, the script
execution proceeds with the <Compute Winner> portion which compares
the lengths of SecretA and SecretB. If the lengths are equal, then Alice
is the winner. Otherwise, Bob is the winner. This is akin to the bitwise
XOR to decide the winner where Alice won if the bits were equal and
CHAPTER 6. CONTRACTS 138
Bob won otherwise. After the execution of <Check Hashes>, the top
stack element is set to 0 to indicate that Alice is the winner and set
to 1 to indicate that Bob is the winner. In Script notation, <Compute
Winner> is given by
where the OP SIZE operator pushes the length of the top stack element
in bytes onto the stack, the OP ROT operator cyclically rotates the top
three stack elements once, and the OP NIP operator deleletes the stack
item below the top stack element.2 Figure 6.6 shows the state of the
stack during the execution of the <Compute Winner> portion of the re-
deem script. The <Check Sig> portion of the redeem script checks the
validity of the signature provided in the response to the redeem script.
Let PubKeyA and PubKeyB be public keys belonging to Alice and Bob
respectively. In Script notation, the <Check Sig> portion is given below.
OP IF
OP DROP <PubKeyB> OP CHECKSIG
OP ELSE
OP DROP <PubKeyA> OP CHECKSIG
OP ENDIF
The OP DROP operator deletes the top stack element. It is used to get
rid of the SecretB stack item as shown in Figure 6.7. We have assumed
that the top stack element after the execution of the <Compute Winner>
portion is 0.
For convenience, let us call the transaction created in this step the fund-
ing transaction as it funds the lottery by unlocking the UTXOs owned
by Alice and Bob. It requires signatures from both Alice and Bob to be
valid.
<SecretB>
OP 2DUP OP SHA256 <HashB> OP EQUALVERIFY <SecretA>
OP SHA256 <HashA> OP EQUALVERIFY <SigAlice>
<SecretB>
<SecretA>
OP SHA256 <HashB> OP EQUALVERIFY <SecretB>
OP SHA256 <HashA> OP EQUALVERIFY <SecretA>
<SigAlice>
<HashSecretB>
<SecretA>
<HashB> OP EQUALVERIFY <SecretB>
OP SHA256 <HashA> OP EQUALVERIFY <SecretA>
<SigAlice>
<HashB>
<HashSecretB>
<SecretA>
OP EQUALVERIFY
<SecretB>
OP SHA256 <HashA> OP EQUALVERIFY
<SecretA>
<SigAlice>
<HashSecretA>
<SecretB>
<SecretA>
<HashA> OP EQUALVERIFY
<SigAlice>
<HashA>
<HashSecretA>
<SecretB>
OP EQUALVERIFY <SecretA>
<SigAlice>
<SecretB>
<SecretA>
<SigAlice>
Figure 6.5: Stack state during the execution of the <Check Hashes> portion
of the lottery redeem script
CHAPTER 6. CONTRACTS 140
<SecretB>
<SecretA>
OP SIZE OP ROT OP SIZE OP NIP OP EQUAL
<SigAlice>
<LengthSecretB>
<SecretB>
OP ROT OP SIZE OP NIP OP EQUAL <SecretA>
<SigAlice>
<SecretA>
<LengthSecretB>
OP SIZE OP NIP OP EQUAL <SecretB>
<SigAlice>
<LengthSecretA>
<SecretA>
<LengthSecretB>
OP NIP OP EQUAL
<SecretB>
<SigAlice>
<LengthSecretA>
<LengthSecretB>
OP EQUAL <SecretB>
<SigAlice>
0 or 1
<SecretB>
<SigAlice>
Figure 6.6: Stack state during the execution of the <Compute Winner> portion
of the lottery redeem script
CHAPTER 6. CONTRACTS 141
0
OP IF OP DROP <PubKeyB> OP CHECKSIG <SecretB>
OP ELSE OP DROP <PubKeyA> OP CHECKSIG OP ENDIF <SigAlice>
<SecretB>
<SigAlice>
OP DROP <PubKeyA> OP CHECKSIG
<SigAlice>
<PubKeyA> OP CHECKSIG
<PubKeyA>
<SigAlice>
OP CHECKSIG
True/False
Figure 6.7: Stack state during the execution of the <Check Sig> portion of
the lottery redeem script
1
party with probability N) receiving N bitcoins. The protocol proceeds as
follows:
2. Both li and si are kept secret by each party. The parties exchange the
hashes hi of the secret bytestrings si .
4. The winner of the lottery will be the party with index j where
N
X −1
j= li mod N.
i=0
All the parties sign a funding transaction which unlocks UTXOs con-
taining one bitcoin owned by each of them and pays the sum to the
lottery winner.
5. If all the parties reveal their secrets si before the timeout, the lottery
winner claims the jackpot by spending the output in the funding trans-
action.
Chapter 7
Bitcoin Development
143
CHAPTER 7. BITCOIN DEVELOPMENT 144
which is titled “BIP Process, revised”.2 This BIP overrides BIP 1 which
was the original specification of the BIP workflow. Process BIPs do not
involve any code changes in the Bitcoin Core client. If there are no
objections to a draft process BIP, its status changes to active.
Figure 7.1 illustrates the various BIP status transitions. The status of a
draft BIP may be changed to deferred either by the BIP authors themselves
at any time or by the BIP editor if there has not been any progress being
made on the BIP. A deferred BIP may be changed back to a draft BIP once
progress is made. A draft BIP may also be withdrawn by the BIP authors
at any time. The status of a draft or proposed BIP is changed to rejected if
there has been no progress for three years. The status of a final or active BIP
is changed to replaced if another BIP supersedes the feature it describes. If
2
https://github.com/bitcoin/bips/blob/master/bip-0002.mediawiki
3
https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki
4
https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki
CHAPTER 7. BITCOIN DEVELOPMENT 145
Active Obsolete
Withdrawn Rejected
the feature described in a final or active BIP is no longer relevant, its status
is changed to obsolete.
blocks which violate the old size limit. The net effect is that the deploy-
ment of the block size limit increase fails.
• If the upgraded miners control the majority of the network hashrate, the
branch not containing the 9 MB block will be abandoned by them. But
this branch will not be abandoned by the non-upgraded miners as it is
the only valid branch they see. So as long non-upgraded miners exist,
this branch will continue to be extended by them. For this reason, hard
forks like block size limit increases require all the miners to upgrade for
successful deployment.
Soft forks refer to protocol changes which require only miners controlling
a majority of the network hashrate to upgrade in order to be successfully de-
ployed. SegWit was a soft fork change to the Bitcoin protocol. For example,
any P2WPKH output looks like an anyone-can-spend output to a miner run-
ning a pre-SegWit client. Such a miner will accept a transaction which spends
this output by providing an empty scriptSig and no witness structure as
valid. If a miner running a pre-SegWit client broadcasts a valid block contain-
ing the spending transaction, miners running pre-SegWit clients will accept
the block while miners running SegWit clients will reject the block as invalid.
The miners running SegWit clients will continue working on extending the
longest branch that is valid under SegWit rules. Miners running pre-SegWit
clients will see a blockchain fork with one branch containing the transaction
spending the P2WPKH output without providing a witness structure and the
other branch not containing this transaction. Miners running SegWit clients
will not see the blockchain fork as they do not consider the transaction spend-
ing the P2WPKH output without providing a witness structure as valid. The
subsequent events can unfold in two ways.
• If the miners running pre-SegWit clients control the majority of the net-
work hashrate, then the branch not containing the transaction spend-
ing the P2WPKH output without providing a witness structure will be
abandoned. This will prevent SegWit from being successfully deployed
on the network.
• If the miners running SegWit clients control the majority of the net-
work hashrate, then the branch containing the transaction spending the
P2WPKH output without providing a witness structure will be aban-
doned by the miners running pre-SegWit clients. This will happen for
any branch containing transactions which spend SegWit outputs with-
out providing witness structures. Thus SegWit remains successfully de-
ployed as long as miners controlling a majority of the network hashrate
run SegWit clients.
To gauge miner readiness prior to activating a soft fork feature like SegWit,
BIP 9 proposed allowing miners to indicate their readiness by setting bits in
CHAPTER 7. BITCOIN DEVELOPMENT 147
the nVersion field of blocks they mine.5 Each proposed soft fork is alloted a bit
in the nVersion field. Recall that the mining target threshold is recalculated
every 2016 blocks. This duration is called a retarget period. If at least 95%
of the blocks (≥ 1916 out of 2016) mined in a retarget period have the bit
corresponding to a soft fork set, then the soft fork is considered locked-in and
is activated at the end of the next retarget period.
5
https://github.com/bitcoin/bips/blob/master/bip-0009.mediawiki
Appendix A
ECDSA Signature
Malleability
148
APPENDIX A. ECDSA SIGNATURE MALLEABILITY 149
Consider the verification procedure for the signature (r, n − s) when (r, s)
is a valid signature. The point Q corresponding to (r, n − s) is given by
(a)
= (n − s−1 )(e + kr) P
= nP − s−1 (e + kr)P
(b)
= O − s−1 (e + kr)P
(c)
= −jP
(d)
= (x, −y).
In the above equality chain, equality (a) follows from Lemma 1, equality (b)
follows from nP = O which was argued in Section 2.4, equality (c) follows
from O being the additive identity, and equality (d) follows from the fact that
the additive identity of jP = (x, y) is (x, −y). Since r = x mod n, (r, n − s)
is a valid signature.
Appendix B
Probability of a successful
double spending attack
150