MIT 6.824 - Lecture 19 - Bitcoin
MIT 6.824 - Lecture 19 - Bitcoin
Bitcoin is a digital currency for making online payments. I'll start this post by making a
case for digital currencies, before describing Bitcoin and how it solves the double-
spending problem.
Table of Contents
Digital Currencies
Bitcoin
Limitations of the model so far
An attacker can steal a private key
The current owner can double-spend
Addressing double-spend
Publishing to a log
The Bitcoin blockchain
Adding a new block
Validation Checks
Temporary double-spending is possible
Scenario #1
Scenario #2
Peers will abandon the shorter branch
FAQ
Where are new coins from?
Can an attacker change an existing block in the middle of the blockchain?
Can an attacker start a fork from an old block?
Conclusion
Resources
Digital Currencies #
Majority of e-commerce activities today rely on trusted third parties like banks and
other financial institutions to process payments. These third parties offer some
protection from fraud at the cost of increased transaction fees.
But with digital currencies like Bitcoin, two people can make payments directly without
a trusted third party involved. These payments rely on other methods to prevent fraud,
such as cryptographic proofs. Digital currencies offer several advantages, including:
Digital currencies, however, come with their technical and social challenges. One major
technical challenge is in solving the double-spending problem, i.e., how does one
prevent a digital coin from being spent more than once?
Bitcoin #
Bitcoin is a decentralized digital currency for making payments. Decentralized here
means there is no single authority or entity like a central bank on the Bitcoin network.
The network is run by many peers, which are computers that collaborate to make any
necessary decisions. Anyone can add a peer to the network.
The Bitcoin network comprises coins, each owned by someone. A coin is a chain of
transaction records, where each record represents each time an owner transferred the
coin to a new owner as payment. The latest transaction record in the chain shows the
coin's current owner.
Each coin owner has a public/private key pair which the network uses to verify the
integrity of transactions. I'll go over how that works soon, but you can read the
previous post on Certificate Transparency or this article on public-key cryptography for
more detail.
When the current owner of a coin wants to transfer the coin to a new owner, they
create a transaction record which contains:
This information allows the new owner to verify that it received the coin from the right
owner. To illustrate this, if a user Y owns a coin that they received from user X, the
latest transaction record in the coin will look like:
Figure 1: Let the latest transaction be T1, the record will then contain the hash for T0.
If user Y then transfers the same coin to another user Z in a new transaction, T2, the
coin will have a chain which now includes a new transaction record:
User Z will verify that the coin actually belongs to user Y by checking if the public key
in T1 matches the private key signature in T2.
As shown in Figure 2, the current owner of a coin uses their private key to sign the next
transaction. If an attacker steals the private key from the owner's computer, they can
spend the coin. This is a real possibility and is a hard problem to solve.
In the previous example, there's nothing stopping user Y from spending the same coin
for both user Z and another user Q. User Y can create two transactions with the same
coin using the hash of the previous record.
Trusted third parties like banks shine here since they can protect a payee from a
double-spending payer. But if Bitcoin is to operate without a third party, a payee needs
to know that the previous owner of a coin did not sign any earlier transactions.
Addressing double-spend #
Similar to Certificate Transparency, we could introduce a log which all peers must
publish transactions to, ensuring that all the transactions in the network are visible to
all peers. This log must have the same requirements as a certificate log:
If all the transactions are visible to all peers and no peer can delete a transaction, a
peer will detect when a coin has been previously spent on an earlier transaction.
Using the above double-spending example where a user Y attempts to spend the same
coin for users Z and Q, these requirements will ensure that if Y→Z happened before Y-
>Q, then:
User Z will see Y→Z came before Y→Q and will accept Y→Z.
User Q will see Y→Z came before Y→Q and will reject Y→Q.
Publishing to a log #
One challenge with this log approach is in determining what gets published to the log
and in what order. For example, we could have a central log server or a leader that
decides the order of transactions like Raft, but this goes against Bitcoin's
decentralization goal.
Another way to manage the log is to send new transactions to all peers and have them
vote on which transaction to append to the log, with the majority winning the vote. To
determine the majority, we could count one vote per IP address, but an attacker can
forge IP addresses to claim a majority and vote multiple times. The impact of this will
be that:
When user Z asks, the attacker's majority says, "Y→Z is in the log before Y→Q"
When user Q asks, the attacker's majority says, "Y→Q is in the log before Y→Z"
Bitcoin addresses the double-spending problem by using a blockchain, and I'll describe
that next.
Each peer in the network has a complete copy of the chain. When a peer wants to add
a new block to the chain, it broadcasts the block to all the peers. Any new transactions
also get flooded to all the peers. All the blocks and transactions on the Bitcoin network
are publicly visible here.
When you make a payment, the payee won't accept it until the transaction is in the
blockchain. And since all transactions are in the blockchain, the payee will find out if
the coin has been spent before.
The challenge posed in the previous section was on determining what gets added to
the log. The next section will cover Bitcoin's approach.
When a peer receives new transactions, it collects them into a block. Before it can add
the block to the blockchain, it needs to do actual CPU work. This is called a proof-of-
work.
To explain this, assume we have a peer S with an unpublished block containing a list of
transactions and the previous block's hash. S needs to create a hash to identify the
unpublished block using its contents, and that involves solving a hard computational
puzzle. The puzzle is this:
Given that peer S has a list of transactions(let's call this l) and the previous
block's hash (hp), S must find a value x such that when it applies a hash function
to the combination of l, hp, and x, it gets an output that begins with a long run of
zeros.
hash(l + hp + x) = 000000000000000...
This value of x is known as the nonce and the difficult process of finding this nonce is
called mining.
The exact number of leading zeros required in the output varies with the speed with
which peers generate new blocks. If the peers are generating new blocks too quickly,
the network will make the proof-of-work more difficult by increasing the number of
leading zeros required.
Finding this proof-of-work for a block is a costly operation as it requires major CPU
power, and so this system essentially limits peers to one vote per CPU.
It takes 10 minutes on average for a peer to mine a new block, which means that the
parties involved in a transaction have to wait for about 10 minutes before it appears on
the blockchain.
A peer broadcasts a block to all peers after finding its proof of work.
Validation Checks #
When a peer receives a new block, it validates the block by checking that:
No other transaction has spent the same previous transaction (Recall that a
transaction record contains a hash of the previous transaction).
The transaction's signature is by the private key of the public key in the previous
transaction, as described earlier.
If the block is valid, the peer shows its acceptance by working on creating the next
block in the chain—using the accepted block's hash as the previous hash.
Scenario #1 #
One possible scenario in the Bitcoin network is that two peers A and B mining the next
block in the chain find the nonce at the same time and broadcast the block to all peers,
but because of network issues, some peers accept the block from peer A before B's
block, while others accept B's block first.
This causes a fork on the blockchain where it now has two branches.
Scenario #2 #
Another possible scenario is that a peer sends one transaction to a subset of peers
and another one to a different subset using the same coin.
For example, a peer Y could tell some peers about a transaction Y→Z and others about
Y→Q, which both use the same coin. This will create a fork as illustrated below.
When a peer receives two different successors to the same block, it will start working
on the first one it received but save the other branch in case it gets longer. A branch
will get longer when a peer finds the next proof-of-work.
If a peer is working on a branch and sees that another branch has gotten longer, it will
abandon its current branch and switch to the longer one. This will also cause any
transactions on the shorter branch to get abandoned.
During this period where different peers are seeing different branches of the chain, an
attacker will be able to double-spend a coin. But what makes Bitcoin work is that the
shorter branch will eventually get abandoned and only one transaction will remain on
the blockchain.
This possibility of a fork is why careful Bitcoin clients wait until there are a few
successor blocks (typically six) to the one that contains their transaction before
believing it was successful. If a block has many successor blocks, it is unlikely that a
dubious fork will overtake it.
FAQ #
When a peer mines a block and the other peers accept it, the peer gets a 6.25-bitcoin
reward. This is an incentive for people to operate bitcoin peers.
An attacker may want to do this so they can delete a spend of their coin from the
blockchain and spend it again.
The Bitcoin network prevents this by the fact that the block's hash will change if you
delete a transaction, and so the previous hash in the next block will be different, which
the peers will detect.
If an attacker wants to double-spend a coin, they may start a fork from the block that
precedes the one with the first spend of the coin and mine new blocks until the forked
branch is the longest branch of the chain.
For this to be successful, the attacker must have enough CPU power to come from
behind and mine blocks faster than all the honest peers.
If the attacker can create the longest branch, everyone will switch to it and so the
attacker can double-spend a coin. But if an attacker has that much CPU power to mine
blocks faster than all the honest peers, they might as well use it to generate new coins
instead of reusing an old one.
Conclusion #
There's a lot more to learn about Bitcoin and I've offered a one-sided view so far, but
my goal here is to present an idea of how it works. Some downsides of using Bitcoin
are that the proof-of-work takes too much power and the 10-minute confirmation wait
is too long, among other points.
Resources #
Bitcoin: A Peer-to-Peer Electronic Cash System - Original Bitcoin paper.
How the Bitcoin protocol actually works by Michael Nielsen.
Blockchain Explorer - View the latest blocks and transactions in the network.
Proof of work - Further explanation of the Proof of work and what makes it difficult.
Lecture 19: Bitcoin - MIT 6.824 lecture notes.
Bitcoin FAQ - Additional material from 6.824.
A small favour
Did you find anything I wrote confusing, outdated, or incorrect? Please
let me know by writing a few words below.
Your name
Send Message
Follow along
To get notified when I write something new, you can subscribe to the
RSS feed or enter your email below.
← Home