Module 1 - Introduction To Crypto Currencies
Module 1 - Introduction To Crypto Currencies
Possible
Possible Outputs
Inputs
Figure 1.2 Because the number of inputs exceeds the number of
outputs, we are guaranteed that there must be at least one output
to which the hash function maps more than one input
Property 1: Collision‐resistance
Now, to make things even worse, we said that it has to be
impossible to find a collision. Yet, there are methods that are
guaranteed to find a collision.
Consider the following simple method for finding a collision
for a hash function with a 256‐bit output size: pick
2256+1 distinct values, compute the hashes of each of them,
and check if there are any two outputs are equal. Since we
picked more inputs than possible outputs, some pair of them
must collide when you apply the hash function.
Property 1: Collision‐resistance
In fact, if we randomly choose just 2130+1 inputs, it turns
out there’s a 99.8% chance that at least two of them are
going to collide. The fact that we can find a collision by only
examining roughly the square root of the number of possible
outputs results from a phenomenon in probability known as
the birthday paradox .
For a hash function with a 256‐bit output, you would have
to compute the hash function 2256+1 times in the worst case,
and about 2128 times on average.
if a computer calculates 10,000 hashes per second, it would
take more than one octillion (1027) years to calculate 2128
hashes!.
Property 1: Collision‐resistance
Consider, for example, the following hash function:
H(x) = x mod 2256
This function meets our requirements of a hash function as it
accepts inputs of any length, returns a fixed sized output
(256 bits), and is efficiently computable. But this function
also has an efficient method for finding a collision. Notice
that this function just returns the last 256 bits of the input.
One collision then would be the values 3 and 3+2256.
In some cases, such as the old MD5 hash function, collisions
were eventually found after years of work, leading the
function to be deprecated and phased out of practical use.
And so we choose to believe that those are collision
resistant.
Application: Message digests
If we know that two inputs x and y to a collision‐resistant
hash function H are different, then it’s safe to assume that
their hashes H(x) and H(y) are different — if someone
knew an x and y that were different but had the same hash,
that would violate our assumption that H is collision
resistant.
This argument allows us to use hash outputs as a
message digest.
Consider SecureBox, an authenticated online file storage
system that allows users to upload files and ensure their
integrity when they download them.
Application: Message digests
Collision‐free hashes provide an elegant and efficient
solution to this problem. Alice just needs to remember the
hash of the original file.
When she later downloads the file from SecureBox, she
computes the hash of the downloaded file and compares it to
the one she stored.
If the hashes are the same, then she can conclude that the file
is indeed the one she uploaded, but if they are different, then
Alice can conclude that the file has been tampered with.
The hash serves as a fixed length digest, or
unambiguous summary, of a message.
Property 2: Hiding
The hiding property asserts that if we’re given the output of
the hash function y = H(x), there’s no feasible way to figure
out what the input, x , was.
Consider the following simple example: we’re going to do
an experiment where we flip a coin. If the result of the coin
flip was heads, we’re going to announce the hash of the
string “heads”. If the result was tails, we’re going to
announce the hash of the string “tails”.
Hiding. A hash function H is hiding if: when a secret value r
is chosen from a probability distribution that has high
min‐entropy , then given H(r ‖ x) it is infeasible to find x .
Property 2: Hiding
In information‐theory, min‐entropy is a measure of
how predictable an outcome is, and high
min‐entropy captures the intuitive idea that the distribution
(i.e., random variable) is very spread out.
For a concrete example, if r is chosen uniformly from among
all of the strings that are 256 bits long, then any particular
string was chosen with probability 1/2256, which is an
infinitesimally small value.
Application: Commitments
A Commitment is the digital analog of taking a value,
sealing it in an envelope, and putting that envelope out on
the table where everyone can see it.
When you do that, you’ve committed yourself to what’s
inside the envelope. But you haven’t opened it, so even
though you’ve committed to a value, the value remains a
secret from everyone else.
Later, you can open the envelope and reveal the value that
you committed to earlier.
Commitment scheme
A commitment scheme consists of two algorithms:
com := commit(msg, nonce) The commit function takes
a message and secret random value, called a nonce, as input
and returns a commitment.
verify(com, msg, nonce) The verify function takes a
commitment, nonce, and message as input. It returns true if
com == commit(msg , nonce) and false otherwise.
We require that the following two security properties
hold:
Hiding: Given com , it is infeasible to find msg.
Binding: It is infeasible to find two pairs (msg, nonce)
and (msg’, nonce’) such that msg ≠msg’ and
commit(msg, nonce) == commit(msg’, nonce’ ).
Commitment scheme
Every time you commit to a value, it is important that you
choose a new random value nonce . In cryptography, the
term nonce is used to refer to a value that can only be used
once.
Consider the following commitment scheme:
commit( msg, nonce) := H(nonce ‖ msg) where nonce
is a random 256‐bit value
To commit to a message, we generate a random 256‐bit
nonce. Then we concatenate the nonce and the message and
return the hash of this concatenated value as the
commitment. To verify, someone will compute this same
hash of the nonce they were given concatenated with the
message. And they will check whether that’s equal to the
Commitment scheme
If we substitute the instantiation of commit and verify as
well as H(nonce ‖ msg) for com , then these properties
become:
Hiding: Given H( nonce ‖ msg), it is infeasible to find
msg.
Binding: It is infeasible to find two pairs (msg, nonce)
and (msg’, nonce’) such that msg ≠ msg’ and H(nonce ‖
msg) == H( nonce’ ‖ msg’).
The hiding property of commitments is exactly the
hiding property that we required for our hash functions. If
key was chosen as a random 256‐bit value then the hiding
property says that if we hash the concatenation of key and
the message, then it’s infeasible to recover the message from
the hash output.
Property 3: Puzzle friendliness
Puzzle friendliness. A hash function H is said to be
puzzle‐friendly if for every possible n‐bit output value y , if
k is chosen from a distribution with high min‐entropy, then
it is infeasible to find x such that H(k ‖ x) = y in time
significantly less than 2n.
Intuitively, what this means is that if someone wants to target
the hash function to come out to some particular output
value y , that if there’s part of the input that is chosen in a
suitably randomized way, it’s very difficult to find another
value that hits exactly that target.
Application: Search puzzle
We’re going to build a search puzzle , a mathematical
problem which requires searching a very large space in
order to find the solution. In particular, a search puzzle has
no shortcuts. That is, there’s no way to find a valid solution
other than searching that large space.
Search puzzle. A search puzzle consists of
a hash function, H,
a value, id (which we call the puzzle‐ID), chosen from a
high min‐entropy distribution and a target set Y.
A solution to this puzzle is a value, x , such that
H( id ‖ x ) ∈ Y.
Application: Search puzzle
If a search puzzle is puzzle‐friendly, this implies that there’s
no solving strategy for this puzzle which is much better than
just trying random values of x . And so, if we want to pose a
puzzle that’s difficult to solve, we can do it this way as long
as we can generate puzzle‐IDs in a suitably random way.
SHA‐256:
We can build a hash function that works on fixed‐length
inputs, there’s a generic method to convert it into a hash
function that works on arbitrary‐length inputs. It’s called the
Merkle‐Damgard transform .
SHA‐256 is one of a number of commonly used hash
functions that make use of this method. In common
terminology, the underlying fixed‐length collision‐resistant
hash function is called the compression function .
Application: Search puzzle