Gf45 Final Report
Gf45 Final Report
September 2019
1
Abstract
In this paper we will survey the theory underlying Zero Knowledge
Proofs, developing the idea of interactive proving protocols, which will
then be formalized first in Interactive Proofs and then in more complex
knowledge withholding schemes, such as Perfect and Computational Zero
Knowledge Proofs. In the process, we will have shown interactive proofs
for Graph Non Isomorphism (a co-NP problem), and zero knowledge
proofs for both Graph Isomorphism and Graph 3-Colorability. Finally,
we will detail some applications of the theory, showing various “flavours”
of ZKP for the Discrete Logarithm problem that will culminate in Schnorr
Signatures. We will build on this to obtain the Secure Remote Protocol,
a protocol that uses a similar ZKP in order to authenticate without com-
municating your password to the server. Finally, we show a very simple
Zero Knowledge Range Proof, an instance of a very general class that
has a wide range of applications and is currently one of the most active
avenues of research. The paper will also introduce Sagemath worksheets
that demonstrate nearly all of the protocols here detailed.
Declaration
I declare that the material submitted for assessment is my own work except
where credit is explicitly given to others by citation or acknowledgement. This
work was performed during the current academic year except where otherwise
stated. The main text of this project report is 14896 words long, including
project specification and plan. In submitting this project report to the Uni-
versity of St Andrews, I give permission for it to be made available for use in
accordance with the regulations of the University Library. I also give permission
for the report to be made available on the Web, for this work to be used in re-
search within the University of St Andrews, and for any software to be released
on an open source basis. I retain the copyright in this work, and ownership of
any resulting intellectual property.
Contents
1 Introduction 3
2 Structure 4
3 Motivation 5
4 Preliminaries 6
4.1 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Computational Setting . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Adversarial Settings . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.4 Notion of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2
5 Proof Requirements 12
5.1 Parties in play . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.3 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.4 Zero Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6 Proof Systems 15
6.1 Interactive Turing Machine . . . . . . . . . . . . . . . . . . . . . 16
6.2 Interactive Proof Systems . . . . . . . . . . . . . . . . . . . . . . 17
6.3 An Example: GNI . . . . . . . . . . . . . . . . . . . . . . . . . . 18
8 Applications 36
8.1 Schnorr Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.2 Secure Remote Password . . . . . . . . . . . . . . . . . . . . . . . 40
8.3 Zero Knowledge Range Proofs . . . . . . . . . . . . . . . . . . . . 42
9 Implementation 44
9.1 How to run it . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . 45
References 52
1 Introduction
The concept of proof is one that resonates quite strongly in all of society. Having
positive verification that an untrusted party is actually making an affirmative
claim opens up to a variety of avenues for collaboration and enquiry. This pa-
per will be a survey of the theory of a particular variety of proofs, namely Zero
Knowledge Proofs. We will build an understanding of Interactive Proofs, which
are processes in which a verifier is successively convinced by a prover. The
fundamental idea is that the kind of proofs that we will consider are not fixed
3
proofs as in Mathematics1 , but instead they are interactive processes, in which
a verifier asks questions (or challenges) to the prover, and is progressively con-
vinced by correct answers. The general problem that we look at is the following.
Suppose that we have some object x such that the proposition P (x) holds. How
do we convince a skeptic of this fact? One way would be to just reveal x, but of
course this might not be desirable for a variety of reasons. For example, it might
be that x is a password, or contains other sensitive information. However, if
P (·) has enough structure, we are able to devise a protocol in which the verifier
can ask questions, that the prover can only consistently answer if it indeed has
x, and from which the verifier learns nothing about such an x, except the fact
that it exists. We will formalize such process, investigate the structure of the
resulting class, and show some examples of protocols that satisfy said proper-
ties. Finally, we will survey some applications of Zero Knowledge Proofs, in the
fields of digital signatures and authentication.
2 Structure
The structure of the dissertation will be as following. Section 3 will provide
some motivation that show why the effort of developing Zero Knowledge Proofs
is justified. Section 4 will cover the preliminaries, that includes both the mathe-
matical machinery needed and the definition of the computational settings that
we will be extensively using, namely that of Turing Machines. This Section
will also introduce complexity classes that are relevant to our analysis, such
a P, NP, BPP, and introduce the concept of a proof in a computational sense.
Section 5 will explore this concept in more depth, exploring via examples the
characterization of Zero Knowledge Proofs, such as the general setup, sound-
ness, completeness and the zero knowledge property. Section 6 will formalize
the notion of proofs by introducing Interactive Turing Machine, and introduce
the wide class of interactive proofs, of which Zero Knowledge Proofs are a sub-
set. We will then give an example of an interactive proof for the Graph Non
Isomorphism problem, complete with a proof of its correctness. In the following
Section 7 we will finally introduce Zero Knowledge Proofs, in both the perfect
and computational formulation. We will continue by providing an example of a
Perfect Zero Knowledge Proof for deciding Graph Isomorphism, and conclude
the section by introducing bit commitments, and showing that if one way per-
mutations2 exist then every language in NP has a Zero Knowledge Proof. This
is done in Subsection 7.4 by providing a computational Zero Knowledge Proof
for the NP-complete problem of Graph 3-colouring. In Section 8 we will then
look at some applications, in three different contexts. First we will show how we
can use a ZKP for the Discrete Logarithm in order to digitally sign a message
(Schnorr signatures). Secondly, we will show how fundamentally similar proofs
can be used in order to implement an authentication scheme in which the server
1 They resemble mathematical proofs more closely when we consider Non Interactive Zero
Knowledge Proofs
2 In fact this holds for one way functions as well, but the permutation proof is simpler
4
never receives the user’s password. Finally, we will show an extremely simple
example of a Zero Knowledge Range proof, which is a strategy by which a party
can prove that a certain value that it knows is within a suitable range, while
maintaining anonymity to what that value actually is. This family of proofs
have a far-reaching implication, as they allow to implement complex systems
upholding invariants while maintaining anonymity, such as payment systems or
government sanctioned age verification. Finally, Section 9 will refer the reader
to the Sage implementation of the proofs described in the paper.
3 Motivation
Zero Knowledge Proofs have applications in a number of fields outside of theory.
Examples vary from classical cryptography settings such as message signature
and authentication to more practical ones such as online age verification and
electronic voting. The development3 of online blockchain based transaction
systems has given indications of another area in which ZKP applications could
emerge. In this section we will be focusing on this application, as it gives a good
indication of a complex problem that ZKP can solve. Consider a blockchain
system, such as Bitcoin4 . As is well known, Bitcoin accounts (also known as
wallets) are uniquely identified by an address, and this address is visible in every
transaction that the account is involved in. In fact, knowing the ownership of
an account allows anyone to reconstruct from the public ledger the entire trans-
action history, complete with transaction amounts. Furthermore, the balance
of any address is publicly available. One direction of development is to allow
for a blockchain system to be completely private, so that transactions can be
done between actors while maintaining anonymity. Of course, many technical
problems arise when trying to do so, but let us focus on one in particular. Sup-
pose account A wants to transfer an amount X to an account B. What party
B would like to verify is that A actually has enough balance to transfer such
amount. However, since the system that we design is private, B does not have
the ability to peer into A account or its balance. Instead, to solve this problem,
we make it so that when A initiates the transaction, it produces a “proof” that
its balance is greater than X. This proof has to have three properties. First
of all, it must be convincing (which we refer to as complete), so that if A does
indeed have enough funds, then B will be confident that the proof provided is
correct. Secondly, it must be sound, so that if A does not have enough funds
B will never be convinced that it does. Thirdly, it must be zero knowledge,
ensuring it will not leak to B any information about A (like balance, account
number et cetera) other than the fact that A has more than X in its balance.
This can be done via what is called a Zero Knowledge Range Proof, of which
we will see a very simple example in Section 8. One project implementing this
in practice is Zcash [1].
3 And the “hype” associated with it
4 But not limited to, in fact, as far as I know, every major blockchain system works in
essentially the same way
5
4 Preliminaries
4.1 Mathematical Notation
We start by defining some crucial sets of strings5 .
Note in particular the use of h·i to denote the binary representation of the
inner element. In general, as long as the element can be encoded in binary in
polynomial space, we will not concern ourselves too much with said represen-
tation, and sometime even omit the brackets. In particular note that, unless
specified otherwise, integers are encoded in binary and without brackets.
Now, we introduce a couple of utilities for talking about the asymptotic
complexity of various functions.
Definition 4. We define the following sets:
We also allow for two slight relaxation of notation, writing for example O (f )
as O (f (n)) and f ∈ O (g ) as f = O (g ). Furthermore, we define the following
useful set:
Definition 5. We define
∞
[
O ni
poly =
i=0
elements (also known as alphabet) generates the set of all finite sequences of elements of the
original set.
6
Definition 6. Let µ : N → R. We say µ is negligible if for every positive
poly(·) there exists a N ∈ N s.t for every n > N we have:
1
µ(n) <
poly(n)
As an example, functions such as 2−n or n−n are negligible. In particular the
set of negligible functions is closed under multiplication by elements of poly(n).
This is significant, as we usually aim to have the attacker’s success probability
to be negligible, and this means that even repeating the attack polynomially
many time will not create a non negligible threat.
There a couple of notions of probability that we will most likely need. First
of all, we will use the notation x ←$ A to say that x is uniformly selected from
the set A6
Definition 7. Suppose we have a series of n independent events Xi , each one
of them occurring with
P probability θ. Then the total number of successes, which
we denote as X = i Xi , follows a binomial distribution which means:
n k
Pr[X = k ] = θ (1 − θ)n−k
k
In particular, we have that in those n trials we expect E[X ] = nθ events
to occur. This is useful, especially since we have that the probability of an
event occurring every time is Pr[X = n] = θn . This will be useful especially to
estimate the probability of some algorithm succeeding multiple times when it
is not supposed to. Note in particular that if an algorithm spuriously succeeds
with probability θ ∈ [0, 1) then Pr[X = n] is a negligible function of n.
The following definition will be used for having a compact notation for se-
lecting permutations.
Definition 8. The group of permutations acting on X:
SX = {f : X → X| s.t. f is a bijection}
Finally, a lot of our examples will deal with graphs and isomorphisms on
graphs.
Definition 9. A graph G= (V, E) is a pair of sets, where V is referred to as
the vertex set, and E ⊆ V2 is the set of edges.
Definition 10. Let G1 = (V1 , E1 ) and G2 = (V2 , E2 ). A bijection φ : V1 → V2
is an isomorphism between G1 , G2 iff
In the case such an isomorphism exists, the two graphs are said to be isomorphic,
and we write this as G1 ∼
= G2 .
6 Equivalently, when A finite, we have defined a random variable X such that for every
1
a ∈ A we have Pr[X = a] = |A|
7
Finally, we introduce the following useful notation, for the graph obtained
by permuting all vertexes of another graph.
Definition 11. Let G = (V, E), and π ∈ SV , i.e. a permutation on the set V .
Then we define π(G) = (V, E 0 ) where:
Clearly, G ∼
= π(G).
x ∈ L ⇐⇒ Pr[M (x) = 1] = 1
The complexity class P is the one we most closely associate with “efficient”
computation. Notable examples of languages which belong in P are finite lan-
guages, regular languages, context free languages, and more. For example, the
following are all in P :
can effect the running time and such the length of the string. If the machine M is always
polynomial time we can then find a bound p s.t. ∀r, T (Mr (x))) ≤ p(|x|), and as such having
r ∈ {0, 1}p(|x|) will be enough
8
{p | p ∈ N and p prime } 8
(a, b, c) | a2 + b2 = c2
In general, a language being in P implies there exists a machine such that all
the computation paths recognize the input if the input is indeed in the language.
Compare this to the definition of NP.
Definition 14. We say that a language L ∈ NP if there exists a probabilistic
polynomial time TM M such that:
The definition implies that there is at least one path for which the input is
accepted. It is easy to see that P ⊆ NP, but the question of P =? NP is still
open [5] [6]. There are many NP problems that we believe do not belong in P,
such as the following:
9
the probabilistic computation closely mimics the capabilities of general purpose
computer, which can use the external environment as a source of randomness.
Secondly, polynomial time computation is feasible, and, as long as Moore’s law
holds, modern computer can every year improve faster than one can increase the
problem size10 . Furthermore, the error probability of a probabilistic algorithm
can be made exponentially small by repeating the algorithm polynomially many
times. This means that if an algorithm has a 1c probability of failing, repeating
it n times will yield an algorithm that fails with probability c−n . Let us have
an example to clarify how a probabilistic algorithm works:
Computationally wise, this might require less calculations. Let’s assume you
can select a random k-vector in O (k ) steps, then the first step is O (m), the
two computations take respectively O (nm) and O (k(n + m))11 and the com-
parison will be O (n). So in total the algorithm is O (nm + kn + mk + n). We
claim that, trivially, if A = BC then the algorithm always answer correctly.
Furthermore if A 6= BC then the algorithm answers correctly in the majority
of cases. Let D = BC Note that if the algorithm is to output incorrectly, it
10 What I mean is, if an algorithm takes poly (|x|) steps, and on year n a computer can
perform O (2n ) steps, then by waiting a suitable and polynomial number of years the time
needed decreases exponentially. E.g., suppose that a given problem takes |x|3 steps to be
solved, and on year 0 a computer can perform 1000 steps per year. Then in year 0 we can
solve instances of size at most |x| = 10. However, suppose in year n the power of the computer
is given by 1000 · 2n . This implies that in year n we can handle instances of size |x| = 10 · 2n/3 ,
which is exponential in n
11 This is optimized computing B(Cx)
10
must be that Ax = Dx. Then it must be that (A − D)x = 0, and as such
x ∈ ker(A − D). Note that A − D ∈ Fn×m and as such A − D can be seen as
a linear transformation from Fm → Fn . The whole problem reduces to finding
the probability that a vector picked at random belongs to the kernel of a linear
transformation. Let |F| = q, the size of the underlying field12 . First of all note
that since A 6= D at least one of the of the row of the matrix is non zero, and
as such dim ker(A − D) < m. Since the size of any vector space V over F is
|V | = q dim V , then the size of the kernel is at most q m−1 . As such the proba-
m−1
bility of a random vector belonging to the kernel is at most q qm = 1q . We can
then calculate that, repeating the algorithm l times the probability of getting at
l
least one vector not in the kernel is 1 − 1q and if we want to ensure that this
2
probability is greater than 3 we can then set:
1
log 3 log 3
l> 1 =
log q
log q
Finally, we can see that repeating the procedure l times13 (call this algorithm
M ) will have the following result:
• A = BC =⇒ Pr[M (A, B, C) = 1] = 1
2
• A 6= BC =⇒ Pr[M (A, B, C) = 0] > 3
11
In general, the most strict level of security that we might require is that of
information theoretic security, which holds even in the case that an attacker
has infinite computational power. We will see that Perfect Zero Knowledge
Proofs have an information theoretic flavour, even though saying that they are
secure against an infinitely powerful attacker might be a stretch15 . After that
level, we usually require that the attacker has computational capabilities that
are bounded by a polynomial. This quite closely mirrors the real world require-
ments for most schemes [8]. We will see that Computational Zero Knowledge
Proofs have this kind of protection, as they rely on the concept of computational
indistinguishability.
5 Proof Requirements
In order to start visualizing the concept of ZKP, we can start with a couple of
toy examples that are helpful. Here and in the rest of the text we will refer to
the prover as Peggy and to the verifier as Victor.
12
1. If Peggy were colourblind, she would not be able to tell whether Victor
switched the balls or not, and so the best she could do is guess, getting it
right 12 of the time17 .
2. If Peggy is not colourblind, she will be able to answer correctly every time,
and thus convince Victor
3. Victor does not learn anything else about his environment, apart from the
fact that Peggy is not colourblind.
Example 3. This example is due to Jean-Jacques Quisquater [9]. Consider
a cave shaped like a ring, with a single entrance. On the side opposite to the
entrance, there is a gate that divides the cave into two. The gate can only be
opened by a secret word, that only Peggy knows. Peggy wants to prove to Victor
that she possesses this secret word, without of course letting him in on the secret.
She devises an experiment as follows. She tells Victor to wait at the entrance of
the cave, and she goes by one of the two possible paths (without letting Victor
know which one she took) to the gate, where she can utter the secret word without
being heard. Then she asks Victor to name one of the two paths, and she goes
back trough the named path. As before:
1. If Peggy did not know the secret word, she would not be able to go back on
a path different from the one she initially took, and as such has only a 12
chance of getting it right.
2. Instead, if Peggy does indeed know the secret word, she will be able to
always take the path that Victor names, and as such convince him
3. Furthermore, Victor does not gain any knowledge of the secret key, apart
from the fact that Peggy knows it.
This illustrates the general idea of Zero Knowledge Proofs. The proof part
is devised as an experiment or challenge from the verifier to the prover, that the
prover can only answer by having the necessary piece of information or “power”.
The prover cannot fool the verifier consistently and an honest prover and honest
verifier will agree on the fact. Furthermore no new information is learned by the
verifier. Also, if external observers were to have witnessed the exchange they
would not be convinced of the fact (as Peggy and Victor could have coordinated
their answers). Note however that in this case a simpler protocol can actually
fulfil the requirement. For example, Peggy could go into the cave, showing
Victor which entrance she took, and come out from the other entrance. This
in effect amounts to removing the need of an interaction between Peggy and
Victor, which is something that is quite desirable in real world applications. In
particular we will see this in Section 8, with regards to Schnorr signatures. Note
that this transformation also removes the probability aspect, which instead is
something that in general we cannot do for non trivial languages, but I digress.
We seek to formalize the ideas from these two examples next.
17 And as such repeating the experiment n times will bring her probability of always guessing
right to 21n
13
5.1 Parties in play
In the sequel, we will have two parties interacting, the prover and the verifier
that, as before, will be colloquially referred to as Peggy and Victor. In math-
ematical notation we will have P, V always referring, respectively, to prover
and verifier. As in maths, we aim to have the verification procedure be as ef-
ficient as possible, while most of the computational burden will be placed on
the prover. This mirrors the definition of NP we used above, as the class of all
problems whose solution can be efficiently verified. Furthermore, it is crucial
to understand that the verifier inherently does not trust the prover (otherwise
there would not be need for the proof to be provided), and as such Victor will
be skeptical of everything that Peggy says. The general situation will be Peggy
and Victor having access to some common input, plus each one possibly having
access to some additional private input, which will often be related from the
shared input.
5.2 Soundness
The condition of soundness asserts that an honest verifier cannot be tricked into
accepting a false statement by a possibly cheating prover. In the first example,
this is equivalent to stating that, if Peggy is indeed colourblind, then she cannot
convince Victor that she is not. In the second one, if Peggy were not to know
the password to the gate, then she would not be able to convince Victor who
will then not be fooled. It is important to note that in the above example we
actually allow for a dishonest prover to be able to successfully fool Victor with
a bounded above probability18 .
5.3 Completeness
Conversely, the condition of completeness states that an honest prover will be
able to convince an honest verifier of the veracity of the claim, if the claim is
indeed true. So, moving back to the first example, Peggy will be able to prove
she is not colourblind if that is indeed true and Victor acts honestly satisfying
the protocol. While in the examples Victor is always positively convinced at
the end of the interaction, this is not a strict requirement, instead (similarly as
in Soundness) we just require the interaction to succeed with a strictly bounded
below probability19 . As an example of when this might be the case, consider
the following
times
19 And again this probability can be increased by repeating the experiment and taking the
majority answer
14
her opinion on how it was brewed. The experiment is repeated n times20 , and
Victor is convinced if Peggy is right more than c times, where c depends on how
certain we want to be of Peggy’s ability.
This is the exact subject of hypothesis testing, in particular it reduces to
the exact problem of finding whether a coin is biased. This can be modelled
√
by
a binomial distribution with mean µ = n/2 and standard deviation σ = 2n . In
particular, if we apply the central limit theorem (and as such have n > 30) we
can model it as√
a normal distribution, then using the 3σ rule21 we can see that
n n
if |c − 2 | ≥ 3 2 then we can affirm that with a more than 99% chance Peggy
can distinguish the two. For example,√
if we set n = 100 and Peggy guesses
100
right more than c = 65 = 100 2 + 3 2 times (or less than 35 times) then we
can be almost certain that Peggy is truthful (and in the other case that she can
distinguish it but for some reasons she is convinced the espresso machine coffee
is brewed in cup, and vice versa).
6 Proof Systems
In our drive to formalize the intuition that we gained in the previous sections,
we will start by extending the model of a Turing Machine to allow for two Turing
20 According to [10], and assuming Peggy weighs around 70kg, we would not recommend for
in polynomial time
15
Machines to interact. Then we will define formally interactive proofs and give
examples.
• Communication, each of the two machines can choose to let some infor-
mation (often related to their private information) to the other, using
communication tapes
To fulfil those goals, we define a pair of interactive machines as follows:
private. There are some cases in which this requirement might be lifted, in so called public
coin protocols, but we will not discuss them here
24 This is an alternate equivalent definition of randomized Turing Machines, it just makes
16
• The interconnected randomized Turing machine pair terminates when ei-
ther of the two machines terminates.
• An interactive proof system runs on some common input, which is initially
written on both machines’ input tape.
• By convention, A is initially running, and B is idle.
It is to be noted that there is an alternative and ultimately equivalent model
that allows each machine to have an additional private input tape, which allows
for both machines to be polynomial time. For simplicity here we work with the
definition with less tapes, but it is good to be aware of it.
Using the above definition, we can go and define notation for the output of
an execution of a pair of ITMs.
Definition 17. Let A, B be interconnected randomized Turing machines, such
that both A and B always terminate in finitely many steps. Then we define
hA, Bi (x) to be a random variable modelling the output of machine B after
interacting with machine A on common input x (with the computation path
string uniformly and independently selected).
Note that the definition is asymmetric, as it only accounts for the output of
B, however, for all our intents or purposes, this should suffice. Furthermore, we
need to accurately represent time complexity in this model, and we say that:
Definition 18. An ITM A has time complexity t : N → N if, for any string x
and every linked ITM B, and any computation path, it halts within t(|x|) steps.
In a nutshell, the above implies that, regardless of the messages that machine
B sends, machine A always terminates quickly enough.
• Completeness if x ∈ L then:
2
Pr[hP, V i (x) = 1] ≥
3
There are some things to note in the above definition. First of all, note that
only the verifier is required to be computationally bound, while the prover has
17
no such limit. This is required in the general case, as we will see in the next
example, but often in the practical application we can make the prover polyno-
mially bound by resorting to a similar model which makes use of auxiliary input.
Secondly, as in many other examples, the bounds of 31 , 23 can be replaced25 by
1 1
2 ∓ for 2 > > 0. From the above definition, we can prove that any language
in NP has an interactive proof system. Letting IP be the set of all languages
with an interactive proof system:
Theorem 1. NP ⊆ IP
Proof. If L is in NP then26 there exists a polynomially recognizable relation RL
such that x ∈ L ⇐⇒ ∃y : (x, y) ∈ RL . Also, such a y satisfies |y| < p(|x|) for
some polynomial p.
We design the following interactive system, which has as common input x. First
of all, the prover finds such y. This can be done by searching for any string s in
Sp(|x|) i
i=0 {0, 1} that satisfies (x, s) ∈ RL . Since the prover has no computational
bound this can be done without issues. The prover then sends this s to the
verifier. On a message s the verifier accepts if (x, s) ∈ RL , rejects otherwise.
Since RL is recognizable in polynomial time, the verifier runs in polynomial time
as required.
Completeness is easily verified, as in fact the verifier will accept with probability
1. Soundness requires a little more thought, but essentially since x ∈ / L =⇒
∀y : (x, y) ∈
/ RL then the verifier will never accept (i.e. accept with probability
0). This implies that the above is an interactive proof system for L. Since L
was an arbitrary element of NP, then NP ⊆ IP .
In fact, as we will see next, IP actually contains languages that are not
believed to be in NP. We will not prove this, but it turns out that IP =
PSPACE[12], i.e. the set of all languages that use polynomial space27 . Also
note that in the case of NP problems we can even make the prover machine
polynomial time, if we use the additional input model and give the prover the
certificate.
18
Proof that G1 , G2 are not isomorphic
Prover Verifier
G1 = (V1 , E1 ), G2 = (V2 , E2 )
i ←$ {1, 2}
π ←$ SVi
F = π(Gi )
F
Find j s.t. Gj ∼
=F
j
return i =? j
V1 (if n = |V1 |) and checking for isomorphism, outputting 1 if one found and 2 if none (of
course this relies on the verifier being honest which for completeness and soundness is always
the case).
19
input graphs, and as such the prover always has an unique choice of j, which
must equal the i that the verifier chose. As such the verifier always accepts and
the completeness bound is 1.
Intuitively, for the soundness bound, we claim that a possibly malicious
prover cannot convince the verifier that the graphs are isomorphic while indeed
they are not. In particular, we claim that in case they are isomorphic, the best
the prover can do is guess what the verifier chose, with probability 12 .
We start by proving this preliminary claim:
Theorem 3. Let I be a random variable uniformly selected from {1, 2}. Let
G1 , G2 be two graphs such that G1 ∼ = G2 . We also let Π be a random variable
uniformly distributed over the set of permutations of the vertex set of G1 (which
without loss of generality we assume being equal to the vertex set of G2 ). Then
we want to show, for every graph F ∼ = G1 ∼
= G2 the following hold:
1
Pr[I = 1 | Π(GI ) = F ] = Pr[I = 2 | Π(GI ) = F ] =
2
Proof. First of all let F be a graph isomorphic to G1 , G2 . Then consider the
sets S1 , S2 defined by Sδ = {π : π(Gδ ) = F }. We first claim that |S1 | = |S2 |. To
see this, note that G1 ∼= G2 =⇒ ∃φ : G1 = φ(G2 ). Then π ∈ S1 =⇒ π(G1 ) =
F =⇒ π(φ(G2 )) = F =⇒ π ◦ φ ∈ S2 , where the last step is justified since
α(β(G)) = (α ◦ β)(G). So we can see that f : S1 → S2 defined as f (σ) = σ ◦ φ is
well defined, and we can prove that it is a bijection29 . and as such |S1 | = |S2 |.
Now using this fact we have that:
Pr[I = 1 | Π(GI ) = F ]
= Pr[Π(G1 ) = F ]
= Pr[Π ∈ S1 ]
= Pr[Π ∈ S2 ]
= Pr[Π(G2 ) = F ]
= Pr[I = 2 | Π(GI ) = F ]
Pr[Π(GI ) = F | I = 1 ] Pr[I = 1] 1
Pr[I = 1 | Π(GI ) = F ] = = Pr[I = 1] =
Pr[Π(GI ) = F ] 2
20
Theorem 4. Let V be the verifier in the above protocol, then for any prover E
and any G1 ∼
= G2 we have
1
Pr[hE, V i (G1 , G2 ) = 1] ≤
2
Proof. We see that, in the protocol, the verifier only ever accepts if the prover
successfully figures out which graph was permuted and sent to it. Using the
notation of the above discussion, we have that the permuted graph is represented
by the random variable Π(GI ). Letting E be a random process, we see that the
verifier accepts if E(Π(GI )) = I. Then:
X
Pr[E(Π(Gi )) = I ] = Pr[Π(GI ) = G0 ] Pr[E(G0 ) = I | Π(GI ) = G0 ]
G0
We now can use the proof from before and conclude that, for any G0
X
Pr[E(G0 ) = I | Π(GI ) = G0 ] = Pr[E(G0 ) = i ∧ I = i | Π(GI ) = G0 ]
i
X
= Pr[E(G0 ) = i] Pr[I = i | Π(GI ) = G0 ]
i
Here we can use the fact proved before, and for i ∈ {1, 2} we have that the
expression on the right is equal to 21 and as such
X
Pr[E(G0 ) = i] Pr[I = i] Π(GI ) = G0
i
Pr[E(G0 ) = 1] + Pr[E(G0 ) = 2]
=
2
Pr[E(G0 ) ∈ {1, 2}]
=
2
1
≤
2
Where the last inequality is since the probability is maximized when E always
outputs an element of {1, 2}. Going back to the original equation:
1X 1
Pr[E(Π(Gi )) = I ] ≤ Pr[Π(GI ) = G0 ] =
2 0 2
G
Since E was arbitrary, this implies that any possible prover can fool the verifier
with maximum probability 12 .
This shows that the soundness bound of the protocol is 21 , and repeating
the protocol twice yields an interactive proof with completeness bound 1 and
soundness bound 14 .
21
7 Zero Knowledge Proof
Now that we have introduced interactive proofs, we can naturally extend them
to encompass the idea of zero knowledge. In the previous sections we have
given some intuition into what gaining knowledge does mean, and now we aim
to formalize this. As before, we consider an interactive proof system, composed
of a prover and a verifier, respectively P, V . The first observation is that in
an interactive proof system the prover usually (either by being more powerful
computationally or by having some additional information) has a “knowledge
advantage” over the verifier. With zero knowledge we aim to make so that this
advantage does not transfer to the verifier, no matter what clever things it may
do. From this we can conclude that being zero knowledge is strictly a property
of the prover in the interaction, in the same way that soundness is a property
of verifier. Secondly, in a sense, we can express the zero knowledge requirement
as the fact that any verifier V 0 , when interacting with prover on input x, will
not be able to compute anything that it wouldn’t have been able to compute
on input x alone. In order to formalize this, we turn to the idea of a simulation
of the interaction of P, V 0 .
Let us consider why this might be an helpful notion. Our loose definition of Zero
Knowledge as computational power gain in effect means that any prover that is
polynomial cannot yield any knowledge to a verifier, as the verifier already has
all the knowledge to start with. Because of this reason, every problem in BPP
has a Zero Knowledge Proof, and we refer to these problems as the trivial ones.
Now consider a non trivial language, and a corresponding prover that is more
powerful than a polynomial one. How can we use a simulation to show that such
a prover does not leak information? The key idea is that we make the simulator
just as powerful as the verifier. Then if a simulator with limited computational
capabilities is able to replicate the interaction, the verifier cannot learn anything
from it, as the simulator has no knowledge advantage to start with. Then, since
the simulator and the prover interact indistinguishably with the verifier, the
verifier cannot learn anything from the prover as well. In essence, the existence
of the simulator shows that all the verifier can learn is that which is given
by a BPP machine, and since the verifier is BPP to start with, it gains Zero
Knowledge. We formalize this by saying that a language L is zero knowledge if
it has an interactive proof system P, V such that, for every possible verifier V 0 ,
there exists a probabilistic polynomial time Turing machine MV 0 that for any
x ∈ L satisfies that hP, V i (x) and MV 0 (x) are similarly distributed31 . Using
this intuition and playing around with the loose parts of it, we can start defining
the interesting classes of zero knowledge proofs.
22
the definition of zero knowledge proof to simply require that the ensembles
hP, V i (x) and MV 0 (x) are identically distributed. However, as far as we know,
no non trivial languages satisfy that requirement. To show why this needs to be
case, if a simulator can perfectly and infallibly simulate an interaction between
prover and verifier, despite having less computational power (and or information
available) than this effectively implies that the prover is not using its superior
capabilities to their best or at all! This in turns will imply that language being
decided is in fact of the trivial class. So, in order to allow us to decide any
harder language, the simulator will have at some point have to give up. We
can formulate this by giving the following definition of Perfect Zero Knowledge
Proof.
As always, the bound on how often the simulator is allowed to fail can be
made negligible. It is interesting to note how the identical distribution require-
ment mentioned is a notion that is mostly used in information theoretic con-
texts, and as such this scheme is in a sense perfectly secure. To be more precise,
this implies that even a possibly cheating computationally unbounded verifier
cannot, quite amusingly, detect if it is in a simulation or in a real interaction.
Furthermore, we can make the observation that, for any verifier V 0 , its execution
is completely determined by the contents of its random tape and the messages
that are sent on the communication tape by P . Knowing this we can define a
random variable viewP V 0 (x) to describe the transcript of the interaction between
P and V 0 , including the contents of V 0 random tape and all the messages sent
from P to V 0 . Using this, we can replace the hP, V i (x) by viewP V 0 (x) and ob-
tain an equivalent definition that is slightly easier to work with, as it allows to
account for verifiers that arbitrarily deviate from the protocol (e.g. some that
terminate between steps). With this definition, let us give now an example of a
Perfect Zero Knowledge Proof for the language of Graph Isomorphism.
32 Pr[m 0 (x) = α] = Pr[MV 0 (x) = α | MV 0 (x) 6= ⊥ ]
V
23
Example 6. First of all, let us define the language GI.
Note that GI is33 , as far as we know, not NP-complete nor in BPP. This is
important since every language in BPP has a trivial proof, while known ZKP
languages that are NP-complete require further assumptions. The following pro-
tocol [13] [14] is a Perfect Zero Knowledge Proof System for GI.
Proof that G1 ∼
= G2
Prover Verifier
G1 = (V1 , E1 ), G2 = (V2 , E2 )
i ←$ {1, 2}
if i =? 2
π := σ
else
π := σ ◦ φ
return π(Gi ) =? G0
or that it is only known to the prover. Strictly speaking the second formulation requires a
slightly different formulation of ZKP, but the proof follows similarly
24
3. Once V receives the graph G0 from P , it randomly selects an i ∈ {1, 2},
and sends it to P .
4. When P receives i, it acts as follows. If i = 2, then P sends σ, else it
sends σ ◦ φ.
The rationale behind this is quite straightforward. On the first step, the prover
generates a challenge graph, that it claims is isomorphic to both of the input
graphs. Of course, if the two graphs are non isomorphic no graph can satisfy
this claim. On receiving this graph, the verifier issues a challenge, asking to
receive an isomorphism between the received graph and one of the two inputs. If
the graphs are isomorphic, then the prover can always easily (if the isomorphism
φ is know) answer the query. Else, it must fail with some probability.
Theorem 5. The above is a Perfect Zero Knowledge Proof for GI.
Proof. First of all, note that the verifier can easily be implemented in polynomial
time. We do not know of a way to similarly implement the prover, but luckily
this is not a requirement. If the permutation φ is known by P a priori then
we can actually find such an implementation. First of all, we aim to show that
P, V is an interactive proof system.
• Completeness. If (G1 , G2 ) ∈ GI, then P is always able to find φ such that
φ(G1 ) = G2 . Then G0 = σ(G2 ) = (σ ◦φ)(G1 ). Then the prover can always
satisfy the verifier challenge, and as such the completeness bound is 1.
• Soundness. If (G1 , G2 ) ∈/ GI, then there is no isomorphism between the
two input graphs. Therefore for any G0 that a possibly cheating prover E
might choose, there exists a j ∈ {1, 2} such that Gj G0 . So, since the
verifier chooses i ∈R {1, 2} after G0 is fixed, Pr[i = j ] = 21 and as such the
prover will fail to convince the verifier at least 21 of the times, and that is
the soundness bound.
Now, for the arguably more challenging section of the proof let us prove that
the above protocol is indeed zero knowledge. We show a simulator35 that sat-
isfies the zero knowledge condition. Let V 0 be an arbitrary polynomial time
randomized ITM. We define the following simulator MV 0 , that runs on input
x ≡ (G1 , G2 ):
25
2. The simulator randomly selects j ∈ {1, 2}, and permutation π ∈ SVj ,
computes the graph H = π(Gj ).
3. The simulator now starts simulating V 0 , giving it x as common input, r
as the random tape, H in the incoming message tape.
A couple of notes are in order. First of all, as long as the verifier is polynomial
time, the above simulator can be implemented in polynomial time. Secondly,
the simulator does not follow exactly the steps of the prover, of course since
it is required to run in polynomial time. Thirdly, the simulator is not able
to perfectly answer the query every time, as it does not know φ as the prover
does, and as such it is required to fail some times. Now onto proving that this
simulator satisfies the requirement. Let x ≡ (G1 , G2 ) ∈ GI. We first show that:
1
Pr[MV 0 (x) = ⊥] ≤
2
Recall that38 we showed that, for any randomized algorithm E, any two iso-
morphic graph G1 , G2 , and with the variables Π ∈R SV1 , I ∈R {1, 2} we have
that:
1
Pr[E(Π(GI )) = I ] ≤
2
Since we see that Pr[MV 0 (x) = ⊥] ≤ Pr[E(Π(GI )) = I ] (as the simulation only
fails when V 0 is able to find out i from H = π(Gi )), the claim is proved.
Using now the alternative characterization of zero knowledge, we aim to show
that viewP V 0 (x) and mV 0 (x) are identically distributed. First of all, note that
both deal with quadruples of the form (x, r, ·, ·), with r in both cases randomly
and uniformly selected. As such, we can move on to consider only the next
two components, namely the first and second message sent by the prover to the
verifier. Let us define s(x, r) be the last two elements of the quadruple outputted
by the simulator (as long as the simulation succeeds), and similarly p(x, r) the
same elements of the quadruple of the verifier’s view of the interaction. We aim
to show these are identically distributed. First of all not that once x, r, H are
fixed, the output of V 0 , which we will call v(x, r, H) is uniquely determined. It
turns out that s(x, r), p(x, r) are identically distributed over the set:
Cx,r = (H, π)|H = π(Gv(x,r,H) )
37 This makes the discussion easier, also if V 0 terminated we can still apply the normalization
26
Unfortunately, the proof is rather tedious39 , and not particularly related to
cryptography, so for now I have decided to skip it until I find a simpler one.
This concludes out proof that the above is a valid Perfect Zero Knowledge
simulator for the interactive proof system, and as such we are done.
is a negligible function of n.
A couple of notes are, as usual, in order. First of all, the 1n parameter
are used to rule out degenerate cases where the two distributions yields strings
logarithmic in the size of n, and as such the D algorithm is unable to run in
polynomial time of n. The δ function is often referred to as the attackers advan-
tage. The idea is that this metric can measure how much better the attacker can
do over the general strategy of always accepting, rejecting or guessing, which
always yield an advantage of41 0. In order to further explain how this is useful
let us consider the following two examples [15].
Example 7. Consider the two ensembles {An }n∈N and {Bn }n∈N where we set
Pr [An = n] = Pr [An = n + 1] = 21 , and Pr [Bn = n] = 1. So An is either
n or n + 1, while Bn always equals n. Then {An }n∈N and {Bn }n∈N are not
computationally indistinguishable
Proof. Let D be the algorithm that accepts (x, 1n ) if x = n. Then
second part requires apparently at least a discussion of the authomorphisms of a graph, which
is beyond our scope
40 An ensemble {A } ∗
n n∈N is just an infinite set of random variables over {0, 1} indexed by
either an integer n, or by some string in some language
41 For example, if an algorithm D always accepts then δ(n) = |1 − 1| = 0
27
Since δ(n) is a constant, it is non negligible and as such the two ensembles are
distinguishable.
Example 8. Consider the two ensembles {An }n∈N and {Bn }n∈N where we
define Pr[An = 0n ] = 2−n , Pr[An = 1n ] = 1 − 2−n and Pr[Bn = 1n ] = 1. In
order to make intuitive, An can be generated by an algorithm that flips n coins
and outputs 0n if all landed tails, and 1n otherwise, and Bn always outputs 1n .
Then {An }n∈N and {Bn }n∈N are computationally indistinguishable
Proof. First of all note that the only case in which the distribution of the two
differ is when An = 0n . Intuitively, this happens only with negligible probability,
and as such the overall advantage that any D could gain happens scarcely enough
that it does not matter. Let us give a formal proof of this fact. Let D be any
polynomial time algorithm. Then let:
Pr[D(An , 1n ) = 1 | An 6= 0n ] = Pr[D(Bn , 1n )]
Pr[D(An , 1n ) = 1] = Pr[D(An , 1n ) = 1 ∧ An 6= 0n ]
+ Pr[D(An , 1n ) = 1 ∧ An = 0n ]
Now note that the second term is greater than 0, and as such:
Pr[D(An , 1n ) = 1] ≥ Pr[D(An , 1n ) = 1 ∧ An 6= 0n ]
= Pr[D(An , 1n ) = 1 | An 6= 0n ] Pr[An 6= 0n ]
= Pr[D(Bn , 1n ) = 1] Pr[An 6= 0n ]
= Pr[D(Bn , 1n ) = 1] (1 − 2−n )
28
Similarly we can use the fact that Pr[A ∧ B ] ≤ Pr[A] to conclude that:
Pr[D(An , 1n ) = 1]
= Pr[D(An , 1n ) = 1 ∧ An 6= 0n ] + Pr[D(An , 1n ) = 1 ∧ An = 0n ]
≤ Pr[An 6= 0n ] + Pr[D(An , 1n ) = 1 ∧ An = 0n ]
= 2−n + Pr[D(An , 1n ) = 1 | An 6= 0n ] Pr[An 6= 0n ]
= 2−n + Pr[D(Bn , 1n ) = 1] (1 − 2−n )
Using this we can define the idea of a Computational Zero Knowledge proof.
Definition 22. Let L be a language. Then L has a computational zero
knowledge proof system if there exist P, V such that:
1. {hP, V i (x)}x∈L
2. {MV 0 (x)}x∈L
As you can see, other than the different level of “closeness” of the two dis-
tributions, the main other difference is that we do not require bounded failure
of the simulation by the ⊥ symbol. The rationale between this is quite simple.
First of all, remember that in perfect zero knowledge we can repeat the interac-
tion multiple times42 to reduce the probability of outputting ⊥ to a negligible
one (in the size of the input). In particular this implies that the simulator only
differs from the interaction a negligible percentage of times, and the above ex-
ample shows how two ensembles that only differ with negligible probability are
computationally indistinguishable. Computational Zero Knowledge Proofs are
42 In order for this to be formal we would actually need to prove that sequential composition
of zero knowledge proofs does not yield any knowledge, but for the sake of this discussion just
know this holds
29
in a sense more efficient then Perfect Zero Knowledge proofs, as in practical
applications there is no need to perfectly simulate the interaction to guaran-
tee zero knowledge. From a practical standpoint as well, computational Zero
Knowledge Proofs are those that are generally referred to as simply ZKP, and
those around which applications mostly focus.
30
flavour44 . In particular, note that once S has committed a value, in order to
provide verification, it can send its private input and the random coin it used
to R, which can then run S’s algorithm to verify the certificate in polynomial
time.
As far as we know, the existence of bit commitment schemes requires some
assumptions, namely the existence of one way functions, functions that are
easy to compute but hard to invert. For a more through description, look no
further than Appendix A. While a proof that uses solely those kind of function
exists [18], it requires some discussion of pseudorandom generators, which are
not particularly insightful nor too related. Instead, we can use the stronger
assumption of the existence of one way permutations (i.e. one way functions
that are also bijections), which results in a simpler proof [19].
Let us first discuss the intuition that makes the following construction useful.
Let us assume that we have access to a one way permutation f . We want to
commit to a bit b. What we can do is select a random string. We can make
use of a property of one way functions, that is the property that they have hard
cores. This is some property of the input of a one way function that cannot be
guessed from the output (discussed extensively in Appendix A). Let us denote
h(·) as one of those hard cores. Then let us select some random string s that
we will call the certificate. We will be sending to the other party the pair
(f (s), h(s) ⊕ b). This has a double effect. First of all, since the function is one
way, we won’t be able to efficiently find a s0 such that f (s) = f (s0 ), and so
effectively we won’t be able to find another certificate with which to trick the
other party. On the other hand, since h is a hard core, the other party will not
be able to guess it effectively from f (s), and such h(s) ⊕ b and consequently b
cannot be recovered by it. Of course, revealing s, b will make the other party
be able to efficiently verify that we did not cheat.
Theorem 6. Let f : {0, 1}∗ → {0, 1}∗ be a one-way permutation. Let h :
{0, 1}∗ → {0, 1} be a hard-core of f . Then the following is a bit commitment
scheme.
• Commitment. To commit the value i, uniformly select s ∈ {0, 1}n . Send
(f (s), h(s) ⊕ i) to the receiver.
• Reveal. To reveal, the sender reveals s, i. The receiver, which had received
commit (α, j), accepts if f (s) = α and h(s) ⊕ i = j.
Proof. The secrecy requirement is a direct consequence of f being one way, and
b being one of its hard cores. The unambiguity requirement is a consequence of
f being injective, as no ambiguous view exists.
44 Thiscommitment scheme is referred to as perfectly binding. There is also a corresponding,
mutually exclusive, “dual”-like notion of perfectly hiding schemes [16], in which the secrecy
requirement is information theoretic and the unambiguity is computational. This can be
used to provide Perfect ZK Arguments for languages in NP [17], but they are relatively more
advanced and as such will not be discussed here
31
Furthermore, the above scheme has one additional desirable feature: the fact
that it only requires to interact one way (from the prover to the verifier). As
such we define:
Definition 24. A one-way-interaction bit commitment scheme is a poly-
nomial time algorithm F such that.
• For s uniformly drawn from {0, 1}n , the ensembles {Fs (0, 1n )}n∈N and
{Fs (1, 1n )}n∈N are computationally indistinguishable.
• There does not exist s such that Fs (0, 1n ) = Fs (1, 1n ).
32
Proof that G is 3-colorable
Prover Verifier
π a 3-colouring of G
σ ←$ S3
φ := σ ◦ π
si ←$ {0, 1}n , ∀i ∈ V
ci := Csi (φ(i))
c1 , . . . , c n
(u, v) ←$ E
u, v
γu := φ(u)
γv := φ(v)
su , γu , sv , γv
Example 9. Let the common input be G = (V, E), where V = {1, . . . , n}. This
graph is 3-colourable, with colouring π, which is unknown to the verifier45 .
1. The prover randomly uniformly selects σ ∈ S3 . It sets φ ≡ σ ◦ π. For
each i ∈ V , it generates a random string si ∈ {0, 1}n and computes ci ≡
Csi (φ(i)). The prover then sends c1 , . . . , cn to the verifier.
2. The verifier uniformly selects an edge (u, v) ∈ E, and sends it to the prover
3. The prover sends (su , φ(u)) and (sv , φ(v)) to the verifier
4. On the messages (α, i), (β, j), with α, β ∈ {0, 1}n and i, j ∈ {1, 2, 3},
the verifier checks that cu = Cα (i) and that cv = Cβ (j). If not it rejects.
Finally, it accepts if i 6= j, and if as such the edges are differently coloured.
A small discussion is, of course, in order. The fundamental idea is that the
prover creates n boxes, each containing a colour. It locks these boxes and sends
45 As always, such π could be either computed by a superior prover, or given as auxiliary
input
33
them to the verifier. The verifier then chooses two of these boxes, such that they
are connected by an edge in the original graph. It asks the prover to open the
boxes, and after having verified that they haven’t been tampered with, the verifier
concludes that the colouring is valid if the two boxes it chose have different
colours. Intuitively, the verifier only ever learns that two vertexes have different
colours, that of course is a requirement for any colouring. In particular, note the
role of the permutation σ. The idea is that, on every run, the prover will select
a different sigma, and as such repeating the protocol will possibly yield different
colourings for two vertexes. This way, a cheating verifier cannot alternatively
ask for the colouring of each vertex in the graph, and so it cannot piece back the
original colouring.
And so the protocol is an interactive proof system for 3COL, with completeness
bound 1 and with a weak soundness bound 1/|E|. Repeating the protocol yields
an interactive proof as required46 . For any verifier V 0 , we show a simulator MV 0
that shows that the prover has the zero knowledge property.
• Let the input be G = (V, E), where V = {1, . . . , n}.
• Again, let q(|G|) be a polynomial bound on the running time of V 0 . Allo-
cate a uniformly randomly selected string r ∈ {0, 1}q(|G|) random bits to
be used.
not yield any knowledge. Luckily, the sequential composition lemma that we assumed earlier
and here ensures that this is the case
34
• After V 0 has been simulated, we can assume that it will have some message
m on its out message tape. We can also assume that m ∈ E, as if it is not
we can adapt MV 0 so that is selected some edge in E to be used.
• Let (u, v) = m. Then if eu 6= ev the simulator halts with output
• Else we output ⊥
First of all, you might be puzzled by the presence of the ⊥ symbol in this
simulator. However, we noted before that computational proofs do not need
such help, as repeating the simulation often will result it in it failing negligibly
often, and as such the interaction will still be computationally indistinguishable.
So, adding the capability of outputting ⊥ does not “add” any extra power to
the definition, and as such we can use it here to simplify our life (requiring as
always that it is not output too often). We first show that. We note that the ⊥
symbol is only output when the two edges that were selected by the verifier V 0
have been assigned the same colour. Intuitively, we would like to use the fact
that commitments are computationally indistinguishable in order to show that
the verifier cannot do much better than to randomly pick two edges (in which
case the probability of getting two edges with the colour is 31 ). This is easy
to show, as any V 0 that is able to distinguish commitments will directly yield
an algorithm that violates the secrecy property of the scheme47 . Now, for the
next step we aim to show that the view of the interaction of the prover and the
verifier is computationally indistinguishable to the output of the simulator. In
particular, let mV 0 (G) be the random variable denoting the output of MV 0 on
input G conditioned on it being different from ⊥. Let also A be any probabilistic
time algorithm. We let:
35
verifier gets computationally indistinguishable ensembles, the two probability
cannot differ by a non negligible factor (as then V 0 could yield an algorithm
that contradicts the secrecy requirement of the scheme). One can refer to [14]
for a complete proof.
A couple of things are worth noting. First of all, the proof above has a weak
soundness bound, and as such a one round protocol has only a moderate chance
of convincing the verifier. However, as always repeating the algorithm k ∗ |E|
times yields a proof with error probability bounded by e−k , and the sequential
composition lemma will also guarantee that this is zero knowledge. Secondly,
the main significance of this proof relies on the fact that 3COL is a NP-complete
language, and as such it is polynomial time reducible from any other problem
in NP. This implies that, if one way functions exists, then every language in NP
has a computational zero knowledge proof.
8 Applications
In the above sections, we discussed some of the mathematical definition of zero
knowledge proofs, their properties and some basic proofs for languages such as
GI and 3COL. Of course, all would be for naught if Zero Knowledge Proofs did
not have practical applications, and were not a capable tool for solving various
problems. Taking inspiration from [22], we hope to show a variety of proofs that
are applicable with real life problems.
36
Proof of knowledge of a discrete log β in base α
Prover Verifier
α, β, N
b ←$ {0, 1}
y ≡ r + bx (mod φ(N ))
return αy ≡? γβ b (mod N )
One can easily verify that, if both parties follow the protocol, then the equation
αy = αr+bx = αr αbx = γ(αx )b = γβ b holds. The main idea is that the prover
commits itself to a value r, that is kept hidden from the verifier thanks to the
hardness of the discrete log. Then the verifier issues a challenge, in this case
the bit b. If a cheating prover did not know x, then he would not be able to
48 Inparticular N is chosen so that the discrete log is hard in the group Z×
N . α is chosen to
be a primitive root modulo N
37
find a y that satisfies the above equation. Furthermore, the random parameter
r ensures that the verifier does not learn anything from y about x, ensuring
the scheme is zero knowledge. Now, the idea is that, using something called
the Fiat Shamir Heuristic [25], we can turn an interactive proof into a non
interactive one. In summary, since the only contribution of the verifier to the
interaction is to provide a challenge, we can replace this interaction with the
use of a cryptographic hash function. We make the verifier compute a hash of
the parameters, and use the hash as the challenge. By the fact that the hash is
determined from public parameters the verifier can make sure that the prover
is not cheating selecting one that would advantage it. Also, by the fact that the
hash function is indistinguishable from a random function, in order to fool the
verifier the prover will have to find a collision in the hash function output, which
is computationally at least comparably hard to solving the original problem!. A
transformation of the above protocol to a non interactive one is as follows:
b0 := H(α, β, γ)
0
return αy ≡? γβ b (mod N )
• The prover now computes b := H(α, β, γ), which will be his challenge
• Finally, the prover computes y ≡ r + bx (mod φ(N )), and sends the pair
(γ, y) to the verifier
38
• The verifier, or anyone can ensure that αy =? γβ b . Also note how the
verifier can compute b from the public parameters easily.
The reasoning for why this would work is exactly the same as the interactive
version, but the hash function H(·) allows us to fairly select a challenge without
the possibility of cheating. In particular, a further modification of the protocol
allows us to get Schnorr signatures, a way to sign a message using 4t-bit signa-
tures, where t is a security parameter. In fact, it was later shown [26] that only
3t-bits are required. The scheme works as follows:
γ 0 ≡ αy β e (mod N )
return e =? H(γ 0 , M )
39
8.2 Secure Remote Password
The Secure Remote Password protocol [27], or SRP for short, is the most
common example of zero knowledge proofs used for authentication. Common
schemes usually use a plaintext approach, in which the user communicates a
username and a password to a (trusted) server, that then has the burden of
securely maintaining said information. On the other end, SRP only requires
that the user itself maintains a secret key, is resistant to dictionary attacks, and
crucially requires no trusted party. In particular, one of its most attractive fea-
tures is that it can work as a drop in replacement that does not require users to
change their workflow. SRP is based on Diffie Hellman, and as such it has some
of the same cryptography requirement, namely the intractability of the discrete
logarithm and the existence of one way functions. The protocol consists of two
steps: the registration step and the authentication step.
40
Secure Remote Password, with sec param k
User I Host
. . . . . . . . . . . . . . . . . . . . . . . . . . . Registration . . . . . . . . . . . . . . . . . . . . . . . . . . .
p, s ←$ {0, 1}∗
x := H(s, p)
v := g x
s, v
store I, s, v
. . . . . . . . . . . . . . . . . . . . . . . . . . Authentication . . . . . . . . . . . . . . . . . . . . . . . . . .
a ←$ {1, . . . , |G|}
A := g a
I, A
lookup s, v from I
b ←$ {1, . . . , |G|}
B := kv + g b
s, B
u := H(A, B)
x := H(s, p) S := (Av u )b
S := (B − kg x )a+ux K := H(S)
K := H(S)
• Preliminaries
• Both parties need to agree on a large safe prime N , and consequently on
a group G ∼
= ZN . Also, they agree to a generator of the group g, and to
a parameter k ∈ G.
• Both parties select a cryptographically secure hash function H.
• Finally, the parties agree to an identifier I for the user U .
41
• The user selects a password p and a random (usually small) salt s.
• It computes x := H(s, p). Also, it computes v := g x .
• Finally, the user sends to the host the pair (s, v)
• The host stores the triple (I, s, v)
Once this is done, the user can authenticate using the following protocol:
• The user selects a random a, and computes A := g a , it then sends (I, A)
to the host.
• The host looks up (s, v) using I, selects a random b and computes B :=
kv + g b . It then sends (s, B) to the user.
• Both parties set u := H(A, B)
• The user computes x := H(s, p), and from it the session key S := (B −
kg x )a+ux . Finally it computes the key K := H(S).
• On its end the host computes the session key S := (Av u )b . The final key
is then K := H(S).
Finally, the proof that the authentication succeeded is shown by the two par-
ties comparing the final key. Technically speaking, SRP is a Zero Knowledge
Password Proof, which is a more narrow class of Zero Knowledge proof, how-
ever you can see how the ideas that were developed in the previous sections are
still easily transferable. In particular, the prover is the user, that shows that it
has knowledge of a value x such that the equation below holds for any random
a, b, u, and for v = g x :
(kv + g b − kg x )a+ux =? (g a v u )b
Of course, the security lies in the fact that assuming the hardness of the discrete
logarithm, computing x from g x should be unfeasible.
such as Bitcoin to be truly anonymous and confidential. This is not the case currently as in
every transaction the addresses of the parties in play are known, and the amount transferred
as well
42
is that, if x is the date of birth of the user, he/she could prove to some sort
of service that he/she is over 18, without revealing any private information.
The kind of proofs that are used to solve this problem are referred to as Zero
Knowledge Range Proofs. In particular, the proofs that we are most interested
in this case are those referred to as Non Interactive. In this kind of proof,
the challenge and answer method that we presented before is lost, as Peggy
tends to compile a “transcript” of a proof, that Victor can check on its own, of
course while not leaking any information as possible. These kind of proofs are
particularly helpful in a network setting, in which communication is expensive.
Let us look at a simple example. First of all we need to introduce the idea of
Pedersen Commitment [28].
Definition 27. Let G be a group in which the discrete log is hard, g, h be two
generators. For a secret m, and random bits r we have the Pedersen Commit-
ment of m :
Cr (m) = g m hr
Pedersen Commitments have the desirable property that they are homomor-
phic i.e. Cr1 +r2 (m1 + m2 ) = Cr1 (m1 )Cr2 (m2 ). Now, suppose we want to show
in one interaction the fact that x ∈ [0, 2N − 1]. A way to do that is to show that
PN
we can encode x = i=1 bi 2i where each bi ∈ [0, 1]. A non interactive proof [29]
for the fact is as follows:
• Select
PN a random value r, and let ri be such that the bit decomposition of
r = i=1 ri 2i .
• Compute the commitment ci = Cri (bi ), and let c = Cr (x).
• Send to the verifier the commitments ci together with a proof π that each
ci is a commitment of a value which is either 0 or 1. This can be done
by combining discrete log knowledge proofs such as [23] together with a
proof of partial knowledge such as those described in [30]. Essentially, by
the property of the commitment, the value of ci is either hri or ghri . We
prove that we either know the discrete log of ci base h or that we know
the discrete log of g −1 ci in base h.
QN i
• Finally, the verifier checks the proof π and that c = i=1 c2i
Just to fill in the gap, the proof of partial knowledge in this case can be
extracted and simplified, and it works as follows. Recall that in that step of the
proof the prover has committed ci = g bi hri . We know that bi ∈ {0, 1}. And
ci = hri or ci = ghri . Note that in the first case we know a discrete log of ci ,
while in the second we know the discrete log of g −1 ci (in both cases in base h
and in both cases the log is ri ). If we manage to prove either one of the facts,
without letting the verifier know which actually holds, then we will be set. A
protocol for this works as following (due to [29]):
• Let E, f, g ∈ G be common inputs. E is equal to either f z or gf z , where
z is known by the prover only.
43
• The prover actions are conditional on what case holds
– If E = f z , the prover uniformly and independently selects integers
w, r1 , c1 . Computes a ≡ f w , and b ≡ f r1 (E/g)−ci
– If E = gf z , the prover uniformly and independently selects integers
w, r2 , c2 . Computes a ≡ f r2 E −c2 , b ≡ f w
The prover then sends a, b to the verifier
• The verifier sends a random integer c to the prover.
• Again, we act differently in both cases.
– If E = f z , the prover computes c2 ≡ c − c1 , and r2 ≡ w + zc2
– If E = gf z , the prover computes c1 ≡ c − c2 and r1 ≡ w + zc1
Finally, the prover sends r1 , r2 , c1 , c2 to the verifier
• The verifier accepts if the following hold:
c =? c1 + c2
r1
f = b(E/g)c1
r2
f = aE c2
9 Implementation
As supplemental material to this document, we produced a variety of Sagemath
worksheets that allow the user to experiment with most of the Zero Knowledge
Proofs presented in the text. In particular, the protocols presented are those
for:
1. Probabilistic Matrix Multiplication checking (Figure 1)
2. Graph Non Isomorphism Interactive protocol (Figure 2)
3. Graph Isomorphism Perfect Zero Knowledge Proof (Figure 3)
4. Graph 3-Colouring Computational Zero Knowledge Proof (Figure 4)
5. Discrete Logarithm Interactive Zero Knowledge Proof (Figure 5)
6. Discrete Logarithm Non Interactive Zero Knowledge Proof (Figure 6)
44
7. Schnorr Signature protocol (Figure 7)
8. Secure Remote Password protocol (Figure 8)
We considered providing a worksheet for the Zero Knowledge Range Proof
described in 8.3, but the branching that is needed for the implementation trans-
lated very poorly to worksheets, and as such it would have not been insightful.
In the first four worksheets we also provided “experiments”, that is we set the
worksheet up in such a way so that the corresponding protocol is executed
multiple times serially, in order to show that the probabilities that we derived
correspond to the experimental results.
45
relabelling in order to correctly behave with randomly selected permutations.
In every example that uses a Hash Function (Non Interactive Discrete Log,
Schnorr Signature, Secure Remote Password) I needed to essentially hack my
way around in converting an hash back to an integer, which does not make me
particularly proud.
Now, from a security standpoint there are a few things that could go wrong.
First of all, the parameters that I used are chosen absolutely arbitrarily. They
are very much too small for any kind of serious application52 . Furthermore, in
cryptography you should always use parameters that are battle-tested, such as
the NIST [32] endorsed ones. This is because, even if in general we believe that
a problem is intractable, this does not guarantee that all of its instances are,
and there might53 be some particular configuration of parameters that yield
very easy to solve problems. Following up, sources of randomness. In these
worksheets, I have used the standard random number generator that comes
with Sage, which, as far as I know, is not cryptographically secure. This could
have many implications. First of all, suitable attacks could fix the seed, and
thus render the prover deterministic, which could then be used to break the Zero
Knowledge Property. This can be done since, if the verifier knows the random
bits of the prover, then it can simulate its following actions perfectly, and as
such gain knowledge by predicting every branch and then comparing it with
the prover’s actual message. Secondly, let us consider the Graph 3 Colouring
protocol. In this protocol we have to use bit commitments, and I emulated the
construction based on one way permutations that we detailed before. In order
to model a one way function, I simply used a standard RNG to generate a string
of bits. There are two implications of this. Since I have used a one way function
instead of a one way permutation, it might be the case that there are collisions,
and as such the unambiguity requirement might be violated (which makes the
proof system weaker). Also, there probably are attacks on such a generator,
which would allow a verifier to peek the bit committed value before the reveal
phase, allowing him to at least partially gather the three colouring .
graph which is too big is quite hard, but feasible for small ones
53 There almost always is
46
large n, and Un uniformly distributed over {0, 1}n the following is neg-
ligible:
Pr E(f (Un ), 1n ) ∈ f −1 (f (Un ))
Definition 29. A function f : {0, 1}∗ → {0, 1}∗ is weakly one way if:
1. There exists a deterministic polynomial time algorithm A such that A(x) =
f (x)
2. For every probabilistic polynomial time algorithm E, every sufficiently
large n, and Un uniformly distributed over {0, 1}n the following is non-
negligible:
/ f −1 (f (Un ))
Pr E(f (Un ), 1n ) ∈
fmult (x, y) = x · y
fdiscrete (x) = g x
X
fssum (x1 , . . . , xN , I) = (x1 , . . . , xN , xi )
i∈I
Where fmult is simply multiplying two integers together, and, assuming in-
tractability of integer factorization, it is weakly one way. The second problem,
fdiscrete is the discrete logarithm problem, where g x is taken to be in some
group in which the discrete log is assumed to be hard, and g is a generator for
the group (more in the discrete log appendix). Instead, fssum is very closely
linked to the NP-complete problem known as subset sum. While it is true that
if P = NP then fssum is not one way, the converse does not hold. To see this,
note that subset sum could have worst case exponentially running time, yet be
polynomially solvable in the average case, and as such making fssum not one
way54 . In fact, P 6= NP is necessary but not sufficient condition for one way
functions to exist. It turns out [14] that strongly one way functions exist if
and only if weakly one way functions do. This justifies our assuming only the
existence of vague one way functions, without specifying the type. While for
one way functions it is hard to recover the complete input in general, there is
nothing stopping a possible attacker from computing some partial information
about the input. For example, if f is one way we can define the function55
g(x, r) = f (x)||r which will be one way, yet always leak half of its input to the
attacker. On the other hand, there is some information that such g does not
54 In fact, we report here f
ssum only for historical relevance, as originally it was thought to
be a good candidate, while nowadays it is known that the SSUM problem is actually efficiently
solved in many commonly arising cases [33], and a such we know that it is not
55 We use || to denote concatenation
47
leak, for example (if f is strongly one way): x ◦ r. Using this we can define the
following:
Definition 30. Let f : {0, 1}∗ → {0, 1}∗ be a one way function. A predicate
b : {0, 1}∗ → {0, 1} is a hard-core of f if:
48
average57 . In particular, cyclic groups such as Z× p are often good candidates,
even though if p − 1 has many factors then there are efficient algorithms such as
the Pohlig–Hellman [34]. In practice, if modular arithmetic is used, the prime
p is chosen so that p = 2q + 1 for some other prime q. Using this and using
suitably large primes the discrete logarithm problem is considered intractable.
The other alternative to modular arithmetic is using elliptic curves. Elliptic
curves are defined over a field58 F. In particular, this field is in practice of the
form Zp for p prime, and it is defined by the set of points (x, y) ∈ F2 ∪ {∞}
such that59 :
y 2 = x3 + αx + β
For suitable constant parameters α, β, we can define addition in a geometrical
manner, having that the sum of some points P, Q is given by first taking the line
between the points (or the tangent at P if P = Q), finding the other point in
which it intersects the curve, and then reflecting it in the x-axis. In particular,
this yields us a group which has the following properties:
• We denote E(F) as the elliptic curve over F with parameters α, β.
• Then we have ∞ as the identity, and we let P + ∞ = ∞ + P = P for all
P ∈ E(F)
This definition is important, especially since the addition laws are easy to im-
plement and optimize, yielding performance often superior to that of modular
exponentiation. As an example we have E(F29 ), with α = 4, β = 20. This group
in particular has 37 points (including of course ∞), such as (2, 6), (20, 26). We
can also verify that (5, 22) + (16, 27) = (13, 6), and that 2(5, 22) = (14, 6) (and
57 Itcan be shown using random self reducibility, that if a hard instance exists than on the
average the problem will also be hard
58 In fact, it is required this field has characteristic not 2 or 3, otherwise the equations
49
of course that closure holds). Using this kind of construction, and well tested60
parameters such as the ones that NIST recommend [32] we can yield cryptosys-
tems that are resistant to any non quantum attacker that we know of, and in
particular we can get some pretty good performance as well.
2. x ∈ L ⇐⇒ M (x) = 1
Also, if L ∈ P we say that L is recognizable in polynomial time
We can now prove the equivalence of our two definition of P, let us call the
probabilistic definition as Pprob and the classical as simply P.
Proof. Let L ∈ P, then there exists a Turing Machine M such that the ma-
chine outputs one for each x ∈ L and always halts in polynomially many steps.
Construct a probabilistic TM M1 as follows: copy everything from M , except
the transition function, replacing each transition of the form φ(s, i) = x with
φ1 (s, i) = (x, x). Recall from the definition that on each step the machine M1
will randomly choose between left and right transition. Since for each transition
left and right are always equal, the choice is meaningless and for each transition
it will take the same action as M (taking of course the same exact number of
steps). Since M recognized L, then M1 will also, and as such L ∈ Pprob , which
implies P ⊆ Pprob .
Conversely, let L ∈ Pprob . Then there exists a Turing Machine M s.t. M
halts in polynomial time, and that x ∈ L ⇐⇒ Pr[M (x) = 1] = 1. Let l be
bound on the running time of M . This implies that, for any x ∈ L, we have that
∀r ∈ {0, 1}l : Mr (x) = 1. Construct a machine M1 as follows, keep everything
equal except from the transition function. Replace each transition of the form
φ(s, i) = (x0 , x1 ) to one φ(s, i) = x0 . This is equivalent to picking r to be 0l ,
and as such M1 will also output 1 for each element in L. Of course the running
time will still be bound by l, and as such L ∈ P, which in turn implies Pprob ⊆ P
So, since P ⊆ Pprob and Pprob ⊆ P, it must be that P = Pprob
Definition 34. We say that a language L ∈ NP if there exists a relation RL ⊆
{0, 1}∗ × {0, 1}∗ and a polynomial p(·) such that the following hold:
1. RL is recognizable in polynomial time
50
Such a y is a witness for membership of x ∈ L
In particular, note the bound requirement on the witness y. This ensures
that the certificate is not more than polynomially bigger than the problem
statement. If this were not the case, then let us take for example the language
i.e. the language of all boolean formulas that return true for every input. If
the bounds on the polynomial size of the certificate were to be lifted, then we
n
could allow for a relation language RTAUT = (φ, 12 )|n ∈ N, φ ∈ TAUT . Such
language is recognizable in polynomial time in the certificate (as evaluating a
boolean formula of n variable takes polynomial time in the formula length, and
enumerating all possible n boolean variables takes O (2n ) which is polynomial
in the length of the certificate), however, if the bound is upheld we have no
evidence of the belonging of TAUT in NP (in fact, we know that TAUT ∈ co-NP,
and the relation between the two classes is still not clear).
We can similarly prove that our previous definition of NP (call it NPprob ) is
the same as the classical one.
Proof. Let L ∈ NP, then there exists a relation RL s.t. that RL is recognizable
in polynomial time, x ∈ L ⇐⇒ ∃(x, y) ∈ RL and that y is at most polynomial
in the size of x. Consider a probabilistic time machine M as follows:
1. On each step randomly guess a bit of the certificate y
2. Randomly decide whether to generate another bit or to run a polynomial
time algorithm to decide whether (x, y) ∈ RL
3. If we recognized (x, y) ∈ RL , accept, else reject.
Since the certificate exists, and it is polynomial in the size of x, there will
always be at least one extremely lucky machine that will guess it, and as such
Pr[M (x) = 1] > 0 for x ∈ L. So L ∈ NPprob , which implies NP ⊆ NPprob .
Conversely, let L ∈ NPprob . Then there exists a probabilistic time machine
M s.t. x ∈ L ⇐⇒ Pr[M (x) = y ] > 0. This implies that ∀x ∈ L∃rs.t.Mr (x) =
y. Construct RL = {(x, r)|x ∈ L and Mr (x) = 1}. First of all note the RL
is polynomial time recognizable, as one can run Mr (x) and accept depending
on the input (this only works since M is guaranteed to be polynomial time).
The condition for existence is met by our above reasoning. Finally, since r is
bound by the running time of the machine, which is polynomial, the condition
for the size of the certificate is met. As such L ∈ NP, which in turn implies
NPprob ⊆ NP.
So, since NP ⊆ NPprob and NPprob ⊆ NP, it must be that NP = NPprob
51
Definition 35. A language L is polynomial time reducible to a language M if
there exists a polynomial time computable function s.t. x ∈ L ⇐⇒ f (x) ∈ M .
We write in this case L ≤ M .
Definition 36. We say a language L is NP-Hard, if ∀M ∈ NP, M ≤ L.
References
[1] Daira Hopwood, Sean Bowe, Taylor Hornby, and Nathan Wilcox. Zcash
protocol specification. GitHub: San Francisco, CA, USA, 2016.
52
[10] Michael Chambers. ChemIDplus - 0000302272 - XFSBVAOIAHNAPC-
NPVHKAFCSA-N - Aconitine [USP] - Similar structures search, syn-
onyms, formulas, resource links, and other chemical information.
https://chem.nlm.nih.gov/chemidplus/rn/302-27-2.
[11] Shafi. Goldwasser, Silvio. Micali, and Charles. Rackoff. The Knowledge
Complexity of Interactive Proof Systems. SIAM Journal on Computing,
18(1):186–208, February 1989.
[12] Adi Shamir. IP = PSPACE, October 1992.
[13] Oded Goldreich, Silvio Micali, and Avi Wigderson. Proofs that yield noth-
ing but their validity or all languages in NP have zero-knowledge proof
systems. Journal of the ACM, 38(3):690–728, July 1991.
[14] Oded Goldreich. Foundations of Cryptography. Vol. 1: Basic Tools. Cam-
bridge Univ. Press, Cambridge, digitally print. 1. paperback version edition,
2007. OCLC: 255762567.
[18] Moni Naor, Harry Road, and San-Jose Ca. Bit Commitment Using Pseudo-
Randomness. page 10.
[19] BlumManuel. Coin flipping by telephone a protocol for solving impossible
problems. ACM SIGACT News, January 1983. PUB27 New York, NY,
USA.
[22] Eduardo Morais, Tommy Koens, Cees van Wijk, and Aleksei Koren. A sur-
vey on zero knowledge range proofs and applications. SN Applied Sciences,
1(8):946, July 2019.
[23] C. P. Schnorr. Efficient signature generation by smart cards. Journal of
Cryptology, 4(3):161–174, January 1991.
53
[24] David Chaum, Jan-Hendrik Evertse, and Jeroen van de Graaf. An Im-
proved Protocol for Demonstrating Possession of Discrete Logarithms and
Some Generalizations. In David Chaum and Wyn L. Price, editors, Ad-
vances in Cryptology — EUROCRYPT’ 87, Lecture Notes in Computer
Science, pages 127–141, Berlin, Heidelberg, 1988. Springer.
[25] Amos Fiat and Adi Shamir. How To Prove Yourself: Practical Solutions to
Identification and Signature Problems. In Andrew M. Odlyzko, editor, Ad-
vances in Cryptology — CRYPTO’ 86, Lecture Notes in Computer Science,
pages 186–194, Berlin, Heidelberg, 1987. Springer.
[26] Gregory Neven, Nigel P. Smart, and Bogdan Warinschi. Hash function
requirements for Schnorr signatures. Journal of Mathematical Cryptology,
3(1), January 2009.
[27] Thomas Wu. The Secure Remote Password Protocol. In In Proceedings of
the 1998 Internet Society Network and Distributed System Security Sym-
posium, pages 97–111, 1997.
54
[34] S. Pohlig and M. Hellman. An improved algorithm for computing log-
arithms overGF(p)and its cryptographic significance (Corresp.). IEEE
Transactions on Information Theory, 24(1):106–110, January 1978.
55