0% found this document useful (0 votes)
20 views75 pages

Super Fast Derandomization of Interactive Proofs

This document discusses the superfast derandomization of interactive proof systems, specifically focusing on MA and AM protocols. It presents conditional derandomization results with optimal time overhead under certain hardness assumptions and introduces deterministic doubly efficient argument systems. The findings suggest that under strong assumptions, constant-round proof systems can be efficiently transformed into deterministic systems with minimal time overhead.

Uploaded by

Mirza Ahad Baig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views75 pages

Super Fast Derandomization of Interactive Proofs

This document discusses the superfast derandomization of interactive proof systems, specifically focusing on MA and AM protocols. It presents conditional derandomization results with optimal time overhead under certain hardness assumptions and introduces deterministic doubly efficient argument systems. The findings suggest that under strong assumptions, constant-round proof systems can be efficiently transformed into deterministic systems with minimal time overhead.

Uploaded by

Mirza Ahad Baig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Electronic Colloquium on Computational Complexity, Revision 1 of Report No.

57 (2022)

When Arthur has Neither Random Coins nor Time to


Spare: Superfast Derandomization of Proof Systems

Lijie Chen * Roei Tell †

May 22, 2022

Abstract
What is the actual cost of derandomization? And can we get it for free? These
questions were recently raised by Doron et. al (FOCS 2020) and have been attracting
considerable interest. In this work we extend the study of these questions to the setting
of derandomizing interactive proofs systems.
First, we show conditional derandomization of MA and of AM with optimal run-
time overhead, where optimality is under the #NSETH assumption. Specifically, denote
by AMT IME [ c] [ T ] a protocol with c turns of interaction in which the verifier runs
in polynomial time T. We prove that for every e > 0 there exists δ > 0 such that:

MAT IME [ T ] ⊆ N T IME [ T 2+e ] ,


AMT IME [ c]
[ T ] ⊆ N T IME [n · T dc/2e+e ] ,

assuming the existence of properties of Boolean functions that can be recognized quickly
from the function’s truth-table such that functions with the property are hard for proof
systems that receive near-maximal amount of non-uniform advice.
To obtain faster derandomization, we introduce the notion of a deterministic doubly
efficient argument system. This is a doubly efficient proof system in which the verifier
is deterministic, and the soundness is relaxed to be computational, as follows: For every
probabilistic polynomial-time adversary P̃, the probability that P̃ finds an input x ∈ / L
and misleading proof π such that V ( x, π ) = 1 is negligible.
Under strong hardness assumptions, we prove that any constant-round doubly ef-
ficient proof system can be compiled into a deterministic doubly efficient argument
system, with essentially no time overhead. As one corollary, under strong hardness as-
sumptions, for every e > 0 there is a deterministic verifier V that gets an n-bit formula
Φ of size 2o(n) , runs in time 2e·n , and satisfies the following: An honest prover running
in time 2O(n) can print, for every Φ, a proof π such that V (Φ, π ) outputs the number
of satisfying assignments for Φ; and for every adversary P̃ running in time 2O(n) , the
probability that P̃ finds Φ and π such that V (Φ, π ) outputs an incorrect count is 2−ω (n) .

* Massachusetts Institute of Technology, MA. Email: [email protected]


† The Institute for Advanced Study at Princeton NJ and the DIMACS Center at Rutgers University, NJ. Email:
[email protected]

ISSN 1433-8092
Contents
1 Introduction 1
1.1 Superfast derandomization of MA . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Superfast derandomization of AM . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 A lunch that looks free: Deterministic doubly efficient arguments . . . . . . . 4
1.4 Implications for derandomization of pr BP P . . . . . . . . . . . . . . . . . . . 7

2 Technical overview 8
2.1 Warm-up: Proof of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Proof of Theorem 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Proofs of the results from Section 1.3 . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Proof of Theorem 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Preliminaries 15

4 Superfast derandomization of MA 23

5 Superfast derandomization of AM 26
5.1 Refining the reconstructive PRG from Proposition 4.1 . . . . . . . . . . . . . . 26
5.2 The superfast derandomization result: Basic version . . . . . . . . . . . . . . 30
5.3 The superfast derandomization result: Stronger version . . . . . . . . . . . . 34
5.4 Uniform trade-offs for AM ∩ co AM . . . . . . . . . . . . . . . . . . . . . . . 42

6 Optimality under #NSETH 47

7 Deterministic doubly efficient argument systems 48


7.1 Warm-up: The case of an MA-style system . . . . . . . . . . . . . . . . . . . . 49
7.2 Basic case: Doubly efficient proof systems with few random coins . . . . . . 51
7.3 The general case: Constant-round doubly efficient proof systems . . . . . . . 61

A Useful properties are necessary for derandomization of MA 70

B Proof of Lemma 7.5.3.4 72

i
1 Introduction
We study the well-known problem of eliminating randomness from interactive proof sys-
tems. While we do not expect to be able to eliminate randomness from general proof
systems (as that would imply that N P = IP = P SPACE , by [Sha92]), a classical
line of works suggests that for proof systems that only use constantly many rounds
of interaction, this might be doable. In particular, assuming sufficiently strong circuit
lower bounds,1 we have that AM = N P (for a subset of prominent works in this area,
see e.g., [KM98; IKW02; MV05; SU05; GSTS03; SU07]).
Recently, Doron et al. [DMO+20] raised the fine-grained derandomization question
of what is the minimal time overhead that we need to pay when eliminating randomness
from probabilistic algorithms. In their paper and in two subsequent works [CT21b;
CT21a], the following picture emerged: Under sufficiently strong lower bounds for
algorithms that use non-uniformity, we can derandomize probabilistic algorithms that
run in time T by deterministic algorithms that run in time n · T 1+e , for an arbitrarily
small e > 0 (see [CT21b] for details);2 and furthermore, if we are willing to settle for
derandomization that errs on a negligible fraction of inputs over any polynomial-time
samplable distribution, then we can derandomize probabilistic time T in deterministic
time ne · T for an arbitrarily small e > 0 (see [CT21a]).
The current paper studies the question of superfast derandomization of proof systems:
We ask what is the minimal time overhead that we need to pay when eliminating
randomness from proof systems such as MA and AM. While it is well-known that
proof systems with constantly many rounds can be simulated by proof systems with
two rounds (see [BM88]), in this work we care about fine-grained time bounds, and
therefore we care about the precise number of rounds in any such system. We denote
by AMT IME [ c] [ T ] the set of problems solved by proof systems that use c turns of
interaction, where in each turn the relevant party communicates T (n) bits (recall that
the verifier just sends random bits; see Definition 3.5 for precise details).

Our contributions, at a bird’s eye. First, we construct essentially optimal derandom-


ization algorithms for MA, and for AM protocols with constantly many rounds. The
derandomization algorithms are constructed under appealing hardness hypotheses,
which compare favorably to hypotheses from previous works (and the claim that they
are essentially optimal is under the assumption #NSETH). The results for MA and for
AM are presented in Sections 1.1 and 1.2, respectively.
Secondly, we introduce the notion of deterministic doubly efficient argument systems,
which are deterministic doubly efficient proof systems such that no polynomial-time
adversary can find, with non-negligible probability, an input x ∈ / L and a proof π
such that the verifier accepts x given proof π. Under strong hardness assumptions,
we compile every constant-round doubly efficient proof system into a deterministic
doubly efficient argument system with essentially no time overhead.
As a special case of the latter result, under strong hardness assumptions, we con-
struct a verifier that certifies the number of solutions for a given n-bit formula of size
2o(n) in time 2e·n , for any e > 0, such that no 2O(n) -time algorithm can find a formula
and a proof on which this verifier errs. The general result mentioned in the preceding
paragraph as well as the special case of #SAT are presented in Section 1.3.
1 For example, it suffices to assume that for some e > 0 it holds that E = DT IME [2O(n) ] is hard for

SVN circuits of size 2e·n on all input lengths (see [MV05]).


2 This is tight for worst-case derandomization, under the hypothesis #NSETH (see Assumption 3.19).

1
Useful properties. Many of our hardness hypotheses will rely on the existence of
useful properties, in the sense of Razborov and Rudich [RR97]: Loosely speaking, these
are algorithms verifying that a given string is the truth-table of a function that is
hard for a certain class. Specifically, for two complexity classes C and C 0 , we say
that a property Π of truth-tables is C 0 -constructive and useful against C if there is a C 0 -
algorithm that decides whether a truth-table is in Π, and if every truth-table in Π is
hard for algorithms from C (see Definition 3.3 for precise details).3
The reason that we consider useful properties is that they are necessary for any de-
randomization of proof systems, even just of MA. In fact, the useful properties that are
necessary are constructive in quasilinear time (this follows using the well-known ap-
proach of Williams [Wil13; Wil16]; see Appendix A for a proof). Accordingly, we will
consider useful properties in which truth-tables of length N = 2n can be recognized in
non-deterministic polynomial time N k = 2k·n or even in non-deterministic near-linear
time N 1+e = 2(1+e)·n , for a small constant e > 0.
Previous works deduced superfast derandomization from hard functions whose
truth-tables can be printed quickly (i.e., they have bounded amortized complexity;
see, e.g., [CT21b]). The assumptions above are more relaxed: We only require recog-
nizing the truth-table (rather than printing it), we allow non-determinism (i.e., proof)
in recognizing the truth-table, and we allow for a collection of hard truth-tables per
input length, rather than just a single hard truth-table per input length.

1.1 Superfast derandomization of MA


Our first result is a derandomization of MA that induces a (near-)quadratic time
overhead. This derandomization is (conditionally) tight, and it relies on the existence
of constructive properties that are useful against N P ∩ co N P machines that receive
near-maximal non-uniform advice. That is:

Theorem 1.1 (quadratic derandomization of MA and useful properties). For every e >
0 there exists δ > 0 such that the following holds. Assume that there exists an N T IME [ N 2+e/4 ]-
constructive property useful against (N ∩ co N )T IME [2(2−δ)·n ]/2(1−δ)·n . Then,

pr MAT IME [ T ] ⊆ pr N T IME [ T 2+e ] .

We stress that, in contrast to derandomization of pr BP P , for MA the quadratic


time overhead above might be unavoidable. To see this, assume that #NSETH holds;
that is, for every e > 0 there is no non-deterministic algorithm running in time 2(1−e)·n
and counting the number of satisfying assignments for a given n-bit formula of size
2o(n) (see Assumption 3.19). Then, for every e > 0 we have that MAT IME [ T ] 6⊆
N T IME [ T 2−e ]: The reason is that MA algorithms may simulate a sumcheck proto-
col to count the number of satisfying assignments of a given formula in time Õ(2n/2 )
(see [Wil16]). See Theorem 6.1 for a more general statement.
The technical result underlying Theorem 1.1 is actually stronger: Under the hy-
pothesis we deduce that pr BP T IME [ T ] ⊆ pr N T IME [ T 2+e ] (see Theorem 4.2), and
if we assume that a deterministic algorithm can quickly print the hard truth-tables, we
can deduce that pr BP T IME [ T ] ⊆ pr DT IME [ T 2+e ] (see Theorem 4.3). The latter
extension to pr BP T IME compares favorably to assumptions needed to get the same
conclusion in previous works [DMO+20; CT21b]; we elaborate on this in Section 1.4.
3 We also assume non-triviality, i.e. for every n ∈ N the property Π contains functions on n input bits.

2
1.2 Superfast derandomization of AM
Our next step is to consider derandomization of AM protocols, and for simplicity
of presentation we focus on protocols with perfect completeness.4 Our second main
result is a derandomization of AMT IME [ c] [ T ], for any constant number c ∈ N of
turns, that is (conditionally) tight: Loosely speaking, we prove that any such protocol
can be simulated in non-deterministic time ≈ T c/2 , assuming the existence of con-
structive properties that are useful against MAM protocols that receive near-maximal
non-uniform advice. That is:

Theorem 1.2 (superfast derandomization of AM). For every e > 0 there exists δ > 0
such that the following holds. Assume that for every k ≥ 1 there exists an N T IME [ N k+e/3 ]-
constructive property useful against MAM[2(1−δ)·k·n ]/2(1−δ)·n . Then, for every polynomial
T it holds that

pr AMT IME [ T ] ⊆ pr N T IME [n · T 1+e ] ,

and furthermore for every constant c ∈ N it holds that

pr AMT IME [ c]
[ T ] ⊆ pr N T IME [n · T dc/2e+e ] .

The conclusions in Theorem 1.2 are essentially tight, assuming #NSETH; see The-
orem 6.2 for details. Moreover, the hardness assumption is quite mild compared to
what one might expect; let us explain why we claim so, comparing our assumption
to ones in previous results. Recall that in classical results, to derandomize AM we
assume lower bounds for the non-uniform analogue of N P (see, e.g., [SU05]). The as-
sumption above is for a stronger class, namely the non-uniform analogue of MAM.
However, interestingly, this assumption still suffices to optimally derandomize AM
with any constant number of turns; that is, the derandomization overhead is optimal
for any number of rounds given hardness only for (non-uniform) MAM.
Now, observe that our assumption refers to hardness for time bounds 2kn  2n
(i.e., upper bounds of 2kn and lower bounds of 2(1−δ)·kn , for every constant k ≥ 1).
This is quantitatively reminiscent of a hardness assumption from [CT21b] that was
used to deduce that pr BP T IME [ T ] ⊆ pr DT IME [n · T 1+e ]. However, the latter
result also needed an additional hardness assumption, namely the existence of one-
way functions; in contrast, in Theorem 1.2 we do not rely on any cryptographic assumption.
In fact, similarly to Section 1.1, the techniques underlying Theorem 1.2 also extend
to the setting of derandomization of pr BP P , and allow us to deduce a conclusion as
in [CT21b] without cryptographic assumptions. For further details see Section 1.4.

A strengthening. We further strengthen Theorem 1.2 by relaxing its hypothesis.


Specifically, recall that in Theorem 1.2 we assumed that for every k ≥ 1 there is a
corresponding useful property, and let us denote it by Lk . We prove a stronger version
that relies on the same hardness hypothesis for L1 , but for every k > 1 only requires
Lk to be hard for N T IME [2(1−δ)·k·n ]/2(1−δ)·n (rather than for MAM protocols with
the same complexity). See Section 5.3 and Theorem 5.18 for details.
4 Ourresults extend to the case of protocols with imperfect completeness, at the cost of strengthening
the hardness assumptions (see, e.g., Remarks 5.4 and 7.6).

3
Uniform tradeoffs for AM ∩ co AM. The results above relied on hardness for pro-
tocols that use a large number of non-uniform advice bits. Classical results were able
to deduce derandomization of the more restricted class AM ∩ co AM relying only on
hardness assumptions for uniform protocols (see, e.g., [GSTS03; SU07]).
Indeed, we are able to show an “extreme high-end” analogue of these results too.
Assuming that there exists a function whose truth-tables can be recognized in near-
linear time but that is hard for 7-round protocols running in time 2(1−δ)·n , we show
that AM ∩ co AM can be derandomized with only a quadratic time overhead:
Theorem 1.3 (uniform tradeoffs for superfast derandomization). For every e > 0 there
exists δ > 0 such that the following holds. Assume that there exists L ∈ / i.o.(MA ∩
[ 7] (1−δ)·n n
co MA)T IME 2 [2 ] such that truth-tables of L of length N = 2 can be recognized
in non-deterministic time N 1+e/3 . 5 Then, for every time-computable T it holds that
(AM ∩ co AM)T IME [ T ] ⊆ (N ∩ co N )T IME [ T 2+e ].
Needless to say, hardness for protocols with constantly many rounds did not ap-
pear in previous results [GSTS03; SU07], and this is because previous results did not
consider fine-grained time bounds (in which case constantly many rounds can be sim-
ulated by two rounds [BM88]).

1.3 A lunch that looks free: Deterministic doubly efficient arguments


In two previous works [CT21b; CT21a], the way to bypass the #NSETH barrier and
obtain faster derandomization results was to consider derandomization that is allowed
to err on a tiny fraction of inputs, and to construct derandomization algorithms that use
new non-black-box techniques. In the current work we follow in the same vein.
Let us first state a striking special case of the general results that we prove. Under
a very strong hardness assumption (we will be formal about the assumption when we
state the general case), we prove that for every e > 0 there is a verifier that counts the
number of satisfying assignments for a given n-bit formula in time 2e·n , with one caveat: The
soundness of the verifier is only computational, rather than information-theoretic.
Theorem 1.4 (effectively certifying #SAT in deterministic time 2e·n ; informal). Assume
that a certain lower bound for probabilistic machines with oracle access to proof systems holds
(see Theorem 7.9). Then, for every e > 0 there is a deterministic verifier V that gets as input
an n-bit formula Φ of size at most 2o(n) , runs in time 2e·n , and satisfies the following:
1. (Completeness.) There is an algorithm that, given any input formula Φ as above, runs
in time 2O(n) and outputs a proof π such that V (Φ, π ) = #SAT(Φ).6
2. (Computational soundness.) For every probabilistic algorithm P̃ running in time
2O(n) , the probability that P̃(1n ) prints an n-bit formula Φ of size 2o(n) and proof π
such that V (Φ, π ) ∈/ {⊥, #SAT(Φ)} is 2−ω (n) .
We remind the reader that constructing a verifier as above without the relaxation of
computational hardness (i.e., a verifier that is sound for all inputs and with any proof)
is believed to be impossible; this is known as the #NETH assumption, which is weaker
than the “strong” version #NSETH mentioned above. To the best of our knowledge,
the existence of a verifier such as the one in Theorem 1.4 was not conjectured before.
n
5 That is, we have that tt( L) ∈ N T IME [ N 1+e/3 ], where tt( L) = f n ∈ {0, 1} N =2

n ∈N and f n is the
[ 7]
truth-table of L on n-bit inputs. Also, the subscript “2” in the notation (MA ∩ co MA)T IME 2 refers
to MA ∩ co MA protocols with imperfect completeness (say, 2/3).
6 We denote by #SAT( Φ ) the number of assignments that satisfy the formula Φ.

4
1.3.1 Deterministic doubly efficient argument systems
More generally, consider the notion of doubly ecient proof systems, as introduced by
Goldwasser, Kalai, and Rothblum [GKR15] (for a survey, see [Gol18]). These are in-
teractive proof systems in which the verifier is very fast (say, runs in quasilinear time)
and the honest prover is slower, but still efficient (say, runs in polynomial time). We
denote by deIP [ c] [ T ] the class of problems solvable by doubly efficient proof systems
with c turns of interaction, a verifier that runs in time T, and an honest prover that
runs in time poly( T ) (see Definition 3.6).7
We initiate a study of deterministic doubly ecient argument systems. Recall that
argument systems are a standard relaxation of proof systems in which soundness is
guaranteed only against computationally bounded provers (see, e.g., [Tha22] for an
exposition, and [Gol18, Section 1.5] for a discussion of probabilistic doubly efficient
argument systems). In our notion of deterministic doubly efficient argument systems
we require the verifier to be deterministic (i.e., an N P -type verifier), and the honest prover
to be efficient, but we further relax the soundness requirement such that it is guaran-
teed only against inputs that can be found by computationally bounded adversaries.

Definition 1.5 (deterministic doubly efficient argument system). We say that L ⊆


{0, 1}∗ is in deDARG[ T ] if there exists a deterministic T-time verifier V such that the fol-
lowing holds:

1. There exists a deterministic algorithm P that, when given x ∈ L, runs in time poly( T )
and outputs π such that V ( x, π ) = 1.

2. For every polynomial p, and every probabilistic algorithm P̃ running in time p( T ), and
every sufficiently large n ∈ N, the probability that P̃(1n ) prints x ∈
/ L of length n and
π ∈ {0, 1} T ( n ) such that V ( x, π ) = 1 is T (n) − ω ( 1 ) .

Intuitively, the meaning of the soundness condition in Definition 1.5 is that no


efficient adversary can find a false claim x ∈
/ L and then convince the verifier that the
false claim is true, except with negligible probability.
One may wonder whether we can relax the condition that the honest prover is
efficient. We remark that such a definition would be too relaxed, and potentially allow
the resulting class to contain essentially all possible problems (even uncomputable
ones); we elaborate and further discuss our definitional choices in Section 3.2.2.

1.3.2 Derandomizing doubly efficient proofs into deterministic doubly efficient


arguments
Our main result in this context is that, under strong hardness hypotheses, every con-
stant round doubly efficient proof system can be simulated by a deterministic doubly
efficient argument system with essentially no time overhead. We stress that even if we
start with a system that has many rounds, after this simulation we still end up with an
N P -type verifier that has essentially the same time complexity. This opens the door to
first using a protocol with many rounds to solve a problem faster, and then simulating
this fast protocol by a deterministic doubly efficient argument system of roughly the
same complexity. (This is the approach that underlies Theorem 1.4.)
7 The original definition of doubly efficient proof systems uses the fixed time bound T = Õ(n) (see,
e.g., [GKR15; Gol18]). In this work we also consider more general time bounds.

5
As a first step, consider any deIP [ c] [ T ] protocol in which the verifier uses only
n o (1)
random coins. We simulate it by a deterministic doubly efficient argument sys-
o (1)
tem, under the following assumption: There exists f : {0, 1}O(T ) → {0, 1}n such that
each output bit of f can be computed in time T̄, but no probabilistic machine with
oracle access to pr AMT IME [ c] [n] can print (an approximate version) of the entire
string f ( x ) in time slightly larger T̄; and this hardness holds with high probability
when choosing x from any efficiently samplable distribution.
Assumption 1.6 (non-batch-computability assumption; see Assumption 7.4). The ( N 7→
K, K 0 , η )-non-batch-computability assumption for time T̄ with oracle access to O is the follow-
ing. There exists a function f : {0, 1} N → {0, 1}K such that:
1. There exists a deterministic algorithm that gets input (z, i ) ∈ {0, 1} N × [K ] and outputs
the ith bit of f (z) in time T̄.
2. For every oracle machine M running in time T̄ · K 0 with oracle access to O , and every
distribution z over {0, 1} N that is samplable in time polynomial in T̄, with probability
at least 1 − ( T̄ )−ω (1) over choice of z ∼ z it holds that Pr[ MO (z)i 6= f (z)i ] ≥ η, where
the probability is over i ∈ [K ] and the random coins of M. 8
Indeed, the name “non-batch-computability” in Assumption 1.6 refers to the fact
that each individual output bit of f is computable in time T̄, whereas printing (an
approximate version) of the entire K-bit string is hard for time T̄ · K 0 > T̄ (even with
oracle access to O ). Our first general result is then the following:
Theorem 1.7 (simulating doubly efficient proof systems with few coins by determinis-
tic doubly efficient argument systems; informal, see Theorem 7.5). Let L ∈ deIP [ c] [ T ]
such that the verifier for L uses only R(n) = no(1) random coins. Assume that for some
α, β ∈ (0, 1) the ( N 7→ K, K β , α)-non-batch-computability assumption holds for time T̄ with
[ c]
oracle queries of length O( T ) to pr AMT IME 2 [n], where
N (n) = n + c · T
K (n) = poly( R(n))
T̄ (n) = T · K .

Then L ∈ deDARG[ T · no(1) ].


The hardness hypothesis in Theorem 1.7 is very strong, but the corresponding
derandomization conclusion is also exceptionally strong. We discuss the hardness hy-
pothesis in detail in Section 7.2.1, and in particular compare it to known results about
batch-computing functions by interactive protocols, by Reingold, Rothblum and Roth-
blum [RRR21; RRR18]. Indeed, even an assumption considerably stronger than the one
in Theorem 1.7 still seems reasonable, in particular an assumption in which the oracle
is a linear-space machine (rather than a linear-time verifier in an interactive protocol);
see Section 7.2.1 for details. We also comment that “non-batch-computability” hard-
ness against probabilistic machines (without oracles) is necessary for derandomization
with no overhead in limited special cases (see [CT21a, Section 6.3]).
Theorem 1.7 implies Theorem 1.4 as a special case, because the underlying proba-
bilistic proof system for #SAT (i.e., a constant-round sumcheck protocol) uses a num-
ber of random coins that is logarithmic in the running time. Moreover, in the special
8 In all our results we will use this assumption with T̄ ( N ) = poly( N ), in which case the error bound
( T̄ )−ω (1) is just any negligible function in the input length.

6
case of Theorem 1.4 we are able to considerably relax the hypothesis, relying on certain
properties of the proof system for #SAT; see Theorems 3.17, 7.8 and 7.9 for details.

The general case. Turning to the general case of simulating doubly efficient proof
systems (without restricting the number of random coins) by deterministic doubly
efficient argument systems, we can do so under one additional hypothesis:

Theorem 1.8 (simulating general doubly efficient proof systems by deterministic dou-
bly efficient argument systems; informal, see Corollary 7.12). For every α, β, e ∈ (0, 1)
there exists η, δ > 0 such that for every polynomial T (n) and constant c ∈ N the following
holds. Assume that:

1. There exists Lhard ∈ / MAT IME [ c+1] [2(1−δ)·n ]/2(1−δ)·n such that given n ∈ N, the
truth-table of L hard of n-bit inputs can be printed in time 2(1+e/3)·n .

2. The ( N 7→ K, K β , α)-non-batch-computability assumption holds for time T̄ with oracle


[ c]
queries of length O( T ) to pr AMT IME 2 [n], where N = n + c · T 1+e/2 and K =
polylog( T ) and T̄ = T 1+e/2 · K.

Then, deIP [ c]
[ T ] ⊆ deDARG[ T 1+e ].
The additional hypothesis in Theorem 1.8 (i.e., the one in Item (1)) is a strengthen-
ing of the one in Theorem 1.2 that refers to the value of k = 1. The strengthening is
because now we consider hardness for protocols with c + 1 turns rather than 3 turns,
we assume a single hard problem rather than a useful property, and the upper-bound
on the complexity of truth-tables is deterministic rather than non-deterministic.

1.3.3 Deterministic argument systems for N P -relations


In Definition 1.5 the honest prover runs in time T̄ = poly( T ), and therefore such argu-
ment systems exist only for problems in DT IME [ T̄ ]. An alternative definition, which
would allow constructing deterministic argument systems for N T IME [ T̄ ]-relations,
is to give the honest prover a witness for the corresponding relation as auxiliary input
(while still insisting that it runs in time T̄). Our proofs extend to this setting, allow-
ing to simulate AMT IME [ c] [ T ] protocols for N T IME [ T̄ ]-relations, in which the
honest prover is efficient given a witness for the relation, into deterministic argument
systems running in time T 1+e . For further details see Remark 7.7.

1.4 Implications for derandomization of pr BP P


Our proof techniques also extend to the setting of derandomization of pr BP P (rather
than just of proof systems), and in this setting they yield results that compare favorably
to previous works. For derandomization in this setting we replace the assumptions
about useful properties with assumptions that truth-tables of hard functions can be
efficiently printed. (This is since our derandomization algorithm cannot guess-and-
verify a hard truth-table, but needs to efficiently print it.) It is useful to think of an
algorithm that prints a truth-table of a function as “batch-computing” the function.
The techniques underlying Theorem 1.1 extend to show a quadratic-time deran-
domization of pr BP P , assuming that a function whose truth-tables can be printed in

7
time 2(2+e/3)·n is hard for N T IME [2(2−δ)·n ]/2(1−δ)·n (see Theorem 4.3).9 This com-
pares favorably to two relevant results in the previous works [DMO+20; CT21b], as
follows. (In the table below, the upper-bound is for a deterministic algorithm that
prints the truth-table of the hard function on n bits.)

Upper bound Lower bound Time overhead


2(2+Θ(e))·n N T IME [2(2−δ)·n ]/2(1−δ)·n T 2+ e Theorem 4.3
2(2+Θ(e))·n MAT IME [2(1−δ)·n ]/2(1−δ)·n T 2+ e [DMO+20]
2(3/2)·n N T IME [2(1−δ)·n ]/2(1−δ)·n T 3+ e [CT21b, Thm 1.8]

While the three results above are formally incomparable, in the current work we
are able to obtain derandomization with quadratic time overhead from hardness for
N T IME machines with advice, whereas previous works either assumed hardness
for MAT IME machines with advice, or paid a cubic time overhead.
Moreover, the techniques underlying Theorem 1.2 extend to yield the conclusion
pr BP T IME [ T ] ⊆ pr DT IME [n · T 1+e ], which is optimal under #NSETH, under an
assumption that compares favorably to the one used to obtain derandomization with
such overhead in our previous work [CT21b]. The previous work obtained this conclu-
sion using a cryptographic assumption (the existence of one-way functions) as well as
a hardness assumption that is necessary for obtaining the conclusion using PRGs. In
the current work we are able to replace the cryptographic assumption with a hardness
assumption for MA protocols with advice (see Theorem 5.5 for precise details).

2 Technical overview
Most of the proofs in this work sequentially build on the ideas of each other, and
therefore we encourage readers read the current section sequentially.
In Section 2.1 we describe the simplest proof, which is of Theorem 1.1. Then in
Section 2.2 we build on this proof to prove Theorem 1.2. In Section 2.3 we describe
the more involved proofs in this paper, which are for the results Section 1.3; these are
inherently different from other proofs, and use non-black-box techniques while also
relying on the previous proofs. We conclude by explaining the proof of Theorem 1.3
in Section 2.4 (this proof does not rely on Section 2.3).

Basic building-block: A simple and efficient PRG. The starting point for our re-
sults is a very simple PRG, which was recently used for superfast derandomization
in [DMO+20; CT21b] (its ideas date back to [Sip88] and variations on them have been
used for derandomization of proof systems, e.g. [MV05]). Recall that a reconstructive
PRG is a pair of efficient algorithms: A generator Gen maps a truth-table f to a list G f
of strings; and a reconstruction Rec maps every distinguisher D for G f to an efficient
procedure RecD that computes f . 10 Indeed, if f is hard for algorithms with complexity
as that of RecD , then it follows that G f is pseudorandom.
9 For
simplicity, in this section we assert lower bounds for N T IME and MAT IME . The results
that we mention only require hardness for (N ∩ co N )T IME and (MA ∩ co MA)T IME , respectively.
10 When we say that D is a distinguisher for a list G f ⊆ {0, 1}n , we mean that D distinguishes the

uniform distribution over the list from a truly uniform n-bit string.

8
The simple reconstructive PRG works as follows. Given f , the generator Gen en-
codes it to f¯ = Enc( f ) by an error-correcting code that is encodable in near-linear time
and locally list-decodable, partitions f¯ into consecutive substrings of size N = | f |.99 ,
and outputs the list of | f¯|/N ≈ N .01 substrings. Indeed, this generator Gen runs in
near-linear time, and is thus particularly suitable for constructing superfast derandom-
ization algorithms. In contrast, the reconstruction Rec is inefficient, and in addition
only works when the distinguisher D : {0, 1} N → {0, 1} is extremely biased; for exam-
.99
ple, when D accepts all but 2 N strings and yet rejects 1% of the strings in G f .11
Note that for any such D, a random pairwise-independent hash function h : {0, 1} N →
.999
{0, 1} N will not have collisions in the set D −1 (0). The reconstruction Rec gets as ad-
vice a description of such h and the hash values of the N-bit substrings of f¯ that G f
outputs. Given input z ∈ [| f |], it runs the local list-decoder for Enc, and whenever
the decoder queries q ∈ [| f¯|], the reconstruction non-deterministically de-hashes the cor-
responding substring in f¯ to obtain f¯q . That is, denoting by f¯(i) the substring of f¯
that contains the index q, the reconstruction Rec guesses z ∈ {0, 1} N , verifies that
h(z) = h( f¯(i) ) using the stored hash value, verifies that z ∈ D −1 (0) by querying D,
and if the verifications passed then z = f¯(i) (since there are no collisions in D −1 (0))
and Rec outputs the bit in z corresponding to index q.
We refer the reader to Proposition 4.1 for a clean statement that outlines the fore-
going basic version of this reconstructive PRG.

Preliminary observations: Using this approach for derandomizing proof systems.


Note that the reconstruction Rec is non-deterministic, and thus when using (Gen, Rec)
above we need to assume hardness for non-deterministic procedures. Such assump-
tions are natural in the current context of derandomizing proof systems. In addition,
the natural way to use Gen for derandomizing proof systems, which is indeed the one
that we will use for many of our results, is to receive a truth-table f from a prover, ver-
ify that f is indeed hard (using an assumption that there is a constructive and useful
property), and then instantiate the PRG Gen f .

2.1 Warm-up: Proof of Theorem 1.1


The main bottleneck in the previous proofs [DMO+20; CT21b] that used (Gen, Rec)
came from the fact that Gen only fools extremely biased distinguishers, whereas our
goal is to construct a PRG that fools all distinguishers. In previous works this was
bridged by straightforward error-reduction: They considered a procedure D̄ (z) that
uses a randomness-efficient sampler Samp and verifies that z satisfies

Pr [ D (Samp(z, i )) = 1] ≈ Pr [ D (r ) = 1] ,
i ∈[ N 1.01 ] r ∈{0,1} N
n o
and noted that if Gen f fools D̄, then S ◦ Gen f = Samp(Gen f , s) fools D.
s∈[ N 1.01 ]
(See [CT21b, Section 5.2] for a detailed explanation.)
To see why this is a bottleneck, note that the resulting D̄ is of size O( N 2.01 ) (because
Samp uses N 1.01 seeds s). Thus, if we use f that is hard for non-deterministic circuits
of such size, we need | f | > N 2.01 and the number of pseudorandom strings is ( f¯/N ) ·
11 Thus, the reconstructive PRG (Gen, Rec) is particularly suitable for the task of quantified derandomiza-
tion; see [GW14; DMO+20; CT21b; Tel21] for further details about this application.

9
N 1.01 > N 2.01 . Evaluating D on each of the strings, this yields derandomization in
near-cubic time.12
The observation leading the way to Theorem 1.1 is simple: We do not really need
f to be hard for non-deterministic circuits of size N 2.01 , but only for non-deterministic
algorithms that run in time N 2.01 and use | f |.99 + N bits of advice. There can indeed be
such truth-tables of size | f | = N 1.01 , as in the hypothesis of Theorem 1.1.
To elaborate, let us sketch the proof, and in fact let us show how to derandomize
pr BP P in near-quadratic non-deterministic time. We are given D : {0, 1} N → {0, 1}
of linear size, and we want to approximate its acceptance probability up to a small
additive error. We guess f of size | f | ≈ N 1.01 and verify in time N 2.02 that f is hard for
non-deterministic algorithms running in time N 2.01 and using O( N ) bits of advice. Our
generator is S ◦ Gen f , yielding N 1.01 pseudorandom strings and derandomization in
quadratic time. The reconstruction Rec gets D and the hash values as advice; whenever
the list-decoder queries D̄, then Rec computes D̄ with queries to D in time N 2.01 . See
Section 4 for further details.

2.2 Proof of Theorem 1.2


Turning to AM protocols, let V be an AMT IME [ T ] verifier, and recall that we want
to derandomize V in N T IME [n · T 1.01 ]. The approach above can still be used, but it
yields derandomization in quadratic time rather than in time n · T 1.01 .
The main idea in the proof is to compose the generator S ◦ Gen with itself, using dif-
ferent hard truth-tables, to get a PRG with only n1.01 seeds rather than T 1.01 seeds.
In high-level, we first use S ◦ Gen f1 with a long truth-table (of size | f 1 | = T 1.01 ) to
transform V into a verifier V 0 with running time T 1.01 that uses only O(log T ) ran-
dom coins, and then use S ◦ Gen f2 with a short truth-table (of size | f 2 | = n1.01 ) to fully
derandomize V 0 , using a seed of length (1.01) · log(n). 13 Details follow.
As a first step, we will need a refined version of (Gen, Rec). Recall that in The-
orem 1.2 we are assuming hardness for probabilistic protocols. Following an idea
from [DMO+20], the reconstruction can now compute D̄ probabilistically rather than
deterministically, and thus use only a small number queries to D rather than N 1.01 ;
this reduces the running time to be close to T rather than to T 2 . Furthermore, we
observe that the reconstruction Rec does not actually use the full power of its oracle access to
D: Loosely speaking, there is α ∈ (0, 1) such that for a good non-deterministic guess,
it suffices to “prove” to Rec that an α-fraction of Rec’s queries are accepted by D; and
simultaneously, for a bad non-deterministic guess, less than α-fraction of Rec’s queries
will be accepted by D. See Proposition 5.2 for precise details and a proof.
Now, when derandomizing AM with perfect completeness, the distinguisher D
is a non-deterministic procedure testing whether there exists a satisfying witness for
a given input and random coins.14 A naive instantiation of Rec would thus yield an
12 An alternative approach from these works is to allow D̄ to use randomness, in which case it is only

of size O( N ) and the derandomization runs in quadratic time. However, this requires assuming that f is
hard for non-uniform MA circuits, an assumption we are trying to avoid.
13 This approach follows an idea from [CT21b], wherein superfast derandomization was achieved by

composing two PRGs in a similar way. However, in [CT21b] one of the PRGs relied on cryptographic
assumptions, and the other was the Nisan-Wigderson [NW94] PRG. In contrast, in this work there are no
cryptographic assumptions, and we simply compose two instantiations of (Gen, Rec).
14 In works concerning derandomization of AM the output of D is often negated, and it is then thought

of as co-nondeterministic circuit. This is done when considering hitting-set generators for D, for deran-
domizing protocols with perfect completeness. We avoid doing so, and as mentioned in Section 1 our
results easily generalize to derandomization of protocols with imperfect completeness (see Remark 5.4).

10
MAN P machine, whereas we want to assume hardness only for MAM. To do so
we rely on the observations about Rec above: Our reconstruction algorithm will first
receive a witness for Rec, then it will use randomness to choose a small number of
queries (thereby reducing the running time to near-linear), and finally it will send the
queries to the prover to obtain non-deterministic witnesses for D on these queries. This
yields an MAM protocol, and we show that it indeed suffices for the reconstruction
to work (for details see Proposition 5.2 and the proof of Proposition 5.3).
Now, denote the running time of the protocol V that we want to derandomize
by T (n) = nk . Recall that our derandomization will compose S ◦ Gen using a truth-
table f 1 of length T 1.01 and another truth-table f 2 of length n1.01 . We assume that
f 1 can be verified in near-linear time T 1.01 and is hard for MAM[2(1−δ)·n ]/2(1−δ)·n ,
and that f 2 can be verified in time nk+.01 and is hard for MAM[2(1−δ)k·n /2(1−δ)·n ].
Thus, we can verify both in time T 1.01 , and using the reconstruction argument above,
both of them are pseudorandom for V, which runs in time T. Our derandomization
enumerates over the output strings of S ◦ Gen f2 , and uses each string as a seed for
S ◦ Gen f1 .15 The resulting pseudorandom set has n1.01 strings and can be computed in
time n1.01 · T 1.01 < n · T 1.02 , which suffices for our derandomization of V.
To extend this result to AM protocols with constantly many rounds, we first sim-
ulate any constant-round protocol by a two-round protocol, and then apply the result
above as a black-box. Indeed, the key observation is that the simulation overhead when
using the classical result of [BM88] allows us to obtain tight results (under #NSETH),
while only assuming hardness for MAM with advice; see Section 5.2.2 for details.

A relaxation of the hypothesis. As mentioned after the statement of Theorem 1.2,


we further relax its hypothesis, by requiring that for k > 1, the property Lk will be
useful only against N T IME machines with the specified time and advice complexity
(rather than against MAM protocols of this complexity). Our approach for doing so
is to replace the inner PRG S ◦ Gen f2 with the Nisan-Wigderson generator [NW94]. This
can be useful for us, because the NW generator is suitable for our superfast parameter
setting when it is instantiated for a small output length (i.e., | f |η where | f | is the
truth-table length and η  e is a sufficiently small constant; see Theorem 5.16), and
S ◦ Gen f1 reduces the number of random coins to O(log( T )) = O(log(n)).
The main obstacle towards applying the NW generator in this setting is that it
requires hardness against machines with advice and (non-adaptive) oracle access to
N T IME (i.e., to the distinguisher, which in this case is an N T IME machine),
whereas we only assume hardness against N T IME machines with advice. To bridge
this gap, we use an idea of Shaltiel and Umans [SU06] (following [FF93; SU05]), which
allows to transform truth-tables that are hard for N T IME machines with advice into
truth-tables that are hard for machines with advice and non-adaptive oracle access
to N T IME . We use their transformation while analyzing it in a more careful way,
which allows us to bound the overheads in the hardness of the truth-table incurred by
the transformation. (See Section 5.3.1 for further details.)

2.3 Proofs of the results from Section 1.3


We simulate probabilistic doubly efficient proof systems by deterministic doubly ef-
ficient argument systems using non-black-box derandomization algorithms, rather than
15 Indeed,the length of output strings of S ◦ Gen f2 is an “overkill”, since they are of length close to n
but we only use the first O(log(n)) bits in them as a seed for S ◦ Gen f1 .

11
PRGs. Specifically, we will use a targeted pseudorandom generator, which takes an input
z and prints a list of strings that looks pseudorandom to efficient algorithms that also
have access to the same z.
Our targeted PRG will be reconstructive (i.e., based on a hard function), and will
thus yield an “instance-wise” hardness-vs-randomness tradeoff: If the underlying
function is hard to compute on a set S of inputs of density 1 − µ, then the targeted
PRG is pseudorandom on each and every input in S.16 The specific generator that we
will use, from [CT21a], is denoted G f and relies on a function f : {0, 1}∗ → {0, 1}∗
with multiple output bits such that:

1. (Upper bound.) Individual output bits of the string f (z) can be computed in
time T 0 .

2. (Lower bound.) The entire string f (z) cannot be approximately printed in time
T 0 · | f ( x )| β , for a small constant β > 0.17

Note that the obvious algorithm prints f (z) in time T 0 · | f (z)|, but f is “non-batch-
computable”, in the sense that printing (an approximate version of) the entire string
cannot be done in time close to that of computing a single bit (i.e., in time T 0 · | f ( x )| β ).
As a first step we will consider derandomizing verifiers that use a small number
of coins, say no(1) . We instantiate G f accordingly with no(1) output bits, and with f
and a time bound T 0 that is slightly larger than the running time of the verifier. In
this setting G f runs in time T 0 · no(1) and prints no(1) random strings, and thus reduces
the number of random coins (to o (log(n))) without increasing the time complexity. To
deduce that G f is pseudorandom for an algorithm from a class C on input z, it suffices
to assume that f is hard to approximately print by algorithms running in time T 0 · no(1)
with oracle access to C . (See Theorem 7.1 for a precise statement.)
Our goal will be to use G f in order to replace the random coins of the verifier by
pseudorandom coins. If we are able to reduce the number of coins to (say) o (log(n)),
then the verifier can just ask the prover in advance to send all possible no(1) transcripts
of interaction, and then check for consistency and compute the probability that it
would have accepted (see, e.g., the proof of Theorem 7.5 for details). However, a naive
application of G f to the common input x to the protocol does not seem to suffice
for this purpose: This is because in any round of interaction, when we replace the
verifier’s random coins by pseudorandom coins, the verifier’s behavior depends not
only on x but also on the messages sent by the prover. 18

The main idea: Using transcripts as a source of hardness. The main idea that un-
derlies our constructions is using the transcript of the interaction in each round as a source
of hardness for G f . That is, in each round of interaction we apply the targeted generator
G f with the current transcript as input, and obtain coins that the verifier can use in that
specific round, given the previous interaction.
16 To be more accurate, for every potential distinguisher M there exists a “reconstruction” algorithm FM
such that the following holds: On every z on which FM fails to compute the hard function, the targeted
PRG with input z is pseudorandom for M on input z.
17 The meaning of “approximately print” here is as in Assumption 1.6: A probabilistic algorithm M

approximately prints f ( x ) with error δ if Pr[ M ( x )i = f ( xi )] ≥ 1 − δ, where the probability is over the
random coins of M and an output index. For simplicity, we think of δ as a small constant for now.
18 Another way to see this is to think of a worst-case verifier that tries to distinguish the pseudorandom

strings from random ones. In this case, the prover’s messages can be thought of as supplying this
worst-case verifier with non-uniform advice.

12
Recall, however, that we do not assume that f is hard on all inputs, and thus an
all-powerful prover could potentially find transcripts on which f is easy (in which
case G f is not guaranteed to be pseudorandom). This is where our relaxation of the
soundness condition comes in: Since we are only interested in soundness with respect
to polynomial-time provers and polynomial-time samplable inputs, we can think of the
transcript as an input (to G f ) sampled from a polynomial-time samplable distribution. By our
assumption f cannot be computed with non-neglibile probability over inputs chosen
from any polynomial-time samplable distribution, and thus for all but a negligible
fraction of transcripts G f will be pseudorandom.
We stress that the function f is hard for algorithms running in fixed polyno-
mial time T 0 · no(1) (this models the verifier), but its hardness holds over inputs (i.e.,
transcripts) chosen according to any polynomial-time samplable distribution, where
the latter polynomial may be arbitrarily large (this models an input chosen from a
polynomial-time samplable distribution and the prover’s corresponding messages).
Materializing this approach turns out to be considerably more subtle than it might
seem. We thus include a self-contained proof for an easy “warm-up” case, namely that
of derandomizing doubly efficient proof systems with one prover message (similar to
MA). In this setting, if we are willing to assume that one-way functions exist, then
we can reduce the number of random coins to be ne (for an arbitrarily small e > 0)
and carry out the strategy above quite easily. See Section 7.1 for details.

The key step: Proof of Theorem 1.7. Let us start with Theorem 1.7, in which we
derandomize protocols with c rounds in which the verifier uses no(1) random coins,
under the assumption that f is hard to approximately print for algorithms with oracle
access to AM protocols with c rounds. Recall that we are interested in algorithms
with perfect completeness, and thus when replacing random coins by pseudorandom
ones we only need to argue that soundness is maintained. (For protocols with few
random coins this can be assumed without loss of generality; see Remark 7.6.)
In each round we feed the current transcript to G f and thus reduce the number
of coins to o (log(n)). Naturally, we want to use a hybrid argument, and claim that
if in any round the residual acceptance probability of the verifier (when considering
the continuation of the interaction after this round) significantly increased,19 then we
can compute the hard function f with the transcript at that round as input. Note that
our “distinguisher” for G f in this case is the function that computes the acceptance
probability of the residual verifier after fixing the current transcript.
When this interaction happens with an all-powerful prover, the distinguisher for
G f can be computed by an AM protocol with at most c turns, and we can indeed
use our hardness hypothesis. However, when we consider the acceptance probability
with respect to efficient provers, then the distinguisher will be an argument system
(rather than an AM protocol). This is a challenge for us (rather than an advantage),
because we want to use this distinguisher to contradict the hardness of f – but the class
of argument systems is broader than AM, so this will necessitate using a stronger
hardness hypothesis (i.e., against algorithms with oracle access to argument systems).
To handle this challenge we use a careful hybrid argument: In each round we
replace not only random coins by pseudorandom ones, but also gradually replace all-
powerful prover strategies by efficient prover strategies; that is, in the ith hybrid we
19 By “acceptance probability” here we mean the maximal probability that the verifier accepts, when

considering interaction with a prover (in the relevant class of provers) that maximizes this probability. In
some sources this is referred to as the value of the game/protocol.

13
replace the random coins in the ith round by pseudorandom coins, and replace the
all-powerful prover strategy in the ith round by an efficient strategy. Assuming that
the common input x is a NO instance, the first hybrid (with random coins and an all-
powerful prover) has low acceptance probability, and we are interested in analyzing
the case where the last hybrid (with pseudorandom coins and an efficient prover)
has higher acceptance probability, in which case there is a noticeable increase in the
acceptance probability between a pair of hybrids, say in the ith hybrid.
In our analysis, we now further replace the efficient prover strategy in the ith round
by an all-powerful prover strategy; crucially, this only causes an increase in the accep-
tance probability gap.20 At this point the “residual protocol” in both distributions that
were obtained in the ith hybrid uses an all-powerful prover, and (with some work) we
can show that there exists a distinguisher for G f (with the transcript as input) that is
an AM protocol with c rounds. (For further details see the part titled “Obtaining an
AMT IME [ c] distinguisher” in the proof of Claim 7.5.2 for details.)

The general case: Proof of Theorem 1.8. The argument above works under the as-
sumption that the protocol uses no(1) coins, and we now want to extend it to general
protocols. To do so we use an additional assumption, namely that there exist functions
whose truth-tables can be verified in near-linear time, but that are hard for AM pro-
tocols with c + 1 rounds that run in time 2(1−δ)·n and use 2(1−δ)·n bits of non-uniform
advice. We will use this additional assumption to reduce the number of random coins
in each round from poly(n) to O(log(n)), using the reconstructive PRG (Gen, Rec),
and then invoke Theorem 1.7 as a black-box.
The argument here follows in the same spirit as the one above, first replacing the
coins in all rounds simultaneously and then using a careful hybrid argument (and a
reconstruction argument) to obtain a contradiction. To support the setting that is ob-
tained via the hybrid argument, we refine the reconstructive PRG (Gen, Rec) yet again,
this time showing that it works when the distinguisher is any function that agrees with
a certain promise problem, where the advice to the reconstruction algorithm depends
only on the promise problem rather than on the function. (See Section 7.3.1.)

2.4 Proof of Theorem 1.3


Finally, we briefly explain the ideas in the proof of Theorem 1.3. The most important
change, compared to the proofs of Theorem 1.1 and Theorem 1.2, is that since now
we are only assuming hardness against uniform protocols (our hardness assumption is
that L ∈/ i.o.(MA ∩ co MA)T IME [ 7] [2(1−δ)·n ]), we need to make the reconstruction
algorithm uniform as well.
We will use yet another refinement of (Gen, Rec) above. Recall that Rec (after our
last refinement) gets the “right” advice adv, and can then compute the function f if
the distinguisher “proves” to the verifier that sufficiently many queries are 1-instances.
Following [MV05; GSTS03; SU07], we strengthen Rec so that it meets a resiliency con-
dition: Namely, using a small number of rounds of interaction, Rec is able to get any
prover to send advice adv and commit to a single truth-table f adv specified by the ad-
vice.21 That is, given adv and a few rounds of interaction, there is a single f adv such
20 This is the place in the proof where we use the fact that the protocol has perfect completeness (i.e.,
we are only arguing that soundness is maintained given pseudorandom coins).
21 To see the challenge, assume that adv is a non-deterministic circuit that supposedly computes the

truth-table. In this case, different non-deterministic guesses could yield different truth-tables.

14
that Rec answers according to it. Working carefully, we are able to make Rec resilient
without significant time overhead (see Definition 5.1 and Proposition 5.2).
If our reconstruction could be convinced that f adv to which the prover commit-
ted is the “right” truth-table (i.e., f adv = f ), then on query x it could simply output
f adv ( x ). But what if the prover committed to a truth-table different than f ? To resolve
this issue we initially encode f using a highly efficient PCP of proximity (i.e., the one
of [BGH+05]), which then allows us to locally test the encoded string without incur-
ring significant time overheads and deduce that f adv is close to f , otherwise we reject
(see Theorem 3.16); we combine this with local error correction, to compute f given
any truth-table that is close to f . The price that we pay for using the PCPP is that the
truth-table (which is now a PCPP witness of the original truth-table) is necessarily of
size T 1.01 (i.e., slightly larger than the time complexity of the function), and thus our
PRG uses (1.01) · log( T ) seeds and we get derandomization in quadratic time T 2.01
rather than in time n · T 1.01 .

3 Preliminaries
Throughout the paper, we will typically denote random variables by boldface, and
will denote the uniform distribution over {0, 1}n by un and the uniform distribution
over a set [n] by u[n] . Recall the following standard definition of a distinguisher for
a distribution w, by which we (implicitly) mean a distinguisher between w and the
uniform distribution.

Definition 3.1 (distinguisher). We say that a function D : {0, 1}n → {0, 1} is an e-


distinguisher for a distribution w over {0, 1}n if Pr[ D (w) = 1] ∈ / Pr[ D (un ) = 1] ± e.
We say that D is an (α, β)-distinguisher if Pr[ D (w) = 1] ≥ α and Pr[ D (un ) = 1] ≤ β.

We also fix a standard notion of “nice” time bounds for complexity classes, where
we are only concerned of time bounds that are not sub-linear.

Definition 3.2 (time bound). We say that T : N → N is a time bound if T is time-


computable and non-decreasing, and for every n ∈ N we have that T (n) ≥ n.

Recall that pr MAT IME [ T ] denotes the class of promise problems (rather than lan-
guages) that can be solved by MA protocols with a verifier running in time T. De-
randomization of pr MAT IME [ T ] as in the conclusions of Theorems 1.1 and 1.2 is
stronger than derandomization of MAT IME [ T ].

3.1 Useful properties


Following Razborov and Rudich [RR97], we now define useful properties, which are
sets of truth-tables that can be efficiently recognized and that describe functions hard
to compute in a certain class C . Our definition is a bit more careful than usual, since
we are interested in the case where C is a class decidable by uniform machines that
gets non-uniform advice (rather than a class decidable by non-uniform circuits).

Definition 3.3 (useful property). Let L ⊆ {0, 1}∗ be a collection of strings such that every
f ∈ L is of length that is a power of two, and let C be a class of languages. We say that L is a
C 0 -constructive property useful against C if the following two conditions hold:
1. (Non-triviality.) For every N = 2n it holds that Ln = L ∩ {0, 1} N 6= ∅.

15
2. (Constructivity.) L ∈ C 0 .

3. (Usefulness.) For every L ∈ C and every sufficiently large n ∈ N it holds that Ln ∈


/ Ln ,
n
where Ln ∈ {0, 1}2 is the truth-table of L on n-bit inputs.

To clarify the meaning of the “usefulness” condition above, let us consider the case
where C is a class of Turing machines with advice of bounded length. In this case, the
condition asserts that for every fixed machine M and infinite sequence adv of advice
strings, and every sufficiently large n ∈ N, the machine M with advice adv fails to
compute any truth-table in Ln .
Since each string in L is of length 2n for some n ∈ N, and we think of it as a
truth-table of a function over n bits, we will usually denote the input length to L as
N = 2n . For example, when we refer to an N T IME [ N 2 ]-constructive property useful
against N T IME [21.99·n ] we mean that 2n -length truth-tables in L can be recognized
in non-deterministic time 22n , but that the corresponding n-bit functions cannot be
computed in non-deterministic time 21.99·n .

3.2 Proof systems


The following definition refers to non-deterministic unambiguous computation, which
captures non-deterministic computation of both the language and its complement:

Definition 3.4 (non-deterministic unambiguous computation). We say that a machine


M is non-deterministic and unambiguous if for every x ∈ {0, 1}∗ there exists a value L( x ) ∈
{0, 1} such that the following holds:
1. There exists a non-deterministic guess π such that M( x, π ) = L( x ).

2. For every non-deterministic guess π 0 it holds that M ( x, π 0 ) ∈ { L( x ), ⊥}.

3.2.1 Arthur-Merlin proof systems


Let us now recall the definition of Arthur-Merlin proof systems (i.e., of AM). Since
we will be concerned with precise time bounds, we also specify the precise structure
of the interaction, as follows.

Definition 3.5 (Arthur-Merlin proof systems). We say that L ∈ AMT IME [ c] [ T ] if


there is a proof system in which on a shared input x ∈ {0, 1}∗ , a verifier interacts with a
prover, taking turns in sending each other information, such that the following holds.

• Public coins: Whenever the verifier sends information to the prover, that information is
just uniformly chosen bits.

• Structure of the interaction: The number of turns is c, and we always assume that
the first turn is the verifier sending random bits to the prover.

• Running time: The number of bits that are sent in each turn is exactly T (| x |), and in
the end the verifier performs a deterministic linear-time computation on the transcript
(which is of length c · T (| x |) = O( T (| x |)) and outputs a single bit.

• Completeness and soundness: For every x ∈ L there exists a prover such that the
verifier accepts with probability 1; for every x ∈
/ L and every prover, the verifier rejects,
with probability at least 2/3.

16
Furthermore, if the verifier sends at most R(n) random coins in each round, then we say
that L ∈ AMT IME [ c] [ T, R]. When L can be decided by an interaction as above in which
the prover takes the first turn, we say that L ∈ MAT IME [ c] [ T ]. In all the definitions
above, when omitting the number of messages c, we mean that c = 2.

Following standard conventions, we will sometimes refer to the verifier as Arthur


and to the prover as Merlin. Note that when c is odd (meaning that the last turn is the
verifier’s), in the last turn the verifier does not need to send any random bits to the
verifier, but may run a randomized linear-time computation on the transcript rather
than a deterministic one.22
When we want to refer to proof systems with imperfect completeness (i.e., for any x ∈
L there is a prover such that the verifier accepts with probability at least 2/3), we ex-
[ c] [ c]
plicitly add a subscript “2 ”, for example AMT IME 2 [ T ] or (MA ∩ co MA)T IME 2 [ T ].

3.2.2 Doubly efficient proof systems and deterministic argument systems


As mentioned in Section 1.3, our derandomization results will apply to proof systems
in which the honest prover is efficient. We define this notion as follows:

Definition 3.6 (doubly efficient proof systems). We say that L ∈ deIP [ c] [ T ] if it meets
all the conditions in Definition 3.5, and in addition meets the following condition: There exists
a deterministic algorithm P running in time polynomial in T such that for every x ∈ L, when
the verifier interacts with P on common input x, it accepts with probability 1. We define
deIP [ c] [ T, R] analogously.

Note that Definition 3.6 focuses on doubly efficient proof systems with public
coins; the known constructions of doubly efficient proof systems all use public coins
(see [GKR15; RRR21; GR18; Gol18]). In addition, one could use a broader definition
that allows the honest prover to run in time T̄  T that is not necessarily polynomial
in T; in this work the narrower definition will suffice for us.
We will also be interested in a natural subclass of doubly efficient proof systems,
which we now define. Loosely speaking, we say that a system has an efficient universal
prover if for any partial transcript, the maximum acceptance probability of the residual
protocol across all provers is (approximately) attained by an efficient prover. That is:

Definition 3.7 (universal provers). Let L ∈ {0, 1}∗ and let V be a verifier in a proof system
for L. For µ : N → [0, 1), we say that the proof system has a µ-approximate universal prover
with running time T̄ if there exists an algorithm P that on any input x and π, where π is a
partial transcript for the proof system, runs in time T̄ (| x |), and satisfies that

Pr [hV, Pi ( x, π ) = 1] > max {Pr [hV, P̄i ( x, π ) = 1]} − µ(| x |) ,


where the notation hV, Pi ( x, π ) denotes the outcome of interaction of V and P on input x and
when the first part of the transcript is fixed to π, and the maximum on the RHS is over all
prover strategies (regardless of their efficiency).

Note that given x ∈ L, the universal prover acts as an honest prover that convinces
the verifier to accept with probability at least 2/3 − µ (or 1 − µ, if the protocol admits
perfect completeness). In particular, a proof system with an efficient universal prover
22 This is equivalent to the definition above, in which the verifier sends random coins in the last turn

then runs a deterministic linear-time computation on the transcript (which includes these random coins).

17
is a doubly efficient proof system. However, we do not think of the universal prover
as an honest prover, since given x ∈ / L or a dishonest partial transcript, the universal
prover still tries to maximize the acceptance probability of the verifier.
A well-known doubly efficient proof system that has a universal prover is the
sumcheck protocol; see Theorem 3.17 for details.
Let us now recall Definition 1.5 of deterministic doubly efficient argument systems
and discuss a few definitional issues.
Definition 3.8 (deterministic doubly efficient argument system; Definition 1.5, re-
stated). We say that L ⊆ {0, 1}∗ is in deDARG[ T ] if there exists a deterministic T-time
verifier V such that the following holds:
1. There exists a deterministic algorithm P that, when given x ∈ L, runs in time poly( T )
and outputs π such that V ( x, π ) = 1.

2. For every polynomial p, and every probabilistic algorithm P̃ running in time p( T ), and
every sufficiently large n ∈ N, the probability that P̃(1n ) prints x ∈
/ L and π ∈ {0, 1} T
such that V ( x, π ) = 1 is T (n) − ω ( 1 ) .
Observe that deDARG[ T ] ⊆ DT IME [poly( T )], since the honest prover runs in
time poly( T ). Removing the efficiency restriction on the honest prover makes the
definition too broad to be meaningful: Under a plausible hardness assumption, every
language (regardless of is complexity) has a proof system as above in which the honest
prover P runs in time larger than that of the adversaries P̃. 23
Thus, for the definition to be meaningful we need to have T = TV < TP < TP̄ ,
where the three latter notations represent the running times of the verifier V, of the
honest prover P, and of the adversaries P̃, respectively. While Definition 3.8 couples
these bounds so that they are all polynomially related, a broader definition in which
the gaps are super-polynomial still makes sense. We use the narrower definition only
because it suffices for our purposes in the current work.

3.3 Error-correcting codes


We recall the definition of locally list-decodable codes, and state a standard construc-
tion that we will use.
Definition 3.9 (locally list decodable codes). We say that Enc : Σ N → Σ M is locally list-
decodable from agreement ρ in time t and with output-list size L if there exists a randomized
oracle machine Dec : [ N ] × [ L] → Σ running in time t such that the following holds. For every
z ∈ Σ M that satisfies Pri∈[ M] [zi = Enc( x )i ] ≥ ρ for some x ∈ Σ N there exists a ∈ [ L] such
that for every i ∈ [ N ] we have that Pr[Decz (i, a) = xi ] ≥ 2/3, where the probability is over
the internal randomness of Dec.
Theorem 3.10 (a locally list-decodable code, see [STV01]). For every constant η > 0 there
exists a constant η 0 > 0 such that the following holds. For every m ∈ N and ρ = ρ(m) there
0 0
exists a code Enc : {0, 1}m → Σm̄ , where |Σ| = O(mη /ρ2 ) and m̄ = Oη 0 (m/ρ2/η ), such
that:
0
1. The code is computable in time Õ(m̄ · log(|Σ|)) = Õ(m/ρ2/η ).
23 Tosee this, assume that there exists a relation R = {( x, π )} that can be decided in time T, but every
probabilistic algorithm P̃ getting input x and running in time poly( T ) fails to find π such that ( x, π ) ∈ R,
except with negligible probability (over x and over the random coins of P̃). Then, for every L, define V
that accepts ( x, π ) iff ( x, π ) ∈ R, and observe that this verifier meets the relaxed definition of deDARG .

18
0
2. The code is locally list-decodable from agreement ρ in time mη · (1/ρ)1/η and with
output list size O(1/ρ). Furthermore, the local decoder issues its queries in parallel, as
a function of the randomness and the input.

We also need the following two uniquely decodable codes:

Theorem 3.11 (uniquely decodable code, see e.g., [GLR+91] and [AB09, Section 19.4]).
For every constant η > 0 the following holds. For every m ∈ N there exists a code Enc : {0, 1}m →
{0, 1}m̄ , where m̄ = Õ(m), such that:
1. The code is computable in time Õ(m̄) = Õ(m).

2. The code is locally decodable from agreement 0.9 with decoding circuit size mη . Fur-
thermore, the decoding circuit issues its queries in parallel and outputs correctly with
probability at least 1 − 1/m.

Lemma 3.12 (unique decoding for low-degree univariate polynomials; see [WB86]).
Let q be a prime power. Given t pairs ( xi , yi ) of elements of Fq , there is at most one polyno-
mial g : Fq → Fq of degree at most u for which g( xi ) = yi for more than (t + u)/2 pairs.
Furthermore, there is a polynomial time algorithm that finds g or report that g does not exist.

3.4 Near-linear time constructions: Extractors, hash functions, cryptographic


PRGs
In this section we state several known algorithms that we will use in our proofs and
that run in near-linear time. The first is a pairwise-independent hash function based
on convolution hashing [MNT93].

Theorem 3.13 (a quasilinear-time computable pairwise independent hash function;


o Theorem 3.12]). For every m, m ∈ N there exists a family
for proof 0
n see, e.g., [CT21b, 0
H ⊆ {0, 1}m → {0, 1}m of quasilinear-time computable functions such that for every
0
distinct x, x 0 ∈ {0, 1}m it holds that Prh∈H [h( x ) = h( x 0 )] ≤ 2−m .

The second construction is of a seeded randomness extractor that runs in linear


time, which was presented by Doron et al. [DMO+20] following [TSZS06, Theorem 5].

Theorem 3.14 (a linear-time computable extractor, see [DMO+20]). There exists c ≥ 1


such that for every γ < 1/2 the following holds. There exists a strong oblivious (δ, e)-
1− γ
sampler Samp : {0, 1}n × {0, 1}d → {0, 1}m for δ = 2n −n and e ≥ c · n−1/2+γ and
d ≤ (1 + c · γ) · log(n) + c · log(1/e) and m = 1c · n1−2γ that is computable in linear time.

The third construction is that of a cryptographic PRG that works in near-linear


time. Such a PRG can be obtained assuming standard one-way functions, by starting
with a fast PRG that has small stretch and then applying standard techniques for
extending the output length.

Theorem 3.15 (OWFs yield PRGs with near-linear running time; see [CT21a, The-
orem 3.4]). If there exists a polynomial-time computable one-way function secure against
polynomial-time algorithms, then for every e > 0 there exists a PRG that has seed length
`(n) = ne , is computable in time n1+e , and fools every polynomial-time algorithm with negli-
gible error. Moreover, if the one-way function is secure against polynomial-sized circuits, then
the PRG fools every polynomial-sized circuit.

19
3.5 Pair languages and PCPPs
A pair language L is a subset of {0, 1}∗ × {0, 1}∗ . We say that L ∈ N T IME [ T ] if
L( x, y) can be computed by a nondeterministic algorithm in time T (| x | + |y|). We
also say L has stretch K for a function K : N → N, if for all ( x, y) ∈ L, it holds that
|y| = K (| x |). We will use the following PCP of proximity for pair languages by Ben-
Sasson et al. [BGH+05]:

Theorem 3.16 (PCPP with short witnesses [BGH+05]). Let K : N → N such that K (n) ≥
n for all n ∈ N. Suppose that L is a pair language in DT IME [ T ] for some non-decreasing
function T : N → N such that L has stretch K. There is a verifier V and an algorithm A such
that for every x ∈ {0, 1}n , denoting T = T (n + K (n)) and K = K (n):

1. (Efficiency.) When V is given input x and oracle access to y ∈ {0, 1}K and to a
r
proof π ∈ {0, 1}2 , where r ≤ log( T ) + O(log log( T )), the verifier V uses r bits of
randomness, makes at most polylog( T ) non-adaptive queries to both y and π, and
runs in time poly(n, log(K ), log( T )).

2. (Completeness.) Let M be an O( T )-time nondeterministic machine that decides L. That


is, ( x, y) ∈ L if and only if there exists w ∈ {0, 1}O(T ) such that M (( x, y), w) = 1.
For every y ∈ {0, 1}K and w ∈ {0, 1}O(T ) such that M (( x, y), w) = 1, the algorithm
r
A( x, y, w) runs in time Õ( T ) and outputs a proof π ∈ {0, 1}2 such that

Pr [V y,π ( x, ur ) = 1] = 1 .

3. (Soundness.) Let Z = z ∈ {0, 1}K : ( x, z) ∈ L . Then, for every y ∈ {0, 1}K that

r
has Hamming distance at least K/ log( T ) from every z ∈ Z and every π ∈ {0, 1}2 ,

Pr [V y,π ( x, ur ) = 1] ≤ 1/3 .

3.6 Constant-round sumcheck and #NSETH


The following result is an “asymmetric” version of Williams’ [Wil16] adaptation of the
sumcheck protocol [LFK+92] into a constant-round protocol for counting the number
of satisfying assignments of a given formula. Specifically, while in [Wil16] the n vari-
ables are partitioned into symmetric subsets and each sumcheck round is a summation
over one of the subsets, here we partition the variables such that the first set is smaller,
and in the corresponding sumcheck protocl the first round is shorter.

Theorem 3.17 (a constant-round protocol for counting satisfying assignments of a for-


mula). Let k ∈ N, δ ∈ (0, 1), and γ = 1−k δ be constants. For any s(n) ≤ 2o(n) , there is
an MAT IME [ 2k] [2max(δ,γ)·n+o(n) ] protocol Π that gets as input a formula C : {0, 1}n →
{0, 1} of size s(n) and (with probability 1) outputs the number of satisfying assignments of
C, such that the first prover message has length at most 2δn+o(n) , and the rest k − 1 prover
messages have length at most 2γn+o(n) .
Moreover, the protocol has the following two properties:

1. After the prover sends its first message, the maximum acceptance probability of the sub-
sequent protocol is either 1 or at most 1/3.

2. There is a 1/n-approximate universal prover running in 2O(n) for Π.24


24 For the definition of a universal prover, see Definition 3.7.

20
Proof. The protocol is essentially identical to that of [Wil16, Theorem 3.4]. The only
difference is that in [Wil16, Theorem 3.4] the n input variables are partitioned into
k + 1 blocks, each of length n/(k + 1), whereas we will partition them into a single
block of length δ · n and k other blocks of length γ · n. Due to this fact, and since we
are also claiming additional properties in the “moreover” part that were not stated
in [Wil16], we include a complete proof.
Let s(n) ≤ 2o(n) , and let C : {0, 1}n → {0, 1} be a formula of size s(n). The prover
and verifier will work with a prime p ∈ (2n , 2n+1 ].25 Let P : Fnq → Fq be the arithmetic
circuit constructed by the standard arithmetization of the Boolean circuit C (see the
proof of [Wil16, Theorem 3.3] for details) such that P has size poly(s(n)) ≤ 2o(n) and
degree at most poly(s(n)) ≤ 2o(n) .
For simplicity, we assume that δ · n and γ · n are integers. We partition the n
variables x1 , . . . , xn into k + 1 blocks S1 , . . . , Sk+1 , such that |S1 | = δ · n and |Si | = γ · n
for every i ≥ 2. For each i ∈ [k + 1], via interpolation similar to the proof of [Wil16,
|S |
Theorem 3.4], we define a degree-2|Si | polynomial Φi : Fq → Fq i such that for every
j ∈ {0, 1, . . . , 2|Si | − 1}, (Φi ( j))` is the `-th bit in the |Si |-bit binary representation of j
. Using a fast interpolation algorithm (see [Wil16, Theorem 2.2]), the polynomial Φi
(i.e., the list of the coefficients of |Si | univariate polynomials, each corresponding to
one of Φi ’s output values) can be constructed in 2|Si | · poly(n) time.
The protocol Π is specified as follows:

• In the first round, the honest prover computes the coefficients of the polynomial

Q1 ( y ) = ∑ P(Φ1 (y), Φ2 ( j2 ), . . . , Φk+1 ( jk )) (3.1)


j2 ,...,jk+1
∈{0,1,...,2γ·n −1}

via interpolation, and sends Q1 to the verifier (Note that Q1 has degree 2o(n)+δ·n ,
so this message has size 2o(n)+δ·n ). The verifier then chooses r1 ∈ Fq uniformly
at random and sends it to the prover.

• In the t-th round for t ∈ {2, 3, . . . , k }, the honest prover sends the coefficients of
the degree-2o(n)+γ·n polynomial

Qt (y) = ∑ P(Φ1 (r1 ), . . . , Φt−1 (rt−1 ), Φt (y), Φt+1 ( jt+1 ), . . . , Φk+1 ( jk+1 ))
jt+1 ,...,jk+1
∈{0,1,...,2γ·n −1}
(3.2)

to the verifier. The verifier first checks if

∑ Qt ( jt ) = Qt−1 (rt−1 ), (3.3)


jt ∈{0,1,...,2γ·n −1}

and rejects immediately if the equality does not hold. The verifier then picks
rt ∈ Fq uniformly at random. The verifier then picks rτ ∈ Fq uniformly at
random, and sends it to the prover if τ < k
25 Thehonest prover can find the smallest prime p that is larger than 2n , which is at most 2n+1 by
Bertrand’s postulate, in time 2O(n) and send p to the verifier. The verifier checks that the received
number lies in (2n , 2n+1 ] and is a prime, in deterministic time poly(n) (e.g., using [AKS04; AKS19]).

21
• Finally, at the end of the k-th round, the verifier checks if
Q k (r k ) = ∑ P((Φ1 (r1 ), . . . , Φk (rk ), Φk+1 ( jk+1 )),
jk+1 ∈{0,1,...,2γ·n −1}

and rejects immediately if the equality does not hold. Otherwise, it outputs

∑ Q1 ( j1 ) .
j1 ∈{0,1,...,2δ·n −1}

The upper bound on message lengths and the running time of the verifier can be
verified directly from the protocol above. The analysis establishing the completeness
and soundness of this protocol, as well as the upper bound on the complexity of
the honest prover, follow a standard analysis of the sumcheck protocol; see [Wil16,
Theorem 3.4]. We therefore focus on establishing the moreover part.
To see the “moreover” part, for every i ∈ [k ], let Di be the degree of the polynomial
Qi . Fix a partial transcript π for all the interaction before the t-th prover message, and
without loss of generality assume that π = ( Q̃1 , r1 , . . . , Q̃t−1 , rt−1 ) (if t = 1 then π is
empty). We prove the following claim.
Claim 3.18. Given a partial transcript π = ( Q̃1 , r1 , . . . , Q̃t−1 , rt−1 ), if t = 1 or Q̃t−1 (rt−1 ) =
Qt−1 (rt−1 ), then the maximum acceptance probability of the subsequent protocol is 1, and can
be achieved by a 2O(n) -time prover. Otherwise, it is at most (k + 1 − t)/n2 .

Proof. We prove the claim by an induction on t, starting from k + 1 and moving down-
ward to 1. In the base case t = k + 1 all messages have been sent, and the verifier
accepts if and only Q̃t−1 (rt−1 ) = Qt−1 (rt−1 ). Hence the claim holds immediately.
Now, for t ∈ [k ], assuming the claim holds for t + 1. If t = 1 or Q̃t−1 (rt−1 ) =
Qt−1 (rt−1 ), then the prover simply sends the correct polynomial Qt defined by (3.1)
or (3.2) and proceeds as the honest prover (note that the check (3.3) will pass).
Otherwise, we have that t ≥ 2 and Q̃t−1 (rt−1 ) 6= Qt−1 (rt−1 ). In this case, in order
to not be rejected immediately, the prover has to send a polynomial Q̃t such that
∑ jt ∈{0,1,...,2γ·n −1} Q̃t ( jt ) = Q̃t−1 (rt−1 ). In particular, it means that Q̃t 6= Qt , where Qt
is defined by (3.2). Therefore, with probability at most Dt /q ≤ 1/n2 over the choice
of rt , it holds that Q̃t (rt ) = Qt (rt ). By the induction hypothesis, we know that the
maximum acceptance probability is at most 1/n2 + (k − t)/n2 ≤ (k + 1 − t)/n2 . 
The existence of a 1/n-approximate universal prover with running 2O(n) (the sec-
ond item of the “moreover” part) follows immediately from Claim 3.18.
To see the first item, we note that if Q̃1 = Q1 , where Q1 is defined by (3.1), then
the maximum acceptance probability is 1, by Claim 3.18. If Q̃1 6= Q1 , then since
both of them have degree at most D1 , the probability of Q̃1 (r1 ) = Q1 (r1 ) is at most
D1 /q ≤ 1/n2 , hence the overall acceptance probability is at most 1/n, using the same
argument as in the proof of Claim 3.18.

The “savings” in running time above (i.e., the improvements over the brute-force
algorithm that runs in time 2(1+o(1))·n ) were achieved via a probabilistic interactive pro-
tocol. The following assumption, denoted #NSETH, asserts that without randomness
it is impossible to achieve running time 2(1−e)·n , for any constant e > 0.
Assumption 3.19 (#NSETH). There does not exist a constant e > 0 and a non-deterministic
machine M that gets as input a formula Φ over n variables of size 2o(n) , runs in time 2(1−e)·n ,
and satisfies the following:

22
1. There exists non-deterministic choices such that M outputs the number of satisfying
assignments for Φ.

2. For all non-deterministic choices, M either outputs the number of satisfying assignments
for Φ or outputs ⊥.

Recall that the standard strong exponential-time hypotheses SETH asserts that for
every e > 0 it is hard to solve k-SAT with n variables in time 2(1−e)·n , where k = k e
is sufficiently large. The assumption #NSETH is incomparable to SETH: On the one
hand, in #NSETH we assume hardness with respect to a larger class of formulas (i.e.,
n-bit formulas of size 2o(n) ), and also assume hardness of the counting problem; but on
the other hand, the hardness in #NSETH is for non-deterministic machines rather than
just for deterministic algorithms.

4 Superfast derandomization of MA
The following is the “basic version” of the highly efficient reconstructive PRG, which
was mentioned in the beginning of Section 2. We will further refine this version later
on in Propositions 5.2 and 7.10.

Proposition 4.1 (a reconstructive PRG with unambiguous non-deterministic recon-


struction). For every e0 > 0 there exists δ0 > 0 and a pair of algorithms that for any N ∈ N
1+e /3
and f ∈ {0, 1} N 0 satisfy the following:

1. (Generator.) The generator G gets input 1 N , oracle access to f , and a random seed of
length (1 + e0 ) · log( N ), and outputs an N-bit string in time N 1+e0 .

2. (Reconstruction.) For any D : {0, 1} N → {0, 1} such that Prs∈[ N 1+e0 ] [ D ( G f (1 N , s)) =
1] > Prr∈{0,1} N [ D (r ) = 1] + 1/10 there exists a string adv of length | f |1−δ0 such that
the following holds. When the reconstruction R gets input x ∈ [| f |] and oracle access to
D and non-uniform advice adv, it runs in non-deterministic time | f |1−δ0 , issues queries
in parallel, and unambiguously computes f x .

Proof. For a sufficiently small γ = γ(e0 ) to be determined later, let Samp : {0, 1} N̄ ×
[ L̄] → {0, 1} N be the sampler from Theorem 3.14, instantiated with parameter γ, with
a a sufficiently small constant error, and with N̄ = N 1+O(γ) and L̄ = N̄ 1+O(γ) .

The generator G. The generator encodes f to f¯ = Enc( f ) using the code Enc in
Theorem 3.10, instantiated with parameters m = N and ρ, η that are sufficiently small
constants. We think of f¯ as a binary string (by naively encoding each symbol using
log(|Σ|) bits, in which case | f¯| = m̄ · log(|Σ|) = Õ( N 1+e0 /3 ). The generator then
partitions f¯ into L = | f¯|/ N̄ consecutive substrings f¯1 , . . . , f¯L , and given seed (i, j) ∈
[ L] × [ L̄] it outputs the N-bit string Samp( f¯i , j). The number of strings in the set is

L · L̄ = (| f¯|/ N̄ ) · ( N̄ 1+O(γ) ) = N 1+e0 /3+O(γ) < N 1+e0 ,

and the running time of the generator is Õ( N 1+e0 /3 ) < N 1+e0 .

23
The reconstruction R. Fix a (1/10)-distinguisher D : {0, 1} N → {0, 1} . Denoting the
uniform distribution over the output-set of G f by G, our assumption is that Pr[ D (G) =
1] > Prr∈{0,1} N [ D (r ) = 1] + 1/10.
Let D̄ : {0, 1} N̄ → {0, 1} be the function

D̄ (z) = 1 ⇐⇒ Pr [ D (Samp(z, j)) = 1] ≤ Pr [ D (r ) = 1] + .01 , (4.1)


j∈[ L̄] r ∈{0,1} N

1− γ
and let S = D̄ −1 (1) and T = S̄ = D̄ −1 (0). 26 Note that |S̄| ≤ 2 N̄ , by the properties
of Samp. Then, by the definition of T we have that

Pr[ D (G)) = 1] = Pr [ D (Samp( f¯i , j))]


i ∈[ L],j∈[ L̄]

≤ Pr[ f¯i ∈ T ] + Pr[ D (Samp( f¯i , j)) = 1| f¯i ∈


/ T]
i i,j
 
¯
≤ Pr[ f i ∈ T ] + Pr [ D (r ) = 1] + .01 . (4.2)
i r ∈{0,1} N

Since Pr[ D (G) = 1] − Prr [ D (r ) = 1] > (1/10), we deduce that Pri [ f¯i ∈ T ] > (1/10) −
.01 > ρ, where the last inequality is by a sufficiently small choice of ρ.
Computing a “corrupted” version of f¯. We first construct a machine M that computes
a “corrupted” version of f¯, as follows. Let H be the hash family in Theorem 3.13,
using parameters m = N̄ and m0 = N̄ 1−γ/2 . We argue that:
1− γ
Fact 4.1.1. With probability at least 1 − 2− N̄ over h ∼ H, for every distinct z, z0 ∈ S̄ it
holds that h(z) 6= h(z0 ).

Proof. For every distinct z, z0 ∈ S̄, the probability over h ∼ H that h(z) = h(z0 ) is at
1−γ/2 1− γ
most 2− N̄ . By a union bound over |S̄|2 ≤ 22 N̄ pairs, with probability at least
− N̄ 1−γ/2 2 N̄ 1− γ − N̄ 1− γ
1−2 ·2 > 1−2 there does not exist a colliding pair in S̄. 
¯

 Let I = i ∈ [ L] : f i ∈ T . The machine M gets as advice the foregoing h, the set
¯
(i, h( f i )) : i ∈ I , and the value Prr∈{0,1} N [ D (r ) = 1]. Given x ∈ [ f¯], the machine
computes i ∈ [ L] such that the index x belongs to the ith substring of f¯, and if i ∈ / I it
outputs zero. Otherwise, the machine:

1. Non-deterministically guesses a preimage z ∈ {0, 1} N̄ for f¯i under h.

2. Verifies that h(z) = f¯i using the advice value (i, h( f¯i )).

3. Verifies that z ∈
/ S, using the oracle access to D, the sampler Samp, and the
acceptance probability of D (that is given as advice).

4. If either of the two verifications failed, the machine aborts. Otherwise, it outputs
the bit in z corresponding to index x.

Note that the foregoing machine computes, in an unambiguous non-deterministic


manner, a string f˜ such that Prx [ f¯x = f˜x ] ≥ ρ. (This is because for every x belonging
to a substring indexed by i ∈ / I it holds that f˜x = 0, and for every other x it holds that
˜f x = f¯x .) The number of advice bits that M uses is at most Õ( N̄ + L · N̄ 1−γ/2 + N ) ≤
26 Denoting the complement of S both by S̄ and by T might seem unnecessarily cumbersome. However,
this formulation will generalize more easily later on when we prove the “furthermore” statement.

24
N 1+O(γ) , and its running time is dominated by computing the hash function once
(which takes time Õ( N̄ )) and computing D̄ once.
Computing f¯, and thus also f . Consider the execution of the local list-decoder Dec
from Theorem 3.10 with agreement ρ, when it is given oracle access to the “corrupted”
version of f¯ computed by M (i.e., the version in which a block f¯i indexed by i ∈ I has
the correct values of f¯, and all other blocks are filled with zeroes). Fixing the “right”
index η ∈ [O(1/ρ)] of f in the corresponding list of codewords, we reduce the error
probability of Dec to less than 1/N 2 (by O(log( N )) repetitions) and now consider its
execution with a fixed random string.
The reconstruction procedure R gets as advice the non-uniform advice for M, the
index η, and the fixed random string for Dec (we will see that the latter string is of
length N O(γ) ). Given x ∈ [| f¯|], it runs Dec and answers its queries using M. If any
of the queries to M was answered by ⊥, we abort and output ⊥, and otherwise we
output the result of the list-decoder’s computation. Note that Dec can be described by
an oracle circuit of size poly(1/ρ) · N η + log(1/ρ) + O(1) ≤ N 2η ≤ N O(γ) , where we
relied on a sufficiently small choice of η. It issues its queries in parallel, since both Dec
and the machine M issue their queries in parallel.
We thus obtained a procedure for f¯ that is non-deterministic and unambiguous,
and uses at most N 1+O(γ) < | f |1−δ0 advice bits, where the inequality relies on suffi-
ciently small choices of γ and of δ0 . Denoting the time for verifying that z ∈ T by K,
the running time of the procedure for f¯ is at most

N O(γ) · Õ( N̄ ) + K = N 1+O(γ) + N O(γ) · K ,




and using the naive algorithm for D̄ (which enumerates over i ∈ [ L̄]) this is at most
N 1+O(γ) < | f |1−δ0 .

Given the reconstructive PRG in Proposition 4.1, we can now prove Theorem 1.1:
Theorem 4.2 (derandomization with quadratic overhead from useful properties against
SVN circuits). For every e > 0 there exists δ > 0 such that the following holds. As-
sume that there exists an N T IME [ N 2+e/4 ]-constructive property L useful against (N ∩
co N )T IME [2(2−δ)·n ]/2(1−δ)·n . Then, for any time bound T we have that pr BP T IME [ T ] ⊆
pr (N ∩ co N )T IME [ T 2+e ].
Proof. Let A be a pr BP T IME [ T ] machine, fix a sufficiently large input length | x |,
and let N = T (| x |). Given input x, our derandomization algorithm guesses a string f n
of length 2n = N 1+e/3 and verifies that f n ∈ L (if the verification fails, it aborts).27 It
then enumerates over the seeds of the generator from Proposition 4.1, when the latter
is instantiated with e0 = e/2 and with f n as the oracle, to obtain N 1+e/2 pseudoran-
dom strings w1 , . . . , w N 1+e/2 ∈ {0, 1} N ; and it outputs MAJ { A( x, wi )}i∈[ N 1+e/2 ] . This
derandomization algorithm runs in time O( N 2+e ) = O( T (n)2+e ).
Let δ ≤ δ0 /2 be sufficiently small. For any n ∈ N and N = 2n/(1+e/3) , assume
−1 n
that there is x ∈ {0, 1} T ( N ) and f n ∈ (L ∩ {0, 1}2 ) such that Dx (r ) = A( x, r ) is a
(1/10)-distinguisher for the generator above, when the latter guesses the truth-table
f n . Let D̄x be either the function computed by Dx (if Dx accepts a pseudorandom
input with higher probability than it does a random input) or the negation of that
function (otherwise). By Proposition 4.1, there is a reconstruction algorithm R for
27 For simplicity we assume that N 1+e/3 is a power of two, since rounding issues do not meaningfully

affect the proof.

25
f n that runs in time | f n |1−2δ and uses oracle access to D̄x and | f n |1−2δ bits of advice.
To simulate the oracle for R, we supply R with additional advice x of length | x | =
T −1 (2n/(1+e/3) ) ≤ | f n |1−2δ and with an advice bit indicating whether or not to flip
the output of Dx . Plugging in the time complexity of Dx as N = | f n |1/(1+e/3) , the
running time of R is | f n |(1−2δ)+1/(1+e/3) < | f n |2−δ , and it non-deterministically and
unambiguously computes the function whose truth-table is f n .
Now, assume towards a contradiction that there are infinitely many x ∈ {0, 1}∗ and
f n such that Dx is a (1/10)-distinguisher for the generator with f n , and fix correspond-
ing advice strings for R as above. (On input lengths for which there are no suitable
x and f n , the advice string indicates that R should compute the all-zero function.)
This yields L ∈ (N ∩ co N )T IME [2(2−δ)·n ]/2(1−δ)·n whose truth-tables are included,
infinitely often, in L, contradicting the usefulness of L.

The proof of Theorem 4.2 also yields the following result, in which both the hy-
pothesis and the conclusion are stronger:

Theorem 4.3 (derandomization with quadratic overhead from batch-computable truth-ta-


bles). For every e > 0 there exists δ > 0 such that the following holds. Assume that there
exists L ∈/ i.o.(N ∩ co N )T IME [2(2−δ)·n ]/2(1−δ)·n such that there is an algorithm that gets
input 1 and prints the truth-table of L on n-bit inputs in time 2(2+e/4)·n . Then, for any time
n

bound T we have that pr BP T IME [ T ] ⊆ pr DT IME [ T 2+e ].

We think of the algorithm in the hypothesis of Theorem 4.3 as “batch-computing”


the hard function: It prints the entire truth-table in time slightly larger than 22n ,
whereas computing individual entries in the truth-table cannot be done in time slightly
less than 22n (even using unambiguous non-determinism). The only difference be-
tween the proof of Theorem 4.2 and in the proof of Theorem 4.3 is that in the latter,
instead of guessing and verifying the truth-table of a hard function, we explicitly com-
pute the truth-table using the hypothesized algorithm.

5 Superfast derandomization of AM
In this section we prove Theorems 1.2 and 1.3. Towards this purpose, in Section 5.1
we refine the reconstructive PRG from Proposition 4.1. Then, in Section 5.2.1, we
conditionally construct two PRGs that rely on different hardness hypotheses and have
different parameters, both of which use the foregoing refined reconstructive PRG. In
Section 5.2.2 we compose the two PRGs in order to prove Theorem 1.2. Lastly, in
Section 5.4 we use the reconstructive PRG in a different way to prove Theorem 1.3.

5.1 Refining the reconstructive PRG from Proposition 4.1


We now extend Proposition 4.1 in two ways. First, we argue that by allowing random-
ness in the reconstruction, we can reduce its query complexity (this is along the lines
of ideas from [DMO+20; CT21b]). Secondly, we claim that the reconstruction does not
need “full oracle access” to D; loosely speaking, the functionality of the reconstruction
is maintained as long as we can guarantee that D answers zero on a sufficiently large
fraction of the queries.

Definition 5.1 (α-indicative sequences). Let t, s ∈ N such that s|t, let α ∈ (0, 1), and let
d ∈ {0, 1}t . We say that d is (s, α)-valid if for every i ∈ [t/s] it holds that ∑ j∈[s] d(i−1)+ j ≥

26
α · s. We say that a sequence q ∈ {0, 1}t is (s, α)-indicative of d if for every i ∈ [t/s] it holds
that ∑ j∈[s] (q(i−1)+ j ∧ d(i−1)+ j ) ≥ α · s. We say that a sequence q ∈ {0, 1}t is (s, α)-decient
if there exists i ∈ [t/s] such that ∑ j∈[s] q(i−1)+ j < α · s.

Proposition 5.2 (an extension of the PRG from Proposition 4.1). In the reconstruction
of Proposition 4.1, for any D satisfying the hypothesis and any input x and witness w for the
reconstruction procedure, denote by R̄ D ( x, w) ∈ { f x , ⊥} the output of R with oracle access to
D and advice adv. If we allow R to use randomness, then we may assume that it makes only
t̄ = N e0 /10 parallel queries to D and satisfies Pr[ R D ( x, w) = R̄ D ( x, w)] ≥ 2/3.
Furthermore, denoting N̄ = N 1+e0 /3 , there exists s ∈ N satisfying s|t̄ and α ∈ (0, 1) and
1−2δ
a random variable h over {0, 1} N̄ 0 that can be sampled in quasilinear time, such that for
any D : {0, 1} N → {0, 1} satisfying Prr∈{0,1} N [ D (r ) = 1] ≤ 1/3, with probability at least
1 − 1/N over h ∼ h the following holds. For any input x ∈ [ N̄ ] and witness w and random
coins γ, denote by d x,w,γ ∈ {0, 1}t̄ the evaluations of D on the queries made by R, and denote
by a x,w,γ ∈ {0, 1}t̄ the answers that R received to these queries. Then,

1. (Honest oracle.) For any f ∈ {0, 1} N̄ , assume that Prs∈[ N 1+e0 ] [ D ( G f (s)) = 1] ≥ 1/2.
1−2δ0
Then, there exists adv ∈ {0, 1} N̄ such that when R gets advice (h, adv) the following
holds.

(a) Completeness: For every x there exists w such that with probability 1 − 1/N over
γ it holds that d x,w,γ is (s, α)-valid, and if a x,w,γ is (s, α)-indicative of d x,w,γ then
R( x, w) outputs f x .
(b) Soundness: For every ( x, w), with probability at least 1 − 1/N over γ, if a x,w,γ is
(s, α)-indicative of d x,w,γ , then R( x, w) outputs either f x or ⊥.
1−2δ
2. (Dishonest oracles.) For every adv ∈ {0, 1} N̄ 0 there exists g ∈ {0, 1} N̄ such that
when R gets advice (h, adv) the following holds. For every ( x, w), with probability at
least 1 − 1/N over γ, if a x,w,γ is (s, α)-indicative of d x,w,γ then R( x, w) outputs either
gx or ⊥.

3. (Deficient oracles.) For every advice (h, adv) to R and any ( x, w) and any choice of
γ, if a x,w,γ is (s, α)-deficient then R outputs ⊥.

We stress that the “dishonest oracles” claim and the “deficient oracles” claim do
not depend on any particular choice of f for the generator G. That is, the two claims
refer only to the behavior of the reconstruction R given advice (h, adv) and oracle
access to D such that Prr∈{0,1} N [ D (r ) = 1] ≤ 1/3.

Proof of Proposition 5.2. We follow the same proof as for Proposition 4.1 and explain
the necessary changes. Let us start with proving the bound of N e/10 on the number of
queries when R is allowed to use randomness.
We modify the definition of D̄ in Eq. (4.1). Specifically, let ν = Prr∈{0,1} N [ D (r ) = 1],
and let D̄ be a probabilistic procedure that gets z ∈ {0, 1} N̄ , uniformly samples s =
O(log( N )) values i1 , . . . , is ∈ [ L̄] and outputs 1 if and only if Pr j∈[s] [ D (Samp(z, i j )) =
1] < ν + .005. We also modify the definitions of S and of T, as follows:
 

S = z ∈ {0, 1} : Pr [ D (Samp(z, j)) = 1] ≤ ν + .001 ,
j∈[ L̄]

27
and
 

T= z ∈ {0, 1} : Pr [ D (Samp(z, j)) = 1] > ν + .01 .
j∈[ L̄]

1− γ
Note that T ⊆ S̄, that |S̄| ≤ 2 N̄ (by the properties of Samp, assuming we instantiate
it with a sufficiently small error), and that Pri [ f¯i ∈ T ] > ρ (using the exact same
calculation as in Eq. (4.2)). Also, D̄ accepts every z ∈ S, with probability at least
1 − 1/N 2 , and rejects every z ∈ T, with probability at least 1 − 1/N 2 .
Given these properties, the rest of the proof continues with only one change.
Specifically, in the third step of the execution of M, we compute D̄ (z) (using ran-
domness), and continue only if the output is zero. Since any z ∈ S will be accepted
with high probability, if we continue then we are confident that z ∈ S̄, and thus (given
that we verified h(z) = h( f¯i )) we are certain that z = f¯i . The final procedure that uses
Dec and M uses at most t̄ = s · N O(γ) < N e0 /10 queries to D.

The “furthermore” part. We partition the queries of R into t̄/s sets of size s in the
natural way (i.e., each subset corresponds to a set of s queried made by one of the
executions of M).28 The variable h is the choice of hash function, and as proved in
1− γ
Fact 4.1.1, with probability at least 1 − 2− N̄ > 1 − 1/N, there are no distinct z, z0 ∈ S̄
such that h(z) = h(z0 ). We also modify the definition of D̄ and S and T above to use
the value ν = 1/3 instead of ν = Prr∈{0,1} N [ D (r ) = 1], and let α = 1/3 + .005.
Deficient oracles. Assume, for a moment, that all the queries that R makes to M during
its execution are to indices in substrings f¯i such that i ∈ I. Then, the soundness
condition for deficient oracles follows immediately by the definitions of M and R: If
in any execution of M less than α of its queries are answered by 1 then D̄ (z) = 0 and
M aborts (in which case R outputs ⊥).
The only gap is that some queries to M might be to indices in substrings f¯i where
i∈/ I. To handle this gap we change the definition of M as follows. Whenever i ∈ / I,
the original machine just outputs 0, whereas our new modified machine will non-
deterministically guess z ∈ {0, 1} N̄ and verify that z ∈
/ S, and only if the verification
passes it outputs 0 (otherwise it aborts). This does not affect the original proof in any
way, but now the soundness condition for deficient oracles holds even without the
assumption that queries to M are such that i ∈ I.
Honest oracle. We now assume that D accepts a pseudorandom string of G f with
probability at least 1/2. Observe that Prr [ f¯i ∈ T ] ≥ ρ as in the proof of Proposition 4.1;
to see this, use the same calculation as in Eq. (4.2) only replacing the original value
Prr∈{0,1} N [ D (r ) = 1] with the new value ν = 1/3.
The string adv consists of the hash values (i, h( f¯i ) : i ∈ I where I = i ∈ [ L] : f¯i ∈ T
 

and of the local list decoding circuit Dec. (The circuit Dec is just as in the proof of
Proposition 4.1: We consider the execution of Dec on the corrupted codeword defined
by D and h and I, use naive error-reduction, and hard-wire a good random string.
This circuit is given to R as part of the advice adv.) We can assume that the advice
string is of length | f |1−2δ0 (rather than δ0 ) by choosing the parameter δ0 to be smaller
than in Proposition 4.1.
28 Recall that the queries that Dec makes to M are non-adaptive and that the queries that M makes to

D are also non-adaptive.

28
Now, recall that the non-determinism w for R yields non-deterministic strings for
each of the execution of M. For every execution of M on query q and with non-
determinism z, with probability at least 1 − 1/N 2 , if at least α · t of M’s oracle queries
are answered by 1, and D indeed evaluates to 1 on these queries, then M outputs
the value σ ∈ {0, 1, ⊥} such that Pr[ M D (q, z) = σ] ≥ 2/3. By a union-bound, with
probability 1 − 1/O( N ) this happens for all queries to M.
In this case, we can think of the oracle that Dec gets access to as a fixed string
y in {0, 1, ⊥}. Given any input, if Dec sees gets an answer of ⊥ from M, then it
outputs ⊥; and otherwise it outputs the answer corresponding to the unique codeword
determined by y and by the list-decoding index.29
To prove completeness, note that for every x there exists w for which σ ∈ {0, 1}
for every possible query q to M. Also, for such w, all the corresponding z’s will be in
S̄. Thus, with probability 1 − 1/O( N ) over the coins of M, at least an α-fraction of the
queries of M to the oracle will be to 1-instances of D. In this case, when at least an
α-fraction of the received answers are 1, then Dec (and hence also R) outputs f x .
Dishonest oracles. Fix any advice adv to R. We can assume without loss of generality
that adv consists of hash values {(i, hi ) : i ∈ I } where I ⊆ [ L] such that | I | ≥ ρ and of
the index η ∈ [O(1/ρ)] of a codeword for the local decoder Dec and a random string
for Dec (otherwise R can always output ⊥, regardless of x and w).
We partition [ L] into Ī = [ L] \ I, and I0 = i ∈ I : ∃z ∈ S̄, h(z) = hi , and I1 = I \ I0 .
For every i ∈ I0 let z(i ) be the unique string in S̄ such that h(z(i )) = hi . Denote by
¯
ḡ ∈ {0, 1}| f | the following string: For every ( q ∈ [| ḡ|], let i (q) be the i ∈ [ L] such that
0 i (q) ∈ Ī ∪ I1
q indexes a location in ḡi , and let ḡ(q) = , where z(i )q is the bit
z(i )q i (q) ∈ I0
in z(i ) corresponding to the location indexed by q. Note that ḡ along with the index η
define together a unique codeword g (i.e., g is the η th codeword in the list of codewords
that agree with ḡ on at least ρ indices).
Now fix any ( x, w). For any query q to M with non-determinism z0 , if i (q) ∈ / I
then by definition M can only output either ḡ(q) = 0 or ⊥. Also, with probability at
least 1 − 1/N 2 over γ, if at least α · s of M’s oracle queries are answered by 1 and D
evaluates to 1 on these queries:
(
ḡ(q) z0 = z(i )
1. If i (q) ∈ I0 then M outputs .
⊥ o.w.

2. If i (q) ∈ I1 then M outputs ⊥. (Because the oracle answers guarantee that z0 ∈


/ S̄,
and hence h(z0 ) 6= hi .)

By a union bound, with probability 1 − 1/O( N ) the above holds for all queries that
Dec makes to M. Now, consider an execution of Dec on input x ∈ [| f |] with oracle
access to ḡ and with the index η and the randomness that R provides it. Denote by g
the string such that gx is the answer of Dec on input x ∈ [| f |] in such an execution.
Note that g depends only on D, on h and on the advice adv.
The last step is to analyze the behavior of Dec when its gets oracle access to M
rather than to ḡ, and under the condition on M’s random choices above and the as-
sumption that a x,w,γ is α-indicative of d x,w,γ . Recall that Dec issues its queries in par-
allel, and therefore it makes the same queries to M and to ḡ. We now consider two
29 To be accurate, in the latter case Dec returns the answer corresponding to the string y0 in which ⊥’s

are replaced with 0’s.

29
cases: If at least one query of Dec to M is answered by ⊥, then R aborts and outputs
⊥ (by the definition of R). Otherwise, all of the queries of Dec to M are answered
according to ḡ; in this case Dec outputs gx .

5.2 The superfast derandomization result: Basic version


In this section we prove Theorem 1.2. First, in Section 5.2.1 we present two PRGs
that use the refined construction from Proposition 5.2. Then, in Section 5.2.2 we show
how to compose these PRGs to obtain our hardness vs randomness results for AM
protocol.

5.2.1 Two PRGs that will be composed later


The first PRG, presented next, uses a truth-table that can be recognized in near-linear
time, and is hard for MAM protocols running in time 2(1−δ)·n with 2(1−δ)·n bits of non-
uniform advice, for a small constant δ > 0. This PRG allows to reduce the number of
coins the verifier uses in any T-time protocol to be approximately log( T ). (Recall that
in our final result we want the number of coins to be approximately log(n).)

Proposition 5.3 (the “outer PRG” – radically reducing the number of random coins).
For every e > 0 there exists δ > 0 such that the following holds. Assume that there ex-
ists an N T IME [ N 1+e/3 ]-constructive property L useful against MAM[2(1−δ)·n ]/2(1−δ)·n .
Then, for every time bound T it holds that pr AMT IME [ T ] ⊆ AMT IME [ T 1+e , (1 + e) ·
log( T )].

Proof. Fix a problem Π = (Y, N) ∈ pr AMT IME [ T ], and let V be a T-time verifier
1+e/3
for Π. Given input x ∈ {0, 1}n , let N = T (n). We guess f n ∈ {0, 1} N and verify
that f n ∈ L (otherwise we abort). Consider the generator G from Proposition 5.2 with
e0 = e and input 1 N oracle access to f n , and denote its set of outputs by s1 , . . . , s N 1+e ∈
{0, 1} N . The new verifier V 0 chooses a random i ∈ [ N 1+e ] and simulates V at input x
with random coins si . Note that this verifier indeed runs in time O( N 1+e ).
The reconstruction argument. Assume towards a contradiction that there exists an in-
finite set S of pairs ( x, f n ) such that x ∈ N and Pri∈[ N 1+e ] [∃w : V ( x, si , w) = 1] ≥ .5,
where the si ’s are the outputs of G on non-deterministic guess f n . For every such pair,
denote by Dx : {0, 1} N → {0, 1} the function Dx (z) = 1 ⇐⇒ ∃w : V ( x, z, w) = 1, and
note that Prr∈{0,1} N [ Dx (r ) = 1] ≤ 1/3 (because x ∈ N).
We design an MAM protocol for f n as follows. Consider the reconstruction al-
gorithm R from Proposition 5.2 with oracle access to Dx and with the corresponding
advice string. Given input z ∈ [| f n |], our protocol acts as follows:

1. The prover sends non-determinism w for R.

2. The verifier simulates R using random coins, computes a set q1 , .., .qt of queries
to Dx , and sends them back to the prover.

3. The prover sends t responses w1 , . . . , wt , to be used as non-determinism for Dx


with queries q1 , . . . , qt .

4. (Deterministic step.) The verifier computes the values {di = V (qi , z, wi )}i∈[t] .
Then it simulates R with witness w and with the query answers {di }i∈[t] , and
accepts iff R accepts.

30
We now use the “furthermore” part of Proposition 5.2. By the “honest oracle”
part, and our assumption about Dx , there exists adv and s ∈ N and α ∈ (0, 1) such
that for every x there exists w for which with high probability over R’s random coins,
the oracle queries can be answered in a manner that will be (s, α)-indicative of the
(s, α)-valid sequence of answers by Dx , and whenever that happens R outputs f x ;
and whenever the sequence of answers to oracle queries are (s, α)-indicative of the
sequence of answers by Dx , then R outputs a value in {⊥, f x }.
We hard-wire adv, s, α as non-uniform advice to the MAM protocol. For every x
the prover can send the “correct” w in the first step, and witnesses w1 , . . . , wt in the
third step such that di = Dx (qi ) for all i ∈ [t], in which case the output of R is f x , with
high probability. To establish the soundness of the protocol, observe that we always
have di ≤ Dx (qi ) (i.e., it is impossible that di = 1 whereas Dx (qi ) = 0). Then, by
Proposition 5.2, with high probability the following holds: If the sequence of answers
to the verifier’s queries is (α, s)-deficient, then R outputs ⊥; and otherwise, it means
that the answers are (s, α)-indicative of the sequence of answers by Dx , in which case
we are in the soundness case and the output is either f x or ⊥.
By hard-wiring appropriate advice for every input length n such that ( x, f n ) ∈ S,
the protocol above computes a language whose truth-tables are included in L infinitely
often. (As in the proof of Theorem 4.2, on input lengths for which there are no suitable
x and f n , the advice string indicates that the protocol should compute the all-zero
function.) The time complexity of the protocol is at most

O( | f n |1−δ0 + e/10
|N{z } · N
|{z} ) < | f n | 1− δ , (5.1)
| {z }
complexity of R number of queries (i.e. t) complexity of V

where the inequality relies on a sufficiently small choice of δ; and similarly, the advice
needed for this protocol is of length | x | + | f n |1−δ0 + O(1) < | f n |1−δ . This contradicts
the usefulness of L.

Remark 5.4 (handling AM with imperfect completeness). Observe that the proof
above can be easily adapted to yield derandomization of AM protocols that do
not have perfect completeness, at the cost of strengthening the hardness assumption.
Specifically, assume that f n is hard even for MA protocols (with the specific running
time and advice) that can issue oracle queries to AMT IME [O(n)]. Then, given x ∈ Y
such that Pr[∃w : V ( x, si , w) = 1] is too low, we can define D̄x = 1 to be the negation of
the function Dx defined in the proof above, and run the reconstruction argument with
oracle access to D̄x , contradicting the hardness of f n . We chose to present our result as
above merely since AM with perfect completeness is a natural class to consider, and
we preferred using the weaker hypothesis.

A tangent: Derandomization pr BP P . Using the proof approach of Proposition 5.3,


we can obtain a derandomization of pr BP P (rather than of AM) in time that is op-
timal under #NSETH, using hypotheses that compare favorably to previous results.
Specifically, we show that:

Theorem 5.5 (optimal derandomization of pr BP P without OWFs). For every e > 0


there exists δ > 0 such that for any polynomial T (n) the following holds. Assume that:

1. There exists L ∈/ i.o.MAT IME [2(1−δ)·n ]/2(1−δ)·n and a deterministic algorithm that
gets input 1n , runs in time 2(1+e/3)·n , and prints the truth-table of L on n-bit inputs.

31
2. For a constant k = k T ≥ 1 there exists L ∈ / i.o.DT IME [2(k−δ)·n ]/2(1−δ)·n and a
deterministic algorithm that gets input 1 , runs in time 2(k+1)·n , and prints the truth-
n

table of L on n-bit inputs.


Then, pr BP T IME [ T ] ⊆ pr DT IME [n · T 1+e ].
The conclusion in Theorem 5.5 is identical to that in [CT21b], and so is the hypoth-
esis in Item (2). The new part is that the hypothesis in Item (1) of Theorem 5.5 replaces
the cryptographic hypothesis that one-way functions exist in [CT21b].
Proof sketch for Theorem 5.5. Using the hypothesis in Item (1), we mimic the proof
of Proposition 5.3 to argue that pr BP T IME [ T ] ⊆ pr BP T IME [ T 1+e/2 , (1 + e/2) ·
log( T )], where the latter class is that of problems that can be decided in time T with
(1 + e/2) · log( T ) random coins. (This follows since the generator in Proposition 5.3
can now deterministically print f n , which is the truth-table of L on (1 + e/6) · log( N )
input bits; and since the reconstruction protocol does not need an additional round,
because the distinguisher is now just a deterministic function.)
We then use the hypothesis in Item (2) with the Nisan-Wigderson generator, when
the latter is instantiated for small output length (see Theorem 5.16 for a statement of
the generator’s parameters). Instantiating this generator with the truth-table of L on
inputs of length (1 + Θ(e)) · log(n) as in [CT21b, Section 4.2] or in Proposition 5.17,
we deduce that pr BP T IME [ T 1+e/2 , (1 + e/2) · log( T )] ⊆ pr DT IME [n · T 1+e ]

The inner PRG. The next PRG that we present assumes that there is a function
whose truth-tables can be recognized in time approximately nk and that is hard for
MAM protocols running in time 2(1−δ)·k·n with 2(1−δ)·n bits of non-uniform advice
(again δ > 0 here is a small constant). Under this assumption, any protocol that uses
at most linearly-many random coins can be derandomized with time overhead that is
multiplicative in the input length.
Proposition 5.6 (the “inner PRG” – derandomizing AM protocols with few random
coins). For every e > 0 there exists δ > 0 such that the following holds. Fix k ≥ 1,
and assume that there exists an N T IME [ N k+e ]-constructive property L useful against
MAM[2(1−δ)·k·n ]/2(1−δ)·n . Then,
AMT IME [nk , n] ⊆ N T IME [n1+(1+e)·k ] .
Proof. Fix Π ∈ AMT IME [nk , n], and let V be the nk -time verifier that uses at most
k
n random coins. Given input x ∈ {0, 1}n and witness w ∈ {0, 1}n , we guess f n ∈
1 + e/3
{0, 1}n and verify that f n ∈ L. Consider G from Proposition 5.2 with e0 = e
and input 1n oracle access to f n . We enumerate over the output-set s1 , . . . , sn1+e ∈
{0, 1}n of G and output MAJi {V ( x, si , w)}. Note that this verifier indeed runs in time
O(n(1+e/3)·(k+e) + n1+e · nk ) < n1+(1+e)·k .
The reconstruction argument is essentially identical to the one in the proof of
Proposition 5.3, the only difference being its time complexity. Specifically, using a
calculation analogous to Eq. (5.1), the time complexity of the reconstruction in our
parameter setting is
O( | f n |1−δ0 + n e/10
| {z } · nk
|{z} ) < | f n |(1−δ)·k ,
| {z }
complexity of R number of queries complexity of V

where the inequality relies on the fact that k + e/10 < (1 + e/3)(1 − δ) · k, using a
sufficiently small choice of δ > 0.

32
5.2.2 Composing the two PRGs
By a straightforward combination of Proposition 5.3 and Proposition 5.6, we obtain
the following result, which is the first conclusion in Theorem 1.2:

Corollary 5.7 (superfast derandomization of AM; the first part of Theorem 1.2). For
every e > 0 there exists δ > 0 such that the following holds. Assume that for every k ≥ 1 there
exists an N T IME [ N k+e/3 ]-constructive property useful against MAM[2(1−δ)·k·n ]/2(1−δ)·n .
Then, for every polynomial T it holds that pr AMT IME [ T ] ⊆ pr N T IME [n · T 1+e ].

Proof. Let T (n) = nc . By Proposition 5.3 with parameter value e/3 (using our hy-
pothesis with k = 1), we have that pr AMT IME [ T ] ⊆ pr AMT IME [ T 1+e/3 , (1 +
e/3) · log( T )]. By Proposition 5.6 instantiated with parameter values e/3 and k =
(1 + e/3) · c, such that T 1+e/3 = nk , we have that pr AMT IME [nk , O(log(n))] ⊆
pr AMT IME [n1+(1+e/3)·k ], and note that n1+(1+e/3)·k ≤ n · T 1+e .

To prove the “furthermore” part of Theorem 1.2 we combine the foregoing result
with a careful application of the standard round-reduction procedure for AM proto-
cols by Babai and Moran [BM88].

Corollary 5.8 (superfast derandomization of AMT IME [ c] ; the “furthermore” part


of Theorem 1.2, restated). For every e > 0 there exists δ > 0 such that the following holds.
Assume that for every k ≥ 1 there exists an N T IME [ N k+e/3 ]-constructive property useful
against MAM[2(1−δ)·k·n ]/2(1−δ)·n . Then, for every polynomial T and constant c ∈ N,

pr AMT IME [ c]
[ T ] ⊆ pr N T IME [n · T dc/2e+e ] .

and

pr MAT IME [ c]
[ T ] ⊆ pr N T IME [ T bc/2c+1+e ] .

Proof. Let us first prove the claim about AM, and note that the case of c = 2 was
proved in Corollary 5.7. For c ≥ 3, we first use the standard round-reduction of
Babai and Moran [BM88] to unconditionally simulate the class in AM, while carefully
tracking the simulation overheads, and then we apply Corollary 5.7.
0 0
Specifically, denote by pr AMT IME [ c,T ] [ T ] (resp., prMAT IME [ c,T ] [ T ]) a
protocol with c turns in which the verifier runs in time T in each turn and the prover
sends T 0 bits in each turn. We use the following result:
Proposition 5.8.1 (the round-reduction of [BM88] for AM; see, e.g., [Gol08, Appendix
0
F.2.2.1, Extension]). For every constant c ≥ 3 we have that pr AMT IME [ c,T ] [ T ] ⊆
0
pr AMT IME [ min{c−2,2},O(T )] [ T · T 0 ].
We can apply Proposition 5.8.1 for c0 = dc/2e − 1 times to deduce that

pr AMT IME [ c]
[ T ] ⊆ AMT IME [O( T dc/2e )] ,

and the claim follows from Corollary 5.7.

33
The case of MAT IME [ c] . Let Π = (Y, N) ∈ pr MAT IME [ c] [ T ] and let V be a
corresponding T-time protocol. Let V>1 be the sub-protocol of V after the first turn,
which takes an input x ∈ {0, 1}n to Π and a first prover message w1 ∈ {0, 1} T (n) as
input. Note that V>1 is an AMT IME [ c−1] [O(n)] protocol on n + T (n) bits of input.
From the definition of MAT IME [ c] [ T ], we have
(
∃ w ∈ {0, 1}T (n) s.t. V>1 ( x, w) accepts with probability 1 for x ∈ Y ,
∀ w ∈ {0, 1} T ( n ) V>1 ( x, w) accepts with probability at most 1/3 for x ∈ N .

Let Π̄ = (Ȳ, N̄) such that ( x, w) ∈ Ȳ if V>1 ( x, w) accepts with probability at least
2/3, and ( x, w) ∈ N̄ if V>1 ( x, w) accepts with probability at most 1/3. Note that
Π̄ ∈ pr AMT IME [ c−1] [O(n)], and hence (by the first part of the proof), we have
that Π̄ ∈ pr N T IME [n1+d(c−1)/2e+e ] = pr N T IME [n1+bc/2c+e ].
Let M( x, w) be the corresponding nondeterministic machine that decides Π̄. We
can decide Π as follows: Given input x ∈ {0, 1}n , guess w ∈ {0, 1} T (n) , simulate
M( x, w) and accept iff M accepts. Note that for x ∈ Y, there exists w ∈ {0, 1} T (n) such
that ( x, w) ∈ Ȳ, and hence M( x, w) accepts on this particular w. And when x ∈ N,
for all w ∈ {0, 1} T (n) we have ( x, w) ∈ N̄, and thus M( x, w) rejects on all possible w.
Therefore, we have that Π ∈ pr N T IME [ T 1+bc/2c+e ].

5.3 The superfast derandomization result: Stronger version


In this section we prove the stronger version of Theorem 1.2, which was mentioned
after the original theorem’s statement. Our main goal will be to relax the hypothesis
that is needed for values of k > 1, by requiring hardness only against N T IME
machines with advice, rather than against MAM protocols with advice.
To do so, we replace the “inner” generator from Proposition 5.6 with the Nisan-
Wigderson generator, which is similar to a proof in [CT21b]. The key challenge is that
the latter generator requires hardness against algorithms (with advice) that have oracle
access to N T IME , whereas we only want to assume hardness against N T IME ma-
chines (with advice). To bridge this gap, we first show, in Section 5.3.1, how to trans-
form a truth-table that is hard for N T IME machines with advice into a truth-table
that is hard for DT IME machines with advice that have oracle access to N T IME .
This proof follows an argument from [SU06], but uses a more careful analysis to obtain
tighter bounds on the running time. Then, in Section 5.3.2 we combine this transfor-
mation with an application of the Nisan-Wigderson generator as above to prove the
stronger version of Theorem 1.2 (for the result statement, see Theorem 5.16).

5.3.1 A refined analysis of the “random curve reduction”


In this section it will be more convenient to work with the notion of non-uniform pro-
grams, rather than with Turing machines that take advice. A non-uniform program A
on n-bit inputs with advice complexity α and running time T is a RAM program Π of
description size α (i.e., |Π| ≤ α). Given an input x ∈ {0, 1}n , A( x ) is defined to be the
output of running the program Π on input x for at most T steps (if, Π does not stop,
we define the output to be 0). The notation “A on n-bit inputs” means that we only
care about A’s outputs on n-bit inputs.30
30 This is the reason why we use program instead of algorithm. We want to emphasize the fact that a

non-uniform program is a non-asymptotic object and we only care about its behaviors on a fixed-input
length n; this is similar to a circuit with n-bit inputs.

34
We will need the notions of non-uniform single-valued programs (which are non-
uniform programs analogous of N P ∩ co N P ) and of non-uniform non-adaptive SAT-
oracle program (which are non-uniform programs analogous to P with oracle access to
N P ), defined as follows.
Definition 5.9 (non-uniform SVN programs). A single-valued nondeterministic program
A on n-bit inputs with advice complexity α and running time T is a nonuniform program with
the same advice complexity and running time which receives two inputs: an input x ∈ {0, 1}n
and a second input y of length at most T, and outputs two bits: the value bit and the flag bit.
A computes the function f : {0, 1}n → {0, 1} if the following hold:
• For every x ∈ {0, 1}n and y, if the flag bit of A( x, y) equals 1, then the value bit of
A( x, y) equals f ( x ).

• For every x, there is y such that the flag bit of A( x, y) equals 1.


We note that if α = T, then a single-valued nondeterministic program is essentially
a single-valued nondeterministic circuit (see [SU06]).
Definition 5.10 (non-uniform programs). A non-adaptive non-uniform SAT-oracle pro-
gram A on n-bit inputs is a pair of non-uniform programs Apre and Apost . The program Apre
has n-bit inputs, and an input x ∈ {0, 1}n , it outputs queries q1 , . . . , qk . 31 The program Apost
receives x ∈ {0, 1}n together with k bits a1 , . . . , ak where ai = 1 if and only if qi ∈ SAT, and
outputs a single answer bit.
The running time (resp. advice complexity) of A is defined as the sum of the running time
(resp. advice complexity) of Apre and Apost . We also call k the query complexity of A.
For convenience, we will assume that the SAT oracle takes the description of a for-
mula φ as input, and the number of variables in φ is at most the description length of
φ. In other words, to prove that a formula φ with m-bit description length is satisfiable,
one only needs to provide a satisfying assignment with at most m bits.

The transformation. The main result that we prove in this section is a transformation
of truth-tables of functions that are hard for nondeterministic programs to truth-tables
of functions that are hard for SAT-oracle programs, as follows:
Theorem 5.11 (the “random curve reduction” with a careful analysis of overheads).
There is a universal constant c > 1 and an algorithm Alow-d which takes an input function
f : {0, 1}n → {0, 1} , a real e ∈ (0, 1), and an integer k ≤ 2en/100 as input, and outputs
the truth-table of another function g : {0, 1}(1+e)n → {0, 1} such that for all sufficiently large
n ∈ N:
1. Alow-d ( f , e, k) runs in Õ(2(1+e)n ) time.

2. If f does not have a single-valued nondeterministic program with running time T and
advice complexity α, then g does not have a non-adaptive non-uniform SAT-oracle pro-
gram with running time T · (n1/e · k )−c , advice complexity α − O(n2 · k ), and query
complexity k.
To prove Theorem 5.11 we will need the following technical tools. First, we need
the standard low-degree extension of Boolean functions, and we use the precise defi-
nition from [SU06].
31 On all inputs from {0, 1}n the program Apre outputs the same number of queries.

35
Definition 5.12 (low-degree extension). Let f : {0, 1}n → {0, 1} be a function, h, q be
powers of 2 such that h ≤ q and d ∈ N such that hd ≥ 2n . Let H be the first h elements from
Fq , and I be an efficiently computable injective mapping from {0, 1}n \ H d . The low-degree
extension of f with respect to q, h, d is the unique d-variable polynomial fˆ : Fdq → Fq with
degree h − 1 in each variable, such that fˆ( I ( x )) = f ( x ) for all x ∈ {0, 1}n and fˆ(v) = 0 for
all v ∈ ( H d \ Im( I )).
We will also consider the Boolean function of fˆ, denoted as f bool : {0, 1}d log q+log log q →
{0, 1} and defined by f bool ( x, i ) = fˆ( x )i , where fˆ( x )i denotes the i-th bit of the binary repre-
sentation of fˆ( x ).
Since in Definition 5.12 we always consider finite fields Fq whose size is a power
of 2, we can naturally encode each element in Fq by exactly log q bits. Hence, the
mapping I can be constructed as follows: given x ∈ {0, 1}n , partition it into d con-
secutive blocks x (1) , . . . , x (d) each of size log h and ignore the remaining bits (note that
n ≥ d · log h), and then interpret each x (i) as an element of H via a natural bijection
(first interpret x (i) as an integer from [h], and then interpret the obtained integer u
as the u-th element from H). Also, by standard interpolation, the truth-table of fˆ
or f bool can be computed in Õ(qd ) time from the truth-table of f . For simplicity, we
will chose the representation of Fq in such that a way that for every u ∈ {0, 1}n ,
f (u) = f bool ( I (u), 1).
We also need the notion of parametric curves, and a concentration bound for func-
tions on random curves that was proved in [SU06].
Definition 5.13 (parametric curves). Let q be a prime power and f 1 , . . . , f q be an enu-
meration of elements from Fq . For convenience we will assume f 1 = 0. Given r elements
v1 , . . . , vr ∈ Fdq for r ≤ q, we define the curve passing through v1 , . . . , vr to be the unique
degree r − 1 polynomial function c : Fq → Fdq such that c( f i ) = vi for every i ∈ [r ]. We say
that a curve c is one to one if c( f i ) 6= c( f j ) for every distinct pair i, j from [q].
The following technical lemma from [SU06] asserts that any function (or set of
functions) satisfy good concentration bounds on random curves. To state it, de-
note by c( x, c1 , . . . , cr ) the unique curve passing through x, c1 , . . . , cr , and note that
c( x, c1 , . . . , cr )(0) = x by its definition. We also use F∗q to denote Fq \ {0}. For two
finite sets W, Z such that W ⊆ Z and h : Z → [0, 1], we define
1
|W | i∑
µW ( h ) = h (i ) .
∈W

Then, the technical lemma from [SU06] is as follows:


Lemma 5.14 (curves are good samplers; see [SU06]). Let q be a prime power and r be an
integer such that 2 ≤ r < q. For every point x ∈ Fdq , a list of k functions h1 , . . . , hk : Fdq →
[0, 1], and δ ∈ (0, 1), the probability over a random choice of points v1 , . . . , vr ∈ Fdq that
c( x, c1 , . . . , cr ) is one to one and32

µc(x,c1 ,...,cr )(F∗q ) (hi ) − µFdq (hi ) < δ

for every i ∈ [k ] is at least


 r/2 !
2r 1
1− 8k · + .
( q − 1) · δ2 q d −2
32 We use c( x, c1 , . . . , cr )(F∗q ) to denote the set {c( x, c1 , . . . , cr )(u) : u ∈ F∗q }.

36
We now turn to the actual proof of Theorem 5.11.

Proof of Theorem 5.11. We mimic the proof of [SU06, Theorem 3.2] with a more care-
ful choice of parameters. We also keep track of the running time and non-uniformity
separately, instead of a single measure of non-uniform circuit size as in [SU06].
We define pw( x ) = 2dlog xe . That is, pw( x ) is the smallest power of 2 that is at
least x. Let c0 ∈ N be a large enough constant to be chosen later. We first define the
following parameters:

1. r = c0 · n.
 
2. h = pw(h̃) where h̃ = max c0 · r2 · (kn)4 , (c0 · (n + 3) · r )3/e .

3. d = dn/ log he + 3.

4. q = pw(q̃/2), where q̃ = c0 · h · d · r.

Construction of the function g. Now, we define fˆ : Fdq → Fq and f bool : {0, 1}d·log q+log log q →
{0, 1} as the low-degree extension of f with respect to q, h, d, according to Defini-
tion 5.12. And we define our output function g so that given an input x ∈ {0, 1}(1+e)·n ,
g( x ) computes f bool on the length-(d · log q + log log q) prefix of x.
Fact 5.14.1. By our parameter choices we have that (1 + e) · n ≥ d · log q + log log q, and
hence g is well-defined.

Proof. By the definition of q, we have that q ≤ q̃ = h · (c0 · d · r ). By the definition of h,


we have that
h ≥ h̃ ≥ (c0 · (n + 3) · r )3/e ≥ (c0 · d · r )3/e .
The above further implies that q ≤ q̃ ≤ h1+e/3 . Hence, for a sufficiently large
n ∈ N, we have that

d · log q + log log q = (dn/ log he + 3) · log q + log log q


log q
≤ n· + 5 · log q ≤ (1 + e/3) · n + 5 · log q .
log h

Next, from the definition of q and the assumption that k ≤ 2en/100 , we have

log q ≤ O(log n) + log h ≤ O(log n) + 4 · log k


≤ O(log n) + 4 · e/100 · n ≤ e/20 · n .

Putting the above together, we have

d · log q + log log q ≤ (1 + e/3 + e/4) · n < (1 + e) · n .


As mentioned after Definition 5.12, the truth-table of g can be computed in Õ(2(1+e)n )
time. This proves Item (1).

37
Construction of a probabilistic program B0 for f bool . To prove Item (2), it suffices to
show the following

• If f bool has a non-adaptive non-uniform SAT-oracle program A = ( Apre , Apost )


with running time T ≥ 1, advice complexity α ≥ 0, and query complexity k,

• then f bool has a single-valued nondeterministic program B with advice complex-


ity α + O(n2 · k ) and running time O( T · q + poly(q)).33

We will first construct a probabilistic single-valued nondeterministic program B0


that computes f bool and then fix its randomness to obtain the desired program that
computes f (the definition of such a probabilistic program will be clear later in the
proof). Using our assumption about f bool , we can construct a non-adaptive non-
uniform SAT-oracle program  = ( Âpre , Âpost ) with query complexity m = k · log q,
advice complexity α, and running time T · log q that computes fˆ : Fdq → Fq .
For x ∈ Fdq , let Q1 ( x ), . . . , Qm ( x ) and A1 ( x ), . . . , Am ( x ) be the queries and answers
associated with Â, respectively, on input x. We then define pi = µFdq ( Ai ) for every
i ∈ [m] and δ = 1/(9m). The probabilistic program B0 takes the α-bit advice of Â
together with all the p1 , . . . , pm as advice.34
On an input ( x, b) ∈ {0, 1}d log q × [log q], B0 works as follows:

1. Pick v1 , . . . , vr uniformly at random, and set x a = c x,v1 ,...,vr ( a) for every a ∈ Fq .


Simulate Âpre to compute queries Qi ( x a ) for every i ∈ [m] and a ∈ F∗q .

2. Set ni = b( pi − δ) · (q − 1)c. For every i ∈ [m], guess zi ∈ {0, 1}Fq with exactly ni

ones, and T-bit strings {wi,a } a∈F∗q .

3. For every i ∈ [m] and a ∈ F∗q , check that (zi ) a = 1 implies that wi,a is a wit-
ness that query Qi ( x a ) is answered positively (recall that the query is to SAT);
otherwise, set the flag bit in the output to be 0 and halt.

4. For every a ∈ F∗q , compute y a = Âpost ( x a , (z1 ) a , (z2 ) a , . . . , (zm ) a ).

5. Run the algorithm from Lemma 3.12 on the q − 1 pairs ( f a , y a ) with degree u =
hdr to obtain a polynomial τ : Fq → Fq of degree u. Set the flag bit in the output
to be 1. If no such τ exists, set the value bit in the output to be 0, and otherwise
set the value bit to be the b-th bit of τ (0).

Analysis of the program B0 . We need the following claim.


2− n
Claim 5.15. For every ( x, b) ∈ Fdq × [log q], with probability more than 1 − 2 log q over the
choice of v1 , . . . , vr , the following two conditions hold:

1. For all guesses zi and wi,a such that the flag bit of the output is set to 1, then the value
bit of the output is f bool ( x, b).

2. There exist guesses zi and wi,a such that the flag bit of the output is set to 1.
33 By the discussion after Definition 5.12, we have f (u) = f bool ( I (u), 1). Hence B can be used to
compute f as well with a minor overhead in the running time.
34 Note that each p can be described by a single integer between 0 and qd . Hence these m reals take
i
m · O(d · log q) = O(m · n) bits to store.

38
Proof. Fix x ∈ Fdq . We apply Lemma 5.14 to show that over a random choice of points
v1 , . . . , vr ∈ Fdq ,

c(x,v1 ,...,vr ) is one to one and for all i ∈ [m], µc(x,c1 ,...,cr )(F∗q ) ( Ai ) − µFdq ( Ai ) < δ (1)

holds with probability at least


 r/2 !
2r 1
1− 8m · + .
( q − 1) · δ2 q d −2
−n
We need to show the above is lower bounded by 1 − 2 2log q . First, by our choices of
q, h, d, we have
1 1 1 1 2− n
= ≤ · ≤ .
q d −2 qdn/ log he+1 hdn/ log he q 4 log q
2 2
Next, note that 2r
(q−1)·δ2
≤ 4·9 q·rm < 1/2 by our choice of q (note that q > h >
c0 · r2 · (kn)4 > c0 · r · m2 , and c0 is sufficiently large). We then have
r/2
2− n

2r
8m · < 8m · 2−r/2 ≤ ,
( q − 1) · δ2 4 log q
the last inequality follows from r = c0 · n for a sufficiently large c0 ∈ N, and m ≤
log q · k ≤ log q · 2en/100 . Putting everything together, we have that (1) holds with
−n
probability more than 1 − 2 2log q .
Next we show whenever (1) holds, the two items in the claim hold. For the second
item, note that since (1) holds, for every i ∈ [m] we know that Ai ( x a ) = 1 for at least ni
distinct elements a ∈ F∗q . Therefore, if for every i ∈ [m], the guess zi is the string with
exactly ni ones in entries indexed by those a with Ai ( x a ) = 1, then there are witnesses
wi,a a∈F∗ that pass the check together with all the zi .
q
For the first item, for all guesses zi and wi,a such that the flag bit is 1, we know
that for all i ∈ [m] and a ∈ F∗q , (zi ) a = 1 implies Ai ( x a ) = 1, and there are exactly ni
ones in zi . On the other hand, by (1), for every i ∈ [m], the number of a ∈ F∗q such that
Ai ( x a ) = 1 is at most d( pi + δ) · (q − 1)e. Hence, we can bound the number of errors
associated with query i as follows:

{ a ∈ F∗q : Ai ( x a ) 6= (zi ) a } ≤ d( pi + δ) · (q − 1)e − b( pi − δ) · (q − 1)c ≤ 2δq.

The above in particular means that for all but at most 2δq · m many a ∈ F∗q , we have
(zi ) a = Ai ( x a ) for all i ∈ [m], and consequently y a = fˆ( x a ). Letting p be the restriction
fˆ ◦ c(x,v1 ,...,vr ) , from the discussions above, it follows that for at least (q − 1) − 2δqm =
(1 − 2δm)q − 1 = 97 · q − 1 of the pairs ( a, y a ) we have y a = p( a). Note that the degree
of p is at most hdr (since fˆ a d-variate polynomial of individual degree at most h − 1,
and we compose it with a curve of degree r). and q ≥ q̃/2 ≥ c0 /2 · hdr. Since c0 is
a sufficiently large constant, we can apply Lemma 3.12 to compute p(0) = fˆ( x ), and
output the b-th bit of fˆ( x ) as desired, which completes the proof. 
Finally, let I : {0, 1}n → Fdq be the injective function associated the low-degree
extension fˆ. Let S = {( x, b) : x ∈ I ({0, 1}n ), b ∈ [log q]}. Since |S| ≤ 2n · log q,
by a union bound, there exists fixed v̂1 , . . . , v̂r such that B0 with randomness fixed to
v̂1 , . . . , v̂r correctly computes f bool on all inputs from S. We then define the single-
valued nondeterministic program B by fixing the randomness of B0 to v̂1 , . . . , v̂r .

39
Verifying the complexity of B. We have already established that B computes f bool .
It remains to verify the running time and advice complexity of B. First, B takes the
following advice:

1. Advice for Â, of length α.

2. Description of the reals p1 , . . . , pm ∈ [0, 1], of total length O(m · n).

3. Fixed randomness v̂1 , . . . , v̂r , which takes d log q · r ≤ O(n2 ) bits.

Hence, the number of advice bits of B is α + O(mn) + O(n2 ) ≤ α + O(n2 · k).


Finally, the running time of B is dominated by the simulation of (q − 1) calls to
Â, the final decoding algorithm, and the time to compute the curve c( x, v̂1 , . . . , v̂r ) by
direct interpolation (which takes poly(r, q) = poly(q) time). The total running time of
B can thus be bounded by

O(q · T ) + poly(q) ≤ T · poly(q) .

and the final bound follows since q ≤ n8/e · k4 .

5.3.2 An inner PRG relying on weaker hypotheses


We will use the following version of the Nisan-Wigderson [NW94] generator, when it
is combined with the locally decodable code of Sudan, Trevisan, and Vadhan [STV01]
and with the weak designs of Raz, Reingold, and Vadhan [RRV02]. This version of the
generator is instantiated for a sufficiently small output length, and we carefully bound
its running time, the running time of the reconstruction argument, and the number of
advice bits that the reconstruction argument needs.

Theorem 5.16 (NW generator for small output length). There exists a universal constant

cNW > 1 such that for all eNW > 0 and µNW > cNW · eNW there exist two algorithms that
for any N ∈ N and f ∈ {0, 1} N satisfy the following:

1. (Generator.) When given input 1 N and oracle access to f , the generator G runs in time
2
N 1+2cNW ·µNW and outputs a set of strings in {0, 1} N NW .
e

e
2. (Reconstruction.) For any (1/N eNW )-distinguisher D : {0, 1} N NW → {0, 1} for
G (1 N ) f there exists a string adv of length N 1−µNW such that the following holds. When
the reconstruction R gets input√
x ∈ [| f |] and oracle access to D and non-uniform advice
adv, it runs in time N cNW · eNW , makes non-adaptive queries to D, and outputs f x .

Proof Sketch. The proof is the standard analysis of the NW generator as in [STV01;
RRV02]; specifically, we follow the proof in [CT21a, Appendix A.2] and explain the
necessary changes. (These changes are needed since in the current statement we de-
couple µNW and eNW , instead of choosing a fixed value for µNW .)
Denote e = eNW and µ = µNW and M = N e . Instead of using designs with
log(ρ) = (1 − 3β) · ` we use designs with log(ρ) = (1 − 2µ) · `. Then, the seed length
of the generator becomes (1 + O( β + µ)) · log( N ) = (1 + O(µ)) · log( N ). In the re-
construction, the oracle machine P computing a corrupted version of f runs in time
poly( M ) = N O(e) and uses M · N (1+ β)·(1−2µ) = N 1+2β−2µ bits of non-uniform √
advice.
Thus, the final reconstruction uses N 1 − µ bits of advice and runs in time N O ( e ) .

40
We now use the NW generator above along with Theorem 5.11 to replace the “in-
ner” PRG from Proposition 5.6 with a PRG that relies only on lower bounds against
N T IME machines with advice, rather than against MAM protocols with advice.
Proposition 5.17 (the “inner PRG” – derandomizing AM protocols with few random
coins). For every e > 0 there exist δ, η > 0 such that the following holds. Fix k ≥ 1,
and assume that there exists an N T IME [ N k+e/3 ]-constructive property L useful against
N T IME [2(1−δ)·k·n ]/2(1−δ)·n . Then,

pr AMT IME [nk , nη ] ⊆ pr N T IME [n1+(1+e)·k ] .

Proof. Let η > 0 be a sufficiently small constant to be determined later, let Π =


(Y, N) ∈ pr AMT IME [nk , nη ], and let V be an nk -time verifier for Π that uses at
most nη random coins. Given x ∈ {0, 1}n , the deterministic verifier guesses a witness
k 1+e/7
w ∈ {0, 1}n and f n ∈ {0, 1}n and checks that f n ∈ L. It then uses the algorithm √
from Theorem 5.11 with parameter e/7 and with a bound of Q = n(1+e/3)·cNW · η on
1+e/3 35
the number of queries to obtain a truth-table gn ∈ {0, 1}n .
The verifier uses G from Theorem 5.16 with parameters eNW = η and N = n1+e/3

and µ = C · η for a sufficiently large universal constant C > 0, giving G oracle
access to gn . Denoting the output strings of G gn (1 N ) by s1 , . . . , s L , the verifier outputs
MAJi {V ( x, si , w)}. Note that the√
running time of G and its number of output strings
L are bounded by n(1+e/3)·(1+O( η )) < n1+e , assuming that η > 0 is sufficiently small.
Thus, the deterministic verifier runs in time n1+(1+e)·k .

η
The reconstruction argument. For any x ∈ {0, 1}n , let Dx : {0, 1}n → {0, 1} be the
function

Dx (r ) = 1 ⇐⇒ ∃w : V ( x, r, w) = 1 .

Assume towards a contradiction that for infinitely many pairs ( x, f n ) such that x ∈ N
1+e/7
and f n ∈ {0, 1}| x| it holds that Pri∈[ L] [ Dx (si ) = 1] ≥ 1/2, where the si ’s are the
result of running G with oracle access to gn and with the parameters above. √
For every such pair, there is an advice string adv of length | gn |1−C· η such that the
reconstruction √
R from Theorem 5.16 computes the function whose truth-table is gn in
cNW · η
time | gn | when given adv as non-uniform advice and oracle access to Dx . By
adding x to the advice (we can assume that | x | < |adv √
| by choosing η√to be sufficiently
cNW · η
small), this is an algorithm that runs in time | gn | , uses | gn |1−C· η bits of advice,
and makes Q non-adaptive oracle queries to N T IME [nk ]. 36 Using an efficient reduc-
tion of N T IME [nk ] to SAT (see [Tou01; √
FLM+05]), we obtain an algorithm that runs
0 k 0 1− C · η
in time T = Õ(n ), uses α = | gn | bits of advice, and makes Q non-adaptive
queries to a SAT oracle.
The algorithm R above yields a non-adaptive non-uniform SAT-oracle program
with running time T 0 and advice complexity α0 and query complexity Q, in the sense
of Definition 5.10. By Theorem 5.11, it holds that gn can be computed by a non-
deterministic non-uniform program with running time T = T 0 · γ√ and advice com- √
plexity α = α0 · γ, where γ = ( Q, log(| f n |)1/e )cSU = ncSU ·cNW ·(1+e/3)· η = | gn |cSU ·cNW · η
and cSU is the constant from Theorem 5.11 that depends on e > 0.
2
35 The truth-table that Theorem 5.11 outputs is of length n(1+e/7) , and by padding it with zeroes we
can assume that it is of length n1+e/3 .
36 In more detail, queries to the oracle are of the form ( x, r ) ∈ {0, 1}n × {0, 1}nη , and the oracle answers

“yes” iff there exists w such that V ( x, r, w) = 1.

41
By the straightforward simulation of the infinite sequence of non-uniform pro-
grams to a Turing machine with advice,37 there exists a non-deterministic machine
M that runs in time Õ( T ) and an advice sequence of length α and infinitely many
gn ∈ L such that on input length log(| gn |), when M is given the appropriate advice it
computes gn . Plugging in the parameters, we have that
√ √
Õ( T ) = Õ(nk · | gn |cSU ·cNW · η
) = Õ(| gn |k/(1+e/3)+cSU ·cNW · η
) < | gn |k/(1+e/4)
√ √ √
α = | gn | 1− C · η
· | gn |cSU ·cNW · η
≤ | gn | 1− η

where the upper-bound on T follows by choosing a sufficiently small η = η (e) > 0.


This contradicts the usefulness of L if we choose δ > 0 to be sufficiently small.

By composing the “outer” PRG from Proposition 5.3 with the PRG from Proposi-
tion 5.17, in the exact same way as in Section 5.2.2, we obtain the following stronger
version of Theorem 1.2:

Theorem 5.18 (stronger version of Theorem 1.2). For every e > 0 there exists δ > 0 such
that the following holds. Assume that:

1. There exists an N T IME [ N k+e/3 ]-constructive property useful against MAM[2(1−δ)·n ]/2(1−δ)·n .

2. For every k ≥ 1 there exists an N T IME [ N k+e/3 ]-constructive property useful against
N T IME [2(1−δ)·k·n ]/2(1−δ)·n .
Then, for every constant c ∈ N it holds that

pr AMT IME [ c]
[ T ] ⊆ pr N T IME [n · T dc/2e+e ] .

5.4 Uniform trade-offs for AM ∩ co AM


In this section we prove Theorem 1.3. The main claim in the proof is the following,
which shows that under the hardness assumption of Theorem 1.3, one can reduce
the randomness complexity of any (AM ∩ co AM)T IME [ T ] protocol to roughly
log T (n). That is:

Proposition 5.19 (radically reducing the number of random coins of AM ∩ co AM).


For every e > 0 there exists δ > 0 such that the following holds. Assume that there exists
[ 7]
L∈/ i.o.(MA ∩ co MA)T IME 2 [2(1−δ)·n ] such that truth-tables of L of length N = 2n
can be recognized in nondeterministic time N 1+e/3 . Then, for every time-computable T it holds
that

(AM ∩ co AM)T IME [ T ] ⊆ (AM ∩ co AM)T IME [ T 1+e , (1 + e) · log( T (n))].

Note that Theorem 1.3 follows immediately from Proposition 5.19 by enumerating
all possible random choices; that is, the deterministic verifier asks the prover to send its
responses to all possible T 1+e random challenges, checks the responses for consistency,
and computes the probability that the original random verifier would have accepted.

Proof of Proposition 5.19. Fixing any L ∈ (AM ∩ co AM)T IME [ T ], we prove that
L ∈ AMT IME [ T 1+e , (1 + e) · log T (n)]; since the same argument can also be applied
to the complement of L, it follows that L ∈ co AMT IME [ T 1+e , (1 + e) · log T (n)].
37 That is, the machine gets as advice the description of the program and simulates it.

42
By definition, there are two T-time verifiers V1 ( x, y, z) and V0 ( x, y, z) with |y| =
|z| = T (| x |), such that the following holds for every x ∈ {0, 1}n and σ = L( x ):

Pr [∃ z s.t. Vσ ( x, y, z) = 1] = 1 ,
y ∼u T ( n )

Pr [∃ z s.t. V1−σ ( x, y, z) = 1] ≤ 1/3 .


y ∼u T ( n )

Let δ0 be the corresponding constant from Proposition 5.2 when setting e0 = e, and
without loss of generality assume that δ0 < e0 .

Construction of the new verifier V 0 . Given input x ∈ {0, 1}n , let N = T (n). Let
1+e/3−δ0 /10
` = (1 + e/3 − δ0 /10) · log N. V 0 guesses f ` ∈ {0, 1} N and verifies that f ` ∈
tt( L) by guessing the witness for the nondeterministic algorithm that recognizes tt( L)
(otherwise V 0 rejects). V 0 then computes Enc( f ` ) using the encoder in Theorem 3.11
with parameters m = | f ` | and η > 0 that is a sufficiently small constant.
Next, we consider the following pair language

Lpair = {(1` , Enc( f ` ))) : f ` is the truth-table of Lhard on `-bit inputs}.

Note that Lpair has stretch K (`) = |Enc( f ` )| = Õ(2` ), and is decidable in N T IME [ T̃ (`))]
for some T̃ (`) = Õ(2` ). Let M be a T̃ (`)-time nondeterministic machine such that
( x, y) ∈ Lpair if and only if there exists w ∈ {0, 1}T̃ (|x|) such that M(( x, y), w) = 1.
We apply Theorem 3.16 to Lpair to obtain a poly(`)-time verifier Vpair and a Õ(2` )
time algorithm Apair . By Theorem 3.16, for some r (`) ≤ ` + O(log `), and for every
sufficiently large ` ∈ N, the followings hold:

1. For every w ∈ {0, 1} T̃ (`) such that M ((1` , Enc( f ` )), w) = 1, Apair (1` , Enc( f ` ), w)
r (`)
outputs a proof π ∈ {0, 1}2 such that
Enc( f ` ),π
Pr[Vpair (1` , ur ) = 1] = 1.

2. For every y ∈ {0, 1}K(`) that has hamming distance at least K (`)/20 from Enc( f ` ),
r
for every π ∈ {0, 1}2 it holds that
Enc( f ` ),π
Pr[Vpair (1` , ur ) = 1] ≤ 1/3.

Then, V 0 guesses w ∈ {0, 1} T̃ (`) such that M((1` , Enc( f ` )), w) = 1 (V 0 reject imme-
diately if this does not hold), computes π`w = Apair (1` , Enc( f ` ), w), which has length
2r(`) . Now we define
1+e/3 −|Enc( f w
Λw N ` )|−| π` |
` = |Enc( f ` )| ◦ π` ◦ 0 .

Note that |Λw `|= N


1+e/3 . Consider the generator G from Proposition 5.2 with e =
0
e, input 1 N , and oracle access to Λ` , and denote its list of outputs by s1w , . . . , sw 1
N e + ∈
{0, 1} N . The new verifier V 0 chooses a random i ∈ [ N 1+e ] and simulates V1 at input
x with random coins siw . Note that this verifier indeed runs in time O( N 1+e ). For
notational convenience, we now denoted the set of all accepted w by

W = {w : w ∈ {0, 1}T̃ (`) ∧ M((1` , Enc( f ` )), w) = 1}.

We note that |W | ≥ 1 since M decides Lpair .

43
The reconstruction argument. Assume towards a contradiction that V 0 fails to solve
L. Since V1 has perfect completeness and |W | ≥ 1, V 0 can only make mistakes when
x∈/ L. In particular, there exist x ∈ {0, 1}n and w∗ ∈ W :

x∈
/ L∧ Pr [∃ω : V1 ( x, siw , ω ) = 1] > .5 . (5.2)
i ∈[ N 1+e ]

Denote by Dx : {0, 1} N → {0, 1} the function Dx (z) = 1 ⇐⇒ ∃ω : V1 ( x, z, ω ) = 1.



From now on, we will use Λ` to denote Λw ` for simplicity.
We first design an MAT IME [ 7] protocol Π1 for f ` . Let τ = c0 · log N for a big
enough constant c0 > 1. Given an input µ ∈ [| f ` |], our protocol acts as follows. (See
Figure 1 for a visual diagram of the protocol.)

1. The prover sends a (supposedly bad) input x ∈ {0, 1}n . From now on, we con-
sider the reconstruction algorithm R from Proposition 5.2 with oracle access to
Dx . Let t̄ = N e0 /10 be the number of parallel queries R makes, and let s, α ∈ N
be the parameters from Proposition 5.2.

2. The verifier draws h ∼ h (h is defined in Proposition 5.2) and τ queries z1 , z2 , . . . , zτ ∈


{0, 1} N uniformly at random, and sends them to the prover. Note that h ∈
1−2δ
{0, 1}| f ` | 0 .
1−2δ0
3. The prover sends an advice adv ∈ {0, 1}| f ` | , together with witnesses ω1 , ω2 , . . . , ωτ ∈
{0, 1} N .
4. The verifier then performs the following:

(a) It rejects immediately if

Pr [V0 ( x, zi , ωi ) = 1] ≤ 1/2.
i ∈[τ ]

(b) Otherwise, it draws α1 , α2 , . . . , ατ ∈ {0, 1}r(`) , and then simulates Vpair (1` , αi )
for each i ∈ [τ ] to obtain a list of polylog(`) queries to the input oracle
(which it hopes will be Enc( f ` )) and to the proof oracle (which it hopes will
be π). We can view those queries as queries q ∈ [ N 1+e/3 ] to some values of
Λ` .38
(c) The verifier also simulates the local decoder for Enc with input µ, which
issues | f ` |η many queries β 1 , β 2 , . . . , β | f ` |η ∈ [|Enc( f ` )|]. Again, we view all
these | f ` |η queries as queries q ∈ N 1+e/3 to some values of Λ` .
 

(d) To summarize, in this turn, the verifier obtained Nq = O (| f ` |η ) many


queries q1 , . . . , q Nq ∈ [ N 1+e/3 ] and sent them to the prover.
` |1−δ0
5. For every i ∈ [ Nq ], the prover sends a witness wi ∈ {0, 1}| f , which is sup-
posed to be the witness for the execution of R on qi .

6. For every i ∈ [ Nq ], the verifier simulates R on input qi with witness wi with fresh
random coins γi , obtains queries zi,1 , . . . , zi,t̄ ∈ [ N ], and sends them to the prover.
38 More formally, querying the i-th bit of the proof oracle corresponding to Enc( f ) translates to query-
`
ing (Λ` )i , and querying the i-th bit of the proof oracle corresponding to π translates to querying
(Λ` )i+|Enc( f ` )| .

44
x (for constructing a distinguisher Dx )

h, queries z1 , . . . , zτ for verifying Prr [ Dx (r ) = 1] ≤ 1/2

adv (after seeing h), witnesses ω1 , . . . ., ωτ for V0 on z1 , . . . , zτ

queries q1 , . . . , q Nq by the local encoder and by Vpair


V P

witnesses w1 , . . . , w Nq for executing R on the qi ’s

queries zi,j (i ∈ [ Nq ], j ∈ [t̄]) for executing R on each qi with wi

witnesses ωi,j for Dx on each query zi,j

Figure 1: An illustration of the MAT IME [ 7] protocol for f ` in the reconstruction


argument. The verifier V is on the left-hand side and the prover P is on the right-hand
side, and observe that the prover speaks first.

7. For every i ∈ [ Nq ] and j ∈ [t̄], the prover sends a witness ωi,j ∈ {0, 1} N , which is
supposed to be the witness for Dx (zi,j ).

8. (Deterministic step.) Finally, the verifier performs the following verifications:

(a) For every i ∈ [ Nq ], it constructs the sequence ρi ∈ {0, 1}t̄ such that (ρi ) j =
V1 ( x, zi,j , ωi,j ) for every j ∈ [t̄]. If for any i ∈ [ Nq ], ρi is (s, α)-deficient, then
it immediately rejects.
(b) Otherwise, for every i ∈ [ Nq ], let di ∈ {0, 1}t̄ be such that (di ) j = Dx (zi,j )
for every j ∈ [t̄]. We know that ρi is (s, α)-indicative of di .39 The verifier
then finishes the executions of R on all the qi ’s, and rejects immediately if
it gets ⊥ form any of these executions. Next, the verifier uses the obtained
values to finish the simulation of Vpair (1` , αi ) for every i ∈ [τ ], and rejects
immediately if any of the Vpair (1` , αi ) = 0. Finally, the verifier finishes the
simulation of the local decoder for Enc to obtain an output ω ∈ {0, 1}, and
accepts iff ω = 1.

Completeness. For every µ ∈ [| f ` |] such that ( f ` )µ = 1, we will show that there exists
a prover strategy in the protocol Π1 such that the verifier accepts, with high probability.
In Step (1) the prover sends the (“bad”) input x ∈ {0, 1}n such that (5.2) holds. Since
x∈/ L, we have that Pr[ Dx (u N ) = 1] ≤ 1/3 and that Prz∼uN [∃ ω V0 ( x, z, ω ) = 1] = 1.
39 This holds since for every j ∈ [t̄] (ρi ) j = 1 implies (di ) j = 1 by the definition of ρi and Dx .

45
Therefore, in Step (3), the prover can always send ω1 , . . . , ωτ so that the verifier does
not reject in Step (4a) in Π1 .
By the completeness case of the “Honest oracle” part of Proposition 5.2, and since
1−2δ
Eq. (5.2) holds, with probability 1 − 1/N over h ∼ h, there exists adv ∈ {0, 1}| f ` | 0
such that by sending adv to the verifier in Step (3), the following holds: Given correct
witnesses in Step (5), with probability at least 1 − O( Nq )/N, for all of the verifier’s
O( Nq ) simulations of R, given correct witnesses in Step (7), the verifier obtains the
correct values in Λ` . In particular, it means that Vpair (1` , αi ) = 1 for all i ∈ [τ ], and
the local decoder returns the correct value ( f ` )µ = 1. Putting the above together, the
verifier accepts with probability at least 2/3.

Soundness. Let µ ∈ [| f ` |] such that ( f ` )µ = 0. We first note that if thr prover sends
x ∈ L in Step (1), then we have Prz∼uN [∃ ω V0 ( x, z, ω ) = 1] ≤ 1/3, meaning that with
probability at least 1 − 1/N, the verifier rejects in Step (4a), no matter what witnesses
ω1 , . . . , ωτ it receives in Step (3). Therefore, we can assume that x ∈ / L in Step (1) and
that the verifier does not reject immediately in Step (4a). In particular, since x ∈ / L, we
have that Prz∼uN [ Dx (z)] ≤ 1/3.
Now, by the “Dishonest oracles” case of Proposition 5.2, with probability 1 − 1/N
over h ∼ h, for every possible adv sent by the prover in Step (3), there exists g ∈
1+e/3
{0, 1} N such that the following holds for every i ∈ [ Nq ]: for every possible wi
sent by the prover in Step (5), with probability at least 1 − 1/N over the verifier’s
randomness γi drawn in Step (6), either the verifier rejects in Step (8a), or the simulated
R(qi , wi ) in Step (8b) given advice (h, adv) outputs either g(qi ) or ⊥ (this holds since
ρi in in Step (8) is either (s, α)-deficient, in which case the verifier rejects, or (s, α)-
indicative of di , in which case R(qi , wi ) outputs either g(qi ) or ⊥).
By the above discussions and a union bound, with probability at least 1 − O( Nq )/N,
at Step (8b), either the verifier rejects (meaning that some of R(qi , wi ) outputs ⊥), or
all the simulated R(qi , wi ) outputs g( xi ). We will denote the above as event E .
From now on we condition on the event E and we assume that the verifier does
not reject in Step (8a). Let y be the first |Enc( f ` )| bits of g, and π be the next |π` | bits
of g. Note that g is already determined by the end of Step (3) and hence the verifier’s
randomness α1 , . . . , ατ in Step (4) is independent of g. Hence, if y has Hamming
distance at least |Enc( f ` )|/20 from Enc( f ` ), then with probability at least 1 − 1/N,
y,π
Vpair (1` , αi ) = 0 for at least one i ∈ [τ ], and the verifier rejects in Step (8b).40 Hence,
we can assume that y has hamming distance at most |Enc( f ` )|/20 from Enc( f ` ). By
Theorem 3.11, the local decoder returns the correct value ( f ` )µ = 0 with probability
at least 1 − 1/N, and the verifier rejects at the end. Putting everything together, the
verifier rejects with probability at least 2/3.

Protocol Π0 for the complement of f ` . Finally, we modify Π1 to obtain another


MAT IME [ 7] protocol Π0 . The only difference between Π0 and Π1 is that at Step
(8b) of Π0 , the verifier in Π0 accepts if and only if ω = 0 instead of ω = 1. The
completeness and soundness of Π0 for computing the complement of f ` follow from
the same proof as that for Π1 .
40 Here we crucially used the fact that g (and thus y and π) is fixed before the verifier draws the αi ’s.

46
Running time of the protocols Π1 and Π0 . Finally, we bound the running time of
the verifier in Π1 (and hence also in Π0 ), as follows:
 
Õ |Λ` |1−2δ0 + Õ( N ) + polylog( N ) + | f ` |η + O( Nq · |Λ` |1−δ0 )
| {z } | {z } | {z }
Steps (4a)+ (4b)+ (4c) Step (6)
Step (2)

+ O( Nq · N · t̄) + O( Nq · |Λ` |1−δ0 + polylog( N ) + | f ` |η )


| {z } | {z }
Step (8a) Step (8b)
1−δ0
≤ Nq · O( N · t̄ + |Λ` | )
= | f ` |η · O( N 1+e0 /10 + |Λ` |1−δ0 ) ,

and this can be made smaller than |Λ` |1−δ by choosing η and δ to be sufficiently small.
This contradicts the hardness of f ` .

6 Optimality under #NSETH


In this section we prove that the derandomization conclusions in Theorems 1.1 and 1.2
are essentially optimal, under the assumption #NSETH. First we lower bound the
derandomization overhead of protocols in which the prover speaks first (i.e., of MA
and MAT IME [ c] ), as follows:
Theorem 6.1 (a lower bound on derandomization of MAT IME [ c] , under #NSETH).
Suppose that #NSETH holds. Then, for every integer c ≥ 2 and real number d ≥ 1 and
e ∈ (0, 1), letting T (n) = nd , it holds that

MAT IME [ c]
[ T ] 6⊆ N T IME [ T bc/2c+1−e ] .
Proof. Without loss of generality we can assume e ∈ (0, 0.01). For the sake of contra-
diction, we assume that

MAT IME [ c]
[ T ] ⊆ N T IME [ T bc/2c+1−e ] .
We instantiate Theorem 3.17 with k = bc/2c and δ = k+1 1 , in which case γ = k+1 1 .
It follows that there is an MAT IME [ 2k] [2n/(k+1)+o(n) ] protocol Π that computes the
number of satisfying assignment to a formula C with 2o(n) size and n bits input.
We first define a decisional MAT IME [ 2k] protocol ΠD , such that ΠD (C, z) = 1
for an n-input formula C and an integer z ∈ {0, 1, . . . , 2n }, if the number of satisfying
assignments to C is z. (Indeed, ΠD can be constructed by simply simulating Π on the
input C and only accepting if Π accepts and the accepted output is z.)
n (1+ τ )
Now we pad the input of the protocol ΠD to be of length N = 2 (k+1)d for a suf-
ficiently small constant τ ∈ (0, 1) to be specified later. Then, the running time of
the protocol ΠD is bounded by 2n/(k+1)+o(n) ≤ 2(1+τ )n/(k+1) = T ( N ). Hence, by our
(1+ τ ) n ( k +1− e )
assumption, ΠD has a T ( N )k+1−e = 2 (k+1) time nondeterministic algorithm MD .
(1+ τ ) n ( k +1− e )
Setting τ = e/4(k + 1), it holds that ( k +1)
< (1 − τ ) · n. Now we can
construct a nondeterministic algorithm refuting #NSETH as follows: given a formula
C : {0, 1}n → {0, 1} of size 2o(n) , guess z ∈ {0, 1, . . . , 2n }, simulate MD on input (C, z),
output z if MD accepts and ⊥ otherwise. By the above discussion, this is a nondeter-
ministic algorithm that counts the number of solutions to n-bit formulas of size 2o(n)
in time 2(1−τ )·n , a contradiction to #NSETH.

47
By a more careful argument, we now lower bound the derandomization overhead
of protocols in which the verifier speaks first (i.e., of AMT IME [ c] ), as follows:

Theorem 6.2 (a lower bound on derandomization of AMT IME [ c] , under #NSETH).


Suppose that #NSETH holds. Then, for every integer c ≥ 2 and real number d ≥ 1 and
e ∈ (0, 1), letting T (n) = nd , it holds that

AMT IME [ c]
[ T ] 6⊆ N T IME [n · T dc/2e−e ] .

Proof. Without loss of generality we can assume e ∈ (0, 0.01). For the sake of contra-
diction, we assume that

AMT IME [ c]
[ T ] ⊆ N T IME [n · T dc/2e−e ] .

We instantiate Theorem 3.17 with k = dc/2e and δ = dk1+1 and γ = (1 − δ)/k. Note
that since d ≥ 1, we have δ ≤ γ. It follows that there is an MAT IME [ 2k] [2γ·n+o(n) ]
protocol Π that computes the number of satisfying assignment to a formula C with
2o(n) size and n bits input, and the first message of Π has length 2δ·n+o(n) .
As in the proof of Theorem 6.1, we first define a decisional MAT IME [ 2k] pro-
tocol ΠD , such that ΠD (C, z) = 1 for an n-input formula C and an integer z ∈
{0, 1, . . . , 2n }, if the number of satisfying assignments to C is z. We note that the
prover messages of the honest prover in ΠD are identical to those in Π. Furthermore,
by the moreover part of Theorem 3.17, the first message of ΠD is such that the accep-
tance probability of the subsequent protocol is either 1 or at most 1/3. (Indeed, ΠD
inherits this property from Π.)
Let ΠD >1 be the sub-protocol of Π after the first message and let `( n ) = 2
D (1+τ )·δn ,

where τ ∈ (0, 1) is a small enough constant to be specified later. We define a new


language L0 such that L0 ( x, π, 1`(n)−| x|−|π | ) = 1 if ΠD>1 accepts the input/first-message
pair ( x, π ) with probability 1, and L0 ( x, π, 1`(n)−|x|−|π | ) = 0 if ΠD
>1 accepts ( x, π ) with
probability at most 1/3. Note that (by the discussion above) the problem L0 is indeed
a language (i.e., a total function rather than a promise problem), and L0 can be decided
[ 2k−1]
by ΠD >1 , which is an AMT IME protocol with running time 2γ·n+o(n) .
Note that ΠD d
>1 takes `( n ) bits as input, and that `( n ) = 2
(1+ τ ) δ · d · n > 2γ · n + o ( n )

by the definition of δ and γ. Hence, the language L0 ∈ AMT IME [ c] [ T ]. By our


assumption, L0 ∈ N T IME [n · T k−e ].
Now we construct an algorithm that refutes #NSETH: Given an n-bit formula C
of size 2o(n) , guess z ∈ {0, 1, . . . , 2n }, guess a proof π of length 2δ·n+o(n) , outputs z
if L0 ((C, z), π, 1`(n)−|(C,z)|−|π | ) = 1, and outputs ⊥ otherwise. This nondeterministic
algorithm indeed counts the number of satisfying assignments, and its running time
is at most
(1+τ )·(d(k−e)+1)
`(n) · `(n)d(k−e) = `(n)d(k−e)+1 = 2 dk+1 ·n
< 2(1−Ω(1))·n ,

where the last inequality follows by setting τ to be small enough. This is a contradic-
tion to #NSETH.

7 Deterministic doubly efficient argument systems


In this section we prove the results from Section 1.3 concerning derandomization of
doubly efficient proof systems; that is, we prove Theorems 1.7, 1.4 and 1.8.

48
Let us first set up some preliminaries. The derandomization algorithms in this
section will be non-black-box, and in particular will use the following construction of a
reconstructive targeted HSG from our very recent work [CT21a].

Theorem 7.1 (a reconstructive targeted HSG, see [CT21a, Proposition 6.2]). For every
α0 , β0 > 0 and sufficiently small η = ηα0 ,β0 > 0 the following holds. Let T̄, k : N → N
be time-computable functions such that T̄ ( N ) ≥ N, and let g : {0, 1} N → {0, 1}k (where
k = k ( N )) such that the mapping of ( x, i ) ∈ {0, 1} N × [k ] to g( x )i is computable in time
T̄ ( N ). Then, there exists a deterministic algorithm Gg and a probabilistic algorithm Rec that
for every z ∈ {0, 1} N satisfy the following:

1. Generator. When Gg gets input z and η > 0, it runs in time k · T̄ ( N ) + poly(k ) and
η
outputs a list of poly(k) strings in {0, 1}k . 41

2. Reconstruction. When Rec gets as input z and η > 0, and gets oracle access to
η
a function Dz : {0, 1}k → {0, 1} that (1/kη )-distinguishes the uniform distribution
0
over the output-list of Gg (z, η ) from a uniform kη -bit string, it runs in time Õ(k1+ β ) +
0 0
k β · T̄ ( N ), makes Õ(k1+ β ) queries to Dz , and with probability at least 1 − 2−k outputs
η

a string that agrees with g(z) on at least 1 − α0 of the bits.

The reconstruction algorithm above can be thought of as approximately printing the


string g(z) (i.e., printing a string that agrees with g(z) on at least 1 − α0 of the bits).
For convenience, we define the following corresponding notion of hardness, which is
failing to approximately print.

Definition 7.2 (failing to approximately print). We say that a probabilistic algorithm M


fails to approximately print a function f : {0, 1}n → {0, 1}∗ with error α on a given string
x ∈ {0, 1}n if Pr[ M ( x )i = f ( x )i ] ≤ 1 − α, where the probability is over i ∈ [| f ( x )|] and over
the random coins of M. We will use shorthand notation and say that M fails to approximately
print f ( x ) with error α.

7.1 Warm-up: The case of an MA-style system


Towards presenting our result, we first present an appealing special case whose proof
[ 2]
is far less involved. Denote by deIP MA [ T ] a doubly efficient proof system in which
the prover speaks first, sending a proof π, and then the verifier tosses random coins
and decides whether to accept or reject the input x with the proof π. Under suit-
[ 2]
able hardness assumptions, we simulate deIP MA by deterministic doubly efficient
argument systems, with essentially no time overhead.
[ 2]
Theorem 7.3 (derandomizing deIP MA into deterministic doubly efficient argument
systems, with almost no overhead). Suppose that non-uniformly secure one-way functions
exist. Let T (n) be any polynomial, and assume that for every e0 > 0 there exist α, β ∈ (0, 1)
0 0
and a function f = f (e ) mapping n + T (n) bits to k (n) = ne bits such that:

1. There exists a non-deterministic unambiguous machine that gets input (( x, π ), i ) ∈


0
{0, 1}n+T × [n1+e ] and outputs the ith bit of f ( x, π ) in time T̄ = T (n) · k.
41 The fact that the number of strings is poly( k ) is not mentioned in the original statement in [CT21a,

Proposition 6.2], but this is just an omission. This fact is established in the proof of the proposition and
the applications of the proposition rely on it.

49
2. For every probabilistic algorithm M running in time T̄ · n β and every distribution P over
{0, 1}n+T that is samplable in polynomial time, with probability at least 1 − n−ω (1) over
( x, π ) ∼ P it holds that M fails to approximately print f ( x, π ) with error α.
[ 2]
Then, for every e > 0 it holds that deIP MA [ T ] ⊆ deDARG[ne · T ].
[ 2]
Proof. Let L ∈ deIP MA [ T ], let V be a corresponding verifier for L, and let
n o
YV = ( x, π ) : Pr[V ( x, π, r ) = 1] ≥ 2/3
r
n o
NV = ( x, π ) : Pr[V ( x, π, r ) = 1] ≤ 1/3 ,
r

where in the expressions V ( x, π, r ) above x is an input and π is a proof and r is a ran-


dom string. Note that the promise-problem (YV , NV ) can be decided in probabilistic
linear time.
0 0
Let e0 = e/c for a sufficiently large constant c > 1, and let k (n) = ne . Let f = f (e )
be the corresponding function from our hypothesis, and note that the upper bound in
the first item is T̄ = T · k, whereas the lower bound in the second item is T̄ · k1+ β .
First step: Reduce the number of random coins to nη . For a sufficiently small constant
µ = µ(e0 ) > 0 that will be determined later, let Gcrypto be the PRG from Theorem 3.15,
instantiated with stretch nµ 7→ T (n). Consider the verifier V 0 that uses only nµ coins
and is defined by V 0 ( x, π, s) = V ( x, π, Gcrypto (s)). Note that for every fixed x, π we
have that Prs [V 0 ( x, π, s) = 1] ∈ Prr [V ( x, π, r ) = 1] ± n−ω (1) . Hence, YV 0 = YV , where
def def
YV 0 == {( x, π ) : Prr [V 0 ( x, π, r ) = 1] ≥ .66}, and similarly NV 0 = NV where NV 0 ==
{( x, π ) : Prr [V 0 ( x, π, r ) = 1] ≤ .33}. The running time of V 0 is O( T 1+µ ) and its number
of random coins is nµ .
Main step: Targeted PRG using the transcript as a source of hardness. Now, let D be
the following deterministic verifier. On input x ∈ {0, 1}n and proof π ∈ {0, 1} T ,
consider the generator from Theorem 7.1, instantiated with parameters

N = n+T , T̄ ( N ) = T (n) · k ,
1+ e 0
g = f e : {0, 1} N → {0, 1}n , sufficiently small α0 , β0 , η ;

we now also fix the parameter µ of Gcrypto above to be such that kη = nµ . The verifier
D runs Gg to obtain a set of k · T̄ + poly(k ) ≤ T · poly(k ) strings of length nµ denoted
s1 , . . . , s T ·poly(k) and outputs MAJi∈[T ·poly(k)] {V 0 ( x, π, si )}. Assuming that the constant
c > 1 is sufficiently large, the running time of this algorithm is at most T · ne .
Analysis. The honest prover for D is identical to that of V. We now show an algorithm
F that runs in time T̄ · n β , and for every fixed ( x, π ) ∈ YV such that D ( x, π ) = 0 it holds
that F ( x, π ) α-approximates f ( x, π ). By a symmetric argument (which is identical
and omitted), there exists another algorithm F 0 with precisely the same guarantee for
any ( x, π ) ∈ NV such that D ( x, π ) = 1. By our hypothesis, for every polynomial-
time samplable distribution P over {0, 1}n+T , with probability at least 1 − n−ω (1) over
choice of ( x, π ) ∼ P both algorithms F and F 0 fail to α-approximate f ( x, π ). Hence,
the probability that a probabilistic polynomial-time algorithm can find ( x, π ) such that
D ( x, π ) errs is at most n−ω (1) .
Thus, it is left to construct the algorithm F. Fix ( x, π ) ∈ YV such that D ( x, π ) =
η
0, and denote by Dx,π : {0, 1}k → {0, 1} the function Dx,π (r ) = V ( x, π, r ). By our

50
assumption Dx,π is a (1/10)-distinguisher for the uniform distribution over the output-
set of Gg . We invoke the reconstruction Rec from Theorem 7.1; the running time of
0
algorithm, accounting for answering its Õ(k1+ β ) queries to Dx,π , is
0 0 0 0
Õ(k1+ β ) + k β · T̄ (n) + Õ(k1+ β ) · T < Õ( T · k · k β ) < T̄ · n β ,

for a sufficiently small choice of β0 ; and with probability 1 − o (1) > 1 − α/2 the
reconstruction outputs a string string that agrees with f ( x, π ) on at least 1 − α0 >
1 − α/2 of the bits, for a sufficiently small α0 > 0.

[ 2]
Note that the proof above works as-is even if the initial deIP MA system has im-
perfect completeness.

7.2 Basic case: Doubly efficient proof systems with few random coins
The main goal in this section is to state and prove Theorem 1.7, which asserts that
under strong hardness assumptions, we can simulate every deIP [ c] protocol by a
deterministic doubly efficient argument system, with essentially no time overhead.
In Section 7.2.1 we state Theorem 1.7 and discuss its hypothesis, and then in Sec-
tion 7.2.2 we prove the result. In Section 7.2.3 we state and prove a strong version of
the result that holds for doubly efficient proof systems that have an efficient univesal
prover, and in Section 7.2.4 we deduce Theorem 1.4 as a corollary of the latter.

7.2.1 The result statement and a discussion of the hypothesis


The following is a generic form of the hardness hypothesis that we will use in The-
orem 1.7. It is inspired by a hardness assumption from our previous work [CT21a],
but the current assumption is stronger because it refers to probabilistic oracle machines
(rather than just to probabilistic algorithms). In the assumption we fix a “complexity
parameter” n ∈ N, and think of all other parameters as functions of n.

Assumption 7.4 (non-batch-computability assumption; Assumption 1.6, restated). For


N, K, K 0 , T̄ : N → N and η : N → (0, 1), the ( N 7→ K, K 0 , η )-non-batch-computability
assumption for time T̄ with oracle access to O is the following. There exists f : {0, 1}∗ →
{0, 1}∗ that for every n ∈ N maps N (n) bits to K (n) bits and satisfies:
1. There exists a deterministic algorithm that gets input (z, i ) ∈ {0, 1} N (n) × [K (n)] and
outputs the ith bit of f (z) in time T̄ (n).
0
n Morunning in time T̄ · K and having oracle access to O , and
2. For every oracle machine
every collection z = z N (n) of distributions such that z N (n) is over {0, 1} N (n) and
n ∈N
can be sampled in time polynomial in T̄ (n), and every sufficiently large n ∈ N, with
probability at least 1 − T̄ (n)−ω (1) over choice of z ∼ z N (n) it holds that MO fails to
approximately print f (z) with error η (n).

We will use Assumption 7.4 with T̄ that is a polynomial, in which case the error
bound T̄ (n)−ω (1) is just n−ω (1) ; that is, the error bound is any negligible function in
the input length. We are now ready to state Theorem 1.7.

Theorem 7.5 (derandomizing constant-round doubly efficient proof systems into de-
terministic doubly efficient argument systems, with almost no overhead; Theorem 1.7,

51
restated). For every α, β ∈ (0, 1) there exists η > 0 such that the following holds. Let
c ∈ N be a constant, let T (n) be a polynomial, let R(n) < T (n) be time-computable, and let
N (n) = n + c · T (n) and K (n) = R(n)1/η . Assume that the ( N 7→ K, K β , α)-non-batch-
[ c]
computability assumption holds for time T̄ = T · K with oracle access to pr AMT IME 2 [n]
on inputs of length O( T ). Then,

deIP [ c]
[ T, R] ⊆ deDARG[ T · RO(c/η ) ] ,
where the O-notation hides a universal constant.
To discuss the hardness hypothesis, let us fix the parameter value R = T o(1) , which
will be the value we use in the proofs of Theorems 1.4 and 1.8. Then, loosely speaking,
the assumption in Theorem 7.5 can be described as follows: There exists a function f
from T bits to K = T o(1) bits such that –
1. Each output bit of f can be computed in time T̄ = T · K = T 1+o(1) .
2. The entire string f ( x ) cannot be (approximately) printed in probabilistic time
T̄ · K β , while making oracle queries of length T to a pr AMT IME [ c] protocol
that runs in time linear in T. (And this hardness holds whp over any polynomial-
time samplable distribution of T-bit strings x.)
Our proof allows relaxing the hypothesis further: For example, the oracle machine
against which we assume hardness only makes non-adaptive queries to its oracle, and
the number of queries is only poly(k ). (We avoid including these relaxations in the
assumption statement for simplicity of presentation.)

The difference from known results about batch-computability. The complexity of


the oracle machine for which we assume hardness is reminiscent of the complexity of
unconditionally known interactive protocols for batch-computing functions. However,
there is a crucial difference between the two settings, which we now explain.
For any f : {0, 1} T → {0, 1}K whose individual output bits are computable in time
T̄, Reingold, Rothblum, and Rothblum [RRR18, Theorem 13] (following their previous
result in [RRR21]) constructed a constant-round protocol for batch verification of the K
output bits of f such that the verifier runs in time Õ(K · T ) + K β · T̄ 1+ β = O(K β · T̄ 1+ β ),
where β > 0 is an arbitrarily small constant.42
The running time of their protocol almost matches (from above) the complexity
of our oracle machine, which might seem alarming. However, we stress that our
oracle machine is not an interactive protocol by itself, but rather only makes queries
to a protocol; these queries are of length T = T̄/K, and the protocol is only allowed
linear running time O( T ). Therefore, the main point of difference is that the interactive
proof oracle in our hypothesized lower bound is allowed less running time than the upper
bound T̄ on computing even a single output bit of f . Needless to say, the techniques for
batch-verification from [RRR21; RRR18] do not extend to such a setting.
In fact, a hardness assumption that is even stronger than the one we make still
makes sense: The hypothesis would still seem reasonable if the probabilistic machine
would make queries of length T to a linear-space oracle (i.e., to a machine running in
space O( T ) and time 2O(T ) ), rather than to a linear-time interactive proof.
42 The result of [RRR18] applies in a more general setting than the one we compare to, since they
consider applying f to K different inputs x1 , . . . , xk , since their prover is efficient (i.e. runs in time
poly( T̄ )), and since their result holds even when the upper bound on the complexity of f is U P rather
than P .

52
7.2.2 Proof of Theorem 7.5
Let L ∈ deIP [ c] [ T, R], and let V be a T-time verifier in a protocol with c rounds for L
such that V sends R random coins in each turn. Denote by c0 = dc/2e be the number
of turns of the verifier in the interaction. Given any prover P and input x, we think
of the corresponding interaction as a function of c0 strings of R random coins that are
chosen (uniformly) in advance, but are revealed to P during the interaction. We denote
by

hV, P, x i (r1 , . . . , rc0 )

the result of the interaction between V and P on input x with random coins r1 , . . . , rc0 .
A verifier for L with O(log( R)) random coins. Our goal now is to construct an alterna-
tive verifier V 0 for L such that in each round of interaction, V 0 uses O(log( R)) random
coins. Let G = G f be the generator from Theorem 7.1, instantiated with the function
1/η
f : {0, 1} N → {0, 1} R , with sufficiently small β0 < β and α0 < α, and with output
def
length R; the number of output strings of G f is R̄ == poly(K ) = RO(1/η ) , where
η = ηα0 ,β0 is the parameter from Theorem 7.1, and its running time is T̄ · poly(K ) =
T · RO(1/η ) . For any given z ∈ {0, 1} N and s ∈ [ R̄], let G z (s) = G (z)s ∈ {0, 1} R be the
sth output string of G when G is given input z.
In turn i ∈ [c0 ], the verifier V 0 chooses a random seed s ∈ [ R̄] and sends G x,πi (s)
to the prover, where x is the input and πi is the interaction communicated between
the parties up to turn i of the interaction, padded with zeroes to be of length precisely
N − n (for convience we denote π1 = 0 N −n ). Then it applies the same final predicate
to the interaction as V. In other words, continuing our notation for interactions as
functions of random coins, we have that

V 0 , P, x (s1 , . . . , sc0 ) = hV, P, x i ( G x,π1 (s1 ), . . . , G x,πc0 (sc0 )) .

The running time of V 0 is larger than that of V, but we will carefully account for
it later on. For now it suffices to observe that the only runtime overhead of V 0 on top
of V comes from computing the generator G in each of the c0 turns, rather than just
sending random coins.
Analysis: Soundness of V 0 . For convenience, we will think of any probabilistic polynomial-
time algorithm P as an efficiently samplable distribution over deterministic polynomial-
time algorithms, obtained in the natural way (i.e., by randomly choosing coins in ad-
vance and running the algorithm with the fixed chosen coins).
Consider an arbitrary probabilistic polynomial-time algorithm P. Our analysis uses
the following hybrid argument. Let V0 be the original verifier, and for i ∈ [c0 ] let Vi be
the verifier in which in the first i turns, the verifier uses log( R̄) coins as above, instead
of R(n) random coins; that is, for any fixed P ∼ P,

hVi , P, x i (s1 , . . . , si , ri+1 , . . . , rc0 ) = hV, P, x i ( G x,π1 (s1 ), . . . , G x,πi (si ), ri+1 , . . . , rc0 ) .

Observe that V0 = V and that Vc0 = V 0 . For any fixed prover P and 1 ≤ i <
j ≤ c0 , denote by Pi...j the partial prover strategy of P that refers only to rounds
i, i + 1, . . . , j. 43 That is, we think of the prover as a sequence of c0 functions, where the
43 Note that we are now referring to rounds rather than to turns. That is, the interaction consists of c

turns (where a turn is when one of the players speaks) and of c0 = dc/2e rounds (where a round consists
of two turns, except possibly the last round).

53
ith function maps a transcript in the ith round (which is of length i · R + (i − 1) · T) to
the prover’s response (which is of length T); then, Pi...j is simply a subsequence of the
c0 functions that define P. We use the notation Pi...j ∼ P to denote the random variable
that is obtained by sampling P ∼ P and outputting the partial prover strategy Pi...j .
0
And for two partial prover strategies P1...i−1 and Pi...c , we denote by P1...i−1 ◦ Pi...c the
(complete) prover strategy that is obtained by combining both partial strategies in the
obvious way (i.e., by simply concatenating the two sequences of functions).
Our main definition for the hybrid argument is a sequence of hybrids that in-
volves both a partial replacement of the random coins, and a partial replacement of
the “existential” prover strategy in the soundness condition (i.e., that there does not
exist an all-powerful prover that convinces the verifier) with a partial strategy chosen
according to P. In more detail, for any fixed i ∈ [c0 ] we denote
( )
hD 0
E i
p x,i = max0 Pr Vi , P1...i ◦ Pi+1,...,c , x (s1 , . . . , si , ri+1 , . . . , rc0 ) = 1 .
Pi+1...c s1 ,...,si ,ri+1 ,...,rc0 ,P1...i ∼P

The perfect completeness of V 0 follows immediately from the perfect completeness


of V, since V 0 simply simulates V with pseudorandom choices of coins. Thus, we
focus on establishing the soundness of V 0 . To do so, we first argue that:

Fact 7.5.1. For any fixed x ∈


/ L such that

Pr [ V 0 , P, x (s1 , . . . , sc0 ) = 1] > .4 , (7.1)


P∼P,s1 ,...,sc0

there exists i ∈ [c0 ] such that p x,i − p x,i−1 > 1/20c0 .

Proof. Fix x ∈ / L such that Eq. (7.1) holds. Observe that p x,0 ≤ 1/3, by the soundness
of the original protocol V (which holds for every fixed x and an all-powerful P). On
the other hand, we have that p x,c0 = PrP∼P,s1 ,...,sc0 [hV 0 , P, x i (s1 , . . . , sc0 ) = 1], and thus
p x,c0 > .4 by Eq. (7.1). Since p x,c0 − p x,0 = ∑i∈[c0 ] p x,i − p x,i−1 , for some i ∈ [c0 ] it holds
that p x,i − p x,i−1 > ( p x,c0 − p x,0 )/c0 > 1/20c0 . 
The main claim in the analysis is the following:

Claim 7.5.2. For any i ∈ [c0 ], the probability over x ∼ x that x ∈


/ L and

p x,i > p x,i−1 + 1/20c0

is negligible.

Since the proof of Claim 7.5.2 is quite involved, we defer it for a moment, and first
complete the argument assuming that Claim 7.5.2 is true.
By combining Fact 7.5.1 and Claim 7.5.2, we deduce that for every probabilistic
polynomial-time algorithm P, with probability 1 − neg(| x |) over x ∼ x, if x ∈ / L then
PrP∼P,s1 ,...,sc0 [hV , P, x i (s1 , . . . , sc ) = 1] ≤ .4. (Intuitively, this establishes that V 0 is an
0 0

argument system that uses few random coins.)


From V 0 to a deterministic doubly efficient argument system. Given any input x, the
0
total number of possible messages from V 0 to a prover (across all turns) is R̄c , cor-
0
responding to a choice of seeds s̄ ∈ [ R̄]c . Our deterministic verifier V 00 expects the
prover to send a corresponding transcript for each choice of s̄. It then:

54
1. Verifies that the sent transcripts are consistent with a prover strategy (i.e., that
the prover did not respond to two identical prefixes in different ways).

2. For each transcript corresponding to a choice of s̄, it verifies that the pseudo-
random coins in each message (which are part of the transcript) are what one
obtains by applying G to the transcript at that point, using the corresponding
seed (which is part of s̄).

3. Computes the average acceptance probability of V 0 over all choices of s̄.


Recall that there is an efficient prover P such that for any x ∈ L it holds that
Prr1 ,...,rc0 [hV, P, x i (r1 , . . . , rc0 ) = 1] = 1. When interacting with P, the verifier V 0 simply
uses a pseudorandom coins instead of random ones, and thus Prs1 ,...,sc0 [hV 0 , P, x i (s1 , . . . , sc0 ) =
0
1] = 1. The honest prover for the verifier V 00 enumerates over s̄ ∈ [ R̄]c , and for every
choice of s̄ it simulates the corresponding interaction with P, while computing the
generator G at each round. Since P runs in time poly( T ) and G can be computed in
time polynomial in T̄ ≤ poly( T ), the honest prover for V 00 runs in time poly( T ).
Now, the verifier V 00 inherits its soundness immediately from that of V 0 .44 Relying
on the fact that the soundness holds against all polynomial-time algorithms P, we
further argue that the soundness error of V 00 is not only .4 but actually neg(n):
Claim 7.5.3. For every probabilistic polynomial-time algorithm P with probability 1 − neg(n)
over an n-bit x ∼ x, if x ∈
/ L then PrP∼P [V ( x, P( x )) = 1] < neg(n).
Proof. Assume towards a contradiction that for some P and polynomial p(n) there are
infinitely many n ∈ N such that, with non-negligible probability over an n-bit x ∼ x it
holds that x ∈ / L and PrP∼P [V 00 ( x, P( x )) = 1] ≥ 1/p(n).
Consider the prover P0 that on input x runs P for t = p(n)2 times to obtain can-
didate proofs π1 , . . . , πt , for each i ∈ [t] checkes whether V 00 ( x, πi ) = 1, and if it
finds πi for which the latter holds then it prints this πi (otherwise it prints some
fixed default proof). Note that P0 runs in polynomial time, and for every x such
that PrP∼P [V 00 ( x, P( x )) = 1] ≥ 1/p(n) we have that PrP0 ∼P0 [V 00 ( x, P0 ( x )) = 1] ≥ .99.
Thus, there is a polynomial-time prover P0 and infinitely many n ∈ N such that with
non with non-negligible probability over an n-bit x ∼ x it holds that x ∈ / L and
PrP∼P [V 00 ( x, P( x )) = 1] > .4, contradicting the soundness of V 00 .
0
We now bound the running time of V 00 . Recall that the prover sends [ R̄]c transcripts
0
to V 00 , corresponding to all possible choices of seeds s̄ ∈ [ R̄]c by V 0 . We assume that
the prover sends the transcripts in prefix-order of s̄. For each prefix s1 , . . . , si of s̄, the
verifier checks that all transcripts corresponding to that prefix are identical (i.e., verifies
that the prover strategy is consistent); and then computes the set of pseudorandom
strings obtained by applying G to the foregoing transcript with all possible choices
of si+1 , “in a batch”; by Theorem 7.1, for each prefix this can be done in time T̄ ·
K + poly(K ) ≤ T · RO(1/η ) . In the end, it computes the original V on each of the
0
| R̄|c = RO(c/η ) choices of seeds. The final running time of V 00 is thus T · RO(c/η ) .
Finally, to complete the proof the only missing piece is to prove Claim 7.5.2.

Proof of Claim 7.5.2. Fix i ∈ [c0 ] and assume that with non-negligible probability over
x ∼ x we have that p x,i > p x,i−1 + 1/20c0 . For any fixed p̄ = P1...i−1 and s̄ = s1 , . . . , si−1 ,
44 To see this, note that if the prover sends consistent answers to all message sequences then those

answers yield a prover strategy that could have been used in the interaction with V 0 .

55
the interaction in the first i − 1 rounds between any prover whose partial strategy in
these rounds is p̄ and any verifier that behaves like Vi−1 in these rounds is also fixed
(i.e., it is a deterministic function of p̄ and s̄); we denote the transcript of this interaction
by π p̄,s̄ . We denote by (x, π ) the distribution that is obtained by sampling x ∼ x and
p̄ = P1...i−1 ∼ P and s̄ ∈ ([ R̄])i−1 and outputting ( x, π p̄,s̄ ). By our assumptions about P
and x and the fact that G runs in polynomial time, the distribution (x, π ) is samplable
in polynomial time.
Fixing the first i − 1 rounds of interaction. For every fixed ( x, πs̄, p̄ ), denote
 hD E i
0
( p x,i−1 |πs̄, p̄ ) = max0 Pr Vi−1 , p̄ ◦ Pi...c , x (s̄, ri , ri+1 , . . . , rc0 ) ,
Pi...c ri ,ri+1 ,...,rc0
( )
hD 0
E i
( p x,i |πs̄, p̄ ) = max0 Pr Vi , p̄ ◦ Pi...c , x (s̄, si , ri+1 , . . . , rc0 ) ,
Pi+1...c si ,ri+1 ,...,rc0 ,Pi ∼P

and observe that:


Claim 7.5.3.1. We have that p x,i = Es̄, p̄ p x,i |πs̄, p̄ and p x,i−1 = Es̄, p̄ p x,i−1 |πs̄, p̄ .
   

Proof. Note that in p x,i and in ( p x,i |πs̄, p̄ ) (resp., in p x,i−1 and ( p x,i−1 |πs̄, p̄ )) the maximum
0 0
is over functions Pi+1...c (resp., Pi...c ) that take πs̄, p̄ as part of their input. Thus, first
choosing πs̄, p̄ from some distribution and then taking the maximum-achieving func-
tion is identical to first taking the maximum-achieving function and then choosing πs̄, p̄
from the same distribution.45 
We call a pair ( x, πs̄, p̄ ) good if ( p x,i |πs̄, p̄ ) > ( p x,i−1 |πs̄, p̄ ) + 1/40c0 , and argue that:
Claim 7.5.3.2. With non-negligible probability over ( x, πs̄, p̄ ) ∼ (x, π ) we have that ( x, πs̄, p̄ )
is good.

Proof. By our assumption, with non-negligible probability over x ∼ x we have that


p x,i > p x,i−1 + 1/20c0 . Now, let ∆s̄, p̄ = ( p x,i |πs̄, p̄ ) − ( p x,i−1 |πs̄, p̄ ), and note that

E [∆s̄, p̄ ] = E p x,i |πs̄, p̄ − E p x,i−1 |πs̄, p̄ = p x,i − p x,i−1 > 1/20c0 ,


   
s̄, p̄ s̄, p̄ s̄, p̄

where the second equality above relies on Claim 7.5.3.1. Thus, if we’d have that
Prs̄, p̄ [∆s̄, p̄ > 1/40c0 ] < n−ω (1) , then we’d also have Es̄, p̄ [∆s̄, p̄ ] ≤ n−ω (1) · (1/40c0 ) +
(1/40c0 ) < 1/20c0 , a contradiction. 

[ c]
Obtaining an AMT IME 2 distinguisher. Fixing a good ( x, πs̄, p̄ ), we denote
 hD E i
i...c0
p(ri ) = max0 Pr Vi−1 , p̄ ◦ P , x (s̄, ri , . . . , rc0 ) = 1 ,
Pi...c ri+1 ,...,rc0

and argue that:


45 In other words, we use the fact that for every D and R and ν : R → R it holds that
   
max E [ν( f (r ))] = E max {ν( f (r ))} = E [ν( f ∗ (r ))] ,
f : D→R r ∈D r ∈D f : D→R r ∈D

where f ∗ is the function that maps any r ∈ D to t = f (r ) such that ν(t) is maximal.

56
Claim 7.5.3.3. The following two statements hold:

1. E [ p(ri )] = ( p x,i−1 |πs̄, p̄ ) .


ri ∈{0,1} R

2. E [ p( G x,πs̄, p̄ (si ))] ≥ ( p x,i |πs̄, p̄ ) .


si ∈[ R̄]

Proof. For the first item, note that

E [ p(ri )]
ri ∈{0,1} R
  hD E i
i...c0
= E max Pr Vi−1 , p̄ ◦ P , x (s̄, ri , . . . , rc0 ) = 1
ri ∈{0,1} R Pi...c0 ri+1 ,...,rc0
 hD E i
i...c0
= max0 Pr Vi−1 , p̄ ◦ P , x (s̄, ri , . . . , rc0 ) = 1
Pi...c ri ,ri+1 ,...,rc0

= ( p x,i−1 |πs̄, p̄ ) ,

where the second inequality is because (as in the proof of Claim 7.5.3.2) the maximum
0
is over functions P1...c that take ri as part of their input (and thus first drawing a ran-
dom ri and then choosing a maximum-achieving function is identical to first choosing
a maximum-achieving function and then drawing a random ri ).
For the second item, the argument is a bit more subtle, as follows:

E [ p( G x,πs̄, p̄ (si ))]


si ∈[ R̄]
  hD E i
i...c0
= E max0 Pr Vi−1 , p̄ ◦ P , x (s̄, G x,πs̄, p̄
( s i ), r i +1 , . . . , r c ) = 1
0
si ∈[ R̄] Pi...c ri+1 ,...,rc0
 hD E i
0
= max0 Pr Vi−1 , p̄ ◦ Pi...c , x (s̄, G x,πs̄, p̄ (si ), ri+1 , . . . , rc0 ) = 1
Pi...c si ,ri+1 ,...,rc0
 hD E i
i...c0
= max0 Pr Vi , p̄ ◦ P , x (s̄, si , ri+1 , . . . , rc0 ) = 1
Pi...c si ,ri+1 ,...,rc0
( )
hD E i
i...c0
≥ max0 Pr Vi , p̄ ◦ P , x (s̄, si , ri+1 , . . . , rc0 ) = 1
Pi+1...c si ,ri+1 ,...,rc0 ,Pi ∼P

= ( p x,i |πs̄, p̄ ) ,

where the main difference from the proof of the first item is the inequality, which
asserts that taking the maximum-achieving prover strategy in round i can only increase
the acceptance probability compared to choosing Pi ∼ P. 46 
Relying on Claim 7.5.3.3, for the fixed good ( x, πs̄, p̄ ) we have that

E [ p( G x,πs̄, p̄ (si ))] − E [ p(ri )] ≥ ( p x,i |πs̄, p̄ ) − ( p x,i−1 |πs̄, p̄ ) > 1/40c0 .
si ∈[ R̄] ri ∈{0,1} R

The foregoing assertion implies that the real-valued function p(·) behaves differ-
ently on the pseudorandom distribution G x,πs̄, p̄ (u[ R̄] ) and on the uniform distribution
46 To see that the third equality above holds, note the following. In the upper row we are referring to
the verifier Vi−1 , which uses the ith random string given to it as-is, and are giving Vi the string G x,πs̄, p̄ (si )
for a random si . In the bottom row we’re referring to the verifier Vi−1 , which applies G x,πs̄, p̄ to the ith
random string given to it, and are giving Vi−1 a random si . Thus, in both cases the random string used
in the ith round is G x,πs̄, p̄ (si ) for a random si .

57
uT . To obtain a Boolean-valued distinguisher (rather than a real-valued one such
as p(·)), and furthermore a Boolean-valued distinguisher that is computable by an
AMT IME [ c] protocol, we rely on the following claim: It asserts that if two real-
valued RVs x and y in [0, 1] have expectations that differ by e > 0, then there are
two thresholds `0 and `0 + Θ(e) such that the probability that x exceeds `0 + Θ(e) is
noticeably higher than the probability that y exceeds `0 .
Lemma 7.5.3.4. Let x, y be two random variables taking values from [0, 1] and e ∈ (0, 1) such
that e−1 ∈ N. If E[x] − E[y] > e, then there exists j ∈ [4/e] such that

Pr[x ≥ (e/4) · ( j + 1)] − Pr[y ≥ (e/4) · j] > e/2.

The proof of Lemma 7.5.3.4 is elementary but tedious, so for convenience we de-
fer it to Appendix B. Now, denote e = 1/40c0 and consider the following promise
problem:

Y = ( x, πs̄, p̄ , ri , τ ) : p(ri ) ≥ τ + e/4

N = ( x, πs̄, p̄ , ri , τ ) : p(ri ) ≤ τ .
[ c]
Observe that the problem (Y, N) can be decided in pr AMT IME 2 [O(n)], since the
verifier can run the original protocol Vi−1 from the definition of p(·) for O(1) times
in parallel, estimate the acceptance probability of Vi−1 with the given prover up to
accuracy e/8 and with high probability, and accept if and only if this probability is
more than τ + e/8. Note that the runtime of this protocol is linear, because e = Ω(1)
and the input size to this problem is O( T ).
Now, by instantiating Lemma 7.5.3.4 with the RVs x = p( G x,πs̄, p̄ (u[ R̄] )) and y =
p(uR ), for some τ ∈ {0, e/4, 2e/4, . . . , 1} it holds that

Pr[( x, πs̄, p̄ , G x,πs̄, p̄ (u[ R̄] , τ )) ∈ Y ] > 1 − Pr[( x, πs̄, p̄ , uR , τ )) ∈ N] + e/2 .

For any fixed ( x, π ) and τ ∈ {0, e/4, 2e/4, . . . , 1} we define the following proce-
dure D = Dx,π,τ : Given r ∈ {0, 1} R (which we think of as coming either from the
uniform distribution or from the pseudorandom distribution) the procedure creates
the string z = ( x, π, r, τ ) and solves the promise problem (Y, N) on input z. Note that
[ c]
indeed D ∈ pr AMT IME 2 [O(n)], and that

Pr[ D ( G x,πs̄, p̄ (u[ R̄] )) = 1] > Pr[ D (uR ) = 1] + e/2 ,

where the inequality relies on the fact that Pr[ D (uR ) = 1] = 1 − Pr[ D (uR ) = 0 ≥
1 − Pr[uT ∈ N].
The reconstruction argument. To obtain a contradiction, we show an algorithm that
runs in time T̄ · K β , where T̄ = T · K, and with non-negligible probability over the
polynomial-time samplable distribution ( x, π ) ∼ (x, π ) manages to approximately
print f ( x, π ) (with high probability over its random coins).
Consider an algorithm F0 gets input ( x, π ) and randomly chooses τ ∈ {0, e/4, 2e/4, . . . , 1}.
Then F0 runs the reconstruction Rec with input ( x, π ), giving it oracle access to D τ,x,π .
Specifically, whenever Rec queries D τ,x,π on input r, the algorithm F0 queries D τ on
input ( x, π, r, τ ) and returns the corresponding answer. The algorithm F0 runs in time
0 0 0
Õ(K1+ β ) · T
|{z} + |T̄ ·{zK }
β
= Õ( T̄ · K β )
| {z }
#queries length of a query to D τ add’l runtime of Rec

58
and assuming that the guess of τ 0 was correct, with high probability F0 prints a string
that agrees with f ( x, π ) on 1 − α0 of the bits.
To construct an algorithm F that succeeds with high probability, we run F0 for O(c0 )
times, such that with high probability at least one iteration successfully printed a string
that agrees with f ( x, π ) on 1 − α0 of the bits. Then, for each of the O(c0 ) candidate
strings, the algorithm F estimates the agreement of this string with f ( x, π ) up to a
sufficiently small constant error, by randomly sampling output bits and computing
the corresponding bits of f ( x, π ) (recall that each bit can be computed in time T̄). The
0
running time of F is at most Õ( T̄ · K β ) < T̄ · K β , and with high probability it succeeds
in outputting a string that agrees with f on at least 1 − α of the bits. 
Having proved Claim 7.5.2, this concludes the proof of Theorem 7.5.

Remark 7.6 (handling imperfect completeness). The proof of Theorem 7.5 implies sim-
ilar derandomization for deIP [ c] protocols with imperfect completeness that use a small
number of random coins. To see this, observe that the well-known transformation of
AM protocols into protocols with perfect completeness [FGM+89] yields the follow-
ing: Any deIP [ c] protocol in which the verifier uses only R coins can be simulated
by a deIP [ c] protocol with perfect completeness such that the new protocol has the
same number of rounds, the verifier uses R̄ = Õ( R) random coins, its communica-
tion complexity and running time increase by a multiplicative factor of O( R̄), and the
prover is still efficient but it is now probabilistic.47
Since we think of R as small in the result above (indeed, we will use this result
with R = O(log( T )) in Theorems 7.9 and Corollary 7.12), we can start from a protocol
with imperfect completeness, simulate it by a protocol with similar running time, and
appeal to Theorem 7.5 as a black-box. Indeed, the only gap is that the honest prover
in the resulting deIP [ c] protocol is now probabilistic rather than deterministic.

Remark 7.7 (argument systems for N P -relations). The honest prover in the argument
system that is constructed in the proof of Theorem 7.5 is very similar to the original
honest prover, and in particular has almost the same time complexity. (This is because
the new prover enumerates over all possible seeds for the targeted PRG, across all
rounds, and for each choice it simulates the verifier’s interaction with the original
prover using the chosen seeds to generate pseudorandom coins.) One implication
of this fact is that our results extend naturally to constant-round probabilistic proof
systems in which the honest prover gets a witness as auxiliary input.
In more detail, let R be a relation, let L R be the decision problem defined by R, and
let Π be a probabilistic proof system for L R with c turns of interaction such that the
verifier in Π runs in time T, and the honest prover in Π runs in time poly( T ) when
given a witness for the input (in the relation R).48 Then, under the same assumption as
in Theorem 7.5, our proof gives a deterministic argument system for L R in which the
verifier runs in time T 1+e , soundness is precisely as in Definition 3.8, and the honest
prover runs in time poly( T ) when given a witness (in the relation R) for the input.
47 These properties are not explicitly stated in the original work but they readily follow from the proof.

Specifically, in the original proof the message lengths by the verifier and the prover are coupled, but the
proof only relies on the error being smaller than 1/R; and the original proof shows that for every x ∈ L,
with high probability over choice of initial strings by the prover (as a basis for simulating copies of the
protocol in parallel) the verifier to accept with probability 1.
48 That is, we consider the interaction between the verifier and the honest prover when the for-

mer is given input x and the latter is given input ( x, w) ∈ R. Note that any such relation is in
MAT IME [poly( T )], since a verifier can guess w and simulate the interaction with the honest prover.

59
7.2.3 A stronger result for doubly efficient proof systems with a universal prover
We now argue that in the special case of doubly efficient proof systems that have
an efficient universal prover, we can derandomize such systems under a hypothesis
that is weaker than the one in Theorem 7.5. Specifically, we require hardness only
against probabilistic algorithms that have oracle access to deIP [ c] , rather than to
AMT IME [ c] . (We will use this generic claim in the proof of Theorem 1.4, since the
proof system for #SAT indeed has an efficient universal prover.)
Theorem 7.8 (derandomizing constant-round doubly efficient proof systems with an
efficient universal prover into deterministic doubly efficient argument systems). For
every α, β ∈ (0, 1) there exists η > 0 such that the following holds. Let c ∈ N be a constant,
let T (n) be a polynomial, let R(n) < T (n) be time-computable, and let N (n) = n + c · T (n)
and K (n) = R(n)1/η . Assume that the ( N 7→ K, K β , α)-non-batch-computability assumption
holds for time T̄ = T · K with oracle access to pr-deIP [ c] [n] on inputs of length O( T ). Then,
[ c]
deIP uni [ T, R] ⊆ deDARG[ T · RO(c/η ) ] ,
[ c]
where the O-notation hides a universal constant, and deIP uni [ T, R] denotes all problems
solvable by deIP [ c] [ T, R] systems that, for every constant µ > 0, have a µ-approximate
universal prover running in time poly( T ).
Proof. The proof is very similar to that of Theorem 7.5, the only difference being
that the reconstruction algorithm in Claim 7.5.2 can now only access a pr-deIP [ c] [n]
[ c]
oracle rather than a AMT IME 2 [n] oracle.
Recall that in the proof of Claim 7.5.2, the algorithm uses an oracle that solves the
following promise problem, which is defined with respect to a parameter τ:

Y = ( x, πs̄, p̄ , ri , τ ) : p(ri ) ≥ τ + e/4

N = ( x, πs̄, p̄ , ri , τ ) : p(ri ) ≤ τ .

Our goal is to argue that (Y, N) ∈ pr-deIP [ c] [n], by arguing that there is an efficient
honest prover. (The rest of the proof then continues without change.)
The key observation is that our current assumption asserts the existence of an
efficient prover that, on any x and (πs̄, p̄ , ri ), almost maximizes the residual acceptance
probability of the protocol when the prefix of the transcript is fixed to (πs̄, p̄ , ri ). For any
constant µ, we denote the µ-approximate universal prover by Pµ , and on a fixed x̄ =
( x, πs̄, p̄ , ri ), we denote by ν = νx̄ the maximal acceptance probability of the residual
protocol, across all provers.
The verifier for (Y, N) is identical to that in the original proof of Claim 7.5.2; that
is, the verifier simulates the residual protocol for poly(1/e) times in parallel, and
accepts if and only if the average probability across simulations, denoted ν̃, satisfies
ν̃ ≥ τ + e/8. When ( x̄, τ ) ∈ N, for any prover, with high probability we have that
ν̃ < τ + e/8. On the other hand, when ( x̄, τ ) ∈ Y, a prover that simulates Pe/16 on
the poly(1/e) parallel instantiations of the protocol yields ν̃ ≥ τ + e/8, with high
probability. Thus, (Y, N) ∈ deIP [ c] [n] as we wanted.

7.2.4 A deterministic argument system for #SAT with runtime 2e·n


We now state and prove Theorem 1.4, as a particularly appealing special case of The-
orem 7.8. Specifically, we show that under strong hardness assumptions, we can solve

60
#SAT by a deterministic doubly efficient argument system running in time 2e·n , for an
arbitrarily small e > 0.
Theorem 7.9 (a deterministic argument system for #SAT with runtime 2e·n ). For ev-
ery α, β ∈ (0, 1) there exists η > 0 such that the following holds. Assume that for every
constant c ∈ N and logarithmic function `(n) = O(log(n)), the ( N 7→ K, K β , α)-non-batch-
computability assumption holds for time T̄ = T · K with oracle access to pr-deIP [ c] [n] on
inputs of length O( T ), where
T (n) = O(n) ,
N (n) = n + c · T (n) ,
K (n) = `(n)1/η .
Then, for every e > 0 there exists a deterministic verifier V that gets as input an n-bit formula
Φ of size at most 2o(n) , runs in time 2e·n , and satisfies the following:
1. There exists an algorithm that gets input Φ, runs in time 2O(n) , and prints a proof π
such that V (Φ, π ) = #SAT(Φ).
2. For every probabilistic algorithm P running in time 2O(n) and every sufficiently large
n ∈ N, the probability that P(1n ) outputs an n-bit formula Φ of size 2o(n) and proof π
such that V (Φ, π ) ∈
/ {⊥, #SAT(Φ)} is 2−ω (n) .
Proof. Let e0 > 0 be a sufficiently small constant. Using Theorem 3.17 with a suffi-
ciently large constant k ∈ N and with δ > 0 such that δ/(1 − δ) = 1/k, we have an
[ 2k+1] e0 ·n
deIP uni [2 ] protocol for counting the number of satisfying assignments of an
n-bit formula of size 2o(n) , which uses at most O(n) random coins.49
0
By a padding argument, we think of the input as of size n̄ = 2e ·n and of the
protocol as running in linear time with logarithmically many coins. Also, we consider
the 2n = O(log n̄) Boolean protocols UU , . . . , Pn , Ū1 , . . . , Ūn , where each Ui is a protocol
for proving that the ith bit of #SAT(Φ) is 1 and each Ūi is a protocol for proving that
the ith bit of #SAT(Φ) is 0.
We apply Theorem 7.8 to each of 2n protocols, with parameters T (n̄) = O(n̄) and
R(n̄) = O(log n̄), to obtain a sequence of 2n deterministic verifiers D1 , . . . , Dn , D̄1 , . . . , D̄n
each running in time Õ( T ). Given a formula Φ, our verifier V expects to receive from
the prover a value ρ ∈ {0,(1}n and n witnesses w1 , . . . , wn ∈ {0, 1}Õ(n̄) such that for
Di ( w i ) = 1 ρ i = 1
every i ∈ [n] it holds that . Whenever this happens V outputs ρ,
D̄i (wi ) = 1 ρi = 0
otherwise it outputs ⊥.
For every algorithm P running in time 2O(n) , by a union-bound, the probability
over Φ ∼ Φ that P outputs a proof that falsely convinces some Di or D̄i is negligible
in N. Also, the running time of V is Õ(n · n̄) < 2e·n .

7.3 The general case: Constant-round doubly efficient proof systems


In this section we prove Theorem 1.8. First, in Section 7.3.1, we show yet another
refinement of the reconstructive PRG from Proposition 5.2, which will be used in the
proof. Then in Section 7.3.2 we prove Theorem 1.8.
49 Note that Theorem 3.17 yields an MAT IME [ 2k] protocol that has a universal prover running in
time 2O(n) . In particular, such a protocol can be simulated by a deIP [ 2k+1] protocol with the same
verifier running time.

61
7.3.1 Yet another refinement of the reconstructive PRG
We now further refine the reconstructive PRG from Proposition 5.2. The goal now
will be for the PRG to work not only with distinguishers, but also with distinguishers
that solve a promise problem (i.e., evaluate to Y on a pseudorandom input more than
they evaluate to “not N” on a random input). The crux of the proof is to show that
the advice depends only on the promise-problem, rather than on any particular oracle
that solves the problem (and may behave arbitrarily on queries outside the promise).

Proposition 7.10 (an extension of the PRG from Proposition 5.2 to “promise prob-
lem” distinguishers). For any constant e0 > 0, we can replace the “furthermore” claim in
Proposition 5.2, with the following claim. Fix a promise-problem (Y, N) such that

/ N] + e 0 .
Pr[ G f (u(1+e0 )·log( N ) ) ∈ Y ] > Pr[u N ∈

Then, there exists s ∈ N satisfying s|t̄, and α ∈ (0, 1), and an advice string adv of length
| f |1−δ0
such that the following holds. Denoting by a x,w,γ the sequence of answers to R’s queries
on input x ∈ [| f |] and witness w and random choices γ, we have that:

1. (Completeness.) For any x there exists w such that with probability 1 − 1/N over γ,
any function that agrees with (Y, N) yields a sequence of answers to the oracle queries
that is (s, α)-valid, and if a x,w,γ is (s, α)-indicative of a sequence that agrees with (Y, N),
then R( x, w) outputs f x .

2. (Soundness.) For every ( x, w), with probability at least 1 − 1/N over γ, if a x,w,γ is
(s, α)-indicative of a sequence that agrees with (Y, N) on the oracle queries, then R( x, w)
outputs either ⊥ or f x .

3. (Deficient oracles.) If a x,w,γ is (s, α)-deficient then R outputs ⊥.

Proof. We instantiate Samp : {0, 1} N̄ × [ L̄] → {0, 1} N just as in the proof of Propo-
sition 4.1, and recall that its accuracy δ is sub-constant. We instantiate Enc with an
agreement parameter ρ = ρ(e0 ) that is now a sufficiently small constant that depends
on e0 > 0. Other than that, we use the exact same generator G, and denote again the
uniform distribution over the output-set of G f by G.
In the current proof, instead of assuming that we have oracle access to a function
D : {0, 1} N → {0, 1} that is a (1/10)-distinguisher for G, we are first fixing a promise-
problem (Y, N) such that

/ N] + e 0 ,
Pr[G ∈ Y ] > Pr[u N ∈ (7.2)

and are only assuming that we have access to some (arbitrary) D : {0, 1} N → {0, 1}
that solves (Y, N). Our goal is for the advice to depend only on (Y, N), but for the
reconstruction to work with any oracle D that agrees with (Y, N).
The error-reduced “promise distinguisher”. We define the following two sets:
 
N̄ 0
S= z ∈ {0, 1} : Pr [Samp(z, j) ∈
/ N] ≤ Pr[u N ∈
/ N] + e /100 ,
j∈[ L̄]

and
 
N̄ 0
T= z ∈ {0, 1} : Pr [Samp(z, j) ∈ Y ] > Pr[u N ∈
/ N] + e /10 .
j∈[ L̄]

62
Observe that T ⊆ S̄; this is the case because for any z we have that Pr j [Samp(z, j) ∈
1− γ
Y ] ≤ Pr j [Samp(z, j) ∈
/ N], and thus if z ∈ T then z ∈ / S. Also note that |S̄| ≤ 2 N̄ , by
the properties of Samp (we crucially use the fact that S is defined with respect to the
specific event of being not in N). Finally, similarly to Eq. (4.2), we have that

Pr[G ∈ Y ] = Pr Samp( f¯i , j) ∈ Y


 
i,j

≤ Pr[ f¯i ∈ T ] + Pr Samp( f¯i , j) ∈ Y | f¯i ∈


 
/T
i i,j
¯ / N] + e0 /10 ,

≤ Pr[ f i ∈ T ] + Pr[u N ∈
i

and using Eq. (7.2) we deduce that Pri [ f¯i ∈ T ] ≥ ρ, relying on the assumption that
ρ = ρ(e0 ) is sufficiently small.
Computing a corrupted codeword. Analogously to the proofs of Propositions 4.1 and 5.2,
we construct a machine M that, given any oracle D that agrees with (Y, N), computes
a “corrupted” version of f¯. We first fix a hash function h ∈ H such that there are no
collision in S̄, as in theprevious proofs. The machine M gets as advice h, the value
Pr[u N ∈/ N], the set I = (i, h(i )) :∈ [ L] ∧ f¯i ∈ T , and the value α = Pr[u N ∈
/ N] + .005.
We stress the following fact:
Observation 7.10.1. The advice to M depends only on (Y, N), on Samp, and on h.
Now, given q ∈ [| f¯|], the machine M first guesses a preimage z ∈ {0, 1} N̄ for f¯i
and verifies that h(z) = f¯i using the stored hash value. Then, M uniformly samples
s = O(log( N )) values j1 , . . . , js ∈ [ L̄], calls its oracle D on these values, and proceeds if
def
/ N] + e0 /50. If both verifications
and only if ν == Prk∈[t] [ D (Samp(z, jk )) = 1] ≥ Pr[u N ∈
succeeded, then M outputs the bit in z corresponding to index q, and otherwise M
outputs ⊥. The reconstruction R then uses the list-decoder (with fixed index and
randomness) just as in the proof of Proposition 5.2.
The claim about deficient oracles follows immediately by the definition of M above
(recall that R outputs ⊥ whenever M returns ⊥ in at least one execution). To demon-
strate completeness, observe that for any x there exists w such that, with probability
at least 1 − 1/N, on every query q to M, any sequence of answers that is consistent
with (Y, N) will have at least an α-fraction of answers that are not in N (since the cor-
responding z is in T); and any answers with an α-fraction of 1’s cause M to correctly
compute the corrupted codeword (since the corresponding z is the right preimage for
the query under h). When this happens, Dec (and R) output f x .
For the soundness case, fix ( x, w) and again recall that the non-determinism w for
R yields non-deterministic strings for each of the execution of M. For every execution
of M on query q and with non-determinism z, if z ∈ S then with probability at least
1 − 1/N 2 it holds that

Pr [Samp(z, jk ) ∈ / N] + e0 /50 ,
/ N] < Pr[u N ∈
k ∈[t]

in which case there does not exist a sequence of answers with at least an α-fraction of
1’s that agrees with (Y, N). In this case the soundness claim holds vacuously (because
there does not exist (s, α)-valid sequence of answers that agrees with (Y, N)).
Therefore, by a union-bound over the queries, we may assume that z ∈ S̄ for all
queries to M and that M outputs either ⊥ (if z ∈ S̄ is not the unique preimage under
h for the corresponding query) or the corresponding bit in the corrupted codeword (if

63
z ∈ S̄ is indeed the unique preimage under h for the corresponding query). In case one
of the queries of Dec is answered by ⊥, then R outputs ⊥ by definition; and otherwise,
the execution of Dec is identical to an execution with access to the corrupted codeword,
in with case R( x, w) = f x .

7.3.2 The proof of Theorem 1.8


We now show that under strong hardness assumptions we can reduce the number of
random coins in an arbitrary doubly efficient proof system to be logarithmic, without
increasing the number of rounds. The hardness assumptions will refer to a function
whose truth-tables can be recognized in near-linear time, but that is hard for multi-
round AM protocols running in time 2(1−δ)·n with 2(1−δ)·n bits of non-uniform advice,
for a small δ > 0.
Theorem 7.11 (drastically reducing the number of random coins in a constant-round
proof system). For every e > 0 there exists δ > 0 such that the following holds. Let c ∈ N be
/ MAT IME [ c+1] [2(1−δ)·n ]/2(1−δ)·n such
a constant, and assume that there exists Lhard ∈
that given n ∈ N, the truth-table of Lhard of n-bit inputs can be printed in time 2(1+e/3)·n .
Then, for every polynomial T (n) it holds that

deIP [ c]
[ T ] ⊆ deIP [ c]
[ T 1+e , (1 + e) · log( T )] .

Proof. Let L ∈ deIP [ c] [ T ], and let V be the corresponding T-time verifier for L.
Denote by c0 = dc/2e the number of turns of the verifier in the interaction. As in the
proof of Theorem 7.5, we denote by hV, P, x i (r1 , . . . , rc0 ) the result of an interaction
between V and P on common input x where the coins r1 , . . . , rc0 are gradually revealed
in the c0 turns of V.

The new verifier. We define a verifier V 0 that gets input x ∈ {0, 1}n , and acts as
follows. For N = T (n), let f n be the truth-table of Lhard on inputs of length (1 + e/3) ·
log( N ). Consider the generator G from Proposition 7.10 with e0 = e and input 1 N and
oracle access to f n , and denote the number of its outputs by N̄ = N 1+e .
The verifier V 0 computes f n , and in each turn i, it chooses a random si ∈ [ N̄ ] and
sends G f n (si ) it to the prover, instead of a random N-bit string. In the end V 0 applies
the predicate V to the transcript. Note that V 0 uses (1 + e) · log( N ) random coins in
each turn, and runs in time O( N 1+e ).

Completeness and soundness. The honest prover for V 0 behaves identically to the
prover for V. Since the new protocol simulates the original protocol with a pseudoran-
dom subset of the random strings, completeness follows immediately. We thus focus
on proving the soundness of V 0 . Fix x ∈
/ L.
Notation. Let us introduce some useful notation. For every i ∈ [c0 ] let Vi be the verifier
that in the first i turns uses pseudorandom coins as above, and in the rest of the
interaction uses random coins. By definition, we have that
 
hVi , P, x i (s1 , . . . , si , ri+1 , . . . , rc0 ) = hV, P, x i G f n (s1 ), . . . , G f n (si ), ri+1 , . . . , rc0 .

We define a sequence of hybrids, as follows:


 
p x,i = max Pr [hVi , P, x i (s1 , . . . , si , ri+1 , . . . , rc0 ) = 1] , (7.3)
P s1 ,...,si ,ri+1 ,...,rc0

64
where i = 0, . . . , c0 . Indeed p x,0 is just the maximal acceptance probability of V on
input x whereas p x,c0 is the maximal acceptance probability of V 0 on input x (where in
both cases the maximum is taken over all provers).
Now, let P̄ be the prover that maximizes p x,c0 . For any i ∈ [c0 ] and prover P, we
0
think P as a concatenation P1...i ◦ Pi+1...c ,50 and for any i ∈ {0, . . . , c0 } we denote
 hD E i
1...i i +1...c0
( p x,i | P̄) = max0 Pr Vi , P̄ ◦ P , x ( s 1 , . . . , s i , r i +1 , . . . , r c 0 ) = 1 .
Pi+1...c s1 ,...,si ,ri+1 ,...,rc0

A hybrid argument. Assume that p x,c0 = ( p x,c0 | P̄) ≥ 1/2. Since x ∈


/ L, we have that
( p x,0 | P̄) ≤ 1/3. It follows that

1/6 < ( p x,c0 | P̄) − ( p x,0 | P̄) = ∑ ( px,i | P̄) − ( px,i−1 | P̄) ,
i ∈[c0 ]

and hence for some i ∈ [c0 ] it holds that ( p x,i | P̄) − ( p x,i−1 | P̄) > 1/20c0 . Observe that

( p x,i | P̄) − ( p x,i−1 | P̄)


 hD E i
0
= max0 Pr Vi , P̄1...i ◦ Pi+1...c , x (s1 , . . . , si , ri+1 , . . . , rc0 ) = 1
Pi+1...c s1 ,...,si ,ri+1 ,...,rc0
 hD E i
1...i −1 i...c0
− max0 Pr Vi−1 , P̄ ◦ P , x ( s 1 , . . . , s i −1 , r i , . . . , r c 0 ) = 1
Pi...c s1 ,...,si−1 ,ri ,...,rc0
 hD E i
1...i −1 i...c0
≥ max0 Pr Vi , P̄ ◦ P , x ( s 1 , . . . , s i , r i +1 , . . . , r c 0 ) = 1
Pi...c s1 ,...,si ,ri+1 ,...,rc0
 hD E i
1...i −1 i...c0
− max0 Pr Vi−1 , P̄ ◦ P , x ( s 1 , . . . , s i −1 , r i , . . . , r c 0 ) = 1
Pi...c s1 ,...,si−1 ,ri ,...,rc0
  hD E i
1...i −1 i...c0
= E max0 Pr Vi , P̄ ◦ P , x ( s 1 , . . . , s i , r i +1 , . . . , r c 0 ) = 1
s1 ,...si Pi...c ri+1 ,...,rc0
  hD E i
0
− E max0 Pr Vi−1 , P̄1...i−1 ◦ Pi...c , x (s1 , . . . , si−1 , ri , . . . , rc0 ) = 1 ,
s1 ,...,si−1 ,ri Pi...c ri+1 ,...,rc0
(7.4)

where the last equality is justified using the precise same argument as in Claim 7.5.3.1.
Defining a distinguisher. For any s̄ = (s1 , . . . , si−1 ) and ri ∈ {0, 1} N , denote
 hD E i
1...i −1 i...c0
ps̄ (ri ) = max0 Pr Vi−1 , P̄ ◦ P , x (s̄, ri , . . . , rc0 ) = 1 ,
Pi...c ri+1 ,...,rc0

and observe that Eq. (7.4) can thus be rewritten as


 h i 
E E ps̄ ( G f n (si )) − E [ ps̄ (ri )] > 1/20c0 .
s̄ si ∈[ N̄ ] ri ∈{0,1} N

We fix s̄ = (s1 , . . . , si−1 ) such that the expected difference above is attained. Now,
using Lemma 7.5.3.4 with x = ps̄ ( G f n (u[ N̄ ])) and y = p(u N ) and e = 1/20c0 , there
exists τ ∈ {0, e/4, 2e/4, . . . , 1} such that
h i
Pr ps̄ ( G f n (u[ N̄ ] )) ≥ τ + e/4 − Pr [ p(u N ) ≥ τ ] > e/2 . (7.5)
50 Recall that, as in the proof of Theorem 7.5, we think of P as a sequence of c0 functions, mapping

transcripts to N-bit responses.

65
Now, consider the following promise problem:

Y = ( x, ri , πs̄, p̄ , τ ) : ps̄ (ri ) ≥ τ + e/4

N = ( x, ri , πs̄, p̄ , τ ) : ps̄ (ri ) ≤ τ .

Note that by Eq. (7.5), we have that

e/2 > Pr[ G f n (u[ N̄ ] ) ∈ Y ] − Pr[u N ∈


/ N] .

We further argue that (Y, N) is in pr AMT IME [ c] [O(n)], via the following protocol
U. (Note that the input length to this problem is N, and thus the running time O( N )
that we will prove is linear in its input length.) The protocol U simulates the protocol
underlying ps̄ (i.e., interacts with its prover using random coins and applies the final
predicate that V applies to the interaction, when the prefix is π) for constantly many
times in parallel, to estimate its acceptance probability, with high probability, up to an
error of e/8. The procedure accepts if and only if its estimate is more than τ + e/8.
Note that U is indeed an AMT IME [ c] protocol that runs in linear time, uses a
linear number of advice bits, and solves (Y, N).
A reconstruction argument. Let D be a protocol for the pr AMT IME [ c] [O(n)] above.
Consider the reconstruction algorithm R from Proposition 7.10, with the advice string
adv and s ∈ N and α > 0 that correspond to (Y, N) above. Our protocol for the hard
function f will simulate R, and resolve its queries by simulating D with our prover.
By the completeness of R, for every z there exists w for which with high probability
over R’s random coins, any sequence of answers to oracle queries that agrees with
(Y, N) is (s, α)-valid, and any sequence of answers that is (s, α)-indicative of a sequence
that agrees with (Y, N) causes R to output f z . Thus, the prover can simply convince
the verifier of the veracity of all queries that are Y instances: In this case, regardless of
what the protocol answers for N instances, the sequence of answers agrees with (Y, N)
and is thus (s, α)-indicative of itself, causing R to output f z .
To establish the soundness of the protocol, denote the sequence of answers to the
queries of R by d1 , . . . , dt̄ , and observe that with high probability, for every query
qi we have di = 1 ⇒ qi ∈ / N. Then, by Proposition 7.10, with high probability the
following holds: If the sequence of answers to the verifier’s queries is (α, s)-deficient,
then R outputs ⊥; and otherwise, it means that the answers are (s, α)-indicative of the
sequence of answers that agree with (Y, N), in which case we are in the soundness
case and the output is either f z or ⊥.
The procedure above is an MAT IME [ c+1] protocol (since we are using the first
round to receive w, then c rounds to simulate D) and it runs in time

| f n |1−δ0 + e/10
|N{z } · O( N ) < | f n | 1− δ ,
| {z } | {z }
complexity of R number of queries length of each query

relying on a sufficiently small choice of δ > 0. Similarly, the total number of advice
bits that it uses is | f n |1−δ0 + O( N ) < | f n |1−δ . This contradicts the hardness of Lhard .

By combining Theorems 7.11 and 7.5, we now show that under strong hardness
assumptions we can simulate any AMT IME [ c] protocol by a deterministic doubly
efficient argument system, with essentially no time overhead.

66
Corollary 7.12 (simulating proof systems by deterministic doubly efficient argument
systems; Theorem 1.8, restated). For every α, β, e ∈ (0, 1) there exists η, δ > 0 such that
for every c ∈ N the following holds. Assume that:

1. There exists Lhard ∈/ MAT IME [ c+1] [2(1−δ)·n ]/2(1−δ)·n such that given n ∈ N, the
truth-table of Lhard of n-bit inputs can be printed in time 2(1+e/3)·n .

2. The ( N 7→ K, K β , α)-non-batch-computability assumption holds for time T 0 with oracle


[ c]
access to pr AMT IME 2 [n] on inputs of length Θ( T 1+e/2 · R), where R = (1 + e) ·
log( T ) and N = n + (c · dc/we) · T 1+e/2 · R and K = R1/η and T 0 = T 1+e/2 · K · R.

Then, deIP [ c]
[ T ] ⊆ deDARG[ T 1+e ].
Proof. Let L ∈ deIP [ c] [ T ]. By the first hypothesis and Theorem 7.11 we have that
L ∈ deIP [ c] [ T̄, R] for T̄ = T 1+e/2 and R = (1 + e/2) · log( T ). And by the second
hypothesis and Theorem 7.5 with time bound T̄ and randomness R, we have that
L ∈ deDARG[ T̄ · poly( R)] ⊆ deDARG[ T 1+e ].

As in Remark 7.7, the derandomization in Corollary 7.12 extends to the setting


in which the original honest prover (in the probabilistic proof system) was efficient
only when given a witness in a relation, and in this case the honest prover in the
argument system is also efficient only when given a witness in the same relation.
(This is because, similarly to Remark 7.7, the honest prover in the argument system
underlying Theorem 7.11 is identical to the original honest prover.)

Acknowledgements
The authors are very grateful to Ron Rothblum, for explaining to us some of the
possibilities for batch-computing functions with doubly efficient interactive proofs.
We also thank Avi Wigderson, Ryan Williams, Oded Goldreich, Vinod Vaikuntanathan
and Alex Lombardi for useful discussions. The first author thanks Avi Wigderson for
hosting him at the IAS for a visit in which most of this work was carried out.
The first author is supported by NSF CCF-2127597 and an IBM Fellowship. Part
of this work was conducted while the second author was at DIMACS and partially
supported by the National Science Foundation under grant number CCF-1445755.

References
[AB09] Sanjeev Arora and Boaz Barak. Computational complexity: A modern ap-
proach. Cambridge University Press, Cambridge, 2009.
[AKS04] Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. “PRIMES is in P”.
In: Annals of Mathematics. Second Series 160.2 (2004), pp. 781–793.
[AKS19] Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. “Errata: PRIMES is
in P [ MR2123939]”. In: Annals of Mathematics. Second Series 189.1 (2019),
pp. 317–318.
[BGH+05] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and
Salil P. Vadhan. “Short PCPs Verifiable in Polylogarithmic Time”. In: 20th
Annual IEEE Conference on Computational Complexity (CCC 2005), 11-15
June 2005, San Jose, CA, USA. IEEE Computer Society, 2005, pp. 120–134.

67
[BM88] László Babai and Shlomo Moran. “Arthur-Merlin games: a randomized
proof system, and a hierarchy of complexity classes”. In: Journal of Com-
puter and System Sciences 36.2 (1988), pp. 254–276.
[BSGH+06] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and
Salil Vadhan. “Robust PCPs of proximity, shorter PCPs and applications
to coding”. In: SIAM Journal of Computing 36.4 (2006), pp. 889–974.
[CT21a] Lijie Chen and Roei Tell. “Hardness vs Randomness, Revised: Uniform,
Non-Black-Box, and Instance-Wise”. In: Proc. 62nd Annual IEEE Sympo-
sium on Foundations of Computer Science (FOCS). 2021.
[CT21b] Lijie Chen and Roei Tell. “Simple and fast derandomization from very
hard functions: Eliminating randomness at almost no cost”. In: Proc. 53st
Annual ACM Symposium on Theory of Computing (STOC). 2021.
[DMO+20] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. “Nearly
Optimal Pseudorandomness From Hardness”. In: Proc. 61st Annual IEEE
Symposium on Foundations of Computer Science (FOCS). 2020.
[FF93] Joan Feigenbaum and Lance Fortnow. “Random-self-reducibility of com-
plete sets”. In: SIAM Journal of Computing 22.5 (1993), pp. 994–1005.
[FGM+89] Martin Fürer, Oded Goldreich, Yishay Mansour, Michael Sipser, and
Stathis Zachos. “On Completeness and Soundness in Interactive Proof
Systems”. In: Advances in Computing Research 5 (1989).
[FLM+05] Lance Fortnow, Richard Lipton, Dieter van Melkebeek, and Anastasios
Viglas. “Time-space lower bounds for satisfiability”. In: Journal of the
ACM 52.6 (2005), pp. 835–865.
[GKR15] Shafi Goldwasser, Yael Tauman Kalai, and Guy N. Rothblum. “Delegat-
ing computation: interactive proofs for muggles”. In: Journal of the ACM
62.4 (2015), Art. 27, 64.
[GLR+91] Peter Gemmell, Richard J. Lipton, Ronitt Rubinfeld, Madhu Sudan, and
Avi Wigderson. “Self-Testing/Correcting for Polynomials and for Ap-
proximate Functions”. In: Proceedings of the 23rd Annual ACM Symposium
on Theory of Computing, May 5-8, 1991, New Orleans, Louisiana, USA. Ed.
by Cris Koutsougeras and Jeffrey Scott Vitter. ACM, 1991, pp. 32–42.
[Gol08] Oded Goldreich. Computational Complexity: A Conceptual Perspective. New
York, NY, USA: Cambridge University Press, 2008.
[Gol18] Oded Goldreich. “On doubly-efficient interactive proof systems”. In: Foun-
dations and Trendsr in Theoretical Computer Science 13.3 (2018), front mat-
ter, 1–89.
[GR18] Oded Goldreich and Guy N. Rothblum. “Simple doubly-efficient inter-
active proof systems for locally-characterizable sets”. In: Proc. 9th Con-
ference on Innovations in Theoretical Computer Science (ITCS). Vol. 94. 2018,
Art. No. 18, 19.
[GSTS03] Dan Gutfreund, Ronen Shaltiel, and Amnon Ta-Shma. “Uniform hard-
ness versus randomness tradeoffs for Arthur-Merlin games”. In: Compu-
tational Complexity 12.3-4 (2003), pp. 85–130.

68
[GW14] Oded Goldreich and Avi Widgerson. “On derandomizing algorithms that
err extremely rarely”. In: Proc. 46th Annual ACM Symposium on Theory of
Computing (STOC). Full version available online at Electronic Colloquium
on Computational Complexity: ECCC, 20:152 (Rev. 2), 2013. 2014, pp. 109–
118.
[IKW02] Russell Impagliazzo, Valentine Kabanets, and Avi Wigderson. “In search
of an easy witness: exponential time vs. probabilistic polynomial time”.
In: Journal of Computer and System Sciences 65.4 (2002), pp. 672–694.
[KM98] Adam Klivans and Dieter van Melkebeek. “Graph Nonisomorphism has
Subexponential Size Proofs Unless the Polynomial-Time Hierarchy Col-
lapses”. In: Electronic Colloquium on Computational Complexity: ECCC 5
(1998), p. 75.
[LFK+92] Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. “Alge-
braic methods for interactive proof systems”. In: Journal of the Association
for Computing Machinery 39.4 (1992), pp. 859–868.
[MNT93] Yishay Mansour, Noam Nisan, and Prasoon Tiwari. “The computational
complexity of universal hashing”. In: vol. 107. 1. 1993, pp. 121–133.
[MV05] Peter Bro Miltersen and N. V. Vinodchandran. “Derandomizing Arthur-
Merlin games using hitting sets”. In: Computational Complexity 14.3 (2005),
pp. 256–279.
[NW94] Noam Nisan and Avi Wigderson. “Hardness vs. randomness”. In: Journal
of Computer and System Sciences 49.2 (1994), pp. 149–167.
[RR97] Alexander A. Razborov and Steven Rudich. “Natural proofs”. In: Journal
of Computer and System Sciences 55.1, part 1 (1997), pp. 24–35.
[RRR18] Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. “Efficient
Batch Verification for UP”. In: Proc. 33rd Annual IEEE Conference on Com-
putational Complexity (CCC). 2018, 22:1–22:23.
[RRR21] Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. “Constant-
round interactive proofs for delegating computation”. In: SIAM Journal
of Computing 50.3 (2021), STOC16–255–STOC16–340.
[RRV02] Ran Raz, Omer Reingold, and Salil Vadhan. “Extracting all the random-
ness and reducing the error in Trevisan’s extractors”. In: Journal of Com-
puter and System Sciences 65.1 (2002), pp. 97–128.
[Sha92] Adi Shamir. “IP = PSPACE”. In: Journal of the ACM 39.4 (1992), pp. 869–
877.
[Sip88] Michael Sipser. “Expanders, randomness, or time versus space”. In: Jour-
nal of Computer and System Sciences 36.3 (1988), pp. 379–383.
[STV01] Madhu Sudan, Luca Trevisan, and Salil Vadhan. “Pseudorandom genera-
tors without the XOR lemma”. In: Journal of Computer and System Sciences
62.2 (2001), pp. 236–266.
[SU05] Ronen Shaltiel and Christopher Umans. “Simple extractors for all min-
entropies and a new pseudorandom generator”. In: Journal of the ACM
52.2 (2005), pp. 172–216.

69
[SU06] Ronen Shaltiel and Christopher Umans. “Pseudorandomness for Ap-
proximate Counting and Sampling”. In: Comput. Complex. 15.4 (2006),
pp. 298–341.
[SU07] Ronen Shaltiel and Christopher Umans. “Low-end uniform hardness vs.
randomness tradeoffs for AM”. In: Proc. 39th Annual ACM Symposium on
Theory of Computing (STOC). 2007, pp. 430–439.
[Tel21] Roei Tell. “How to Find Water in the Ocean: A Survey on Quantified
Derandomization”. In: Electronic Colloquium on Computational Complexity:
ECCC 28 (2021), p. 120.
[Tha22] Justin Thaler. Proofs, Arguments, and Zero-Knowledge. Accessed at https:
//people.cs.georgetown.edu/jthaler/ProofsArgsAndZK.html, March
28, 2022. 2022.
[Tou01] Iannis Tourlakis. “Time-space tradeoffs for SAT on nonuniform machines”.
In: Journal of Computer and System Sciences 63.2 (2001), pp. 268–287.
[TSZS06] Amnon Ta-Shma, David Zuckerman, and Shmuel Safra. “Extractors from
Reed-Muller codes”. In: Journal of Computer and System Sciences 72.5 (2006),
pp. 786–812.
[WB86] Lloyd R Welch and Elwyn R Berlekamp. Error correction for algebraic block
codes. US Patent 4,633,470. 1986.
[Wil13] Ryan Williams. “Improving Exhaustive Search Implies Superpolynomial
Lower Bounds”. In: SIAM Journal of Computing 42.3 (2013), pp. 1218–1244.
[Wil16] Richard Ryan Williams. “Strong ETH breaks with Merlin and Arthur:
short non-interactive proofs of batch evaluation”. In: Proc. 31st Annual
IEEE Conference on Computational Complexity (CCC). Vol. 50. 2016, Art.
No. 2, 17.
[Žák83] Stanislav Žák. “A Turing machine time hierarchy”. In: Theoretical Com-
puter Science 26.3 (1983), pp. 327–333.

A Useful properties are necessary for derandomization of MA


In this appendix we prove that useful properties that are constructive in near-linear
time are necessary for derandomization of MA. We first prove that infinitely often
useful properties are necessary for any derandomization of MA, following the proof
approach of Williams [Wil13; Wil16]; and then we prove that useful properties (in the
standard sense, i.e. not only infinitely often) are necessary for black-box superfast
derandomization of MA.

Definition A.1 (io-useful property; compare with Definition 3.3). We say that a collection
L ⊆ {0, 1}∗ of strings is a C 0 -constructive property io-useful against C if for every N = 2n it
holds that Ln = L ∩ {0, 1} N 6= ∅, and L ∈ C 0 , and for every L ∈ C there are infinitely many
n
n ∈ N such that Ln ∈/ Ln , where Ln ∈ {0, 1}2 is the truth-table of L on n-bit inputs.

Proposition A.2 (io-useful properties are necessary for derandomization of MA). If


every unary language in MA is also in N P then there is a quasilinear-time-constructive
property io-useful against circuits of size 2e·n , for some constant e > 0.

70
Proof. We first use the well-known proof approach from [Wil13; Wil16] to argue that
witnesses for a certain unary language in N T IME [2n ] have exponential circuit com-
plexity, infinitely often; and then follow [Wil16] in observing that the latter statement
yields an io-useful property for exponential-sized circuits. Details follow.
Let L be a unary language in N T IME [2n ] \ N T IME [2n/2 ] (see [Žák83]), and let
V be a PCP verifier for L with running time poly(n) and randomness n + O(log n)
and perfect completeness and soundness error 1/3 (see [BSGH+06]). Consider an
MA verifier V PCP that on n-bit inputs guesses a circuit C of size 2e·n , where e > 0 is
sufficiently small, and runs V while resolving its oracle queries using C.
Assume towards a contradiction that for every sufficiently large n ∈ N, if 1n ∈ L
n
then there is w ∈ {0, 1}2 that is the truth-table of a function computable by circuits
of size 2e·n such that Pr[V (1n , w) = 1] = 1. In this case, V PCP is an MA verifier that
decides L in time Õ(2e·n ), and thus (by our hypothesis and relying on e > 0 being
small) it holds that L ∈ N T IME [2n/2 ], a contradiction.
Thus, there is an infinite set S ⊆ N such that for every n ∈ S there exists w ∈
n
{0, 1}2 such that Pr[V (1n , w) = 1] = 1, and every w satisfying this condition is the
truth-table of a function whose circuit complexity is more than 2e·n . Our io-useful
property consists of all such w’s, for every n ∈ S. The constructive algorithm for the
n
property gets f n ∈ {0, 1}2 and enumerates over the randomness of V to compute
Pr[V (1n , f n ) = 1]; if the latter value is 1 then it accepts, otherwise it rejects. Indeed,
the property is non-trivial infinitely often, its truth-tables are hard for circuits of size
2e·n , and it is constructive in time Õ(2n ).

We now prove that useful properties (which are not only “infinitely often” useful)
that are constructive in near-linear time are necessary for any superfast derandomiza-
tion of MA that uses non-deterministic PRGs (NPRGs). The proof follows the stan-
dard transformation of PRGs into hard functions, adapting it to our setting: Replacing
PRGs with NPRGs, and hard functions with useful properties.

Definition A.3 (non-deterministic PRG). A non-deterministic machine M is a non-deterministic


e-PRG (e-NPRG, in short) for a circuit class C if for every n ∈ N, when M is given input 1n
it satisfies the following:

1. There exists a non-deterministic guess such that M prints S ⊆ {0, 1}n for which there
is no circuit on n inputs in C that is an e-distinguisher.51

2. For every non-deterministic guess, either M prints S ⊆ {0, 1}n for which there is no
circuit on n inputs in C that is an e-distinguisher, or M outputs ⊥.

Proposition A.4 (useful properties are necessary for derandomization with NPRGs).
For every e > 0 there exists δ > 0 such that the following holds. If there is a (1/10)-NPRG
for linear-sized circuits that is computable in time n1+e , then there is an N T IME [ N 1+3e ]-
computable property L useful against circuits of size 2(1−δ)·n .

Proof. For every n ∈ N, let Sn be the collection of all possible sets S that the NPRG
can print on input length 1n (i.e., when considering all non-deterministic guesses that
do not cause M to output ⊥). Fix S ∈ Sn , and note that |S| ≤ n1+e (due to the running
time of M). For `(n) = d(1 + 2e) · log(n)e, let f S : {0, 1}` → {0, 1} such that f S ( x ) = 1
if and only if x is a prefix of some z ∈ S. Note that there is no circuit C of size
51 That is, for every C ∈ C on n input bits it holds that Prs∈S [C (s) = 1] ∈ Prx∈{0,1}n [C ( x ) = 1] ± e.

71
O(n) = 2Ω(`/(1+2e)) = 2(1−δ)·` that computes f S (otherwise we could use C to construct
a circuit that avoids S but has high acceptance probability).
For every ` ∈ N, we include in L all the functions f S obtained from S ∈ Sn
such that ` = `(n). The argument above shows that L is useful against circuits of
size 2(1−δ)·` , and its non-triviality follows since the NPRG works on all input lengths.
Finally, the truth-tables of f S ’s included in L can be recognized in non-deterministic
time Õ( N 1+2e ) < N 1+3e , by computing the NPRG.

Indeed, the proof above works also with the weaker notion of non-deterministic
HSG (NHSG), and we formulated it using NPRGs merely since the latter notion is
more well-known.

B Proof of Lemma 7.5.3.4


We now restate Lemma 7.5.3.4 from the proof of Theorem 7.5 and prove it.

Lemma B.1. Let x, y be two random variables taking values from [0, 1] and e ∈ (0, 1) such
that e−1 ∈ N, and let τ = 4/e. If E[x] − E[y] > e, then there exists ` ∈ [τ ] such that

Pr[x ≥ (` + 1)/τ ] − Pr[y ≥ `/τ ] > e/2 .

Proof. For every i ∈ {0, 1 . . . , τ + 1}, let


h i h i
pi = Pr x ∈ [i/τ, (i + 1)/τ ) and qi = Pr y ∈ [i/τ, (i + 1)/τ ) .

Note that pτ +1 = qτ +1 = 0 (we define them purely for notational convenience), and
that ∑iτ=0 pi = ∑iτ=0 qi = 1. Also, we have that
" !
τ τ
E[x] ∈ ∑ pi · (i/τ ), τ −1 + ∑ pi · (i/τ ) ,
i =0 i =0
" !
τ τ
E[y] ∈ ∑ qi · (i/τ ), τ −1 + ∑ qi · (i/τ ) ,
i =0 i =0

which by our hypothesis E[x] − E[y] > e implies that


" # " #
τ τ
∑ pi · (i/τ ) − ∑ qi · (i/τ ) > e − τ −1 .
i =0 i =0

Now, denote p≥` = ∑τj=` p j and q≥` = ∑τj=` q j . Then, we have that

τ τ
∑ pi · (i/τ ) = ∑ p≥` /τ = `∈[Eτ] [ p≥` ] ,
i =0 `=1
τ
∑ qi · (i/τ ) = `∈[Eτ] [q≥` ] ,
i =0

and therefore
h i
E p≥` − q≥` > e − τ −1 .
`∈[τ ]

72
Next, observe that

E [ p≥`+1 ] = E [ p≥` ] − E [ p` ] ≥ E [ p≥` ] − 1/τ .


`∈[τ ] `∈[τ ] `∈[τ ] `∈[τ ]

Putting them together, we have


h i
E p≥`+1 − q≥` > e − 2τ −1 = e/2 ,
`∈[τ ]

and hence there exists ` ∈ [τ ] such that p≥`+1 − q≥` > e/2.

73
ECCC ISSN 1433-8092

https://eccc.weizmann.ac.il

You might also like