0% found this document useful (0 votes)
27 views

Variational Quantum Algorithms

Uploaded by

jaa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Variational Quantum Algorithms

Uploaded by

jaa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Variational quantum algorithms

M. Cerezo,1, 2, 3, ∗ Andrew Arrasmith,1, 3 Ryan Babbush,4 Simon C. Benjamin,5 Suguru Endo,6 Keisuke Fujii,7, 8, 9
Jarrod R. McClean,4 Kosuke Mitarai,7, 10, 11 Xiao Yuan,12, 13 Lukasz Cincio,1, 3 and Patrick J. Coles1, 3, †
1
Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
2
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
3
Quantum Science Center, Oak Ridge, TN 37931, USA
4
Google Quantum AI Team, Venice, CA 90291, United States of America
5
Department of Materials, University of Oxford, Parks Road, Oxford OX1 3PH, United Kingdom
6
NTT Secure Platform Laboratories, NTT Corporation, Musashino, Tokyo 180-8585, Japan
7
Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan
8
Center for Quantum Information and Quantum Biology,
Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Osaka 560-8531, Japan
9
Center for Emergent Matter Science, RIKEN, Saitama 351-0198, Japan
10
arXiv:2012.09265v2 [quant-ph] 4 Oct 2021

Center for Quantum Information and Quantum Biology,


Institute for Open and Transdisciplinary Research Initiatives, Osaka 560-8531, Japan
11
JST, PRESTO, Saitama 332-0012, Japan
12
Center on Frontiers of Computing Studies, Department of Computer Science, Peking University, Beijing 100871, China
13
Stanford Institute for Theoretical Physics, Stanford University, Stanford California 94305, USA
Applications such as simulating complicated quantum systems or solving large-scale linear algebra
problems are very challenging for classical computers due to the extremely high computational cost.
Quantum computers promise a solution, although fault-tolerant quantum computers will likely not
be available in the near future. Current quantum devices have serious constraints, including limited
numbers of qubits and noise processes that limit circuit depth. Variational Quantum Algorithms
(VQAs), which use a classical optimizer to train a parametrized quantum circuit, have emerged as
a leading strategy to address these constraints. VQAs have now been proposed for essentially all
applications that researchers have envisioned for quantum computers, and they appear to the best
hope for obtaining quantum advantage. Nevertheless, challenges remain including the trainability,
accuracy, and efficiency of VQAs. Here we overview the field of VQAs, discuss strategies to overcome
their challenges, and highlight the exciting prospects for using them to obtain quantum advantage.

I. INTRODUCTION to still be many years, or even decades, away. The key


technological question is therefore how to make best use
Quantum computing holds promise for a number of ap- of today’s NISQ devices to achieve quantum advantage.
plications that have motivated the decades-long quest to Any such strategy must account for: limited numbers of
build the necessary physical hardware. For example, with qubits, limited connectivity of the qubits, and coherent
an exponential speedup over classical methods, quantum and incoherent errors that limit quantum circuit depth.
algorithms could factor numbers [1], simulate quantum Variational Quantum Algorithms (VQAs) have
systems [2], or solve linear systems of equations [3]. emerged as the leading strategy to obtain quantum
In 2016, access to the first cloud-based quantum com- advantage on NISQ devices. Accounting for all of the
puter [4] became available, but noise and qubit limi- constraints imposed by NISQ computers with a single
tations prevented serious implementations of the afore- strategy requires an optimization-based or learning-
mentioned quantum algorithms [5]. However, excitement based approach, precisely what VQAs use. VQAs
grew as to what could be done with these new devices, are arguably the quantum analog of highly successful
which have been called Noisy Intermediate-Scale Quan- machine-learning methods, such as neural networks.
tum (NISQ) computers [6]. Current state-of-the-art de- Moreover, VQAs leverage the toolbox of classical
vice size ranges from 50 to 100 qubits which allows one optimization, since VQAs use parametrized quantum
to achieve ‘quantum supremacy’: outperforming the best circuits to be run on the quantum computer, and then
classical supercomputer, for certain contrived mathemat- outsource the parameter optimization to a classical
ical tasks [7, 8]. optimizer. This approach has the added advantage of
Nevertheless, the true promise of quantum computers, keeping the quantum circuit depth shallow and hence
speedup for practical applications, which is often called mitigating noise, in contrast to quantum algorithms
quantum advantage, has yet to be realized. Moreover, the developed for the fault-tolerant era.
availability of fault-tolerant quantum computers appears VQAs have already been considered for a plethora of
applications (see Figure 3), covering essentially all of the
applications that researchers had envisioned for quantum
computers. Although they may be the key to obtaining
∗ e-mail: [email protected] near-term quantum advantage, VQAs still face important
† e-mail: [email protected] challenges, including their trainability, accuracy, and effi-
2

FIG. 1. Schematic diagram of a Variational Quantum Algorithm (VQA). The inputs to a VQA are: a cost function
C(θ), with θ a set of parameters that encodes the solution to the problem, an ansatz whose parameters are trained to minimize
the cost, and (possibly) a set of training data {ρk } used during the optimization. Here, the cost can often be expressed in
the form in Eq. (3), for some set of functions {fk }. Also, the ansatz is shown as a parameterized quantum circuit (on the
left), which is analogous to a neural network (also shown schematically on the right). At each iteration of the loop one uses
a quantum computer to efficiently estimate the cost (or its gradients). This information is fed into a classical computer that
leverages the power of optimizers to navigate the cost landscape C(θ) and solve the optimization problem in Eq. (1). Once a
termination condition is met, the VQA outputs an estimate of the solution to the problem. The form of the output depends
on the precise task at hand. The red box indicates some of the most common types of outputs.

ciency. In this Review, we discuss the exciting prospects details for each step of the VQA architecture shown in
for VQAs, and we highlight the challenges that must be Fig. 1.
overcome to obtain the ultimate goal of quantum advan-
tage.
A. Cost function

II. BASIC CONCEPTS AND TOOLS A crucial aspect of a VQA is encoding the problem
into a cost function. Similar to classical machine learn-
One of the main advantages of VQAs is that they pro- ing, the cost function maps values of the trainable pa-
vide a general framework that can be used to solve a rameters θ to real numbers. More abstractly, the cost
variety of problems. Although this versatility translates defines a hyper-surface usually called the cost landscape
into different algorithmic structures with different levels (see Fig. 1) such that the task of the optimizer is to nav-
of complexity, there are basic elements that most (if not igate through the landscape and find the global minima.
all) VQAs have in common. In this section we review the Without loss of generality, the cost can be expressed as
building blocks of VQAs.
Let us start by considering a task one wishes to solve. C(θ) = f ({ρk }, {Ok }, U (θ)) , (2)
This implies having access to a description of the prob-
where f is some function, U (θ) is a parametrized uni-
lem, and also possibly to a set of training data. As
tary, θ is composed of discrete and continuous parame-
schematically shown in Fig. 1, the first step to develop-
ters, {ρk } are input states from a training set, {Ok } are
ing a VQA is to define a cost (or loss) function C which
a set of observables. Often it is useful, and possible, to
encodes the solution to the problem. One then proposes
express the cost in the form
an ansatz, that is, a quantum operation depending on
a set of continuous or discrete parameters θ that can X
fk Tr[Ok U (θ)ρk U † (θ)] , (3)

C(θ) =
be optimized (see below for a more in-depth discussion
k
of ansatzes). This ansatz is then trained in a hybrid
quantum-classical loop to solve the optimization task for some set of functions {fk }. Note that the task at hand
will determine the choice of f in Eq. (2) or the choice
θ ∗ = arg min C(θ) . (1) of {fk } in Eq. (3). During the optimization, one uses a
θ
finite statistic estimator of the cost or its gradients. (See
The trademark of VQAs is that they use a quantum com- below for an overview of optimizers used to train the cost
puter to estimate the cost function C(θ) (or its gradient) function.)
while leveraging the power of classical optimizers to train Let us now discuss desirable criteria that the cost func-
the parameters θ. In what follows, we provide additional tion should meet. First, the cost must be ‘faithful’ in
3

that the minimum of C(θ) corresponds to the solution


of the problem. Second, one must be able to ‘efficiently
estimate’ C(θ) by performing measurements on a quan-
tum computer and possibly performing classical post-
processing. An implicit assumption here is that the cost
should not be efficiently computable with a classical com-
puter, as this would imply that no quantum advantage
can be achieved with the VQA. In addition, it is also
useful for C(θ) to be ‘operationally meaningful’, so that
smaller cost values indicate a better solution quality. Fi-
nally, the cost must be ‘trainable’, which means that it
should be possible to efficiently optimize the parameters
θ. We will later discuss in more detail the issue of train-
ability for VQAs.
For a given VQA to be implementable in NISQ hard-
ware, the quantum circuits used to estimate C(θ) must
keep the circuit depth and ancilla requirements small.
This is due to the fact that NISQ devices are prone to
gate errors, have limited qubit counts, and that these FIG. 2. Schematic diagram of an ansatz. The unitary
qubits have short decoherence times. Hence the construc- U (θ), with θ a set of parameters, can be expressed as a prod-
uct of L unitaries Ul (θ l ) sequentially acting on an input state.
tion of efficient cost evaluation circuits is an important
As indicated, each unitary Ul (θ l ) can in turn be decomposed
aspect of VQA research. into a sequence of parametrized and unparametrized gates.

B. Ansatzes
needed to implement U (θ) when using a given quantum
hardware. Here one uses unitaries Wm and e−iθm Hm that
Another important aspect of a VQA is its ansatz. are taken from a gate alphabet (set of quantum gates) de-
Generically speaking the form of the ansatz dictates what termined from the connectivity and interactions specific
the parameters θ are, and hence, how they can be trained to a quantum hardware which avoids the circuit depth
to minimize the cost. The specific structure of an ansatz overhead arising from translating an arbitrary unitary
will generally depend on the task at hand, as in many into a sequence of gates easily implementable in a de-
cases one can use information about the problem to tai- vice. One of the main advantages of the hardware effi-
lor an ansatz. These are the so-called ‘problem-inspired cient ansatz is its versatility, as it can accommodate en-
ansatze’. However, some ansatz architectures are generic coding symmetries [10, 11] and bringing correlated qubits
and ‘problem-agnostic’, meaning that they can be used closer for depth reduction [12], as well as being especially
even when no relevant information is readily available. useful to study Hamiltonians that are similar to the de-
For the cost function in Eq. (3), the parameters θ can vice’s interactions [13]. Such is the case, for instance, of
be encoded in a unitary U (θ) that is applied to the in- local spin Hamiltonians, although in this case it has been
put states to the quantum circuit. As shown in Fig. 2, heuristically shown that near criticiallity the ansatz re-
U (θ) can be generically expressed as the product of L quires depths proportional to the system size [14]. Addi-
sequentially applied unitaries tionally, ‘layered’ hardware efficient ansatzes, where gates
act on alternating pairs of qubits in a brick-like structure,
U (θ) = UL (θ L ) · · · U2 (θ 2 )U1 (θ 1 ) , (4)
have been prominently used as problem-agnostic archi-
with tectures. However, this ansatz can lead to trainability
Y problems when randomly initialized.
Ul (θl ) = e−iθm Hm Wm . (5)
m

Here Wm is an unparametrized unitary and Hm is a Her- 2. Unitary coupled clustered ansatz


mitian operator; θl is the l-th element in θ. Below we
describe some of the most widely used ansatzes in the The Unitary Coupled (UCC) ansatz is a problem-
literature, starting with those that can be expressed as inspired ansatz widely used in quantum chemistry prob-
Eq. (4), and then presenting more general architectures. lems where the goal is to obtain the ground state en-
ergy of a fermionic molecular Hamiltonian H. The
UCC ansatz proposes a candidate for such ground state
1. Hardware efficient ansatz based on exciting some reference state |ψ0 i (usually the

Hartree-Fock state of H) as eT (θ)−T (θ) |ψ0 i. Here, T =
The hardware efficient ansatz [9] is a generic name used k Tk is the cluster operator [15, 16] and Tk are exci-
P
for ansatzes that are aimed at reducing the circuit depth tation operators. In the so-called UCCSD ansatz (SD
4

stands for single and double) the summation is trun- chemistry [21, 29], optimization [24], and for quantum
cated to contain single excitations T1 = j †
i,j θi ai aj ,
simulation problems [30].
P
k,l † †
and double excitations T2 = i,j,k,l θi,j ai aj ak al , where
P

{a†i } ({ai }) are fermionic creation (annihilation) opera-


tors. To implement this ansatz in a quantum computer 5. Variable structure ansatz
one uses the Jordan-Wigner or the Bravyi-Kitaev trans-
formations [17] to map the fermionic operators to spin
operators, resulting in an ansatz of the form Eq. (4). In many ansatzes, one optimizes over continuous pa-
There are many variants of the UCC ansatz [18], with rameters (such as rotation angles), while the structure
some of them reducing the circuit depth by considering of the circuit is kept fixed. Although this enables the
more efficient methods for compiling the fermionic oper- control of the overall circuit complexity, it may miss re-
ators [19–22]. finements attained by optimizing the circuit structure it-
self, including the addition or removal of unnecessary cir-
cuit elements. Optimizing the circuit structure was ini-
3. Quantum alternating operator ansatz tially explored in a framework called ADAPT-VQE [31],
which seeks to adaptively add specific elements to the
ansatz to maximize the benefit while minimizing the
The Quantum Approximate Optimization Algorithm
number of circuit elements in quantum chemistry ap-
(QAOA) was originally introduced to obtain approximate
plications. (Improvements to ADAPT-VQE and vari-
solutions for combinatorial optimization problems [23].
able ansatz for quantum chemistry have been introduced
The ansatz used in QAOA involves an alternating struc-
in Refs. [32, 33], and a variable structure version of
ture and is often called the quantum alternating opera-
the QAOA ansatz was introduced in Ref. [34].) One
tor ansatz [24], sharing the same acronym as the algo-
can then view this problem as a sparse model problem,
rithm (although we will use QAOA to refer to the algo-
and whereas such an optimization is known to be hard,
rithm in this Review). This ansatz was first shown to
heuristic or greedy approximations that seek to add one
be computationally universal for certain Hamiltonians in
term at a time have been shown to be helpful [31, 32].
Ref. [25], with the proof of its universality being gener-
alized in Ref. [26] for families of ansatzes defined by sets Machine learning-aided evolutionary algorithms for
of graphs and hyper-graphs. The ansatz in QAOA is in- circuit design have also been explored in Refs. [35, 36],
spired by a Trotterized adiabatic transformation where where individuals (quantum circuits) from a population
the order p of the Trotterization determines the preci- are upgraded to grow the circuit and explore the Hilbert
sion of the solution. The goal of this ansatz is to map space. In addition, Refs. [37–41] use tools from ma-
an input state |ψ0 i to the ground-state of a given prob- chine learning to develop variational ansatzes for various
lem Hamiltonian HP by sequentially applying a problem VQA applications. Complementary approaches based
unitary e−iγl Hp and a mixer unitary e−iβl HM , where HM on exploring different ansatz variants simultaneously as
is a Hermitian operator known as the mixing Hamilto- an evolving cohort have also shown promising perfor-
nian Ref. [27]. mance [42].
Qp Specifically, the ansatz takes the form
U (γ, β) = l=1 e−iβl HM e−iγl HP , where θ = (γ, β). This
ansatz is naturally of the form in Eq. (4), although de-
composing these unitaries into native gates may result
in a lengthy circuit due to many-body terms in HP and 6. Sub-logical ansatz and quantum optimal control
limited device connectivity. One of the strengths of this
ansatz is the fact that the feasible subspace for certain The parameters θ are often specified at the logical cir-
problems is smaller than the full Hilbert space, and this cuit level (such as rotation angle), however sometimes
restriction may result in a better-performing algorithm. they have a direct translation to device-level parameters
below the logical level. Hence, one can include these
device-level parameters in the definition of the ansatz, as
4. Variational Hamiltonian ansatz this can offer additional flexibility [43]. This approach
also establishes a connection to the idea of quantum
Inspired by the QAOA ansatz, the variational Hamil- optimal control, which is often used to determine the
tonian ansatz also aims to prepare a trial ground states translation from logical to physical device parameters,
for a given Hamiltonian H = k Hk (where HK are Her- and which is especially applicable for quantum simula-
P
mitian operators, usual Pauli strings) by Trotterizing an tions [44, 45]. Refs. [46, 47] have explored using VQAs
adiabatic state preparation process [28]. Here, each Trot- the construction of optimal control sequences. Although
ter step corresponds to a variational ansatz so that the this can increase the number of parameters, the addi-
unitary is given by U (θ) = l ( k e−θl,k Hk ), and again is tional flexibility may allow for on-the-fly calibration ef-
Q Q
of the form Eq. (4). Due to its versatility, the variational fects that have been seen to reduce the effects of coherent
Hamiltonian ansatz has been implemented for quantum noise [46–48].
5

7. Hybrid ansatzes 9. Ansatz expressibility

Given the wide range of ansatzes one can use, a rel-


In some cases, it is possible to combine quantum evant question is whether a given architecture can pre-
ansatzes with classical strategies to push some of the pare a target state by optimizing its parameters. In this
complexity onto the classical device. For instance, in sense, there are different ways to judge the quality of an
quantum chemistry one can exploit the classical simula- ansatz [74] by considering two different notions: the ex-
bility of free fermion dynamics to apply quantum oper- pressibility and the entangling capability of an ansatz.
ations via classical post-processing [49–54]. A different An ansatz is expressible if the circuit can be used to uni-
approach is to use as ansatz a trainable linear
P combina- formly explore the entire space of quantum states. Thus
tion of parametrized states |ψ({cµ }, θ)i = µ cµ |ψµ (θµ )i one way to quantify the expressibility of an ansatz U (θ)
with {cµ } classically optimizable coefficients [55–61]. is to compare the distribution of states obtained from
Moreover, given that quantum circuits can be viewed as U (θ) to the maximally expressive uniform (Haar) distri-
tensor networks [62], it is natural to combine the existing bution of states UHaar . Motivated by this line of thought,
tensor network techniques with a quantum ansatz [63– the expressibility of a circuit is measured by [74] ||A(t) ||,
67]. For instance, it has been shown that it is possi- where
ble to unitarily contract tensor networks on a quantum
computer [63, 64]. An alternative hybrid approach was
Z
⊗t †
proposed via the deep variational quantum eigensolver,
(t)
A (U ) := dUHaar UHaar |0ih0|(UHaar )⊗t
where the algorithm divides the whole system into small Z
subsystems and sequentially solves each subsystem and − dU U ⊗t |0ih0|(U † )⊗t . (6)
the interaction between the subsystems [68]. Finally,
there is also a hybrid method that combines variational
Other expressibility measures can be considered as
Monte Carlo techniques with a quantum ansatz to classi-
well [74], and the expressibility of different ansatzes was
( Jij σi σj )
P
cally apply the so-called Jastrow operator e i,j
investigated further in Ref. [75]. Ref. [74] also introduced
(for J a symmetric matrix, and σi and σj Pauli opera- a measure of entangling capability for ansatzes, which
tors) to a parametrized quantum state |ψ(θ)i with the quantifies the average entanglement of states produced
goal of obtaining a more accurate result by optimizing from randomly sampling the circuit parameters θ.
together J and θ (Ref. [69]). Quantifying expressibility for particular ansatzes is an
active area of research [74–78], with certain quantum ar-
chitectures exhibiting higher expressibility (according to
certain measures) relative to classical architectures [77].

8. Ansatz for mixed states C. Gradients

Once the cost function and ansatz have been defined,


Since mixed states play an important role in many ap- the next step is to train the parameters θ and solve the
plications, such as systems at finite temperature, several optimization problem of Eq. (1). It is known that for
ansatzes
P have been developed to construct a mixed state many optimization tasks using information in the cost
ρ = i pi |ψi ihψi | of nPqubits (here pi are the eigen- function gradient (or in higher-order derivatives) can help
values of ρ such that i pi = 1). A first approach in speeding up and guaranteeing the convergence of the
(which comes at the cost of requiring up to 2n qubits) optimizer. One of the main advantages of many VQAs
is based on preparing a pure state that has ρ as a re- is that, as discussed below, one can analytically evaluate
duced state in some subsystem of qubits. Refs. [65, 70] the cost function gradient.
have proposed Pa √ method to variationally obtain a purifi-
cation |ψi = i pi |ψi i|φi i of ρ, whereas Ref. [71] intro-
duced a method to construct a state |ρi = 1c i pi |ψi i|ψi i
P

with normalization c = 1. Parameter-shift rule


i pi . Alternatively, one can
P 2
also train a probability distribution {pi (φ)} and a set
of states {|ψi (θi )i} toPconstruct ρ as the statistical en- Let us consider for simplicity a cost function of the
semble ρ(φ, {θi }) = i pi (φ)|ψi (θi )ihψi (θi )|. Ref. [70] form in Eq. (3) with fk (x) = x, and let θl be the l-
proposed to use a simple product distribution based on th element in θ which parametrize a unitary eiθl σl in
physical insights, whereas a more general proposal for the ansatz. Here, σl is a Pauli operator. Surprisingly,
energy based models was introduced in Ref. [72]. More there is a hardware-friendly protocol to evaluate the par-
recently, there has been a proposal to generate mixed tial derivative of C(θ) with respect to θl often referred
states which uses the autoregressive model [73]. to as the parameter-shift rule [79–82]. Explicitly, the
6

parameter-shift rules states that the equality D. Optimizers

∂C X 1  
Tr Ok U † (θ + )ρk U (θ + )

∂θl
=
2 sin α As for any variational approach, the success of a
k variational quantum algorithm (VQA) depends on the

− Tr Ok U † (θ − )ρk U (θ − ) ,

(7) efficiency and reliability of the optimization method
used. The classical optimization problems associated
with θ ± = θ ± αel , holds for any real number α. Here el with VQAs are expected to be NP-hard in general as
is a vector having 1 as its l-th element and 0 otherwise. they involve cost functions that can have many local
Equation (7) shows that one can evaluate the gradient minima [94]. In addition to the typical difficulties en-
by shifting the l-th parameter by some amount α. Note countered in complex classical optimizations, it has been
that the accuracy of the evaluation depends on the coef- shown that when training a VQA one can encounter new
ficient 1/(2 sin α) since each of the ±α-term is evaluated challenges. These include issues such as the inherently
by sampling Ok . This accuracy is maximized at α = π/4, stochastic environment due to the finite budget for mea-
since 1/ sin α is minimized at this point. Although the surements, hardware noise, and the presence of barren
parameter-shift rule might resemble a naive finite differ- plateaus (see main text). This has led to the develop-
ence, it evaluates the analytic gradient of the parameter ment of many quantum-aware optimizers, with the opti-
by virtue of the coefficient 1/ sin α. A detailed compar- mal choice still being an active topic of debate. Here we
ison between the parameter-shift rule and the finite dif- discuss a selection of optimizers that have been designed
ference can be found in Ref. [83]. Finally, the gradient or promoted for use with VQAs. For convenience, these
for more general fk (x) can be obtained from Eq. (7) by will be grouped into two categories based on whether or
using the chain rule. not they implement some version of gradient descent.

2. Other derivatives
1. Gradient descent methods

Higher-order derivatives of the cost function can


be evaluated by straight-forward extensions of the One of the most common approaches to optimization is
parameter-shift rule. For example, the second derivative to make iterative steps in directions indicated by the gra-
for the previous example can be written as dient. Given that only statistical estimates are available
for these gradients, these strategies fall under the um-
∂2C X 1  h i
brella of Stochastic Gradient Descent (SGD). One SGD
2
= 2 Tr Ok U † (θ + 2αel ) ρk U (θ + 2αel )
∂θl 4 sin α method that has been imported from the machine learn-
k
ing community is Adam, which adapts the size of the
h i
+ Tr Ok U † (θ − 2αel ) ρk U (θ − 2αel )
steps taken during the optimization to allow for more ef-
ficient and precise solutions than those obtained through
h i
− 2Tr Ok U † (θ)ρk U (θ) ,
basic SGD [95]. An alternative method inspired by
by applying the parameter-shift rule twice. Other higher- the machine learning literature adapts the precision (the
∂2C number of shots taken for each estimate), rather than the
3
order ones such as ∂θ l θl0
or ∂∂θC3 can be obtained in
l
a similar fashion. Explicit formulas can be found in step size, at each iteration in an attempt to be frugal with
Refs. [83, 84]. These observations relate to the fact that the quantum resources used [96]. It is possible to attain
the cost function can be expanded into a trigonometric an unbiased estimator for a partial derivative with even
series that admits a classically efficient, analytical ap- just a single shot [97], so adapting the number of shots
proximation around any reference point. One can thus when low precision is acceptable can lead to significant
infer a classical model of the cost function, and minimise reductions in the overall shot cost of an algorithm.
it, to offload more work from the quantum processor to A different gradient-based approach is based on sim-
the classical supervising system [85, 86]. ulating an imaginary time evolution [87], or equiva-
Other types of derivatives of the parametrized quan- lently by using the quantum natural gradient descent
tum state not directly related to the cost function, method, which is based on notions of information geom-
such as a metric tensor of a state ∂hψ(θ)| ∂|ψ(θ)i
(with etry [88, 89]. Whereas standard gradient descent takes
∂θl0
∂θl
steps in the steepest descent direction in the l2 (Eu-
|ψ(θ)i = U (θ)|ψ0 i for some initial state |ψ0 i), are some- clidean) geometry of the parameter space, natural gra-
times used in sophisticated optimization algorithms [87– dient descent works instead on a space with a metric
89] and variational quantum simulation [90–92] (see the tensor that encodes the sensitivity of the quantum state
section on dynamical quantum simulation). As quantities to variations in the parameters. Using this metric ten-
such as ∂hψ(θ)|
∂θl
∂|ψ(θ)i
∂θl0 are essentially overlaps of different sor, typically accelerates the convergence of the gradient
states, this can be evaluated via Hadamard-test like pro- update steps, allowing a given level of precision to be at-
tocols [91]. However, as shown in Ref. [93], those can also tained with fewer iterations. This method has also been
be reduced to the parameter-shift technique. extended to incorporate the effects of noise [89].
7

2. Other methods

A different method which uses gradients, but has


a more complicated update step than SGD, is meta-
learning [98]. In this context, the optimizer ‘learns to
learn’ by training a neural network to make a good up-
date step based on the optimization history and current
gradient with similar optimization problems. Because
the update steps taken are based on rules learned from
similar cost functions, this meta-learning approach has
significant potential to be highly efficient when used on
a new instance of a common class of optimizations.
Of the optimization methods proposed for use with
VQAs which do not directly utilize gradients, the one
that is perhaps the most closely related to SGD is
the simultaneous perturbation stochastic approximation
(SPSA) method [99]. SPSA can be considered as an ap-
proximation to gradient descent where the gradient is
approximated by a single partial derivative computed
by a finite difference along a randomly chosen direction. FIG. 3. Applications of Variational Quantum Algo-
rithms (VQAs). Many applications have been envisioned
SPSA has thus been put forward as an efficient method
for VQAs. Here we show some of the key applications that
for VQAs as it avoids the expense of computing many are discussed in this Review.
gradient components at each iteration. Moreover, it has
been shown that for a restricted set of problems, SPSA
has a faster theoretical convergence rate (in terms of the finite difference methods) [102].
number of function calls) than SGD performed with finite
differences [99].
Finally, another noteworthy gradient-free approach has
III. APPLICATIONS
been developed specifically for the context of VQAs for
problems where the objective function is a linear func-
tion of an operator expectation value, so that C(θ) can One of the main advantages of the VQA paradigm is
be expressed as a sum of trigonometric functions. Using that it allows for task-oriented programming. That is,
this insight, one can fit the functional dependence on a VQAs provide a framework that can be used to tackle a
few parameters (with the rest held fixed) allowing one wide array of tasks. This has lead to VQAs being pro-
to make local parameter updates [85, 100]. Performing posed for essentially all applications envisioned for quan-
such local updates sequentially over all parameters, or tum computers, and in fact, it has been shown that VQAs
subsets of parameters, and iterating over all parameters allow for universal quantum computing [103]. In this sec-
one then has an optimization method that is gradient- tion we provide an overview of some of the main appli-
free and which does not depend on hyper-parameters. cations of VQAs and their state-of-the implementation.
Additionally, a variation of this method using Anderson These applications are also summarized in Figure 3. We
acceleration (a method that adds a linear combination of also refer the reader to Section V for an overview of ap-
prior steps to each new update step) to speed up conver- plications where VQAs can be potentially used to obtain
gence has been proposed [100]. a quantum advantage.

3. Convergence analysis A. Finding ground and excited states

The cost landscapes of VQAs are generally non-convex The best-known application of VQAs is estimating low-
and can be complicated [101], making it difficult to ob- lying eigenstates and corresponding eigenvalues of a given
tain general guarantees about the computational expense Hamiltonian. Previous quantum algorithms to find the
of the optimizations. However, for simplified landscapes, ground state of a given Hamiltonian H were based on adi-
SGD convergence guarantees have been derived which abatic state preparation and quantum phase estimation
are similar to those provided in the machine learning lit- subroutines [104, 105], both of which have circuit depth
erature [97, 102]. Furthermore, within a convex region requirements beyond those available in the NISQ era.
about a minimum, SGD methods using gradients calcu- Hence, the first proposed VQA, the Variational Quantum
lated via the parameter-shift rule have been shown to Eigensolver (VQE), was developed to provide a near-term
have smaller upper bounds on the optimization complex- solution to this task. Here we review both the original
ity than methods using only objective values (including VQE architecture and some more advanced methods for
8

systems are generally described by sparse Hamiltonians,


the cost function can be efficiently estimated on quantum
computers with a computational cost that usually grows
at most polynomially with the system size.

2. Orthogonality constrained VQE

Once an approximated ground state |ψ̃G i = U (θ ∗ )|ψ0 i


has been obtained, one can use it to find excited states
of H. Let a be a positive constant that is much larger
than the energy gap between the ground state and the
first excited states. Then, H 0 = H + a|ψ̃G ihψ̃G | is
a Hamiltonian whose ground state is the first excited
state of H (Ref. [107]). Thus, by using the VQE for
H 0 with an updated cost C(θ) = hψ(θ)|H|ψ(θ)i +
ahψ(θ)|ψ̃G ihψ̃G |ψ(θ)i, one may find the first excited state
of H. The first term here is evaluated as in VQE, and the
second term can be obtained by computing the state over-
lap between |ψ̃G i and |ψ(θ)i (Refs. [38, 108]). This pro-
FIG. 4. Variational Quantum Eigensolver (VQE) im- cedure can be further generalized to approximate higher
plementation. The VQE algorithm can be used to estimate excited states. Moreover, it has been shown that incorpo-
the ground state energy EG of a molecule. The interactions of rating an imaginary time evolution can help to improve
the system are encoded in a Hamiltonian H, usually expressed the calculation robustness [109].
as a linear combination of simple operators hk with coeffi-
cients ck . Taking H as input, VQE outputs an estimate E eG
of the ground-state energy. The lower part of the figure shows
the results of a VQE implementation for the electronic struc- 3. Subspace expansion method
ture problem of an H2 molecule, whose exact energy is shown
as a dashed line. The experimental results were obtained us- Another way to discover low energy excited states
ing two of the five qubits in one of IBM’s superconducting using information of the estimated ground state |ψ̃G i
quantum processors (the inset illustrates qubit connectivity is via the subspace expansion method [55]. Here one
with Q0 . . . Q4 denoting the qubits ). Due to the presence of runs an additional optimization in a subspace of states
hardware noise the estimated energy E eG has a gap with the
{|ψk i} generated from |ψ̃G i. For instance, one cre-
true energy. In fact, amplifying the noise strength (that is
increasing the quantity s), deteriorates the solution quality.
ates states |ψk i = σk |ψ̃G i for low-weight Pauli opera-
However, as discussed below, one can use error mitigation tors σk , and
Pexpands the candidates for the eigenstates
techniques to improve the solution quality. Figure adapted as |Ei = k αk |ψk i. Then, one obtains approxima-
from Ref. [106], Springer Nature Limited. tions to the lowest eigenstates by training the coefficients
α = (α0 , α1 , α2 , ...) while solving the generalised eigen-
value problem Hα = ESα, with Hk,j = hψk |H|ψj i and
finding ground and excited states. Sk,j = hψk |ψj i.

1. Variational quantum eigensolver 4. Subspace VQE

As shown in Fig. 4, VQE is aimed at finding the ground The main idea behind subspace VQE [110] is to train
state energy EG of a Hamiltonian H [16]. Here the cost a unitary for preparing states in the lowest energy sub-
function is defined as C(θ) = hψ(θ)|H|ψ(θ)i. That is, space of H. There are two variants of subspace VQE
one seeks to minimize the expectation value of H over a called weighted and non-weighted subspace VQE. For
trial state |ψ(θ)i = U (θ)|ψ0 i for some ansatz U (θ) and the weighted subspace
Pm VQE, one considers a cost func-
initial state |ψ0 i. According to the Rayleigh-Ritz vari- tion C(θ) = i=0 w i hϕ i |U (θ)HU (θ)|ϕi i with ordered
ational principle, the cost is meaningful and faithful as weights w0 > w1 > · · · > wm and easily prepared
C(θ) > EG , with the equality holding if |ψ(θ)i is the mutually-orthogonal states {|ϕi i}. By minimizing the
ground state |ψG i of H. In practice, the Hamiltonian H cost function, one approximates the subspace of the low-
is usually represented as a linear
Pcombination of products est eigenstates as {U (θ ∗ )|ϕi i}mi=0 . Since the weights are
of Pauli operators σk as H = k ck σk (ck ∈ R), so that in decreasing order, each state U (θ ∗ )|ϕi i corresponds to
the cost function C(θ) is obtained from a linear combina- an eigenstate of the (non-degenerate) Hamiltonian with
tion of expectation values of σk . Since practical physical increasing energies.
9

The non-weightedPsubspace VQE makes use of the cost called α-QPE. This allows the measurement cost to in-
m
function C1 (θ) = i=0 hϕi |U (θ)HU (θ)|ϕi i. Minimiz-

terpolate between that of VQE and QPE.
ing C1 again gives the subspace of lowest eigenstates.
As each state U (θ ∗ )|ϕi i is in a superposition of the
eigenstates, one needs to further optimize a second cost Dynamical quantum simulation
C2 (θ ∗ , φ) = hϕi |V † (φ)U † (θ ∗ )HU (θ ∗ )V (φ)|ϕi i over pa-
rameters φ to rotate each state U (θ ∗ )V (φ)|ϕi i to an Apart from static eigenstate problems, VQAs can also
eigenstate. be applied to simulate the dynamical evolution of a quan-
tum system. Conventional quantum Hamiltonian sim-
ulation algorithms, such as the Trotter-Suzuki product
5. Multistate contracted VQE formula [117], generally discretize time into small time
steps and simulate each time evolution with a quantum
The multistate contracted VQE [56] can be regarded circuit. Therefore, the circuit depth generally increases
as a midway point between subspace expansion and polynomially with the system size and simulated time.
subspace VQE. It first obtains the lowest energy sub- Given the noise inherent in NISQ devices, the accumu-
space {U (θ ∗ )|ϕi i}m
i=0 by optimizing C1 (θ) as in the non-
lated hardware errors for such deep quantum circuits can
weighted subspace VQE. Instead of optimizing an addi- prove prohibitive. To address this, VQAs for dynamical
tional unitary, the multistate contracted VQE approxi- quantum simulation only use a shallow depth circuit, sig-
mates each eigenstate as |Ei = i αi U (θ ∗ )|ϕi i with co- nificantly reducing the impact of hardware noise.
P
efficients αi which are obtained by solving a generalised
eigenvalue problem similar to subspace expansion with
S = 1. 8. Iterative approach

Instead of directly implementing the unitary evolu-


6. Adiabatically assisted VQE tion described by the Schrödinger equation d|ψ(t)i dt =
−iH|ψ(t)i, iterative variational algorithms [90, 91] con-
Quantum adiabatic optimization seeks to find a solu- sider trial states |ψ(θ)i and map the evolution of the state
tion to an optimization problem by slowly transforming to the evolution of the parameters θ. By iteratively up-
the ground state of a simple problem to that of a complex dating the parameters, the quantum state is effectively
problem. These methods have a close connection with updated and hence evolved. Specifically, by using vari-
classical homotopy schemes that are used to find the solu- ational principles, such as McLachlan’s principle [118]
tions of classical problems in optimization [111]. In light to solve the minimization minθ̇ δk( dt d
+ iH)|ψ(θ)ik, one
of this connection, the adiabatically assisted VQE [112] obtains a linear equation p for the parameters as M ·
uses a cost function C(θ) = hψ(θ)|H(s)|ψ(θ)i, where θ̇ = V . Here k|ψik  = hψ|ψi, θ̇ = dθdt , Mi,j =
H(s) = (1 − s)H0 + sHP and |ψ(θ)i = U (θ)|ψ0 i. Here Re ∂i hψ(θ)|∂j |ψ(θ)i , Vi = Im hψ(θ)|H∂i |ψ(θ)i , and
HP is the problem Hamiltonian of interest and H0 is a ∂i |ψ(θ)i = ∂|ψ(θ)i
∂θ i . Each element of M and V can be
simple Hamiltonian whose known ground state is taken as efficiently measured with a modified Hadamard test cir-
the initial state |ψ0 i. During the parameter optimization, cuit. By solving the linear equation, one can iteratively
one slowly changes s from 0 to 1. The idea of Hamiltonian update the parameters from θ to θ + θ̇∆t with a small
transformation has been used as a type of ansatz to ob- time step ∆t. Similar variational algorithms could be ap-
tain solutions near the more challenging endpoint [113]. plied for simulating the Wick-rotated Schrödinger equa-
tion of imaginary time evolution [87] and general first-
order derivative equations with non-Hermitian Hamilto-
7. Accelerated VQE nians [92]. A systematic comparison between different
variational principles for different problems can be found
As previously mentioned, whereas Quantum Phase Es- in Ref. [91]. Recent works also extend the algorithms to
timation (QPE) provides a means to estimate eigenener- use adaptive ansatz to reduce the circuit depth [119, 120]
gies in the fault-tolerant era, it is not implementable in
the near-term. However, one of the positive features of
this algorithm is that a precision  can be obtained with a 9. Subspace approach
number of measurements which scale as O(log( 1 )). This
is in contrast with VQE, which requires O( 12 ) measure- The weighted subspace VQE [110] provides an alter-
ment for the same precision. This scaling motivated the native way to simulate dynamics in the subspace of the
Accelerated VQE algorithm, which interpolates between low energy eigenstates [121]. Here one uses the weighted
the VQE and QPE algorithms [114–116]. The interpo- subspace VQE unitary operator U (θ ∗ ) that maps compu-
lation involves taking the VQE algorithm and replacing tational basis states {|ϕj i} to the low energy eigenstates
the measurement process with a tunable version of QPE {|Ej i} as U (θ ∗ )|ϕj i ≈ eiδj |Ej i, with δj an unknown
10

phase. Considering the low energy subspace, the time


evolution operator can be approximated as exp(−iHt) ≈
U (θ ∗ )T (t)U † (θ ∗ ) with T (t) =
P
j exp(−iEj t)|ψj ihψj |.
The procedure could intuitively be understood as first,
rotating the state to the computational basis with
U † (θ ∗ ), second, evolving the state with T (t), and third,

rotating the basis P back with U (θ ). Therefore, for any
state |ψ(0)i = j αj |Ej i that is a superposition of the
low energy eigenstates, its time evolution can be simu-
lated as |ψ(t)i = U (θ ∗ )T (t)U † (θ ∗ )|ψ(0)i. Since the time
evolution is directly implemented via T (t), it does not
involve iterative parameter update and the circuit depth
is independent of the simulation time.
FIG. 5. Quantum Approximate Optimization Algo-
rithm (QAOA). a. Schematic representation of the Trot-
10. Variational fast forwarding terized adiabatic transformation in the ansatz. The algo-
rithm only loosely follows the evolution of the ground state of
Similar to the subspace approach, variational fast for- H(t) = (1−t)HM +tHP for every t ∈ [0, 1], as one is interested
warding [122, 123] simulates the time evolution opera- in making the final state close to the ground state of the prob-
tion ∗ † ∗ lem Hamiltonian HP , with HM being a mixer Hamiltonian.
P exp(−iHt) as U (θ )T (E, t)U (θ ) with T (E, t) = The free parameters {βl }pl=1 and {γl }pl=1 are trained, with p
j exp(−iEj t)|ψj ihψj | a trainable diagonal matrix and being the number of QAOA rounds. b. Problem Hamilto-
U (θ ∗ ) a trainable unitary that maps between the eigen- nian HP and graph hjki for a Max-Cut task. Each node in
states of H and the computational basis. Although the graph (circle) represents a spin. Vertices connecting two
the subspace approach obtains T (E, t) and U (θ ∗ ) via nodes indicate an interaction σjz σkz in HP , with σkz the Pauli
weighted subspace VQE, variational fast forwarding op- z operator on spin k. The solution is encoded in the ground
timises a cost given by the fidelity between e−iHδt and state of HP where some spins are pointing up (green) whereas
U (θ ∗ )T (E, δt)U † (θ ∗ ) for a small time step δt via the others point down (blue).
so-called local Hilbert-Schmidt test [124]. Then, ac-
cording to the Trotter-Suzuki product formula, one has
e−iHT = (e−iH∆t )M ≈ U (θ ∗ )(T (E, t))M U † (θ ∗ ). Again, pure state trajectory experiences continuous damping ef-
since the time evolution is implemented in T (E, t), one fect and jump processes due to the noise operators, both
can simulate the evolution for arbitrary time t with the of which can be efficiently simulated. Since this method
same circuit structure. As shown in Ref. [125], the en- one only controls a single copy of the pure state, it only
suing Trotter error of this approach can be removed by requires n + 1 qubits.
diagonalizing instead the Hamiltonian H that generates
the evolution.
B. Optimization

11. Simulating open systems Thus far we have discussed using VQAs for tasks which
are inherently quantum in nature, that is, finding ground
The VQA framework can also be extended to simulate states and simulating the evolution of quantum states.
dynamical evolution of open quantum systems. Suppose In this subsection we discuss a different possibility where
that the dynamics of the system is described by dρ dt =
one uses a VQA to solve a classical optimization prob-
L(ρ), where L denotes a super-operator for a dissipa- lem [126].
tive process. Similarly to the iterative approach for pure The most famous VQA for quantum-enhanced op-
states [90], one maps the evolution of the mixed state to timization is the QAOA [23], originally introduced to
one of the variational parameters via McLachlan’s princi- approximately solve combinatorial problems such as
ple, which solves the minimization minθ̇ k( dtd
− L)ρ(θ)k. Constraint-Satisfaction (SAT) [127] and Max-Cut prob-
The solution determines the evolution of the parame- lems [128].
ters M · θ̇ = V with Mi,j = Tr ∂i ρ(θ)† ∂j ρ(θ) , Vi = Combinatorial optimization problems are defined on
 

Tr ∂i ρ(θ)† L(ρ) and ∂i ρ(θ) = ∂ρ(θ) binary strings s = (s1 , · · · , sN ) with the task of maxi-
∂θ i . Each term of M
 
mizing a given classical objective function L(s). QAOA
and V can be computed by applying the SWAP test cir- encodes L(s) in a quantum Hamiltonian HP by promot-
cuit on two copies of the purified states [91]. Here, to ing each classical variable sj to a Pauli spin-1/2 operator
simulate an open system of n qubits, one needs to apply σjz , so that the goal is to prepare the ground state of HP .
operations on 4n+1 qubits. An alternative approach [92] Motivated by the quantum adiabatic algorithm, QAOA
which reduces this overhead is to simulate the stochas- replaces adiabatic evolution with p rounds of alternat-
tic Schrödinger equation, which unravels the evolution of ing time propagation between the problem Hamiltonian
the density matrix into trajectories of pure states. Each HP and appropriately chosen mixer Hamiltonian HM ,
11

see Fig. 5. As discussed in the subsection on quantum cost C(θ) = hψ(θ)|HG |ψ(ψ)i. Refs. [139–141] consid-
alternating operator ansatz, the evolution time intervals ered the Hamiltonian HG = A(1 − |bihb|)A† (which was
are treated as variational parameters and are optimized also considered outside of the variational setting [142]).
classically. Hence, defining θ = {γ, β}, the cost function The aforementioned cost can have gradients that vanish
is C(γ, β) = hψp (γ, β)|HP |ψp (γ, β)i with exponentially in the number of qubits n (that is, a so-
called barren plateau in the cost landscape). This prob-
|ψp (γ, β)i = e−iβp HM e−iγp HP · · · e−iβ1 HM e−iγ1 HP |ψ0 i , (8)
lem can be mitigated by considering a local Hamiltonian
and where |ψ0 i is the ground state of HM . with the same ground state [139] or P by using a hybrid
Finding optimal values γ and β is a hard problem since ansatz strategy [141] where |ψ(θ)i = i αi |ψi (θ 1 )i with
the optimization landscape in QAOA is non-convex with αi being variational parameters. A study [139] was con-
many local optima [129]. Hence, great efforts have been ducted with n = 10, . . . , 30 qubits for Ising-inspired lin-
devoted to finding a good classical optimizer that would ear systems and with n = 2, . . . , 7 qubit random (sparse)
require as few calls to the quantum computer as possible. linear systems. The study showed that the time to so-
Gradient-based [130, 131], derivative-free [43, 132], and lution scales logarithmically in N (and also efficiently in
reinforcement learning [133] methods were investigated, condition number and solution precision) for these prob-
and this still remains an active field to guarantee a good lems. Provided that larger systems display similar be-
performance for the QAOA. havior, the observed heuristic scaling suggests that VQAs
could potentially give an exponential speedup, analogous
to HHL, for the QLSP.
C. Mathematical applications

2. Matrix-vector multiplication
Several VQAs have been proposed to tackle relevant
mathematical problems such as solving linear systems of
equations or integer factorization. Since in many cases Another related problem is matrix-vector multiplica-
there exist quantum algorithms for the fault-tolerant era tion, that is to prepare a normalized state |xi such
aimed for these tasks, the goal of VQAs is to have heuris- that |xi ∝ A|bi with normalized vector |bi. When
tical scalings comparable to the provable scaling of these A = 1 − iHδt, then the problem becomes the task
non-near-term algorithms while keeping the algorithm re- of Hamiltonian simulation. Similar to solving the
quirements compatible with the NISQ era. QLSP, one constructs the Hamiltonian HM = 1 −
A |bi hb| A† /kA |bi k2 , whose ground
p state is |xi with zero
energy [140]. Here kA|bik = hb|A† A|bi is the Eu-
1. Linear systems clidean norm. Given an approximate solution |ψ(θ ∗ )i,
one can lower bound the fidelity to the exact solution
Solving systems of linear equations has wide-ranging as |hψ(θ ∗ )|xi|2 ≥ 1 − hψ(θ ∗ )|HM |ψ(θ ∗ )i, thus verify the
applications in science and engineering. Quantum com- solution’s correctness whenever the cost function is small.
puters offer the possibility of exponential speedup for this
task. Specifically, for an N × N linear system Ax = b
(with A an N × N matrix, and b an N × 1 column vec- 3. Non-linear equations
tor defined from the linear systems problem), one con-
siders the Quantum Linear Systems Problem (QLSP) Non-linear equations are important to various fields,
where the task is to prepare a normalized state |xi such especially in the form of non-linear partial differential
that A|xi ∝ |bi, where |bi = b/kbk is also a normal- equations. However, mapping such equations onto quan-
ized state. The classical algorithmic complexity for this tum computers requires careful thought since the under-
task scales polynomially in the dimension N , whereas lying mathematics of quantum mechanics is linear. To
the now-famous Harrow–Hassidim–Lloyd (HHL) quan- address this, a VQA for such non-linear problems was
tum algorithm [3] has a complexity that scales logarith- proposed in Ref. [143]. The approach was illustrated
mically in N , with some scaling improvements having for the time-independent non-linear Schrödinger equa-
been proposed [134–137]. These pioneering quantum al- tion, where the cost function is the total energy (sum of
gorithms, however, will be difficult to implement in the potential, kinetic, and interaction energies), and where
near-term due to the enormous circuit depth require- the space was discretized into a finite grid. By us-
ments [138]. ing multiple copies of variational quantum states in the
This situation has motivated VQAs for the QLSP [139– cost-evaluation circuit, this VQA can compute non-linear
141]. A common feature in these algorithms is the as- functions.
sumption that A = k ck Ak is given as a linear combi- An alternative approach has been proposed for non-
P
nation of unitaries Ak that can be efficiently implemented linear differential equations that is based on using a set of
weighted by real coefficients ck . One can then construct basis functions rather than a finite grid [144]. First, the
a Hamiltonian whose ground state is the solution to the basis functions are encoded as non-linear feature maps
QLSP and apply a variational approach to minimize the (state preparation unitaries that are a function of the
12

variables from the system). Next, a parameterized ansatz D. Compilation and unsampling
prepares a state that represents a linear combination of
these basis functions. The corresponding function value A natural task that NISQ devices can potentially accel-
is then output as an expectation value of an operator. erate is the compiling of quantum programs. In quantum
Additionally, derivatives of this function are computed compiling, the goal is to transform a given unitary V into
with the parameter shift rule. This method then op- native gate sequence U (θ) with an optimally short cir-
timizes a cost function that is minimized then the non- cuit depth. Quantum compiling plays a major role in
linear differential equation of interest is satisfied at a cho- error mitigation, as errors increase with circuit depth.
sen set of points. Quantum compiling is a challenging problem for classical
computers to perform optimally, due to the exponential
complexity of classically simulating quantum dynamics.
Hence, several VQAs have been introduced that can po-
4. Factoring tentially be used to accelerate this task [124, 148–151].
These algorithms can be categorized as either Full Uni-
tary Matrix Compiling (FUMC) or Fixed Input State
Large-scale implementations of Shor’s algorithm are
Compiling (FISC), which respectively aim to compile the
not possible in the near term. Hence, a VQA for factor-
target unitary V over all input states or for a particular
ing as a potential near-term alternative was introduced
input state. In Ref. [124] a VQA for FUMC was pre-
in Ref. [145]. This proposal relies on the fact that factor-
sented, which uses cost functions closely related to entan-
ing can be formulated as an optimization problem, and
glement fidelities to quantify the distance between V and
in particular, as a ground state problem for a classical
U (θ). The proposal in Ref. [148] also treats the FUMC
Ising model. The authors used the QAOA to variation-
case, but with an alternative approach to quantifying the
ally search for the ground state. Their numerical heuris-
cost using the average gate fidelity, averaged over many
tics suggest that a linear number of layers in the ansatz
input and output states. The FISC case was treated
(p ∈ O(n)) leads to a large overlap with the ground state.
in Ref. [149], where the problem was reformulated as a
ground state energy task, hence making the connection
with VQE. The connection with VQE was also general-
ized to FUMC [150], showing that variational quantum
5. Principal Component Analysis compiling, in general, is a special kind of VQE problem.
Ref. [151] introduced and experimentally implemented
An important primitive in data science is reducing a compiling scheme which can be thought of as FISC,
the dimensionality of data with Principal Component although the architecture here is focused on the applica-
Analysis (PCA). This involves diagonalizing the covari- tion of unpreparing a quantum state. Finally, it is worth
ance matrix for a data set and selecting the eigenvectors noting that both FUMC and FISC exhibit resilience to
with the largest eigenvalues as the key features of the hardware noise, in that the global minimum of the cost
data. Because the covariance matrix is positive semi- landscape is unaffected by various types of noise [150].
definite, one can store it in a density matrix, that is, in This noise resilience feature is crucial for the utility of
a quantum state, and then any diagonalization method variational quantum compiling for error mitigation, and
for quantum states can be used for PCA. This idea we discuss this in more detail later.
was exploited in Ref. [146] to propose a quantum algo-
rithm for PCA. However, quantum phase estimation and
density matrix exponentiation were subroutines in this E. Error correction
algorithm, making it non-implementable in the NISQ
era. To potentially make this application more near- Quantum Error Correction (QEC) protects qubits
term, Ref. [147] proposed a variational quantum state from hardware noise. Due to the large qubit requirements
diagonalization algorithm, where the cost function C(θ) of QEC schemes, their implementation is beyond NISQ
quantifies the Hilbert-Schmidt distance between the state device capabilities. Nevertheless, QEC could still bene-
ρ̃(θ) = U (θ)ρU (θ)† and Z(ρ̃(θ)), and where Z is the de- fit NISQ hardware by suppressing the error to a certain
phasing channel. This VQA outputs estimates of all the extent and by combining it with other error mitigation
eigenvalues and eigenvectors of ρ, but it comes at the cost methods. Specifically, conventional universal approaches
of requiring 2n qubits for an n qubit state. This qubit for implementing QEC codes generally involve an un-
requirement can be reduced with the VQA of Ref. [113], necessarily long circuit that does not take into account
which requires only n qubits. Here one exploits the con- the hardware structure or the type of noise. Hence, two
nection between diagonalization and majorization to de- VQAs have been introduced to solve these problems to
fine a cost function of the form C(θ) = Tr[ρ̃(θ)H] where automatically discover or compile a small quantum error-
H is a non-degenerate Hamiltonian. Due to Schur con- correcting code for any quantum hardware and any noise.
cavity, this cost function is minimized when ρ̃(θ) is diag- The Variational Quantum Error Corrector (QVEC-
onalized. TOR) was first proposed to discover a device-tailored
13

quantum error-correcting code for a quantum mem- 1. Classifiers


ory [152]. For any k-qubit input state |ψi = US |0i,
prepared by a unitary US acting on a reference state The classification of data is a ubiquitous task in
|0i, QVECTOR considered two parametrized circuits machine learning. Given training data of the form
V (θ 1 ) (on n ≥ k qubits) and W (θ 2 ) (on n + r qubits), {x(i) , y (i) }, where x(i) are inputs, and y (i) labels, the goal
which respectively encode the input logical state into n is to train a classifier to accurately predict the label of
qubits with n − k ancillary qubits and realize recovery each input. Since a key aspect for the success of classi-
operations with r ancillary qubits. By sequentially ap- cal neural networks is their non-linearity, one can expect
plying encoding, recovery, and decoding on the input this property to also arise in a quantum classifier. As
state, one obtains an output ρout = W (θ 1 )V (θ 1 )(ψ ⊗ shown in Ref. [156], parametrized quantum circuits can
|0ih0|⊗n−k+r )V (θ 1 )† W (θ 1 )† . Projecting the n − k an- support linear transformations and non-linearity can be
cillary qubits back to |0ih0| and discarding the last exploited from the tensor product structure of a quantum
r ancillary qubits, one finds a quantum channel ρ = system. More precisely, defining an input data depen-
E(θ 1 , θ 2 )(ψ) on the input state ψ.R The target of QVEC- dent unitary V (x), then the tensor product V (x) ⊗ V 0 (x)
TOR is to maximize the fidelity ψ dψF (ψ, E(θ 1 , θ 2 )(ψ)) or the multiplication V (x)V 0 (x) results in a non-linear
between the output ρ and the input ψ averaged overall all function of the input data x. In this sense, the unitary
ψ or any US that forms a unitary 2-design. The solution V (x) can be used as a quantum non-linear feature map,
will give the quantum circuit that maximally protects the where the Hilbert space can be exploited for a feature
input state. Numerical simulations showed that QVEC- space [157, 158]. Interestingly, the tensor network struc-
TOR can find quantum codes that outperform existing ture of quantum mechanics has even inspired classical
ones [152]. machine learning methods [159].
Instead of discovering new device-tailored QEC codes, Here, after embedding the input data x into the
Ref. [153] considered how to compile conventional QEC quantum state, a linear transformation is performed
codes into a given quantum hardware with specific noise. using a parametrized quantum circuit, U (θ)V (x)|ψ0 i.
Suppose one aims to implement the logical state |ψiL = The cost function is then defined as the error between
α|0iL + β|1iL with logical state basis {|0iL , |1iL }. Note the true label and the expectation value of P an easily
that |ψiL is the ground state of the stabilizers Gk as measurable observable A, that is, C(θ) = i [y
(i)

well as the logical operator P = |ψiL hψ|L − |ψ ⊥ iL hψ ⊥ |L hψ0 |V † (x(i) )U † (θ)AU (θ)V (x(i) )|ψ0 i]2 . This approach
with orthogonal state |ψ ⊥ iL . Then one can Pconstruct has been used in generalization and in classification
a frustration-free Hamiltonian H = −a0 P − k≥1 ak Gk tasks [80, 156], with Refs. [76, 80, 160, 161] discussing
with positive coefficients a0 , ak , P
and with |ψiL the ground different ways of embedding classical data into quantum
state with energy EG = −(a0 + k≥1 ak ). One then uses states (such as data re-uploading), and with Ref. [158]
a VQA to discover the circuit that implements |ψiL with showing an experimental demonstration of variational
a given hardware structure. Since the eigenstate energies classification.
are know, the fidelity, F , of the discovered state can be Moreover, as shown in Refs. [157, 158], instead of
bounded by F ≥ 1 − (E − EG )/a with the discovered en- using a parameterized unitary U (θ) one can use prod-
ergy E and a = min{a0 , ak }. Numerical studies showed ucts of quantum feature vectors hψ0 |V † (x0 )V (x)|ψ0 i to
the encoding circuits for the five- and seven-qubit codes perform a kernel method. Finally, the quantum kernel
with different noisy hardware [153]. trick, which means that the dimensions of the quantum-
enhanced feature space are larger than the number of
data sets, has been demonstrated experimentally by us-
ing an ensemble nuclear spins [162].
F. Machine learning and data science
2. Autoencoders
Quantum machine learning (QML) generally refers to
the tasks of using a quantum computer to learn patterns The autoencoder for data compression is an important
in quantum data with the goal of making accurate pre- primitive in machine learning. The idea is to force in-
dictions on unknown, and unseen data [154]. Although formation through a bottleneck while still maintaining
an in-depth overview of QML is beyond the scope of this the recoverability of the data. As a quantum analog,
Review, we present several QML applications for which Ref. [163] introduced a VQA for quantum autoencod-
the VQA framework can be readily implemented. Specif- ing, with the goal of compressing quantum data. (see
ically, here one learns a parametrized quantum circuit to Refs. [164, 165] for alternative approaches to quantum
solve a given task [80, 155]. This connection between autoencoders.) The input to the algorithm is an ensemble
VQAs and (typical) QML applications shows that the of pure quantum states {pµ , |ψµ i} on a bipartite system
lessons learned in one field can be of great use in the P (here pµ are real and positive coefficients such that
AB
other, hence providing a close connection between these µ , pµ = 1). The goal is then to train an ansatz U (θ) to
two fields. compress this ensemble into the A subsystem, such that
14

one can recover each state |ψµ i with high fidelity from considers classical data, but encoded into a quantum cir-
subsystem A. The B subsystem is discarded and hence cuit. This encoding is followed by a variational quantum
can be thought of as the ‘trash’. Given the close con- circuit that generates quantum states, which are then
nection between data compression and decoupling [163], measured to produce a fake sample. This fake sample
the cost function is based on the overlap between the then enters either a classical discriminator or a quantum
output state on B and a fixed pure state. Recently, a discriminator, and the cost function is optimized to min-
local version of this cost function was also proposed and imize the discrimination probability with respect to real
was shown to train well for large-scale problems [166]. samples. The target application is to accelerate classical
Moreover, in Ref. [167], the autoencoder scheme was GANs using quantum computers.
generalized to mixed state and a noise-assisted algo-
rithm was provided to improve the recovering fidelity for
mixed/pure states. Quantum autoencoders have seen ex- 5. Quantum Neural Network architectures
perimental implementation on quantum hardware [168],
and will likely be an important primitive in QML. Several Quantum Neural Network (QNN) architectures
have been proposed; for instance, Refs. [155, 175, 176]
proposed perceptron-based QNNs. In these architectures
3. Generative models each node in the neural network represents a qubit, and
their connections are given by parameterized unitaries
The idea of training a parameterized quantum circuit of the form in Eq. (4) acting on the input states. In
for a QML implementation can also be applied for a gen- addition, Ref. [177] introduced Quantum Convolutional
erative model [169–171], which is an unsupervised sta- Neural Networks (QCNNs). QCNNs have been used for
tistical learning task with the goal of learning a prob- error correction [177], image recognition [178], and to dis-
ability distribution that generates a given data set. Let criminate quantum state belonging to different topologi-
i=1 be a data set of size D sampled from a probabil-
{x(i) }D cal phases [177]. Moreover, it has been shown that QC-
ity distribution q(x). Here one learns q(x) as the param- NNs and QNNs with tree tensor network architectures
eterized probability distribution pθ (x) = |hx|U (θ)|ψ0 i|2 do not exhibit barren plateaus [179, 180] (which will be
obtained by applying U (θ) to an input state and mea- discussed later), potentially making them a generically
suring in the computational basis, that is, it corresponds trainable architecture for large-scale implementations.
then to a quantum circuit Born machine [170]. In prin-
ciple one wishes to minimize the difference between the
two distributions. However, since q(x) is not available, G. New frontiers
the cost function
P is defined by the negative log-likelihood
i log(pθ (x )). In Ref. [170] a variational
1 (i)
C(θ) = − D In this section we discuss some exciting, recently pro-
framework for training quantum circuit Born machines posed applications of VQAs. These applications high-
was introduced and demonstrated for both classical data, light the fact that VQAs could be used to understand
such as the bars-and-stripes data set, and for synthetic and exploit the mathematical and physical structure of
data sets related to the preparation of cat states and quantum states, and quantum theory in general.
coherent thermal states. In Ref. [172], an implicit gen-
erative model has been constructed by comparing the
distance in the Gaussian kernel feature space. The repre- 1. Quantum foundations
sentation power of the generative model has been investi-
gated in Ref. [171]. Finally, it has been shown that quan- NISQ computers will likely play an important role in
tum circuit Born machines can simulate the restricted understanding the foundations of quantum mechanics.
Boltzmann machine and perform a sampling task that is In a sense, these devices offer experimental platforms
hard for a classical computer [173]. to test foundational ideas ranging from quantum gravity
to quantum Darwinism [181]. For example, the emer-
gence of classicality in quantum systems will be soon
4. Variational Quantum Generators be a computationally tractable field of study due to the
increasing size of NISQ computers. Along these lines,
Generative Adversarial Networks (GANs) use two neu- Ref. [182] proposed the Variational Consistent Histories
ral networks, a generator and discriminator, in competi- (VCH) algorithm. Consistent Histories is a formal ap-
tion. The generator aims to convince the discriminator proach to quantum mechanics that has proven to be use-
that its output is coming from the true distribution asso- ful in studying the quantum-to-classical transition and
ciated with the training data. GANs play an important quantum cosmology. In this formalism, interference be-
role in classical machine learning for applications such as tween different paths (histories) as quantified in the de-
image synthesis and molecular discovery. Ref. [174] pro- coherence functional [183]. The exponential number of
posed a VQA for learning continuous distributions which terms in this decoherence functional makes the formal-
is meant to be a quantum version of GANs. Here one still ism computationally expensive on classical devices. VCH
15

provides a way to prepare a density matrix representa- 4. Quantum metrology


tion of the entire functional, allowing one to efficiently
examine the consistency of a set of histories. The appli- Quantum metrology is a field where one seeks the opti-
cation of standard VQAs to foundational situations can mal setup for probing a parameter of interest (for exam-
also provide a framework for new insights. For example, ple a magnetic field) with minimal shot noise. In the ab-
Ref. [184] showed that a Full Unitary Matrix Compil- sence of noise during the probing process, the analytical
ing (FUMC) strategy (discussed above) cannot efficiently solution for the optimal probe state can be derived. How-
learn a scrambling unitary. This result provides insight ever, when general physical noises are present, an analyt-
into the black hole information paradox as one would ical solution is hard to find. Variational-state quantum
need to have a representation of a black hole’s scram- metrology variationally searches for the optimal probe
bling unitary in order to unscramble information from state [190–193]. For state preparation, variational quan-
emitted Hawking radiation [185]. tum circuits are used in Refs. [190, 192, 193] whereas opti-
cal tweezer arrays are considered in Ref. [191]. More con-
cretely, one prepares a probe state with variational pa-
rameters, probes the magnetic field with physical noises,
measures quantum Fisher information (QFI) as a cost
2. Quantum information theory
function, and updates the parameters to maximize it.
Note that since QFI cannot be efficiently computed, an
Another field that will likely see renewed interest due approximation of QFI can be heuristically found by op-
to NISQ computers is quantum information theory [186]. timizing the measurement basis, or by computing upper
For example, in Ref. [163] it was remarked that the quan- and lower bounds on the QFI [193].
tum autoencoder algorithm could potentially be used to
learn encodings and achievable rates for quantum chan-
nel transmission. Another area of research is using NISQ IV. CHALLENGES AND POTENTIAL
computers to compute key quantities in quantum infor- SOLUTIONS
mation theory, such as the von Neumann entropy or dis-
tinguishability measures such as the trace distance. Al- Despite the tremendous developments in the field of
though it is know that these problems are hard for general VQAs, there are still many challenges that need to be
quantum states [187], Ref. [188] introduced a VQA to es- addressed to maintain the hope of achieving quantum
timate the quantum fidelity between an arbitrary state speedups when scaling up these near-term architectures.
σ and a low-rank state ρ. Moreover, in Ref. [72] a VQA Understanding the limitations of VQAs is crucial to de-
was introduced to learn modular Hamiltonians, which veloping strategies that can be used to construct bet-
provides an upper bound on the von Neumann entropy ter algorithms, prove certain guarantees on their perfor-
of a quantum state. Here one attempts to variationally mance, and even to build better quantum hardware.
decorrelate a quantum state by minimizing the relative
entropy to a product distribution, and hence this method
is suited for states that can be easily decorrelated. A. Trainability

1. Barren plateaus

3. Entanglement Spectroscopy
The so-called barren plateau (BP) phenomenon in the
cost function landscape has received considerable atten-
tion as one of the main bottlenecks for VQAs. When a
Characterizing entanglement is crucial for understand- given cost function C(θ) exhibits a BP, the magnitude of
ing condensed matter systems, and the entanglement its partial derivatives will be, on average, exponentially
spectrum has proven to be useful in studying topologi- vanishing with the system size [194]. As shown in Fig. 6
cal order. Several VQAs have been introduced to extract this has the effect of the landscape being essentially flat.
the entanglement spectrum of a quantum state [113, 147, Hence, in a BP one needs an exponentially large precision
189]. Since the entanglement spectrum can be viewed as to resolve against finite sampling noise and determine a
the principlal components of a reduced density matrix, cost-minimizing direction, with this being valid indepen-
algorithms for PCA can be used for this purpose, includ- dently of using a gradient-based [84] or gradient-free op-
ing the VQAs discussed before. In addition, one can also timization method [195]. The exponential scaling in the
use the variational algorithm for quantum singular value precision due to BPs could erase a potential quantum
decomposition introduced in Ref. [189]. These algorithms advantage with a VQA, as its complexity would be com-
could potentially characterize the entanglement (and for parable to the exponential scaling typically associated
example, topological order) in a ground state that was with classical algorithms. Hence, analyzing the existence
prepared by VQE, and hence different VQAs can be used of BPs in a given VQA is fundamental to preserve the
together in a complementary manner. hope of using it to achieve a quantum advantage.
16

with high probability, a BP. This again is due to the


intrinsic randomness in a scrambler. In addition, the
BP phenomenon has been studied when training ran-
domly initialized perceptron-based quantum neural net-
works [200, 201]. Here, BPs arise from the significant
amount of entanglement created by the perceptrons con-
necting large number of qubits in visible and hidden lay-
ers. Specifically, when tracing out the qubits in the hid-
den layers, the state of the visible qubits becomes ex-
ponentially close to being maximally mixed (due to con-
FIG. 6. Barren plateau (BP) phenomenon. a. Vari- centration of measure), which makes it difficult to extract
ance of the cost function partial derivative, Var(∂θ1,1 E), for information from such a state.
a particular parameter θ1,1 in the ansatz versus number of Although previous results rely on the randomness of
qubits (n). Results were obtained from a Variational Quan-
the ansatz, there is a conceptually different phenomenon
tum Eigensolver implementation with a deep unstructured
ansatz. The y-axis is on a log scale. As the number of qubits that can lead to BPs. Recently, it was shown in Ref. [202]
increases the variance vanish exponentially with the system that noise can induce barren plateaus, regardless of the
size. b. Visualization of the landscape of a global cost func- ansatz employed. Here, the presence of noise acting
tion which exhibits a BP for the quantum compilation imple- throughout the circuit progressively corrupts the state
mentation, . The orange (blue) landscape was obtained for towards the fixed point of the noise model, usually the
n = 24 (n = 4) qubits. As the number of qubits increases, maximally mixed state [203]. Such a phenomenon was
the landscape becomes flatter. Moreover, this cost also ex- shown to arise when the circuit depth needs to be lin-
hibits the narrow gorge phenomenon [166], where the volume ear (or larger) with the system size, meaning that it will
of parameters leading to small cost values shrinks exponen- affect many widely-used ansatzes.
tially with n. Panel a is adapted from Ref. [194], CC BY 4.0;
Panel b is adapted from Ref. [166], CC BY 4.0.

2. Ansatz and initialization strategies


The phenomenon of BPs was originally discovered in
Ref. [194] where it was shown that deep unstructured Hand-in-hand with the theoretical progress on the
parametrized quantum circuits exhibit BPs when ran- analysis on the BP phenomenon, great effort has been
domly initialized. The proof of this result relies on the dedicated to avoiding or mitigating the effect of BPs.
fact that these unstructured ansatzes become 2-designs The main strategy here has been to break the assump-
when their depth grows polynomially with the number tions leading to BPs. In what follows we present two
of qubits n [196, 197]. One can view this phenomenon main approaches: parameters initialization and choice of
as stemming from the fact that the ansatz is problem- ansatz.
agnostic and hence needs to explore an exponentially
large space to find the solution. Therefore, the proba- • Parameter initialization. Randomly initializing an
bility of finding the solution when randomly initializing ansatz can lead to the algorithm starting far from
the ansatz is exponentially small. the solution, near a local minima, or even in a re-
The analysis of BPs was extended to shallow random gion with barren plateaus. Hence, optimally choos-
layered ansatze [166] where it was shown that the BP phe- ing the seed for θ at the beginning of the optimiza-
nomenon is cost-function dependent: Global cost func- tion is an important task. The importance of pa-
tions (when one compares operators or states living in rameter initialization was made clear in Ref. [204]
exponentially large Hilbert spaces) exhibit BPs, whereas where it was noted that the optimal parameters
local cost functions (when one compares objects at the in QAOA exhibit persistent patterns. Based on
single-qubit level), exhibit gradients which vanish poly- these observations initialization strategies were pro-
nomially in n so long as the circuit depth is at most posed and which were heuristically shown to out-
logarithmic in n. This implies a connection between the perform randomly initialized optimizations. Addi-
locality and trainability and informs our intuition as to tionally, in Refs. [29, 205] an initialization strat-
what types of cost functions might have to be avoided. egy was developed specifically to address BPs in
These results have been numerically verified in Ref. [198] deep circuits. Here, one selects at random a sub-
and further extended in Ref. [199]. Here one can under- set of parameters in θ, and chooses the value of
stand the presence of BPs for global costs as spanning the remaining ones so that the circuit is a sequence
from the fact that the Hilbert space grows exponentially of shallow unitaries that evaluates to the identity.
with n and hence the probability of two objects being The main idea behind this method is to reduce
close is exponentially small. the randomness and depth of the circuit to break
In addition, it has been shown that BPs can arise in the assumption that the circuit approximates a 2-
more general problems such as learning a scrambler [184] design, a condition necessary for BPs to arise in
where for any choice of variational ansatz will lead to, deep ansatzes. Similar to the previous method,
17

other schemes have been introduced to prevent BP astronomical, hence addressing this issue is essential for
by restricting the randomization of the ansatz. For realizing quantum advantage [28]. More reasonable re-
instance, the proposal in Ref. [206] showed that cor- source estimates can be reached for restricted problems
relating the parameters in the ansatz effectively re- such as the Hubbard model [211, 212]. Although in prin-
duces the dimension of the hyperparameter space ciple one could always take projective measurements onto
and can lead to large cost function gradients. In the eigenbasis of the operator in question, in general both
addition, Ref. [207] introduced a method where the computational complexity of finding the required uni-
one uses layer-by-layer training: one initially trains tary, and the depth required to implement that transfor-
shallow circuits and progressively adds components mation, may be intractable. However, given that arbi-
to the circuit. Whereas the latter guarantees that trary Pauli operators are diagonalizable with one layer
the number of parameters and randomness remains of single qubit rotations, it is common for the operators
small for the first steps of the training, it has of interest (such as quantum chemistry Hamiltonians) to
been shown [208] that this method can lead to an be expressed by their decomposition into such Pauli op-
abrupt transition in the ability of quantum circuits erators. That is, H = i ci σi , where {ci } are real coef-
P
to be trained. Finally, a method was introduced ficients and {σi } are Pauli operators. The drawback of
in Ref. [209] where one pre-trains the parameters this approach is that, for many interesting Hamiltonians
in the quantum circuits by using classical neural this decomposition contains many terms. For example,
networks. for chemical Hamiltonians the number of distinct Pauli
strings scales as n4 where n is the number of orbitals
• Ansatz strategies. Another strategy for prevent- (and thus qubits) for large molecules. In what follows we
ing BPs is using structured ansatzes which are discuss several methods whose goal is to obtain measure-
problem-inspired. The goal here is to restrict the ment frugality in estimating the cost function.
space explored by the ansatz during the optimiza-
tion. As discussed in the section on ansatzes, the
UCC ansatz for VQE of the quantum alternat-
1. Commuting sets of operators
ing operator ansatz [23, 24] for optimization are
problem-inspired ansatzes which are usually train-
able even when randomly initialized. Other ansatz In the interest of reducing the number of measurements
strategies include the proposals in Ref. [209] to required to estimate an operator expectation value, a
learning a mixed state, where one leverages knowl- number of methods have been proposed for partition-
edge of the target Hamiltonian to create a Hamilto- ing sets of Pauli strings into commuting (simultaneously
nian variational ansatz. In addition, Refs. [60, 61] measurable) subsets. The choice of the subsets is also of
presented an approachP where the ansatz for the so- course non-unique and has been mapped onto the com-
lution is |ψ ({cµ })i = µ cµ |ψµ i, for a fixed set of binatorial problems of graph coloring [213, 214], finding
states {|ψµ i} determined by the problem at hand. the minimum clique cover [215–218], or finding the max-
Here the optimization over the coefficients {cµ } imal flow in network flow graphs [219], which makes it
can be solved using a quadratically constrained possible to import the heuristics and formal results from
quadratic program. those problems.
Perhaps the simplest approach to such a partition-
Finally, we remark that along with ansatz strategies ing is to look for subsets that are qubit-wise commut-
there are other ways of potentially addressing BPs. ing (QWC), which is to say that the Pauli operators on
These include optimizers tailored to mitigate the effect of each qubit commute. Indeed, this was the first method
BPs [210], local cost functions [113], or architectures such introduced [220]. However, whereas the QWC methods
as the QCNN, which has been shown to avoid BPs [179]. help reduce the number of operators, they do not change
the asymptotic scaling for quantum chemistry applica-
tions, motivating more general commutative groupings
B. Efficiency to be considered. To this end, it has been shown that
by considering general commutations (and increasing the
Another requirement that must be met for VQAs to number of gates of the circuit quadratically with n) the
provide a quantum advantage is having an efficient way to scaling of the number of measurements can be reduced
estimate expectation values (and more general cost func- to n3 [213, 214, 216–219].
tions). The existence of BPs can exponentially increase For using VQE on fermionic systems, this scaling can
the precision requirements needed for the optimization actually be brought down to either quadratic or, for sim-
portion of VQAs, as discussed in the section on BPs, but pler cases, even linear [221] in n. This significant im-
even in the absence of such BPs these expectation value provement is found by considering factorizations of the
estimations are not guaranteed to be efficient. Indeed, two-electron integral tensors, rather than working at the
early estimations of resource requirements suggested that operator level. The success of this approach suggests that
the number of measurements that would be required for using background information on the problem may signif-
interesting quantum chemistry VQE problems would be icantly improve the measurement efficiency of estimating
18

an expectation value. instead of directly from measurements the sampling vari-


ance for a given number of shots is substantially reduced
at the cost of introducing a small, positive bias [227].
2. Optimized sampling

In addition to reducing the number of individual oper- C. Accuracy


ators that need to be measured, measurement efficiency
can also be improved by carefully allocating the num-
ber of shots among the Pauli operators. Since operators One of the main goals for VQAs is to enable a prac-
with smaller coefficients will tend to contribute less to the tical use for NISQ devices. For this goal, VQAs provide
overall variance, assigning the same number of shots to a strategy to deal with hardware noise as they can po-
each operator is usually inefficient. Instead, the optimal tentially minimize quantum circuit depth. Moreover, as
approach [222] is to give eachp Pauli operator a number discussed below, error mitigation methods can be com-
of shots proportional to |ci | Var(σi ), where ci is the co- bined with VQAs to further improve accuracy. However,
efficient of the ith Pauli operator σi and Var(σi ) is the one can still ask what the impact of hardware noise will
variance of hσi i. During an optimization where low pre- be on the accuracy of a VQA.
cision steps may be allowed early on, this allocation can
instead be performed randomly with probabilities pro-
portional to |ci | Var(σi ). Making the allocation ran-
p 1. Impact of hardware noise
domly in this way allows for unbiased estimates with as
little as one shot, potentially significantly increasing the There are multiple aspects of the impact of hardware
efficiency of the optimization [223]. Optimizing the sam- noise: it could potentially slow down the training pro-
pling of the metric tensor has also been explored, with cess, it could bias the landscape so that the noisy global
the conclusion that these costs need not be dominant in optimum no longer corresponds to the noise-free global
metric-aware VQAs [224]. optimum, and it could affect the final value of the optimal
cost.

3. Classical shadows • Effect of noise on training. The question of whether


noise can help with the training process was posed
in Ref. [228]. In practice, it is typical to observe
Another promising approach to efficient measurements
that noise slows down the training. For example, it
is the construction of classical shadows [225], also know
was heuristically observed that the noise-free cost
as shadow tomography. In this approach, an approximate
achieves lower values with noise-free training than
classical representation of the state (the classical shadow)
with noisy training [96, 223, 229]. As discussed in
is constructed by summing over the collection of states
the section on BPs, the intuition behind this slow-
that a sequence of different measurements projects onto.
ing down is that the cost landscape is flattened,
These measurements are taken in the basis of randomly
and hence gradient magnitudes are reduced, by the
chosen strings so that a partial tomography of the state
presence of incoherent noise [202, 230, 231]. More-
is completed. Combining the measurements in this way,
over, gradients decay exponentially with the algo-
each shot contributes to the estimation of each Pauli op-
rithm’s depth, meaning that the deeper the circuit,
erator expectation value, resulting in a number of mea-
the more it will be affected. This can be further un-
surements that scales logarithmically with n. As with
derstood from the fact that cost functions are typi-
direct measurement approaches discussed above, this ap-
cally extremized by pure states, and since incoher-
proach can also be further optimized by tuning the proba-
ent noise reduces state purity, one expects this noise
bility distribution for the Pauli operators that define the
to erode the extremal points of the landscape [203].
measurements to match the properties of the operator
The presence of noise-induced BPs and their effect
and state [226].
on the trainability is one of the leading challenges
for VQAs, with potential solutions being develop-
4. Neural network tomography
ing better quantum hardware or shorter-depth al-
gorithms. It is worth remarking that the results
discussed here do not account for the use of error
A different approach using partial tomography is mitigation techniques, and the scope to which these
to train an approximate restricted Boltzmann machine could help is still an open question.
(RBM) representation of the desired quantum state [227].
This RBM is fitted using measurements of the Pauli oper- • Effect of noise on cost evaluation. In Refs. [202,
ators that are needed to directly estimate a given opera- 203] it was also shown that in the presence of lo-
tor’s expectation value, and so does not inherently reduce cal Pauli noise, the cost landscape concentrated ex-
the number of operators to measure. However, by com- ponentially with the depth of the ansatz around
puting the expectation value on an approximate RBM the value of the cost associated with the maximally
19

mixed state. Whereas the proof of this exponential where it was shown that VQAs for quantum com-
concentration of the cost was for general VQAs, piling (see section on compilation), exhibit a spe-
some previous works had also observed this effect cial type of noise resilience known as Optimal Pa-
for the special case of the QAOA [230, 231]. The rameter Resilience (OPR). OPR is the notion that
exponential concentration of the cost is of course global minima of the noisy cost function correspond
important beyond the issue of trainability. Even if to global minima of the noise-free cost function. In
one is able to train, the final cost value will be cor- this sense, if an algorithm exhibits OPR, then min-
rupted by noise. There are certain VQAs where this imizing the cost in the presence of noise will still
is not an important issue (for example, in QAOA obtain the correct optimal parameters, and hence
where one can classically compute the cost after the optimal parameters are resilient. Since quan-
sampling). However, for VQE problems, this is im- tum compilation is a special case of VQE the ques-
portant, since one is ultimately interested in an ac- tion still remains open to whether other VQAs ex-
curate estimation of the energy. This emphasizes hibit this type of noise resilience for certain noise
the importance of understanding to what degree models. A different type of noise resilience was ana-
error mitigation methods can correct for this issue. lyzed in Ref. [232] in the context of the holographic
quantum simulation of many-body systems. Specif-
ically, it was shown that under certain conditions
2. Noise resilience the expectation values of local observables mea-
sures on the prepares ground-state are perturbed
by, at most, a function that does not depend on
One reason for the interest in VQAs is their ability to
the size, but rather only on the noise parameter.
naturally overcome certain types of noise in hardware,
especially in near-term implementations. This noise re-
silience is a crucial, non-trivial feature of VQAs.
3. Error mitigation
• Inherent resilience to coherent noise. By construc-
tion, VQAs are insensitive to the specific parameter Quantum Error Mitigation (QEM) generally sup-
values, ultimately only sampling physical observ- presses physical errors to expectation values of observ-
ables from the resulting state. More specifically, ables via classical post-processing of measurement out-
if the physical implementation of a unitary results comes [233]. An intuitive, but powerful, example is the
in a coherent error within the parameter space, or extrapolation method [90, 234]. Even if the error rate
U (θ) actually results in U (θ + δ), then under mild cannot be reduced, in many cases it can be deliber-
assumptions the optimizer can calibrate this block ately boosted, for example, as shown in Fig. 7, by in-
unitary on the fly to improve the physical state serting additional noisy pulses or making gate operations
produced. This effect was first conjectured theo- longer, the quantum device undergoes more physical er-
retically [220] and later seen experimentally in su- rors. Then, by obtaining measurement outcomes at sev-
perconducting qubits [48], where errors after the eral noise levels and extrapolating them, one can esti-
variational procedure were reduced in some cases mate the error-free result using the so-called zero-noise
by over an order of magnitude. Success in this en- extrapolation method. Due to the propagation of uncer-
deavor depends upon the ability to optimize faster tainty, the variance of the error-mitigated result is am-
than the drift of calibration in the device, and suf- plified and hence one needs to have a larger sampling
ficient variational flexibility in the ansatz, but may cost, which is the overhead of QEM. First, Richardson
continue to be effective even into the early fault- extrapolation was proposed [90, 106, 234], and it was
tolerant regime where coherent errors can be espe- shown that single- and multi-exponential extrapolation
cially insidious. work well for Markovian gate errors, with the latter sub-
sequently shown to have very broad efficacy [235, 236].
• Inherent resilience to incoherent noise. It is an In addition, extrapolation using least square fitting for
interesting question as to what degree incoherent several noise parameters has been proposed [237]. Fur-
noise, such as decoherence, random gate errors, and thermore, it has been observed that the extrapolation
measurement errors, will impact VQAs. For ex- method can mitigate algorithmic errors that arise due to
ample, it was shown that optimization in the pres- insufficiency in the number of time steps [238].
ence of some noise channels can automatically move Although extrapolation methods by design cannot
the state into subspaces that are resilient to those fully mitigate physical errors [234, 235], probabilistic er-
channels as an energetic trade-off [55]. However, ror cancellation in theory can obtain unbiased expecta-
one could operate under the assumption of per- tion values by inverting the noise process with additional
fect training (which may still be possible for either probabilistic gate operations (if a complete characteri-
weak noise or shallow ansatzes), and ask whether zation of noise is provided). Note that since an inverse
the global optimum in the cost landscape is robust map of physical errors is generally unphysical, it is nec-
to such noise. This was the approach of Ref. [150], essary to post-process measurement outcomes according
20

a |1 b |1 post-processing approach using the information of the


symmetry with a larger sample number [244]. The use
of symmetry verification to augment error extrapolation
and probabilistic error cancellation was taken still further
in Ref. [236].
In an alternative and complementary approach, the
subspace expansion method was also shown to be use-
ful for QEM in Ref. [245]. Here, using subspace expan-
sion one can mitigate physical noise for eigenstates of
the Hamiltonian as well as evaluating excited states be-
cause the state is expanded in a larger subspace. Note
|0 |0 that this method works better for coherent noise than for
stochastic noise. A distinct approach was introduced in
FIG. 7. Qubit trajectories on the Bloch sphere with Ref. [246, 247] which comes at the cost of increasing the
the Zero-Noise Extrapolation (ZNE) technique. The number of qubits. Here, by entangling and measuring
accuracy of a noisy quantum computer can be improved with M copies of a noisy state ρ, one can compute expecta-
ρM
the ZNE error mitigation method. a. Here, one repeats a tion values with respect to the state Tr[ρ M ] . Under the
given calculation with different levels of noise. The green
assumption that the principal eigenvector of ρ is the de-
curve corresponds to a rotation on the Bloch sphere with a
higher noise level than that leading to the red curve. b. Tak-
sired state, this method can exponentially suppress errors
ing data from the red and green curves, ZNE can be used with M . Finally, we remark that Ref. [248] introduced a
to estimate what the trajectory (blue) would be like in the method to mitigate expectation values against correlated
absence of noise. Adapted from Ref. [106], Springer Nature measurement errors, whereas Ref. [249] implemented an
Limited. error mitigation technique to suppress the effects of pho-
ton loss for a Gaussian Boson sampling device.

to applied recovery operations. In Ref. [234] this method


was first introduced and in Ref. [235] it was found that V. OPPORTUNITIES FOR NEAR-TERM
gate set tomography is a suitable noise characterization QUANTUM ADVANTAGE
strategy, and a set of operations was proposed which can
compensate for general Markovian errors. Furthermore,
VQAs are largely regarded as the best candidate for
based on probabilistic error cancellation, stochastic error
providing quantum advantage for practical applications.
mitigation which works for general continuous systems
That is, it is expected that a VQA can solve a prob-
such as analog quantum simulators and digital quantum
lem more efficiently than any classical state-of-the-art
computers was introduced [239].
method. As discussed in the main text, tremendous effort
A different approach to QEM relies on the classical
has been dedicated to this goal with the development of
simulability of near-Clifford circuits. The basic idea be-
efficient ansatz strategies, quantum-aware optimization
hind this approach is to compare the classically com-
methods, new VQAs, and error mitigation techniques.
puted exact expectation values for near-Clifford circuits
Although many challenges still remain to be addressed,
with their noisy counterparts evaluated on actual hard-
such as the need for larger and better quantum devices,
ware [240–242]. Taking this approach can allow one to
one can nevertheless pose the question as to what specific
implement a probabilistic error mitigation protocol with-
applications will provide the first quantum advantage for
out needing to construct a full error model for an ex-
a practical scenario. In this section we discuss some of
periment [240]. Alternatively, one can perform a simple
the most exciting possibilities where quantum advantage
regression with this Clifford data to estimate how the ob-
could arise.
servables have been affected and invert this regression to
estimate desired noise-free expectation values [241]. Fi-
nally, zero-noise extrapolation can be merged with this
regression to have an extrapolation to zero-noise whose A. Chemistry and material sciences
form is tuned via the Clifford data, reducing the risk of
blind extrapolations [242]. The ability to simulate and understand the static and
Several additional QEM methods have been proposed. dynamical properties of molecules and strongly corre-
Symmetry verification is especially useful for ansatze lated electronic systems is a fundamental task in many
that preserve symmetries such as particle and spin num- areas of science. For instance, this task is relevant in
ber [11, 243, 244]. Since physical errors break the symme- biology to understand protein folding dynamics, and in
try, by measuring and ignoring the undesired case (simi- pharmaceutical sciences one could analyze drug-receptor
larly to error detection), one can mitigate physical noise. interactions to improve drug discovery capabilities [250–
Unlike other QEM methods, symmetry verification can 252]. Similarly, analyzing the electronic structure of com-
recover the quantum state itself. One can also take a plex correlated materials is very important for studying
21

high-temperature superconductivity or to analyze tran- timately, these methods may help unlock proton-coupled
sition metal materials near a Mott transition. electron transfer mechanisms [262] in proteins and help
with the design of novel organic photovoltaics [257] and
related systems.
1. Molecular structure

In the past few decades there have been great develop- 3. Materials science
ments in the classical treatment of the structure of molec-
ular systems. These include approximate methods such Classical methods for materials simulations usually use
as Hartree-Fock or density functional theory, or methods density-functional theory coupled with approximation
closely connected to quantum information, like the den- methods, such as the local density approximation [263]
sity matrix renormalization group approach that utilizes to tackle weakly correlated materials. However, many
matrix product states as an ansatz [253, 254]. However, effects arising from strongly correlated systems are be-
even for these sophisticated approaches, systems of in- yond the reach of such classical methods. Since long-
terest such as the FeMo cofactor are beyond the reach of term algorithms for material simulation require phase
an accurate description due to the entanglement struc- estimation [264–266], these lie beyond the scope of near-
ture of the electrons and orbitals. The relevant electronic term devices. In contrast, near-term VQAs for analyz-
space that one needs to treat correlations accurately in ing strong correlation problem are aimed at reducing the
for these systems is relatively modest, and for that rea- circuit depth by using smart initializations [267], or by
son, these may be good targets for near-term quantum optimizing the circuit structure itself [31, 32].
computers to play a role. As discussed in the main text
the Variational Quantum Eigensolver algorithm [16] (and
associated architectures) have shown promising advances B. Nuclear and particle physics
towards the goal of performing molecular quantum chem-
istry on quantum computers [255], with large scale im- 1. Nuclear physics
plementation already being executed [256].
Similar to the chemistry applications discussed above,
VQAs have the potential to convey a quantum advan-
2. Molecular dynamics tage in studying nuclear structure and dynamics. The
most studied potential contribution is the utility of the
As for the dynamics of chemical and other quantum VQE method to find nuclear ground states. This was
systems, there have been a number of strides in eval- first demonstrated for computing the deuteron (2 H) bind-
uating or compressing these evolutions using variational ing energy [268], and has been extended to other light
approaches [90, 91]. Much like variational principles con- nuclei such as the triton (3 H), 3 He, and an alpha par-
nected to the ground state, there are a number of time- ticle (4 He) [269]. Additionally, using VQE to prepare
dependent variational principles that can be used to ap- the ground state of a triton has been an initial step
proximate time-dynamics. Here there are two timescales as a demonstration of simulating neutrino-nucleon scat-
of interest. The first is the electronic timescales over tering [270]. Considering these low-energy applications
which electrons rearrange upon excitation. The second, along with the general progress towards studying higher
much slower than the first, is the rearrangement of nu- energy nuclear interactions (quantum chromodynamics)
clei that is induced by forces derived from the electrons in via VQA lattice gauge theory approaches (discussed be-
their respective configurations, excited or not. Generally low) shows that VQAs have the potential to provide a
speaking, treating the detailed dynamics of the electrons significant advantage over classical methods for nuclear
accurately has been extremely challenging for classical physics.
approaches despite its relevance in phenomena related to
photovoltaics and light-emitting diodes [257, 258]. The
scale between the two timescales has motivated the de- 2. Particle physics
velopment of methods that treat them separately, often
using a classical or semi-classical representation for the In particle physics many analytical tools have been de-
nuclei and quantum representation for the electrons [259]. veloped to describe and study theories, but there are
Variational methods can be applied incrementally in many areas that remain intractable. In particular, the
these cases, by stepping the electronic wavefunction for- study of important gauge theories such as quantum chro-
ward with time-dependent variational principles [90, 91] modynamics is often handled by mapping the problem
and sampling the forces [260] to move the nuclei clas- onto a lattice to allow for numerical studies. One of
sically, resulting in a Born-Oppenheimer type molecu- the major drawbacks of such Lattice Gauge Theories
lar dynamics. Early test systems for quantum molecular (LGTs) for classical computation is that they exhibit the
dynamics often include photo-dissociation reactions and sign problem and as a result are usually not classically
conical interactions of small molecular systems [261]. Ul- simulable. Although large scale, fault-tolerant quantum
22

computers will eventually be able to handle this diffi- 2. Machine Learning


culty [271, 272], there is also the potential for achieving a
significant quantum advantage in this area with VQAs in In the past few decades, the use of machine learning
the Noisy Intermediate-Scale Quantum (NISQ) era [273]. has become common in most, if not all, areas of sci-
Advances in this direction include work on VQAs for ence. Although the problem of loading classical data
LGT simulation [13] and variational determinations of on quantum computers is still an active topic of re-
mass gaps, Green’s functions, and running coupling con- search, there has been significant efforts put forward to
stants [274–276]. In addition, an approach using a VQA use quantum algorithms for machine learning applica-
to determine interpolation operators to accelerate classi- tions [154, 157, 174, 287]. For instance, it has been shown
cal LGT computations has been proposed [277]. Finally, that quantum neural networks can achieve a significantly
the impacts of decoherence by hardware noise on LGT higher capacity, as measured by the effective dimension,
calculations have been studied, finding that gauge viola- than comparable classical neural networks [77], implying
tions caused by decoherence only grow linearly at short that the former can express a broader class of functions
times, suggesting that short depth approaches may be than the latter. Moreover, it has also been pointed out
possible [278]. Taken together, these results show that that quantum algorithms can outperform classical ones
studying LGTs is a viable candidate for NISQ quantum in deep learning problems [288], potentially provide ex-
advantage. ponentially better ability to generalize when trained to
predict the outcome of physical processes [289], and more
recently a VQA has been proposed for deep reinforcement
C. Optimization and machine learning learning [290]. An exciting prospect for using quantum
neural networks is that certain architectures are immune
Although it is natural to consider that VQAs can bring to barren plateaus, and hence are trainable even for large
an advantage on tasks which are inherently quantum problems [179, 180].
in nature, the prospect of using quantum algorithms to
solve classical problems is also an exciting one. Gen-
erally, one here aims to use the large dimension of the VI. OUTLOOK
Hilbert space to encode big problems or large amounts of
data, with the premise that the quantum nature of the
algorithm (such as coherence or entanglement between In the quest for quantum advantage, analytical and
qubits [279]) helps in speeding up a given task. heuristic scaling analysis of VQAs will be increasingly
important. Better methods to analyze VQA scalability
are anticipated in the future. This will likely include
1. Optimization both gradient scaling and other scaling aspects, such as
the density of local minima and the shape of the cost
Many optimization problems can be encoded in rel- landscape. These fundamental results will help to guide
atively simple mathematical models such as the Max- the search for quantum advantage.
Cut [128] or the Max-Sat [127] problems. These include At the same time, the future will also see an im-
tasks such as electronic circuitry layout design, state proved toolbox for VQAs. Quantum-aware optimizers
problems in statistical physics [280], and even automo- will exploit knowledge gained about the cost landscape.
tive configuration [281]. Applying Quantum Approxi- These improved optimizers will mitigate the impacts of
mate Optimization Algorithm (QAOA) to classical op- small gradients and avoid local minima to facilitate rapid
timization problems is widely considered to be one of training of the parameters in VQAs. Moreover, com-
the leading candidates for achieving quantum advantage mercial software packages will streamline the testing of
on NISQ devices [131]. There are several reasons for VQAs and further speed up the parameter optimiza-
this optimism. QAOA has provable performance guar- tion [82, 291, 292].
antees [23, 282] for p = 1. In general, even p = 1 QAOA Application-specific ansatzes will continue to be devel-
ansatz cannot be efficiently simulated on any classical oped. Better ansatzes will enhance gradient magnitudes
device [283]. At the same time, QAOA performance can to improve trainability and they may also reduce the im-
only improve by increasing p. It was also shown that pact of noise on VQAs. This will likely include adap-
‘bang-bang’ evolution that motivates QAOA ansatz is tive ansatz strategies, which appear promising. Hybrid
the optimal approach given fixed quantum computation quantum-classical models [72] are a natural extension of
time [43]. However, there are problems for which a shal- VQAs where one parameterizes both a classical (for ex-
low QAOA ansatz does not perform well [284, 285] sug- ample, neural network) and quantum ansatz, and such
gesting that p may have to grow with the problem size. models could also facilitate near-term applications.
Larger p requires improvements in the parametrization New error mitigation strategies are anticipated in the
and optimization [204]. Similarly to quantum chemistry, future. These will be crucial for obtaining accurate re-
large scale experiments of QAOA have already been im- sults from VQAs and will improve accuracy by orders
plemented [286]. of magnitude. Error mitigation will be hard-coded into
23

cloud-based quantum computing platforms, to allow uses Xing Ding, Yi Hu, et al., “Quantum computational ad-
to obtain accurate results with ease. vantage using photons,” Science 370, 1460–1463 (2020).
The future will also see better quantum hardware be- [9] A. Kandala, A. Mezzacapo, K. Temme, M. Takita,
M. Brink, J. M. Chow, and J. M. Gambetta,
come available, both in terms of qubit count and noise
“Hardware-efficient variational quantum eigensolver for
levels. VQAs will certainly benefit from such improved small molecules and quantum magnets,” Nature 549,
hardware. Moreover, VQAs will play a central role in 242–246 (2017).
benchmarking the capabilities of these new platforms. [10] Bryan T Gard, Linghua Zhu, George S Barron,
In the near future, VQAs will likely see a shift from the Nicholas J Mayhall, Sophia E Economou, and Edwin
proposal and development phase to the implementation Barnes, “Efficient symmetry-preserving state prepara-
phase. Researchers will aim to implement larger, more tion circuits for the variational quantum eigensolver al-
realistic problems with VQAs instead of toy problems. gorithm,” npj Quantum Information 6, 1–9 (2020).
[11] Matthew Otten, Cristian L Cortes, and Stephen K
These implementations will incorporate multiple state-of-
Gray, “Noise-resilient quantum dynamics using
the-art strategies for enhancing VQA performance. Com- symmetry-preserving ansatzes,” arXiv preprint
bining strategies for improving the accuracy, trainability, arXiv:1910.06284 (2019).
and efficiency of VQAs will test their ultimate capabili- [12] Nikolay V Tkachenko, James Sud, Yu Zhang, Sergei
ties and will push the boundaries of NISQ devices, with Tretiak, Petr M Anisimov, Andrew T Arrasmith,
the grand vision of obtaining quantum advantage. Patrick J Coles, Lukasz Cincio, and Pavel A
In the more distant future, VQAs will even find use Dub, “Correlation-informed permutation of qubits
even when the fault-tolerant era arrives. Transitioning for reducing ansatz depth in VQE,” arXiv preprint
arXiv:2009.04996 (2020).
from estimating expectation values from Hamiltonian av-
[13] Christian Kokail, Christine Maier, Rick van Bijnen, Tiff
eraging to phase estimation may be an important com- Brydges, Manoj K Joshi, Petar Jurcevic, Christine A
ponent here [114]. QAOA may be a good candidate VQA Muschik, Pietro Silvi, Rainer Blatt, Christian F Roos,
to find usage in the fault-tolerant era, albeit with caveats et al., “Self-verifying variational quantum simulation of
about the overhead [293]. Strategies that address chal- lattice models,” Nature 569, 355–360 (2019).
lenges in the NISQ era, such as keeping circuit depth shal- [14] Carlos Bravo-Prieto, Josep Lumbreras-Zarapico, Luca
low and avoiding barren plateaus, could still play a role Tagliacozzo, and José I Latorre, “Scaling of variational
in the fault-tolerant era. Therefore, current research on quantum circuit depth for condensed matter systems,”
VQAs will likely remain useful even when fault-tolerant Quantum 4, 272 (2020).
quantum devices arrive. [15] Andrew G. Taube and Rodney J. Bartlett, “New per-
spectives on unitary coupled-cluster theory,” Interna-
tional Journal of Quantum Chemistry 106, 3393–3401
(2006).
[16] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-
VII. REFERENCES Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L.
O’Brien, “A variational eigenvalue solver on a photonic
[1] Peter W Shor, “Algorithms for quantum computation: quantum processor,” Nature Communications 5, 4213
discrete logarithms and factoring,” in Proceedings 35th (2014).
annual symposium on foundations of computer science [17] Sergey B Bravyi and Alexei Yu Kitaev, “Fermionic
(Ieee, 1994) pp. 124–134. quantum computation,” Annals of Physics 298, 210–226
[2] Seth Lloyd, “Universal quantum simulators,” Science (2002).
273, 1073–1078 (1996). [18] Joonho Lee, William J. Huggins, Martin Head-Gordon,
[3] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd, and K. Birgitta Whaley, “Generalized unitary coupled
“Quantum algorithm for linear systems of equations,” cluster wave functions for quantum computation,” Jour-
Physical Review Letters 103, 150502 (2009). nal of Chemical Theory and Computation 15, 311–324
[4] “IBM Makes Quantum Computing Available on (2019).
IBM Cloud to Accelerate Innovation,” (2016), [19] Mario Motta, Erika Ye, Jarrod R McClean, Zhen-
press release at https://www-03.ibm.com/press/us/ dong Li, Austin J Minnich, Ryan Babbush, and Gar-
en/pressrelease/49661.wss. net Kin Chan, “Low rank representations for quan-
[5] Adetokunbo Adedoyin, John Ambrosiano, Petr Anisi- tum simulation of electronic structure,” arXiv preprint
mov, Andreas Bärtschi, William Casper, Gopinath arXiv:1808.02625 (2018).
Chennupati, Carleton Coffrin, Hristo Djidjev, David [20] Yuta Matsuzawa and Yuki Kurashige, “Jastrow-type de-
Gunter, Satish Karra, et al., “Quantum algo- composition in quantum chemistry for low-depth quan-
rithm implementations for beginners,” arXiv preprint tum circuits,” Journal of Chemical Theory and Compu-
arXiv:1804.03719 (2018). tation 16, 944–952 (2020).
[6] J. Preskill, “Quantum computing in the NISQ era and [21] Ian D Kivlichan, Jarrod McClean, Nathan Wiebe, Craig
beyond,” Quantum 2, 79 (2018). Gidney, Alán Aspuru-Guzik, Garnet Kin-Lic Chan,
[7] Frank Arute et al., “Quantum supremacy using a pro- and Ryan Babbush, “Quantum simulation of electronic
grammable superconducting processor,” Nature 574, structure with linear depth and connectivity,” Physical
505–510 (2019). Review Letters 120, 110501 (2018).
[8] Han-Sen Zhong, Hui Wang, Yu-Hao Deng, Ming-Cheng [22] Kanav Setia, Sergey Bravyi, Antonio Mezzacapo, and
Chen, Li-Chao Peng, Yi-Han Luo, Jian Qin, Dian Wu, James D Whitfield, “Superfast encodings for fermionic
24

quantum simulation,” Physical Review Research 1, [38] L. Cincio, Y. Subaşı, A. T. Sornborger, and P. J. Coles,
033033 (2019). “Learning the quantum algorithm for state overlap,”
[23] Edward Farhi, Jeffrey Goldstone, and Sam Gut- New Journal of Physics 20, 113022 (2018).
mann, “A quantum approximate optimization algo- [39] Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh,
rithm,” arXiv preprint arXiv:1411.4028 (2014). and Dacheng Tao, “Quantum circuit architecture
[24] S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rieffel, search: error mitigation and trainability enhance-
D. Venturelli, and R. Biswas, “From the quantum ap- ment for variational quantum solvers,” arXiv preprint
proximate optimization algorithm to a quantum alter- arXiv:2010.10217 (2020).
nating operator ansatz,” Algorithms 12, 34 (2019). [40] Shi-Xin Zhang, Chang-Yu Hsieh, Shengyu Zhang,
[25] Seth Lloyd, “Quantum approximate optimiza- and Hong Yao, “Differentiable quantum architecture
tion is computationally universal,” arXiv preprint search,” arXiv preprint arXiv:2010.08561 (2020).
arXiv:1812.11075 (2018). [41] M Bilkis, M Cerezo, Guillaume Verdon, Patrick J Coles,
[26] Mauro ES Morales, JD Biamonte, and Zoltán Zim- and Lukasz Cincio, “A semi-agnostic ansatz with vari-
borás, “On the universality of the quantum approx- able structure for quantum machine learning,” arXiv
imate optimization algorithm,” Quantum Information preprint arXiv:2103.06712 (2021).
Processing 19, 1–26 (2020). [42] Arthur G. Rattew, Shaohan Hu, Marco Pistoia, Richard
[27] Zhihui Wang, Nicholas C Rubin, Jason M Dominy, and Chen, and Steve Wood, “A domain-agnostic, noise-
Eleanor G Rieffel, “X y mixers: Analytical and numeri- resistant, hardware-efficient evolutionary variational
cal results for the quantum alternating operator ansatz,” quantum eigensolver,” arXiv preprint arXiv:1910.09694
Physical Review A 101, 012320 (2020). (2019).
[28] Dave Wecker, Matthew B Hastings, and Matthias [43] Zhi-Cheng Yang, Armin Rahmani, Alireza Shabani,
Troyer, “Progress towards practical quantum variational Hartmut Neven, and Claudio Chamon, “Optimizing
algorithms,” Physical Review A 92, 042303 (2015). variational quantum algorithms using pontryagin’s min-
[29] Roeland Wiersema, Cunlu Zhou, Yvette de Sereville, imum principle,” Phys. Rev. X 7, 021027 (2017).
Juan Felipe Carrasquilla, Yong Baek Kim, and Henry [44] Alicia B Magann, Christian Arenz, Matthew D Grace,
Yuen, “Exploring entanglement and optimization within Tak-San Ho, Robert L Kosut, Jarrod R McClean, Her-
the hamiltonian variational ansatz,” Phys. Rev. X schel A Rabitz, and Mohan Sarovar, “From pulses to
Quantum 1, 020319 (2020). circuits and back again: A quantum optimal control
[30] Wen Wei Ho and Timothy H Hsieh, “Efficient varia- perspective on variational quantum algorithms,” Phys.
tional simulation of non-trivial quantum states,” SciPost Rev. X Quantum 2, 010101 (2021).
Phys 6, 029 (2019). [45] Alexandre Choquette, Agustin Di Paolo, Panagio-
[31] Harper R Grimsley, Sophia E Economou, Edwin Barnes, tis Kl Barkoutsos, David Sénéchal, Ivano Taver-
and Nicholas J Mayhall, “An adaptive variational al- nelli, and Alexandre Blais, “Quantum-optimal-control-
gorithm for exact molecular simulations on a quantum inspired ansatz for variational quantum algorithms,”
computer,” Nature Communications 10, 1–9 (2019). arXiv preprint arXiv:2008.01098 (2020).
[32] Ho Lun Tang, VO Shkolnikov, George S Barron, [46] Jun Li, Xiaodong Yang, Xinhua Peng, and Chang-Pu
Harper R Grimsley, Nicholas J Mayhall, Edwin Barnes, Sun, “Hybrid quantum-classical approach to quantum
and Sophia E Economou, “qubit-ADAPT-VQE: An optimal control,” Phys. Rev. Lett. 118, 150503 (2017).
adaptive algorithm for constructing hardware-efficient [47] Dawei Lu, Keren Li, Jun Li, Hemant Katiyar, Annie Ji-
ansatze on a quantum processor,” arXiv preprint hyun Park, Guanru Feng, Tao Xin, Hang Li, Guilu
arXiv:1911.10205 (2019). Long, Aharon Brodutch, Jonathan Baugh, Bei Zeng,
[33] Yordan S Yordanov, V Armaos, Crispin HW Barnes, and Raymond Laflamme, “Enhancing quantum control
and David RM Arvidsson-Shukur, “Iterative qubit- by bootstrapping a quantum processor of 12 qubits,”
excitation based variational quantum eigensolver,” npj Quantum Information 3, 45 (2017).
arXiv preprint arXiv:2011.10540 (2020). [48] Peter JJ O’Malley, Ryan Babbush, Ian D Kivlichan,
[34] Linghua Zhu, Ho Lun Tang, George S Barron, Jonathan Romero, Jarrod R McClean, Rami Barends,
Nicholas J Mayhall, Edwin Barnes, and Sophia E Julian Kelly, Pedram Roushan, Andrew Tranter, Nan
Economou, “An adaptive quantum approximate op- Ding, et al., “Scalable quantum simulation of molecular
timization algorithm for solving combinatorial prob- energies,” Physical Review X 6, 031007 (2016).
lems on a quantum computer,” arXiv preprint [49] Tyler Takeshita, Nicholas C. Rubin, Zhang Jiang, Eun-
arXiv:2005.10258 (2020). seok Lee, Ryan Babbush, and Jarrod R. McClean, “In-
[35] Arthur G Rattew, Shaohan Hu, Marco Pistoia, Richard creasing the representation accuracy of quantum simu-
Chen, and Steve Wood, “A domain-agnostic, noise- lations of chemistry without extra quantum resources,”
resistant, hardware-efficient evolutionary variational Phys. Rev. X 10, 011004 (2020).
quantum eigensolver,” arXiv preprint arXiv:1910.09694 [50] Leslie G. Valiant, “Quantum circuits that can be simu-
(2019). lated classically in polynomial time,” SIAM Journal on
[36] D Chivilikhin, A Samarin, V Ulyantsev, I Iorsh, Computing 31, 1229–1254 (2002).
AR Oganov, and O Kyriienko, “MoG-VQE: Multiob- [51] Barbara M. Terhal and David P. DiVincenzo, “Classical
jective genetic variational quantum eigensolver,” arXiv simulation of noninteracting-fermion quantum circuits,”
preprint arXiv:2007.04424 (2020). Phys. Rev. A 65, 032325 (2002).
[37] Lukasz Cincio, Kenneth Rudinger, Mohan Sarovar, and [52] Richard Jozsa and Akimasa Miyake, “Matchgates and
Patrick J Coles, “Machine learning of noise-resilient classical simulation of quantum circuits,” Proceedings of
quantum circuits,” Phys. Rev. X Quantum 2, 010324 the Royal Society A: Mathematical, Physical and Engi-
(2021). neering Sciences 464, 3089–3106 (2008).
25

[53] Wataru Mizukami, Kosuke Mitarai, Yuya O. Nakagawa, tions for ground-state calculations in near-term quan-
Takahiro Yamamoto, Tennin Yan, and Yu-ya Ohnishi, tum computers,” Phys. Rev. Lett. 123, 130501 (2019).
“Orbital optimized unitary coupled cluster theory for [70] John Martyn and Brian Swingle, “Product spectrum
quantum computer,” Phys. Rev. Research 2, 033421 ansatz and the simplicity of thermal states,” Phys. Rev.
(2020). A 100, 032107 (2019).
[54] Igor O. Sokolov, Panagiotis Kl. Barkoutsos, Pauline J. [71] Nobuyuki Yoshioka, Yuya O Nakagawa, Kosuke Mi-
Ollitrault, Donny Greenberg, Julia Rice, Marco Pis- tarai, and Keisuke Fujii, “Variational quantum algo-
toia, and Ivano Tavernelli, “Quantum orbital-optimized rithm for nonequilibrium steady states,” Physical Re-
unitary coupled cluster methods in the strongly corre- view Research 2, 043289 (2020).
lated regime: Can quantum algorithms outperform their [72] Guillaume Verdon, Jacob Marks, Sasha Nanda, Stefan
classical equivalents?” The Journal of Chemical Physics Leichenauer, and Jack Hidary, “Quantum hamiltonian-
152, 124107 (2020). based models and the variational quantum thermalizer
[55] Jarrod R McClean, Mollie E Kimchi-Schwartz, algorithm,” arXiv preprint arXiv:1910.02071 (2019).
Jonathan Carter, and Wibe A de Jong, “Hybrid [73] JinGuo Liu, Liang Mao, Pan Zhang, and Lei Wang,
quantum-classical hierarchy for mitigation of decoher- “Solving quantum statistical mechanics with variational
ence and determination of excited states,” Physical Re- autoregressive networks and quantum circuits,” Ma-
view A 95, 042308 (2017). chine Learning: Science and Technology 2, 025011
[56] Robert M Parrish, Edward G Hohenstein, Peter L (2021).
McMahon, and Todd J Martínez, “Quantum compu- [74] Sukin Sim, Peter D Johnson, and Alán Aspuru-Guzik,
tation of electronic transitions using a variational quan- “Expressibility and entangling capability of parameter-
tum eigensolver,” Physical Review Letters 122, 230401 ized quantum circuits for hybrid quantum-classical al-
(2019). gorithms,” Advanced Quantum Technologies 2, 1900070
[57] Robert M Parrish and Peter L McMahon, “Quantum fil- (2019).
ter diagonalization: Quantum eigendecomposition with- [75] Kouhei Nakaji and Naoki Yamamoto, “Expressibility of
out full quantum phase estimation,” arXiv preprint the alternating layered ansatz for quantum computa-
arXiv:1909.08925 (2019). tion,” arXiv preprint arXiv:2005.12537 (2020).
[58] William J Huggins, Joonho Lee, Unpil Baek, Bryan [76] Maria Schuld, Ryan Sweke, and Johannes Jakob Meyer,
O’Gorman, and K Birgitta Whaley, “A non-orthogonal “The effect of data encoding on the expressive power of
variational quantum eigensolver,” New Journal of variational quantum machine learning models,” arXiv
Physics 22, 073009 (2020). preprint arXiv:2008.08605 (2020).
[59] Nicholas H Stair, Renke Huang, and Francesco A Evan- [77] Amira Abbas, David Sutter, Christa Zoufal, Aurélien
gelista, “A multireference quantum krylov algorithm for Lucchi, Alessio Figalli, and Stefan Woerner, “The
strongly correlated electrons,” Journal of Chemical The- power of quantum neural networks,” arXiv preprint
ory and Computation 16, 2236–2245 (2020). arXiv:2011.00027 (2020).
[60] Kishor Bharti and Tobias Haug, “Iterative quantum [78] Zoë Holmes, Kunal Sharma, M. Cerezo, and Patrick J
assisted eigensolver,” arXiv preprint arXiv:2010.05638 Coles, “Connecting ansatz expressibility to gradi-
(2020). ent magnitudes and barren plateaus,” arXiv preprint
[61] Kishor Bharti and Tobias Haug, “Quantum assisted sim- arXiv:2101.02138 (2021).
ulator,” arXiv preprint arXiv:2011.06911 (2020). [79] Gian Giacomo Guerreschi and Mikhail Smelyanskiy,
[62] Igor L. Markov and Yaoyun Shi, “Simulating quantum “Practical optimization for hybrid quantum-classical al-
computation by contracting tensor networks,” SIAM gorithms,” arXiv preprint arXiv:1701.01450 (2017).
Journal on Computing 38, 963–981 (2008). [80] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii,
[63] Isaac H Kim and Brian Swingle, “Robust entanglement “Quantum circuit learning,” Phys. Rev. A 98, 032309
renormalization on a noisy quantum computer,” arXiv (2018).
preprint arXiv:1711.07500 (2017). [81] Maria Schuld, Ville Bergholm, Christian Gogolin, Josh
[64] Isaac H Kim, “Holographic quantum simulation,” arXiv Izaac, and Nathan Killoran, “Evaluating analytic gra-
preprint arXiv:1702.02093 (2017). dients on quantum hardware,” Phys. Rev. A 99, 032331
[65] Jin-Guo Liu, Yi-Hong Zhang, Yuan Wan, and Lei (2019).
Wang, “Variational quantum eigensolver with fewer [82] Ville Bergholm, Josh Izaac, Maria Schuld, Chris-
qubits,” Phys. Rev. Research 1, 023025 (2019). tian Gogolin, M Sohaib Alam, Shahnawaz Ahmed,
[66] Fergus Barratt, James Dborin, Matthias Bal, Vid Sto- Juan Miguel Arrazola, Carsten Blank, Alain Delgado,
jevic, Frank Pollmann, and Andrew G Green, “Parallel Soran Jahangiri, et al., “Pennylane: Automatic dif-
quantum simulation of large systems on small quantum ferentiation of hybrid quantum-classical computations,”
computers,” arXiv preprint arXiv:2003.12087 (2020). arXiv preprint arXiv:1811.04968 (2018).
[67] Xiao Yuan, Jinzhao Sun, Junyu Liu, Qi Zhao, and [83] Andrea Mari, Thomas R Bromley, and Nathan Killo-
You Zhou, “Quantum simulation with hybrid tensor net- ran, “Estimating the gradient and higher-order deriva-
works,” arXiv preprint arXiv:2007.00958 (2020). tives on quantum hardware,” Physical Review A 103,
[68] Keisuke Fujii, Kosuke Mitarai, Wataru Mizukami, and 012405 (2021).
Yuya O Nakagawa, “Deep variational quantum eigen- [84] M. Cerezo and Patrick J Coles, “Impact of barren
solver: a divide-and-conquer method for solving a larger plateaus on the hessian and higher order derivatives,”
problem with smaller size quantum computers,” arXiv arXiv preprint arXiv:2008.07454 (2020).
preprint arXiv:2007.10917 (2020). [85] Ken M Nakanishi, Keisuke Fujii, and Synge Todo, “Se-
[69] Guglielmo Mazzola, Pauline J. Ollitrault, Panagiotis Kl. quential minimal optimization for quantum-classical hy-
Barkoutsos, and Ivano Tavernelli, “Nonunitary opera- brid algorithms,” Physical Review Research 2, 043158
26

(2020). putation,” Phys. Rev. A 103, L030401 (2021).


[86] Bálint Koczor and Simon C Benjamin, “Quantum ana- [104] Daniel S. Abrams and Seth Lloyd, “Quantum algorithm
lytic descent,” arXiv preprint arXiv:2008.13774 (2020). providing exponential speed increase for finding eigen-
[87] Sam McArdle, Tyson Jones, Suguru Endo, Ying Li, Si- values and eigenvectors,” Phys. Rev. Lett. 83, 5162–
mon C Benjamin, and Xiao Yuan, “Variational ansatz- 5165 (1999).
based quantum simulation of imaginary time evolution,” [105] Alán Aspuru-Guzik, Anthony D Dutoi, Peter J Love,
npj Quantum Information 5, 1–6 (2019). and Martin Head-Gordon, “Simulated quantum compu-
[88] James Stokes, Josh Izaac, Nathan Killoran, and tation of molecular energies,” Science 309, 1704–1707
Giuseppe Carleo, “Quantum natural gradient,” Quan- (2005).
tum 4, 269 (2020). [106] Abhinav Kandala, Kristan Temme, Antonio D Cór-
[89] Bálint Koczor and Simon C Benjamin, “Quantum natu- coles, Antonio Mezzacapo, Jerry M Chow, and Jay M
ral gradient generalised to non-unitary circuits,” arXiv Gambetta, “Error mitigation extends the computational
preprint arXiv:1912.08660 (2019). reach of a noisy quantum processor,” Nature 567, 491–
[90] Ying Li and Simon C Benjamin, “Efficient variational 495 (2019).
quantum simulator incorporating active error minimiza- [107] Oscar Higgott, Daochen Wang, and Stephen Brierley,
tion,” Physical Review X 7, 021050 (2017). “Variational quantum computation of excited states,”
[91] Xiao Yuan, Suguru Endo, Qi Zhao, Ying Li, and Si- Quantum 3, 156 (2019).
mon C Benjamin, “Theory of variational quantum sim- [108] Harry Buhrman, Richard Cleve, John Watrous, and
ulation,” Quantum 3, 191 (2019). Ronald De Wolf, “Quantum fingerprinting,” Physical
[92] Suguru Endo, Jinzhao Sun, Ying Li, Simon C Benjamin, Review Letters 87, 167902 (2001).
and Xiao Yuan, “Variational quantum simulation of gen- [109] Tyson Jones, Suguru Endo, Sam McArdle, Xiao Yuan,
eral processes,” Physical Review Letters 125, 010501 and Simon C Benjamin, “Variational quantum algo-
(2020). rithms for discovering hamiltonian spectra,” Physical
[93] Kosuke Mitarai and Keisuke Fujii, “Methodology for Review A 99, 062304 (2019).
replacing indirect measurements with direct measure- [110] Ken M Nakanishi, Kosuke Mitarai, and Keisuke Fu-
ments,” Phys. Rev. Research 1, 013006 (2019). jii, “Subspace-search variational quantum eigensolver
[94] Lennart Bittel and Martin Kliesch, “Training variational for excited states,” Physical Review Research 1, 033062
quantum algorithms is np-hard–even for logarithmically (2019).
many qubits and free fermionic systems,” arXiv preprint [111] Jarrod R McClean, Matthew P Harrigan, Masoud
arXiv:2101.07267 (2021). Mohseni, Nicholas C Rubin, Zhang Jiang, Sergio Boixo,
[95] Diederik P Kingma and Jimmy Ba, “Adam: A method Vadim N Smelyanskiy, Ryan Babbush, and Hartmut
for stochastic optimization,” in Proceedings of the 3rd Neven, “Low depth mechanisms for quantum optimiza-
International Conference on Learning Representations tion,” arXiv preprint arXiv:2008.08615 (2020).
(ICLR) (2015). [112] A Garcia-Saez and JI Latorre, “Addressing hard clas-
[96] Jonas M Kübler, Andrew Arrasmith, Lukasz Cin- sical problems with adiabatically assisted variational
cio, and Patrick J Coles, “An adaptive optimizer for quantum eigensolvers,” arXiv preprint arXiv:1806.02287
measurement-frugal variational algorithms,” Quantum (2018).
4, 263 (2020). [113] M. Cerezo, Kunal Sharma, Andrew Arrasmith, and
[97] Ryan Sweke, Frederik Wilde, Johannes Jakob Meyer, Patrick J Coles, “Variational quantum state eigen-
Maria Schuld, Paul K Fährmann, Barthélémy Meynard- solver,” arXiv preprint arXiv:2004.01372 (2020).
Piganeau, and Jens Eisert, “Stochastic gradient descent [114] Daochen Wang, Oscar Higgott, and Stephen Brierley,
for hybrid quantum-classical optimization,” Quantum 4, “Accelerated variational quantum eigensolver,” Physical
314 (2020). Review Letters 122, 140504 (2019).
[98] Max Wilson, Sam Stromswold, Filip Wudarski, Stuart [115] Guoming Wang, Dax Enshan Koh, Peter D Johnson,
Hadfield, Norm M Tubman, and Eleanor Rieffel, “Op- and Yudong Cao, “Minimizing estimation runtime on
timizing quantum heuristics with meta-learning,” arXiv noisy quantum computers,” Phys. Rev. X Quantum 2,
preprint arXiv:1908.03185 (2019). 010346 (2021).
[99] James C Spall, “Multivariate stochastic approximation [116] Wang, Guoming and Koh, Dax Enshan and Johnson,
using a simultaneous perturbation gradient approxima- Peter D and Cao, Yudong, “Bayesian inference with en-
tion,” IEEE transactions on automatic control 37, 332– gineered likelihood functions for robust amplitude es-
341 (1992). timation,” Preprint at https://arxiv.org/abs/2006.
[100] Robert M Parrish, Joseph T Iosue, Asier Ozaeta, 09350 (2020).
and Peter L McMahon, “A jacobi diagonalization and [117] M. A. Nielsen and I. L. Chuang, Quantum Computation
anderson acceleration algorithm for variational quan- and Quantum Information: 10th Anniversary Edition,
tum algorithm parameter optimization,” arXiv preprint 10th ed. (Cambridge University Press, New York, NY,
arXiv:1904.03206 (2019). USA, 2011).
[101] Patrick Huembeli and Alexandre Dauphin, “Character- [118] AD McLachlan, “A variational solution of the time-
izing the loss landscape of variational quantum circuits,” dependent schrodinger equation,” Molecular Physics 8,
Quantum Science and Technology 6, 025011 (2021). 39–44 (1964).
[102] Aram Harrow and John Napp, “Low-depth gradient [119] Yong-Xin Yao, Niladri Gomes, Feng Zhang, Thomas
measurements can improve convergence in variational Iadecola, Cai-Zhuang Wang, Kai-Ming Ho, and Pe-
hybrid quantum-classical algorithms,” arXiv preprint ter P. Orth, “Adaptive variational quantum dynamics
arXiv:1901.05374 (2019). simulations,” arXiv preprint arXiv:2011.00622 (2020).
[103] Jacob Biamonte, “Universal variational quantum com- [120] Zi-Jian Zhang, Jinzhao Sun, Xiao Yuan, and Man-Hong
27

Yung, “Low-depth hamiltonian simulation by adap- algorithms for systems of linear equations inspired by
tive product formula,” arXiv preprint arXiv:2011.05283 adiabatic quantum computing,” Phys. Rev. Lett. 122,
(2020). 060504 (2019).
[121] Kentaro Heya, Ken M Nakanishi, Kosuke Mitarai, and [136] A. Childs, R. Kothari, and R. Somma, “Quantum algo-
Keisuke Fujii, “Subspace variational quantum simula- rithm for systems of linear equations with exponentially
tor,” arXiv preprint arXiv:1904.08566 (2019). improved dependence on precision,” SIAM J. Comput-
[122] Cristina Cirstoiu, Zoe Holmes, Joseph Iosue, Lukasz ing 46, 1920–1950 (2017).
Cincio, Patrick J Coles, and Andrew Sornborger, “Vari- [137] Shantanav Chakraborty, András Gilyén, and Stacey
ational fast forwarding for quantum simulation beyond Jeffery, “The Power of Block-Encoded Matrix Pow-
the coherence time,” npj Quantum Information 6, 1–10 ers: Improved Regression Techniques via Faster Hamil-
(2020). tonian Simulation,” in 46th International Colloquium
[123] Joe Gibbs, Kaitlin Gili, Zoë Holmes, Benjamin Com- on Automata, Languages, and Programming (ICALP
meau, Andrew Arrasmith, Lukasz Cincio, Patrick J 2019), Leibniz International Proceedings in Informat-
Coles, and Andrew Sornborger, “Long-time simulations ics (LIPIcs), Vol. 132 (2019) pp. 33:1–33:14.
with high fidelity on quantum hardware,” arXiv preprint [138] A. Scherer, B. Valiron, S.-C. Mau, S. Alexander,
arXiv:2102.04313 (2021). E. Van den Berg, and T. E. Chapuran, “Concrete re-
[124] S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. source analysis of the quantum linear-system algorithm
Sornborger, and P. J. Coles, “Quantum-assisted quan- used to compute the electromagnetic scattering cross
tum compiling,” Quantum 3, 140 (2019). section of a 2D target,” Quantum Information Process-
[125] Benjamin Commeau, M. Cerezo, Zoë Holmes, Lukasz ing 16, 60 (2017).
Cincio, Patrick J Coles, and Andrew Sornborger, [139] Carlos Bravo-Prieto, Ryan LaRose, M. Cerezo, Yigit
“Variational hamiltonian diagonalization for dynamical Subasi, Lukasz Cincio, and Patrick J Coles, “Varia-
quantum simulation,” arXiv preprint arXiv:2009.02559 tional quantum linear solver: A hybrid algorithm for lin-
(2020). ear systems,” arXiv preprint arXiv:1909.05820 (2019).
[126] Nikolaj Moll, Panagiotis Barkoutsos, Lev S Bishop, [140] Xiaosi Xu, Jinzhao Sun, Suguru Endo, Ying Li, Simon C
Jerry M Chow, Andrew Cross, Daniel J Egger, Ste- Benjamin, and Xiao Yuan, “Variational algorithms
fan Filipp, Andreas Fuhrer, Jay M Gambetta, Marc for linear algebra,” arXiv preprint arXiv:1909.03898
Ganzhorn, et al., “Quantum optimization using vari- (2019).
ational algorithms on near-term quantum devices,” [141] Hsin-Yuan Huang, Kishor Bharti, and Patrick Reben-
Quantum Science and Technology 3, 030503 (2018). trost, “Near-term quantum algorithms for linear systems
[127] Cedric Yen-Yu Lin and Yechao Zhu, “Performance of equations,” arXiv preprint arXiv:1909.07344 (2019).
of qaoa on typical instances of constraint satisfac- [142] Yiğit Subaşı, Rolando D Somma, and Davide Orsucci,
tion problems with bounded degree,” arXiv preprint “Quantum algorithms for systems of linear equations in-
arXiv:1601.01744 (2016). spired by adiabatic quantum computing,” Physical Re-
[128] Z. Wang, S. Hadfield, Z. Jiang, and E. G. Rief- view Letters 122, 060504 (2019).
fel, “Quantum approximate optimization algorithm for [143] Michael Lubasch, Jaewoo Joo, Pierre Moinier, Martin
MaxCut: A fermionic view,” Phys. Rev. A 97, 022304 Kiffner, and Dieter Jaksch, “Variational quantum algo-
(2018). rithms for nonlinear problems,” Physical Review A 101,
[129] Ruslan Shaydulin, Ilya Safro, and Jeffrey Larson, “Mul- 010301 (2020).
tistart methods for quantum approximate optimiza- [144] Oleksandr Kyriienko, Annie E Paine, and Vin-
tion,” in 2019 IEEE High Performance Extreme Com- cent E Elfving, “Solving nonlinear differential equations
puting Conference (HPEC) (IEEE, 2019) pp. 1–8. with differentiable quantum circuits,” arXiv preprint
[130] Jonathan Romero, Ryan Babbush, Jarrod R McClean, arXiv:2011.10395 (2020).
Cornelius Hempel, Peter J Love, and Alán Aspuru- [145] Eric Anschuetz, Jonathan Olson, Alán Aspuru-Guzik,
Guzik, “Strategies for quantum computing molecular and Yudong Cao, “Variational quantum factoring,” in
energies using the unitary coupled cluster ansatz,” International Workshop on Quantum Technology and
Quantum Science and Technology 4, 014008 (2018). Optimization Problems (Springer, 2019) pp. 74–85.
[131] Gavin E Crooks, “Performance of the quantum approxi- [146] Seth Lloyd, Masoud Mohseni, and Patrick Reben-
mate optimization algorithm on the maximum cut prob- trost, “Quantum principal component analysis,” Nature
lem,” arXiv preprint arXiv:1811.08419 (2018). Physics 10, 631–633 (2014).
[132] Dave Wecker, Matthew B Hastings, and Matthias [147] Ryan LaRose, Arkin Tikku, Étude O’Neel-Judy, Lukasz
Troyer, “Training a quantum optimizer,” Physical Re- Cincio, and Patrick J Coles, “Variational quantum
view A 94, 022309 (2016). state diagonalization,” npj Quantum Information 5, 1–
[133] Sami Khairy, Ruslan Shaydulin, Lukasz Cincio, Yuri 10 (2019).
Alexeev, and Prasanna Balaprakash, “Learning to op- [148] Kentaro Heya, Yasunari Suzuki, Yasunobu Nakamura,
timize variational quantum circuits to solve combinato- and Keisuke Fujii, “Variational quantum gate optimiza-
rial problems,” Proceedings of the AAAI Conference on tion,” arXiv preprint arXiv:1810.12745 (2018).
Artificial Intelligence 34, 2367–2375 (2020). [149] Tyson Jones and Simon C Benjamin, “Quantum compi-
[134] A Ambainis, “Variable time amplitude amplification lation and circuit optimisation via energy dissipation,”
and a faster quantum algorithm for solving systems of arXiv preprint arXiv:1811.03147 (2018).
linear equations 29th int,” in Symp. Theoretical Aspects [150] Kunal Sharma, Sumeet Khatri, M. Cerezo, and
of Computer Science (STACS 2012), Vol. 14 (2012) pp. Patrick J Coles, “Noise resilience of variational quantum
636–47. compiling,” New Journal of Physics 22, 043006 (2020).
[135] Y. Subaşı, R. D. Somma, and D. Orsucci, “Quantum [151] Jacques Carolan, Masoud Mohseni, Jonathan P Olson,
28

Mihika Prabhu, Changchen Chen, Darius Bunandar, cob Biamonte, “A quantum algorithm to train neu-
Murphy Yuezhen Niu, Nicholas C Harris, Franco NC ral networks using low-depth circuits,” arXiv preprint
Wong, Michael Hochberg, et al., “Variational quantum arXiv:1712.05304 (2017).
unsampling on a quantum photonic processor,” Nature [170] Marcello Benedetti, Delfina Garcia-Pintos, Oscar Per-
Physics 16, 322–327 (2020). domo, Vicente Leyton-Ortega, Yunseong Nam, and
[152] Peter D Johnson, Jonathan Romero, Jonathan Olson, Alejandro Perdomo-Ortiz, “A generative modeling ap-
Yudong Cao, and Alán Aspuru-Guzik, “Qvector: an proach for benchmarking and training shallow quantum
algorithm for device-tailored quantum error correction,” circuits,” npj Quantum Information 5, 1–9 (2019).
arXiv preprint arXiv:1711.02249 (2017). [171] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, and
[153] Xiaosi Xu, Simon C Benjamin, and Xiao Yuan, “Vari- Dacheng Tao, “Expressive power of parametrized quan-
ational circuit compiler for quantum error correction,” tum circuits,” Physical Review Research 2, 033125
arXiv preprint arXiv:1911.05759 (2019). (2020).
[154] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick [172] Jin-Guo Liu and Lei Wang, “Differentiable learning of
Rebentrost, Nathan Wiebe, and Seth Lloyd, “Quantum quantum circuit born machines,” Physical Review A 98,
machine learning,” Nature 549, 195–202 (2017). 062324 (2018).
[155] Edward Farhi and Hartmut Neven, “Classification with [173] Brian Coyle, Daniel Mills, Vincent Danos, and Elham
quantum neural networks on near term processors,” Kashefi, “The born supremacy: Quantum advantage
arXiv preprint arXiv:1802.06002 (2018). and training of an ising born machine,” npj Quantum
[156] Maria Schuld, Alex Bocharov, Krysta M Svore, and Information 6, 1–11 (2020).
Nathan Wiebe, “Circuit-centric quantum classifiers,” [174] Jonathan Romero and Alan Aspuru-Guzik, “Variational
Physical Review A 101, 032308 (2020). quantum generators: Generative adversarial quantum
[157] Maria Schuld and Nathan Killoran, “Quantum machine machine learning for continuous distributions,” arXiv
learning in feature hilbert spaces,” Physical Review Let- preprint arXiv:1901.00848 (2019).
ters 122, 040504 (2019). [175] MV Altaisky, “Quantum neural network,” arXiv
[158] Vojtěch Havlíček, Antonio D Córcoles, Kristan Temme, preprint quant-ph/0107012 (2001).
Aram W Harrow, Abhinav Kandala, Jerry M Chow, [176] Kerstin Beer, Dmytro Bondarenko, Terry Farrelly, To-
and Jay M Gambetta, “Supervised learning with bias J Osborne, Robert Salzmann, Daniel Scheiermann,
quantum-enhanced feature spaces,” Nature 567, 209– and Ramona Wolf, “Training deep quantum neural net-
212 (2019). works,” Nature communications 11, 1–6 (2020).
[159] Edwin Stoudenmire and David J Schwab, “Supervised [177] Iris Cong, Soonwon Choi, and Mikhail D Lukin, “Quan-
learning with tensor networks,” in Advances in Neural tum convolutional neural networks,” Nature Physics 15,
Information Processing Systems (2016) pp. 4799–4807. 1273–1278 (2019).
[160] Seth Lloyd, Maria Schuld, Aroosa Ijaz, Josh Izaac, and [178] Lukas Franken and Bogdan Georgiev, “Explorations in
Nathan Killoran, “Quantum embeddings for machine quantum neural networks with intermediate measure-
learning,” arXiv preprint arXiv:2001.03622 (2020). ments,” in Proceedings of ESANN (2020).
[161] Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil- [179] Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff,
Fuster, and José I Latorre, “Data re-uploading for a Andrew T Sornborger, and Patrick J Coles, “Absence
universal quantum classifier,” Quantum 4, 226 (2020). of barren plateaus in quantum convolutional neural net-
[162] Takeru Kusumoto, Kosuke Mitarai, Keisuke Fujii, works,” arXiv preprint arXiv:2011.02966 (2020).
Masahiro Kitagawa, and Makoto Negoro, “Experimen- [180] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, and Dacheng
tal quantum kernel machine learning with nuclear spins Tao, “Toward trainability of quantum neural networks,”
in a solid,” arXiv preprint arXiv:1911.12021 (2019). arXiv preprint arXiv:2011.06258 (2020).
[163] J. Romero, J. P. Olson, and A. Aspuru-Guzik, “Quan- [181] Wojciech Hubert Zurek, “Quantum darwinism,” Nature
tum autoencoders for efficient compression of quan- physics 5, 181–188 (2009).
tum data,” Quantum Science and Technology 2, 045001 [182] A. Arrasmith, L. Cincio, A. T. Sornborger, W. H. Zurek,
(2017). and P. J. Coles, “Variational consistent histories as a hy-
[164] Kwok Ho Wan, Oscar Dahlsten, Hlér Kristjánsson, brid algorithm for quantum foundations,” Nature com-
Robert Gardner, and MS Kim, “Quantum generali- munications 10, 3438 (2019).
sation of feedforward neural networks,” npj Quantum [183] Robert B Griffiths, Consistent quantum theory (Cam-
Information 3, 36 (2017). bridge University Press, 2003).
[165] Guillaume Verdon, Jason Pye, and Michael Broughton, [184] Zoë Holmes, Andrew Arrasmith, Bin Yan, Patrick J
“A universal training algorithm for quantum deep learn- Coles, Andreas Albrecht, and Andrew T Sornborger,
ing,” arXiv preprint arXiv:1806.09729 (2018). “Barren plateaus preclude learning scramblers,” arXiv
[166] M. Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, preprint arXiv:2009.14808 (2020).
and Patrick J Coles, “Cost function dependent barren [185] Patrick Hayden and John Preskill, “Black holes as mir-
plateaus in shallow parametrized quantum circuits,” Na- rors: quantum information in random subsystems,”
ture Communications 12, 1–12 (2021). Journal of high energy physics 2007, 120 (2007).
[167] Chenfeng Cao and Xin Wang, “Noise-assisted quantum [186] Mark M Wilde, Quantum information theory (Cam-
autoencoder,” arXiv preprint arXiv:2012.08331 (2020). bridge University Press, 2013).
[168] Alex Pepper, Nora Tischler, and Geoff J Pryde, “Ex- [187] B. Rosgen and J. Watrous, “On the hardness of distin-
perimental realization of a quantum autoencoder: The guishing mixed-state quantum computations,” in 20th
compression of qutrits via machine learning,” Physical Annual IEEE Conference on Computational Complex-
Review Letters 122, 060501 (2019). ity (CCC’05) (2005) pp. 344–354.
[169] Guillaume Verdon, Michael Broughton, and Ja- [188] M. Cerezo, Alexander Poremba, Lukasz Cincio, and
29

Patrick J Coles, “Variational quantum fidelity estima- [205] Edward Grant, Leonard Wossnig, Mateusz Ostaszewski,
tion,” Quantum 4, 248 (2020). and Marcello Benedetti, “An initialization strategy for
[189] Carlos Bravo-Prieto, Diego García-Martín, and José I addressing barren plateaus in parametrized quantum
Latorre, “Quantum singular value decomposer,” Physi- circuits,” Quantum 3, 214 (2019).
cal Review A 101, 062310 (2020). [206] Tyler Volkoff and Patrick J Coles, “Large gradients via
[190] Bálint Koczor, Suguru Endo, Tyson Jones, Yuichiro correlation in random parameterized quantum circuits,”
Matsuzaki, and Simon C Benjamin, “Variational-state Quantum Science and Technology 6, 025008 (2021).
quantum metrology,” New Journal of Physics 22, 083038 [207] Andrea Skolik, Jarrod R McClean, Masoud Mohseni,
(2020). Patrick van der Smagt, and Martin Leib, “Layerwise
[191] Raphael Kaubruegger, Pietro Silvi, Christian Kokail, learning for quantum neural networks,” arXiv preprint
Rick van Bijnen, Ana Maria Rey, Jun Ye, Adam M arXiv:2006.14904 (2020).
Kaufman, and Peter Zoller, “Variational spin-squeezing [208] Ernesto Campos, Aly Nasrallah, and Jacob Bia-
algorithms on programmable quantum sensors,” Physi- monte, “Abrupt transitions in variational quantum cir-
cal Review Letters 123, 260505 (2019). cuit training,” Physical Review A 103, 032607 (2021).
[192] Ziqi Ma, Pranav Gokhale, Tian-Xing Zheng, Sisi Zhou, [209] Guillaume Verdon, Michael Broughton, Jarrod R Mc-
Xiaofei Yu, Liang Jiang, Peter Maurer, and Frederic T Clean, Kevin J Sung, Ryan Babbush, Zhang Jiang,
Chong, “Adaptive circuit learning for quantum metrol- Hartmut Neven, and Masoud Mohseni, “Learning to
ogy,” arXiv preprint arXiv:2010.08702 (2020). learn with quantum neural networks via classical neural
[193] Jacob L Beckey, M. Cerezo, Akira Sone, and Patrick J networks,” arXiv preprint arXiv:1907.05415 (2019).
Coles, “Variational quantum algorithm for estimat- [210] Abhinav Anand, Matthias Degroote, and Alán
ing the quantum fisher information,” arXiv preprint Aspuru-Guzik, “Natural evolutionary strategies for
arXiv:2010.10488 (2020). variational quantum computation,” arXiv preprint
[194] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, arXiv:2012.00101 (2020).
Ryan Babbush, and Hartmut Neven, “Barren plateaus [211] Zhenyu Cai, “Resource estimation for quantum varia-
in quantum neural network training landscapes,” Nature tional simulations of the hubbard model,” Phys. Rev.
communications 9, 4812 (2018). Applied 14, 014059 (2020).
[195] Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz [212] Chris Cade, Lana Mineh, Ashley Montanaro, and
Cincio, and Patrick J Coles, “Effect of barren Stasja Stanisic, “Strategies for solving the fermi-
plateaus on gradient-free optimization,” arXiv preprint hubbard model on near-term quantum computers,”
arXiv:2011.12245 (2020). Physical Review B 102, 235122 (2020).
[196] Aram W Harrow and Richard A Low, “Random quan- [213] Andrew Jena, Scott Genin, and Michele Mosca, “Pauli
tum circuits are approximate 2-designs,” Communica- partitioning with respect to gate sets,” arXiv preprint
tions in Mathematical Physics 291, 257–302 (2009). arXiv:1907.07859 (2019).
[197] Fernando GSL Brandao, Aram W Harrow, and Michał [214] Ophelia Crawford, Barnaby van Straaten, Daochen
Horodecki, “Local random quantum circuits are approx- Wang, Thomas Parks, Earl Campbell, and Stephen
imate polynomial-designs,” Communications in Mathe- Brierley, “Efficient quantum measurement of pauli oper-
matical Physics 346, 397–434 (2016). ators in the presence of finite sampling error,” Quantum
[198] Alexey Uvarov, Jacob D. Biamonte, and Dmitry Yudin, 5, 385 (2021).
“Variational quantum eigensolver for frustrated quan- [215] Vladyslav Verteletskyi, Tzu-Ching Yen, and Artur F Iz-
tum systems,” Phys. Rev. B 102, 075104 (2020). maylov, “Measurement optimization in the variational
[199] Alexey Uvarov and Jacob Biamonte, “On barren quantum eigensolver using a minimum clique cover,”
plateaus and cost function locality in variational The Journal of Chemical Physics 152, 124114 (2020).
quantum algorithms,” arXiv preprint arXiv:2011.10530 [216] Artur F Izmaylov, Tzu-Ching Yen, Robert A Lang, and
(2020). Vladyslav Verteletskyi, “Unitary partitioning approach
[200] Kunal Sharma, M. Cerezo, Lukasz Cincio, and to the measurement problem in the variational quantum
Patrick J Coles, “Trainability of dissipative perceptron- eigensolver method,” Journal of Chemical Theory and
based quantum neural networks,” arXiv preprint Computation 16, 190–195 (2019).
arXiv:2005.12458 (2020). [217] Andrew Zhao, Andrew Tranter, William M Kirby,
[201] Carlos Ortiz Marrero, Mária Kieferová, and Nathan Shu Fay Ung, Akimasa Miyake, and Peter J Love,
Wiebe, “Entanglement induced barren plateaus,” arXiv “Measurement reduction in variational quantum algo-
preprint arXiv:2010.15968 (2020). rithms,” Physical Review A 101, 062322 (2020).
[202] Samson Wang, Enrico Fontana, M. Cerezo, Kunal [218] Tzu-Ching Yen, Vladyslav Verteletskyi, and Artur F
Sharma, Akira Sone, Lukasz Cincio, and Patrick J Izmaylov, “Measuring all compatible operators in one
Coles, “Noise-induced barren plateaus in variational series of single-qubit measurements using unitary trans-
quantum algorithms,” arXiv preprint arXiv:2007.14384 formations,” Journal of Chemical Theory and Compu-
(2020). tation 16, 2400–2409 (2020).
[203] Daniel Stilck Franca and Raul Garcia-Patron, “Limita- [219] Pranav Gokhale and Frederic T Chong, “o(n3 ) measure-
tions of optimization algorithms on noisy quantum de- ment cost for variational quantum eigensolver on molec-
vices,” arXiv preprint arXiv:2009.05532 (2020). ular hamiltonians,” arXiv preprint arXiv:1908.11857
[204] Leo Zhou, Sheng-Tao Wang, Soonwon Choi, Hannes (2019).
Pichler, and Mikhail D Lukin, “Quantum approximate [220] Jarrod R McClean, Jonathan Romero, Ryan Babbush,
optimization algorithm: Performance, mechanism, and and Alán Aspuru-Guzik, “The theory of variational
implementation on near-term devices,” Physical Review hybrid quantum-classical algorithms,” New Journal of
X 10, 021067 (2020). Physics 18, 023023 (2016).
30

[221] William J Huggins, Jarrod R McClean, Nicholas C Ru- [238] Suguru Endo, Qi Zhao, Ying Li, Simon Benjamin, and
bin, Zhang Jiang, Nathan Wiebe, K Birgitta Whaley, Xiao Yuan, “Mitigating algorithmic errors in a hamilto-
and Ryan Babbush, “Efficient and noise resilient mea- nian simulation,” Physical Review A 99, 012334 (2019).
surements for quantum chemistry on near-term quan- [239] Jinzhao Sun, Xiao Yuan, Takahiro Tsunoda, Vlatko Ve-
tum computers,” npj Quantum Information 7, 1–9 dral, Simon C Benjamin, and Suguru Endo, “Mitigat-
(2021). ing realistic noise in practical noisy intermediate-scale
[222] Nicholas C Rubin, Ryan Babbush, and Jarrod Mc- quantum devices,” Physical Review Applied 15, 034026
Clean, “Application of fermionic marginal constraints (2021).
to hybrid quantum algorithms,” New Journal of Physics [240] Armands Strikis, Dayue Qin, Yanzhu Chen, Simon C
20, 053020 (2018). Benjamin, and Ying Li, “Learning-based quantum error
[223] Andrew Arrasmith, Lukasz Cincio, Rolando D Somma, mitigation,” arXiv preprint arXiv:2005.07601 (2020).
and Patrick J Coles, “Operator sampling for shot-frugal [241] Piotr Czarnik, Andrew Arrasmith, Patrick J Coles, and
optimization in variational algorithms,” arXiv preprint Lukasz Cincio, “Error mitigation with clifford quantum-
arXiv:2004.06252 (2020). circuit data,” arXiv preprint arXiv:2005.10189 (2020).
[224] Barnaby van Straaten and Bálint Koczor, “Measure- [242] Angus Lowe, Max Hunter Gordon, Piotr Czarnik, An-
ment cost of metric-aware variational quantum algo- drew Arrasmith, Patrick J Coles, and Lukasz Cincio,
rithms,” arXiv preprint arXiv:2005.05172 (2020). “Unified approach to data-driven quantum error miti-
[225] Hsin-Yuan Huang, Richard Kueng, and John Preskill, gation,” arXiv arXiv:2011.01157 (2020).
“Predicting many properties of a quantum system from [243] Sam McArdle, Xiao Yuan, and Simon Benjamin,
very few measurements,” Nature Physics 16, 1050–1057 “Error-mitigated digital quantum simulation,” Physical
(2020). Review Letters 122, 180501 (2019).
[226] Charles Hadfield, Sergey Bravyi, Rudy Raymond, and [244] Xavi Bonet-Monroig, Ramiro Sagastizabal, M Singh,
Antonio Mezzacapo, “Measurements of quantum hamil- and TE O’Brien, “Low-cost error mitigation by symme-
tonians with locally-biased classical shadows,” arXiv try verification,” Physical Review A 98, 062339 (2018).
preprint arXiv:2006.15788 (2020). [245] Jarrod R McClean, Zhang Jiang, Nicholas C Rubin,
[227] Giacomo Torlai, Guglielmo Mazzola, Giuseppe Carleo, Ryan Babbush, and Hartmut Neven, “Decoding quan-
and Antonio Mezzacapo, “Precise measurement of quan- tum errors with subspace expansions,” Nature Commu-
tum observables with neural-network estimators,” Phys- nications 11, 1–9 (2020).
ical Review Research 2, 022060 (2020). [246] Bálint Koczor, “Exponential error suppression
[228] Laura Gentini, Alessandro Cuccoli, Stefano Piran- for near-term quantum devices,” arXiv preprint
dola, Paola Verrucchi, and Leonardo Banchi, “Noise- arXiv:2011.05942 (2020).
resilient variational hybrid quantum-classical optimiza- [247] William J Huggins, Sam McArdle, Thomas E O’Brien,
tion,” Physical Review A 102, 052414 (2020). Joonho Lee, Nicholas C Rubin, Sergio Boixo, K Birgitta
[229] Enrico Fontana, M. Cerezo, Andrew Arrasmith, Whaley, Ryan Babbush, and Jarrod R McClean, “Vir-
Ivan Rungger, and Patrick J Coles, “Optimizing tual distillation for quantum error mitigation,” arXiv
parametrized quantum circuits via noise-induced break- preprint arXiv:2011.07064 (2020).
ing of symmetries,” arXiv preprint arXiv:2011.08763 [248] Sergey Bravyi, Sarah Sheldon, Abhinav Kandala,
(2020). David C Mckay, and Jay M Gambetta, “Mitigating
[230] Cheng Xue, Zhao-Yun Chen, Yu-Chun Wu, and Guo- measurement errors in multi-qubit experiments,” arXiv
Ping Guo, “Effects of quantum noise on quantum preprint arXiv:2006.14044 (2020).
approximate optimization algorithm,” arXiv preprint [249] Daiqin Su, Robert Israel, Kunal Sharma, Haoyu Qi,
arXiv:1909.02196 (2019). Ish Dhand, and Kamil Brádler, “Error mitigation on
[231] Jeffrey Marshall, Filip Wudarski, Stuart Hadfield, and a near-term quantum photonic device,” arXiv preprint
Tad Hogg, “Characterizing local noise in QAOA cir- arXiv:2008.06670 (2020).
cuits,” IOP SciNotes 1, 025208 (2020). [250] Yudong Cao, Johnathan Romero, and Alán Aspuru-
[232] Isaac H Kim, “Noise-resilient preparation of quan- Guzik, “Potential of quantum computing for drug dis-
tum many-body ground states,” arXiv preprint covery,” IBM Journal of Research and Development 62,
arXiv:1703.00032 (2017). 6–1 (2018).
[233] Suguru Endo, Zhenyu Cai, Simon C Benjamin, and [251] Yudong Cao, Jonathan Romero, Jonathan P Olson,
Xiao Yuan, “Hybrid quantum-classical algorithms and Matthias Degroote, Peter D Johnson, Mária Kieferová,
quantum error mitigation,” Journal of the Physical So- Ian D Kivlichan, Tim Menke, Borja Peropadre, Nico-
ciety of Japan 90, 032001 (2021). las PD Sawaya, et al., “Quantum chemistry in the age
[234] Kristan Temme, Sergey Bravyi, and Jay M Gambetta, of quantum computing,” Chemical reviews 119, 10856–
“Error mitigation for short-depth quantum circuits,” 10915 (2019).
Physical Review Letters 119, 180509 (2017). [252] Carlos Outeiral, Martin Strahm, Jiye Shi, Garrett M
[235] Suguru Endo, Simon C Benjamin, and Ying Li, “Prac- Morris, Simon C Benjamin, and Charlotte M Deane,
tical quantum error mitigation for near-future applica- “The prospects of quantum computing in computa-
tions,” Physical Review X 8, 031027 (2018). tional molecular biology,” Wiley Interdisciplinary Re-
[236] Zhenyu Cai, “Multi-exponential error extrapolation and views: Computational Molecular Science , e1481 (2020).
combining error mitigation techniques for nisq applica- [253] Steven R White, “Density matrix formulation for quan-
tions,” arXiv preprint arXiv:2007.01265 (2020). tum renormalization groups,” Physical Review Letters
[237] Matthew Otten and Stephen K Gray, “Recovering noise- 69, 2863 (1992).
free quantum observables,” Physical Review A 99, [254] Garnet Kin-Lic Chan and Sandeep Sharma, “The den-
012338 (2019). sity matrix renormalization group in quantum chem-
31

istry,” Annual Review of Physical Chemistry 62, 465– [268] Eugene F Dumitrescu, Alex J McCaskey, Gaute Ha-
481 (2011). gen, Gustav R Jansen, Titus D Morris, T Papenbrock,
[255] Sam McArdle, Suguru Endo, Alan Aspuru-Guzik, Si- Raphael C Pooser, David Jarvis Dean, and Pavel
mon C Benjamin, and Xiao Yuan, “Quantum com- Lougovski, “Cloud quantum computing of an atomic nu-
putational chemistry,” Reviews of Modern Physics 92, cleus,” Physical Review Letters 120, 210501 (2018).
015003 (2020). [269] Hsuan-Hao Lu, Natalie Klco, Joseph M Lukens, Ti-
[256] Frank Arute, Kunal Arya, Ryan Babbush, Dave Ba- tus D Morris, Aaina Bansal, Andreas Ekström, Gaute
con, Joseph C. Bardin, Rami Barends, Sergio Boixo, Hagen, Thomas Papenbrock, Andrew M Weiner, Mar-
et al., “Hartree-fock on a superconducting qubit quan- tin J Savage, et al., “Simulations of subatomic many-
tum computer,” Science 369, 1084–1089 (2020). body physics on a quantum frequency processor,” Phys-
[257] Artem A Bakulin, Stoichko D Dimitrov, Akshay Rao, ical Review A 100, 012320 (2019).
Philip CY Chow, Christian B Nielsen, Bob C Schroeder, [270] Alessandro Roggero, Andy CY Li, Joseph Carlson, Ra-
Iain McCulloch, Huib J Bakker, James R Durrant, and jan Gupta, and Gabriel N Perdue, “Quantum comput-
Richard H Friend, “Charge-transfer state dynamics fol- ing for neutrino-nucleus scattering,” Physical Review D
lowing hole and electron transfer in organic photovoltaic 101, 074038 (2020).
devices,” The Journal of Physical Chemistry Letters 4, [271] Julian Bender, Erez Zohar, Alessandro Farace, and
209–215 (2013). J Ignacio Cirac, “Digital quantum simulation of lattice
[258] Markus Gross, David C Müller, Heinz-Georg Nothofer, gauge theories in three spatial dimensions,” New Jour-
Ulrich Scherf, Dieter Neher, Christoph Bräuchle, and nal of Physics 20, 093001 (2018).
Klaus Meerholz, “Improving the performance of doped [272] Mari Carmen Banuls, Rainer Blatt, Jacopo Catani,
π-conjugated polymers for use in organic light-emitting Alessio Celi, Juan Ignacio Cirac, Marcello Dalmonte,
diodes,” Nature 405, 661–665 (2000). Leonardo Fallani, Karl Jansen, Maciej Lewenstein, Si-
[259] JR Schmidt, Priya V Parandekar, and John C Tully, mone Montangero, et al., “Simulating lattice gauge
“Mixed quantum-classical equilibrium: Surface hop- theories within quantum technologies,” The European
ping,” The Journal of Chemical Physics 129, 044104 physical journal D 74, 1–42 (2020).
(2008). [273] John Preskill, “Simulating quantum field theory with
[260] Thomas E O’Brien, Bruno Senjean, Ramiro Sagas- a quantum computer,” arXiv preprint arXiv:1811.10085
tizabal, Xavier Bonet-Monroig, Alicja Dutkiewicz, (2018).
Francesco Buda, Leonardo DiCarlo, and Lucas Viss- [274] Suguru Endo, Iori Kurata, and Yuya O Nakagawa,
cher, “Calculating energy derivatives for quantum chem- “Calculation of the green’s function on near-term quan-
istry on a quantum computer,” npj Quantum Informa- tum computers,” Physical Review Research 2, 033281
tion 5, 1–12 (2019). (2020).
[261] John C Tully and Richard K Preston, “Trajectory sur- [275] Chinmay Mishra, Shane Thompson, Raphael Pooser,
face hopping approach to nonadiabatic molecular col- and George Siopsis, “Quantum computation of an in-
lisions: the reaction of h+ with d2,” The Journal of teracting fermionic model,” Quantum Science and Tech-
Chemical Physics 55, 562–572 (1971). nology 5, 035010 (2020).
[262] David R Weinberg, Christopher J Gagliardi, Jonathan F [276] Danny Paulson, Luca Dellantonio, Jan F Haase, Alessio
Hull, Christine Fecenko Murphy, Caleb A Kent, Celi, Angus Kan, Andrew Jena, Christian Kokail, Rick
Brittany C Westlake, Amit Paul, Daniel H Ess, van Bijnen, Karl Jansen, Peter Zoller, et al., “Towards
Dewey Granville McCafferty, and Thomas J Meyer, simulating 2d effects in lattice gauge theories on a
“Proton-coupled electron transfer,” Chemical Reviews quantum computer,” arXiv preprint arXiv:2008.09252
112, 4016–4093 (2012). (2020).
[263] Walter Kohn and Lu Jeu Sham, “Self-consistent equa- [277] A Avkhadiev, PE Shanahan, and RD Young, “Accel-
tions including exchange and correlation effects,” Phys- erating lattice quantum field theory calculations via in-
ical review 140, A1133 (1965). terpolator optimization using noisy intermediate-scale
[264] Bela Bauer, Dave Wecker, Andrew J Millis, Matthew B quantum computing,” Physical Review Letters 124,
Hastings, and Matthias Troyer, “Hybrid quantum- 080501 (2020).
classical approach to correlated materials,” Physical Re- [278] Jad C Halimeh, Valentin Kasper, and Philipp Hauke,
view X 6, 031045 (2016). “Fate of lattice gauge theories under decoherence,”
[265] Ryan Babbush, Craig Gidney, Dominic W Berry, arXiv preprint arXiv:2009.07848 (2020).
Nathan Wiebe, Jarrod McClean, Alexandru Paler, [279] Kunal Sharma, M. Cerezo, Zoë Holmes, Lukasz Cincio,
Austin Fowler, and Hartmut Neven, “Encoding elec- Andrew Sornborger, and Patrick J Coles, “Reformu-
tronic spectra in quantum circuits with linear t com- lation of the no-free-lunch theorem for entangled data
plexity,” Physical Review X 8, 041015 (2018). sets,” arXiv preprint arXiv:2007.04900 (2020).
[266] Dominic W Berry, Mária Kieferová, Artur Scherer, Yu- [280] Francisco Barahona, Martin Grötschel, Michael Jünger,
val R Sanders, Guang Hao Low, Nathan Wiebe, Craig and Gerhard Reinelt, “An application of combinatorial
Gidney, and Ryan Babbush, “Improved techniques for optimization to statistical physics and circuit layout de-
preparing eigenstates of fermionic hamiltonians,” npj sign,” Operations Research 36, 493–513 (1988).
Quantum Information 4, 1–7 (2018). [281] Wolfgang Küchlin and Carsten Sinz, “Proving consis-
[267] Pierre-Luc Dallaire-Demers, Jonathan Romero, Libor tency assertions for automotive product data manage-
Veis, Sukin Sim, and Alán Aspuru-Guzik, “Low- ment,” Journal of Automated Reasoning 24, 145–163
depth circuit ansatz for preparing correlated fermionic (2000).
states on a quantum computer,” arXiv preprint [282] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann,
arXiv:1801.01053 (2018). “A Quantum Approximate Optimization Algorithm Ap-
32

plied to a Bounded Occurrence Constraint Problem,” and KA2401032. SCB acknowledges financial support
arXiv preprint arXiv:1412.6062 (2014). from EPSRC Hub grants under the agreement num-
[283] Edward Farhi and Aram W Harrow, “Quantum bers EP/M013243/1 and EP/T001062/1, and from EU
supremacy through the quantum approximate opti- H2020-FETFLAG-03-2018 under the grant agreement
mization algorithm,” arXiv preprint arXiv:1602.07674 No 820495 (AQTION). SE was supported by MEXT
(2016).
Quantum Leap Flagship Program (MEXT QLEAP)
[284] Matthew B Hastings, “Classical and quantum bounded
depth approximation algorithms,” arXiv preprint Grant Number JPMXS0120319794, JPMXS0118068682
arXiv:1905.07047 (2019). and JST ERATO Grant Number JPMJER1601. KF was
[285] Sergey Bravyi, Alexander Kliesch, Robert Koenig, and supported by JSPS KAKENHI Grant No. 16H02211,
Eugene Tang, “Obstacles to variational quantum opti- JST ERATO JPMJER1601, and JST CREST JP-
mization from symmetry protection,” Physical Review MJCR1673. KM was supported by JST PRESTO
Letters 125, 260505 (2020). Grant No. JPMJPR2019 and JSPS KAKENHI Grant
[286] Matthew P Harrigan, Kevin J Sung, Matthew Neeley, No. 20K22330. KM and KF were also supported
Kevin J Satzinger, Frank Arute, Kunal Arya, Juan Ata- by MEXT Quantum Leap Flagship Program (MEXT
laya, Joseph C Bardin, Rami Barends, Sergio Boixo, QLEAP) Grant Number JPMXS0118067394 and JP-
et al., “Quantum approximate optimization of non- MXS0120319794. XY acknowledges support from the
planar graph problems on a planar superconducting pro-
cessor,” Nature Physics , 1–5 (2021).
Simons Foundation. LC was initially supported by
[287] Maria Schuld, Ilya Sinayskiy, and Francesco Petruc- the LDRD program of LANL under project number
cione, “An introduction to quantum machine learning,” 20190065DR, and later supported by the U.S. DOE, Of-
Contemporary Physics 56, 172–185 (2015). fice of Science, Office of Advanced Scientific Comput-
[288] Nathan Wiebe, Ashish Kapoor, and Krysta M ing Research under the Quantum Computing Application
Svore, “Quantum deep learning,” arXiv preprint Teams (QCAT) program. PJC was initially supported by
arXiv:1412.3489 (2014). the LANL ASC Beyond Moore’s Law project, and later
[289] Hsin-Yuan Huang, Richard Kueng, and John Preskill, supported by the U.S. DOE, Office of Science, Office of
“Information-theoretic bounds on quantum advantage Advanced Scientific Computing Research, under the Ac-
in machine learning,” arXiv preprint arXiv:2101.02464 celerated Research in Quantum Computing (ARQC) pro-
(2021).
gram. Most recently, MC, LC, and PJC were supported
[290] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi,
Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan, “Vari- by the Quantum Science Center (QSC), a National Quan-
ational quantum circuits for deep reinforcement learn- tum Information Science Research Center of the U.S. De-
ing,” IEEE Access 8, 141007–141024 (2020). partment of Energy (DOE).
[291] Michael Broughton, Guillaume Verdon, Trevor Mc-
Court, Antonio J Martinez, Jae Hyeon Yoo, Sergei V
Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin AUTHOR CONTRIBUTIONS
Halavati, Evan Peters, et al., “Tensorflow quantum:
A software framework for quantum machine learning,” All authors have read, discussed and contributed to
arXiv preprint arXiv:2003.02989 (2020).
the writing of the manuscript.
[292] Xiu-Zhe Luo, Jin-Guo Liu, Pan Zhang, and Lei Wang,
“Yao. jl: Extensible, efficient framework for quantum
algorithm design,” Quantum 4, 341 (2020).
[293] Yuval R Sanders, Dominic W Berry, Pedro CS Costa, COMPETING INTERESTS
Louis W Tessler, Nathan Wiebe, Craig Gidney, Hart-
mut Neven, and Ryan Babbush, “Compilation of fault- The authors declare no competing interests.
tolerant quantum heuristics for combinatorial optimiza-
tion,” Physical Review X Quantum 1, 020312 (2020).
KEY POINTS:

ACKNOWLEDGEMENTS • Variational quantum algorithms (VQAs) are the


leading proposal for achieving quantum advantage
MC is thankful to Kunal Sharma for helpful discus- using near-term quantum computers.
sions. MC was initially supported by the Laboratory Di- • VQAs have been developed for a wide range of
rected Research and Development (LDRD) program of applications including finding ground states of
Los Alamos National Laboratory (LANL) under project molecules, simulating dynamics of quantum sys-
number 20180628ECR, and later supported by the Cen- tems, and solving linear systems of equations,
ter for Nonlinear Studies at LANL. AA was initially sup- among others.
ported by the LDRD program of LANL under project
number 20200056DR, and later supported by the by • VQAs share a common structure, where a task is
the U.S. Department of Energy (DOE), Office of Sci- encoded into a parameterized cost function that is
ence, Office of High Energy Physics QuantISED pro- evaluated using a quantum computer, and a classi-
gram under under Contract Nos. DE-AC52-06NA25396 cal optimizer trains the parameters in the VQA.
33

• The adaptive nature of VQAs is well suited to han- • Trainability, accuracy, and efficiency are three chal-
dle the constraints of near-term quantum comput- lenges that arise when applying VQAs to large-scale
ers. applications, and strategies are currently being de-
veloped to address these challenges.

You might also like