0% found this document useful (0 votes)

6 views54 pages

General Framework For Randomized Benchmarking

The document presents a comprehensive framework for randomized benchmarking (RB) protocols, which are essential for characterizing quantum gates while being robust against state preparation and measurement errors. It introduces a general RB protocol that encompasses various existing methods and offers novel extensions, including scalable postprocessing techniques to enhance practical feasibility. The work also establishes rigorous conditions for the output of RB experiments, linking decay rates to quality measures like average fidelity, thereby advancing the understanding and application of RB in quantum computing.

Uploaded by

124083393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views54 pages

General Framework For Randomized Benchmarking

Uploaded by

124083393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

PRX QUANTUM 3, 020357 (2022)

General Framework for Randomized Benchmarking

J. Helsen ,1,* I. Roth ,2,3 E. Onorati,4 A.H. Werner,5,6 and J. Eisert3,7,8
1
QuSoft & Korteweg-de Vries Institute, University of Amsterdam, Science Park, Amsterdam 123 1098 XG,
Netherlands
2
Quantum Research Centre, Technology Innovation Institute, Abu Dhabi, UAE
3
Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, Arnimallee 14 14195, Germany
4
Department of Computer Science, University College London, 66-72 Gower Street, London WC1E 6EA, United
Kingdom
5
Department of Mathematical Sciences, University of Copenhagen, København 2100, Denmark
6
NBIA, Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, København 2100, Denmark
7
Mathematics and Computer Science, Freie Universität Berlin, Takustraße 9, Berlin 14195, Germany
8
Helmholtz-Zentrum Berlin für Materialien und Energie, Hahn-Meitner-Platz 1, Berlin 14109, Germany

(Received 8 December 2020; revised 23 December 2021; accepted 4 March 2022; published 16 June 2022)

Randomized benchmarking refers to a collection of protocols that in the past decade have become cen-
tral methods for characterizing quantum gates. These protocols aim at efficiently estimating the quality
of a set of quantum gates in a way that is resistant to state preparation and measurement errors. Over
the years many versions have been developed, however a comprehensive theoretical treatment of ran-
domized benchmarking has been missing. In this work, we develop a rigorous framework of randomized
benchmarking general enough to encompass virtually all known protocols as well as novel, more flexible
extensions. Overcoming previous limitations on error models and gate sets, this framework allows us, for
the first time, to formulate realistic conditions under which we can rigorously guarantee that the output of
any randomized benchmarking experiment is well described by a linear combination of matrix exponential
decays. We complement this with a detailed analysis of the fitting problem associated with randomized
benchmarking data. We introduce modern signal processing techniques to randomized benchmarking,
prove analytical sample complexity bounds, and numerically evaluate performance and limitations. In
order to reduce the resource demands of this fitting problem, we introduce novel, scalable postprocessing
techniques to isolate exponential decays, significantly improving the practical feasibility of a large set
of randomized benchmarking protocols. These postprocessing techniques overcome shortcomings in effi-
ciency of several previously proposed methods such as character benchmarking and linear-cross entropy
benchmarking. Finally, we discuss, in full generality, how and when randomized benchmarking decay
rates can be used to infer quality measures like the average fidelity. On the technical side, our work sub-
stantially extends the recently developed Fourier-theoretic perspective on randomized benchmarking by
making use of the perturbation theory of invariant subspaces, as well as ideas from signal processing.

DOI: 10.1103/PRXQuantum.3.020357

I. INTRODUCTION and benchmarking. Particularly for quantum operations,

stringent conditions have to be met to achieve fault tol-
In the last few years significant steps have been taken
erance. Motivated by this observation, in recent years a
towards the development of large-scale quantum com-
significant body of work has been dedicated to the devel-
puters. A key part of the development of these quantum
opment of tools for the certification and benchmarking of
computers are tools that provide diagnostics, certification,
quantum gates. A prominent role in this collection of tools
is taken by methods that can be collectively referred to
* as randomized benchmarking (RB). These methods have
[email protected]
risen to prominence because they conform well to the
Published by the American Physical Society under the terms of
the Creative Commons Attribution 4.0 International license. Fur- demands of realistic experimental settings. They estimate
ther distribution of this work must maintain attribution to the the magnitude of an average error of a set of quantum gates
author(s) and the published article’s title, journal citation, and in a fashion that is robust to errors in state preparation and
DOI. measurement (SPAM) and moreover is, in many settings,

2691-3399/22/3(2)/020357(54) 020357-1 Published by the American Physical Society

J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

efficient, in the sense that the resources required scale poly- it possible to reliably use them without a detailed under-
nomially with the number of qubits in the device. The standing of their inner workings. This is a timely effort,
various versions of RB apply sequences of randomly cho- as procedures that fit within the RB framework, such as
sen quantum gates of varying length. Small errors are thus linear-cross-entropy benchmarking [29] and the behavior
amplified with the sequence length, and gate quality mea- of noisy random circuits more generally, have been the
sures can be extracted from the dependence of the output topic of significant attention recently [30–32], including
data on sequence length. for the purpose of benchmarking [33]. Given how we iden-
In RB protocols, group structures feature strongly, in tify linear-cross-entropy benchmarking as a randomized
that the gate set considered is in almost all cases a sub- benchmarking procedure, we relate our general framework
set of a finite group. Such group structures not only make to this timely discussion.
it possible to efficiently make predictions for error-free At the same time our framework allows us to go sig-
sequences and compute inverses, but they also provide the nificantly beyond current protocols and establish a series
means to analyze the error contribution after averaging. of novel theoretical results and benchmarking schemes,
Originally proposed for random unitary gates [1–3], RB addressing several shortcomings of the current state of the
is now most prominently executed with gates from the so- art. Among others, these novel results include a rigorous
called Clifford group [4–6], a set of efficiently classically error bound for generator-style randomized benchmark-
simulatable quantum gates that take a key role specifi- ing, a formal equivalence between linear-cross-entropy
cally in fault-tolerant quantum computing [7]. It has also benchmarking and randomized benchmarking and a novel,
been considered for other (subsets of) finite groups [8–15]. scalable method for isolating signals in RB experiments, an
Moreover RB has been generalized to capture other fig- absolute requirement if one wants to apply RB to nonstan-
ures of merit of gate sets, such as relative average gate dard gate sets. This latter method, which we call filtered
fidelities to specific anticipated target gates [4], fidelities RB, is a significant conceptual improvement over standard
within a symmetry sector [9,10], or the unitarity [16]. RB schemes, promising greater flexibility and applica-
Specifically recently, with challenges of realizing fault- bility. Notably, it also obviates the need for physically
tolerant quantum computers in mind, emphasis has been implemented inversion gates in randomized benchmark-
put on capturing losses, leakage, and crosstalk in a scheme ing experiments and the preparation of specific input states,
[17–19]. Also, data from RB—or rather suitably combin- making its implementation significantly more straightfor-
ing data from multiple such experiments—can be sufficient ward. As such, this framework therefore also constitutes
to acquire full tomographic information about a quantum a solid basis for developing new schemes of random-
gate [20–22]. This adds up to a wealth of RB protocols [23] ized benchmarking. Altogether these results substantially
proposed over the previous years. Figure 2 summarizes (to advance the understanding of the possibilities and require-
our knowledge) an up to date list of theoretical proposals ments of randomized benchmarking as a practical tool for
for RB procedures presently known. estimation and certification.
A significant body of work moreover deals with the lim-
itations and precise preconditions of RB. The originally II. OVERVIEW OF RESULTS
rather stringent assumptions on noise being necessarily
In this work, we aim at developing a mathematically
identical across different quantum gates have over time
comprehensive framework of randomized benchmarking
been relaxed for particular protocols in later work [24–26],
protocols, synthesizing, generalizing, and substantially
and the connection between the output of RB and opera-
strengthening previous work. This paper covers a vari-
tionally relevant quantities (such as average fidelity) has
ety of different aspects of randomized benchmarking, from
been studied in some detail [26,27].
general theorems on the validity of RB data, to a detailed
And yet, it seems fair to say that a comprehensive pic-
study of the classical postprocessing of data generated by
ture of RB schemes for the quantum technologies [28] has
RB and an in-depth discussion of the connection between
been lacking so far. In particular, a theoretical framework
the outputs of RB and average fidelity. As our work is
that is broad enough to formalize the required precon-
often quite technical, we formulate a series of “take-home”
ditions ensuring the proper functioning of RB protocols
messages at the end of this section, summarizing the key
beyond case-by-case arguments for specific protocols. This
takeaways of our work for experimental practice.
is unsatisfactory, as the development of higher-quality
quantum gates and currently relies heavily on a plethora
of tailor-made variants of RB. This motivates our cur- A. A general framework for randomized
rent effort at providing a clear rigorous underpinning for benchmarking
RB and exploring its underlying mathematical structure, We begin by providing a general framework that gen-
putting all variants of RB on a common footing. eralizes and covers (to the best of our knowledge) all RB
With this effort we aim to not only better understand procedures currently present in the literature. This can also
these protocols, but also to increase trust in them, making be thought of as an attempt at a formal definition of RB

020357-2
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

protocols, and is largely an eﬀort to organize and for-

malize knowledge already present in the RB literature.
RB protocols can be divided into two separate phases:
a data-collection phase, and a data-processing phase (see
Fig. 1).

(a) The data-collection phase corresponds to the part

of the protocol involving the actual quantum com-
puter and can be described as (1) the preparation of
a quantum state, (2) the application of a sequence
of random quantum operations, capped by (3) an
inversion operator mapping the state (ideally) to a
specified final state (usually the initial state), upon FIG. 1. The basic structure of RB. The RB data-collection
phase iterates the following steps: after (1) the preparation of an
which (4) a measurement is then performed.
initial state ρ0 , (2) a sequence of m random gates gi is applied, (3)
(b) This process yields estimates of a success proba- followed by the gate inverting the sequence ginv up to an end gate
bility p(m) for different sequence lengths m, which gend and (4) a final POVM measurement. In the data-processing
constitutes the input to the data-processing phase. phase the measurement data, for many random sequences and
In this phase—which is completely classical—the different sequence lengths is postprocessed in a classical com-
data p(m) is fitted to a functional model, generally a puter to extract decay parameters quantifying the imperfections
linear combination of exponential decays. One can in the implemented gates.
consider the decay rates of these exponential decays
as direct measures of quality of the implementation, the identity, but other choices are relevant, and it can
or further relate it to operational quantities like the even be chosen at random.
average fidelity.
Different choices for these key parameters can be collected
Starting with the data-collection phase, we write down a into classes, yielding a typology of randomized bench-
general RB protocol (Algorithm 1). This protocol depends marking procedures, an overview of which can be seen in
on a number of input parameters, and by making particular Fig. 2. This typology consists of three classes:
choices for these parameters we can obtain all RB proto-
cols currently in the literature. The key parameters are as (a) Uniform RB, which is characterized by uniform
follows: random sampling of operations and reference imple-
mentations that are representations.
1. A group G, encoding the gates that are applied (b) Interleaved RB, where the reference implementa-
during the RB protocol. A common choice for this tions involve the application of “interleaved” gates.
group is the multiqubit Clifford group Cq but many (c) Nonuniform RB, which is characterized by nonuni-
other choices are possible. form random sampling of operations. This last class
2. A reference implementation φr assigning to each comes with two subtypes: approximate RB, where
element of the group G an ideal quantum operation the sampling distribution is close to uniform, and
to be implemented. In the standard scenario this map subset RB, where the sampling distribution is very
is a representation of the group G (denoted ω). In far from being uniform (for instance, taking only
general this map need not be a representation, but it nonzero values on a small set of generators).
is in all known cases obtained from a representation
by some fixed mapping. The paradigmatic exam- These classes of RB procedures are motivated by the qual-
ple of such an implementation map is the standard itatively different behavior of the associated output data
conjugation action g → Ug ρUg † , which associates p(m), which we discuss in more detail later. They also par-
a unitary action to every element g of the group. tially but not completely align with notions already present
3. A probability distribution ν encoding the proba- in the literature. In particular, we see that the behavior of
bility with which gates are selected from G. In the this data is dictated by the group G and the reference repre-
standard case this probability distribution is simply sentation ω. We can always decompose this representation
the uniform distribution over the group. We also ω intoa direct⊕nsum of irreducible subrepresentations, i.e.,
consider the situation where this probability distri- ω = λ∈ σλ λ where the σλ are irreducible (and occur
bution can vary throughout different steps of the with multiplicity nλ ).
protocol. A key tenet of RB is that this decomposition decides the
4. An ending gate gend governing the total operation functional form of the output data p(m) as a function of
performed in each RB sequence. Typically this is sequence length m. More precisely, we expect behavior of

020357-3
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

FIG. 2. An overview of RB schemes, indicating how they ﬁt within our typology (see Sec. V D) of RB schemes and what theorem
covers the behavior of their output data (see Sec. VI). An ∗ indicates that the protocol has a nontrivial postprocessing scheme, while
∗∗ indicates that the protocol in its original speciﬁcation has no inversion gate. We discuss how this is equal to uniform RB (with
inversion) together with a postprocessing step in Sec. VIII.

the form combination of (matrix) exponential decays [as expressed

in Eq. (1)], decaying with the length of the sequences of
p(m) ≈ Tr(Aλ Mλm ), (1) random gates. Moreover, this linear combination is of a
λ∈ specific structure, determined by the implemented gate set.
where Aλ , Mλ are nλ × nλ matrices encoding state prepa- However, this functional form of the RB output data is
ration and measurement errors, and the quality of gate not guaranteed by the protocol itself, but is instead derived
implementation, respectively. This formalizes in a precise from assumptions on the noisy implementation of the ran-
way the general idea that RB data is well described by a dom quantum operations. In early work this assumption
linear combination of exponential decays and allows for took the form of the gate-independent noise assumption.
the classical processing of RB output data, thus provid- Later, it was realized that this assumption is not satisfac-
ing the connection between the data-collection and the tory [26] and it was subsequently generalized for standard
data-processing phases. Note, however, that if irreducible Clifford RB to the more general assumption that the noisy
subrepresentations appear with nontrivial multiplicities the implementations of gates are Markovian and time indepen-
functional form of Eq. (1) includes matrix exponential dent, and moreover either that the gate-dependent variation
decays. These can have qualitatively different features than of the noise is upper bounded in the diamond norm (in the
scalar exponential decays, requiring a more sophisticated work of Ref. [24]), or lower bounded in average fidelity
data-processing approach. It is, for instance, possible for (in the work of Ref. [25]). Here, we provide a series of
these matrices to have complex eigenvalue pairs, which theorems generalizing these works to (almost) all existing
will lead to damped-oscillation behavior in randomized RB protocols, justifying Eq. (1) in broad circumstances.
benchmarking data. The theorems we prove make claims of different strength
for different classes of RB protocols, as per the typology
B. The functional form of randomized benchmarking outlined in Fig. 2.
data
(a) We prove that the output data of uniform RB proto-
At the core of the RB literature is the promise that RB cols (as per the typology in Fig. 2) can be described
output data has a very specific form, namely that of a linear as a linear combination of exponentials, up to an

020357-4
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

exponentially small error, provided that the gate sampling of random gates, becomes close to the uni-
implementations are Markovian, time independent, form distribution and is a function of both the initial
and are on average close in diamond norm to an distribution ν and the underlying group G. We can
ideal implementation that is a representation. This summarize our result on subset RB as follows.
closeness is independent of the particular RB pro-
tocol and independent of the underlying Hilbert- Theorem 2: (Informal version of Theorem 10). Consider
space dimension. The complete statement is given a RB experiment with sequence length m, with gates drawn
as Theorem 8 that can be summarized as follows. from a group G according to a probability distribution
ν and implemented
through a reference representation
⊕n
Theorem 1: (Informal version of Theorem 8.) Consider ω(g) = λ∈ σλ λ (g). Denote the corresponding (noisy)
a RB experiment with sequence length m, with gates uni- actual implementation on the quantum computer as φ(g).
formly drawn from a group G andimplemented through If we have, for some sequence length mmix that
⊕n
a reference representation ω(g) = λ∈ σλ λ (g). Denote
the corresponding noisy implementation on the quantum δ
ν(g) ω(g) − φ(g) ≤ , (4)
computer as φ(g) (note that this assumes time independent g∈G
mmix
and Markovian noise). If we have

ν ∗mmix (g) − 1 ≤ δ , (5)
1 1 |G|
ω(g) − φ(g) ≤δ ≤ , (2) g∈G
|G| g∈G 9
and δ + δ ≤ 1/9, then the output data p(m) of the RB
then the output data p(m) of the RB experiment obeys the experiment obeys the relation
relation

Tr(Aλ Mλ mix ) ≤ O(δ + δ ),
m−m
p(m) − (6)

p(m) − Tr(Aλ Mλm ) ≤ O(δ m ), (3) λ∈
λ∈
with the error bound independent of m. Here Aλ and
with the error exponentially suppressed in m. Here Aλ and Mλ are nλ × nλ matrices, with Mλ depending only on the
Mλ are nλ × nλ matrices, with Mλ depending only on the actual implementation φ.
actual implementation φ.
This theorem cannot guarantee an exponential error
The proof of this theorem relies on a combination of bound, but still improves on the state of the art [14,15],
techniques from earlier works: taking the matrix Fourier- both in the generality of the assumptions made and the
transform perspective introduced to RB in Ref. [25] and size of the possible error. Note also the appearance of the
combining it with the realization in Ref. [24] that the mmix −1 term in the average diamond-norm deviation. This
diamond distance (averaged over random gates) is the can be read as the requirement that the generating gates are
correct distance measure for the formulation of assump- of sufficiently high quality that any (composite) uniformly
tions on noisy gate implementations. We also make heavy randomly chosen gate will be close in diamond norm to its
use of the perturbation theory of invariant subspaces of ideal version. In this sense this requirement is of the same
non-normal matrices [52,53]. We note that the specific stringency as Eq. (2).
parameter 1/9 is an artifact of the proof techniques and
probably suboptimal. (c) We discuss the behavior of interleaved RB proto-
cols, illustrating how standard interleaved RB, as
(b) Building on Theorem 8, we prove multiple theorems well as all but one nonstandard interleaved RB pro-
for nonuniform RB protocols. The first subtype, tocol, are covered by Theorem 8. We consider two
approximate RB, is covered by Theorem 9, a direct nonstandard interleaved RB protocols, namely cycle
generalization of Theorem 8, and also features an benchmarking [13], which is covered by our theo-
exponentially suppressed error. For the second sub- rems in a nontrivial way and robust RB tomography
type, subset RB, on the other hand, we can give only [50], which is not covered by our theorems. We
a weaker statement, guaranteeing that the RB output argue that this is not a weakness of our argument
data is described by a linear combination of expo- but rather that the RB output data of this pro-
nentials up to constant error (in sequence length) as tocol behaves in a nonstandard manner, requiring
long as the sequence length m is taken to be larger tailor-made analysis.
than a mixing length mmix . This mixing length indi- (d) In Sec. X, we providea discussion of the cen-
cates the moment where the m-fold convolution ν ∗m tral assumption |G|−1 g∈G ω(g) − φ(g) ≤ δ,
of the probability distribution ν, which governs the made on the behavior of noisy gates in the above

020357-5
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

theorems. We argue that this assumption is a natu- only interested in a subset of the decay parameters that
ral one to make (Theorem 18) and moreover that it describe the output data of a particular RB experiment.
cannot be replaced by a similar assumption involv- Because of this, several methods have been developed
ing the average fidelity without requiring the gate to to isolate particular exponential decays. Examples of this
be exponentially close to perfect in the number of include the class of uniform RB protocols without inver-
qubits. This also answers an open question posed in sion gates (indicated with a double asterisk “∗∗” in Fig. 2)
Ref. [25] in the negative. and a variety of other protocols that take linear combina-
tions of RB output data with different ending gates gend to
The unifying conceptual theme of all of our theorems is the isolate particular exponential decays (indicated with a sin-
fact that RB can be seen as a “power iteration in frequency gle asterisk “∗”). In Sec. VIII, we give a novel class of
space.” The behavior of the output data is dictated by the protocols called filtered RB that subsumes all these earlier
dominant eigenvalues of a fixed matrix that is obtained approaches. For simplicity, we consider only uniform RB,
from the Fourier transform [25] (in a specific sense defined but our results generalize to other types of RB.
later) of the noisy implementation map φ. Taking powers This class of protocols is based on the realization that
of this matrix results in the exponential suppression of all RB output data (indexed by an ending gate gend ) can be
but the largest eigenvalues. seen as a vector in the group algebra of the group being
Together, these results provide a rigorous justification benchmarked. This allows for the design of filter functions
for the folkloric knowledge that RB protocols function αλ : G → C, based on the matrix elements of irreducible
under broad experimental circumstances. representations, that isolate exponential decays associated
with subrepresentations of the ideal implementation of the
C. A framework for randomized benchmarking data gates in the group G. Using these filter functions we can
processing write down a general postprocessing scheme for the isola-
The second phase of the RB protocol, the data- tion of exponential decays and prove that it works when
processing phase, takes in RB output data, which is well the assumptions of Theorem 8 are satisfied. We prove a
described by a linear combination of exponentials and out- theorem of the following form.
puts the decay rates associated with those exponentials. If
the data is well described by a single exponential decay Theorem 3: [Theorem 16 (informal)]. Let αλ : G → C be
this can be done by off-the-shelf curve-fitting procedures, the filter function associated with the irreducible represen-
but if the RB output data is of a more complex form tation σλ and let p(m, gend ) be the output data associated
(such as a linear combination of several exponentials) a with a uniform RB experiment with ending gate gend , sat-
more flexible approach is required. Here we provide a self- isfying the condition Eq. (2) with parameter δ. We have
contained discussion of modern signal-processing methods that
for extracting decay parameters from data with a func-
tional form given by Eq. (1). We review signal-processing 1
kλ (m) := αλ (gend )p(m, gend ) (7)
algorithms, in particular the multiple signal classification |G| g ∈G
end
(MUSIC) and estimation of signal parameters via ratio-
nal invariance techniques (ESPRIT) algorithms, that are satisfies
at least, in principle, applicable to the most general form
of RB output data, even including matrix exponentials. kλ (m) − Tr(Bλ M m ) ≤ O(δ m ), (8)
λ
Beyond that, we discuss theoretical guarantees that were
derived for these algorithms and discuss their implications with Mλ associated with the irreducible subrepresentation
for RB data processing. Building upon these guarantees, σλ [as per Eq. (1)].
we derive a sampling complexity statement that ensures
the recovery of decay rates with these algorithms under Beyond this theoretical result we note that this novel
measurements with finite statistics. We complement our class of protocols allows one (by a simple reparametriza-
analytical discussion with numerical evaluations and simu- tion) to eliminate the need for an explicitly implemented
lations that demonstrate the practical performance of these inversion gate in RB, making the protocol significantly
algorithms. Importantly, our discussions detail the fun- simpler to implement in practice.
damental limitations of postprocessing RB output data We also give a statistical analysis of this postprocessing
featuring many exponential decays. scheme. In particular, we prove that if the measurement
positive operator-valued measure (POVM) performed in
D. A general postprocessing scheme for isolating the RB experiment is (proportional to) a state 3-design,
exponential decays the sample complexity of the complete benchmarking
Even with modern methods, fitting multiple exponential procedure (data collection plus postprocessing) is asymp-
decays is a difficult affair, and in many scenarios one is totically independent of the dimension of the underlying

020357-6
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

Hilbert space for arbitrary benchmarking groups. This is a positive implementation map, which is not completely pos-
strong improvement on previous attempts at such a general itive in the depolarizing gauge (or equivalently has non-
postprocessing procedure. Note that the 3-design condi- completely positive noise in-between gates). This implies
tion appearing here plays a similar role in controlling the that both these interpretations of RB decay rates are not
variance in scalable estimation procedures such as shadow fully satisfactory, because they cannot be guaranteed to
estimation [54,55]. correspond to the average fidelity of a physical process.
We stress, however, that the 3-design condition is a That said, this does not mean that RB decay rates are not
sufficient condition and there are examples in the liter- useful figures of merit, as they can always be interpreted
ature covered by this postprocessing scheme where this as meaningful benchmarks in their own right.
condition is not met but the overall procedure is still Complementing this, following the approximate approach
scalable. In particular, we discuss the recently proposed of Ref. [27], we show that the problem of connecting RB
linear-cross-entropy benchmarking procedure (XEB) [29] decay rates with the average gate fidelity can be (approx-
in Sec. VIII C. We argue that the variant of XEB that imately) reduced to the deviation between the dominant
performs multiple random gate sequences is an example (ideal) unperturbed eigenvectors and their (implemented)
of uniform RB (as per the typology) combined with an perturbed version in Fourier space. We show that, as
instance of our general postprocessing scheme. Further- long as this overlap is sufficiently close to 1, any gauge
more, we argue that the sample complexity of linear XEB choice that corresponds to a completely positive and trace-
is asymptotically independent of the underlying Hilbert- preserving (CPT) channel will connect RB parameters to
space dimension even though the POVM being measured the average gate fidelity. Hence we obtain, under pre-
is not itself a 3-design. cise conditions, an approximate version of the connection
between average fidelity and RB decay rates.
More formally, we leverage the Fourier-transform
E. Randomized benchmarking and average fidelity framework introduced in Ref. [25] to derive the following
RB has originally been designed to estimate the average expression for the entanglement fidelity, which is linearly
gate fidelity of a group of gates. Under the assumption of related to the average fidelity, averaged over all elements
gate-independent noise, it can be proven (as has already of the group as
been done in Ref. [1]) that the decay rates estimated in a
1
RB experiment correspond exactly to the average fidelity Fe (φ, ω) = dσ fmax (σλ )αoverlap + αres , (9)
of the noise associated to the gates. However, if this con- d2 λ∈ λ
dition is relaxed, the connection between these decay rates
and the average fidelity is less clear. Even more strongly, it where fmax (σλ ) is the RB decay parameter associated with
has been argued in Ref. [26] that due to a so-called gauge the irreducible subrepresentation σλ . In the Fourier frame-
freedom in the representation of the gate set, the entire work fλ,max corresponds to the largest eigenvalue of the
premise of a connection between RB decay rates and aver- Fourier transform of the implementation map φ evalu-
age fidelity may be suspect. This is because the choice of ated at σλ . Furthermore, the parameter αoverlap encodes
the gauge does not influence the RB decay rates, but it does the overlap between the (left and right) eigenvectors asso-
affect the average gate fidelity. Indeed, it has been shown ciated with this largest eigenvalue, and the eigenvector
that under some transformations the two quantities may of the Fourier transform of the reference representation
differ by orders of magnitude, even in the gate-dependent ω evaluated at σλ . Finally, the term αres , the residuum,
noise case (where the previously proven connection can be encodes information about the subdominant eigenspaces of
seen as a “natural” gauge choice). the Fourier transform. The factors αoverlap , αres are gauge
Subsequently proposals have been made to reconnect dependent. We give bounds on the overlap and residuum
the average gate fidelity and RB decay rates in the con- in terms of the deviation of φ from the reference ω and
text of standard Clifford RB: a natural gauge called the discuss relevant scenarios where these terms contribute
depolarizing gauge [25] and the noise-in-between-gates only negligibly to the entanglement fidelity (and thus when
framework. Both of these proposals provide an exact con- RB decay data corresponds approximately to an average
nection between the decay rates of RB and the average fidelity).
fidelity. However, several crucial questions of interpreta-
tion have still been left open, and in this work we aim to F. Nontechnical discussion
address some of them, and sharpen others. In this work, we develop a comprehensive theory of
In Sec. IX B 2, we substantially generalize both pro- randomized benchmarking. Our main motivation has been
posed connections between decay rates and average our desire to give a mathematical framework for RB and
fidelity to RB with arbitrary finite groups. What is more, to classify known schemes. It should be clear, however,
we argue that these two proposals are in fact equivalent. that our work goes significantly beyond a mere classi-
Moreover, we present an explicit example of a completely fication of what is present in the literature. Since our

020357-7
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

work is in parts rather technical, in the following we 1. Filtering scalably extends RB to a large class of
formulate a series of “take-home messages”: actionable groups. As formalized in Sec. VII, a major prac-
advice for experimentalists interested in using RB in the tical hurdle to applying RB with arbitrary finite
laboratory and developing new protocols to suit their groups, is that this generically requires the fitting
needs. of output data to multiexponential decays. This is
a difficult problem both in theory and in practice
1. RB gives exponential decays under broad and it has so far contributed to the limited experi-
(Markovian) circumstances. Confirming experimental use of RB beyond a few groups (such as the
mental intuition, and extending earlier results for Clifford group). Our new procedure, which we call
specific groups, our main result (Theorem 8) proves filtering (or filtered RB), takes a major step towards
that RB protocols behave (up to an exponentially solving this problem by giving a generic procedure
small correction factor) as expected whenever the for isolating exponential decays in a fully scalable
noise afflicting the gate set is Markovian and time manner. This approach is discussed in great detail
independent. Because the correction factor is so in Sec. VIII, with the protocol given explicitly in
small, any deviation from the prescribed functional Algorithm Box 2. This procedure is guaranteed to
form can in fact be taken as evidence of non- be scalable for all groups as long as the measure-
Markovian or time-dependent noise processes (as ment POVM forms a 3-design, but we believe that
suggested earlier by Ref. [24]). We do wish to it applies beyond that (see, in particular, the exam-
emphasize that the error term in Theorem 8 can ple of linear cross-entropy benchmarking discussed
be quite significant for small sequence lengths. in Sec. VIII C).
Hence we recommend as a rule of thumb that RB 2. Inversion gates are not required for RB. Another
experiments should not include very short (m ≤ key practical difficulty in performing randomized
5) sequence lengths, especially when strong gate- benchmarking has been the necessity to compute
dependent (but Markovian) noise is suspected, as and implement a global inversion gate. However,
this might bias the estimator for the decay rate. filtered RB has the bonus property that it does not
2. RB is broadly resistant to deviations from uni- require the application of inverses. Instead a ran-
form sampling. Similar to robustness against gate- dom noisy gate sequence can be directly compared
dependent Markovian noise, we prove (Theorem 9) to a perfect classical simulated version to extract the
that RB gives correct results even when the group is same RB decay rates, making the quantum part of
not being sampled exactly uniformly. This broadly the protocol significantly easier to implement. How-
justifies the use of (generically applicable) Markov ever, this simplicity is gained at a (constant) extra
chain techniques for sampling group elements [14], sampling overhead, as the inversion gate in standard
overcoming a key technical hurdle in running RB RB also suppresses the sampling complexity [56].
protocols with new groups.
3. The decay rates given by RB can be interpreted With these new contributions, our framework serves
as an average fidelity (but caveats apply). We as a convenient basis to design new schemes that
find that the decay rates of general RB experiments come with rigorous performance bounds built in. We
can always be exactly associated to the average expect this to facilitate and accelerate the develop-
fidelity of a fixed process, however, this process ment of more sophisticated and tailor-made benchmark-
need not be physical (i.e., it does not always corre- ing schemes as required by experimental practition-
spond to a completely positive and trace-preserving ers. Steps in this direction have already been made
map). Alternatively, we show that RB decay rates [57–59]. In particular, Ref. [58] explores the frame-
can always be connected approximately to the aver- work put forth here for continuous groups of quantum
age fidelity of a physical process, but the degree of gates.
approximation is dependent upon external beliefs
about the underlying noise process. Hence, we
believe the interpretation of RB decay rates as an G. Structure of this work
average fidelity to be broadly valid, but subject to In Sec. III, we discuss mathematical preliminaries: we
technical caveats. set the notation for the rest of the work and recall stan-
dard notions from representation theory. This section can
These three messages can be considered folklore knowl- be skipped by experienced readers.
edge in the RB community, for which we provide In Sec. IV, we discuss implementation maps: linear
a rigorous underpinning. However, our work also maps from finite groups to superoperators, a central con-
contains new conceptual developments, notably the cept in our treatment of RB. We also give an introduc-
following. tion into matrix-valued Fourier theory and explicitly state

020357-8
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

several results from the perturbation theory of non-normal V ⊗ W for some W). Finally we denote the complex con-
matrices, which we use throughout the rest of the work. jugate by a bar (i.e., A is the entrywise complex conjugate
In Sec. V, we give a general framework for RB, with of A).
its two phases: the data-collection and data-processing
phases, and give a general protocol for the data-collection A. Quantum channels and the operator-matrix
phase. This protocol, which depends on a range of input representation
parameters, covers (the data-collection phase of) all known Unitary operations as they are generated by quantum
versions of RB. We also discuss a typology of RB schemes, gates—in the focus of attention in this work—are quantum
dividing up the known protocols into a few generic classes. channels. Formally, quantum channels are superoperators,
In Sec. VI, we present a series of general theorems that that is elements of Sd , that are trace preserving and com-
govern the behavior of the output data of a RB protocol. pletely positive. In order to represent quantum channels
We confirm the folklore knowledge that RB data is well (and elements of Sd more generally), we make use of the
described by a linear combination of (matrix) exponentials, operator matrix representation. Given a quantum chan-
under some general assumptions. nel E ∈ Sd , we can represent it as an element of Md2 by
In Sec. VII, we discuss general procedures for extracting choosing an orthonormal basis (with respect to the trace
decay parameters from RB output data. We discuss imple- d 2
mentation and general limitations and prove a sampling or Hilbert-Schmidt inner product) bj j =1 for Md . Thus E
complexity statement for RB. (abusing notation) is a d2 × d2 matrix with components
In Sec. VIII, we propose a general postprocessing
method for isolating exponential decays associated with Ej ,k := Tr b†j E (bk ) . (10)
particular subrepresentations. We argue that this postpro-
cessing method covers many previously proposed proce- Analogously, (density) matrices ρ ∈ Md can be repre-
dures. We also prove a sufficient condition under which sented as vectors,
this postprocessing scheme is scalable for any RB proto- ⎛ ⎞
col and analyze linear cross-entropy benchmarking as an ρ1

example. ⎜ρ ⎟ †
|ρ = ⎝ 2 ⎠ with ρk := Tr bk ρ . (11)
In Sec. IX, we discuss the relation between the decay ...
rates generated by RB and the average fidelity, focusing in ρd 2
particular on the gauge freedom in the presentation of the
underlying noise channels. Note that the action E (ρ) now corresponds to a matrix-
Finally, in Sec. X, we finally argue that the assumptions vector multiplication E |ρ and the concatenation of two
made in Sec. VI are natural and in some sense necessary channels E and E into a matrix multiplication EE . We
for the correct behavior of RB. can analogously write a (POVM element) matrix ∈ Md
as a covector

| = 1 2 . . . d2 with
III. PRELIMINARIES: QUANTUM CHANNELS := , bk = Tr [ bk ] . (12)
k
AND GROUP REPRESENTATIONS
In this section, we go over some of the basic mathe- With this, the probability to obtain an outcome described
matical machinery needed to talk about randomized bench- by the POVM element when measuring ρ is p( |ρ) =
marking and prove our central theorems. We discuss quan- , ρ = Tr[ ρ].
tum channels and their matrix representations (Sec. III A),
and groups and group representations (Sec. III B). This is B. Representations of groups
fairly standard material, and beyond the setting of notation At the heart of our discussion are notions of represen-
it can be skipped by an experienced reader. tations of groups. In this section, we hence recall some
We begin by setting the stage and introducing some basic facts about the representations of finite (and compact)
basic notation used throughout our work. We denote com- groups over complex vector spaces, with a focus on their
plex vector spaces by V or more explicitly by Cd . We use in quantum computation. For a more in-depth treat-
denote by Md the vector space of complex linear transfor- ment of this topic we refer to Refs. [60,61]. In this work
mation of Cd and by Sd the space of linear transformations we restrict our attention to finite groups keeping the nota-
of Md , often called superoperators. Here d is an integer tion more concise. Most results can be analogously stated
that in many cases can be thought of as being a power of for continuous, compact groups and derived following
2 (d = 2q ), however, all theorems are valid for general d the same strategy. Reference [58] carefully discusses the
unless explicitly stated. We denote by TrV the partial trace required modifications and gives explicit reformulations
over a tensor factor V (of an implied tensor product space for continuous compact groups.

020357-9
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

1. Representations 4. Characters
Let G be a ﬁnite group and consider the space Md of Characters are a central object in representation theory,
linear transformations of Cd . A representation ω is a map given by the trace of a representation.
ω : G → Md that preserves the group multiplication, i.e., Definition 4: (Character of a representation). The char-
acter χω of a representation ω of a group G is deﬁned
ω(g)ω(h) = ω(gh), ∀g, h ∈ G. (13) as

We require the operators ω(g) to be unitary as well (for χω (g) = Tr[ω(g)]. (19)
finite groups this can always be done).
One of the most important properties for characters of
irreducible representations is the following orthogonality
2. Reducible and irreducible representations
relation.
If there is a nontrivial subspace W of Cd such that for all
vectors w ∈ W we have Proposition 5: (Orthogonality formula). Let χλ , χλ be the
characters of two irreducible representations σλ , σλ of a
ω(g)w ∈ W, ∀g ∈ G, (14) group G. Then

then the representation ω is called reducible. The restric- 1 1 if σλ σλ
tion of ω to the subspace W is also a representation, which χ λ (g)χλ (g) = (20)
|G| g∈G 0 if σλ σλ .
we call a subrepresentation of ω. If there are no nontrivial
subspaces W such that Eq. (14) holds the representation ω
is called irreducible. We generally reserve the letter σ to 5. Projections onto irreducible representation
⊕n
denote irreducible representations. Given a representation ω = λ∈ σλ λ on a vector
Two representations ω, ω of a group G are called ⊕nλ
space Vω = Vλ we can choose a basis vjλ | j ∈ 1, . . . ,
equivalent if there exists an invertible linear map T such
that dλ for each Vλ . Each vector v in Vω can thus be written as
dλ λ λ
a linear combination v = λ∈ j =1 cj vj . We can con-
Tω(g) = ω (g)T, ∀g ∈ G. (15) versely identify the basis vector components of any vector
v by application of an appropriate projection Pjλ , such that
We denote this by ω ω . For finite groups G the set of Pjλ v = cλj vjλ , where
irreducible representations (up to the above equivalence)
is finite. We denote it by Irr(G). dλ
Pjλ = σ λ (g) j ,j ω(g). (21)
|G| g∈G
3. Sums, products, and Maschke’s lemma
We make use of sums and products of representations. Note that, in order to construct these projections, the
Given representations ω, ω , the maps knowledge of the diagonal elements of the correspond-
ing irreducible representation σλ is required. However, it is
ω ⊕ ω : G → Md ⊕ Md : g → ω(g) ⊕ ω (g), (16) also possible to project any vector onto distinct irreducible
ω ⊗ ω : G → Md ⊗ Md : g → ω(g) ⊗ ω (g), (17) subspaces (up to multiplicity) by using only knowledge of
the character of a representation:
are again representations. They are, however, generally not dλ
irreducible (even if ω and ω are). However, Maschke’s Pλ = χ (g)ω(g). (22)
|G| g∈G λ
lemma ensures that every representation ω of a group
can be uniquely written as a direct sum of irreducible
representations, that is This last formula follows simply from the definition of the
character as χλ (g) = Tr[σλ (g)].

ω(g) σλ (g)⊕nλ , ∀g ∈ G, (18)
IV. FOURIER TRANSFORMS AND
λ∈
PERTURBATION THEORY OF
IMPLEMENTATION MAPS
where the index set is a subset of the set Irr(G) and nλ is
an integer denoting the number of copies (or multiplicity) In this section, we review the concept of group imple-
of σλ present in ω. mentation maps and their Fourier theory (Sec. IV B).

020357-10
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

Mathematically this corresponds to noncommutative har- to a set of matrices. This definition has all the properties
monic analysis of matrix-valued functions. We also discuss of a Fourier transform. Firstly, it has an inverse transform,
perturbation theory for non-normal matrices. This mate- which maps F (φ) back to φ, given by
rial is somewhat less well known, so we spend more time
discussing these concepts. F −1 F (φ) (g) = dλ TrVλ F (φ)[σλ ]σλ (g −1 ) ⊗ 1
λ∈Irr(G)
A. Implementation maps (25)
Given a group G, we can assign quantum circuits [ele- for all g ∈ G and where dλ is the dimension of Vλ , the
ments of U(d)] to each group element, which gives rise to a space on which the representation σλ acts.
representation of the group. However, in practice, quantum Secondly, it has the correct behavior with respect to con-
circuits will not be executed perfectly, but rather include volutions of implementation maps: the Fourier transform
noise. This noise can be modeled by a quantum channel, of a convolution corresponds to a product of Fourier trans-
and we can thus envision assigning to each group element forms. Recalling the definition of a convolution of two
a quantum channel modeling the real implementation of implementation maps φ, φ
that circuit. These quantum channels can be composed,
but this composition will not necessarily maintain group 1
φ ∗ φ (g) = φ(gg −1 )φ (g ) (26)
structure and will thus in general not form a representa- |G|
g ∈G
tion. However, we can define the more general concept of
an “implementation map” φ, which is a function from a we can easily see the following:
finite group G to the space of superoperators Sd ,
1
F (φ ∗ φ )[σλ ] = σλ (g) ⊗ φ(gg −1 )φ (g )
φ : G → Sd , (23) |G|
g,g ∈G
where we usually assume that φ(g) is a trace nonincreas- 1
ing quantum channel for all g. If we want to draw explicit = σλ (gg ) ⊗ φ(g)φ (g )
|G|
attention to this fact we call φ completely positive if and g,g ∈G
only if φ(g) is completely positive for all g ∈ G. Finally, = F (φ)[σλ ]F (φ )[σλ ] (27)
note that if φ(g)φ(h) = φ(gh) for all g, h ∈ G then φ
would be a representation. We can think of the implemen- for all λ ∈ Irr(G). Another useful property is the Parseval
tation map as being an abstract presentation of the noisy identity
implementation of the group elements, which depends on
1
the noise processes in the quantum computer but also Tr φ(g)† φ (g)
on other choices such as the compilation of circuits into |G| g∈G
elementary gates.
= dλ Tr{F (φ)[σλ ]† F (φ )[σλ ]}. (28)
B. Fourier transforms of implementation maps λ∈Irr(G)

When considering an implementation map one can ask Finally, we note that the Fourier transform (evaluated at
precisely when it is a representation, and failing that, if an irreducible representation) of a representation is an
it is close to a representation (in some reasonable way). orthogonal projector with its rank given by the multiplic-
To answer this question we need to introduce some math- ity of that irreducible
representation. To see this, consider
⊕n
ematical machinery. This machinery was ﬁrst introduced a representation ω = λ∈ σλ λ . We have that
into the theory of randomized benchmarking by Ref. [25],
based on work by Gowers and Hatami [62], which is 1
{F (ω)[σλ ]}2 = σλ (gg ) ⊗ ω(gg )
itself a partial review of older mathematical work. In this |G|2
g,g ∈G
section, we consider general maps φ from a group G to a
space of d × d matrices Md . Thinking of Sd as a matrix |G|
= σλ (g) ⊗ ω(g) = F (ω)[σλ ]
space, our notion of implementation map can be seen to be |G|2 g∈G
a special case of these maps. Given a map φ we deﬁne its (29)
Fourier transform F (φ) as
for all λ ∈ Irr(G). Moreover for λ ∈ we have
1
F (φ)[σλ ] = σλ (g) ⊗ φ(g) (24)
|G| g∈G 1
Tr{F (ω)[σλ ]} = χσ (g)χω (g) = nλ (30)
|G| g∈G λ
for all λ ∈ Irr(G). So the Fourier transform F (φ) is a func-
tion from the set Irr(G) of irreducible representations of G by the character orthogonality formula.

020357-11
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

1. Fourier operators ≤ max φ(g) φ (ĝ)
g,ĝ∈G
We also give another, useful way to think about the
matrix Fourier transform, namely in terms of what we call = F(φ)max F(φ )max (34)
Fourier operators.
Note that the set of maps φ → Sd can be seen as a and similarly for ·m . We also have an identity involving
vector space under pointwise addition (of the superoper- both norms
ators). We can further lift this vector space to an algebra
by considering the convolution operator ∗ [as deﬁned in F(φ)F(φ ) = F(φ ∗ φ )max = max φ ∗ φ (g)
max g∈G
Eq. (26)] on the functions in the vector space. We can con-
struct a faithful (i.e., injective) matrix representation of this (35)
algebra as 1
φ(g ĝ −1 ) φ (ĝ)
≤ max
g∈G |G|
1 ĝ∈G
F(φ) = σ λ (g) ⊗ φ(g)
|G| g∈G λ∈Irr(G) = F(φ)max F(φ )m , (36)

1 which will be helpful later.

= ωG (g) ⊗ φ(g), (31)
|G| g∈G
C. Perturbation theory
In this section, we gather some technical tools from
with ωG = λ∈Irr(G) σλ . This is just the Fourier transform
of φ gathered in a direct sum [note that Irr(G), and hence matrix perturbation theory that are essential to many of the
the sum, is finite for any finite group]. By the Peter-Weyl proofs in this paper. Our sources for this section are the
theorem for finite groups one can equally well think of standard books of Stewart and Sun [53] and Kato [52]. For
ωG (g) as an element of the group algebra C[G] associ- the rest of this section, we assume that · denotes a sub-
ated with G, we do not use this point of view explicitly. multiplicative matrix norm on Md , i.e., AB ≤ A B
We call F(φ) the Fourier operator of φ. From the prop- for all A, B ∈ Md .
erties of the Fourier transform we immediately see that Let A ∈ Md be a complex Hermitian matrix. Assume
F(φ)F(φ ) = F(φ ∗ φ ). It is useful to equip the algebra that there exists a unitary matrix X = [X1 , X2 ] such that
of Fourier operators with several norms, based on the dia- the columns of X1 and X2 span invariant subspaces of A,
mond norm · for Sd (in principle, this construction will that is
work for any norm on Sd ). We define
A1 0
[X1 , X2 ] A[X1 , X2 ] =
†
, (37)
0 A2

F(φ)max = max TrVωG DωG ωG (g −1 ) ⊗ 1F(φ)
g∈G
with A1 = X1 † AX1 and A2 = X2 † AX2 . We call this a spec-
= max φ(g) , (32) tral resolution of A. We can think of A1 , A2 as the matrix
g∈G
A restricted to subspaces of Cd spanned by the columns
1
of X1 , X2 , respectively, and furthermore we assume that
F(φ)m = TrVωG DωG ωG (g −1 ) ⊗ 1F(φ)
|G| g∈G the eigenvalues of A1 are all distinct from the ones of
A2 : the subspaces are then said to be simple. These sub-
1 spaces are invariant under the action of A in the sense that
= φ(g) , (33)
|G| g∈G AX1 = X1 A1 and are hence called invariant subspaces. It
turns out that spectral resolutions, and invariant subspaces
more generally, are stable against (small) perturbations.
where Dω = λ∈Irr(G) dλ 1λ collects the relevant dimen- That is, given a perturbation matrix E (not necessarily Her-
sional factors and where the second equality follows from mitian) we can find matrices R = [R1 , R2 ] and L = [L1 , L2 ]
the properties of the Fourier transform. These norms are such that L† = R−1 and
bona fide matrix norms on the algebra of Fourier operators,
notably they are submultiplicative, viz., A1 0
[L1 , L2 ] (A + E)[R1 , R2 ] =
†
(38)
0 A2

F(φ)F(φ ) = F(φ ∗ φ )max = max φ ∗ φ (g)
max g∈G for some A1 , A2 and the matrices R, L are close to X in a
1
φ(g ĝ −1 ) φ (ĝ)
well-specified sense. This is what one would expect from a
≤ max perturbation theorem. It, however, holds only if the pertur-
g∈G |G| bation E is small with respect to the difference between
ĝ∈G

020357-12
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

A1 and A2 . This diﬀerence is made quantitative by the With this function we can state the following theorem,
so-called separation function: which can be derived from Theorem 2.8 in Ref. [53,
p. 238].
A1 Z − ZA2
sep(A1 , A2 ) = min . (39)
Z=0 X1 ZX2 † Theorem 6: (Reference [53]). Let A be a complex Hermi-
tian matrix with spectral resolution diag(A1 , A2 ) induced
This separation function has some rather nice properties. by a unitary X = [X1 , X2 ]. Also, let · be a matrix norm.
Firstly, it is symmetric in its arguments: Now let E be a complex matrix. If E has the properties
†
X1 EX2 X2 † EX1 1
sep(A1 , A2 ) = sep(A2 , A1 ). (40)
2 < , (42)
sep(A1 , A2 ) − X1 † EX1 − X2 † EX2 4
Secondly, it is stable against perturbations, i.e., given a † † † †
X1 EX2 X2 EX1 + X1 EX2 X1 EX2 1
perturbation A + E of A we have 2 < (43)

sep(A1 , A2 ) − X1 EX1 − X2 EX2
† † 2
|sep(A1 + E1 , A2 + E2 ) − sep(A1 , A2 )| ≤ E1 + E2 .
(41) then there exist matrices P1 , P2 such that

†
X2 EX1
P1 ≤ , (44)
sep(A1 , A2 ) − X1 † EX1 − X2 † EX2
†
X2 EX1
P2 ≤ (45)
sep(A1 , A2 ) − X1 † EX1 + X1 † EX2 P1 − X2 † EX2 − P1 X1 † EX2

and and

A1 0 1 0 1 0
[L1 , L2 ] (A + E)[R1 , R2 ] =
†
, (46) [X1 , X2 ]† (A + E)[X1 , X2 ]
0 A2 −P1 I P1 I

A1 E12
= , (51)
with 0 A2

1 0 1 P2 with E12 = X1 † EX2 and A1 = A1 + X1 † EX1 − X2 † EX1 P1
[R1 , R2 ] = [X1 , X2 ] , (47)
P1 I 0 I and A2 = A2 + X2 † EX2 − P1 X1 † EX
2 . Nowconsidering the
A1 0
1 −P2 1 0 above as a perturbation of A = we can apply
[L1 , L2 ] =
†
[X1 , X2 ]† , (48) 0 A2
0 I −P1 I
Theorem 2.8 from Ref. [53] again so long as

and A1 = A1 + X1 † EX1 − X2 † EX1 P1 and A2 = A2 + X2 † sep(A2 , A1 ) > 0. (52)

EX2 − P1 X1 † EX2 . Equivalently, we have
Using the stability and symmetry of the sep function a
A + E = R1 A1 L1 + R2 A2 L2 .
† †
(49) necessary condition for the above is

sep(A1 , A2 ) − X1 † EX1 + X1 † EX2 P
Proof. From the ﬁrst property in Eq. (42), and Theorem
2.8 in Ref. [53] we conclude the existence of a matrix P1 − X1 † EX1 − PX1 † EX2 > 0, (53)
such that
which by submultiplicativity and the norm bound on P1
†
X2 EX1 is true if the second property in Eq. (42) holds. Hence
P1 ≤ , (50) Theorem 2.8 in Ref. [53] provides for the existence of a
sep(A1 , A2 ) − X2 † EX2 − X1 † EX1 P2 with norm bound

020357-13
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

†
X2 EX1
P2 ≤ (54)
sep(A1 , A2 ) − X2 † EX2 + X1 † EX2 P1 − X2 † EX2 − P1 X1 † EX2

and the property that Finally, to analyze perturbations of eigenvalues, we

make use of the Bauer-Fike Theorem [53, Theorem 1.6]:
A1 0 let A be diagonalizable such that S −1 AS = diag[(aj )j ] and
[L1 , L2 ]† (A + E)[R1 , R2 ] = , (55)
0 A2 let E be an arbitrary operator of the same dimension. Then,
for any eigenvalue ã of A + E, the bound
with

1 0 1 P2 |ã − aj | ≤ SS −1 E (61)
[R1 , R2 ] = [X1 , X2 ] , (56)
P1 I 0 I
is satisﬁed for some eigenvalue aj in any vector-induced
1 −P2 1 0
[L1 , L2 ]† = [X1 , X2 ]† . (57) norm. This implies that, if A is Hermitian, then
0 I −P1 I

|ã − aj | ≤ E. (62)

We note that inEq. (42) the first property implies the
second if X1 † EX2 ≤ X2 † EX1 . V. THE RANDOMIZED BENCHMARKING
While eigenvalues and invariant subspaces are stable PROTOCOL
under small perturbations (as discussed above), that is,
they are holomorphic functions with respect to analytic The name randomized benchmarking is conventionally
perturbations, the same is not true for eigenvectors. This given to a class of methods that assess the quality of a set
is due to the fact that a vector basis spanning a multidi- of quantum gates. These methods are probabilistic, and can
mensional eigenspace is not uniquely determined, and thus be seen as constructing an estimator for a quantity that cap-
the eigenvectors of the perturbed eigenspace may be com- tures some notion of gate quality. In this section, we make
pletely different from the unperturbed basis. However, if an an attempt at defining randomized benchmarking. By this
unperturbed eigenvalue a1 is simple, the related eigenvec- we mean that we attempt to organize and make explicit
tor x1 is unique (up to a scalar factor), and it is thus stable. various ideas that have been present in the literature. We
We can make this more explicit by specializing Theorem begin (in Sec. V A) by dividing RB into two parts: a data-
6 to simple invariant subspaces of dimension one. Let us collection phase and a data-processing phase. These corre-
again consider a Hermitian matrix A and adopt a unitary spond roughly to the parts of RB performed on a quantum
basis transformation X = [x1 , X2 ] so that computer and on a classical computer, respectively. Within
this division we focus first on the data-collection phase. In
a1 0 Secs. V B and V C, we give a general protocol for the data-
[x1 , X2 ] A[x1 , X2 ] =
†
, (58) collection phase of RB. This general protocol depends on
0 A2
a number of input parameters, and we can obtain every
where A2 ∈ Md−1 . In this specific setting, the separation known RB protocol from a choice of these input param-
function becomes [63] eters. We complement this protocol with a classification
of RB protocols into a few types in Sec. V D. This clas-
−1
sep(a1 , A2 ) = (a1 1 − A2 )−1 . (59) sification, which pertains only to the data-collection phase
of RB is largely a formalization of knowledge implicit in
From Theorem 6, we then have the following. the literature but we see that it is a useful organizing tool
when proving theorems about the data generated by RB.
Corollary 7: (Perturbation of a 1-dim simple subspace). This data we discuss in Sec. V E.
The left and right perturbed eigenvectors originated from We note that the output of RB data is assumed to be of
x1 are a very particular form, namely that of a linear combina-
tion of (matrix) exponential decays. However, this form is
r1 ≈ x1 + X2 (a1 1 − A2 )−1 X2 Ex1
† incumbent upon assumptions on the quantum computer on
and
which (the data-collection phase of) RB is implemented.
≈ x1 + x1 EX2 (a1 1 − A2 )−1 X2 ,
† † † †
1 (60) We discuss what assumptions have been made before in the
literature and propose our own set of assumptions, which
where we neglect terms O(E2 ). we justify later in the text.

020357-14
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

A. The data-collection and data-processing phases the action on the space of Hermitian matrices ρ by
RB is composed of two major parts, a data-collection conjugation, i.e., φr (g)(ρ) = ω(g)(ρ) = Ug ρUg † .
phase and a data-processing phase. The data-collection In general, however, the reference implementation
phase consists of what one typically thinks of as RB: one φr is not a representation, though we see that for
randomly selects a sequence of quantum gates and applies any known RB procedure the reference implementa-
them to a quantum state together with a global inverse, and tion can be written as φr (g) = Aω(g)B , where A, B
measures the resulting state. Averaging over many ran- are (unitary) quantum channels. We refer to ω as the
dom choices of these gates one obtains RB output data reference representation.
that depends on the length of the random sequence in a 3. An ending gate: A group element gend that dictates
controlled way. This vague description can be made more the global action of a RB sequence. For most pro-
precise in many different ways and we provide a general posals this gate is simply the identity, but in other
framework for this procedure in the next few subsections. proposals nontrivial choices for gend (such as choos-
The data-processing phase, on the other hand, consists ing it uniformly at random [13,37,39,45]) play an
of what one then does with the data given by a RB exper- essential role in data-processing schemes. This end-
iment. This can be as simple as fitting the data to an ing gate also allows us to include RB schemes that
exponential decay, but in many cases also involves more do not involve an inversion gate [16,19,42,64]. We
sophisticated processing techniques. The key feature of the emphasize that it is not necessary to implement this
RB protocol that allows for a structured approach to data gate physically, but rather it arises from compilation.
processing is the fact that the RB output data has a very 4. A set of sequence lengths: A set of integers M
controlled form. We discuss this form in Sec. V E after denoting the length of the random sequences of
more formally discussing the data-collection phase of RB. gates implemented in a RB experiment. We denote
elements of this set by m and the largest element of
this set by M .
B. Input parameters 5. An input state: A state ρ0 that is prepared at the
The data-collection phase of a RB procedure is charac- beginning of a RB experiment. This state will typ-
terized by a set of input parameters. These input param- ically be a pure state (such as the |0, . . . , 0 state
eters fully define a protocol (which we write down in vector), but is chosen mixed in some versions of
Sec. V C) that can be executed on a quantum computer, RB [56].
yielding probabilistic data that can then be interpreted. 6. An output POVM: A POVM that is measured
Below is a list of all input parameters to RB, together at the end of a RB experiment. We denote this
with an explanation and examples of choices for these POVM as { i }i∈I with some index set I . In many
parameters that correspond to versions of RB present in cases this is a two-component POVM, but some
the literature. RB procedures explicitly call for more complex
measurements (such as a computational basis mea-
1. A gate-set/group: A finite set of unitaries (quantum surement [29]).
gates) on Cd . In (almost) all RB protocols this gate 7. A set of sampling distributions: A set of probabil-
set is also a finite subgroup G ⊂ U(d) of the uni- ity distributions νi for i ∈ {1, . . . , M } over the group
tary group. In a large section of the RB literature G that govern the random sampling of group ele-
the group considered is the q-qubit Clifford group ments in RB. We often consider the scenario where
Cq , but a range of other choices (such as the Pauli all these probability distributions are the same, in
group Pq [13], the real Clifford group [35] or the which case we drop the subscript i and just write ν
CNOT-dihedral group [37,38]) are possible. Choos- for the probability distribution. Moreover, in almost
ing a group fixes what gates RB assesses the quality all instances in the literature this distribution is uni-
of and partially determines the structure of the out- form, i.e., ν(g) = 1/|G|, and unless stated explicitly
put data. In generator-style RB [14,15] this group is we always assume this to be the case.
defined implicitly by the set of generators.
2. A reference implementation and representation:
A map φr from the gate-set/group G to the d- C. The data-collection protocol
dimensional superoperators that specifies how the Given the input parameters discussed above we can
gates in G should be implemented in the quantum write down a formal procedure for the data-collection
computer. This map takes into account aspects of phase of RB. It has as output an estimator p̂(i, m) of a
the specific RB protocol but also how gates are com- probability p(i, m) for each POVM element i for i ∈ I
posed of elementary gates and other implementation and each sequence length m ∈ M.
details. In uniform RB the map φr is a representation Note that the probabilities p(i, m) depend in a nontrivial
of the group G on Sd . The prototypical example is manner on the initial state ρ0 , the POVM { i }i∈I and the

020357-15
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

ending gate gend . We, however, suppress this dependence include the original RB proposals [1,66] and many
unless it is explicitly necessary to refer to it. others.
2. Nonuniform randomized benchmarking: The
D. A typology of randomized benchmarking protocols defining feature of this class is that the sampling
distributions νi are not the uniform distribution.
Given protocol Algorithm 1, different choices of the It comes in two flavors, which we discuss sepa-
parameters discussed in Sec. V B give rise to different rately:
RB procedures. More strongly, (the data-collection phases
of) all variants of RB currently in the literature can be (a) Subset RB: Here, the distributions νi are far
expressed by choosing these input parameters correctly. from uniform (and typically only have support
Surveying the literature we can distinguish three major on a small subset of the group G). Examples
types that are differentiated by their reference implemen- from the literature are Refs. [14,15,39,67].
tations and sampling distributions. The output data associ- (b) Approximate RB: Here the νi are close to
ated with these classes of protocols has varying behavior uniform. This latter class will turn out to be
and we treat each class separately in Sec. VI. All protocols essentially the same as uniform RB. This class
included in these classes can be found in Fig. 2 (here we has been discussed in Ref. [14] and also arises
give only illustrative examples). in the original “NIST” RB proposal [5] (as per
the analysis of Ref. [51]).
1. Uniform randomized benchmarking: This is the
basic type of RB. It is characterized by the fact In all works of this type so far the reference imple-
that the probability distributions νi are the uni- mentations are representations (akin to uniform
form distribution for all i ∈ {1, . . . , mmax }, and that RB).
the reference implementation map φr is exactly a 3. Interleaved randomized benchmarking: This
representation ω, usually the standard action by class of RB protocol is characterized by the addition
conjugation given by ω(g) = φr (g)(ρ) = Ug ρUg † of an extra “interleaving gate” in the RB proce-
for unitaries Ug (other choices have been made in dure. This is a class that is somewhat idiosyncratic,
Refs. [42,65]). Randomized benchmarking propos- having one standard subtype and a collection of
als of this type are mainly distinguished by what “nonstandard” protocols:
group G they consider as a gate set (at least when
it comes to the data-collection phase, different pro- (a) Standard interleaved randomized bench-
posals in this class might have radically different marking: In this class the interleaving gate is
data-processing procedures). Protocols of this type an element of the benchmarked group G. In this

Algorithm 1. RB (data-collection phase)

020357-16
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

case we ﬁnd that it is most useful to interpret form

interleaved RB as uniform RB, with the ref-
erence implementation a representation ω, but p(i, m) ≈ Tr(Aλ Mλm ), (63)
with the probability distributions νi uniform for λ∈
even i and peaked on a single group element (the
interleaving gate) for odd i. We consider this where Mλ is an nλ × nλ real matrix that depends only on
in more detail in Sec. VI C. The paradigmatic the implementation φ and Aλ is an nλ × nλ matrix encoding
example is Ref. [4], but nearly all uniform RB SPAM behavior. Note that the matrices Mλ are not required
protocols have an interleaved version. to be normal, or even diagonalizable. This means that
(b) Nonstandard interleaved randomized bench- p(i, m) can appear to be strikingly nonexponential (at least
marking: These protocols are characterized by if m is fairly small) unless ω is known to be multiplicity-
the addition of interleaving gates that are not free. We discuss this in greater detail in Sec. VII when we
part of the group G as well as nonuniform sam- discuss general fitting procedures.
pling distributions. We discuss these protocols
on a more case-by-case basis in Sec. VI C. F. Assumptions
The functional form of RB output data given in Eq. (63)
1. Protocols without inversion gates does not immediately follow from the specification of the
A number of RB protocols have been developed that protocol in Algorithm 1. Rather it must be derived based
do not feature an inversion gate ginv . These protocols are on assumptions on the behavior of the operations being
indicated with ∗∗ in Fig. 2. While not immediately obvi- performed inside the quantum computer. Here we give a
ous, these protocols are actually covered by the general run down of assumptions that are made throughout the lit-
procedure written down in Algorithm 1. We can think of erature, and which we make in order to derive Eq. (63).
these protocols as choosing the ending gate gend at random The assumptions we make are not the most general pos-
for each experimental run and averaging over the results. sible that still lead to Eq. (63), but we attempt to strike a
Because of the invariance of uniform group averages this balance between generality and operational motivation. In
is equivalent to not including an inversion gate and end- the list we point out where assumptions can be generalized
ing the protocol on a random group element. In Sec. VIII and refer to work where this is done (for some versions of
we see that protocols without inversion gate can be seen as RB).
a special case of a general postprocessing scheme for RB
data. (a) State preparation and measurement consistency:
We assume that the initial state ρ0 and the mea-
surement POVM { i }i∈I are always prepared in
E. Output data
the same manner, independently of the gates being
There is a folkloric notion that the output data of RB has implemented. Slightly stronger, we assume the exis-
an exponential dependence on the sequence length, with tence of quantum channels ESP and EM such that
the rate of decay dependent only on the implementation φ the implemented initial state is given by ESP (ρ0 )
of the gates in G. This was first established to be true for and the elements of the implemented measurement
uniform RB (in our typology) with the unitary and Clifford POVM are given by EM ( i ). This assumption is
groups, where, under certain assumptions (see Sec. V F) on made throughout the RB literature.
the quantum computer implementing operations, one can (b) Markovianity and time independence: We assume
prove that p(i, m) = Af m + B, where f depends only on that the implementation of a gate g ∈ G is always
the implementation map φ and A, B are constants depend- the same, independently of when it is performed in
ing on SPAM. However, if the group G was not the Clifford the RB protocol and independently of its context
group it was found that the RB output data did not fol- (the gates being performed before and after). This
low a single
exponential decay but rather was of the form assumption leads to the concept of an implementa-
p(i, m) = λ Aλ fλm with the decay constants fλ depend- tion map φ : G → Sd which assigns to each group
ing only on the implementation of the quantum operations element g a completely positive superoperator φ(g)
and associated with the irreducible subrepresentations of modeling the actual implementation of the gate.
the reference representation ω.
However, this functional form is only valid if the refer- (i) This assumption is not always justified, as
ence representation ω has no multiplicities (no irreducible the implementation of a gate can in principle
subrepresentation occurs more than once), and hence does depend on, e.g., the gates being implemented
not describe all possible RB experiments. In this paper before it or the amount of time elapsed in
we argue that
for a ⊕n
general reference representation of the the protocol. It can also depend on external
form ω = λ∈ σλ λ for ⊂ Irr(G) RB data takes the uncontrolled variables (either deterministic or

020357-17
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

random). In Ref. [68], a model of time depen- operators R, L are not guaranteed to be com-
dence has been considered and in Refs. [69–71] pletely positive, complicating the interpretation
the effect of gate correlations and certain uncon- of this assumption as being a belief on physical
trolled variables such as quasistatic noise were quantities. Finally, Ref. [25] derives (introduc-
investigated. In all of these scenarios, however, ing the Fourier analysis also used here) Eq. (63)
the exponential behavior of Eq. (63) breaks (up to an exponentially small correction) for
down. It might be possible to derive assump- uniform RB with the multiqubit Clifford group
tions beyond the setting of Markovian time under an assumption on the fidelity of the imple-
independence that lead to output data of the mentation map φ with respect to its reference
correct form, but we do not pursue this here. implementation,

(c) Closeness to reference implementation: In order 1

F[ω(g), φ(g)] ≥ 1 − δ. (66)
to derive Eq. (63) we must make additional assump- |G| g∈G
tions on the implementation map φ. We assume
that
This assumption has the advantage of making
1 reference to physical objects only, but suffers
φr (g) − φ(g) ≤δ, (64) from the drawback that δ must grow inversely
|G| g∈G
proportional to the underlying Hilbert-space
dimension for the argument in Ref. [25] to hold.
for sufficiently small δ > 0. The appearance of the We discuss this further in Sec. X.
diamond distance might strike one as overly pes-
simistic, however, we show that it is in fact required
VI. THE RANDOMIZED BENCHMARKING
in Sec. IX. It is also not the most general possi-
FITTING MODEL
ble assumption that still guarantees Eq. (63) (see
below), but it has the advantage of making reference In this section, we prove a general theorem about the
only to physical quantities and being operationally behavior of RB output data, i.e., the probabilities p(i, m)
interpretable. associated with a RB experiment with its input param-
eters specified as in Sec. V B and described in protocol
(i) In early works on RB the standard assumption Algorithm 1. We argue that for a broad variety of choices
was that of gate-independent noise. This means for reference implementations and probability distribu-
the implementation map φ is of the form φ(g) = tions this data is well described by a linear combination
Aφr (g) for all g with some fixed quantum of exponential (matrix) decays [as in Eq. (63)], as long
channel A. This is not a very realistic assump- as the physical implementation φ is close to its ideal
tion and several attempts were made to replace version: the reference implementation φr . By close we
it with a weaker assumption. In Ref. [66] it mean that the diamond distance between reference and
has been proposed to consider a perturbation ideal implementations, averaged over the group, has to be
φ(g) = Aφr (g) + Ag φr (g). In Ref. [26], how- bounded as
ever, this analysis was shown to not be strong
enough to actually justify behavior of the form 1
φr (g) − φ(g) ≤δ. (67)
Eq. (63). Here, an analysis of uniform Clifford |G| g∈G
randomized benchmarking as a power iteration
of a matrix was proposed (see also early work One can think of the above equation as a relatively weak
in this direction by Ref. [41]), justifying the initial belief one must hold about one’s quantum computer
exponential decay model (but with nonoptimal (instantiated in φ) before one can trust the outcome of RB.
correction). Subsequently, in Ref. [24] Eq. (63) For the rest of the work we adopt the transfer-matrix
was derived (with an exponentially small cor- framework (discussed in Sec. III) for describing the action
rection) for uniform RB with the multiqubit of superoperators. We also explicitly write implementa-
Clifford group under the assumption that there tion noise on the initial state ρ0 and output POVM { i }i∈I
exist superoperators R, L such that through quantum channels ESP (state preparation) and EM
(measurement). This is notationally somewhat clumsy,
1
φ(g) − Rω(g)L ≤δ (65) but it makes explicit one of the assumptions underlying
|G| g∈G RB, namely that SPAM noise is independent of sequence
length.
for small enough δ. This assumption is quite The theorems we present in this section are generaliza-
general, but has as its main drawback that the tions of the theorems given in Ref. [24], encompassing

020357-18
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

almost all known RB procedures, but the techniques used combination of the form
are based on the cleaner conceptual framework of matrix-
valued Fourier transforms provided by Ref. [25], which we p(i, m) ≈ Tr(Aλ Mλm ), (71)
reviewed in Sec. IV B. The central observation of Ref. [25] λ∈
is that the data-collection phase of uniform RB can be seen where Mλ is an nλ × nλ matrix depending only on the
as evaluating m-fold convolutions of the implementation actual implementation φ. In particular, Mλ is given by the
map φ. This observation generalizes beyond uniform RB projection of the Fourier mode F (φ)[σλ ] onto the subspace
to arbitrary implementation maps, and, in particular, we associated with its nλ largest (in absolute value) eigen-
see that values. This is the content of Theorem 8. The essential
idea in Theorem 8 is the fact that convolutions corre-
p(i, m) = EM ( i )|ν(g1 ) . . . νm (gm ) spond to matrix multiplication in Fourier space, together
g1 ,...,gm ∈G with a careful use of the subspace perturbation techniques
discussed in Sec. III.
× φ(gend g1−1 . . . gm−1 )φ(gm ) · · · φ(g1 )|ESP (ρ0 )
(68) Theorem 8: (Output data of uniform randomized bench-
marking). Let p(i, m) be the outcome probability associ-
can be rewritten, using the invariance of the uniform sum ated with a uniform RB experiment with group G, initial
⊕n
over G under changes of variables, as state ρ0 , reference representation ω = λ∈ σλ λ , and
ending gate gend , for a specific sequence length m ∈ M
and POVM element i in the POVM { i }i (as described
−1
p(i, m) = EM ( i )|φ(gend gm−1 )νm (gm gm−1 ) in protocol Algorithm 1). Let φ be the implementation map
g1 ,...,gm ∈G describing the actually implemented operations. More-
−1
× φ(gm gm−1 ) · · · ν1 (g1 )φ(g1 )|ESP (ρ0 ) (69) over, assume that there exists a δ > 0 such that
1
= EM ( i )| φ ∗ (νm φ) ∗ · · · ∗ (ν1 φ) (gend )| ω(g) − φ(g) ≤ δ ≤ 1/9. (72)
|G| g∈G
× ESP (ρ0 ) (70)
The RB output probability p(i, m) is well approximated as
where we use the definition of convolution of imple-
m
mentation maps given in Eq. (26) and where (νi φ)(g) = 2δ
|p(i, m) − Tr(Aλ Mλ )| ≤ 8 δ 1 +
m
,
νi (g)φ(g). We see that often the convolution product map λ∈
1 − 5δ
φ ∗ (νm φ) ∗ · · · ∗ (ν1 φ) can be written exactly as an m-fold (73)
convolution φ ∗m (for some φ that is not necessarily the
same as φ). where Mλ , Aλ are nλ × nλ real matrices and Mλ depends
We begin in Sec. VI A with discussing the case of uni- only on the implementation φ.
form RB (as per the RB typology in Sec. V D). This is
the easiest case, but the results derived there will go a Proof. Note from Eq. (69) with νi the uniform probability
long way in analyzing the other two types (nonuniform and distribution for all i ∈ {1, . . . , m} that
interleaved RB).
p(i, m) = EM ( i )|(φ ∗ φ ∗m )(gend )|ESP (ρ0 ) . (74)
A. Uniform randomized benchmarking Inserting the Fourier transform of φ, we get
Here we discuss the behavior of RB output data given
by a uniform RB scheme (as defined in Sec. V D). We p(i, m) = dλ EM ( i )
prove that this data behaves as expected (i.e., a controlled λ∈Irr(G)
linear combination of exponential decays), as long as the
× | TrVλ {F (φ)m+1 [σλ ]σλ (gend −1 ) ⊗ 1}|ESP (ρ0 )
implementation map φ is close enough to its reference
(75)
implementation φr . As we saw in Sec. V B, for uniform ⎡
RB protocols this reference implementation is exactly a m+1
1
representation, which we denote by ω. We can always = EM ( i )| TrVωG ⎣ ωG (g) ⊗ φ(g)
decompose ω into a direct sum of irreducible representa- |G| g∈G
⊕n
tions. We write this as ω = λ∈ σλ λ with some index ⎤
set and σλ irreducible subrepresentations appearing with
multiplicity nλ . As discussed in Sec. V, we expect the RB × [DG ωG (gend −1 )] ⊗ 1⎦ |ESP (ρ0 ) , (76)
output data to be approximately well described by a linear

020357-19
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

where ωG (g) = ⊕λ∈Irr(G) σλ (g) is the direct sum of all as a perturbation to F(ω) we need to ensure the condi-
irreducible representations of G and DG = ⊕λ∈Irr(G) dλ 1λ tions in Eq. (42) are satisfied with respect to the norm
accounts for the dimensional factor in the inverse Fourier ·m . Using the submultiplicativity of this norm and the
transform. Now we can consider the Fourier operator F(φ) fact that X1 m = 1 by construction together with the tri-
[as defined in Eq. (31)] associated with φ as a perturbation angle inequality, we get the following sufficient condition
of its ideal version F(ω). From our discussion of Fourier for the applicability of Theorem 6:
transforms
and Fourier operators we know that F(ω) = †
X1 F(φ − ω)X2 X2 † F(φ − ω)X1

1/|G| g∈G λ∈ σ λ (g) ⊗ ω(g) is an orthogonal projec-
m m
tion, with rank given by the number
of irreducible subrep- [sep(1, 0) − X1 † F(φ − ω)X1 m − X2 † F(φ − ω)X2 m ]2
resentations of ω (Rk[F(ω)] = λ∈ nλ ). Recall also that
there is a natural matrix norm ·m on the space of Fourier [2 F(φ − ω)m ]2 1
≤ < (78)
operators and that [1 − 5 F(φ − ω)m ]2 4

F(φ − ω)m where we also use that sep(1, 0) = 1, which is easy to see
1
from the deﬁnition of sep (see Sec. IV C). Working out, we

= TrVωG F(φ − ω)DG ωG (g −1 ) ⊗ 1 see that the above is satisﬁed if Eq. (72) is true, which it is
|G| g∈G by assumption. Hence we can use Theorem 6 to conclude
the existence of operators R = [R1 , R2 ], L = [L1 , L2 ] with
1
= φ(g) − ω(g) . (77) L† = R−1 and P1 such that
|G| g∈G

F(φ) = R1 X1 † F(ω)X1 + X1 † F(φ − ω)(X1 + X2 P1 ) L1 †
The plan is now to use the perturbation theorem (Theorem
+R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 † .
6) to split the above into dominant and subdominant invari-
ant subspaces. To do this note that F(ω) is a projector (79)
so we trivially get a spectral resolution with X1 = F(ω),
X2 = 1 − F(ω) with F(ω) acting as the identity on the Using the fact that L† = R−1 (and thus that L2 † R1 =
column and row space of X1 and as the zero operator on L1 † R2 = 0) we can now write p(m, gend , ) as a sum of
the column and row space of X2 . Thinking of F(φ − ω) two terms corresponding to the above spectral resolution:

−1
m
p(i, m) = EM ( i )| TrVωG [DG ωG (gend ) ⊗ 1]F(φ) R1 X1 † F(ω)X1 + X1 † F(φ − ω)(X1 + X2 P1 ) L1 † |ESP (ρ0 )
m+1 †
+ EM ( i )| TrVωG [DG ωG (gend −1 ) ⊗ 1] R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 |ESP (ρ0 ) .
(80)

We consider both of these terms separately. We deal ﬁrst with the second term. Note that, using the deﬁnitions of R, L
from Theorem 6, we have
m+1 †

(2) ≤ TrVωG [DG ωG (gend −1 ) ⊗ 1]R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 (81)
m
≤ (X2 + X1 P2 + X2 P1 P2 ) X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 (X2 − P1 X1 )† max , (82)

which is just a statement about the max norm of a Fourier operator. Note that X2 † F(ω)X2 = 0 by construction so the
above depends only on F(φ − ω). Now using the max-mean norm inequality in Eq. (35) several times and the fact that
X2 = 1 − X1 , we can upper bound this as

(2) ≤ (X2 + X1 P2 + X2 P1 P2 )(X2 − P1 X1 )† F(φ − ω)X2 (X2 − P1 X1 )max [F(φ − ω)X2 (X2 − P1 X1 )† ]m m (83)

≤ 2 F(φ − ω)max (1 + P2 m )(1 + P1 m ) + P1 m P2 m (3 + P1 m ) [F(φ − ω)m (1 + P1 m )] . (84)
2 m

Now we use from Theorem 6, the upper bounds on

†
X2 F(φ − ω)X1 2 F(φ − ω)m 2δ
P1 m ≤ m ≤ ≤ (85)

1 − X1 F(φ − ω)X1 m − X1 F(φ − ω)X1 m
† † 1 − 5 F(φ − ω)m 1 − 5δ

020357-20
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

and of Eq. (80) as

2 F(φ − ω)m
P2 m ≤ −1
1−5 F(φ −ω)m −2 P1 m X1 † F(φ −ω)X2 (1) = dσλ EM ( i )| TrVσλ {[σ λ (gend ) ⊗ 1]
λ∈
(86) m
× F (φ)[σλ ]Rλ1 (X1λ )† F (φ)[σλ ](X1λ + X2λ P1λ ) Lλ1 †
2 F(φ − ω)m
≤ (87) × |ESP (ρ0 ) , (93)
8F(φ−ω)2m
1 − 5 F(φ − ω)m − 1−5F(φ−ω)m
wshere we also use that F(ω)X2 = 0 by construction. To
2δ(1 − 5δ) continue further we need to pick a convenient basis to
≤ (88)
1 − 8δ 2 express X1λ , Rλ1 .
δ For this note that we can specify rank-1 Fourier oper-
≤ , (89) ators in L(VωG ) ⊗ Sd by specifying pairs of superopera-
1−δ
tors A, B and looking at Fourier operators of the form
where we exploit the assumption δ ≤ 1/9 in the last line. F(AωB ). It is useful to think of the Fourier operator F(ω)
Inserting these bounds into the main expression we get as a vectorization operation on A, B . We can express X1 =
j j

F(ω) in this way by considering the operators F(Pλ ωPλ )
2δ δ j
where Pλ is the (superoperator) projector onto the j th
(2) ≤ 4 1 + 1+
1 − 5δ
1−δ ⊕n
copy of the representation σλ in ω = λ∈ σλ λ . Note
2
that these operators are rank-one orthogonal projectors and
2δ δ 2δ
+ 3+ moreover that
1 − 5δ 1 − δ2 1 − 5δ
m
nλ

nλ
2δ ⊕nλ
) = X1λ
j j
× δ 1+ (90) F(Pλλ ωPλλ ) = = F(σλ (94)
1 − 5δ jλ =1 jλ =1
m
115 2δ
≤ δ 1+ (91) holds true. Now noting that (X1λ + X2λ P1λ ) = Rλ1 is a rank
16 1 − 5δ nλ matrix with X1λ Rλ1 = X1λ , we can similarly find nλ
m j
2δ superoperators Rλλ (jλ ∈ 1, . . . , nλ ) such that
≤8 δ 1+ , (92)
1 − 5δ

nλ
Rλ1 =
j j
F(Rλλ ωPλλ ), (95)
where we use that F(φ − ω)max ≤ 2 and δ ≤ 1/9. Next jλ =1
we consider the first term in Eq. (80). For this term we
j j
desire an exact expression. We begin by noting that both where F(Pλλ ωRλλ ) is again of rank one (but no longer
F(ω) and F(φ) are block diagonal with respect to the orthogonal). Note that X1λ Rλ1 = X1λ gives rise to the orthog-
decomposition of ωG into irreducible representations. This onality property
implies that the matrices R, L are block diagonal with
j j j j j j
respect to this decomposition as well, and that, more- F(Pλλ ωPλλ )F(Rλλ ωPλλ ) = δjλ ,jλ F(Pλλ ωPλλ ). (96)
over, we can take the matrices P1 , P2 to be block diagonal
with the blocks labeledby the⊕nirreducible subrepresenta- Using these resolutions of X1λ , Rλ1 and the orthogonality
tions present in ω = λ∈ σλ λ . Writing P = ⊕λ∈ P λ , property we can express the first term in Eq. (80) further
and similarly for other operators we can write the first term as

nλ
(1) = dσλ EM ( i )| TrVσλ [σ λ (gend −1 ) ⊗ 1]F (φ)[σλ ]Rλ1 (97)
λ∈ j 1 ,...,j 2m =1
λ λ

jλ1 jλ1 jλ2 jλ2 jλ2m jλ2m λ †
× F(Pλ ωPλ )F (φ)[σλ ]F(Rλ ωPλ ) · · · F (φ)[σλ ]F(Rλ ωPλ ) L1 |ESP (ρ0 ) (98)

nλ j
[σ λ (gend −1 ) ⊗ 1]F (φ)[σλ ]Rλ1 F(Pλλ ωPλλ )Lλ1 † |ESP (ρ0 ) [Mλm ]jλ ,j 2m
j
= dσλ EM ( i )| TrVσλ (99)
λ
λ∈ j 1 ,j 2m =1
λ λ

020357-21
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

with assumption [replacing it with the more general diamond-

! " norm assumption of Eq. (72)]. We have the following
j j j j statement.
[Mλ ]jλ ,jλ = Tr F(Pλλ ωPλλ )F(φ)F(Rλλ ωPλλ )
! "
j j Theorem 9: (Randomized benchmarking data with
= Tr F(φ)F(Rλλ ωPλλ ) , (100)
nonuniform sampling). Let ν be a probability distribution
on G and pν (i, m) be the outcome probability associated
j j j j
by the fact that F(Pλλ ωPλλ ), F(Rλλ ωPλλ ) are of rank one. with a nonuniform RB experiment with implementation
⊕n
Now writing map φ and reference representation ω(g) = λ∈ σλ λ .
Moreover, assume that there exists δ, δ > 0 such that

−1
[Aλ ]jλ ,jλ = dσλ EM ( i )| TrVσλ [σ λ (gend ) ⊗ 1]F (φ) 1
ω(g) − φ(g) ≤ δ, (103)
|G| g∈G
[σλ ]Rλ1 F(Pλλ ωPλλ )Lλ1 †
j j
× |ESP (ρ0 ) (101)
1
|ν(g) − |≤δ, (104)
we can combine the two terms in Eq. (80) to get g∈G
|G|
m
2δ with δ + δ ≤ 1/9. Now pν (i, m) is well approximated as
p(i, m) − Tr(A M m
) ≤ 8 δ 1+ .
λ λ
1 − 5δ
λ∈
(102) |pν (i, m) − Tr[Aλ (Mλν )m ]|
λ∈
m
2(δ + δ )
≤ 8 (δ + δ ) 1 + , (105)
1 − 5(δ + δ )
B. Randomized benchmarking with nonuniform where Mλν , Aλ are nλ × nλ real matrices, Mλν depends on
sampling the implementation φ and the measure ν.
Several works [5,14,15,39,51] discuss adaptations of
RB where the elements of the group are no longer sam- Proof. Consider the map φν : G → Sd : g → |G|ν(g)
pled exactly at random, but are instead sampled according φ(g). Note that we can think of nonuniform RB as being
to (1) a distribution close to uniform [5,14,51] (which we uniform RB with this (not trace preserving but still com-
call “approximate RB” in Sec. V B, following Ref. [14]), pletely positive) implementation map. In particular, we
or (2) a distribution that only has support on a small subset have
of the group; group generators in the case of Ref. [14] (see
also early work on the Clifford group by Ref. [67]), sub- pν (i, gend , m) = EM ( i )|(φ ∗ φν∗m )(gend )|ESP (ρ0 ) ,
group cosets in the case of Ref. [39], and constant depth (106)
circuits (layers) in the case of Ref. [15]. In Sec. V B, we
call these approaches “subset RB.” which is just Eq. (74) but with the “effective imple-
We begin by treating the case of approximate RB. This mentation” φν . From the assumptions of the theorem we
corresponds to performing RB as described in protocol have
Algorithm 1 but instead of sampling group elements from
the group G uniformly at random one samples group 1
ω(g) − φν (g)
elements according to some prescribed probability distri- |G| g∈G
butions νi : G → [0, 1] (with i indicating the time at which
the gate is applied). In Ref. [14] it has been argued that 1 1 1
≤ ω(g) − φ(g) + |ν(g) − |≤ .
as long as the distributions νi are all close to the uni- |G| g∈G | G | 9
g∈G
form distribution in the l1 norm, then the output data of
(107)
approximate RB is close to the output data of exact RB.
As a corollary of Theorem 8 we obtain a similar result.
Hence, the proof of Theorem 8 immediately applies to
Our result is somewhat less general than the one given in
pν (i, gend , m), yielding Eq. (105).
Theorem 17 of Ref. [14]. In particular, we assume that all
distributions νi are equal to a fixed distribution ν. In return We note that in the case of NIST RB [51] the prob-
for this restriction we are able to make a much stronger ability distribution over (a subgroup of) the single-qubit
statement on the behavior of the RB output data. Moreover, Clifford group is not strictly speaking close enough to uni-
our approach does not require the gate-independent noise form to apply the above theorem. This can be easily solved

020357-22
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

by blocking a few gate applications together, defining a with Mλ the projection onto the nλ -dimensional dominant
new effective implementation map φ = (νφ) ∗ (νφ) · · · ∗ invariant subspace of F (νφ)[σλ ] and where
(νφ), which is close enough to uniformly distributed to

apply Theorem 9. 2δ δ
The above approach fails utterly when applied to sub- ε ≤ 2δ 1+ 1+
1 − 5δ 1−δ
set RB. In this scenario the distribution ν only has support 2
on a small subset A of G and consequently g∈G |ν(g) − 2δ δ 2δ
+ 3+ ≤ 4δ
1/|G|| ≈ 1 in many cases. This is not necessarily a weak- 1 − 5δ 1−δ 1 − 5δ
ness of Theorem 8 but rather a statement of the fact (111)
that strong deviations from exponential behavior can be
observed if one does not give the distribution ν time with δ = δ + δ .
to converge to the uniform distribution through repeated
convolution. This was already noted more or less explic-
itly in previous papers on subset RB. There are two Note that this theorem is qualitatively less strong than
approaches to solving this problem. The first, followed in Theorem 8. In particular, we cannot guarantee that the
Refs. [14,15,39,67] is to restrict the set of sequence lengths distance between the output data of subset RB and the
M at which RB data is gathered to m ≥ mmix where mmix exponential decays associated with the irreducible subrep-
is related to the mixing time of the distribution ν. Note resentations of the reference representation closes expo-
that in the direct RB proposal [15], this convergence time nentially fast with increasing sequence length. However,
is instead enforced directly by applying a uniformly ran- our bound on this distance is stronger than previous rigor-
dom gate before applying nonuniformly sampled gates. ous statements (Theorem 20 in Ref. [14]) and works under
The second approach is to take this deviation from uniform weaker assumptions. The distance bound given in Ref. [39]
RB behavior at face value [13] and draw conclusions from (Theorem 3) does close exponentially but the proof relies
the RB output directly. We believe this latter approach is critically on the fact that ν is uniformly nonzero on a
more accurately classified as an interleaved benchmarking (large) subgroup coset in G, and thus applies only to a
scheme and we discuss it there. far more restricted situation. Note also that it does not
With regards to the first approach we can make a state- directly apply to the approach taken in Ref. [15]. However,
ment akin to Theorem 8 about subset RB procedures by we believe that with very minor alterations the reasoning
making the (natural) assumption that upon equilibration below can be made to fit.
of the distribution ν the quality of the total gates has not Proof. Consider again the map φν : G → Sd : g → |G|
degraded too much. Intuitively, this means that the gates ν(g)φ(g). We have
that have high weight in the initial distribution are of high
enough quality to generate (by composition) good-quality pν (i, m) = EM ( ∗ φν∗m )(gend )|ESP (ρ0 ) .
i )|(φ (112)
implementations of all gates in the group. Concretely, we
have the following theorem. ∗mmix
We now establish a bound on the quality of φν , namely
Theorem 10: (Subset randomized benchmarking). Let ν we show that
be a probability distribution on G and pν (i, m) be the
1
φ ∗mmix (g) − ω(g) ≤ δ + δ ≤ 1 .
outcome probability associated with a nonuniform RB ν (113)
experiment with implementation map φ and reference rep- |G| g∈G 9
⊕n
resentation ω(g) = λ∈ σλ λ . Moreover, assume that
there exists an integer mmix and real numbers δ, δ > 0 This can be seen as follows:
such that
1
φ ∗mmix (g) − ω(g)
1 ν
|ν ∗mmix (g) − |≤δ, (108) |G| g∈G
g∈G
| G |
1
ω∗mmix (g) − ω(g)
δ ≤ ν
ν(g) ω(g) − φ(g) ≤ (109) |G| g∈G
g∈G
mmix
1
φ ∗mmix − ω∗mmix
+ ν ν (114)
with δ + δ ≤ 1/9. Now pν (i, m) is well approximated as |G| g∈G
m−mmix
|pν (i, m) − Tr(Aλ Mλ )| ≤ ε (110) with ων (g) = |G|ν(g)ω(g). Writing out the convolution in
λ∈ the first term and changing variables, we get

020357-23
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

1 1

ω∗mmix (g) − ω(g) = |G|ν(ggmmix −1 ) . . . ν(g1 )ω(ggmmix −1 ) · · · ω(g1 ) − ω(g)
−1 −1
(115)
|G| g∈G ν
|G| g∈G
g

1 ,...gmmix −1 ∈G

1
≤ |G|ν(ggm−1mix −1 ) . . . ν(g1 ) − 1 ω(g) (116)
|G| g∈G g ,...g
1 mmix −1

1

= ∗m mix (g)
|G| − ν (117)
g∈G

for the ﬁrst term and

mmix
1 ∗m
φ mix − ω∗mmix = 1 ∗(mmix −j )
φν ∗(j −1)
∗ (φν − ων ) ∗ ων (g) (118)
|G| g∈G ν ν
|G| g∈G j =1

≤ mmix ν(g) φ(g) − ω(g) , (119)
g∈G

where we use the telescoping series identity Am − Bm = m j =1 A
m−j
(A − B)Bj −1 , which holds for any elements A, B of
an associative algebra (such as the implementation maps with convolution), the submultiplicativity of the diamond norm,
and the fact that φ(g) = ω(g) = 1 for all g ∈ G. Together with the theorem assumptions, this yields Eq. (113).
Now as in Theorem 8, we can write the RB output data as

−1 ∗mmix
pν (i, m) = EM ( i )| TrVω G D G [ω (g
G end ) ⊗ 1]F(φ)F(φν )m
F(φν ) |ESP (ρ0 ) , (120)

∗m
where m = m − mmix . We can again consider F(φν mix ) as a perturbation of F(ω). Since F(ω) is a projector, the operator
∗m
F(φν mix ) will resolve into a dominant and subdominant invariant subspace (as in Theorem 8). We have

p(i, m) = EM ( i )| TrVωG DG [ωG (gend −1 ) ⊗ 1] F(φ)F(φν )m−mmix R1 X1 † F(ω)X1

+ (X1 −
†
P1 X2 )F(φν∗mmix
†
− ω)X1 L1 †
|ESP (ρ0 )

+ EM ( i )| TrVωG DG [ωG (gend −1 ) ⊗ 1] F(φ)F(φν )m−mmix R2 X2 † F(ω)X2

+ (X2 † − P1 X1 † )F(φν∗mmix − ω)X2 L2 † |ESP (ρ0 ) . (121)

∗mmix
Now note that F(φν ) and F(φν ) commute, and hence share invariant subspaces. This means we can write the ﬁrst term
in Eq. (121) as
m−m
(1) = Tr Aλ Mλ mix ). (122)
λ∈

Finally, we can bound the second term in Eq. (121) as

| (2) | ≤ F(φ)F(φν )m−mmix R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φν∗mmix − ω)X2 L2 † max (123)

≤ R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φν∗mmix − ω)X2 L2 † m F(φν )m−mmix −1 m F(φ)max (124)

020357-24
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

using the max-mean inequality of the norms on Fourier operators. Note now that
⎡ ⎤m−mmix −1

F(φν )m−mmix −1 ≤ ⎣ ν(g) φ(g) ⎦ ≤ 1, (125)
m
g∈G

where we use that ν is a probability distribution and that φ ≤ 1. Moreover, we have that F(φ)max ≤ 1. Using this
and the reasoning from Theorem 8 we can thus bound the second term as
2
2δ δ 2δ δ 2δ
| (2) | ≤ 2 F(φν∗mmix − ω)m 1+ 1+ + 3+ (126)
1 − 5δ 1−δ 1 − 5δ 1−δ 1 − 5δ

with δ = δ + δ . Inserting the assumption that F(φ ∗mmix this probability distribution, we can reconsider the above
− ω)m ≤ δ we obtain the statement of the theorem. as a RB experiment according to the protocol written in
Algorithm 1, we have
C. Interleaved randomized benchmarking
pIRB (i, gend , m)
As discussed in Sec. V D, a common variant of RB is
interleaved randomized benchmarking (IRB). IRB is per- = p(i, gend , 2m) = EM ( i )|
formed like uniform RB, as formulated in Algorithm 1, g1 ,...,g2m ∈G
but the reference implementation is not a representation.
× φ[gend (g1 g2 . . . gm )−1 ]νC (g2m )φ(g2m )
Instead a ﬁxed operation C is being interleaved between
the application of randomly selected group elements. The × μ(g2m−1 )φ(g2m−1 ) · · · νC (g2 )φ(g2 )μ(g1 )φ(g1 )
outcome of this experiment is then compared to the same × |ESP (ρ0 ) , (128)
RB experiment without the interleaving gate to infer the
quality of the interleaved gate C. The literature splits into
where μ is the uniform distribution on G. Hence, we can
two sections, standard interleaved RB [4,48] and nonstan-
think of standard IRB as being a RB experiment with a
dard interleaved RB [9,47]. We emphasize here that we
particular choice of sampling distributions. In this picture,
discuss the so-called “interleaved step” of the interleaved
it becomes trivial to extend Theorem 8 to standard inter-
RB protocol, and do not interpret the resulting decay rate
leaved RB by considering the map φC = (νC φ) ∗ φ. By the
(for a thorough discussion of the relationship of interleaved
standard change of variables we can see
RB decay rates and their interpretation see Ref. [72]).

1. Standard interleaved randomized benchmarking pIRB (i, gend , m) = EM ( i )|φ ∗ φC∗m (gend )|ESP (ρ0 )
(129)
In the standard protocol the interleaved operation C is
applied after every randomly selected gate and is also a part
and hence interleaved RB is just uniform RB with the
of the group G. Hence at the end of a random sequence, the
implementation map φC . If φ(C) is close enough to its
inversion step can be performed inside the group. An IRB
reference representation element ω(C) the assumption
output data is thus of the form
Eq. (72) is reasonable for φC as well. Hence, Theorem 8
holds equally well for interleaved RB.
pIRB (i, gend , m)
Nonstandard interleaved RB protocols [9,13,47,50]
1 depart from the above framework by including interleaved
−1
= EM ( i )|φ[gend (g1 C . . . gm C) ]
|G|m g ,...,g ∈G gates that are not part of the group G, (the Pauli group
1 m
in the case of Ref. [13] and the Cliﬀord group in the case
× φ(C)φ(gm ) · · · φ(C)φ(g1 )|ESP (ρ0 ) (127) of Ref. [47]) and sampling from the group in a nonuni-
form manner. These are somewhat idiosyncratic so we treat
for a POVM element i , an ending gate gend , a sequence them separately. We see that the protocols of Refs. [9,47]
length m, an implementation map φ, and an initial state ρ0 . are covered by Theorem 8, while the protocols of Ref. [13]
It is interesting to interpret this procedure in the light of and Ref. [50] are not covered. We expect that it is possible
the protocol given in Sec. V C. Namely we can think of to make guarantees on the output data of these protocols
deﬁning a probability distribution νC over G, that takes the with suitable adaptations to Theorem 8 but we do not
value 1 for g = C and 0 for all other group elements. With pursue this here.

020357-25
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

2. Interleaved T-gate randomized benchmarking we can make the alternative assumption that
In Ref. [47] the quality of a T gate (with ideal imple-
mentation T ), with an associated noisy implementation T# 1 U ω(g) − U

#φ(g) ≤δ, (133)
is assessed by estimating the following quantity |G| g∈G

1 where U # is the noisy implementation of the unitary U and φ

pT (m) = EM ( i )|
|Pq |m |Cq |m p ,...p ∈P is the implementation of the reference representation ω(g).
1 m q
g1 ,...gm ∈Cq This is a reasonable assumption to make since

× φ{[gm t(pm ) . . . g1 t(p1 )]−1 }φ(gm )T#φ(pm )T# 1

#φ(g)
U ω(g) − U
× · · · φ(g1 )T#φ(p1 )T#|ESP (ρ0 ) (130) |G| g∈G
1
with Cq the q-qubit Clifford group, Pq ⊂ Cq the Pauli ≤ U ω(g) − U φ(g) + U φ(g) − U
#φ(g)
|G| g∈G
group, and φ : C → Sd an implementation of the Clif-
ford group (and the Pauli group) and t(p) : Pq → Cq is (134)
an injective map mapping Pauli elements p to TpT† .
# + 1
≤ U − U ω(g) − φ(g) (135)
Because T is in the third level of the Clifford hierar- |G| g∈G
chy we have TpT† ∈ Cq for all p ∈ Pq making the above
well defined. By defining the map φT (g) : Cq → Sd : g →
νT (g)T † φ[t−1 (g)]T with so as long as the implementation of the interleaving
unitary U is of sufficient quality Eq. (133) is reason-
able. Furthermore we note that due to the commutation
|Cq | assumption [ω(g), U ] = 0 the Fourier operator F(U ω) has
νT (g) = I [g ∈ Im(t)] (131)
|Pq | the same dominant invariant subspace as F(ω) [since
F(U ω) = 1G ⊗ U F(ω) = F(ω)1G ⊗ U ]. Hence the proof
a probability distribution on Cq taking nonzero value only of Theorem 8 goes through for individual gate bench-
on the image of the map t [strictly speaking t−1 (g) is not marking as well, replacing the assumption Eq. (72) with
defined for g ∈ Im(t), but νT is zero there anyway]. With Eq. (133).
these definitions we can rewrite the output probability as
4. Cycle benchmarking
pT (m) = EM ( i )|[φ ∗ (φ ∗ φT )∗m ](e)|ESP (ρ0 ) . (132) Cycle benchmarking [13] is a recently developed RB
protocol that can also be subsumed under the framework
of Theorem 8, albeit after some nontrivial considerations
Hence, Theorem 8 generalizes to pT (m) as long as Eq. (72) we discuss in this section.
is satisfied for the convoluted map φ ∗ φT . In the ideal case The data-collection phase of cycle benchmarking can
of φ = ω (the reference representation) and T = T we see be seen as interleaved RB over the Pauli group with
that φ ∗ φT (g) = ω(g). Hence this is a reasonable assump- the interleaving gate C being a (non-Pauli) Clifford gate.
tion to make, and Theorem 8 thus covers the protocol In particular, cycle benchmarking implements sequences
presented in Ref. [47]. C, gm , . . . , C, g1 where g is drawn uniformly at random
from the Pauli group Pq and C is a Clifford gate.
3. Individual gate benchmarking A key aspect of cycle benchmarking is the cycle length,
i.e., an integer c such that Cc = e (note that for any Clifford
Individual RB, as proposed in Ref. [9], is an interleaved
gate such a cycle length exists). In cycle benchmarking the
RB protocol characterized by uniform probability distri-
number of random Pauli elements implemented is always
butions and, interestingly, a reference implementation φr
a multiple of the cycle length. Writing φ(g) for the noisy
that is not a representation. Rather, the reference imple-
implementation of the standard conjugation representation
mentation is of the form φr (g) = U ω(g) where ω(g) is the
of the Pauli group, and C# for the noisy implementation of
standard action by conjugation, i.e., ω(g)(ρ) = Ug ρUg † ,
the Clifford gate C we can define the cycle implementation
and U (ρ) = UρU† is a fixed unitary gate (that is not a part
map (on the Pauli group):
of the group G). Moreover, U is assumed to commute with
the representation ω(g). The output RB data p(i, m) associ-
1
ated with this procedure is of the form, Eq. (74), however, φc (g) = C#φ(gc ) . . . C#φ(g1 ). (136)
the central assumption [Eq. (72)] of Theorem 8 is gener- |Pq |c−1 g1 ,...,gc ∈Pq
ally far from satisfied (unless U is the identity). However, Cgc ···Cg1 =g

020357-26
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

Note that because the Clifford group contains the Pauli this reference implementation is not close to a represen-
group the equation Cgc · · · Cg1 = g makes sense. Now tation (unless g = e), which means that Theorem 8 does
because of the cycle property not apply. This is not an artifact of the proof technique
but rather a reflection of the fact that robust benchmarking
Cgc · · · Cg1 = (C−(c−1) gc C(c−1) ) · · · C−1 g2 Cg1 = gc · · · g1 tomography features extremely rapid exponential decays.
(137) In the gate-independent noise case the decay rate is set by
the average fidelity F[ω(g ), E ], which can be very small.
since C−1 gC is always a Pauli element. Hence the equation In the language of matrix Fourier theory this means that the
has exactly |P(c−1) | solutions. Furthermore, we have that dominant eigenvalues of the Fourier operator F(φtom ) will
1
be small even in the ideal case. Hence, we do not expect
1 1 an assumption of the form, Eq. (72), to be strong enough
φc (g) = C#φ(gc ) · · · C#φ(g1 ) to guarantee exponential behavior of the RB output data in
|Pq | g∈P |Pq |c g∈P
q q g1 ,...,gc ∈Pq this scenario.
Cgc ···Cg1 =g

1
= C#φ(gc ) · · · C#φ(g1 ) VII. DATA PROCESSING AND SAMPLE
|Pq |c
g1 ,...,gc ∈Pq COMPLEXITY
(138) As discussed before the randomized benchmarking pro-
tocol can be divided into data collection and postprocess-
and thus that
ing phases. The data-collection protocol is summarized in
1 Algorithm 1. The outputs of the data-collection phase are
C#φ(gmc ) · · · C#φ(g1 ) = φc∗m (e), (139) mean estimators p̂(i, m, gend ) that estimate the average over
|Pq |mc g1,1 ,...gm,c ∈Pq all sequences of length m according to the measures νi
and the quantum-measurement statistics, simultaneously.
which means cycle benchmarking can be framed as RB The main theorems of the data-collection phase (Theo-
with the implementation map φc . Moreover, since in the rems 8–10) state that the expectation value, again both
limit of perfect gates we have, if Cgc · · · Cg1 = g, that over the measurement statistics and the random sequences,
is well-approximated by a linear combination of (matrix)
C ω(gc ) · · · C ω(g1 ) = ω(g) (140) exponentials in m.
The figures of merit that RB experiments report are the
we can reasonably make the assumption that φc is close decay parameters associated with the linear combination
to its reference implementation [i.e., Eq. (72)]. Hence of (matrix) exponentials. Extracting these decay param-
the behavior of cycle benchmarking data is covered by eters is the objective of the data-processing phase that
Theorem 8. What is less clear is how to interpret the result- is the focus of the current section. For gate-independent
ing exponential decays (especially in terms of the imple- noise and reference representations without multiplicities
mentations φ and C#). This requires a more sophisticated the decay parameters can be directly connected to the
analysis, which is done in Ref. [13]. average gate fidelity of the noise. In the more general
case, the interpretation of the decay parameters in terms
5. Robust benchmarking tomography of other operational measures of quality can be more com-
In robust benchmarking tomography [50] one uses a RB plicated. We consider the connection between the decay
protocol as a subroutine to extract tomographic informa- parameters and the average gate-set fidelity in Sec. IX.
tion from a superoperator (not necessarily a unitary) E . Here we want to take a more pragmatic approach for the
This is done by estimating the probability postprocessing phase. The deviation of the decay param-
eters from unity can directly be regarded as a measure of
1 quality that captures the deviation of the actually imple-
p(i, m) = EM ( i )|φ[g (g1 . . . gm )−1 ]
|G|m g1 ,...,gm ∈G
mented gates from an ideal implementation. In principle,
the set of decay parameters itself provides a refined image
× E φ(g gm ) · · · E φ(g g1 )|ESP (ρ0 ) , (141) of the quality of the implementation, as compared to the
average gate fidelity. This motivates us to limit the post-
where g is a fixed element of the group G and φ is processing phase to the extraction of the decay parameters.
the implementation of a reference representation ω [the The estimation of other measures of quality from the
goal is to estimate correlations between ω(g ) and E ]. We decay parameters is then left to an optional subsequent
can consider this as an interleaved RB scheme with ref- processing phase.
erence implementation φtom (g) = ω(g g) (thinking of E In the simplest RB setting (e.g., uniform RB with
as a noisy implementation of the identity gate). However, the Clifford group), featuring a single noise-affected

020357-27
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

spec (Mλ ) = {zi(λ) }i=1

representation, the data-processing phase involves only λ n
. If Mλ is diagonalizable, then
ﬁtting a single exponential decay curve. The analysis
of RB data arising in more general settings, however,
requires a considerably more ﬂexible approach for the data
nλ

processing. Tr(Aλ Mλm ) = a(λ) (λ) m

i (zi ) (142)
Extracting multiple decay coeﬃcients, or poles, from i

a discrete series of data points is a well-studied problem

in signal processing that arises in many different disci- with coefficients a(λ) i depending on the overlap of Aλ with
plines. For this reason, this section includes a review of the eigenspaces. More generally, let Mλ = S −1 JS be the
modern approaches to this fitting problem that not only Jordan normal decomposition of Mλ with Jordan blocks
have been generalized to the fitting of matrix exponentials J = diag(J1 , J2 , . . .), Ji ∈ Rμi ×μi and {zi(λ) } being the cor-
but also come with theoretical performance guarantees and responding eigenvalues. For m ≥ μi , the j th diagonal of
bounds. The pole-finding algorithms we review (MUSIC
and ESPRIT) come with multiple merits: (1) they are easily the mth power of the ith Jordan block contains the entry
m m−j
j i
z . Therefore, the matrix exponential takes the form
and efficiently implementable, (2) they are flexible enough
to in principle analyze any RB signal of the general form,
Eq. (63), (3) they come with in-built denoising and super-
(λ,j ) m (λ) m−j
resolution capabilities, (4) they feature theoretical bounds Tr(Aλ Mλm ) = ai (zi ) (143)
j ∈[μi ]
j
that can (4a) inform the design of experimental parame- i
ters, and (4b)—very importantly—can be used to identify
parameter regimes where distinguishing the different decay (λ,j )
parameters becomes infeasible in practice. with real coefficients ai . Note that mj are falling poly-
Following this review we combine analytical guaran- nomials in m. Thus, the function space of Tr(Aλ Mλm ) is in
tees and numerical simulations to evaluate the performance general spanned by exponential function parametrized by
of these algorithmic approaches for the processing of RB the eigenvalues modulated by falling polynomials.
data. In particular, we discuss the effect of the configu- With the pole-finding techniques, which we discuss in
ration of the decay parameters, such as their number and the next section, one can extract the set of all poles
spacings, on the overall number of required measurements
and the maximal sequence length in the experiment. We $
thereby provide theoretical guiding principles for design- Q= {zi(λ) |i ∈ [nλ ]} (144)
λ∈
ing RB experiments and explicitly work out limitations
where the experimental precision required in order to
separate multiple decays become impractical. from RB output data. Thus, the general postprocessing
These fundamental limitations in analyzing RB data task of RB is the following: given a data-series p̂(m)
have previously motivated a variety of more resource- that is approximately described by linear combinations
intensive data-gathering protocols that take further data of polynomial modulated decays, extract the set Q of all
from which one can isolate different decay curves in the poles.
classical postprocessing phase. We turn our attention to Loosely speaking, estimating Q is typically possible,
devising a novel general method for isolating matrix expo- provided that the coefficients of all representations are suf-
nentials in Sec. VIII. We begin by a detailed description of ficiently large and the poles are sufficiently spaced. In
the data-processing problem. the remainder of this section, we assess this statement
quantitatively using analytical and numerical methods.
In practice, one might operate under additional assump-
A. The randomized benchmarking data-processing tions and does not need to extract all poles individually.
phase For example, if one expects multiple poles in the data
The theorems on the data-collection phase, morally series that are all more or less aligned, the data-processing
summarized by Eq. (63), state that in expectation RB out- problem becomes equivalent to extracting a single pole.
put data is well-approximated by a linear combination of The general form of the data-processing task, however,
(matrix) exponentials in m. Every matrix Mλ ∈ Cnλ ×nλ in stays the same, namely extracting the poles in the data
the expansion is associated with an irreducible represen- series. Without additional assumptions or postprocessing,
tation λ of the reference representation ω and nλ is the the resulting poles are unlabeled, in the sense that one does
multiplicity of σλ in the decomposition of ω. From the not know which pole is associated with which irreducible
collected data, a RB protocol subsequently extracts decay presentation. This issue is addressed when we turn our
parameters that describe the exponential decay. The decay attention to techniques that filter the RB data for specific
parameters associated with a matrix Mλ are its eigenvalues representations in Sec. VIII.

020357-28
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

B. Data-processing algorithms and guarantees For the sake of clarity, we now start reviewing the algo-
1. Fitting single decays rithms for identifying multiple poles without polynomial
modulation. This corresponds to the case of RB with a
Many proposals for RB derive a data model that is well multiplicity-free reference representation. For the rest of
approximated by a single decay curve. This is, for example, this section we denote the output data as ym instead of
the case when the group is a unitary 2-design, the refer- p̂(m), in keeping with the signal-processing literature. We
ence representation ω is the adjoint representation and the also assume equidistant spacing of the available sequence
actual implementation is close to being trace-preserving lengths m. As we point out in Sec. VII B 5, this require-
[24]. The adjoint representation of a unitary 2-group acts ment can be relaxed by running a low-rank completion
irreducible on the space of traceless matrices and yields a algorithm on incomplete data and thereby infer equidis-
single dominant decay curve. tantly spaced data ym . When clear from the context, we
A single dominant decay parameter can be extracted write the data series simply as a vector y, dropping the
using nonlinear least-squares fitting algorithms such as explicit dependence on m.
Levenberg-Marquardt, see, e.g., Ref. [73, Chapter 3.2]. In The strategy of both algorithms, MUSIC and ESPRIT,
Ref. [56] it has been shown that in RB for the Clifford is to identify the range of the subspaces associated with
group the variance of the data points is expected to strongly the dominant singular values of the Hankel matrix of the
vary with the sequence length m. This observed het- data series {ym }m . The crucial observation is that from this
eroskedasticity motivates us to use iteratively reweighted subspace the poles can be extracted. Let y ∈ RM be the
variants of least-squares fitting algorithms. RB data with M the maximal sequence length. The Hankel
Reference [74] analyzes a simplified fitting procedure matrix for 1 ≤ L < M is given by
that estimates the decay parameter from the ratio of the
data for two sufficiently separated sequence lengths. In the ⎛ ⎞
y0 y1 ··· yM −L
regime of high fidelity, it establishes a multiplicative error ⎜ y1 y2 ··· yM −L+1 ⎟
in the deviation of the decay parameter from an efficient HankelL (y) = ⎜
⎝ ... .. .. ⎟ . (145)
. ⎠
..
number of samples. Relatedly, Ref. [45] gives an estima- . .
tion scheme for a RB procedure that estimates, in paral- yL yL+1 ··· yM
lel, multiple single exponential decays with multiplicative
accuracy. This scheme makes use of postprocessing tech- We denote the Vandermonde matrix of size n × M for
niques to guarantee the “single-exponential” shape of the poles z = (z1 , . . . , zn ) by
data. We discuss this more in Sec. VIII. ⎛ ⎞
1 z1 z12 ... z1M −1
⎜1 z2 z22 ... z2M −1 ⎟
⎜ ⎟
WM (z) = WM (z1 , . . . , zn ) = ⎜ . .. .. .. .. ⎟ .
2. Fitting multiple decay with pole-finding algorithms: ⎝ .. . . . . ⎠
MUSIC and ESPRIT 1 zn zn2 ... znM −1
Algorithms for simultaneously identifying multiple (146)
poles (frequencies and decay parameters) from a discrete
series of data points date back to at least the work of If n = 1, and thus z ∈ C we refer to WM (z) as the Vander-
Prony [75]. A zoo of modern algorithmic approaches has monde vector of length M and pole z.
been developed in the context of direction-of-angle estima- With this notation, the data vector y, without noise, is
tion in array signaling. In principle, these techniques can in the range of WM (z)T . Furthermore, cyclically shifting
extract poles that are closer together than the grid spacing the entries of y amounts to multiplication of the summands
defined by the finite sampling rate, a phenomenon dubbed with the respective poles. In effect, the Hankel matrix has
superresolution. The theoretical framework to derive guar- a Vandermonde decomposition
antees for these algorithms that go beyond a perturbative
analysis of special noise models or very simple configura- HankelL (y) = WTL (z)diag(a)WM −L (z) + HankelL (α),
tions, was only developed recently [76,77], first focusing (147)
on convex optimization.
Here, we analyze the performance of the MUSIC where we denote by α the deviation of y from an ideal
algorithm [78] and the ESPRIT [79] algorithm on RB data. linear combination of exponentials due to the perturbative
Performance guarantees for these two subspace algorithms error (m) and finite statistics and where a is the vector of
were derived in Refs. [80–83] for the multiplicity-free prefactors given in Eq. (143).
case. Furthermore, the ESPRIT algorithm was extended To identify the signal subspace and distinguish it from
to polynomially modulated exponentials of the type we the noise subspace, the MUSIC and ESPRIT algorithms
encounter in RB data with multiplicities in Refs. [84,85]. employ a singular value decomposition (SVD) of the Han-
We summarize the required modification in Sec. VII B 5. kel matrix, HankelL (y) = UVT . In the absence of noise

020357-29
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

and perturbation, i.e., α = 0, HankelL (y) has n nonvanish- [defined similarly to Eq. (148) but using Psignal ] has been
ing singular values and the corresponding singular vectors derived for poles z of unit absolute value (sinusoids).
form an orthonormal basis of the signal space span WTM (z). The argument, however, holds verbatim for all z ∈ Cn .
Let Usignal be the matrix consisting of the singular vec-
tors of the nontrivial singular values as columns and let Theorem 11: (Noise-correlation function bound [82],
Unoise be the matrix consisting of an orthonormal basis Proposition 4.2). Let E = HankelL (α) denote the Hankel
of the complement. It is convenient to define associated matrix of the perturbation and noise of the signal vector y.
noise space (Pnoise ) and signal space (Psignal ) projectors as Let εmin be the smallest singular value of the Hankel matrix
† †
Pnoise = Unoise Unoise and Psignal = Usignal Usignal . In the pres- of the noise-free signal. Suppose L ≥ n, M − L + 1 ≥ n
ence of noise, analogously choosing the singular vectors of and 2 E∞ < εmin . Then,
the n largest singular values yields an estimate of the signal
space. 2 E∞
† |Rnoise (z) − Rsignal (z)| ≤ (151)
From the noise-space projector Pnoise = Unoise Unoise , the εmin
MUSIC algorithm defines the inverse noise-space correla-
tion function R−1noise : C → R, for all z ∈ C.

WL (z)2 We observe that the bound on Rnoise (z) is proportional to

R−1
noise (z) = . (148)
Pnoise WL (z)2 the spectral norm of the noise in the signal but in addi-
tion is decorated by a noise-enhancing factor inversely
The poles z can then be identified as the peaks of R−1
noise (z). proportional to the smallest singular value εmin of the Han-
These can be found by a continuous scan of the values of kel matrix. The bound on Rnoise (z) can thus not be directly
R−1
noise (z), which can be done numerically. translated into a bound on the precision in recovering the
A slightly different approach that avoids the continuous poles z without further assumptions, see Ref. [80, Theorem
search for poles is taken by the ESPRIT algorithm. The 4] in this context. Nonetheless, the peaks of R−1 noise (z) are
ESPRIT algorithm exploits a so-called “rotational invari- typically very sharp, and the bound on Rnoise (z) indicates a
↓ ↑
ance” property. To this end, let WL (z) and WL (z) be the regime where one can typically expect MUSIC to accu-
submatrices of the Vandermonde matrix WTL (z) that omit rately work. For the ESPRIT algorithm, similar bounds
the last and first column, respectively. These submatrices can be found in Refs. [81,83]. The bounds for ESPRIT
are related via additionally involve the minimum singular value of the
↓ ↑ truncation (as defined above) of the Hankel matrix.
WL (z) = WL (z)diag(z). (149)

This rotational invariance property is inherited by Usignal . 4. Conditioning of Vandermonde matrices

In consequence, let H ↓ and H ↑ be the submatrix of the The performance guarantees for MUSIC (and ESPRIT)
Hankel matrix H that omits the last and first rows, respec- show a noise enhancement inversely proportional to the
tively. Then, in the noiseless case, a solution matrix of minimum singular value εmin of the Hankel matrix of the
the equation ideal signal. The minimum singular value εmin in turn can
be regarded as a measure for the conditioning of the Van-
H ↓ = H ↑ (150)
dermonde matrices into which the Hankel matrix decom-
has nonzero eigenvalues z, which are the poles contained poses. This conditioning depends on the system parameters
in the data. It is given explicitly by the pseudoinverse and on the configuration of poles. Given expected val-
of H ↑ applied to H ↓ . Again noisy signals can be con- ues for the poles and the maximal sequence length, it is
siderably denoised by projecting H ↑ to the signal space straightforward to calculate the minimum singular value
before inversion. Altogether we find the algorithmic strat- numerically. This can provide valuable information in the
egy of ESPRIT to be (i) calculate the SVD of the Hankel design of RB experiments.
† More systematically, it is informative to understand the
matrix of y and determine Psignal = Usignal Usignal , (ii) cal-
scaling behavior of the conditioning of the Vandermonde
culate = (Psignal H ↑ )+ H ↓ , and (iii) determine z as the matrices with the help of theoretical bounds. One such
eigenvalues of . bound that allows us to study its asymptotic behavior is
briefly reviewed in this section. A lot of work has been
3. Performance guarantees devoted to study the often surprisingly favorable condi-
Nonperturbative analysis of the performance of MUSIC tioning of Vandermonde matrices for poles on the unit
has been conducted in Refs. [80,82]. Therein, the follow- circle, which describe sinusoidal oscillations, see, e.g.,
ing bound for the deviation of the noise-correlation func- Ref. [83] and references therein for a discussion of the
tion Rnoise (z) from the ideal noiseless counterpart Rsignal (z) phenomenon of superresolution.

020357-30
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

In the context of RB, we are conversely interested in Most interesting in our context is the asymptotic scaling
poles that are on the real line. A more general characteri- in the limit of large maximal sequence length M , for poles
zation of the conditioning of Vandermonde matrices with inside the unit disc |zi | < 1 for all i. In this limit, the above
poles inside the unit circle (allowing for decays beyond bounds become tight and the following holds true.
oscillations) has been studied in Ref. [86]. The condition-
ing obviously depends on the set of poles z and the size Lemma 13: (Asymptotics of condition number [86],
M of the Vandermonde matrix. To state the result given in Lemma 8). Let z = (z1 , . . . , zn ) ∈ Cn with |zi | < 1 for all
Ref. [86] we define several quantities. To the set of poles i ∈ [n]. Define C(z) ∈ Cn×n as the matrix with entries
z = (z1 , . . . , zn ), we associate ž := maxj |zj |, ẑ := minj |zj |
and z̈ := minj =k |zj − zk |. Furthermore, let us define 1
Ci,j (z) = . (157)
1 − zi z̄j
[WM (z)WM (z)]−1/2 .
†
QM (z) = (152)
† Then,
Note that WM (z)WM (z) is the frame operator of the frame
defined by the rows of the Vandermonde matrix and %
lim κ2 [WM (z)] = κ2 [C(z)]. (158)
QM (z) is the orthogonalizing matrix arising in symmetric M →∞
orthogonalization. With the help of QM (z), we define the
matrix Later in this section we use this bound to perform
numerical investigations of the resolving power of the
FM (z) := QM (z) diag(z)Q−1
M (z), (153) MUSIC and ESPRIT algorithms and to give a sampling
complexity bound for general RB.
which will play a prominent role for analyzing the Van-
dermonde conditioning. In particular, its departure from
5. Extensions of the algorithms
normality as measured by D2 [FM (z)] = FM (z)2F − z22
will appear.
In Ref. [86] a bound is derived a. Incomplete data or logarithmic grids. So far the pre-
for the 2-norm condition
number κ2 (WM ) = WM ∞ W+
M ∞ through the bound-
sented algorithms and analysis relied on having an equidis-
ing of the
Frobenius
norm condition number κF (WM ) = tant grid of sequence length. It is well known that a
WM 2 W+
M 2 . Here X
+
denotes the (Moore-Penrose) low-rank matrix can under fairly general assumptions be
pseudoinverse of a matrix X . The condition number of completed from the knowledge of just a subset of their
a linear map A gives a worst-case bound on the relative entries [87]. Thus, given only data ym for values m on
reconstruction error in 2 norm induced by an additive an irregular subset regular grid, one can attempt at com-
error in 2 norm for a linear inverse problem. But here we pleting the Hankel matrix for the regular grid using a
are more concerned with how it enters into the accuracy of low-rank matrix completion algorithm. This preprocessing
identifying poles in the MUSIC and ESPRIT algorithms. step can be combined with MUSIC or ESPRIT to arrive at
For the analysis of the MUSIC and ESPRIT algorithm, we pole-finding algorithms that do not rely on complete data
−1
want to upper bound the minimum singular value εmin . By from an equidistant grid [80]. In particular, we suspect
means of the Vandermonde decomposition (147) and the that for exponential decays a logarithmic grid can poten-
−1
+ + of−1the spectral norm, we have εmin ≤
submultiplicativity tially yield improved recovery similar to the multiplicative
W
M −L ∞ WL ∞ ẑ . Since WM ∞ ≥ 1, we conclude
error bounds for the fitting of single exponentials derived
that in Ref. [74], but we leave formally verifying this to future
work.
−1
εmin ≤ κ2 (WM −L )κ2 (WL )ẑ −1 . (154)
For the condition number the following bound holds.
b. Generalization of ESPRIT to matrix exponentials.
Theorem 12: (Conditioning of Vandermonde matrices References [84,85] have generalized the ESPRIT algorithm
[86], Theorem 6). For M > n ≥ 2, for a Vandermonde to signal spaces spanned by products of falling polynomials
matrix WM (z), it holds that and exponentials. This is exactly the signal model,
Eq. (143), that we encountered for RB output data,
ε1 [FM (z)] 1! % "
when the reference representation has multiplicities.
≤ κ2 [WM (z)] ≤ ρ + ρ2 − 4 (155)
ž 2 The key insight in this generalization is that the
with Hankel matrix of such signals admits a decompo-
sition analogous to the Vandermonde decomposition
n−1
2 φL (ž) (147) in terms of Pascal-Vandermonde matrices. These
D2 [FM (z)]
ρ =n 1+ 2
− n + 2. (156) Pascal-Vandermonde matrices feature the same rotational
(n − 1)z̈ 2 φL (ẑ)
2 invariance property underlying the ESPRIT algorithm.

020357-31
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

Thus, one can show that when applying the standard Corollary 15: (Sampling complexity). Let M be even and
ESPRIT algorithm to data of this form, the vector of L = M /2. And z = (zi )ni=1 be a set of poles. For m ∈ [M ]
eigenvalues of the matrix is still the vector of poles z let p̂(m) be the mean estimator of IID copies of random
with the eigenvalues appearing in multiplicities according variables with variance bounded by ε2 . Choose ˜ , δ > 0,
to the maximal degree of the associated falling polyno- provided that the total number of random trials is
mial. Hence, ESPRIT can be directly applied to estimate
matrix-exponential data series. Noise in the signal will M ε2 M
Ntotal ≥ 8κ24 [WM /2 (z)]ẑ −2 log (161)
generically break the degeneracy of the eigenvalue spec- ˜ 2 δ
trum, corresponding to the fact that a generic matrix has
nondegenerate eigenvalues. Searching for regular poly- and
gons of poles allows for matching groups of perturbed 16 2 1 M
poles corresponding to the same unperturbed pole. We Ntotal ≥ κ2 [WM /2 (z)]ẑ −1 log (162)
3 ˜ δ
refer to Refs. [84,85] for further details.
for the noise-space correlation function (148) defined by
C. Randomized benchmarking sampling the MUSIC algorithm with input data p̂ it holds that
complexity—estimation of the Hankel matrix |Rnoise (z ) − Rsignal (z )| ≤ ˜ with probability δ.
The performance bounds on the pole-finding algorithms,
We state this bound in terms of the condition number
such as Theorem 11, depend on the deviation of the Hankel
of the Vandermonde matrix, which allows us to make ana-
matrix from ideal data in spectral norm. In RB protocols
lytic claims about the behavior of the sampling complexity
this error has two contributions:
in various regimes. However, one can state an equivalent
1. The finite sampling statistics of the measurements, bound in terms of the smallest singular value, which will
which yields a statistical error of the mean estimator often be significantly smaller. It is, however, difficult to
p̂(m). work with analytically.
2. The perturbative error that comes from neglecting For the application of Corollary 15 to RB data pro-
subdominant eigenvalues, which is controlled by cessing, one has to additionally control the perturbative
our Theorems 8, 9, and 10. error appearing in Theorems 8, 10, and 9. The perturba-
tive error per RB data point, see, e.g., Eq. (73), yields an
For the finite sampling error, we provide the following additive error in the noise correlation function of order of
bound. To this end, we model the individual measurement M ẑ −1 κ22 [WM /2 (z)]. The scaling with M originate from
performed during the RB protocol by a random variable the spectral norm of the Hankel matrix and the factor of
Ŷm . To simplify the notation in the proof, we assume that z −1 κ22 [WM /2 (z)] captures the noise enhancement.
the number of different sequence lengths is even and use a Lemma 14 follows from the matrix Bernstein bound
square Hankel matrix. [88,89] that requires us to control the spectral norm and
matrix variance statistics in order to provide a tail bound
Lemma 14: (Statistical estimation). Let M be even and for sums of matrices. We follow the same strategy as
L = M /2. For m ∈ [M ], let Ŷm be a random variable tak- presented in Ref. [88] for Toeplitz matrices.
ing values in[0, 1] with Var[Ŷk ] ≤ ε2 . Furthermore, let Proof of Lemma 14. With the help of the L × L exchange
p̂(m) = 1/N Ni=1 Ŷm(i) the corresponding mean estimator matrix
of N independent identically distributed (IID) copies Ŷm(i)
of Ŷm . We denote with HankelL (p̂) the Hankel matrix of the 1 j =L−i+1
Ji,j = (163)
vector p̂ = [p̂(m)]m∈[M ] ∈ RM . Then, 0 else

HankelL (p̂) − EHankelL (p̂) ≤ (159) and the L × L (noncyclic) shift matrix X that has ones its
∞
first upper off-diagonal and zeros everywhere else we can
with probability 1 − δ provided that write
& '
L−1
M ε2 2 M
N ≥ 4 max , log . (160) HankelL (p̂) = p̂k X k J , (164)
2 3 δ k=−L+1

Combining Lemma 14 with the performance bound for where we identify the elements of p cyclically. We deﬁne
MUSIC, Theorem 11, and Eq. (154) we can state the
1 (i)
following result for the overall sampling complexity of Sk(i) := (Ŷ − E[Ŷk(i) ])X k J (165)
random benchmarking experiments. N k

020357-32
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

such that HankelL (p̂) − EHankelL (P̂) = Ni=1 L−1
k=−L+1 Sk
The matrix Bernstein inequality [88] yields
(i)
is the sum of the random matrices Sk . Since ) N L−1 *

P Sk ≥

i=1 k=−L+1 ∞
X ∞ = J ∞ = 1 (166) & 2 '
N 3N
≤ M exp − min , . (170)
4M ε2 8
and Ŷk takes values in [0, 1], we have that
Requiring the right-hand side to be dominated by δ and
solving for N yields the lemma’s assertion.
(i)
Sk ≤ 2/N (167)
∞
D. Vandermonde conditioning for randomized
benchmarking decays
for all i, k. For the matrix variance we calculate that
The noise-enhancement factor in the performance guar-
antee for the tone-ﬁnding algorithms MUSIC and ESPRIT
−1

L−1
1
L−1 is given by the inverse of the minimum singular value εmin
E[Sk(i) (Sk(i) )† ] = Var[Ŷk ]X k X −k of the Hankel matrix of the ideal, noise-free signal. This
k=−L+1
N 2 k=−L+1 minimum singular value, Eq. (154), is in turn controlled
by the minimal absolute value of the poles and the condi-
1
L−1
= Var[Ŷk ]Pk , (168) tioning of the Vandermonde matrix WL (z) associated with
N 2 k=−L+1 the poles and the signal length. Here we numerically inves-
tigate this conditioning in various scenarios relevant to RB.
We express all data in terms of the dimension of the Han-
with Pk a diagonal projector having k ones on the diagonal kel matrix L, which one can generally take as being about
and zeros everywhere else. One ﬁnds the same structure half of the maximal sequence length M .
L−1 (i) † (i)
for k=−L+1 E[(Sk ) Sk ] analogously. By the assump- When the RB data model is described by many poles
tion of the lemma Var[Ŷk ] ≤ ε2 . Therefore, matrix variance that are close in value the noise enhancement due to
statistics is dominated as bad conditioning can be the limiting factor rendering the
extraction of poles infeasible.
N L−1 N L−1 ( Increasing the sequence length improves the condition-
ing of WL (z), see Fig. 3. But Theorem 12 shows that the

max E(Sk S † k ) , E(Sk † Sk ) condition number of WL (z) is even in the asymptotic limit

i=1 k=−L+1 ∞ i=1 k=−L+1 ∞ W∞ (z) for large L bounded away from zero. Thus, increas-
Mε 2 ing the length of observed RB series only improves the
≤ . (169) conditioning up to a certain point.
N

FIG. 3. The condition number

of the Vandermonde matrix for
diﬀerent Hankel matrix dimen-
sions (∝ RB sequence length) for
three diﬀerent sets of poles. The
dashed lines indicate the asymp-
totic expression of Lemma 13.
Note that the minimum for the
green line is due to the scaling of
the maximal eigenvalue of WM −L .
We observe (not depicted here)
that the minimum singular value
of the Hankel matrix, as appearing
in Theorem 11, is monotonically
increasing in L for all three sets of
poles.

020357-33
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

FIG. 4. Here we show the dependency of the conditioning number of the Vandermonde matrix on the spacing of two poles z0 , z1 ,
for inﬁnite sequence length. We see that the conditioning depends drastically on the distance between the two poles, but not on the
absolute location of the poles on the real line. The orange line at 102 is added for the purpose of comparison.

The explicit expressions of the upper and lower bounds cardinality, see Table I. These families include linearly
on the condition number in Theorem 12 have a rather com- spaced poles within the interval (α, 1) and the pole fam-
plicated dependency on the geometrical constellation of ilies Fa (n) = (zi = 1 − 10−i/a | i ∈ [n]) for positive real
the poles. One can argue that for RB data with poles on a. For example, F1 (n) = (.9, .99, .999, . . .), which can be
the real line there are roughly speaking two effects coming regarded as featuring exponentially spaced “infidelities”.
into play: (1) the spacing of the poles and (2) the number Figure 5 depicts the dependency of κ2 [W∞ (z)] on the
of poles. number of poles n for different families. We find that due
To illustrate the dependency on the spacing of the poles, to a typically exponential dependency, the conditioning
we numerically evaluate the κ2 [W∞ (z)] for different pairs indicates that the reconstruction of multiple poles becomes
of poles as they might appear in RB data. The result is demanding for already small numbers n.
shown in Fig. 4. The first pole is chosen to deviate from 1 Note that the conditioning is significantly improved if
by a value r ∈ {10−2 , 10−3 , 10−4 }, the second pole is cho- the poles are not exclusively on the real line but also have
sen at different values around the first one. We observe nonvanishing imaginary parts. Such pole sets, for example,
that as both poles move together the condition number arise in the RB variant of Ref. [9] focusing on individual
diverges. Importantly, the size of the interval in which the gates.
condition number grows over a certain threshold scales
with r. Correspondingly, we expect that poles closer to 1
can be still resolved with a smaller spacing compared to E. Performance evaluation
poles that deviate considerably from 1. After collecting evidence that the reconstruction of mul-
Secondly, even if the poles are spaced such that the tiple poles quickly becomes a demanding task. We here
ratio of the departure from normality and the minimum show that for moderate configurations (i.e., not too many
spacing are fixed the upper bound in Theorem 12 exhibits poles, not too close together) the ESPRIT algorithm is
an exponential dependency on the number of poles. We suitable for the postprocessing of RB data. To this end,
numerically evaluate this dependency for different families we implement the ESPRIT algorithm in Python. For a
of poles that each defines a set of poles for every fixed set of poles the ideal data series (constructed from

TABLE I. Examples of pole families for diﬀerent numbers of poles n.

n 2 4 6
Lin. α = .9 (0.9, 0.95) (0.9, 0.925, 0.95, 0.975) (0.9, 0.9167, 0.9333, 0.95, 0.9667, 0.9833)
Lin. α = .5 (0.5, 0.75) (0.5, 0.625, 0.75, 0.875) (0.5, 0.5833, 0.6667, 0.75, 0.8333, 0.9167)
F1 (.9, .99) (.9, .99, .999, .9999) (.9, .99, .999, .9999, .99999, .999999)
F2 (0.9, 0.9684) (0.9, 0.9684, 0.99, 0.9968) (0.9, 0.9684, 0.99, 0.9968, 0.999, 0.9997)

020357-34
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

FIG. 5. The dependency of the

condition number in the limit of
infinite sequence length on the
number of poles for different fam-
ilies of poles. These families are
defined in Table I.

the poles and a fixed identical prefactor) is made noisy dH (z, z ) = max{ddH (z ; z), ddH (z; z )},
by randomly sampling binomial distributions. This simu-
ddH (z, z ) = max min |zk − zk |. (171)
lates the random noise due to finite statistics for a certain k∈[n] k ∈[n ]
number of samples per sequence length. Subsequently, the
set of poles is reconstructed from the noisy data using Figure 6 displays the mean Hausdorff distance for a dif-
the ESPRIT algorithms. We compare the reconstructed set ferent number of samples. Each data point is averaged
of poles with the ideal set of poles using the symmetric over 100 repetitions. Figure 7 depicts the mean Hausdorff
Hausdorff distance. Let z ∈ Cn and z ∈ Cn distance for different numbers of samples and maximal

FIG. 6. Mean Hausdorff distance between the real set of poles and the reconstructed set of poles (via ESPRIT) for different families
of poles (as defined in Table I) and Hankel dimension L (∝ maximal RB sequence length M ) versus the number of samples used per
expectation value estimation. Each data point is averaged over 100 repetitions. For all families we see that the reconstruction essentially
fails until a sampling threshold is reached, after this threshold the accuracy of the estimation increases rapidly with increased number of
samples. This threshold increases strongly with the number of poles in the family across all families and also depends on the maximal
sequence length. This latter dependence is mediated by the actual locations of the poles in the complex plane, which is as expected.

020357-35
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

RB output data, limiting RB to groups with reference rep-

resentations containing at most a few irreducible subrep-
resentations, and (2) upon successful extraction of decay
constants, it is not clear a priori how they are related to the
different irreducible subrepresentations present, making it
hard to relate the decay constants to the average fidelity.
A data-processing technique that addresses this prob-
lem was proposed in various papers (marked with a ∗ in
Fig. 2) such as the dihedral benchmarking scheme [37]
for the single-qubit dihedral group, the character bench-
marking scheme [39], which works for general groups
(with some technical constraints on the reference represen-
tation) and the Pauli channel tomography scheme [45] and
cycle benchmarking [13] for the Pauli group (in Ref. [45]
multiple decays are actually estimated in parallel). The
unifying theme in all of these procedures is that one esti-
FIG. 7. Mean Hausdorff distance of the reconstruction (via mates RB output data p(i, m, gend ) for different ending
ESPRIT) for poles z = F2 (4) = (0.9, 0.968, 0.99, 0.997) for dif- gates gend ∈ G, and then correlates the resulting vector
ferent number of samples and Hankel dimension L (∝ maxi-
of signals [p(i, m, gend )]gend with a scalar function fλ (gend )
mal RB sequence length M ). Each data point is averaged over
100 repetitions. We see again that reconstruction essentially
(which can be thought of as a dual vector) that depends
fails, until a threshold is reached both in the number of sam- on an irreducible subrepresentation σλ of the reference
ples and in maximal sequence length after which the accuracy representation ω.
of reconstruction increases with increasing number of samples In this section we take this idea and generalize it as
and in L. far as possible. In particular, we propose a postprocessing
method that, for any group G and reference representation
sequence lengths. In both of these plots we note a threshold ω, takes in RB output data p(i, m, gend ) (for all gend ∈ G)
effect where the reconstruction of the poles essentially fails and an irreducible subrepresentation σλ of the reference
until a threshold of samples and maximal sequence length representation ω, and outputs postprocessed data kλ (m)
is reached, after which reconstruction accuracy increases that depends only on the (matrix) exponential decay asso-
with increasing number of samples. This phenomenon is ciated with σλ . We state theorems for uniform RB, but the
observed for different families of poles and the location discussion below generalizes to the other types of RB.
of the threshold depends strongly on the number of poles We note that all examples of RB schemes without inver-
in the signal. It is interesting to note in Fig. 7 that the sion gates (marked with a ∗∗ in Fig. 2) can be seen as
minimal number of samples needed for reconstruction is special cases of the procedure given below, where the out-
dependent on the maximal sequence length. Since increas- put data [p(i, m, gend )]gend is simply averaged over gend . We
ing the maximal sequence length has an implicit sampling would also like to note that the procedure defined here
cost, this points to a nontrivial optimization problem in obviates the need for explicitly implementing the inversion
allocating resources. We leave further investigation of the gate (as it can be simply absorbed by redefining gend ). This
optimal point for a family of poles for further research. makes the protocol more experimentally practical.
The conclusion from these numerical investigations is that
the RB decay-rate recovery problem is feasible using mod-
ern methods when the number of poles is small but rapidly A. The postprocessing procedure
becomes impractical as the number of poles grows. We begin by defining filter functions αλ (associated with
a representation σλ )
VIII. ISOLATING MATRIX EXPONENTIALS
ASSOCIATED WITH A REPRESENTATION
αλ : G × I → C : g, i → i |Pλ ω(g)|ρ0 , (172)
We have seen in Sec. VI that for uniform randomized
benchmarking the output data is well described by a lin-
ear combination of (matrix) exponential decays associated where Pλ : Sd → Sd is the projection onto the subrepre-
⊕n
with irreducible subrepresentations of a reference representation σλ λ of the reference representation ω. This is
sentation. The decay rates can, in principle, be extracted (up to normalization) the matrix element of the subrepre-
⊕n
by the methods described in Sec. VII. However, two issues sentation σλ λ corresponding to the vectors |ρ0 and i |.
crop up here: (1) the sample complexity of extraction is From the RB data and the above matrix element function
strongly dependent on the number of decays present in the we can now compute the following quantity we call the

020357-36
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

λ-filtered RB output data: with Aλ given in Eq. (101). From the definition of kλ (m),
we can thus compute
1 −1
kλ (m) = N αλ (i, gend )p(i, m, gend ), (173) 1 −1
|G| g ∈G i∈I λ kλ (m) = Nλ αλ (i, gend ) Tr(Aλ Mλm )
|G| g ∈G i∈I
end
end λ ∈
where the normalization constant is given by (177)

1 +
1
Nλ−1 αλ (i, gend )
Nλ = αλ (g, i) i |ω(g)|ρ0 . (174) |G| g
|G| g∈G i∈I end ∈G i∈I
+ ,

One can think of this quantity as measuring the presence of × p(i, gend , m) − Tr(Aλ Mλm ) . (178)
the subrepresentation σλ in the data p(i, gend , m). We make λ ∈
this more precise in the following theorem.
Considering only the first term, and inserting the definition
Theorem 16: (Measuring subrepresentations in the data). of αλ (i, gend ) we are interested in the SPAM operator
Let G be a finite group and ω : G → Sd a reference rep- quantity
⊕n
resentation of G with decomposition ω = λ ∈ σλ λ .
Moreover, let φ be an implementation of ω for which 1
Bλ,λ = i |Pλ ω(gend )|ρ0 Aλ (179)
Theorem 8 holds. For a fixed λ ∈ consider the λ-filtered |G| g ∈G i∈I
data kλ (m) as defined in Eq. (173). As a function of m we end

now have that

for λ ∈ . From the proof of Theorem 8 [Eq. (101)], we
m can recover an expression for the nλ × nλ matrix Aλ :
2δ
|kλ (m) − Tr(Bλ Mλ )| ≤ 8K δ 1 +
m
, (175)
1 − 5δ
[Aλ ]j ,j = dσλ EM ( i )| TrVσ
λ
where Bλ is an nλ × nλ matrix encoding SPAM terms, Mλ
× [σ λ (gend −1 ) ⊗ 1]Rλ1 F(Pλ ωPλ )Lλ1 †
j j
is given by the projection onto the subspace associated
with the nλ largest eigenvalues of F (φ)[σλ ] (as given in
Theorem 8), and K is some constant independent of m. × |ESP (ρ0 ) , (180)

j
Proof. We know from Theorem 8 that where Pλ is the projector onto the j ’th copy of σλ
in the reference representation ω and Rλ , Lλ1 encode
m
2δ the deviation of φ from ω (their precise shape is not
|p(i, m, gend ) − Tr(Aλ Mλm )| ≤8 δ 1+ , relevant for our argument). By linearity, we can now
1 − 5δ
λ ∈ consider
(176)

where we use that

1 1 ⊕nλ
Pλ ω(gend ) ⊗ σλ (gend ) = σ (gend ) ⊗ σλ (gend ) = δλ,λ F (ω)[σλ ], (184)
|G| g ∈G |G| g ∈G λ
end end

020357-37
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

which is the Fourier transform analog of the orthogonality perfect gates. In the case of a multiplicity-free reference
of characters of irreducible representations. Hence Bλ,λ = representation ω we have
δλ,λ Bλ,λ := Bλ .

Plugging this back into the expression for kλ we get Bλ = Nλ−1 ⊗2 ⊗2
i |F (ω)[σλ ]|ρ0 , (187)
i∈I
1 −1
kλ (m) = Tr(Bλ Mλm ) + N αλ (i, gend )
|G| g ∈G i∈I λ which emphasizes the importance of the normalization
end
+ , constant (on which more later), but also the importance of
choosing ρ and { i }i∈I such that Bλ is nonzero.
× Tr[A( i , gend )λ Mλm ] − p(i, gend , m) .
λ ∈
(185) B. Statistical estimation
When computing the filtered output data kλ (m) in the
We can thus upper bound the difference kλ (m) − previous section we assumed we had access to the RB out-
Tr(Bλ Mλ ) by considering the magnitude of the dif- put data p(i, gend , m) for all i ∈ I and gend ∈ G. This is not
ference term. Note that we know from Theorem 8 realistic since both the size of the POVM { i }i∈I and the
that { λ ∈ Tr[A( , gend )λ Mλm ] − p(i, m gend )} ≤ O(δ m ). size of the group |G| can be exponential in the number of
It follows that there exists a K such that qubits. In practice, we need to construct a statistical esti-
mator k̂λ for kλ , and argue that k̂λ is a good approximation
1 −1 for a reasonable number of samples. This we do in this
Nλ αλ (i, gend )
|G| section.
gend ∈G i∈I
+ , Note that the normalization factor Nλ is essential in
lower bounding the magnitude of the filtered function kλ
× Tr[A( , gend )λ Mλm ] − p(i, m, gend ) (i.e., making sure that the number kλ is not too small).
λ ∈ However, this normalization factor can be proportional to
m the Hilbert-space dimension d, making it tricky to set up an
2δ
≤ 8K δ 1 + . (186) estimator for kλ that has a sampling complexity that does
1 − 5δ
not grow with d (which would make sampling practically
impossible for more than a few qubits). This is the task

we turn to now. We can construct an estimator for kλ (m)
Hence, the λ-filtered output data has essentially the same essentially directly from its definition.
behavior as regular RB data, except that only the Fourier It is easy to see that the mean of this estimator is equal
mode associated with σλ is included in the signal. One can to the λ-filtered output data kλ (m). However, this does
think of the λ filter function αλ as placing a δ-peak fil- not mean that the associated estimation procedure is effi-
ter function centered on the “frequency” σλ . Note that by cient. A priori the variance of the estimator could scale
linearity we get essentially the same result if one defines with Hilbert-space dimension d, since the magnitude of
a filter function associated with nonirreducible representa- the filter function Nλ1 αλ does so in general. We cannot
tions (via a direct sum of irreducible representations). This prove that this estimator is efficient for all groups G and
can be thought of as placing a frequency comb on the RB POVMs { i }i∈I . We can, however, make some partial
data. Finally, it is interesting to explicitly write down the statements. In particular, we can prove that the estimator
form of the SPAM matrix Bλ in the limit of no SPAM and is efficient as long as the POVM { i }i∈I is generated by a

Algorithm 1

L
1 −1
k̂λ (m) = Nλ αλ (i, gend l )fi (gend l ) (188)
L
l=1

Algorithm 2. An estimator for kλ (m)

020357-38
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

3-design. This is a restrictive condition, but not impossi- totically independent of the Hilbert-space dimension d.
ble to fulfill. We discuss how to implement such a POVM
after stating and proving the following theorem, which
Proof. First we calculate the effect of the 3-design condi-
essentially states that under the 3-design condition, the
tion on the normalization factor of the correlation function
variance of the estimator k̂λ (m) does not scale with the α(i, ·), by direct calculation we have
Hilbert-space dimension d. This means that the sampling
resources required by the protocol do not depend on the 1
number of qubits in the system, making the postprocessing Nλ = αλ (i, g) i |ω(g)|ρ0 , (189)
|G| g∈G i∈I
step scalable (at least with respect to sampling). We note
.
that this theorem gives an extremely crude bound on the d2 1
variance, and the actual variance is liable to be substan- = dψψ ⊗2 |ω(g)⊗2 Pλ ⊗ 1|ρ0⊗2 ,
|I | |G| g∈G
tially smaller. For simplicity, we assume that there is no
SPAM or gate noise, but the conclusions made here easily (190)
generalize.
d 2
1 Tr[Pλ (ρ0 )] Tr(ρ0 )
= Tr[ρ P (ρ
0 λ 0 )] + ,
Theorem 17: (Efficient estimators). Consider a uniform |I | d2 − 1 d2
RB experiment of sequence length m, with group G, refer- (191)
ence representation ω, measurement POVM { i }i∈I , and 2
1 d
initial state ρ0 , and further assume that the POVM { i }i∈I = Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (192)
|I | d2 − 1
is an (exact) 3-design,
that is i =- d/|I ||χi χi | with states
|χi and 1/I i∈I |χi χi |⊗3 = dψ|ψ ψ|⊗3 . Then for
all λ ∈ the variance of the estimator k̂λ (m) is asymp-

where we use the fact that the Haar measure is invariant under unitary action to absorb the ω(g) dependence, as well
as a standard formula for the second moment of a Haar average over the unitary group, see, e.g., Ref. [55, Proposition
37]
or Ref. [54] [and that Tr(ρ0 ) = 1]. We can now calculate the variance. We denote by k̂λ (m, gend ) the estimator of
−1
i∈I Nλ α(i, gend )p(i, m, gend ) for a ﬁxed gend ∈ G. By the law of total variation we can write
) *
1
V[k̂λ (m)] = V k̂λ (m, gend ) + VG α(i, gend )p(i, m, gend ) (193)
|G| g ∈G i∈I
end
) *2
1 −2 1
≤ N α(i, gend )2 p(i, m, gend ) + N −1 α(i, gend )p(i, m, gend ) , (194)
|G| g ∈G i∈I λ |G| g ∈G i∈I λ
end end

by dropping the negative terms in the variances. We begin with calculating the second term. For this note that for all
gend ∈ G (again using the invariance of the Haar measure):

−1
d2
Nλ−1 α(i, gend )p(i, m, gend ) =I 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] (195)
i∈I
d −1
.
d2
× dψ ψ ⊗2 |{ω(g) ⊗ [EM φ ∗m (gend )ESP ]} Pλ ⊗ 1 |ρ0⊗2 (196)
I

d ∗m

= 2 Tr ρ0 Pλ EM φ (gend )ESP (ρ0 ) + Tr Pλ (ρ0 ) (197)
d −1
2 −1
d
× 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (198)
d −1

where we use the expression for p(i, m, gend ) from Eq. (69). Note that this expression is asymptotically independent of the
Hilbert-space dimension (depending only on how well the initial state overlaps with the projector Pλ ). Next we discuss
the ﬁrst term, given by

020357-39
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

1 −2
N α(i, gend )2 p(i, m, gend ) (199)
|G| g ∈G i∈I λ
end
.
−2 d
3
1
= Nλ dψψ ⊗3 |{ω(gend )⊗2 ⊗ [EM φ ∗m (gend )ESP ]} Pλ⊗2 ⊗ 1 |ρ0⊗3 (200)
|I | |G| g ∈G
2
end
3 .
d
= Nλ−2 2 dψψ ⊗3 |{Pλ⊗2 ⊗ [EM φ ∗m (e)ESP ]}|ρ0⊗3 . (201)
|I |

Here appears a third moment of a Haar average, which can be evaluated using Weingarten calculus (see, for instance, Eqs.
S35 and S36 in Ref. [54], Ref. [55] or Ref. [90] more generally). In this particular instance, we get
.

dψ ψ | Pλ (ρ0 ) |ψ ψ | Pλ (ρ0 ) |ψ ψ | EM φ ∗m (e)ESP (ρ0 ) |ψ (202)

Tr Pλ (ρ0 )|2t + 2 Tr{Pλ (ρ0 )|2t EM φ ∗m (e)ESP (ρ0 )}
= (203)
(d + 2)(d + 1)d
2
Tr Pλ (ρ0 )] Tr{Pλ (ρ0 )|t EM φ ∗m (e)ESP (ρ0 )} Tr Pλ (ρ0 )
+ + , (204)
d2 (d + 1) d3

where A|t = A − Tr(A)1 for matrices A. By isolating a common d−3 factor and plugging back in, we get

1 −2
N α(i, g)2 p(i, m, gend ) (205)
|G| g ∈G i∈I λ
end

2 Tr Pλ (ρ0 )|2t Tr{Pλ (ρ0 )|2t EM φ ∗m (e)ESP (ρ0 )
= (206)
(d + 2)(d + 1)d−2

Tr Pλ (ρ0 ) Tr{Pλ (ρ0 )|t EM φ ∗m (e)ESP (ρ0 )} 2
+ + Tr Pλ (ρ0 ) (207)
d−1 (d + 1)
2 −2
d
× 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (208)
d −1

which is again asymptotically independent of the Hilbert- and it is also proportional to a 3-design, because the mul-
space dimension. tiqubit Clifford group is a unitary 3-design [91,92], and
hence every orbit {C |x x | C† }C∈Cq is a state 3-design (and
Measurement POVMs that are proportional to 3-designs
thus so is the union over x).
are not very common. However, when considering a sys-
We emphasize that the 3-design condition is only a suf-
tem of q qubits it is possible to construct one by consid-
ficient condition for a controlled variance of the estimator
ering computational basis measurements conjugated by a
for the filtered output data, which works for any group G
random element of the q-qubit Clifford group Cq . That is,
and subrepresentation σλ . For particular choices of G and
we consider the POVM
σλ the estimator k̂λ (m) might be efficient for other choices
1 of the POVM { i }i∈I . It is, for instance, easy to see that
{ x,C } ={ C |x x |C† x ∈ {0, 1}q , C ∈ Cq }. the variance will also be controlled if the degree dλ of
|Cq |
(209) the irrep σλ is small. This follows from the fact that the
normalization factor Nλ can be written as
It is easy to see that this is a POVM
1 1 1
C |x x |C† = CC† = I (210) Nλ = Tr[ i Pλ ( i )] Tr[ρ0 Pλ (ρ0 )] (211)
C∈Cq x∈{0,1}q
|Cq | |Cq | C∈C dλ i∈I
q

020357-40
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

so assuming the POVM { i }i∈I and the initial state ρ0 correct normalization factor for α(x, U), since
can be chosen to have sufficient (larger than 1/d) overlap
with the subrepresentation σλ the magnitude of the inverse . 2d
normalization factor Nλ−1 , and hence the size of the sup- dU | x |U |0 |4 = . (213)
d+1
port of the probability distribution {Nλ−1 α(i, gend )}p(i,m,gend ) x
is controlled by 1/dλ . Hence, if dλ is small, the estimator
k̂λ (m) is efficient. This follows because it is constructed We can extend this interpretation by considering the linear
by sampling from a [O(1) in d] bounded random variable. cross entropy of a sequence of m random unitaries (this is
Examples of this behavior have been noted in the literature done implicitly in Ref. [29]). This gives
[37,39,45].
Alternatively, there are situations where the dimension .
of the representation σλ scales with the total Hilbert- FXEB,m = d dU1 . . . Um |x |Um · · · U1 |0 |2
space dimension d but the estimator k̂λ (m) is still efficient Haar x∈{0,1}q
because the group G under consideration is sufficiently
randomizing (roughly, it spans its own 3-design due to × x |EM φ(U1 · · · Um )ESP |ρ0 . (214)
the randomization over the ending gate gend ). An exam-
ple of this is the recently introduced linear-cross-entropy Using the invariance of the Haar measure and the linearity
benchmarking procedure, which we discuss in the next of the trace and the tensor product we can rewrite this as
section.
Finally, we would like to add that if one reuses the same .
experimental data p(m, gend ) to estimate kλ (m) for differ- FXEB,m = d dUm |x |Um |0 |2
ent λ, the resulting estimates for kλ (m) (and consequently Haar x∈{0,1}q
the associated decay rates) will be correlated. This must be
taken into account when performing joint statistical infer- × x |EM φ ∗m (Um )ESP |ρ0 (215)
.
ences on estimates for several Mλ . This can of course be
remedied by gathering new data for each representation =d dUm |x |Um |0 |2 p(x, Um , m)
Haar
label λ. x∈{0,1}q
(216)

C. Example: linear cross-entropy benchmarking with p(x, Um , m) the output probability of a regular RB
Recently, Ref. [29] has introduced a RB-like protocol experiment. Now noting that ω(U) decomposes into the
referred to as linear-cross-entropy benchmarking, in short trivial representation (on the space {a|1 | a ∈ C}) and the
XEB. We see in this section that this protocol falls into adjoint representation [on the space { |A | Tr(A) = 0}] we
the framework of the benchmarking schemes introduced apply Theorem 8 to the above to get
here. In fact, it can be seen as uniform RB with G the
full unitary group, together with a postprocessing scheme
that is a special case of the above filtering scheme. Let FXEB,m = Atr smtr + Aadj fadjm (217)
φ : U(2q ) → Sd be an implementation map of the uni-
tary group, also let { x }x∈{0,1}n be the computational basis up to a correction exponentially small in m, where str (fadj )
POVM, and ρ0 = |0 0|. The linear cross-entropy fidelity is the largest eigenvalue of the Fourier transform of φ
is now given by evaluated at the trivial (adjoint) representation. Recall that
str = 1 if φ(U) is trace preserving for all U, and that we
. can moreover interpret fadj as affinely related to the average
FXEB = d dU |x |U |0 |2 x |EM φ(U)ESP |ρ0 fidelity (certainly in the gate-independent noise setting).
Haar x∈{0,1}q Hence, through Theorem 8 and our general postprocessing
(212) scheme the linear-cross-entropy benchmarking procedure
inherits both the stability and interpretation of uniform RB.
It is notable that the estimator k̂λ (m), which in this case
with EM , ESP being the usual SPAM error channels. Setting estimates the linear cross entropy fidelity FXEB,m is actu-
α(x, U) = | x |U |0 |2 = x |ω(U)|ρ0 we see that FXEB ally efficient, in the sense of Theorem 17. We can sketch an
can be interpreted as a RB experiment of sequence length argument for this by directly estimating the variance of the
“0” with gend = U together with postprocessing by corre- estimator. For this argument we assume gate-independent
lation with the adjoint representation ω(U) = U · U† . Note noise [i.e., φ(U) = Aω(U) for some completely positive
that the dimensional factor almost precisely serves as the A]. Following Theorem 17, we have

020357-41
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

.
∗m
V[k̂λ (m)] ≤ d 2
dU|x |U |0 |4 x |EM φ (U)ESP |ρ0 (218)
x∈{0,1}q Haar
⎡ ⎤2
.
+ d2 dU ⎣ |x |U |0 |2 x |EM φ
∗m
(U)ESP |ρ0 ⎦ (219)
Haar x∈{0,1}q
.
≤ d max q
3
dU|x |U |0 |4 x |EM φ ∗m (U)ESP |ρ0 (220)
x∈{0,1} Haar
.
/
+ d4 max dU|x |U |0 |2 | x U |0 |2 (221)
x,x ∈{0,1}q Haar
∗m
× x |EM φ (U)ESP |ρ0 x |EM φ ∗m (U)ESP |ρ0 . (222)

Using the gate-independent noise assumption and the fact right eigenvectors of its implemented version; the criti-
hat ω(U)(ρ) = UρU† , the rhs is a Haar integral of a cal point is that ascertaining whether this requirement is
degree-3 homogeneous polynomial in the entries of U, U, met is not possible with a RB procedure. We want to
and the second term is a Haar integral of a degree-4 homo- highlight that this intricacy in connecting RB to other
geneous polynomial. The asymptotic behavior of such well-established quantities does not mean RB protocols are
integrals (in the limit of large d) is well known [90] and inherently flawed, but only that the information they pro-
evaluates to O(d−3 ) and O(d−4 ), respectively. Hence, the vide have to be regarded independently, with decay rates
overall variance is O(1) in d. One could fill in the exact as the defining quantities to characterize the accuracy of
constants by evaluating the Haar integrals (like we did in experimentally implemented sets of gates.
Theorem 17), but we do not pursue this here.
A. The depolarizing gauge and in-between noise
average fidelity
IX. RANDOMIZED BENCHMARKING AND
In an attempt to resolve the apparent disconnect between
AVERAGE FIDELITY
fidelity and RB decay parameters in the gate-dependent
To date, we have treated the information extracted from noise setting, in Refs. [24] and [25] proposals have been
RB procedures, and in particular the decay rates, as fig- made for the precise connection between RB decay rates
ures of merit in their own right, without establishing a and average fidelity. In Ref. [24], it has been noted that the
direct connection to other well-known quantities such as output data of Clifford RB could be exactly fitted to a sin-
the average gate fidelity. Indeed, this latter object is often gle exponential whose decay rates are exactly interpreted
portrayed as the conclusive result of an RB protocol. as the average fidelity of the “noise in between gates,” a
In this section, we provide a series of arguments to vali- manifestly gauge invariant quantity. Similarly in Ref. [25],
date the interpretation of the RB parameters as standalone it has been argued that the decay of Clifford RB can be
information, by showing that connecting RB decays to the regarded as the average fidelity of the implementation with
average gate fidelity presents complications that are hard to regard to a particular gauge choice, namely the one in
overcome. The underlying reason for this incompatibility which the average implementation inverted with the ref-
is due to the gauge-dependent nature of the average gate erence representation is precisely a depolarizing channel.
fidelity (as argued in Ref. [26]) that cannot be established We show here that (1) both of these statements can be gen-
nor controlled under RB. More precisely, in Sec. IX A eralized to RB with arbitrary groups, (2) both statements in
we provide an explicit example showing that adopting a fact say the exact same thing, and (3) both interpretations
gauge to match the average gate fidelity gives rise to a suffer from the same problem, namely that the channel of
channel that is not physical. In Sec. IX B, we substanti- which the average fidelity is measured by RB is not neces-
ate our argument with an analysis of the expression of sarily a completely positive (CP) map (i.e., physical), even
the entanglement fidelity—a quantity closely related to the if the implementation map φ is.
average fidelity—in terms of RB decay parameters and the In Ref. [24], the RB decay rate is interpreted as mea-
adopted gauge. Observing this expression we conclude that suring the fidelity of “the noise in between gates.” (A
RB parameters and fidelity can be linked only if there is general version of) this construction goes as follows. For
a close overlap between the dominant eigenvector of the an implementation φ of a group G, close to some reference
ideal operator and the dominant, gauge-dependent left and representation ω = λ∈ σλ we can pick the dominant

020357-42
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

eigenvectors vec(Rλ ) of the Fourier transform F (φ) eval- We can connect the above two interpretations by inserting
uated at the irreducible subrepresentation σλ ⊂ ω (for now the parametrization φ(g) = Rω(g)L(g) into the expres-
assuming no multiplicities, this easily generalizes). We can sion for φdep as
devectorize these eigenvectors and sum them up to create
a superoperator R with the property
1 φdep (g) = R−1 Rω(g)L(g)R = ω(g)L(g)R. (227)
φ(g)Rω(g)† = Rdep, (223)
|G| g∈G

where
dep is the generalized depolarizing channel dep = Hence, the depolarizing gauge is precisely the gauge in
λ∈ λ Pλ with fλ the eigenvalue corresponding to Rλ .
f which each superoperator φdep (g) is viewed as the ideal
Without loss of generality we can assume that R is invert- superoperator ω(g) preceded by the noise in between gates
ible (as a matrix). Note also that for any φ we can write L(g)R (in the sense of Ref. [24]). Hence, these two inter-
φ(g) = Rω(g)L(g), where L(g) is some implementation pretations of the RB decay rates as corresponding to an
map (not necessarily completely positive). average fidelity of “something” neatly map to each other.
With this parametrization the noise between two gates A central open question in both the above constructions
g, g (which in this parametrization only depends on g) is is whether the noise in between gates, or equivalently the
given by L(g)R. The entanglement fidelity with regards to noise in the implementation in the depolarizing gauge, can
the identity averaged over all g ∈ G of this map is always be chosen to be a completely positive implemen-
tation map. This is essential if we want to consider these
1
Favg [L(g)R, 1] interpretations as actual descriptions of reality. Here we
|G| g∈G answer this question in the negative by giving an exam-
1 ple (an adaptation of a construction given in Ref. [26])
= Favg [R−1 Rω(g)L(g)Rω(g)† , 1] of a pointwise CP implementation map φ where the noise
|G| g∈G in between gates (the implementation in the depolarizing
gauge) is not completely positive. Let G be the single-
= Favg (dep, 1), (224)
qubit Clifford group, and consider, in the Pauli basis, the
where we use the linearity and unitary invariance of the following superoperators:
average fidelity. Note that Favg (dep, 1) = 1/d2 − 1 λ∈
fλ dλ − 1 is precisely the average fidelity one would obtain
by plugging the RB decay rates fλ into Eq. (242). ⎛ ⎞
1 0 0 0
On the other hand, Ref. [25] connects the RB decay rates √
⎜ 0 γ 0 0⎟
T(γ ) = ⎝ √
0⎠
to the average fidelity of the implementation map φ in a ,
0 0 γ
particular gauge, that is a particular choice of invertible 1−γ 0 0 γ
superoperators such that ⎛ ⎞
1 0 0 0
1 ⎜0 α 0 0⎟
Favg [S −1 φ(g)S, ω(g)] = Favg (dep, 1). M1 (α) = ⎝
0⎠
(225) ,
|G| g∈G 0 0 1
0 0 0 1
This map φdep = S −1 φS is called the depolarizing gauge. ⎛ ⎞
1 0 0 0
According to Ref. [25] the correct interpretation of the RB ⎜0 1 0 0 ⎟
M2 (α) = ⎝
0 ⎠
decay rates is that they measure the fidelity of the imple- . (228)
0 0 1
mentation map φ in the depolarizing gauge with respect to 0 0 0 α −1
the reference implementation ω. It turns out that the correct
choice for S is precisely the operator R mentioned above,
which can be easily seen by explicit computation
From these we can construct the implementation φ(g) =
1 T(γ )M1 (α)ω(g)M2 (α), with ω(g)(ρ) = Ug ρUg † the stan-
Favg [R−1 φ(g)R, ω(g)]
|G| g∈G dard reference representation. It is easy to see that
⎛ ⎞ the transformation to the depolarizing gauge is given
by M2 (α)φ(g)M2 (α)−1 = M2 (α)T(γ )M1 (α)ω(g). Equiv-
1
= Favg ⎝R−1 φ(g)Rω(g)† , 1⎠ alently, the noise in between gates is given by
|G| g∈G M2 (α)T(γ )M1 (α). The claim is now that there exists pairs
α, γ such that φ(g) is completely positive for all g ∈ G
= Favg R−1 Rdep, 1 . (226) but M2 (α)T(γ )M1 (α) is not. An easy pathological example

020357-43
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

can be obtained by setting γ = 0. In this case we have a positive-gauged implementation map that has a fidelity
approximately given by the RB decay rates (with approx-
⎛ ⎞
1 0 0 0 imate meaning small relative to 1 − fλ ). This can be done
⎜0 0 0 0⎟ for Clifford RB on a single qubit [27] but generalizing to
φ(g) = ⎝
0⎠
,
0 0 0 higher dimensions seems difficult (although some work in
1 0 0 0 this direction has been done [94]).
⎛ ⎞
1 0 0 0 B. Connecting average fidelity and randomized
⎜ 0 0 0 0⎟ benchmarking decay rates
M2 (α)T(1)M1 (α) = ⎝
0⎠
. (229)
0 0 0
In the previous subsection we showed that the depolar-
α −1 0 0 0
izing gauge does not always give rise to a CP implemen-
tation map, and hence, cannot be connected in all cases to
Hence, for all α < 1 the maps φ(g) are CP while the
the average fidelity of a physical process. Here we want
map M2 (α)T(0)M1 (α) is not (this can be verified by using
to investigate the link between fidelity and the RB decay
the complete positivity conditions for qubit channels from
parameters under a general gauge choice S. We do this
Ref. [93]). For γ < 1 one can always construct interval
using the tools of perturbation theory we have used earlier
conditions on α such that the same holds. Hence, the
to establish Theorem 8.
interpretations [24,25] both suffer from a problem, namely
that in order to imagine RB as “measuring the average
1. The randomized benchmarking measurement
fidelity” of some object, this object has to be chosen in
outcome
a way that is not necessarily physical. This possibility was
already indicated by both papers, but no explicit example Let us consider a special case of Theorem 8 correspond-
was given. It is unclear how to resolve this problem: one ing to reference representations ω that are multiplicity-free
could, for instance, try to find natural conditions on φ such (for simplicity), and making the gauge freedom S explicit.
that the noise in between gates, or equivalently the imple- In this situation, we can write the Fourier operator F(ω)
mentation in the depolarizing gauge, is always completely as a direct sum of rank-1 orthogonal projections, since
positive. Alternatively one could adopt the framework of from Eqs. (29) and (30) it follows that for each unitary
Ref. [27] where one relaxes the problem by asking for irreducible representation σλ of G

⎧
⎪
⎨|z(σλ ) z(σλ ) | rank-1 orthogonal projection if π and σλ are equivalent irreducible representations,
F (π )[σλ ] =
⎪
⎩0 otherwise.
(230)

Furthermore, we also assume that the Fourier transform F (σλ ) is a diagonalizable operator. Since the set of diagonalizable
matrices is dense [95], it is always possible to ﬁnd such a diagonalizable matrix at arbitrary proximity of any given
operator. We can thus write the Fourier transform of the implementation map on the irreducible representation appearing
in the decomposition of ω as the perturbation 4
E (σλ ) := F (SφS −1 − ω)[σλ ] of the rank-1 operator F (ω)[σλ ],

F (SφS −1 )[σλ ] = F (ω)[σλ ] + F (SφS −1 − ω)[σλ ] (231)

= F (ω)[σλ ] + 4
E (σλ ) (232)
dλ −1
/
= fmax (σλ ) |rmax (σλ ) max (σλ ) | + fjλ (σλ ) |rjλ (σλ ) jλ (σλ )
, (233)
jλ =1

d −1
where fmax (σλ ) is the largest eigenvalue of F (SφS −1 )[σλ ] and fjλ j λ=1 are the other eigenvalues. The sets of left and right
λ / 5 / 5
eigenvectors form a biorthogonal system, that is, (σλ )|r (σ
λ ) = jλ (σλ ) rjλ (σλ ) = 1 and

max (σλ ) rjλ (σλ ) =
/ 5 / 5 max max

jλ (σλ ) rmax (σλ ) = jλ (σλ ) rkλ (σλ ) = 0, for jλ = kλ . The important remark that we should make here is that this basis
of eigenvectors reﬂects the gauge transformation SφS −1 .

020357-44
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

In this scenario, we can thus write Eq. (75) in the proof of Theorem 8 for gend = 1

−1 m+1
p(i, m) = dλ EM ( i )| TrVλ {F (SφS ) [σλ ][σλ (1) ⊗ 1]}|ESP (ρ0 ) (234)
λ∈Irr(G)
&
= dλ [fmax (σλ )]m+1 EM ( i )| TrVλ |rmax (σλ ) max (σλ ) | |ESP (ρ0 ) (235)
λ∈
'
/
+ dλ [fjλ (σλ )] m+1
EM ( i )| TrVλ |rjλ (σλ )
jλ (σλ ) |ESP (ρ0 ) (236)
jλ

+ dγ EM ( i )| TrVσγ {F (SφS −1 )m+1 [σγ ][σ γ (1) ⊗ 1]}|ESP (ρ0 ) . (237)
γ ∈
/

By Eq. (62), it follows that fmax (σλ ) for each σλ in the the entanglement fidelity, defined as
irreducible decomposition of ω is lower bounded by 1 −
4E (σλ )2 , while the subdominant eigenvalues, correspond 1
to perturbations of the kernel of F (ω)[σλ ], are upper Fe (R) := | 1 ⊗ R | = 2 Tr(R), (240)
bounded by 4 E (σλ )2 . Moreover, by Theorem 18 pre- d
sented in Sec. X, the eigenvalues in those subspaces not
related to irreducible representations appearing in decom- where the trace is taken over the superoperators, and
position are again dominated by 4 E (σλ )2 . Hence, we can related to the average gate fidelity by
choose m large enough such that fmax m
(λ) fjλm (σλ ) for all
fjλ (σλ ) and for each irreducible representations σλ occur- dFe (R) + 1
ring in the decomposition of ω, and such that the leakage Favg (R) = . (241)
d+1
of the perturbation in nonoccurring irreducible subspaces
is suppressed.
For these values of m, we then retrieve the formula for In particular, we have (first formally written down in
the power law in Eq. (63), but here with respect to 1-dim Ref. [14])
parameters,
1
Favg (R) = dλ Tr(Mλ ) (242)
d2 λ∈
p(i, m) ≈ [fmax (λ)]m+1 ξ(S, σλ , i , ρ0 ), (238)
λ∈
with Mλ again an nλ × nλ matrix.

σλ , i , ρ0 ) := dλ EM (
where ξ(S, i )| TrVλ |rmax (σλ ) The connection between the RB decay rates and the
max (σλ ) | |ESP (ρ0 ) . fidelity has been challenged in Ref. [26], where it has been
argued that the average fidelity and the output of RB are
not related in a unique way. In doing so they introduced
2. Average gate fidelity and entanglement fidelity the concept of gauge freedom into the RB literature.
The first RB protocols based on the Clifford group [5,34] In the context of RB, gauge freedom is the observation
linked a single decay parameter f to the average fidelity that two implementation maps φ and φ give rise to the
of a quantum channel R, under the assumption of gate- same RB output data p(m) if they are related by a similarity
independent noise, i.e., φ(g) = Rω(g). The relation is transformation S, i.e., φ = SφS −1 . However, the average
given by fidelity of these implementation maps (relative to some
reference implementation) will generally differ. Note that
this an issue even with the assumption of gate-independent
1−f
Favg (R) = f + . (239) noise, however, in this case there is a “canonical” choice
d of gauge for which the RB decay rates and the fidelity are
related. In the gate-dependent noise scenario there is no
This formula generalizes to uniform RB with an arbitrary such obvious gauge choice. The rest of this section will be
⊕n
group G with reference representation ω = λ∈ σλ λ , concerned with this question.
again under the assumption of gate-independent noise. The entanglement fidelity—averaged over G—can be
However, it is more convenient to express it in terms of expressed in terms of Fourier transforms (as has first been

020357-45
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

noted in Ref. [25]). Indeed, we have We observe that this connection is complicated by two
factors. Firstly, it depends on the gauge-dependent overlap
Eg Fe [SφS −1 (g), ω(g)] z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) between the rank-1 pro-
jection and the perturbed dominant eigenvectors—a quan-
= Eg Fe [ω† (g)SφS −1 (g)] (243)
tity that we cannot retrieve from RB data—which might
1 deviate significantly from 1 depending on the gauge
= Eg Tr[ω† (g)SφS −1 (g)] (244)
d2 choice. Secondly, the residuum αres may be large, consti-
1 tuting a non-negligible part of the entanglement fidelity.
= 2 dλ Tr {F (ω)[σλ ]}† F (SφS −1 )[σλ ] , The rest of the section is concerned with analyzing these
d λ∈Irr(G)
gauge-dependent connective factors.
(245) We begin by deriving a bound on αres , showing that
this term is small, more precisely, of third order in the
where we use the second Parseval identity (28). gauge-dependent perturbation term 4 E (σλ ). For this, we
At this point we can again use of the property in use Corollary 7, where in this specific case a1 = 1 and
Eq. (230) for F (ω)[σλ ] and the reformulation in Eq. (233) A2 = 0(d2 −1),(d2 −1) and where
for F (SφS −1 )[σλ ] and write
†
Qz(σλ ) := X2 X2 = 1 − |z(σλ ) z(σλ ) | (250)
−1
Eg Fe [SφS (g), ω(g)]
1 is the orthogonal complement of the projection |z(σλ )
= 2 dλ Tr |z(σλ ) z(σλ ) | z(σλ ) |. Then, the relations between unperturbed and per-
d λ∈
turbed dominant eigenvectors is given by

× fmax (σλ ) |rmax (σλ ) max (σλ ) | (246)
/ |rmax (σλ ) = |z(σλ ) + Qz(σλ )4
E (σλ ) |z(σλ )
+ fjλ (σλ ) |rjλ (σλ ) jλ (σλ ) (247)
+ O[4
E (σλ )22 ], (251)
jλ

1 max (σλ ) | = z(σλ ) | + z(σλ ) | 4

E (σλ )Qz(σλ )
= dσ fmax (σλ ) z(σλ )|rmax (σλ )
d2 λ∈ λ + O[4
E (σλ )22 ]. (252)
× max (σλ )|z(σλ ) + αres , (248) Furthermore, let us define the matrix
/
where we define the residuum term 4(σλ ) :=
K fjλ (σλ ) |rjλ (σλ ) jλ (σλ )
, (253)
1 / 5/ 5 jλ
αres := 2 dσλ fjλ (σλ ) z(σλ )rjλ (σλ )
jλ (σλ ) z(σλ ) .
d λ∈ j where we have
λ
(249)
4(σλ ) |rmax (σλ ) = |0
K and 4(σλ )
max (σλ ) | K = 0 | ,
This establishes a connection between the decay parame- (254)
ters fmax (σλ ) retrieved from Eq. (238) and the entanglement
fidelity as expressed in Eq. (248). and the bound on the 2-norm

4(σλ )2 = F (SφS −1 )[σλ ] − fmax (σλ ) |rmax (σλ )

K max (σλ ) |2 (255)
= |z(σλ ) z(σλ ) | + 4E (σλ ) − fmax (σλ ) |rmax (σλ ) max (σλ ) |2 (256)

= |z(σλ ) z(σλ ) | + 4E (σλ ) − fmax (σλ ) |z(σλ ) z(σλ ) | + Qz(σλ )4 E (σλ ) |z(σλ ) z(σλ ) | (257)

E (σλ )Qz(σλ ) + Qz(σλ )4
+ |z(σλ ) z(σλ ) | 4 E (σλ ) |z(σλ ) z(σλ ) | 4
E (σλ )Qz(σλ ) 2 (258)
≤ |1 − fmax (σλ )||z(σλ ) z(σλ ) |2 + O[4
E (σλ )2 ] (259)
≤ O[4
E (σλ )2 ], (260)

where we use the fact that |1 − fmax (σλ )| ≤ 4

E (σλ )2 .

020357-46
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

Now, inserting Eqs. (251)–(253) into Eq. (249) and using the Cauchy-Schwarz inequality, we obtain the following
bound on the residuum:
1
|αres | = dσ max (σλ ) | 1−4
E (σλ )Qz(σλ ) + O [4 4(σλ ) 1 − Qz(σλ )4
E (σλ )22 ]K E (σλ )
d2 λ∈ λ

E (σλ )22 ) |rmax (σλ )
+ O(4 (261)
1
≤ 2 dσλ max (σλ ) | 4E (σλ )Qz(σλ ) K E (σλ ) |rmax (σλ )
4(σλ )Qz(σλ )4 (262)
d λ∈
1
+ dσ O[4 4(σλ )2
E (σλ )32 ]K max (σλ ) rmax (σλ ) + O[4
E (σλ )42 ] (263)
d2 λ∈ λ
1
≤ dσ O[4
E (σλ )32 ] max (σλ ) rmax (σλ ) + O[4
E (σλ )42 ]. (264)
d2 λ∈ λ

This bound for αres has a significant implication: it means that the residuum will not cover the leading term in Eq. (248) if
the latter is (4E (σλ )22 ), for all gauge choices S that yield max (σλ ) · rmax (σλ ) smaller than 1/4
E (σλ )2 .
Note that it is important to compare αres to the difference between 1 (the value of the entanglement fidelity of a perfect
implementation) and the dominant eigenvalues in Eq. (248). This distance is indeed what RB protocols are designed to
detect, and in order for the connection between fidelity and decay rates to be meaningful we require αres to be negligible
in comparison. To analyze this further, we first write

1
max := dσ 1 − fmax (σλ ) z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) |, (265)
d2 λ∈ λ

and we calculate deviation of the absolute of the overlap from 1, which is remarkably only in second order in perturbation,

z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) (266)

4 4
= z(σλ ) | 1 + Qz(σλ ) E (σλ ) + O[E (σλ )2 ] |z(σλ )
2
z(σλ ) | 1 + E (σλ )Qz(σλ ) + O[E (σλ )2 ] |z(σλ )
4 4 2
(267)
2

≤ 1 + O[4 E (σλ )22 ] (268)

≤ 1 + O[4
E (σλ )22 ]. (269)

This bound on the overlap, together with the one on the residuum, implies that the parameters fmax (σλ ) obtained from the
fitting of the RB model in Eq. (238) yield a meaningful characterization of the fidelity on the condition when they are
[4E (σλ )2 ].
Having derived a bound on the residuum we can consider Eq. (248) in different regimes [always assuming small
perturbations, i.e., 4
E (σλ )2 1]. In the first regime we make the assumption

[4 E (σλ )22 ] = |1 − fmax (σλ )| 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) |, (270)

corresponding to the situation where the parameters {fmax (σλ )}λ∈ are more sensitive to the perturbation than the overlap of
the dominant eigenvectors. As we mentioned before, this is indeed the regime where RB provides a meaningful estimation
of the ﬁdelity. Indeed, we have
&
1
max ≥ 2 dσ |fmax (σλ ) − 1| · |z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (271)
d λ∈ λ
'

− 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (272)

020357-47
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

& ! " '

In a second regime we can assume the converse, namely that

|1 − fmax (σλ )| 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | = [4
E (σλ )22 ] (276)

holds true. This case is analogous, since we now have

&
1

max ≥ 2
dσλ 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (277)
d λ∈
'
− |fmax (σλ ) − 1| · |z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (278)
& ! "'
1
≥ 2 dσλ 1 − z(σλ )|rmax (σλ ) 4
max (σλ )|z(σλ ) | − |fmax (σλ ) − 1| 1 + O[E (σλ )2 ]
2
(279)
d λ∈
1
= dσ [4
E (σλ )22 ] (280)
d2 λ∈ λ
|αres |. (281)

This situation is, however, problematic, since RB gives us that

(282) (284)
−1
= EG SφS (g) − ω(g)2F , (285)

which is troublesome not only for the fact that we cannot where we apply Parseval’s identity. Note, however, that the
retrieve the overlap but also because in this case max may lhs of this expression runs over all irreducible representa-
be of the same magnitude or smaller than |αres |. Indeed, in tions of G and not the only ones decomposing ω.
this regime the residuum can then play a significant role in
the characterization of the average gate fidelity.
X. RANDOMIZED BENCHMARKING UNDER
The conclusion we draw from this analysis is that the
DIAMOND NORM AND FIDELITY CONSTRAINTS
overlap z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) is the key factor
to consider when relating RB decays to the fidelity. This In Theorem 8, we have argued that randomized bench-
overlap must be sufficiently close to 1 under the adopted marking output data associated with an implementation of
gauge relative to the difference |1 − fmax (σλ )|. a group G could be approximated as a sum of (matrix)
Finally, we wish to relate {4 E (σλ )2 }λ to a promise exponentials provided the implementation map φ was
on a physical quantity related to the perturbation of the close to a reference representation ω with respect to the
ideal gate implementation ω. We recall that 4 E (σλ ) = diamond norm (averaged over all group elements). Here
F (SφS −1 − ω)[σλ ] and consider that ·2 ≤ ·F such we argue that this is a natural condition to demand in the

020357-48
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

context of RB. In particular, we show that this condition is and let ω, ω be representations of G on Vn , Vn
stable, in the sense that it is impossible to be close [in the with embedding maps L : Vn → Vn , L : Vn → Vn and
sense of Eq. (72)] to two inequivalent representations at R : Vn → Vn , R : Vn → Vn such that
once, and, moreover, we show that this requirement can-
1
not be replaced with a weaker one involving the average φ(g) − Rω(g)L ≤ , (287)
ﬁdelity, resolving an open question in Ref. [25]. |G| g∈G
1
φ(g) − R ω (g)L ≤ .
A. Stability of representations under diamond norm (288)
|G| g∈G
First, we prove that “closeness to a representation” is a
stable concept, that is, it is impossible to be close to two Moreover, assume that there exists K such that
representations at once (in a suitable sense). Rω(g)L ≤ K, R ω (g)L ≤ K for all g ∈ G. If the
inequality K( + ) + 3δ + 2 + 2 < 1 holds then the
Theorem 18: (Stability of representations). Let φ be an
representations ω, ω are equivalent on a subspace of
implementation map of a group G taking values in Sd such
dimension at least d2 .
that
1
1 − φ(g)φ(g −1 ) ≤δ Proof. Consider the map LR : Vn → Vn , as well as its
(286)
|G| g∈G twirled version

1
T= ω(g)LR ω (g)† . (289)
|G| g∈G

We would like to argue that T is a map of rank at least d2 , as then we can decide the theorem by application of Schur’s
lemma. To do this, consider the distance to the identity of the natural pullback of T to Sd , namely RTL . We can calculate

1 − RTL ≤ 1 − 1 Rω(g) LR ω(g) †
L + 1
R ω(g) LR ω(g) †
L − R T L (290)
|G| g∈G |G|
g∈G
1
1 − Rω(g)LRρ(g)† L + Rω(g)LRω(g)† L − Rρ(g)LR ω (g)† L .
≤ (291)
|G| g∈G

We upper bound these two terms separately. For the ﬁrst term, consider

1
1 − Rω(g)LRω(g)† L (292)
|G| g∈G
1
1 − φ(g)φ(g −1 ) + 1 − φ(g)Rω(g −1 )L

≤
|G| g∈G

+ 1 − Rω(g)Lφ(g −1 ) + φ(g) − Rω(g)L φ(g −1 ) − Rω(g −1 )L (293)
1
1 − φ(g)φ(g −1 ) + φ(g) φ(g −1 ) − Rω(g −1 )L
≤ δ + 2 + (294)
|G| g∈G

+ 1 − φ(g)φ(g −1 ) + φ(g) − Rω(g)L φ(g −1 ) (295)
≤ 3δ + 2 + 2 , (296)

where we exploit the submultiplicativity of the diamond norm and the fact that φ(g) = 1 for all g ∈ G. Similarly, for
the second term we get

020357-49
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

1
Rω(g)LRω(g)† L − Rω(g)LR ω (g)† L = 1

Rω(g)L[Rω(g)† L − R ω (g)† L ] (297)
|G| g∈G |G| g∈G
1
≤ Rω(g)L Rω(g)† L − R ω (g)† L (298)
|G| g∈G

≤ K( + ). (299)

Combining all of this we get B. Randomized benchmarking under fidelity

constraints

1 − RTL ≤(K + ) + 3δ + (2 + K) < 1, (300) In this subsection, we argue that the condition Eq. (72)
is in some sense necessary for the correct behavior of RB,
by the assumptions of the theorem. Now assume that T has in the sense that it cannot be replaced with a natural weaker
an image of dimension strictly less than d2 . This means condition. Given the worst-case nature of the diamond
there exists a Hermitian X ∈ Md such that RTL (X ) = 0. norm Eq. (72) is rather restrictive, and one might wonder
But this implies that if it is possible to replace this diamond-norm constraint
with a more congenial constraint based on the average

X − RTL (X )1 fidelity. That is, one can imagine replacing Eq. (72) with
1 − RTL ≥ = 1, (301) a constraint of the form
X 1
1
Favg [φ(g), ω(g)] ≥ 1 − δ (305)
which is a contradiction. Hence, the rank of T is at least d2 . |G| g∈G
Since T by construction commutes with the representations
ω, ω we can decide that there exists a representation ω of for some δ > 0. Indeed, this is the assumption made in
degree at least d2 which is a subrepresentation of both ω Ref. [25] to prove a version of Theorem 8 for the Clifford
and ω and moreover that both Rω L(g) and R ω L (g) group. Here, it has been noted that in order to guarantee
are of rank at least d2 for all g ∈ G. correct behavior the constant δ must be chosen inversely
Next, we state a complementary theorem, saying that proportional to the Hilbert-space dimension (δ ∼ 1/d). It
closeness to a representation is a concept stable under has been speculated that this dimensional scaling could
perturbations of the implementation. This is just a trivial perhaps be an artifact of the proof techniques used.
consequence of the triangle inequality. We argue that this scaling is in fact real, by providing
an explicit family (inspired by example 8.1 in Ref. [96])
of examples of implementations φL (where L is an integer
Theorem 19: (Stability of the closeness to a representa- independent of d) of a group G with
tion). Let φ, φ be implementations of a group G on the
superoperators Sd such that 1 2L
Favg [φ(g), ω(g)] ≥ 1 − (306)
|G| g∈G d
1
φ(g) − φ (g) ≤δ (302)
|G| g∈G relative to a reference implementation ω but with associ-
ated RB output data that is not even qualitatively of the
form Eq. (63). In fact, by choosing L large (but constant in
and let ω be a representation of G on Vn with associated
d) we can obtain almost arbitrary nonexponential behavior
maps L : Sd → Vn and R : Vn → Sd such that
in the RB output data associated with φL .
1 Example 1: Real scaling. Choose G to be the q-qubit
φ(g) − Rω(g)L ≤ (303) Clifford group with standard reference implementation
|G| g∈G μ
ω(g) = Ug · Ug † . Now let L be a superoperator indexed
by an integer L and a real number 0 ≤ μ ≤ 1, defined by
then its action on the basis matrices |i j | as

1
Rω(g)L − φ (g) ≤δ + . μ

d
μ
(304) L ( |i j |) = δi,j [SL ]i,k |k k | (307)
|G| g∈G
k=1

020357-50
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

with S μ a d × d stochastic matrix of the form Consider now the following implementation map
⎧ deﬁned by its action on X ∈ Md :
⎪
⎪ μ if i = j ≤ L − 1
⎪
⎨
μ 1 if i = j ≥ L φ(g)(X ) = (PL XPL ) + Ug (I − PL )X (I − PL )Ug † ,
[SL ] = (308) (309)
⎪
⎪1 −μ if i = j − 1 ≤ L
⎪
⎩
0 otherwise. where PL is the projection onto the space Span{ |i i ≤
L}. This map can be seen as checking whether a state is in
μ
For convenience we write for L in the following. It is the support of PL (though a measurement) and then apply-
easy to see that is a quantum channel
5 / and
moreover that ing or Ug depending on the outcome. We can calculate
if i, j ≤ L then ( |i j |) ∈ Span{ i j i , j ≤ L}. the average ﬁdelity Favg [φ(g), ω(g)] directly as

.

Favg [φ(g), ω(g)] = dψ Tr φ(g)(|ψ ψ|)ω(g)† (|ψ ψ|) (310)
. .

= dψ Tr Ug|ψ ψ|Ug † (PL |ψ ψ|PL ) + dψ Tr |ψ ψ|(I − PL )|ψ ψ|(I − PL ) (311)
. .

= dψ Tr Ug |ψ ψ|Ug † (PL |ψ ψ|PL ) + dψ[1 − 2 ψ |PL |ψ +( ψ |PL |ψ )2 ] (312)
.
≤ 1 − 2 dψ ψ |PL |ψ (313)

2L
≤1− , (314)
d

where we make use of the fact that (PL |ψ ψ|PL ) ≥ 0, XI. CONCLUSIONS
since is CP. Note that for constant L we can make
In this work, we have introduced a comprehensive
the fidelity arbitrarily high by choosing d = 2q large
theory of RB. As such, it goes beyond a mere classifi-
enough. Now consider RB with input state ρ = |1 1|
cation of known protocols (a task that we also hope to
and measurement POVM {|1 1| + |L L|, 1 − |1 1| −
achieve). But at the same time, it provides a deeper under-
|L L|} and implementation map φL as defined above. The
standing, a more precise formulation and interpretation of
RB probability for the POVM element |1 1| + |L L| is
what the data acquired in RB means, actionable advice to
going to be (setting gend = e and assuming no SPAM
experimentalists and theoretical practitioners and a con-
errors)
ceptual platform from which new schemes can be derived.
Specifically, we show how RB gives rise to exponential
p(|1 1| + |L L|, m) = Tr[(|1 1| + |L L|)φL∗m (|1 1|)]. decays under broad classes of Markovian noise models,
(315) show—importantly in practical contexts—in what sense
RB is robust to deviations from uniform sampling and
Note that since PL |1 1| = |1 1|PL we have that provides further evidence to the interpretation in terms
μ
φL (g)(|1 1|) = (L )m (|1 1|) for all g. From this it fol- of average gate fidelities. Maybe most important for our
lows that work to serve as a basis for substantial further develop-
ment of methods and protocols are new conceptual insights
μ
p(|1 1| + |L L|, m) = Tr[(|1 1| + |L L|)(L )m (|1 1|)] into how inversion gates are—in contrast to common
belief—not required for RB and into how large classes
μm μm
= [SL ]1,L + [SL ]1,1 . (316) of groups in RB can become available by means of new
filtering techniques. This contributes to overcoming the
This data shows curious behavior. For small sequence problem of isolating exponential decays in a fully scalable
lengths we have p(|1 1| + |L L|, m) ≈ μm , but with manner. First steps into exploiting the insights established
increasing sequence length we observe wildly nonexpo- here when devising new schemes have already been made
nential behavior. [57–59]. We hope that this work provides a starting point

020357-51
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

of a further rich class of new protocols of quantum certifi- [6] J. Emerson, M. Silva, O. Moussa, C. Ryan, M. Laforest, J.
cation and benchmarking, providing stringent and rigorous Baugh, D. G. Cory, and R. Laflamme, Symmetrized char-
quality criteria, while respecting experimental needs and acterization of noisy quantum processes, Science 317, 1893
desiderata. (2007).
[7] E. T. Campbell, B. M. Terhal, and C. Vuillot, Roads
towards fault-tolerant universal quantum computation,
Nature 549, 172 (2017).
ACKNOWLEDGMENTS [8] R. Barends, et al., Rolling quantum dice with a supercon-
ducting qubit, Phys. Rev. A 90, 030303 (2014).
J.H. would like to acknowledge helpful conversations [9] E. Onorati, A. H. Werner, and J. Eisert, Randomized Bench-
with Michael Walter, Bas Dirkse, and Freek Witteveen. marking for Individual Quantum Gates, Phys. Rev. Lett.
I.R. would like to thank Richard Kueng, Martin Kliesch, 123, 060501 (2019).
Marios Ioannou, Dominik Hangleiter, and Jonas Hafer- [10] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Char-
kamp for helpful discussions and Susane Calegari for con- acterizing universal gate sets via dihedral benchmarking,
tributions to the illustration. The authors would also like Phys. Rev. A 92, 060302 (2015).
to acknowledge an anonymous referee for pointing out the [11] A. W. Cross, E. Magesan, L. S. Bishop, J. A. Smolin,
and J. M. Gambetta, Scalable randomized benchmark-
correct way to include cycle benchmarking into the frame-
ing of non-Clifford gates, npj Quant. Inf. 2, 16012
work of Theorem 8. The Berlin team has been supported (2016).
by the BMBF project DAQC, for which it introduces [12] J. Helsen, X. Xue, L. M. K. Vandersypen, and S. Wehner, A
new methods for randomized benchmarking of near-term new class of efficient randomized benchmarking protocols,
superconducting quantum platforms, and BMBF project npj Quant. Inf. 5, 1 (2019).
MUNIQC-ATOMS, for which it introduces a starting point [13] A. Erhard, J. J. Wallman, L. Postler, M. Meth, R. Stricker,
to develop schemes of analog randomized benchmarking. E. A. Martinez, P. Schindler, T. Monz, J. Emerson, and
It has also been funded by the DFG (EI 519/9-1, for which R. Blatt, Characterizing large-scale quantum computers via
this work develops ideas of signal processing, and DFG cycle benchmarking. Nat. Commun., 10 (2019).
[14] D. S. Franca and A. K. Hashagen, Approximate random-
CRC 183, for which this is an internode work Berlin-
ized benchmarking for finite groups, J. Phys. A 51, 395302
Copenhagen, as well as DFG EI 519/14-1), and the Munich (2018).
Quantum Valley (K-8). This work has also received fund- [15] T. J. Proctor, A. Carignan-Dugas, K. Rudinger, E. Nielsen,
ing from the European Union’s Horizon 2020 research R. Blume-Kohout, and K. Young, Direct Randomized
and innovation programme under Grant Agreement No. Benchmarking for Multiqubit Devices, Phys. Rev. Lett.
817482 (PASQuanS), for which it assesses feasible bench- 123, 030503 (2019).
marking schemes in quantum computing and simulation, [16] J. Wallman, C. Granade, R. Harper, and S. T. Flammia, Esti-
and the Einstein Foundation. E.O. has been supported by mating the coherence of noise, New J. Phys. 17, 113020
the Royal Society. A.H.W. thanks the VILLUM FONDEN (2015).
[17] J. M. Gambetta, A. D. Córcoles, S. T. Merkel, B. R. John-
for its support with a Villum Young Investigator Grant
son, J. A. Smolin, J. M. Chow, C. A. Ryan, C. Rigetti,
(Grant No. 25452) and its support via the QMATH Centre S. Poletto, T. A. Ohki, M. B. Ketchen, and M. Stef-
of Excellence (Grant No. 10059). fen, Characterization of Addressability by Simultaneous
Randomized Benchmarking, Phys. Rev. Lett. 109, 240504
(2012).
[18] J. J. Wallman, M. Barnhill, and J. Emerson, Robust Char-
[1] J. Emerson, R. Alicki, and K. Zyczkowski, Scalable noise acterization of Loss Rates, Phys. Rev. Lett. 115, 060501
estimation with random unitary operators, J. Opt. B 7, S347 (2015).
(2005). [19] J. J. Wallman, M. Barnhill, and J. Emerson, Robust char-
[2] C. Dankert, R. Cleve, J. Emerson, and E. Livine, Exact acterization of leakage errors, New J. Phys. 18, 043021
and approximate unitary 2-designs and their application to (2016).
fidelity estimation, Phys. Rev. A 80, 012304 (2009). [20] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and
[3] B. Lévi, C. C. López, J. Emerson, and D. G. Cory, Efficient T. Ohki, Robust Extraction of Tomographic Information
error characterization in quantum information processing, via Randomized Benchmarking, Phys. Rev. X 4, 011050
Phys. Rev. A 75, 022314 (2007). (2014).
[4] E. Magesan, J. M. Gambetta, B. R. Johnson, C. A. Ryan, [21] I. Roth, R. Kueng, S. Kimmel, Y.-K. Liu, D. Gross, J.
J. M. Chow, S. T. Merkel, M. P. Da Silva, G. A. Keefe, Eisert, and M. Kliesch, Recovering Quantum Gates from
M. B. Rothwell, and T. A. Ohki, et al., Efficient Measure- few Average Gate Fidelities, Phys. Rev. Lett. 121, 170502
ment of Quantum Gate Error by Interleaved Randomized (2018).
Benchmarking, Phys. Rev. Lett. 109, 080505 (2012). [22] S. T. Flammia and J. J. Wallman, Efficient estimation of
[5] E. Knill, D. Leibfried, R. Reichle, J. Britton, R. B. Pauli channels. (2019), ArXiv:1907.12976.
Blakestad, J. D. Jost, C. Langer, R. Ozeri, S. Seidelin, and [23] J. Eisert, D. Hangleiter, N. Walk, I. Roth, D. Markham, R.
D. J. Wineland, Randomized benchmarking of quantum Parekh, U. Chabaud, and E. Kashefi, Quantum certification
gates, Phys. Rev. A 77, 012307 (2008). and benchmarking, Nat. Rev. Phys. 2, 382 (2020).

020357-52
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)

[24] J. J. Wallman, Randomized benchmarking with gate- [42] C. J. Wood and J. M. Gambetta, Quantification and char-
dependent noise, Quantum 2, 47 (2018). acterization of leakage errors, Phys. Rev. A 97, 032306
[25] S. T. Merkel, E. J. Pritchett, and B. H. Fong, Random- (2018).
ized benchmarking as convolution: Fourier analysis of gate [43] R. N. Alexander, P. S. Turner, and S. D. Bartlett, Ran-
dependent errors. (2018). domized benchmarking in measurement-based quantum
[26] T. Proctor, K. Rudinger, K. Young, M. Sarovar, and R. computing, Phys. Rev. A 94, 032303 (2016).
Blume-Kohout, What Randomized Benchmarking Actually [44] J. Combes, C. Granade, C. Ferrie, and S. T. Flammia, Logi-
Measures, Phys. Rev. Lett. 119, 130502 (2017). cal randomized benchmarking. (2017), ArXiv:1702.03688.
[27] A. Carignan-Dugas, K. Boone, J. J. Wallman, and J. [45] S. T. Flammia and J. J. Wallman, Efficient estimation of
Emerson, From randomized benchmarking experiments pauli channels. Nat. Phys., (2020), ArXiv:1907.12976.
to gate-set circuit fidelity: How to interpret randomized [46] R. Harper, S. T. Flammia, and J. J. Wallman, Efficient
benchmarking decay parameters, New J. Phys. 20, 092001 learning of quantum noise. Nat. Phys., (2020).
(2018). [47] R. Harper and S. T. Flammia, Estimating the fidelity of T
[28] A. Acin, I. Bloch, H. Buhrman, T. Calarco, C. Eichler, J. gates using standard interleaved randomized benchmark-
Eisert, J. Esteve, N. Gisin, S. J. Glaser, F. Jelezko, S. Kuhr, ing, Quant. Sc. Tech. 2, 015008 (2017).
M. Lewenstein, M. F. Riedel, P. O. Schmidt, R. Thew, A. [48] S. Sheldon, L. S. Bishop, E. Magesan, S. Filipp, J. M.
Wallraff, I. Walmsley, and F. K. Wilhelm, The European Chow, and J. M. Gambetta, Characterizing errors on qubit
quantum technologies roadmap, New J. Phys. 20, 080201 operations via iterative randomized benchmarking, Phys.
(2018). Rev. A 93, 012301 (2016).
[29] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. [49] T. Chasseur, D. M. Reich, C. P. Koch, and F. K. Wilhelm,
Barends, R. Biswas, S. Boixo, F. G. S. L. Brandao, and D. Hybrid benchmarking of arbitrary quantum gates, Phys.
A. Buell, et al., Quantum supremacy using a programmable Rev. A 95, 062335 (2017).
superconducting processor, Nature 574, 505 (2019). [50] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and
[30] A. Bouland, B. Fefferman, C. Nirkhe, and U. Vazirani, On T. Ohki, Robust Extraction of Tomographic Information
the complexity and verification of quantum random circuit via Randomized Benchmarking, Phys. Rev. X 4, 011050
sampling, Nat. Phys. 15, 159 (2019). (2014).
[31] K. Noh, L. Jiang, and B. Fefferman, Efficient classical sim- [51] K. Boone, A. Carignan-Dugas, J. J. Wallman, and J. Emer-
ulation of noisy random quantum circuits in one dimension, son, Randomized benchmarking under different gate sets,
Quantum 4, 318 (2020). Phys. Rev. A 99, 032329 (2019).
[32] A. M. Dalzell, N. Hunter-Jones, and F. G. S. L. Brandão, [52] T. Kato, Perturbation Theory for Linear Operators, Vol.
Random quantum circuits transform local noise into global 132 (Springer-Verlag Berlin Heidelberg, Berlin, 1995).
white noise. (2021), ArXiv:a2111.14907. [53] G. W. Stewart and Ji-Guang Sun, Matrix Perturbation
[33] Y. Liu, M. Otten, R. Bassirianjahromi, L. Jiang, and B. Fef- Theory (Academic Press, Boston, 1990).
ferman, Benchmarking near-term quantum computers via [54] H.-Y. Huang, R. Kueng, and J. Preskill, Predicting many
random circuit sampling. (2021), ArXiv:2105.05232. properties of a quantum system from very few measure-
[34] E. Magesan, J. M. Gambetta, and J. Emerson, Scalable and ments, Nat. Phys. 16, 1050 (2020).
Robust Randomized Benchmarking of Quantum Processes, [55] M. Kliesch and I. Roth, Theory of quantum system certifi-
Phys. Rev. Lett. 106, 180504 (2011). cation, PRX Quantum 2, 010201 (2021).
[35] A. K. Hashagen, S. T. Flammia, D. Gross, and J. J. [56] J. Helsen, J. J. Wallman, S. T. Flammia, and S. Wehner,
Wallman, Real randomized benchmarking, Quantum 2, 85 Multiqubit randomized benchmarking using few samples,
(2018). Phys. Rev. A 100, 032304 (2019).
[36] J. M. Gambetta, A. D. Córcoles, S. T. Merkel, B. R. John- [57] J. Helsen, S. Nezami, M. Reagor, and M. Walter, Match-
son, J. A. Smolin, J. M. Chow, C. A. Ryan, C. Rigetti, S. gate benchmarking: Scalable benchmarking of a continuous
Poletto, and T. A. Ohki, et al., Characterization of Address- family of many-qubit gates. (2020), ArXiv:2011.13048.
ability by Simultaneous Randomized Benchmarking, Phys. [58] L. Kong, A framework for randomized benchmarking over
Rev. Lett. 109, 240504 (2012). compact groups, ArXiv:2111.10357.
[37] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Char- [59] J. Helsen, M. Ioannou, I. Roth, J. Kitzinger, E. Onorati,
acterizing universal gate sets via dihedral benchmarking, A. H. Werner, and J. Eisert, Estimating gate-set properties
Phys. Rev. A 92, 060302 (2015). from random sequences. (2021), ArXiv:2110.13178.
[38] A. W. Cross, E. Magesan, L. S. Bishop, J. A. Smolin, and J. [60] R. Goodman and N. R. Wallach, Representations and
M. Gambetta, Scalable randomised benchmarking of non- Invariants of the Classical Groups (Cambridge University
Clifford gates, npj Quant. Inf. 2, 16012 (2016). Press, Cambridge, 2000).
[39] J. Helsen, X. Xue, L. M. K. Vandersypen, and S. Wehner, A [61] W. Fulton and J. Harris, Representation Theory: a First
new class of efficient randomized benchmarking protocols, Course, Vol. 129 (Springer Science & Business Media,
npj Quant. Inf. 5, 1 (2019). New York, 2013).
[40] W. G. Brown and B. Eastin, Randomized benchmark- [62] W. T. Gowers and O. Hatami, Inverse and stability theorems
ing with restricted gate sets, Phys. Rev. A 97, 062323 for approximate representations of finite groups, Sbornik:
(2018). Math. 208, 1784 (2017).
[41] T. Chasseur and F. K. Wilhelm, Complete randomized [63] M. E. Kilmer and D. P. O’Leary, Selected Works with
benchmarking protocol accounting for leakage errors, Phys. Commentaries, edited by G.W. Stewart (Birkhäuser Basel,
Rev. A 92, 042333 (2015). 2010).

020357-53
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)

[64] B. Dirkse, J. Helsen, and S. Wehner, Efficient unitarity ran- [80] W. Liao and A. Fannjiang, Music for single-snapshot spec-
domized benchmarking of few-qubit Clifford gates, Phys. tral estimation: Stability and super-resolution, Appl. Comp.
Rev. A 99, 012315 (2019). Harm. An. 40, 33 (2016).
[65] J. J. Wallman, M. Barnhill, and J. Emerson, Robust char- [81] A. Fannjiang, Compressive spectral estimation with
acterization of leakage errors, New J. Phys. 18, 043021 single-snapshot esprit: Stability and resolution. (2016),
(2016). ArXiv:1607.01827.
[66] E. Magesan, R. Blume-Kohout, and J. Emerson, Gate [82] W. Li and W. Liao, Stable super-resolution limit and small-
fidelity fluctuations and quantum process invariants, Phys. est singular value of restricted fourier matrices. (2017),
Rev. A 84, 012309 (2011). ArXiv:1709.03146.
[67] C. A. Ryan, M. Laforest, and R. Laflamme, Randomized [83] W. Li, W. Liao, and A. Fannjiang, Super-resolution limit of
benchmarking of single-and multi-qubit control in liquid- the ESPRIT algorithm. (2019), ArXiv:1905.03782.
state nmr quantum information processing, New J. Phys. [84] R. Badeau, B. David, and G. Richard, High-resolution
11, 013034 (2009). spectral analysis of mixtures of complex exponentials mod-
[68] J. J. Wallman and S. T. Flammia, Randomized bench- ulated by polynomials, IEEE Trans. Sig. Proc. 54, 1341
marking with confidence, New J. Phys. 16, 103032 (2006).
(2014). [85] R. Badeau, G. Richard, and B. David, Performance of esprit
[69] J. M. Epstein, A. W. Cross, E. Magesan, and J. M. Gam- for estimating mixtures of complex exponentials modulated
betta, Investigating the limits of randomized benchmarking by polynomials, IEEE Trans. Sig. Proc. 56, 492 (2008).
protocols, Phys. Rev. A 89, 062321 (2014). [86] F. S. V. Bazan, Conditioning of rectangular Vandermonde
[70] B. H. Fong and S. T. Merkel, Randomized bench- matrices with nodes in the unit disk, SIAM J. Mat. An. App.
marking, correlated noise, and Ising models. (2017), 21, 679 (2006).
ArXiv:1703.09747. [87] L. T. Nguyen, J. Kim, and B. Shim, Low-rank matrix com-
[71] M. A. Fogarty, M. Veldhorst, R. Harper, C. H. Yang, pletion: A contemporary survey, IEEE Access 7, 94215
S. D. Bartlett, S. T. Flammia, and A. S. Dzurak, Non- (2019).
exponential fidelity decay in randomized benchmarking [88] J. A. Tropp, User-friendly tail bounds for sums of random
with low-frequency noise, Phys. Rev. A 92, 022326 matrices, Found. Comput. Math. 12, 389 (2012).
(2015). [89] R. Ahlswede and A. Winter, Strong converse for identifi-
[72] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Bound- cation via quantum channels, IEEE Trans. Inform. Th. 48,
ing the average gate fidelity of composite channels using 569 (2002).
the unitarity, New J. Phys. 21, 053016 (2019). [90] A. Ginory and J. Kim, Weingarten calculus and the IntHaar
[73] C. T. Kelley, Iterative Methods for Optimization (SIAM, package for integrals over compact matrix groups. J. Symb.
Philadelphia, 1999). Comp., 2019.
[74] R. Harper, I. Hincks, C. Ferrie, S. T. Flammia, and J. J. [91] Z. Webb, The Clifford group forms a unitary 3-design,
Wallman, Statistical analysis of randomized benchmarking, Quantum Inf. Comput. 16, 1379 (2016).
Phys. Rev. A 99, 052350 (2019). [92] H. Zhu, Multiqubit Clifford groups are unitary 3-designs,
[75] P. R. Prony, Essai experimentale et analytique, J. de l’Ecole Phys. Rev. A 96, 062336 (2017).
Polytechnique 1, 24 (1795). [93] M. B. Ruskai, S. Szarek, and E. Werner, An analysis of
[76] E. J. Candès and C. Fernandez-Granda, Super-resolution completely-positive trace-preserving maps on m2, Lin. Alg.
from noisy data, J. Fourier An. App. 19, 1229 App. 347, 159 (2002).
(2013). [94] A. Carignan-Dugas, M. Alexander, and J. Emerson, A polar
[77] E. J. Candes and C. Fernandez-Granda, Towards a mathe- decomposition for quantum channels (with applications to
matical theory of super-resolution, Comm. Pure App. Math. bounding error propagation in quantum circuits), Quantum
67, 906 (2014). 3, 173 (2019).
[78] R. Schmidt, Multiple emitter location and signal parameter [95] D. J. Hartfiel, Dense sets of diagonalizable matrices, Proc.
estimation, IEEE Trans. Ant. Prop. 34, 276 (1986). Am. Math. Soc. 123, 1669 (1995).
[79] R. Roy, A. Paulraj, and T. Kailath, in MILCOM 1986-IEEE [96] M. M. Wolf, Quantum channels & operations. Guided
Military Communications Conference: Communications- tour. Lecture notes available at http://www-m5.ma.tum.de/
Computers: Teamed for the 90’s, Vol. 3, (IEEE, 1986). foswiki/pubM, 5, 2012.

020357-54

Benchmarking Quantum Processor Performance at Scale
No ratings yet
Benchmarking Quantum Processor Performance at Scale
15 pages
Epstein Et Al. - 2014 - Investigating The Limits of Randomized Benchmarking Protocols
No ratings yet
Epstein Et Al. - 2014 - Investigating The Limits of Randomized Benchmarking Protocols
12 pages
Steane Code Analysis by Randomized Benchmarking
No ratings yet
Steane Code Analysis by Randomized Benchmarking
8 pages
2Direct-RB-Direct Randomized Benchmarking For Multiqubit Devices
No ratings yet
2Direct-RB-Direct Randomized Benchmarking For Multiqubit Devices
7 pages
Scalable Fast Benchmarking For Individual Quantum Gates With Local Twirling
No ratings yet
Scalable Fast Benchmarking For Individual Quantum Gates With Local Twirling
26 pages
Cathodic Protection Calculation
No ratings yet
Cathodic Protection Calculation
5 pages
Me 45 Strength of Materials
100% (1)
Me 45 Strength of Materials
243 pages
Dokumen - Pub - Remote Compositional Analysis Techniques For Understanding Spectroscopy Mineralogy and Geochemistry of Planetary Surfaces 978 1 107 18620 0
No ratings yet
Dokumen - Pub - Remote Compositional Analysis Techniques For Understanding Spectroscopy Mineralogy and Geochemistry of Planetary Surfaces 978 1 107 18620 0
633 pages
ASSD. M.tech
No ratings yet
ASSD. M.tech
127 pages
12" Steel Model K12: Smith Meter CT Series PD Meter For Crude Transportation
No ratings yet
12" Steel Model K12: Smith Meter CT Series PD Meter For Crude Transportation
4 pages
Chap 8 Steady Incompressible Flow in Circular Pipes
No ratings yet
Chap 8 Steady Incompressible Flow in Circular Pipes
40 pages
Regrooving Elevator Sheaves White Paper
No ratings yet
Regrooving Elevator Sheaves White Paper
5 pages
PYQS+MEQs-3-XE-B-final AK
No ratings yet
PYQS+MEQs-3-XE-B-final AK
53 pages
Differential Equations A First Course On Ode and A Brief Introduction To Pde de Gruyter Textbook 1st Edition Shair Ahmad
No ratings yet
Differential Equations A First Course On Ode and A Brief Introduction To Pde de Gruyter Textbook 1st Edition Shair Ahmad
56 pages
Weekly Test-10 PCM
No ratings yet
Weekly Test-10 PCM
19 pages
Chapter 21
No ratings yet
Chapter 21
10 pages
RRB Alp E: Previous Paper
No ratings yet
RRB Alp E: Previous Paper
11 pages
Padrao de Fios
No ratings yet
Padrao de Fios
6 pages
LEARNING OUTCOMES (Electrical Charge)
No ratings yet
LEARNING OUTCOMES (Electrical Charge)
16 pages
03 RPT 2021 DSKP KSSR Semakan 2017 Mathematics Year 3 v4
No ratings yet
03 RPT 2021 DSKP KSSR Semakan 2017 Mathematics Year 3 v4
38 pages
Mathshistory - St-Andrews - Ac.uk-Augustin Louis Cauchy
No ratings yet
Mathshistory - St-Andrews - Ac.uk-Augustin Louis Cauchy
7 pages
Notes On Greek Mathematics
No ratings yet
Notes On Greek Mathematics
11 pages
Transmission Electron Micros
No ratings yet
Transmission Electron Micros
33 pages
P502 - Conductors and Insulators
No ratings yet
P502 - Conductors and Insulators
33 pages
Convolution Integral2
No ratings yet
Convolution Integral2
7 pages
1 s2.0 S0308016118305490 Main
No ratings yet
1 s2.0 S0308016118305490 Main
16 pages
Certificate of Conformance: Qa@nouveaux - in WWW - Nouveaux.in
No ratings yet
Certificate of Conformance: Qa@nouveaux - in WWW - Nouveaux.in
1 page
Verification of Gap Element in Midas Gen
No ratings yet
Verification of Gap Element in Midas Gen
3 pages
Geology Practical Exam GR 12 Sem 1 20171306
No ratings yet
Geology Practical Exam GR 12 Sem 1 20171306
4 pages
Calculus Chapter 1.1 1.2 Notes
No ratings yet
Calculus Chapter 1.1 1.2 Notes
7 pages
Test Scheme Alkareem 11th and 12th
No ratings yet
Test Scheme Alkareem 11th and 12th
3 pages
Considerations When Restraining Molecularly Oriented PVC Pipe
No ratings yet
Considerations When Restraining Molecularly Oriented PVC Pipe
10 pages
Charges ND Field 2022
No ratings yet
Charges ND Field 2022
2 pages
Exam 100318
No ratings yet
Exam 100318
2 pages
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Fault Tolerant & Fault Testable Hardware Design
From Everand
Fault Tolerant & Fault Testable Hardware Design
Parag K. Lala
5/5 (2)
Finite Elements and Approximation
From Everand
Finite Elements and Approximation
O. C. Zienkiewicz
4.5/5 (4)
Adaptive Filtering Prediction and Control
From Everand
Adaptive Filtering Prediction and Control
Graham C Goodwin
No ratings yet
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Petri Nets: Fundamental Models, Verification and Applications
From Everand
Petri Nets: Fundamental Models, Verification and Applications
Michel Diaz
No ratings yet
Statistical Models and Methods for Reliability and Survival Analysis
From Everand
Statistical Models and Methods for Reliability and Survival Analysis
Vincent Couallier
No ratings yet
Applied Iterative Methods
From Everand
Applied Iterative Methods
Louis A. Hageman
No ratings yet
Applications of Combinatorial Optimization
From Everand
Applications of Combinatorial Optimization
Vangelis Th. Paschos
No ratings yet
Optimization in Engineering Sciences: Exact Methods
From Everand
Optimization in Engineering Sciences: Exact Methods
Pierre Borne
No ratings yet
Métodos numéricos aplicados a Ingeniería: Casos de estudio usando MATLAB
From Everand
Métodos numéricos aplicados a Ingeniería: Casos de estudio usando MATLAB
Héctor Jorquera González
5/5 (1)
Ray Tune for Scalable Hyperparameter Optimization: The Complete Guide for Developers and Engineers
From Everand
Ray Tune for Scalable Hyperparameter Optimization: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Towards best practice in the Archetype Development Process
From Everand
Towards best practice in the Archetype Development Process
Alberto Moreno Conde
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
From Everand
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
Dr. Prashanth Harish Southekal
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Cohere Rerank in Practice: The Complete Guide for Developers and Engineers
From Everand
Cohere Rerank in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Q#: Programming Quantum Algorithms and Circuits: Definitive Reference for Developers and Engineers
From Everand
Q#: Programming Quantum Algorithms and Circuits: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Finite Element Method
From Everand
Finite Element Method
Gouri Dhatt
1/5 (1)
Verilog for Digital Design and Simulation: Definitive Reference for Developers and Engineers
From Everand
Verilog for Digital Design and Simulation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mivar NETs and logical inference with the linear complexity
From Everand
Mivar NETs and logical inference with the linear complexity
Varlamov, Oleg O.
No ratings yet
Efficient String Searching with Boyer-Moore: Definitive Reference for Developers and Engineers
From Everand
Efficient String Searching with Boyer-Moore: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Concurrent Data Pipelines with Broadway in Elixir: The Complete Guide for Developers and Engineers
From Everand
Concurrent Data Pipelines with Broadway in Elixir: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
5/5 (1)
Coq Language and Proof Development: Definitive Reference for Developers and Engineers
From Everand
Coq Language and Proof Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
From Everand
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Effective Mocha Testing: Definitive Reference for Developers and Engineers
From Everand
Effective Mocha Testing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Replication Architectures and Protocols: Definitive Reference for Developers and Engineers
From Everand
Practical Replication Architectures and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
From Everand
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Chai Assertion Library in Practice: Definitive Reference for Developers and Engineers
From Everand
Chai Assertion Library in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
From Everand
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PyTest in Practice: Definitive Reference for Developers and Engineers
From Everand
PyTest in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Optimal Pathfinding with A-Star Algorithms: Definitive Reference for Developers and Engineers
From Everand
Optimal Pathfinding with A-Star Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ChaosBlade in Practice: The Complete Guide for Developers and Engineers
From Everand
ChaosBlade in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
EtherChannel Configuration and Optimization: Definitive Reference for Developers and Engineers
From Everand
EtherChannel Configuration and Optimization: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
TestNG Essentials: Definitive Reference for Developers and Engineers
From Everand
TestNG Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to MiniTest: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to MiniTest: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Foundations of Scheduling Algorithms: Definitive Reference for Developers and Engineers
From Everand
Foundations of Scheduling Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Effective Cucumber Automation: Definitive Reference for Developers and Engineers
From Everand
Effective Cucumber Automation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Art of Controller Design
From Everand
The Art of Controller Design
Martin Braae
No ratings yet
Technical Foundations of Torch: Definitive Reference for Developers and Engineers
From Everand
Technical Foundations of Torch: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
From Everand
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Omni-Path Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Omni-Path Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DevTest Engineering Foundations: Definitive Reference for Developers and Engineers
From Everand
DevTest Engineering Foundations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Continuous Delivery Engineering: Definitive Reference for Developers and Engineers
From Everand
Continuous Delivery Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
From Everand
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
From Everand
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
From Everand
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Essential Apache Beam: Definitive Reference for Developers and Engineers
From Everand
Essential Apache Beam: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
From Everand
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Transformers: Principles and Applications
From Everand
Transformers: Principles and Applications
Richard Johnson
No ratings yet
CodePipeline in Depth: Definitive Reference for Developers and Engineers
From Everand
CodePipeline in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

General Framework For Randomized Benchmarking

Uploaded by

General Framework For Randomized Benchmarking

Uploaded by

PRX QUANTUM 3, 020357 (2022)

General Framework for Randomized Benchmarking

I. INTRODUCTION and benchmarking. Particularly for quantum operations,

2691-3399/22/3(2)/020357(54) 020357-1 Published by the American Physical Society

protocols, and is largely an eﬀort to organize and for-

(a) The data-collection phase corresponds to the part

the form combination of (matrix) exponential decays [as expressed

1 which will be helpful later.

and A1 = A1 + X1 † EX1 − X2 † EX1 P1 and A2 = A2 + X2 † sep(A2 , A1 ) > 0. (52)

and the property that Finally, to analyze perturbations of eigenvalues, we

|ã − aj | ≤ E. (62)

Algorithm 1. RB (data-collection phase)

case we ﬁnd that it is most useful to interpret form

(c) Closeness to reference implementation: In order 1

Now we use from Theorem 6, the upper bounds on

and of Eq. (80) as

with assumption [replacing it with the more general diamond-

for the ﬁrst term and

Finally, we can bound the second term in Eq. (121) as

1 where U # is the noisy implementation of the unitary U and φ

× φ{[gm t(pm ) . . . g1 t(p1 )]−1 }φ(gm )T#φ(pm )T# 1  

spec (Mλ ) = {zi(λ) }i=1

processing. Tr(Aλ Mλm ) = a(λ) (λ) m

a discrete series of data points is a well-studied problem

WL (z)2 We observe that the bound on Rnoise (z) is proportional to

This rotational invariance property is inherited by Usignal . 4. Conditioning of Vandermonde matrices

FIG. 3. The condition number

TABLE I. Examples of pole families for diﬀerent numbers of poles n.

FIG. 5. The dependency of the

RB output data, limiting RB to groups with reference rep-

now have that

where we use that

Algorithm 2. An estimator for kλ (m)

F (SφS −1 )[σλ ] = F (ω)[σλ ] + F (SφS −1 − ω)[σλ ] (231)

1  max (σλ ) | = z(σλ ) | + z(σλ ) | 4

4(σλ )2 = F (SφS −1 )[σλ ] − fmax (σλ ) |rmax (σλ ) 

where we use the fact that |1 − fmax (σλ )| ≤ 4

z(σλ )|rmax (σλ )  max (σλ )|z(σλ ) (266)

& ! " '

In a second regime we can assume the converse, namely that

holds true. This case is analogous, since we now have

This situation is, however, problematic, since RB gives us that

Combining all of this we get B. Randomized benchmarking under fidelity

You might also like

|ã − aj | ≤ E. (62)

× φ{[gm t(pm ) . . . g1 t(p1 )]−1 }φ(gm )T#φ(pm )T# 1

WL (z)2 We observe that the bound on Rnoise (z) is proportional to

1 max (σλ ) | = z(σλ ) | + z(σλ ) | 4

4(σλ )2 = F (SφS −1 )[σλ ] − fmax (σλ ) |rmax (σλ )

where we use the fact that |1 − fmax (σλ )| ≤ 4

z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) (266)