General Framework For Randomized Benchmarking
General Framework For Randomized Benchmarking
(Received 8 December 2020; revised 23 December 2021; accepted 4 March 2022; published 16 June 2022)
Randomized benchmarking refers to a collection of protocols that in the past decade have become cen-
tral methods for characterizing quantum gates. These protocols aim at efficiently estimating the quality
of a set of quantum gates in a way that is resistant to state preparation and measurement errors. Over
the years many versions have been developed, however a comprehensive theoretical treatment of ran-
domized benchmarking has been missing. In this work, we develop a rigorous framework of randomized
benchmarking general enough to encompass virtually all known protocols as well as novel, more flexible
extensions. Overcoming previous limitations on error models and gate sets, this framework allows us, for
the first time, to formulate realistic conditions under which we can rigorously guarantee that the output of
any randomized benchmarking experiment is well described by a linear combination of matrix exponential
decays. We complement this with a detailed analysis of the fitting problem associated with randomized
benchmarking data. We introduce modern signal processing techniques to randomized benchmarking,
prove analytical sample complexity bounds, and numerically evaluate performance and limitations. In
order to reduce the resource demands of this fitting problem, we introduce novel, scalable postprocessing
techniques to isolate exponential decays, significantly improving the practical feasibility of a large set
of randomized benchmarking protocols. These postprocessing techniques overcome shortcomings in effi-
ciency of several previously proposed methods such as character benchmarking and linear-cross entropy
benchmarking. Finally, we discuss, in full generality, how and when randomized benchmarking decay
rates can be used to infer quality measures like the average fidelity. On the technical side, our work sub-
stantially extends the recently developed Fourier-theoretic perspective on randomized benchmarking by
making use of the perturbation theory of invariant subspaces, as well as ideas from signal processing.
DOI: 10.1103/PRXQuantum.3.020357
efficient, in the sense that the resources required scale poly- it possible to reliably use them without a detailed under-
nomially with the number of qubits in the device. The standing of their inner workings. This is a timely effort,
various versions of RB apply sequences of randomly cho- as procedures that fit within the RB framework, such as
sen quantum gates of varying length. Small errors are thus linear-cross-entropy benchmarking [29] and the behavior
amplified with the sequence length, and gate quality mea- of noisy random circuits more generally, have been the
sures can be extracted from the dependence of the output topic of significant attention recently [30–32], including
data on sequence length. for the purpose of benchmarking [33]. Given how we iden-
In RB protocols, group structures feature strongly, in tify linear-cross-entropy benchmarking as a randomized
that the gate set considered is in almost all cases a sub- benchmarking procedure, we relate our general framework
set of a finite group. Such group structures not only make to this timely discussion.
it possible to efficiently make predictions for error-free At the same time our framework allows us to go sig-
sequences and compute inverses, but they also provide the nificantly beyond current protocols and establish a series
means to analyze the error contribution after averaging. of novel theoretical results and benchmarking schemes,
Originally proposed for random unitary gates [1–3], RB addressing several shortcomings of the current state of the
is now most prominently executed with gates from the so- art. Among others, these novel results include a rigorous
called Clifford group [4–6], a set of efficiently classically error bound for generator-style randomized benchmark-
simulatable quantum gates that take a key role specifi- ing, a formal equivalence between linear-cross-entropy
cally in fault-tolerant quantum computing [7]. It has also benchmarking and randomized benchmarking and a novel,
been considered for other (subsets of) finite groups [8–15]. scalable method for isolating signals in RB experiments, an
Moreover RB has been generalized to capture other fig- absolute requirement if one wants to apply RB to nonstan-
ures of merit of gate sets, such as relative average gate dard gate sets. This latter method, which we call filtered
fidelities to specific anticipated target gates [4], fidelities RB, is a significant conceptual improvement over standard
within a symmetry sector [9,10], or the unitarity [16]. RB schemes, promising greater flexibility and applica-
Specifically recently, with challenges of realizing fault- bility. Notably, it also obviates the need for physically
tolerant quantum computers in mind, emphasis has been implemented inversion gates in randomized benchmark-
put on capturing losses, leakage, and crosstalk in a scheme ing experiments and the preparation of specific input states,
[17–19]. Also, data from RB—or rather suitably combin- making its implementation significantly more straightfor-
ing data from multiple such experiments—can be sufficient ward. As such, this framework therefore also constitutes
to acquire full tomographic information about a quantum a solid basis for developing new schemes of random-
gate [20–22]. This adds up to a wealth of RB protocols [23] ized benchmarking. Altogether these results substantially
proposed over the previous years. Figure 2 summarizes (to advance the understanding of the possibilities and require-
our knowledge) an up to date list of theoretical proposals ments of randomized benchmarking as a practical tool for
for RB procedures presently known. estimation and certification.
A significant body of work moreover deals with the lim-
itations and precise preconditions of RB. The originally II. OVERVIEW OF RESULTS
rather stringent assumptions on noise being necessarily
In this work, we aim at developing a mathematically
identical across different quantum gates have over time
comprehensive framework of randomized benchmarking
been relaxed for particular protocols in later work [24–26],
protocols, synthesizing, generalizing, and substantially
and the connection between the output of RB and opera-
strengthening previous work. This paper covers a vari-
tionally relevant quantities (such as average fidelity) has
ety of different aspects of randomized benchmarking, from
been studied in some detail [26,27].
general theorems on the validity of RB data, to a detailed
And yet, it seems fair to say that a comprehensive pic-
study of the classical postprocessing of data generated by
ture of RB schemes for the quantum technologies [28] has
RB and an in-depth discussion of the connection between
been lacking so far. In particular, a theoretical framework
the outputs of RB and average fidelity. As our work is
that is broad enough to formalize the required precon-
often quite technical, we formulate a series of “take-home”
ditions ensuring the proper functioning of RB protocols
messages at the end of this section, summarizing the key
beyond case-by-case arguments for specific protocols. This
takeaways of our work for experimental practice.
is unsatisfactory, as the development of higher-quality
quantum gates and currently relies heavily on a plethora
of tailor-made variants of RB. This motivates our cur- A. A general framework for randomized
rent effort at providing a clear rigorous underpinning for benchmarking
RB and exploring its underlying mathematical structure, We begin by providing a general framework that gen-
putting all variants of RB on a common footing. eralizes and covers (to the best of our knowledge) all RB
With this effort we aim to not only better understand procedures currently present in the literature. This can also
these protocols, but also to increase trust in them, making be thought of as an attempt at a formal definition of RB
020357-2
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
020357-3
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
FIG. 2. An overview of RB schemes, indicating how they fit within our typology (see Sec. V D) of RB schemes and what theorem
covers the behavior of their output data (see Sec. VI). An ∗ indicates that the protocol has a nontrivial postprocessing scheme, while
∗∗ indicates that the protocol in its original specification has no inversion gate. We discuss how this is equal to uniform RB (with
inversion) together with a postprocessing step in Sec. VIII.
020357-4
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
exponentially small error, provided that the gate sampling of random gates, becomes close to the uni-
implementations are Markovian, time independent, form distribution and is a function of both the initial
and are on average close in diamond norm to an distribution ν and the underlying group G. We can
ideal implementation that is a representation. This summarize our result on subset RB as follows.
closeness is independent of the particular RB pro-
tocol and independent of the underlying Hilbert- Theorem 2: (Informal version of Theorem 10). Consider
space dimension. The complete statement is given a RB experiment with sequence length m, with gates drawn
as Theorem 8 that can be summarized as follows. from a group G according to a probability distribution
ν and implemented
through a reference representation
⊕n
Theorem 1: (Informal version of Theorem 8.) Consider ω(g) = λ∈ σλ λ (g). Denote the corresponding (noisy)
a RB experiment with sequence length m, with gates uni- actual implementation on the quantum computer as φ(g).
formly drawn from a group G andimplemented through If we have, for some sequence length mmix that
⊕n
a reference representation ω(g) = λ∈ σλ λ (g). Denote
the corresponding noisy implementation on the quantum δ
ν(g) ω(g) − φ(g) ≤ , (4)
computer as φ(g) (note that this assumes time independent g∈G
mmix
and Markovian noise). If we have
ν ∗mmix (g) − 1 ≤ δ , (5)
1 1 |G|
ω(g) − φ(g) ≤δ ≤ , (2) g∈G
|G| g∈G 9
and δ + δ ≤ 1/9, then the output data p(m) of the RB
then the output data p(m) of the RB experiment obeys the experiment obeys the relation
relation
Tr(Aλ Mλ mix ) ≤ O(δ + δ ),
m−m
p(m) − (6)
p(m) − Tr(Aλ Mλm ) ≤ O(δ m ), (3) λ∈
λ∈
with the error bound independent of m. Here Aλ and
with the error exponentially suppressed in m. Here Aλ and Mλ are nλ × nλ matrices, with Mλ depending only on the
Mλ are nλ × nλ matrices, with Mλ depending only on the actual implementation φ.
actual implementation φ.
This theorem cannot guarantee an exponential error
The proof of this theorem relies on a combination of bound, but still improves on the state of the art [14,15],
techniques from earlier works: taking the matrix Fourier- both in the generality of the assumptions made and the
transform perspective introduced to RB in Ref. [25] and size of the possible error. Note also the appearance of the
combining it with the realization in Ref. [24] that the mmix −1 term in the average diamond-norm deviation. This
diamond distance (averaged over random gates) is the can be read as the requirement that the generating gates are
correct distance measure for the formulation of assump- of sufficiently high quality that any (composite) uniformly
tions on noisy gate implementations. We also make heavy randomly chosen gate will be close in diamond norm to its
use of the perturbation theory of invariant subspaces of ideal version. In this sense this requirement is of the same
non-normal matrices [52,53]. We note that the specific stringency as Eq. (2).
parameter 1/9 is an artifact of the proof techniques and
probably suboptimal. (c) We discuss the behavior of interleaved RB proto-
cols, illustrating how standard interleaved RB, as
(b) Building on Theorem 8, we prove multiple theorems well as all but one nonstandard interleaved RB pro-
for nonuniform RB protocols. The first subtype, tocol, are covered by Theorem 8. We consider two
approximate RB, is covered by Theorem 9, a direct nonstandard interleaved RB protocols, namely cycle
generalization of Theorem 8, and also features an benchmarking [13], which is covered by our theo-
exponentially suppressed error. For the second sub- rems in a nontrivial way and robust RB tomography
type, subset RB, on the other hand, we can give only [50], which is not covered by our theorems. We
a weaker statement, guaranteeing that the RB output argue that this is not a weakness of our argument
data is described by a linear combination of expo- but rather that the RB output data of this pro-
nentials up to constant error (in sequence length) as tocol behaves in a nonstandard manner, requiring
long as the sequence length m is taken to be larger tailor-made analysis.
than a mixing length mmix . This mixing length indi- (d) In Sec. X, we providea discussion of the cen-
cates the moment where the m-fold convolution ν ∗m tral assumption |G|−1 g∈G ω(g) − φ(g) ≤ δ,
of the probability distribution ν, which governs the made on the behavior of noisy gates in the above
020357-5
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
theorems. We argue that this assumption is a natu- only interested in a subset of the decay parameters that
ral one to make (Theorem 18) and moreover that it describe the output data of a particular RB experiment.
cannot be replaced by a similar assumption involv- Because of this, several methods have been developed
ing the average fidelity without requiring the gate to to isolate particular exponential decays. Examples of this
be exponentially close to perfect in the number of include the class of uniform RB protocols without inver-
qubits. This also answers an open question posed in sion gates (indicated with a double asterisk “∗∗” in Fig. 2)
Ref. [25] in the negative. and a variety of other protocols that take linear combina-
tions of RB output data with different ending gates gend to
The unifying conceptual theme of all of our theorems is the isolate particular exponential decays (indicated with a sin-
fact that RB can be seen as a “power iteration in frequency gle asterisk “∗”). In Sec. VIII, we give a novel class of
space.” The behavior of the output data is dictated by the protocols called filtered RB that subsumes all these earlier
dominant eigenvalues of a fixed matrix that is obtained approaches. For simplicity, we consider only uniform RB,
from the Fourier transform [25] (in a specific sense defined but our results generalize to other types of RB.
later) of the noisy implementation map φ. Taking powers This class of protocols is based on the realization that
of this matrix results in the exponential suppression of all RB output data (indexed by an ending gate gend ) can be
but the largest eigenvalues. seen as a vector in the group algebra of the group being
Together, these results provide a rigorous justification benchmarked. This allows for the design of filter functions
for the folkloric knowledge that RB protocols function αλ : G → C, based on the matrix elements of irreducible
under broad experimental circumstances. representations, that isolate exponential decays associated
with subrepresentations of the ideal implementation of the
C. A framework for randomized benchmarking data gates in the group G. Using these filter functions we can
processing write down a general postprocessing scheme for the isola-
The second phase of the RB protocol, the data- tion of exponential decays and prove that it works when
processing phase, takes in RB output data, which is well the assumptions of Theorem 8 are satisfied. We prove a
described by a linear combination of exponentials and out- theorem of the following form.
puts the decay rates associated with those exponentials. If
the data is well described by a single exponential decay Theorem 3: [Theorem 16 (informal)]. Let αλ : G → C be
this can be done by off-the-shelf curve-fitting procedures, the filter function associated with the irreducible represen-
but if the RB output data is of a more complex form tation σλ and let p(m, gend ) be the output data associated
(such as a linear combination of several exponentials) a with a uniform RB experiment with ending gate gend , sat-
more flexible approach is required. Here we provide a self- isfying the condition Eq. (2) with parameter δ. We have
contained discussion of modern signal-processing methods that
for extracting decay parameters from data with a func-
tional form given by Eq. (1). We review signal-processing 1
kλ (m) := αλ (gend )p(m, gend ) (7)
algorithms, in particular the multiple signal classification |G| g ∈G
end
(MUSIC) and estimation of signal parameters via ratio-
nal invariance techniques (ESPRIT) algorithms, that are satisfies
at least, in principle, applicable to the most general form
of RB output data, even including matrix exponentials. kλ (m) − Tr(Bλ M m ) ≤ O(δ m ), (8)
λ
Beyond that, we discuss theoretical guarantees that were
derived for these algorithms and discuss their implications with Mλ associated with the irreducible subrepresentation
for RB data processing. Building upon these guarantees, σλ [as per Eq. (1)].
we derive a sampling complexity statement that ensures
the recovery of decay rates with these algorithms under Beyond this theoretical result we note that this novel
measurements with finite statistics. We complement our class of protocols allows one (by a simple reparametriza-
analytical discussion with numerical evaluations and simu- tion) to eliminate the need for an explicitly implemented
lations that demonstrate the practical performance of these inversion gate in RB, making the protocol significantly
algorithms. Importantly, our discussions detail the fun- simpler to implement in practice.
damental limitations of postprocessing RB output data We also give a statistical analysis of this postprocessing
featuring many exponential decays. scheme. In particular, we prove that if the measurement
positive operator-valued measure (POVM) performed in
D. A general postprocessing scheme for isolating the RB experiment is (proportional to) a state 3-design,
exponential decays the sample complexity of the complete benchmarking
Even with modern methods, fitting multiple exponential procedure (data collection plus postprocessing) is asymp-
decays is a difficult affair, and in many scenarios one is totically independent of the dimension of the underlying
020357-6
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
Hilbert space for arbitrary benchmarking groups. This is a positive implementation map, which is not completely pos-
strong improvement on previous attempts at such a general itive in the depolarizing gauge (or equivalently has non-
postprocessing procedure. Note that the 3-design condi- completely positive noise in-between gates). This implies
tion appearing here plays a similar role in controlling the that both these interpretations of RB decay rates are not
variance in scalable estimation procedures such as shadow fully satisfactory, because they cannot be guaranteed to
estimation [54,55]. correspond to the average fidelity of a physical process.
We stress, however, that the 3-design condition is a That said, this does not mean that RB decay rates are not
sufficient condition and there are examples in the liter- useful figures of merit, as they can always be interpreted
ature covered by this postprocessing scheme where this as meaningful benchmarks in their own right.
condition is not met but the overall procedure is still Complementing this, following the approximate approach
scalable. In particular, we discuss the recently proposed of Ref. [27], we show that the problem of connecting RB
linear-cross-entropy benchmarking procedure (XEB) [29] decay rates with the average gate fidelity can be (approx-
in Sec. VIII C. We argue that the variant of XEB that imately) reduced to the deviation between the dominant
performs multiple random gate sequences is an example (ideal) unperturbed eigenvectors and their (implemented)
of uniform RB (as per the typology) combined with an perturbed version in Fourier space. We show that, as
instance of our general postprocessing scheme. Further- long as this overlap is sufficiently close to 1, any gauge
more, we argue that the sample complexity of linear XEB choice that corresponds to a completely positive and trace-
is asymptotically independent of the underlying Hilbert- preserving (CPT) channel will connect RB parameters to
space dimension even though the POVM being measured the average gate fidelity. Hence we obtain, under pre-
is not itself a 3-design. cise conditions, an approximate version of the connection
between average fidelity and RB decay rates.
More formally, we leverage the Fourier-transform
E. Randomized benchmarking and average fidelity framework introduced in Ref. [25] to derive the following
RB has originally been designed to estimate the average expression for the entanglement fidelity, which is linearly
gate fidelity of a group of gates. Under the assumption of related to the average fidelity, averaged over all elements
gate-independent noise, it can be proven (as has already of the group as
been done in Ref. [1]) that the decay rates estimated in a
1
RB experiment correspond exactly to the average fidelity Fe (φ, ω) = dσ fmax (σλ )αoverlap + αres , (9)
of the noise associated to the gates. However, if this con- d2 λ∈ λ
dition is relaxed, the connection between these decay rates
and the average fidelity is less clear. Even more strongly, it where fmax (σλ ) is the RB decay parameter associated with
has been argued in Ref. [26] that due to a so-called gauge the irreducible subrepresentation σλ . In the Fourier frame-
freedom in the representation of the gate set, the entire work fλ,max corresponds to the largest eigenvalue of the
premise of a connection between RB decay rates and aver- Fourier transform of the implementation map φ evalu-
age fidelity may be suspect. This is because the choice of ated at σλ . Furthermore, the parameter αoverlap encodes
the gauge does not influence the RB decay rates, but it does the overlap between the (left and right) eigenvectors asso-
affect the average gate fidelity. Indeed, it has been shown ciated with this largest eigenvalue, and the eigenvector
that under some transformations the two quantities may of the Fourier transform of the reference representation
differ by orders of magnitude, even in the gate-dependent ω evaluated at σλ . Finally, the term αres , the residuum,
noise case (where the previously proven connection can be encodes information about the subdominant eigenspaces of
seen as a “natural” gauge choice). the Fourier transform. The factors αoverlap , αres are gauge
Subsequently proposals have been made to reconnect dependent. We give bounds on the overlap and residuum
the average gate fidelity and RB decay rates in the con- in terms of the deviation of φ from the reference ω and
text of standard Clifford RB: a natural gauge called the discuss relevant scenarios where these terms contribute
depolarizing gauge [25] and the noise-in-between-gates only negligibly to the entanglement fidelity (and thus when
framework. Both of these proposals provide an exact con- RB decay data corresponds approximately to an average
nection between the decay rates of RB and the average fidelity).
fidelity. However, several crucial questions of interpreta-
tion have still been left open, and in this work we aim to F. Nontechnical discussion
address some of them, and sharpen others. In this work, we develop a comprehensive theory of
In Sec. IX B 2, we substantially generalize both pro- randomized benchmarking. Our main motivation has been
posed connections between decay rates and average our desire to give a mathematical framework for RB and
fidelity to RB with arbitrary finite groups. What is more, to classify known schemes. It should be clear, however,
we argue that these two proposals are in fact equivalent. that our work goes significantly beyond a mere classi-
Moreover, we present an explicit example of a completely fication of what is present in the literature. Since our
020357-7
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
work is in parts rather technical, in the following we 1. Filtering scalably extends RB to a large class of
formulate a series of “take-home messages”: actionable groups. As formalized in Sec. VII, a major prac-
advice for experimentalists interested in using RB in the tical hurdle to applying RB with arbitrary finite
laboratory and developing new protocols to suit their groups, is that this generically requires the fitting
needs. of output data to multiexponential decays. This is
a difficult problem both in theory and in practice
1. RB gives exponential decays under broad and it has so far contributed to the limited experi-
(Markovian) circumstances. Confirming experi- mental use of RB beyond a few groups (such as the
mental intuition, and extending earlier results for Clifford group). Our new procedure, which we call
specific groups, our main result (Theorem 8) proves filtering (or filtered RB), takes a major step towards
that RB protocols behave (up to an exponentially solving this problem by giving a generic procedure
small correction factor) as expected whenever the for isolating exponential decays in a fully scalable
noise afflicting the gate set is Markovian and time manner. This approach is discussed in great detail
independent. Because the correction factor is so in Sec. VIII, with the protocol given explicitly in
small, any deviation from the prescribed functional Algorithm Box 2. This procedure is guaranteed to
form can in fact be taken as evidence of non- be scalable for all groups as long as the measure-
Markovian or time-dependent noise processes (as ment POVM forms a 3-design, but we believe that
suggested earlier by Ref. [24]). We do wish to it applies beyond that (see, in particular, the exam-
emphasize that the error term in Theorem 8 can ple of linear cross-entropy benchmarking discussed
be quite significant for small sequence lengths. in Sec. VIII C).
Hence we recommend as a rule of thumb that RB 2. Inversion gates are not required for RB. Another
experiments should not include very short (m ≤ key practical difficulty in performing randomized
5) sequence lengths, especially when strong gate- benchmarking has been the necessity to compute
dependent (but Markovian) noise is suspected, as and implement a global inversion gate. However,
this might bias the estimator for the decay rate. filtered RB has the bonus property that it does not
2. RB is broadly resistant to deviations from uni- require the application of inverses. Instead a ran-
form sampling. Similar to robustness against gate- dom noisy gate sequence can be directly compared
dependent Markovian noise, we prove (Theorem 9) to a perfect classical simulated version to extract the
that RB gives correct results even when the group is same RB decay rates, making the quantum part of
not being sampled exactly uniformly. This broadly the protocol significantly easier to implement. How-
justifies the use of (generically applicable) Markov ever, this simplicity is gained at a (constant) extra
chain techniques for sampling group elements [14], sampling overhead, as the inversion gate in standard
overcoming a key technical hurdle in running RB RB also suppresses the sampling complexity [56].
protocols with new groups.
3. The decay rates given by RB can be interpreted With these new contributions, our framework serves
as an average fidelity (but caveats apply). We as a convenient basis to design new schemes that
find that the decay rates of general RB experiments come with rigorous performance bounds built in. We
can always be exactly associated to the average expect this to facilitate and accelerate the develop-
fidelity of a fixed process, however, this process ment of more sophisticated and tailor-made benchmark-
need not be physical (i.e., it does not always corre- ing schemes as required by experimental practition-
spond to a completely positive and trace-preserving ers. Steps in this direction have already been made
map). Alternatively, we show that RB decay rates [57–59]. In particular, Ref. [58] explores the frame-
can always be connected approximately to the aver- work put forth here for continuous groups of quantum
age fidelity of a physical process, but the degree of gates.
approximation is dependent upon external beliefs
about the underlying noise process. Hence, we
believe the interpretation of RB decay rates as an G. Structure of this work
average fidelity to be broadly valid, but subject to In Sec. III, we discuss mathematical preliminaries: we
technical caveats. set the notation for the rest of the work and recall stan-
dard notions from representation theory. This section can
These three messages can be considered folklore knowl- be skipped by experienced readers.
edge in the RB community, for which we provide In Sec. IV, we discuss implementation maps: linear
a rigorous underpinning. However, our work also maps from finite groups to superoperators, a central con-
contains new conceptual developments, notably the cept in our treatment of RB. We also give an introduc-
following. tion into matrix-valued Fourier theory and explicitly state
020357-8
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
several results from the perturbation theory of non-normal V ⊗ W for some W). Finally we denote the complex con-
matrices, which we use throughout the rest of the work. jugate by a bar (i.e., A is the entrywise complex conjugate
In Sec. V, we give a general framework for RB, with of A).
its two phases: the data-collection and data-processing
phases, and give a general protocol for the data-collection A. Quantum channels and the operator-matrix
phase. This protocol, which depends on a range of input representation
parameters, covers (the data-collection phase of) all known Unitary operations as they are generated by quantum
versions of RB. We also discuss a typology of RB schemes, gates—in the focus of attention in this work—are quantum
dividing up the known protocols into a few generic classes. channels. Formally, quantum channels are superoperators,
In Sec. VI, we present a series of general theorems that that is elements of Sd , that are trace preserving and com-
govern the behavior of the output data of a RB protocol. pletely positive. In order to represent quantum channels
We confirm the folklore knowledge that RB data is well (and elements of Sd more generally), we make use of the
described by a linear combination of (matrix) exponentials, operator matrix representation. Given a quantum chan-
under some general assumptions. nel E ∈ Sd , we can represent it as an element of Md2 by
In Sec. VII, we discuss general procedures for extracting choosing an orthonormal basis (with respect to the trace
decay parameters from RB output data. We discuss imple- d 2
mentation and general limitations and prove a sampling or Hilbert-Schmidt inner product) bj j =1 for Md . Thus E
complexity statement for RB. (abusing notation) is a d2 × d2 matrix with components
In Sec. VIII, we propose a general postprocessing
method for isolating exponential decays associated with Ej ,k := Tr b†j E (bk ) . (10)
particular subrepresentations. We argue that this postpro-
cessing method covers many previously proposed proce- Analogously, (density) matrices ρ ∈ Md can be repre-
dures. We also prove a sufficient condition under which sented as vectors,
this postprocessing scheme is scalable for any RB proto- ⎛ ⎞
col and analyze linear cross-entropy benchmarking as an ρ1
example. ⎜ρ ⎟ †
|ρ = ⎝ 2 ⎠ with ρk := Tr bk ρ . (11)
In Sec. IX, we discuss the relation between the decay ...
rates generated by RB and the average fidelity, focusing in ρd 2
particular on the gauge freedom in the presentation of the
underlying noise channels. Note that the action E (ρ) now corresponds to a matrix-
Finally, in Sec. X, we finally argue that the assumptions vector multiplication E |ρ and the concatenation of two
made in Sec. VI are natural and in some sense necessary channels E and E into a matrix multiplication EE . We
for the correct behavior of RB. can analogously write a (POVM element) matrix ∈ Md
as a covector
| = 1 2 . . . d2 with
III. PRELIMINARIES: QUANTUM CHANNELS := , bk = Tr [ bk ] . (12)
k
AND GROUP REPRESENTATIONS
In this section, we go over some of the basic mathe- With this, the probability to obtain an outcome described
matical machinery needed to talk about randomized bench- by the POVM element when measuring ρ is p( |ρ) =
marking and prove our central theorems. We discuss quan- , ρ = Tr[ ρ].
tum channels and their matrix representations (Sec. III A),
and groups and group representations (Sec. III B). This is B. Representations of groups
fairly standard material, and beyond the setting of notation At the heart of our discussion are notions of represen-
it can be skipped by an experienced reader. tations of groups. In this section, we hence recall some
We begin by setting the stage and introducing some basic facts about the representations of finite (and compact)
basic notation used throughout our work. We denote com- groups over complex vector spaces, with a focus on their
plex vector spaces by V or more explicitly by Cd . We use in quantum computation. For a more in-depth treat-
denote by Md the vector space of complex linear transfor- ment of this topic we refer to Refs. [60,61]. In this work
mation of Cd and by Sd the space of linear transformations we restrict our attention to finite groups keeping the nota-
of Md , often called superoperators. Here d is an integer tion more concise. Most results can be analogously stated
that in many cases can be thought of as being a power of for continuous, compact groups and derived following
2 (d = 2q ), however, all theorems are valid for general d the same strategy. Reference [58] carefully discusses the
unless explicitly stated. We denote by TrV the partial trace required modifications and gives explicit reformulations
over a tensor factor V (of an implied tensor product space for continuous compact groups.
020357-9
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
1. Representations 4. Characters
Let G be a finite group and consider the space Md of Characters are a central object in representation theory,
linear transformations of Cd . A representation ω is a map given by the trace of a representation.
ω : G → Md that preserves the group multiplication, i.e., Definition 4: (Character of a representation). The char-
acter χω of a representation ω of a group G is defined
ω(g)ω(h) = ω(gh), ∀g, h ∈ G. (13) as
We require the operators ω(g) to be unitary as well (for χω (g) = Tr[ω(g)]. (19)
finite groups this can always be done).
One of the most important properties for characters of
irreducible representations is the following orthogonality
2. Reducible and irreducible representations
relation.
If there is a nontrivial subspace W of Cd such that for all
vectors w ∈ W we have Proposition 5: (Orthogonality formula). Let χλ , χλ be the
characters of two irreducible representations σλ , σλ of a
ω(g)w ∈ W, ∀g ∈ G, (14) group G. Then
then the representation ω is called reducible. The restric- 1 1 if σλ σλ
tion of ω to the subspace W is also a representation, which χ λ (g)χλ (g) = (20)
|G| g∈G 0 if σλ σλ .
we call a subrepresentation of ω. If there are no nontrivial
subspaces W such that Eq. (14) holds the representation ω
is called irreducible. We generally reserve the letter σ to 5. Projections onto irreducible representation
⊕n
denote irreducible representations. Given a representation ω = λ∈ σλ λ on a vector
Two representations ω, ω of a group G are called ⊕nλ
space Vω = Vλ we can choose a basis vjλ | j ∈ 1, . . . ,
equivalent if there exists an invertible linear map T such
that dλ for each Vλ . Each vector v in Vω can thus be written as
dλ λ λ
a linear combination v = λ∈ j =1 cj vj . We can con-
Tω(g) = ω (g)T, ∀g ∈ G. (15) versely identify the basis vector components of any vector
v by application of an appropriate projection Pjλ , such that
We denote this by ω ω . For finite groups G the set of Pjλ v = cλj vjλ , where
irreducible representations (up to the above equivalence)
is finite. We denote it by Irr(G). dλ
Pjλ = σ λ (g) j ,j ω(g). (21)
|G| g∈G
3. Sums, products, and Maschke’s lemma
We make use of sums and products of representations. Note that, in order to construct these projections, the
Given representations ω, ω , the maps knowledge of the diagonal elements of the correspond-
ing irreducible representation σλ is required. However, it is
ω ⊕ ω : G → Md ⊕ Md : g → ω(g) ⊕ ω (g), (16) also possible to project any vector onto distinct irreducible
ω ⊗ ω : G → Md ⊗ Md : g → ω(g) ⊗ ω (g), (17) subspaces (up to multiplicity) by using only knowledge of
the character of a representation:
are again representations. They are, however, generally not dλ
irreducible (even if ω and ω are). However, Maschke’s Pλ = χ (g)ω(g). (22)
|G| g∈G λ
lemma ensures that every representation ω of a group
can be uniquely written as a direct sum of irreducible
representations, that is This last formula follows simply from the definition of the
character as χλ (g) = Tr[σλ (g)].
ω(g) σλ (g)⊕nλ , ∀g ∈ G, (18)
IV. FOURIER TRANSFORMS AND
λ∈
PERTURBATION THEORY OF
IMPLEMENTATION MAPS
where the index set is a subset of the set Irr(G) and nλ is
an integer denoting the number of copies (or multiplicity) In this section, we review the concept of group imple-
of σλ present in ω. mentation maps and their Fourier theory (Sec. IV B).
020357-10
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
Mathematically this corresponds to noncommutative har- to a set of matrices. This definition has all the properties
monic analysis of matrix-valued functions. We also discuss of a Fourier transform. Firstly, it has an inverse transform,
perturbation theory for non-normal matrices. This mate- which maps F (φ) back to φ, given by
rial is somewhat less well known, so we spend more time
discussing these concepts. F −1 F (φ) (g) = dλ TrVλ F (φ)[σλ ]σλ (g −1 ) ⊗ 1
λ∈Irr(G)
A. Implementation maps (25)
Given a group G, we can assign quantum circuits [ele- for all g ∈ G and where dλ is the dimension of Vλ , the
ments of U(d)] to each group element, which gives rise to a space on which the representation σλ acts.
representation of the group. However, in practice, quantum Secondly, it has the correct behavior with respect to con-
circuits will not be executed perfectly, but rather include volutions of implementation maps: the Fourier transform
noise. This noise can be modeled by a quantum channel, of a convolution corresponds to a product of Fourier trans-
and we can thus envision assigning to each group element forms. Recalling the definition of a convolution of two
a quantum channel modeling the real implementation of implementation maps φ, φ
that circuit. These quantum channels can be composed,
but this composition will not necessarily maintain group 1
φ ∗ φ (g) = φ(gg −1 )φ (g ) (26)
structure and will thus in general not form a representa- |G|
g ∈G
tion. However, we can define the more general concept of
an “implementation map” φ, which is a function from a we can easily see the following:
finite group G to the space of superoperators Sd ,
1
F (φ ∗ φ )[σλ ] = σλ (g) ⊗ φ(gg −1 )φ (g )
φ : G → Sd , (23) |G|
g,g ∈G
where we usually assume that φ(g) is a trace nonincreas- 1
ing quantum channel for all g. If we want to draw explicit = σλ (gg ) ⊗ φ(g)φ (g )
|G|
attention to this fact we call φ completely positive if and g,g ∈G
only if φ(g) is completely positive for all g ∈ G. Finally, = F (φ)[σλ ]F (φ )[σλ ] (27)
note that if φ(g)φ(h) = φ(gh) for all g, h ∈ G then φ
would be a representation. We can think of the implemen- for all λ ∈ Irr(G). Another useful property is the Parseval
tation map as being an abstract presentation of the noisy identity
implementation of the group elements, which depends on
1
the noise processes in the quantum computer but also Tr φ(g)† φ (g)
on other choices such as the compilation of circuits into |G| g∈G
elementary gates.
= dλ Tr{F (φ)[σλ ]† F (φ )[σλ ]}. (28)
B. Fourier transforms of implementation maps λ∈Irr(G)
When considering an implementation map one can ask Finally, we note that the Fourier transform (evaluated at
precisely when it is a representation, and failing that, if an irreducible representation) of a representation is an
it is close to a representation (in some reasonable way). orthogonal projector with its rank given by the multiplic-
To answer this question we need to introduce some math- ity of that irreducible
representation. To see this, consider
⊕n
ematical machinery. This machinery was first introduced a representation ω = λ∈ σλ λ . We have that
into the theory of randomized benchmarking by Ref. [25],
based on work by Gowers and Hatami [62], which is 1
{F (ω)[σλ ]}2 = σλ (gg ) ⊗ ω(gg )
itself a partial review of older mathematical work. In this |G|2
g,g ∈G
section, we consider general maps φ from a group G to a
space of d × d matrices Md . Thinking of Sd as a matrix |G|
= σλ (g) ⊗ ω(g) = F (ω)[σλ ]
space, our notion of implementation map can be seen to be |G|2 g∈G
a special case of these maps. Given a map φ we define its (29)
Fourier transform F (φ) as
for all λ ∈ Irr(G). Moreover for λ ∈ we have
1
F (φ)[σλ ] = σλ (g) ⊗ φ(g) (24)
|G| g∈G 1
Tr{F (ω)[σλ ]} = χσ (g)χω (g) = nλ (30)
|G| g∈G λ
for all λ ∈ Irr(G). So the Fourier transform F (φ) is a func-
tion from the set Irr(G) of irreducible representations of G by the character orthogonality formula.
020357-11
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
1. Fourier operators ≤ max φ(g) φ (ĝ)
g,ĝ∈G
We also give another, useful way to think about the
matrix Fourier transform, namely in terms of what we call = F(φ)max F(φ )max (34)
Fourier operators.
Note that the set of maps φ → Sd can be seen as a and similarly for ·m . We also have an identity involving
vector space under pointwise addition (of the superoper- both norms
ators). We can further lift this vector space to an algebra
by considering the convolution operator ∗ [as defined in F(φ)F(φ ) = F(φ ∗ φ )max = max φ ∗ φ (g)
max g∈G
Eq. (26)] on the functions in the vector space. We can con-
struct a faithful (i.e., injective) matrix representation of this (35)
algebra as 1
φ(g ĝ −1 ) φ (ĝ)
≤ max
g∈G |G|
1 ĝ∈G
F(φ) = σ λ (g) ⊗ φ(g)
|G| g∈G λ∈Irr(G) = F(φ)max F(φ )m , (36)
020357-12
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
A1 and A2 . This difference is made quantitative by the With this function we can state the following theorem,
so-called separation function: which can be derived from Theorem 2.8 in Ref. [53,
p. 238].
A1 Z − ZA2
sep(A1 , A2 ) = min . (39)
Z=0 X1 ZX2 † Theorem 6: (Reference [53]). Let A be a complex Hermi-
tian matrix with spectral resolution diag(A1 , A2 ) induced
This separation function has some rather nice properties. by a unitary X = [X1 , X2 ]. Also, let · be a matrix norm.
Firstly, it is symmetric in its arguments: Now let E be a complex matrix. If E has the properties
†
X1 EX2 X2 † EX1 1
sep(A1 , A2 ) = sep(A2 , A1 ). (40)
2 < , (42)
sep(A1 , A2 ) − X1 † EX1 − X2 † EX2 4
Secondly, it is stable against perturbations, i.e., given a † † † †
X1 EX2 X2 EX1 + X1 EX2 X1 EX2 1
perturbation A + E of A we have 2 < (43)
sep(A1 , A2 ) − X1 EX1 − X2 EX2
† † 2
|sep(A1 + E1 , A2 + E2 ) − sep(A1 , A2 )| ≤ E1 + E2 .
(41) then there exist matrices P1 , P2 such that
†
X2 EX1
P1 ≤ , (44)
sep(A1 , A2 ) − X1 † EX1 − X2 † EX2
†
X2 EX1
P2 ≤ (45)
sep(A1 , A2 ) − X1 † EX1 + X1 † EX2 P1 − X2 † EX2 − P1 X1 † EX2
and and
A1 0 1 0 1 0
[L1 , L2 ] (A + E)[R1 , R2 ] =
†
, (46) [X1 , X2 ]† (A + E)[X1 , X2 ]
0 A2 −P1 I P1 I
A1 E12
= , (51)
with 0 A2
1 0 1 P2 with E12 = X1 † EX2 and A1 = A1 + X1 † EX1 − X2 † EX1 P1
[R1 , R2 ] = [X1 , X2 ] , (47)
P1 I 0 I and A2 = A2 + X2 † EX2 − P1 X1 † EX
2 . Nowconsidering the
A1 0
1 −P2 1 0 above as a perturbation of A = we can apply
[L1 , L2 ] =
†
[X1 , X2 ]† , (48) 0 A2
0 I −P1 I
Theorem 2.8 from Ref. [53] again so long as
020357-13
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
†
X2 EX1
P2 ≤ (54)
sep(A1 , A2 ) − X2 † EX2 + X1 † EX2 P1 − X2 † EX2 − P1 X1 † EX2
020357-14
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
A. The data-collection and data-processing phases the action on the space of Hermitian matrices ρ by
RB is composed of two major parts, a data-collection conjugation, i.e., φr (g)(ρ) = ω(g)(ρ) = Ug ρUg † .
phase and a data-processing phase. The data-collection In general, however, the reference implementation
phase consists of what one typically thinks of as RB: one φr is not a representation, though we see that for
randomly selects a sequence of quantum gates and applies any known RB procedure the reference implementa-
them to a quantum state together with a global inverse, and tion can be written as φr (g) = Aω(g)B , where A, B
measures the resulting state. Averaging over many ran- are (unitary) quantum channels. We refer to ω as the
dom choices of these gates one obtains RB output data reference representation.
that depends on the length of the random sequence in a 3. An ending gate: A group element gend that dictates
controlled way. This vague description can be made more the global action of a RB sequence. For most pro-
precise in many different ways and we provide a general posals this gate is simply the identity, but in other
framework for this procedure in the next few subsections. proposals nontrivial choices for gend (such as choos-
The data-processing phase, on the other hand, consists ing it uniformly at random [13,37,39,45]) play an
of what one then does with the data given by a RB exper- essential role in data-processing schemes. This end-
iment. This can be as simple as fitting the data to an ing gate also allows us to include RB schemes that
exponential decay, but in many cases also involves more do not involve an inversion gate [16,19,42,64]. We
sophisticated processing techniques. The key feature of the emphasize that it is not necessary to implement this
RB protocol that allows for a structured approach to data gate physically, but rather it arises from compilation.
processing is the fact that the RB output data has a very 4. A set of sequence lengths: A set of integers M
controlled form. We discuss this form in Sec. V E after denoting the length of the random sequences of
more formally discussing the data-collection phase of RB. gates implemented in a RB experiment. We denote
elements of this set by m and the largest element of
this set by M .
B. Input parameters 5. An input state: A state ρ0 that is prepared at the
The data-collection phase of a RB procedure is charac- beginning of a RB experiment. This state will typ-
terized by a set of input parameters. These input param- ically be a pure state (such as the |0, . . . , 0 state
eters fully define a protocol (which we write down in vector), but is chosen mixed in some versions of
Sec. V C) that can be executed on a quantum computer, RB [56].
yielding probabilistic data that can then be interpreted. 6. An output POVM: A POVM that is measured
Below is a list of all input parameters to RB, together at the end of a RB experiment. We denote this
with an explanation and examples of choices for these POVM as { i }i∈I with some index set I . In many
parameters that correspond to versions of RB present in cases this is a two-component POVM, but some
the literature. RB procedures explicitly call for more complex
measurements (such as a computational basis mea-
1. A gate-set/group: A finite set of unitaries (quantum surement [29]).
gates) on Cd . In (almost) all RB protocols this gate 7. A set of sampling distributions: A set of probabil-
set is also a finite subgroup G ⊂ U(d) of the uni- ity distributions νi for i ∈ {1, . . . , M } over the group
tary group. In a large section of the RB literature G that govern the random sampling of group ele-
the group considered is the q-qubit Clifford group ments in RB. We often consider the scenario where
Cq , but a range of other choices (such as the Pauli all these probability distributions are the same, in
group Pq [13], the real Clifford group [35] or the which case we drop the subscript i and just write ν
CNOT-dihedral group [37,38]) are possible. Choos- for the probability distribution. Moreover, in almost
ing a group fixes what gates RB assesses the quality all instances in the literature this distribution is uni-
of and partially determines the structure of the out- form, i.e., ν(g) = 1/|G|, and unless stated explicitly
put data. In generator-style RB [14,15] this group is we always assume this to be the case.
defined implicitly by the set of generators.
2. A reference implementation and representation:
A map φr from the gate-set/group G to the d- C. The data-collection protocol
dimensional superoperators that specifies how the Given the input parameters discussed above we can
gates in G should be implemented in the quantum write down a formal procedure for the data-collection
computer. This map takes into account aspects of phase of RB. It has as output an estimator p̂(i, m) of a
the specific RB protocol but also how gates are com- probability p(i, m) for each POVM element i for i ∈ I
posed of elementary gates and other implementation and each sequence length m ∈ M.
details. In uniform RB the map φr is a representation Note that the probabilities p(i, m) depend in a nontrivial
of the group G on Sd . The prototypical example is manner on the initial state ρ0 , the POVM { i }i∈I and the
020357-15
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
ending gate gend . We, however, suppress this dependence include the original RB proposals [1,66] and many
unless it is explicitly necessary to refer to it. others.
2. Nonuniform randomized benchmarking: The
D. A typology of randomized benchmarking protocols defining feature of this class is that the sampling
distributions νi are not the uniform distribution.
Given protocol Algorithm 1, different choices of the It comes in two flavors, which we discuss sepa-
parameters discussed in Sec. V B give rise to different rately:
RB procedures. More strongly, (the data-collection phases
of) all variants of RB currently in the literature can be (a) Subset RB: Here, the distributions νi are far
expressed by choosing these input parameters correctly. from uniform (and typically only have support
Surveying the literature we can distinguish three major on a small subset of the group G). Examples
types that are differentiated by their reference implemen- from the literature are Refs. [14,15,39,67].
tations and sampling distributions. The output data associ- (b) Approximate RB: Here the νi are close to
ated with these classes of protocols has varying behavior uniform. This latter class will turn out to be
and we treat each class separately in Sec. VI. All protocols essentially the same as uniform RB. This class
included in these classes can be found in Fig. 2 (here we has been discussed in Ref. [14] and also arises
give only illustrative examples). in the original “NIST” RB proposal [5] (as per
the analysis of Ref. [51]).
1. Uniform randomized benchmarking: This is the
basic type of RB. It is characterized by the fact In all works of this type so far the reference imple-
that the probability distributions νi are the uni- mentations are representations (akin to uniform
form distribution for all i ∈ {1, . . . , mmax }, and that RB).
the reference implementation map φr is exactly a 3. Interleaved randomized benchmarking: This
representation ω, usually the standard action by class of RB protocol is characterized by the addition
conjugation given by ω(g) = φr (g)(ρ) = Ug ρUg † of an extra “interleaving gate” in the RB proce-
for unitaries Ug (other choices have been made in dure. This is a class that is somewhat idiosyncratic,
Refs. [42,65]). Randomized benchmarking propos- having one standard subtype and a collection of
als of this type are mainly distinguished by what “nonstandard” protocols:
group G they consider as a gate set (at least when
it comes to the data-collection phase, different pro- (a) Standard interleaved randomized bench-
posals in this class might have radically different marking: In this class the interleaving gate is
data-processing procedures). Protocols of this type an element of the benchmarked group G. In this
020357-16
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
020357-17
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
random). In Ref. [68], a model of time depen- operators R, L are not guaranteed to be com-
dence has been considered and in Refs. [69–71] pletely positive, complicating the interpretation
the effect of gate correlations and certain uncon- of this assumption as being a belief on physical
trolled variables such as quasistatic noise were quantities. Finally, Ref. [25] derives (introduc-
investigated. In all of these scenarios, however, ing the Fourier analysis also used here) Eq. (63)
the exponential behavior of Eq. (63) breaks (up to an exponentially small correction) for
down. It might be possible to derive assump- uniform RB with the multiqubit Clifford group
tions beyond the setting of Markovian time under an assumption on the fidelity of the imple-
independence that lead to output data of the mentation map φ with respect to its reference
correct form, but we do not pursue this here. implementation,
020357-18
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
almost all known RB procedures, but the techniques used combination of the form
are based on the cleaner conceptual framework of matrix-
valued Fourier transforms provided by Ref. [25], which we p(i, m) ≈ Tr(Aλ Mλm ), (71)
reviewed in Sec. IV B. The central observation of Ref. [25] λ∈
is that the data-collection phase of uniform RB can be seen where Mλ is an nλ × nλ matrix depending only on the
as evaluating m-fold convolutions of the implementation actual implementation φ. In particular, Mλ is given by the
map φ. This observation generalizes beyond uniform RB projection of the Fourier mode F (φ)[σλ ] onto the subspace
to arbitrary implementation maps, and, in particular, we associated with its nλ largest (in absolute value) eigen-
see that values. This is the content of Theorem 8. The essential
idea in Theorem 8 is the fact that convolutions corre-
p(i, m) = EM ( i )|ν(g1 ) . . . νm (gm ) spond to matrix multiplication in Fourier space, together
g1 ,...,gm ∈G with a careful use of the subspace perturbation techniques
discussed in Sec. III.
× φ(gend g1−1 . . . gm−1 )φ(gm ) · · · φ(g1 )|ESP (ρ0 )
(68) Theorem 8: (Output data of uniform randomized bench-
marking). Let p(i, m) be the outcome probability associ-
can be rewritten, using the invariance of the uniform sum ated with a uniform RB experiment with group G, initial
⊕n
over G under changes of variables, as state ρ0 , reference representation ω = λ∈ σλ λ , and
ending gate gend , for a specific sequence length m ∈ M
and POVM element i in the POVM { i }i (as described
−1
p(i, m) = EM ( i )|φ(gend gm−1 )νm (gm gm−1 ) in protocol Algorithm 1). Let φ be the implementation map
g1 ,...,gm ∈G describing the actually implemented operations. More-
−1
× φ(gm gm−1 ) · · · ν1 (g1 )φ(g1 )|ESP (ρ0 ) (69) over, assume that there exists a δ > 0 such that
1
= EM ( i )| φ ∗ (νm φ) ∗ · · · ∗ (ν1 φ) (gend )| ω(g) − φ(g) ≤ δ ≤ 1/9. (72)
|G| g∈G
× ESP (ρ0 ) (70)
The RB output probability p(i, m) is well approximated as
where we use the definition of convolution of imple-
m
mentation maps given in Eq. (26) and where (νi φ)(g) = 2δ
|p(i, m) − Tr(Aλ Mλ )| ≤ 8 δ 1 +
m
,
νi (g)φ(g). We see that often the convolution product map λ∈
1 − 5δ
φ ∗ (νm φ) ∗ · · · ∗ (ν1 φ) can be written exactly as an m-fold (73)
convolution φ ∗m (for some φ that is not necessarily the
same as φ). where Mλ , Aλ are nλ × nλ real matrices and Mλ depends
We begin in Sec. VI A with discussing the case of uni- only on the implementation φ.
form RB (as per the RB typology in Sec. V D). This is
the easiest case, but the results derived there will go a Proof. Note from Eq. (69) with νi the uniform probability
long way in analyzing the other two types (nonuniform and distribution for all i ∈ {1, . . . , m} that
interleaved RB).
p(i, m) = EM ( i )|(φ ∗ φ ∗m )(gend )|ESP (ρ0 ) . (74)
A. Uniform randomized benchmarking Inserting the Fourier transform of φ, we get
Here we discuss the behavior of RB output data given
by a uniform RB scheme (as defined in Sec. V D). We p(i, m) = dλ EM ( i )
prove that this data behaves as expected (i.e., a controlled λ∈Irr(G)
linear combination of exponential decays), as long as the
× | TrVλ {F (φ)m+1 [σλ ]σλ (gend −1 ) ⊗ 1}|ESP (ρ0 )
implementation map φ is close enough to its reference
(75)
implementation φr . As we saw in Sec. V B, for uniform ⎡
RB protocols this reference implementation is exactly a m+1
1
representation, which we denote by ω. We can always = EM ( i )| TrVωG ⎣ ωG (g) ⊗ φ(g)
decompose ω into a direct sum of irreducible representa- |G| g∈G
⊕n
tions. We write this as ω = λ∈ σλ λ with some index ⎤
set and σλ irreducible subrepresentations appearing with
multiplicity nλ . As discussed in Sec. V, we expect the RB × [DG ωG (gend −1 )] ⊗ 1⎦ |ESP (ρ0 ) , (76)
output data to be approximately well described by a linear
020357-19
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
where ωG (g) = ⊕λ∈Irr(G) σλ (g) is the direct sum of all as a perturbation to F(ω) we need to ensure the condi-
irreducible representations of G and DG = ⊕λ∈Irr(G) dλ 1λ tions in Eq. (42) are satisfied with respect to the norm
accounts for the dimensional factor in the inverse Fourier ·m . Using the submultiplicativity of this norm and the
transform. Now we can consider the Fourier operator F(φ) fact that X1 m = 1 by construction together with the tri-
[as defined in Eq. (31)] associated with φ as a perturbation angle inequality, we get the following sufficient condition
of its ideal version F(ω). From our discussion of Fourier for the applicability of Theorem 6:
transforms
and Fourier operators we know that F(ω) = †
X1 F(φ − ω)X2 X2 † F(φ − ω)X1
1/|G| g∈G λ∈ σ λ (g) ⊗ ω(g) is an orthogonal projec-
m m
tion, with rank given by the number
of irreducible subrep- [sep(1, 0) − X1 † F(φ − ω)X1 m − X2 † F(φ − ω)X2 m ]2
resentations of ω (Rk[F(ω)] = λ∈ nλ ). Recall also that
there is a natural matrix norm ·m on the space of Fourier [2 F(φ − ω)m ]2 1
≤ < (78)
operators and that [1 − 5 F(φ − ω)m ]2 4
F(φ − ω)m where we also use that sep(1, 0) = 1, which is easy to see
1
from the definition of sep (see Sec. IV C). Working out, we
= TrVωG F(φ − ω)DG ωG (g −1 ) ⊗ 1 see that the above is satisfied if Eq. (72) is true, which it is
|G| g∈G by assumption. Hence we can use Theorem 6 to conclude
the existence of operators R = [R1 , R2 ], L = [L1 , L2 ] with
1
= φ(g) − ω(g) . (77) L† = R−1 and P1 such that
|G| g∈G
F(φ) = R1 X1 † F(ω)X1 + X1 † F(φ − ω)(X1 + X2 P1 ) L1 †
The plan is now to use the perturbation theorem (Theorem
+R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 † .
6) to split the above into dominant and subdominant invari-
ant subspaces. To do this note that F(ω) is a projector (79)
so we trivially get a spectral resolution with X1 = F(ω),
X2 = 1 − F(ω) with F(ω) acting as the identity on the Using the fact that L† = R−1 (and thus that L2 † R1 =
column and row space of X1 and as the zero operator on L1 † R2 = 0) we can now write p(m, gend , ) as a sum of
the column and row space of X2 . Thinking of F(φ − ω) two terms corresponding to the above spectral resolution:
−1
m
p(i, m) = EM ( i )| TrVωG [DG ωG (gend ) ⊗ 1]F(φ) R1 X1 † F(ω)X1 + X1 † F(φ − ω)(X1 + X2 P1 ) L1 † |ESP (ρ0 )
m+1 †
+ EM ( i )| TrVωG [DG ωG (gend −1 ) ⊗ 1] R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 |ESP (ρ0 ) .
(80)
We consider both of these terms separately. We deal first with the second term. Note that, using the definitions of R, L
from Theorem 6, we have
m+1 †
(2) ≤ TrVωG [DG ωG (gend −1 ) ⊗ 1]R2 X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 L2 (81)
m
≤ (X2 + X1 P2 + X2 P1 P2 ) X2 † F(ω)X2 + (X2 † − P1 X1 † )F(φ − ω)X2 (X2 − P1 X1 )† max , (82)
which is just a statement about the max norm of a Fourier operator. Note that X2 † F(ω)X2 = 0 by construction so the
above depends only on F(φ − ω). Now using the max-mean norm inequality in Eq. (35) several times and the fact that
X2 = 1 − X1 , we can upper bound this as
(2) ≤ (X2 + X1 P2 + X2 P1 P2 )(X2 − P1 X1 )† F(φ − ω)X2 (X2 − P1 X1 )max [F(φ − ω)X2 (X2 − P1 X1 )† ]m m (83)
≤ 2 F(φ − ω)max (1 + P2 m )(1 + P1 m ) + P1 m P2 m (3 + P1 m ) [F(φ − ω)m (1 + P1 m )] . (84)
2 m
020357-20
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
nλ
(1) = dσλ EM ( i )| TrVσλ [σ λ (gend −1 ) ⊗ 1]F (φ)[σλ ]Rλ1 (97)
λ∈ j 1 ,...,j 2m =1
λ λ
jλ1 jλ1 jλ2 jλ2 jλ2m jλ2m λ †
× F(Pλ ωPλ )F (φ)[σλ ]F(Rλ ωPλ ) · · · F (φ)[σλ ]F(Rλ ωPλ ) L1 |ESP (ρ0 ) (98)
nλ j
[σ λ (gend −1 ) ⊗ 1]F (φ)[σλ ]Rλ1 F(Pλλ ωPλλ )Lλ1 † |ESP (ρ0 ) [Mλm ]jλ ,j 2m
j
= dσλ EM ( i )| TrVσλ (99)
λ
λ∈ j 1 ,j 2m =1
λ λ
020357-21
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
020357-22
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
by blocking a few gate applications together, defining a with Mλ the projection onto the nλ -dimensional dominant
new effective implementation map φ = (νφ) ∗ (νφ) · · · ∗ invariant subspace of F (νφ)[σλ ] and where
(νφ), which is close enough to uniformly distributed to
apply Theorem 9. 2δ δ
The above approach fails utterly when applied to sub- ε ≤ 2δ 1+ 1+
1 − 5δ 1−δ
set RB. In this scenario the distribution ν only has support 2
on a small subset A of G and consequently g∈G |ν(g) − 2δ δ 2δ
+ 3+ ≤ 4δ
1/|G|| ≈ 1 in many cases. This is not necessarily a weak- 1 − 5δ 1−δ 1 − 5δ
ness of Theorem 8 but rather a statement of the fact (111)
that strong deviations from exponential behavior can be
observed if one does not give the distribution ν time with δ = δ + δ .
to converge to the uniform distribution through repeated
convolution. This was already noted more or less explic-
itly in previous papers on subset RB. There are two Note that this theorem is qualitatively less strong than
approaches to solving this problem. The first, followed in Theorem 8. In particular, we cannot guarantee that the
Refs. [14,15,39,67] is to restrict the set of sequence lengths distance between the output data of subset RB and the
M at which RB data is gathered to m ≥ mmix where mmix exponential decays associated with the irreducible subrep-
is related to the mixing time of the distribution ν. Note resentations of the reference representation closes expo-
that in the direct RB proposal [15], this convergence time nentially fast with increasing sequence length. However,
is instead enforced directly by applying a uniformly ran- our bound on this distance is stronger than previous rigor-
dom gate before applying nonuniformly sampled gates. ous statements (Theorem 20 in Ref. [14]) and works under
The second approach is to take this deviation from uniform weaker assumptions. The distance bound given in Ref. [39]
RB behavior at face value [13] and draw conclusions from (Theorem 3) does close exponentially but the proof relies
the RB output directly. We believe this latter approach is critically on the fact that ν is uniformly nonzero on a
more accurately classified as an interleaved benchmarking (large) subgroup coset in G, and thus applies only to a
scheme and we discuss it there. far more restricted situation. Note also that it does not
With regards to the first approach we can make a state- directly apply to the approach taken in Ref. [15]. However,
ment akin to Theorem 8 about subset RB procedures by we believe that with very minor alterations the reasoning
making the (natural) assumption that upon equilibration below can be made to fit.
of the distribution ν the quality of the total gates has not Proof. Consider again the map φν : G → Sd : g → |G|
degraded too much. Intuitively, this means that the gates ν(g)φ(g). We have
that have high weight in the initial distribution are of high
enough quality to generate (by composition) good-quality pν (i, m) = EM ( ∗ φν∗m )(gend )|ESP (ρ0 ) .
i )|(φ (112)
implementations of all gates in the group. Concretely, we
have the following theorem. ∗mmix
We now establish a bound on the quality of φν , namely
Theorem 10: (Subset randomized benchmarking). Let ν we show that
be a probability distribution on G and pν (i, m) be the
1
φ ∗mmix (g) − ω(g) ≤ δ + δ ≤ 1 .
outcome probability associated with a nonuniform RB ν (113)
experiment with implementation map φ and reference rep- |G| g∈G 9
⊕n
resentation ω(g) = λ∈ σλ λ . Moreover, assume that
there exists an integer mmix and real numbers δ, δ > 0 This can be seen as follows:
such that
1
φ ∗mmix (g) − ω(g)
1 ν
|ν ∗mmix (g) − |≤δ, (108) |G| g∈G
g∈G
| G |
1
ω∗mmix (g) − ω(g)
δ ≤ ν
ν(g) ω(g) − φ(g) ≤ (109) |G| g∈G
g∈G
mmix
1
φ ∗mmix − ω∗mmix
+ ν ν (114)
with δ + δ ≤ 1/9. Now pν (i, m) is well approximated as |G| g∈G
m−mmix
|pν (i, m) − Tr(Aλ Mλ )| ≤ ε (110) with ων (g) = |G|ν(g)ω(g). Writing out the convolution in
λ∈ the first term and changing variables, we get
020357-23
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
1 1
ω∗mmix (g) − ω(g) = |G|ν(ggmmix −1 ) . . . ν(g1 )ω(ggmmix −1 ) · · · ω(g1 ) − ω(g)
−1 −1
(115)
|G| g∈G ν
|G| g∈G
g
1 ,...gmmix −1 ∈G
1
≤ |G|ν(ggm−1mix −1 ) . . . ν(g1 ) − 1 ω(g) (116)
|G| g∈G g ,...g
1 mmix −1
1
= ∗m mix (g)
|G| − ν (117)
g∈G
where we use the telescoping series identity Am − Bm = m j =1 A
m−j
(A − B)Bj −1 , which holds for any elements A, B of
an associative algebra (such as the implementation maps with convolution), the submultiplicativity of the diamond norm,
and the fact that φ(g) = ω(g) = 1 for all g ∈ G. Together with the theorem assumptions, this yields Eq. (113).
Now as in Theorem 8, we can write the RB output data as
−1 ∗mmix
pν (i, m) = EM ( i )| TrVω G D G [ω (g
G end ) ⊗ 1]F(φ)F(φν )m
F(φν ) |ESP (ρ0 ) , (120)
∗m
where m = m − mmix . We can again consider F(φν mix ) as a perturbation of F(ω). Since F(ω) is a projector, the operator
∗m
F(φν mix ) will resolve into a dominant and subdominant invariant subspace (as in Theorem 8). We have
p(i, m) = EM ( i )| TrVωG DG [ωG (gend −1 ) ⊗ 1] F(φ)F(φν )m−mmix R1 X1 † F(ω)X1
+ (X1 −
†
P1 X2 )F(φν∗mmix
†
− ω)X1 L1 †
|ESP (ρ0 )
+ EM ( i )| TrVωG DG [ωG (gend −1 ) ⊗ 1] F(φ)F(φν )m−mmix R2 X2 † F(ω)X2
+ (X2 † − P1 X1 † )F(φν∗mmix − ω)X2 L2 † |ESP (ρ0 ) . (121)
∗mmix
Now note that F(φν ) and F(φν ) commute, and hence share invariant subspaces. This means we can write the first term
in Eq. (121) as
m−m
(1) = Tr Aλ Mλ mix ). (122)
λ∈
020357-24
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
using the max-mean inequality of the norms on Fourier operators. Note now that
⎡ ⎤m−mmix −1
F(φν )m−mmix −1 ≤ ⎣ ν(g) φ(g) ⎦ ≤ 1, (125)
m
g∈G
where we use that ν is a probability distribution and that φ ≤ 1. Moreover, we have that F(φ)max ≤ 1. Using this
and the reasoning from Theorem 8 we can thus bound the second term as
2
2δ δ 2δ δ 2δ
| (2) | ≤ 2 F(φν∗mmix − ω)m 1+ 1+ + 3+ (126)
1 − 5δ 1−δ 1 − 5δ 1−δ 1 − 5δ
with δ = δ + δ . Inserting the assumption that F(φ ∗mmix this probability distribution, we can reconsider the above
− ω)m ≤ δ we obtain the statement of the theorem. as a RB experiment according to the protocol written in
Algorithm 1, we have
C. Interleaved randomized benchmarking
pIRB (i, gend , m)
As discussed in Sec. V D, a common variant of RB is
interleaved randomized benchmarking (IRB). IRB is per- = p(i, gend , 2m) = EM ( i )|
formed like uniform RB, as formulated in Algorithm 1, g1 ,...,g2m ∈G
but the reference implementation is not a representation.
× φ[gend (g1 g2 . . . gm )−1 ]νC (g2m )φ(g2m )
Instead a fixed operation C is being interleaved between
the application of randomly selected group elements. The × μ(g2m−1 )φ(g2m−1 ) · · · νC (g2 )φ(g2 )μ(g1 )φ(g1 )
outcome of this experiment is then compared to the same × |ESP (ρ0 ) , (128)
RB experiment without the interleaving gate to infer the
quality of the interleaved gate C. The literature splits into
where μ is the uniform distribution on G. Hence, we can
two sections, standard interleaved RB [4,48] and nonstan-
think of standard IRB as being a RB experiment with a
dard interleaved RB [9,47]. We emphasize here that we
particular choice of sampling distributions. In this picture,
discuss the so-called “interleaved step” of the interleaved
it becomes trivial to extend Theorem 8 to standard inter-
RB protocol, and do not interpret the resulting decay rate
leaved RB by considering the map φC = (νC φ) ∗ φ. By the
(for a thorough discussion of the relationship of interleaved
standard change of variables we can see
RB decay rates and their interpretation see Ref. [72]).
1. Standard interleaved randomized benchmarking pIRB (i, gend , m) = EM ( i )|φ ∗ φC∗m (gend )|ESP (ρ0 )
(129)
In the standard protocol the interleaved operation C is
applied after every randomly selected gate and is also a part
and hence interleaved RB is just uniform RB with the
of the group G. Hence at the end of a random sequence, the
implementation map φC . If φ(C) is close enough to its
inversion step can be performed inside the group. An IRB
reference representation element ω(C) the assumption
output data is thus of the form
Eq. (72) is reasonable for φC as well. Hence, Theorem 8
holds equally well for interleaved RB.
pIRB (i, gend , m)
Nonstandard interleaved RB protocols [9,13,47,50]
1 depart from the above framework by including interleaved
−1
= EM ( i )|φ[gend (g1 C . . . gm C) ]
|G|m g ,...,g ∈G gates that are not part of the group G, (the Pauli group
1 m
in the case of Ref. [13] and the Clifford group in the case
× φ(C)φ(gm ) · · · φ(C)φ(g1 )|ESP (ρ0 ) (127) of Ref. [47]) and sampling from the group in a nonuni-
form manner. These are somewhat idiosyncratic so we treat
for a POVM element i , an ending gate gend , a sequence them separately. We see that the protocols of Refs. [9,47]
length m, an implementation map φ, and an initial state ρ0 . are covered by Theorem 8, while the protocols of Ref. [13]
It is interesting to interpret this procedure in the light of and Ref. [50] are not covered. We expect that it is possible
the protocol given in Sec. V C. Namely we can think of to make guarantees on the output data of these protocols
defining a probability distribution νC over G, that takes the with suitable adaptations to Theorem 8 but we do not
value 1 for g = C and 0 for all other group elements. With pursue this here.
020357-25
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
2. Interleaved T-gate randomized benchmarking we can make the alternative assumption that
In Ref. [47] the quality of a T gate (with ideal imple-
mentation T ), with an associated noisy implementation T# 1 U ω(g) − U
#φ(g) ≤δ, (133)
is assessed by estimating the following quantity |G| g∈G
020357-26
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
Note that because the Clifford group contains the Pauli this reference implementation is not close to a represen-
group the equation Cgc · · · Cg1 = g makes sense. Now tation (unless g = e), which means that Theorem 8 does
because of the cycle property not apply. This is not an artifact of the proof technique
but rather a reflection of the fact that robust benchmarking
Cgc · · · Cg1 = (C−(c−1) gc C(c−1) ) · · · C−1 g2 Cg1 = gc · · · g1 tomography features extremely rapid exponential decays.
(137) In the gate-independent noise case the decay rate is set by
the average fidelity F[ω(g ), E ], which can be very small.
since C−1 gC is always a Pauli element. Hence the equation In the language of matrix Fourier theory this means that the
has exactly |P(c−1) | solutions. Furthermore, we have that dominant eigenvalues of the Fourier operator F(φtom ) will
1
be small even in the ideal case. Hence, we do not expect
1 1 an assumption of the form, Eq. (72), to be strong enough
φc (g) = C#φ(gc ) · · · C#φ(g1 ) to guarantee exponential behavior of the RB output data in
|Pq | g∈P |Pq |c g∈P
q q g1 ,...,gc ∈Pq this scenario.
Cgc ···Cg1 =g
1
= C#φ(gc ) · · · C#φ(g1 ) VII. DATA PROCESSING AND SAMPLE
|Pq |c
g1 ,...,gc ∈Pq COMPLEXITY
(138) As discussed before the randomized benchmarking pro-
tocol can be divided into data collection and postprocess-
and thus that
ing phases. The data-collection protocol is summarized in
1 Algorithm 1. The outputs of the data-collection phase are
C#φ(gmc ) · · · C#φ(g1 ) = φc∗m (e), (139) mean estimators p̂(i, m, gend ) that estimate the average over
|Pq |mc g1,1 ,...gm,c ∈Pq all sequences of length m according to the measures νi
and the quantum-measurement statistics, simultaneously.
which means cycle benchmarking can be framed as RB The main theorems of the data-collection phase (Theo-
with the implementation map φc . Moreover, since in the rems 8–10) state that the expectation value, again both
limit of perfect gates we have, if Cgc · · · Cg1 = g, that over the measurement statistics and the random sequences,
is well-approximated by a linear combination of (matrix)
C ω(gc ) · · · C ω(g1 ) = ω(g) (140) exponentials in m.
The figures of merit that RB experiments report are the
we can reasonably make the assumption that φc is close decay parameters associated with the linear combination
to its reference implementation [i.e., Eq. (72)]. Hence of (matrix) exponentials. Extracting these decay param-
the behavior of cycle benchmarking data is covered by eters is the objective of the data-processing phase that
Theorem 8. What is less clear is how to interpret the result- is the focus of the current section. For gate-independent
ing exponential decays (especially in terms of the imple- noise and reference representations without multiplicities
mentations φ and C#). This requires a more sophisticated the decay parameters can be directly connected to the
analysis, which is done in Ref. [13]. average gate fidelity of the noise. In the more general
case, the interpretation of the decay parameters in terms
5. Robust benchmarking tomography of other operational measures of quality can be more com-
In robust benchmarking tomography [50] one uses a RB plicated. We consider the connection between the decay
protocol as a subroutine to extract tomographic informa- parameters and the average gate-set fidelity in Sec. IX.
tion from a superoperator (not necessarily a unitary) E . Here we want to take a more pragmatic approach for the
This is done by estimating the probability postprocessing phase. The deviation of the decay param-
eters from unity can directly be regarded as a measure of
1 quality that captures the deviation of the actually imple-
p(i, m) = EM ( i )|φ[g (g1 . . . gm )−1 ]
|G|m g1 ,...,gm ∈G
mented gates from an ideal implementation. In principle,
the set of decay parameters itself provides a refined image
× E φ(g gm ) · · · E φ(g g1 )|ESP (ρ0 ) , (141) of the quality of the implementation, as compared to the
average gate fidelity. This motivates us to limit the post-
where g is a fixed element of the group G and φ is processing phase to the extraction of the decay parameters.
the implementation of a reference representation ω [the The estimation of other measures of quality from the
goal is to estimate correlations between ω(g ) and E ]. We decay parameters is then left to an optional subsequent
can consider this as an interleaved RB scheme with ref- processing phase.
erence implementation φtom (g) = ω(g g) (thinking of E In the simplest RB setting (e.g., uniform RB with
as a noisy implementation of the identity gate). However, the Clifford group), featuring a single noise-affected
020357-27
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
020357-28
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
B. Data-processing algorithms and guarantees For the sake of clarity, we now start reviewing the algo-
1. Fitting single decays rithms for identifying multiple poles without polynomial
modulation. This corresponds to the case of RB with a
Many proposals for RB derive a data model that is well multiplicity-free reference representation. For the rest of
approximated by a single decay curve. This is, for example, this section we denote the output data as ym instead of
the case when the group is a unitary 2-design, the refer- p̂(m), in keeping with the signal-processing literature. We
ence representation ω is the adjoint representation and the also assume equidistant spacing of the available sequence
actual implementation is close to being trace-preserving lengths m. As we point out in Sec. VII B 5, this require-
[24]. The adjoint representation of a unitary 2-group acts ment can be relaxed by running a low-rank completion
irreducible on the space of traceless matrices and yields a algorithm on incomplete data and thereby infer equidis-
single dominant decay curve. tantly spaced data ym . When clear from the context, we
A single dominant decay parameter can be extracted write the data series simply as a vector y, dropping the
using nonlinear least-squares fitting algorithms such as explicit dependence on m.
Levenberg-Marquardt, see, e.g., Ref. [73, Chapter 3.2]. In The strategy of both algorithms, MUSIC and ESPRIT,
Ref. [56] it has been shown that in RB for the Clifford is to identify the range of the subspaces associated with
group the variance of the data points is expected to strongly the dominant singular values of the Hankel matrix of the
vary with the sequence length m. This observed het- data series {ym }m . The crucial observation is that from this
eroskedasticity motivates us to use iteratively reweighted subspace the poles can be extracted. Let y ∈ RM be the
variants of least-squares fitting algorithms. RB data with M the maximal sequence length. The Hankel
Reference [74] analyzes a simplified fitting procedure matrix for 1 ≤ L < M is given by
that estimates the decay parameter from the ratio of the
data for two sufficiently separated sequence lengths. In the ⎛ ⎞
y0 y1 ··· yM −L
regime of high fidelity, it establishes a multiplicative error ⎜ y1 y2 ··· yM −L+1 ⎟
in the deviation of the decay parameter from an efficient HankelL (y) = ⎜
⎝ ... .. .. ⎟ . (145)
. ⎠
..
number of samples. Relatedly, Ref. [45] gives an estima- . .
tion scheme for a RB procedure that estimates, in paral- yL yL+1 ··· yM
lel, multiple single exponential decays with multiplicative
accuracy. This scheme makes use of postprocessing tech- We denote the Vandermonde matrix of size n × M for
niques to guarantee the “single-exponential” shape of the poles z = (z1 , . . . , zn ) by
data. We discuss this more in Sec. VIII. ⎛ ⎞
1 z1 z12 ... z1M −1
⎜1 z2 z22 ... z2M −1 ⎟
⎜ ⎟
WM (z) = WM (z1 , . . . , zn ) = ⎜ . .. .. .. .. ⎟ .
2. Fitting multiple decay with pole-finding algorithms: ⎝ .. . . . . ⎠
MUSIC and ESPRIT 1 zn zn2 ... znM −1
Algorithms for simultaneously identifying multiple (146)
poles (frequencies and decay parameters) from a discrete
series of data points date back to at least the work of If n = 1, and thus z ∈ C we refer to WM (z) as the Vander-
Prony [75]. A zoo of modern algorithmic approaches has monde vector of length M and pole z.
been developed in the context of direction-of-angle estima- With this notation, the data vector y, without noise, is
tion in array signaling. In principle, these techniques can in the range of WM (z)T . Furthermore, cyclically shifting
extract poles that are closer together than the grid spacing the entries of y amounts to multiplication of the summands
defined by the finite sampling rate, a phenomenon dubbed with the respective poles. In effect, the Hankel matrix has
superresolution. The theoretical framework to derive guar- a Vandermonde decomposition
antees for these algorithms that go beyond a perturbative
analysis of special noise models or very simple configura- HankelL (y) = WTL (z)diag(a)WM −L (z) + HankelL (α),
tions, was only developed recently [76,77], first focusing (147)
on convex optimization.
Here, we analyze the performance of the MUSIC where we denote by α the deviation of y from an ideal
algorithm [78] and the ESPRIT [79] algorithm on RB data. linear combination of exponentials due to the perturbative
Performance guarantees for these two subspace algorithms error (m) and finite statistics and where a is the vector of
were derived in Refs. [80–83] for the multiplicity-free prefactors given in Eq. (143).
case. Furthermore, the ESPRIT algorithm was extended To identify the signal subspace and distinguish it from
to polynomially modulated exponentials of the type we the noise subspace, the MUSIC and ESPRIT algorithms
encounter in RB data with multiplicities in Refs. [84,85]. employ a singular value decomposition (SVD) of the Han-
We summarize the required modification in Sec. VII B 5. kel matrix, HankelL (y) = UVT . In the absence of noise
020357-29
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
and perturbation, i.e., α = 0, HankelL (y) has n nonvanish- [defined similarly to Eq. (148) but using Psignal ] has been
ing singular values and the corresponding singular vectors derived for poles z of unit absolute value (sinusoids).
form an orthonormal basis of the signal space span WTM (z). The argument, however, holds verbatim for all z ∈ Cn .
Let Usignal be the matrix consisting of the singular vec-
tors of the nontrivial singular values as columns and let Theorem 11: (Noise-correlation function bound [82],
Unoise be the matrix consisting of an orthonormal basis Proposition 4.2). Let E = HankelL (α) denote the Hankel
of the complement. It is convenient to define associated matrix of the perturbation and noise of the signal vector y.
noise space (Pnoise ) and signal space (Psignal ) projectors as Let εmin be the smallest singular value of the Hankel matrix
† †
Pnoise = Unoise Unoise and Psignal = Usignal Usignal . In the pres- of the noise-free signal. Suppose L ≥ n, M − L + 1 ≥ n
ence of noise, analogously choosing the singular vectors of and 2 E∞ < εmin . Then,
the n largest singular values yields an estimate of the signal
space. 2 E∞
† |Rnoise (z) − Rsignal (z)| ≤ (151)
From the noise-space projector Pnoise = Unoise Unoise , the εmin
MUSIC algorithm defines the inverse noise-space correla-
tion function R−1noise : C → R, for all z ∈ C.
020357-30
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
In the context of RB, we are conversely interested in Most interesting in our context is the asymptotic scaling
poles that are on the real line. A more general characteri- in the limit of large maximal sequence length M , for poles
zation of the conditioning of Vandermonde matrices with inside the unit disc |zi | < 1 for all i. In this limit, the above
poles inside the unit circle (allowing for decays beyond bounds become tight and the following holds true.
oscillations) has been studied in Ref. [86]. The condition-
ing obviously depends on the set of poles z and the size Lemma 13: (Asymptotics of condition number [86],
M of the Vandermonde matrix. To state the result given in Lemma 8). Let z = (z1 , . . . , zn ) ∈ Cn with |zi | < 1 for all
Ref. [86] we define several quantities. To the set of poles i ∈ [n]. Define C(z) ∈ Cn×n as the matrix with entries
z = (z1 , . . . , zn ), we associate ž := maxj |zj |, ẑ := minj |zj |
and z̈ := minj =k |zj − zk |. Furthermore, let us define 1
Ci,j (z) = . (157)
1 − zi z̄j
[WM (z)WM (z)]−1/2 .
†
QM (z) = (152)
† Then,
Note that WM (z)WM (z) is the frame operator of the frame
defined by the rows of the Vandermonde matrix and %
lim κ2 [WM (z)] = κ2 [C(z)]. (158)
QM (z) is the orthogonalizing matrix arising in symmetric M →∞
orthogonalization. With the help of QM (z), we define the
matrix Later in this section we use this bound to perform
numerical investigations of the resolving power of the
FM (z) := QM (z) diag(z)Q−1
M (z), (153) MUSIC and ESPRIT algorithms and to give a sampling
complexity bound for general RB.
which will play a prominent role for analyzing the Van-
dermonde conditioning. In particular, its departure from
5. Extensions of the algorithms
normality as measured by D2 [FM (z)] = FM (z)2F − z22
will appear.
In Ref. [86] a bound is derived a. Incomplete data or logarithmic grids. So far the pre-
for the 2-norm condition
number κ2 (WM ) = WM ∞ W+
M ∞ through the bound-
sented algorithms and analysis relied on having an equidis-
ing of the
Frobenius
norm condition number κF (WM ) = tant grid of sequence length. It is well known that a
WM 2 W+
M 2 . Here X
+
denotes the (Moore-Penrose) low-rank matrix can under fairly general assumptions be
pseudoinverse of a matrix X . The condition number of completed from the knowledge of just a subset of their
a linear map A gives a worst-case bound on the relative entries [87]. Thus, given only data ym for values m on
reconstruction error in 2 norm induced by an additive an irregular subset regular grid, one can attempt at com-
error in 2 norm for a linear inverse problem. But here we pleting the Hankel matrix for the regular grid using a
are more concerned with how it enters into the accuracy of low-rank matrix completion algorithm. This preprocessing
identifying poles in the MUSIC and ESPRIT algorithms. step can be combined with MUSIC or ESPRIT to arrive at
For the analysis of the MUSIC and ESPRIT algorithm, we pole-finding algorithms that do not rely on complete data
−1
want to upper bound the minimum singular value εmin . By from an equidistant grid [80]. In particular, we suspect
means of the Vandermonde decomposition (147) and the that for exponential decays a logarithmic grid can poten-
−1
+ + of−1the spectral norm, we have εmin ≤
submultiplicativity tially yield improved recovery similar to the multiplicative
W
M −L ∞ WL ∞ ẑ . Since WM ∞ ≥ 1, we conclude
error bounds for the fitting of single exponentials derived
that in Ref. [74], but we leave formally verifying this to future
work.
−1
εmin ≤ κ2 (WM −L )κ2 (WL )ẑ −1 . (154)
For the condition number the following bound holds.
b. Generalization of ESPRIT to matrix exponentials.
Theorem 12: (Conditioning of Vandermonde matrices References [84,85] have generalized the ESPRIT algorithm
[86], Theorem 6). For M > n ≥ 2, for a Vandermonde to signal spaces spanned by products of falling polynomials
matrix WM (z), it holds that and exponentials. This is exactly the signal model,
Eq. (143), that we encountered for RB output data,
ε1 [FM (z)] 1! % "
when the reference representation has multiplicities.
≤ κ2 [WM (z)] ≤ ρ + ρ2 − 4 (155)
ž 2 The key insight in this generalization is that the
with Hankel matrix of such signals admits a decompo-
sition analogous to the Vandermonde decomposition
n−1
2 φL (ž) (147) in terms of Pascal-Vandermonde matrices. These
D2 [FM (z)]
ρ =n 1+ 2
− n + 2. (156) Pascal-Vandermonde matrices feature the same rotational
(n − 1)z̈ 2 φL (ẑ)
2 invariance property underlying the ESPRIT algorithm.
020357-31
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
Thus, one can show that when applying the standard Corollary 15: (Sampling complexity). Let M be even and
ESPRIT algorithm to data of this form, the vector of L = M /2. And z = (zi )ni=1 be a set of poles. For m ∈ [M ]
eigenvalues of the matrix is still the vector of poles z let p̂(m) be the mean estimator of IID copies of random
with the eigenvalues appearing in multiplicities according variables with variance bounded by ε2 . Choose ˜ , δ > 0,
to the maximal degree of the associated falling polyno- provided that the total number of random trials is
mial. Hence, ESPRIT can be directly applied to estimate
matrix-exponential data series. Noise in the signal will M ε2 M
Ntotal ≥ 8κ24 [WM /2 (z)]ẑ −2 log (161)
generically break the degeneracy of the eigenvalue spec- ˜ 2 δ
trum, corresponding to the fact that a generic matrix has
nondegenerate eigenvalues. Searching for regular poly- and
gons of poles allows for matching groups of perturbed 16 2 1 M
poles corresponding to the same unperturbed pole. We Ntotal ≥ κ2 [WM /2 (z)]ẑ −1 log (162)
3 ˜ δ
refer to Refs. [84,85] for further details.
for the noise-space correlation function (148) defined by
C. Randomized benchmarking sampling the MUSIC algorithm with input data p̂ it holds that
complexity—estimation of the Hankel matrix |Rnoise (z ) − Rsignal (z )| ≤ ˜ with probability δ.
The performance bounds on the pole-finding algorithms,
We state this bound in terms of the condition number
such as Theorem 11, depend on the deviation of the Hankel
of the Vandermonde matrix, which allows us to make ana-
matrix from ideal data in spectral norm. In RB protocols
lytic claims about the behavior of the sampling complexity
this error has two contributions:
in various regimes. However, one can state an equivalent
1. The finite sampling statistics of the measurements, bound in terms of the smallest singular value, which will
which yields a statistical error of the mean estimator often be significantly smaller. It is, however, difficult to
p̂(m). work with analytically.
2. The perturbative error that comes from neglecting For the application of Corollary 15 to RB data pro-
subdominant eigenvalues, which is controlled by cessing, one has to additionally control the perturbative
our Theorems 8, 9, and 10. error appearing in Theorems 8, 10, and 9. The perturba-
tive error per RB data point, see, e.g., Eq. (73), yields an
For the finite sampling error, we provide the following additive error in the noise correlation function of order of
bound. To this end, we model the individual measurement M ẑ −1 κ22 [WM /2 (z)]. The scaling with M originate from
performed during the RB protocol by a random variable the spectral norm of the Hankel matrix and the factor of
Ŷm . To simplify the notation in the proof, we assume that z −1 κ22 [WM /2 (z)] captures the noise enhancement.
the number of different sequence lengths is even and use a Lemma 14 follows from the matrix Bernstein bound
square Hankel matrix. [88,89] that requires us to control the spectral norm and
matrix variance statistics in order to provide a tail bound
Lemma 14: (Statistical estimation). Let M be even and for sums of matrices. We follow the same strategy as
L = M /2. For m ∈ [M ], let Ŷm be a random variable tak- presented in Ref. [88] for Toeplitz matrices.
ing values in[0, 1] with Var[Ŷk ] ≤ ε2 . Furthermore, let Proof of Lemma 14. With the help of the L × L exchange
p̂(m) = 1/N Ni=1 Ŷm(i) the corresponding mean estimator matrix
of N independent identically distributed (IID) copies Ŷm(i)
of Ŷm . We denote with HankelL (p̂) the Hankel matrix of the 1 j =L−i+1
Ji,j = (163)
vector p̂ = [p̂(m)]m∈[M ] ∈ RM . Then, 0 else
HankelL (p̂) − EHankelL (p̂) ≤ (159) and the L × L (noncyclic) shift matrix X that has ones its
∞
first upper off-diagonal and zeros everywhere else we can
with probability 1 − δ provided that write
& '
L−1
M ε2 2 M
N ≥ 4 max , log . (160) HankelL (p̂) = p̂k X k J , (164)
2 3 δ k=−L+1
Combining Lemma 14 with the performance bound for where we identify the elements of p cyclically. We define
MUSIC, Theorem 11, and Eq. (154) we can state the
1 (i)
following result for the overall sampling complexity of Sk(i) := (Ŷ − E[Ŷk(i) ])X k J (165)
random benchmarking experiments. N k
020357-32
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
such that HankelL (p̂) − EHankelL (P̂) = Ni=1 L−1
k=−L+1 Sk
The matrix Bernstein inequality [88] yields
(i)
is the sum of the random matrices Sk . Since ) N L−1 *
P Sk ≥
i=1 k=−L+1 ∞
X ∞ = J ∞ = 1 (166) & 2 '
N 3N
≤ M exp − min , . (170)
4M ε2 8
and Ŷk takes values in [0, 1], we have that
Requiring the right-hand side to be dominated by δ and
solving for N yields the lemma’s assertion.
(i)
Sk ≤ 2/N (167)
∞
D. Vandermonde conditioning for randomized
benchmarking decays
for all i, k. For the matrix variance we calculate that
The noise-enhancement factor in the performance guar-
antee for the tone-finding algorithms MUSIC and ESPRIT
−1
L−1
1
L−1 is given by the inverse of the minimum singular value εmin
E[Sk(i) (Sk(i) )† ] = Var[Ŷk ]X k X −k of the Hankel matrix of the ideal, noise-free signal. This
k=−L+1
N 2 k=−L+1 minimum singular value, Eq. (154), is in turn controlled
by the minimal absolute value of the poles and the condi-
1
L−1
= Var[Ŷk ]Pk , (168) tioning of the Vandermonde matrix WL (z) associated with
N 2 k=−L+1 the poles and the signal length. Here we numerically inves-
tigate this conditioning in various scenarios relevant to RB.
We express all data in terms of the dimension of the Han-
with Pk a diagonal projector having k ones on the diagonal kel matrix L, which one can generally take as being about
and zeros everywhere else. One finds the same structure half of the maximal sequence length M .
L−1 (i) † (i)
for k=−L+1 E[(Sk ) Sk ] analogously. By the assump- When the RB data model is described by many poles
tion of the lemma Var[Ŷk ] ≤ ε2 . Therefore, matrix variance that are close in value the noise enhancement due to
statistics is dominated as bad conditioning can be the limiting factor rendering the
extraction of poles infeasible.
N L−1 N L−1 ( Increasing the sequence length improves the condition-
ing of WL (z), see Fig. 3. But Theorem 12 shows that the
max E(Sk S † k ) , E(Sk † Sk ) condition number of WL (z) is even in the asymptotic limit
i=1 k=−L+1 ∞ i=1 k=−L+1 ∞ W∞ (z) for large L bounded away from zero. Thus, increas-
Mε 2 ing the length of observed RB series only improves the
≤ . (169) conditioning up to a certain point.
N
020357-33
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
FIG. 4. Here we show the dependency of the conditioning number of the Vandermonde matrix on the spacing of two poles z0 , z1 ,
for infinite sequence length. We see that the conditioning depends drastically on the distance between the two poles, but not on the
absolute location of the poles on the real line. The orange line at 102 is added for the purpose of comparison.
The explicit expressions of the upper and lower bounds cardinality, see Table I. These families include linearly
on the condition number in Theorem 12 have a rather com- spaced poles within the interval (α, 1) and the pole fam-
plicated dependency on the geometrical constellation of ilies Fa (n) = (zi = 1 − 10−i/a | i ∈ [n]) for positive real
the poles. One can argue that for RB data with poles on a. For example, F1 (n) = (.9, .99, .999, . . .), which can be
the real line there are roughly speaking two effects coming regarded as featuring exponentially spaced “infidelities”.
into play: (1) the spacing of the poles and (2) the number Figure 5 depicts the dependency of κ2 [W∞ (z)] on the
of poles. number of poles n for different families. We find that due
To illustrate the dependency on the spacing of the poles, to a typically exponential dependency, the conditioning
we numerically evaluate the κ2 [W∞ (z)] for different pairs indicates that the reconstruction of multiple poles becomes
of poles as they might appear in RB data. The result is demanding for already small numbers n.
shown in Fig. 4. The first pole is chosen to deviate from 1 Note that the conditioning is significantly improved if
by a value r ∈ {10−2 , 10−3 , 10−4 }, the second pole is cho- the poles are not exclusively on the real line but also have
sen at different values around the first one. We observe nonvanishing imaginary parts. Such pole sets, for example,
that as both poles move together the condition number arise in the RB variant of Ref. [9] focusing on individual
diverges. Importantly, the size of the interval in which the gates.
condition number grows over a certain threshold scales
with r. Correspondingly, we expect that poles closer to 1
can be still resolved with a smaller spacing compared to E. Performance evaluation
poles that deviate considerably from 1. After collecting evidence that the reconstruction of mul-
Secondly, even if the poles are spaced such that the tiple poles quickly becomes a demanding task. We here
ratio of the departure from normality and the minimum show that for moderate configurations (i.e., not too many
spacing are fixed the upper bound in Theorem 12 exhibits poles, not too close together) the ESPRIT algorithm is
an exponential dependency on the number of poles. We suitable for the postprocessing of RB data. To this end,
numerically evaluate this dependency for different families we implement the ESPRIT algorithm in Python. For a
of poles that each defines a set of poles for every fixed set of poles the ideal data series (constructed from
n 2 4 6
Lin. α = .9 (0.9, 0.95) (0.9, 0.925, 0.95, 0.975) (0.9, 0.9167, 0.9333, 0.95, 0.9667, 0.9833)
Lin. α = .5 (0.5, 0.75) (0.5, 0.625, 0.75, 0.875) (0.5, 0.5833, 0.6667, 0.75, 0.8333, 0.9167)
F1 (.9, .99) (.9, .99, .999, .9999) (.9, .99, .999, .9999, .99999, .999999)
F2 (0.9, 0.9684) (0.9, 0.9684, 0.99, 0.9968) (0.9, 0.9684, 0.99, 0.9968, 0.999, 0.9997)
020357-34
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
the poles and a fixed identical prefactor) is made noisy dH (z, z ) = max{ddH (z ; z), ddH (z; z )},
by randomly sampling binomial distributions. This simu-
ddH (z, z ) = max min |zk − zk |. (171)
lates the random noise due to finite statistics for a certain k∈[n] k ∈[n ]
number of samples per sequence length. Subsequently, the
set of poles is reconstructed from the noisy data using Figure 6 displays the mean Hausdorff distance for a dif-
the ESPRIT algorithms. We compare the reconstructed set ferent number of samples. Each data point is averaged
of poles with the ideal set of poles using the symmetric over 100 repetitions. Figure 7 depicts the mean Hausdorff
Hausdorff distance. Let z ∈ Cn and z ∈ Cn distance for different numbers of samples and maximal
FIG. 6. Mean Hausdorff distance between the real set of poles and the reconstructed set of poles (via ESPRIT) for different families
of poles (as defined in Table I) and Hankel dimension L (∝ maximal RB sequence length M ) versus the number of samples used per
expectation value estimation. Each data point is averaged over 100 repetitions. For all families we see that the reconstruction essentially
fails until a sampling threshold is reached, after this threshold the accuracy of the estimation increases rapidly with increased number of
samples. This threshold increases strongly with the number of poles in the family across all families and also depends on the maximal
sequence length. This latter dependence is mediated by the actual locations of the poles in the complex plane, which is as expected.
020357-35
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
020357-36
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
λ-filtered RB output data: with Aλ given in Eq. (101). From the definition of kλ (m),
we can thus compute
1 −1
kλ (m) = N αλ (i, gend )p(i, m, gend ), (173) 1 −1
|G| g ∈G i∈I λ kλ (m) = Nλ αλ (i, gend ) Tr(Aλ Mλm )
|G| g ∈G i∈I
end
end λ ∈
where the normalization constant is given by (177)
1 +
1
Nλ−1 αλ (i, gend )
Nλ = αλ (g, i) i |ω(g)|ρ0 . (174) |G| g
|G| g∈G i∈I end ∈G i∈I
+ ,
One can think of this quantity as measuring the presence of × p(i, gend , m) − Tr(Aλ Mλm ) . (178)
the subrepresentation σλ in the data p(i, gend , m). We make λ ∈
this more precise in the following theorem.
Considering only the first term, and inserting the definition
Theorem 16: (Measuring subrepresentations in the data). of αλ (i, gend ) we are interested in the SPAM operator
Let G be a finite group and ω : G → Sd a reference rep- quantity
⊕n
resentation of G with decomposition ω = λ ∈ σλ λ .
Moreover, let φ be an implementation of ω for which 1
Bλ,λ = i |Pλ ω(gend )|ρ0 Aλ (179)
Theorem 8 holds. For a fixed λ ∈ consider the λ-filtered |G| g ∈G i∈I
data kλ (m) as defined in Eq. (173). As a function of m we end
j
Proof. We know from Theorem 8 that where Pλ is the projector onto the j ’th copy of σλ
in the reference representation ω and Rλ , Lλ1 encode
m
2δ the deviation of φ from ω (their precise shape is not
|p(i, m, gend ) − Tr(Aλ Mλm )| ≤8 δ 1+ , relevant for our argument). By linearity, we can now
1 − 5δ
λ ∈ consider
(176)
[Bλ,λ ]j ,j (181)
1
[Pλ ω(gend −1 )] ⊗ TrVσ [σ λ (gend −1 ) ⊗ 1]Rλ1 F(Pλ ωPλ )Lλ1 † |ρ0 ⊗ ESP (ρ0 )
j j
= dσλ i ⊗ EM ( i )|
i∈I
|G| g ∈G λ
end
(182)
δλ,λ [F (ω)[σλ ] ⊗ 1]1 ⊗ [Rλ1 F(Pλ ωPλ )Lλ1 † ] |ρ0 ⊗ ESP (ρ0 ) ,
j j
= dσλ i ⊗ EM ( i )| TrVσ (183)
λ
i∈I
020357-37
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
which is the Fourier transform analog of the orthogonality perfect gates. In the case of a multiplicity-free reference
of characters of irreducible representations. Hence Bλ,λ = representation ω we have
δλ,λ Bλ,λ := Bλ .
Plugging this back into the expression for kλ we get Bλ = Nλ−1 ⊗2 ⊗2
i |F (ω)[σλ ]|ρ0 , (187)
i∈I
1 −1
kλ (m) = Tr(Bλ Mλm ) + N αλ (i, gend )
|G| g ∈G i∈I λ which emphasizes the importance of the normalization
end
+ , constant (on which more later), but also the importance of
choosing ρ and { i }i∈I such that Bλ is nonzero.
× Tr[A( i , gend )λ Mλm ] − p(i, gend , m) .
λ ∈
(185) B. Statistical estimation
When computing the filtered output data kλ (m) in the
We can thus upper bound the difference kλ (m) − previous section we assumed we had access to the RB out-
Tr(Bλ Mλ ) by considering the magnitude of the dif- put data p(i, gend , m) for all i ∈ I and gend ∈ G. This is not
ference term. Note that we know from Theorem 8 realistic since both the size of the POVM { i }i∈I and the
that { λ ∈ Tr[A( , gend )λ Mλm ] − p(i, m gend )} ≤ O(δ m ). size of the group |G| can be exponential in the number of
It follows that there exists a K such that qubits. In practice, we need to construct a statistical esti-
mator k̂λ for kλ , and argue that k̂λ is a good approximation
1 −1 for a reasonable number of samples. This we do in this
Nλ αλ (i, gend )
|G| section.
gend ∈G i∈I
+ , Note that the normalization factor Nλ is essential in
lower bounding the magnitude of the filtered function kλ
× Tr[A( , gend )λ Mλm ] − p(i, m, gend ) (i.e., making sure that the number kλ is not too small).
λ ∈ However, this normalization factor can be proportional to
m the Hilbert-space dimension d, making it tricky to set up an
2δ
≤ 8K δ 1 + . (186) estimator for kλ that has a sampling complexity that does
1 − 5δ
not grow with d (which would make sampling practically
impossible for more than a few qubits). This is the task
we turn to now. We can construct an estimator for kλ (m)
Hence, the λ-filtered output data has essentially the same essentially directly from its definition.
behavior as regular RB data, except that only the Fourier It is easy to see that the mean of this estimator is equal
mode associated with σλ is included in the signal. One can to the λ-filtered output data kλ (m). However, this does
think of the λ filter function αλ as placing a δ-peak fil- not mean that the associated estimation procedure is effi-
ter function centered on the “frequency” σλ . Note that by cient. A priori the variance of the estimator could scale
linearity we get essentially the same result if one defines with Hilbert-space dimension d, since the magnitude of
a filter function associated with nonirreducible representa- the filter function Nλ1 αλ does so in general. We cannot
tions (via a direct sum of irreducible representations). This prove that this estimator is efficient for all groups G and
can be thought of as placing a frequency comb on the RB POVMs { i }i∈I . We can, however, make some partial
data. Finally, it is interesting to explicitly write down the statements. In particular, we can prove that the estimator
form of the SPAM matrix Bλ in the limit of no SPAM and is efficient as long as the POVM { i }i∈I is generated by a
Algorithm 1
L
1 −1
k̂λ (m) = Nλ αλ (i, gend l )fi (gend l ) (188)
L
l=1
020357-38
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
3-design. This is a restrictive condition, but not impossi- totically independent of the Hilbert-space dimension d.
ble to fulfill. We discuss how to implement such a POVM
after stating and proving the following theorem, which
Proof. First we calculate the effect of the 3-design condi-
essentially states that under the 3-design condition, the
tion on the normalization factor of the correlation function
variance of the estimator k̂λ (m) does not scale with the α(i, ·), by direct calculation we have
Hilbert-space dimension d. This means that the sampling
resources required by the protocol do not depend on the 1
number of qubits in the system, making the postprocessing Nλ = αλ (i, g) i |ω(g)|ρ0 , (189)
|G| g∈G i∈I
step scalable (at least with respect to sampling). We note
.
that this theorem gives an extremely crude bound on the d2 1
variance, and the actual variance is liable to be substan- = dψψ ⊗2 |ω(g)⊗2 Pλ ⊗ 1|ρ0⊗2 ,
|I | |G| g∈G
tially smaller. For simplicity, we assume that there is no
SPAM or gate noise, but the conclusions made here easily (190)
generalize.
d 2
1 Tr[Pλ (ρ0 )] Tr(ρ0 )
= Tr[ρ P (ρ
0 λ 0 )] + ,
Theorem 17: (Efficient estimators). Consider a uniform |I | d2 − 1 d2
RB experiment of sequence length m, with group G, refer- (191)
ence representation ω, measurement POVM { i }i∈I , and 2
1 d
initial state ρ0 , and further assume that the POVM { i }i∈I = Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (192)
|I | d2 − 1
is an (exact) 3-design,
that is i =- d/|I ||χi χi | with states
|χi and 1/I i∈I |χi χi |⊗3 = dψ|ψ ψ|⊗3 . Then for
all λ ∈ the variance of the estimator k̂λ (m) is asymp-
where we use the fact that the Haar measure is invariant under unitary action to absorb the ω(g) dependence, as well
as a standard formula for the second moment of a Haar average over the unitary group, see, e.g., Ref. [55, Proposition
37]
or Ref. [54] [and that Tr(ρ0 ) = 1]. We can now calculate the variance. We denote by k̂λ (m, gend ) the estimator of
−1
i∈I Nλ α(i, gend )p(i, m, gend ) for a fixed gend ∈ G. By the law of total variation we can write
) *
1
V[k̂λ (m)] = V k̂λ (m, gend ) + VG α(i, gend )p(i, m, gend ) (193)
|G| g ∈G i∈I
end
) *2
1 −2 1
≤ N α(i, gend )2 p(i, m, gend ) + N −1 α(i, gend )p(i, m, gend ) , (194)
|G| g ∈G i∈I λ |G| g ∈G i∈I λ
end end
by dropping the negative terms in the variances. We begin with calculating the second term. For this note that for all
gend ∈ G (again using the invariance of the Haar measure):
−1
d2
Nλ−1 α(i, gend )p(i, m, gend ) =I 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] (195)
i∈I
d −1
.
d2
× dψ ψ ⊗2 |{ω(g) ⊗ [EM φ ∗m (gend )ESP ]} Pλ ⊗ 1 |ρ0⊗2 (196)
I
d ∗m
= 2 Tr ρ0 Pλ EM φ (gend )ESP (ρ0 ) + Tr Pλ (ρ0 ) (197)
d −1
2 −1
d
× 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (198)
d −1
where we use the expression for p(i, m, gend ) from Eq. (69). Note that this expression is asymptotically independent of the
Hilbert-space dimension (depending only on how well the initial state overlaps with the projector Pλ ). Next we discuss
the first term, given by
020357-39
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
1 −2
N α(i, gend )2 p(i, m, gend ) (199)
|G| g ∈G i∈I λ
end
.
−2 d
3
1
= Nλ dψψ ⊗3 |{ω(gend )⊗2 ⊗ [EM φ ∗m (gend )ESP ]} Pλ⊗2 ⊗ 1 |ρ0⊗3 (200)
|I | |G| g ∈G
2
end
3 .
d
= Nλ−2 2 dψψ ⊗3 |{Pλ⊗2 ⊗ [EM φ ∗m (e)ESP ]}|ρ0⊗3 . (201)
|I |
Here appears a third moment of a Haar average, which can be evaluated using Weingarten calculus (see, for instance, Eqs.
S35 and S36 in Ref. [54], Ref. [55] or Ref. [90] more generally). In this particular instance, we get
.
dψ ψ | Pλ (ρ0 ) |ψ ψ | Pλ (ρ0 ) |ψ ψ | EM φ ∗m (e)ESP (ρ0 ) |ψ (202)
Tr Pλ (ρ0 )|2t + 2 Tr{Pλ (ρ0 )|2t EM φ ∗m (e)ESP (ρ0 )}
= (203)
(d + 2)(d + 1)d
2
Tr Pλ (ρ0 )] Tr{Pλ (ρ0 )|t EM φ ∗m (e)ESP (ρ0 )} Tr Pλ (ρ0 )
+ + , (204)
d2 (d + 1) d3
where A|t = A − Tr(A)1 for matrices A. By isolating a common d−3 factor and plugging back in, we get
1 −2
N α(i, g)2 p(i, m, gend ) (205)
|G| g ∈G i∈I λ
end
2 Tr Pλ (ρ0 )|2t Tr{Pλ (ρ0 )|2t EM φ ∗m (e)ESP (ρ0 )
= (206)
(d + 2)(d + 1)d−2
Tr Pλ (ρ0 ) Tr{Pλ (ρ0 )|t EM φ ∗m (e)ESP (ρ0 )} 2
+ + Tr Pλ (ρ0 ) (207)
d−1 (d + 1)
2 −2
d
× 2 Tr[ρ0 Pλ (ρ0 )] + Tr[Pλ (ρ0 )] , (208)
d −1
which is again asymptotically independent of the Hilbert- and it is also proportional to a 3-design, because the mul-
space dimension. tiqubit Clifford group is a unitary 3-design [91,92], and
hence every orbit {C |x x | C† }C∈Cq is a state 3-design (and
Measurement POVMs that are proportional to 3-designs
thus so is the union over x).
are not very common. However, when considering a sys-
We emphasize that the 3-design condition is only a suf-
tem of q qubits it is possible to construct one by consid-
ficient condition for a controlled variance of the estimator
ering computational basis measurements conjugated by a
for the filtered output data, which works for any group G
random element of the q-qubit Clifford group Cq . That is,
and subrepresentation σλ . For particular choices of G and
we consider the POVM
σλ the estimator k̂λ (m) might be efficient for other choices
1 of the POVM { i }i∈I . It is, for instance, easy to see that
{ x,C } ={ C |x x |C† x ∈ {0, 1}q , C ∈ Cq }. the variance will also be controlled if the degree dλ of
|Cq |
(209) the irrep σλ is small. This follows from the fact that the
normalization factor Nλ can be written as
It is easy to see that this is a POVM
1 1 1
C |x x |C† = CC† = I (210) Nλ = Tr[ i Pλ ( i )] Tr[ρ0 Pλ (ρ0 )] (211)
C∈Cq x∈{0,1}q
|Cq | |Cq | C∈C dλ i∈I
q
020357-40
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
so assuming the POVM { i }i∈I and the initial state ρ0 correct normalization factor for α(x, U), since
can be chosen to have sufficient (larger than 1/d) overlap
with the subrepresentation σλ the magnitude of the inverse . 2d
normalization factor Nλ−1 , and hence the size of the sup- dU | x |U |0 |4 = . (213)
d+1
port of the probability distribution {Nλ−1 α(i, gend )}p(i,m,gend ) x
is controlled by 1/dλ . Hence, if dλ is small, the estimator
k̂λ (m) is efficient. This follows because it is constructed We can extend this interpretation by considering the linear
by sampling from a [O(1) in d] bounded random variable. cross entropy of a sequence of m random unitaries (this is
Examples of this behavior have been noted in the literature done implicitly in Ref. [29]). This gives
[37,39,45].
Alternatively, there are situations where the dimension .
of the representation σλ scales with the total Hilbert- FXEB,m = d dU1 . . . Um |x |Um · · · U1 |0 |2
space dimension d but the estimator k̂λ (m) is still efficient Haar x∈{0,1}q
because the group G under consideration is sufficiently
randomizing (roughly, it spans its own 3-design due to × x |EM φ(U1 · · · Um )ESP |ρ0 . (214)
the randomization over the ending gate gend ). An exam-
ple of this is the recently introduced linear-cross-entropy Using the invariance of the Haar measure and the linearity
benchmarking procedure, which we discuss in the next of the trace and the tensor product we can rewrite this as
section.
Finally, we would like to add that if one reuses the same .
experimental data p(m, gend ) to estimate kλ (m) for differ- FXEB,m = d dUm |x |Um |0 |2
ent λ, the resulting estimates for kλ (m) (and consequently Haar x∈{0,1}q
the associated decay rates) will be correlated. This must be
taken into account when performing joint statistical infer- × x |EM φ ∗m (Um )ESP |ρ0 (215)
.
ences on estimates for several Mλ . This can of course be
remedied by gathering new data for each representation =d dUm |x |Um |0 |2 p(x, Um , m)
Haar
label λ. x∈{0,1}q
(216)
C. Example: linear cross-entropy benchmarking with p(x, Um , m) the output probability of a regular RB
Recently, Ref. [29] has introduced a RB-like protocol experiment. Now noting that ω(U) decomposes into the
referred to as linear-cross-entropy benchmarking, in short trivial representation (on the space {a|1 | a ∈ C}) and the
XEB. We see in this section that this protocol falls into adjoint representation [on the space { |A | Tr(A) = 0}] we
the framework of the benchmarking schemes introduced apply Theorem 8 to the above to get
here. In fact, it can be seen as uniform RB with G the
full unitary group, together with a postprocessing scheme
that is a special case of the above filtering scheme. Let FXEB,m = Atr smtr + Aadj fadjm (217)
φ : U(2q ) → Sd be an implementation map of the uni-
tary group, also let { x }x∈{0,1}n be the computational basis up to a correction exponentially small in m, where str (fadj )
POVM, and ρ0 = |0 0|. The linear cross-entropy fidelity is the largest eigenvalue of the Fourier transform of φ
is now given by evaluated at the trivial (adjoint) representation. Recall that
str = 1 if φ(U) is trace preserving for all U, and that we
. can moreover interpret fadj as affinely related to the average
FXEB = d dU |x |U |0 |2 x |EM φ(U)ESP |ρ0 fidelity (certainly in the gate-independent noise setting).
Haar x∈{0,1}q Hence, through Theorem 8 and our general postprocessing
(212) scheme the linear-cross-entropy benchmarking procedure
inherits both the stability and interpretation of uniform RB.
It is notable that the estimator k̂λ (m), which in this case
with EM , ESP being the usual SPAM error channels. Setting estimates the linear cross entropy fidelity FXEB,m is actu-
α(x, U) = | x |U |0 |2 = x |ω(U)|ρ0 we see that FXEB ally efficient, in the sense of Theorem 17. We can sketch an
can be interpreted as a RB experiment of sequence length argument for this by directly estimating the variance of the
“0” with gend = U together with postprocessing by corre- estimator. For this argument we assume gate-independent
lation with the adjoint representation ω(U) = U · U† . Note noise [i.e., φ(U) = Aω(U) for some completely positive
that the dimensional factor almost precisely serves as the A]. Following Theorem 17, we have
020357-41
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
.
∗m
V[k̂λ (m)] ≤ d 2
dU|x |U |0 |4 x |EM φ (U)ESP |ρ0 (218)
x∈{0,1}q Haar
⎡ ⎤2
.
+ d2 dU ⎣ |x |U |0 |2 x |EM φ
∗m
(U)ESP |ρ0 ⎦ (219)
Haar x∈{0,1}q
.
≤ d max q
3
dU|x |U |0 |4 x |EM φ ∗m (U)ESP |ρ0 (220)
x∈{0,1} Haar
.
/
+ d4 max dU|x |U |0 |2 | x U |0 |2 (221)
x,x ∈{0,1}q Haar
∗m
× x |EM φ (U)ESP |ρ0 x |EM φ ∗m (U)ESP |ρ0 . (222)
Using the gate-independent noise assumption and the fact right eigenvectors of its implemented version; the criti-
hat ω(U)(ρ) = UρU† , the rhs is a Haar integral of a cal point is that ascertaining whether this requirement is
degree-3 homogeneous polynomial in the entries of U, U, met is not possible with a RB procedure. We want to
and the second term is a Haar integral of a degree-4 homo- highlight that this intricacy in connecting RB to other
geneous polynomial. The asymptotic behavior of such well-established quantities does not mean RB protocols are
integrals (in the limit of large d) is well known [90] and inherently flawed, but only that the information they pro-
evaluates to O(d−3 ) and O(d−4 ), respectively. Hence, the vide have to be regarded independently, with decay rates
overall variance is O(1) in d. One could fill in the exact as the defining quantities to characterize the accuracy of
constants by evaluating the Haar integrals (like we did in experimentally implemented sets of gates.
Theorem 17), but we do not pursue this here.
A. The depolarizing gauge and in-between noise
average fidelity
IX. RANDOMIZED BENCHMARKING AND
In an attempt to resolve the apparent disconnect between
AVERAGE FIDELITY
fidelity and RB decay parameters in the gate-dependent
To date, we have treated the information extracted from noise setting, in Refs. [24] and [25] proposals have been
RB procedures, and in particular the decay rates, as fig- made for the precise connection between RB decay rates
ures of merit in their own right, without establishing a and average fidelity. In Ref. [24], it has been noted that the
direct connection to other well-known quantities such as output data of Clifford RB could be exactly fitted to a sin-
the average gate fidelity. Indeed, this latter object is often gle exponential whose decay rates are exactly interpreted
portrayed as the conclusive result of an RB protocol. as the average fidelity of the “noise in between gates,” a
In this section, we provide a series of arguments to vali- manifestly gauge invariant quantity. Similarly in Ref. [25],
date the interpretation of the RB parameters as standalone it has been argued that the decay of Clifford RB can be
information, by showing that connecting RB decays to the regarded as the average fidelity of the implementation with
average gate fidelity presents complications that are hard to regard to a particular gauge choice, namely the one in
overcome. The underlying reason for this incompatibility which the average implementation inverted with the ref-
is due to the gauge-dependent nature of the average gate erence representation is precisely a depolarizing channel.
fidelity (as argued in Ref. [26]) that cannot be established We show here that (1) both of these statements can be gen-
nor controlled under RB. More precisely, in Sec. IX A eralized to RB with arbitrary groups, (2) both statements in
we provide an explicit example showing that adopting a fact say the exact same thing, and (3) both interpretations
gauge to match the average gate fidelity gives rise to a suffer from the same problem, namely that the channel of
channel that is not physical. In Sec. IX B, we substanti- which the average fidelity is measured by RB is not neces-
ate our argument with an analysis of the expression of sarily a completely positive (CP) map (i.e., physical), even
the entanglement fidelity—a quantity closely related to the if the implementation map φ is.
average fidelity—in terms of RB decay parameters and the In Ref. [24], the RB decay rate is interpreted as mea-
adopted gauge. Observing this expression we conclude that suring the fidelity of “the noise in between gates.” (A
RB parameters and fidelity can be linked only if there is general version of) this construction goes as follows. For
a close overlap between the dominant eigenvector of the an implementation φ of a group G, close to some reference
ideal operator and the dominant, gauge-dependent left and representation ω = λ∈ σλ we can pick the dominant
020357-42
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
eigenvectors vec(Rλ ) of the Fourier transform F (φ) eval- We can connect the above two interpretations by inserting
uated at the irreducible subrepresentation σλ ⊂ ω (for now the parametrization φ(g) = Rω(g)L(g) into the expres-
assuming no multiplicities, this easily generalizes). We can sion for φdep as
devectorize these eigenvectors and sum them up to create
a superoperator R with the property
1 φdep (g) = R−1 Rω(g)L(g)R = ω(g)L(g)R. (227)
φ(g)Rω(g)† = Rdep, (223)
|G| g∈G
where
dep is the generalized depolarizing channel dep = Hence, the depolarizing gauge is precisely the gauge in
λ∈ λ Pλ with fλ the eigenvalue corresponding to Rλ .
f which each superoperator φdep (g) is viewed as the ideal
Without loss of generality we can assume that R is invert- superoperator ω(g) preceded by the noise in between gates
ible (as a matrix). Note also that for any φ we can write L(g)R (in the sense of Ref. [24]). Hence, these two inter-
φ(g) = Rω(g)L(g), where L(g) is some implementation pretations of the RB decay rates as corresponding to an
map (not necessarily completely positive). average fidelity of “something” neatly map to each other.
With this parametrization the noise between two gates A central open question in both the above constructions
g, g (which in this parametrization only depends on g) is is whether the noise in between gates, or equivalently the
given by L(g)R. The entanglement fidelity with regards to noise in the implementation in the depolarizing gauge, can
the identity averaged over all g ∈ G of this map is always be chosen to be a completely positive implemen-
tation map. This is essential if we want to consider these
1
Favg [L(g)R, 1] interpretations as actual descriptions of reality. Here we
|G| g∈G answer this question in the negative by giving an exam-
1 ple (an adaptation of a construction given in Ref. [26])
= Favg [R−1 Rω(g)L(g)Rω(g)† , 1] of a pointwise CP implementation map φ where the noise
|G| g∈G in between gates (the implementation in the depolarizing
gauge) is not completely positive. Let G be the single-
= Favg (dep, 1), (224)
qubit Clifford group, and consider, in the Pauli basis, the
where we use the linearity and unitary invariance of the following superoperators:
average fidelity. Note that Favg (dep, 1) = 1/d2 − 1 λ∈
fλ dλ − 1 is precisely the average fidelity one would obtain
by plugging the RB decay rates fλ into Eq. (242). ⎛ ⎞
1 0 0 0
On the other hand, Ref. [25] connects the RB decay rates √
⎜ 0 γ 0 0⎟
T(γ ) = ⎝ √
0⎠
to the average fidelity of the implementation map φ in a ,
0 0 γ
particular gauge, that is a particular choice of invertible 1−γ 0 0 γ
superoperators such that ⎛ ⎞
1 0 0 0
1 ⎜0 α 0 0⎟
Favg [S −1 φ(g)S, ω(g)] = Favg (dep, 1). M1 (α) = ⎝
0⎠
(225) ,
|G| g∈G 0 0 1
0 0 0 1
This map φdep = S −1 φS is called the depolarizing gauge. ⎛ ⎞
1 0 0 0
According to Ref. [25] the correct interpretation of the RB ⎜0 1 0 0 ⎟
M2 (α) = ⎝
0 ⎠
decay rates is that they measure the fidelity of the imple- . (228)
0 0 1
mentation map φ in the depolarizing gauge with respect to 0 0 0 α −1
the reference implementation ω. It turns out that the correct
choice for S is precisely the operator R mentioned above,
which can be easily seen by explicit computation
From these we can construct the implementation φ(g) =
1 T(γ )M1 (α)ω(g)M2 (α), with ω(g)(ρ) = Ug ρUg † the stan-
Favg [R−1 φ(g)R, ω(g)]
|G| g∈G dard reference representation. It is easy to see that
⎛ ⎞ the transformation to the depolarizing gauge is given
by M2 (α)φ(g)M2 (α)−1 = M2 (α)T(γ )M1 (α)ω(g). Equiv-
1
= Favg ⎝R−1 φ(g)Rω(g)† , 1⎠ alently, the noise in between gates is given by
|G| g∈G M2 (α)T(γ )M1 (α). The claim is now that there exists pairs
α, γ such that φ(g) is completely positive for all g ∈ G
= Favg R−1 Rdep, 1 . (226) but M2 (α)T(γ )M1 (α) is not. An easy pathological example
020357-43
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
can be obtained by setting γ = 0. In this case we have a positive-gauged implementation map that has a fidelity
approximately given by the RB decay rates (with approx-
⎛ ⎞
1 0 0 0 imate meaning small relative to 1 − fλ ). This can be done
⎜0 0 0 0⎟ for Clifford RB on a single qubit [27] but generalizing to
φ(g) = ⎝
0⎠
,
0 0 0 higher dimensions seems difficult (although some work in
1 0 0 0 this direction has been done [94]).
⎛ ⎞
1 0 0 0 B. Connecting average fidelity and randomized
⎜ 0 0 0 0⎟ benchmarking decay rates
M2 (α)T(1)M1 (α) = ⎝
0⎠
. (229)
0 0 0
In the previous subsection we showed that the depolar-
α −1 0 0 0
izing gauge does not always give rise to a CP implemen-
tation map, and hence, cannot be connected in all cases to
Hence, for all α < 1 the maps φ(g) are CP while the
the average fidelity of a physical process. Here we want
map M2 (α)T(0)M1 (α) is not (this can be verified by using
to investigate the link between fidelity and the RB decay
the complete positivity conditions for qubit channels from
parameters under a general gauge choice S. We do this
Ref. [93]). For γ < 1 one can always construct interval
using the tools of perturbation theory we have used earlier
conditions on α such that the same holds. Hence, the
to establish Theorem 8.
interpretations [24,25] both suffer from a problem, namely
that in order to imagine RB as “measuring the average
1. The randomized benchmarking measurement
fidelity” of some object, this object has to be chosen in
outcome
a way that is not necessarily physical. This possibility was
already indicated by both papers, but no explicit example Let us consider a special case of Theorem 8 correspond-
was given. It is unclear how to resolve this problem: one ing to reference representations ω that are multiplicity-free
could, for instance, try to find natural conditions on φ such (for simplicity), and making the gauge freedom S explicit.
that the noise in between gates, or equivalently the imple- In this situation, we can write the Fourier operator F(ω)
mentation in the depolarizing gauge, is always completely as a direct sum of rank-1 orthogonal projections, since
positive. Alternatively one could adopt the framework of from Eqs. (29) and (30) it follows that for each unitary
Ref. [27] where one relaxes the problem by asking for irreducible representation σλ of G
⎧
⎪
⎨|z(σλ ) z(σλ ) | rank-1 orthogonal projection if π and σλ are equivalent irreducible representations,
F (π )[σλ ] =
⎪
⎩0 otherwise.
(230)
Furthermore, we also assume that the Fourier transform F (σλ ) is a diagonalizable operator. Since the set of diagonalizable
matrices is dense [95], it is always possible to find such a diagonalizable matrix at arbitrary proximity of any given
operator. We can thus write the Fourier transform of the implementation map on the irreducible representation appearing
in the decomposition of ω as the perturbation 4
E (σλ ) := F (SφS −1 − ω)[σλ ] of the rank-1 operator F (ω)[σλ ],
d −1
where fmax (σλ ) is the largest eigenvalue of F (SφS −1 )[σλ ] and fjλ j λ=1 are the other eigenvalues. The sets of left and right
λ / 5 / 5
eigenvectors form a biorthogonal system, that is, (σλ )|r (σ
λ ) = jλ (σλ ) rjλ (σλ ) = 1 and
max (σλ ) rjλ (σλ ) =
/ 5 / 5 max max
jλ (σλ ) rmax (σλ ) = jλ (σλ ) rkλ (σλ ) = 0, for jλ = kλ . The important remark that we should make here is that this basis
of eigenvectors reflects the gauge transformation SφS −1 .
020357-44
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
In this scenario, we can thus write Eq. (75) in the proof of Theorem 8 for gend = 1
−1 m+1
p(i, m) = dλ EM ( i )| TrVλ {F (SφS ) [σλ ][σλ (1) ⊗ 1]}|ESP (ρ0 ) (234)
λ∈Irr(G)
&
= dλ [fmax (σλ )]m+1 EM ( i )| TrVλ |rmax (σλ ) max (σλ ) | |ESP (ρ0 ) (235)
λ∈
'
/
+ dλ [fjλ (σλ )] m+1
EM ( i )| TrVλ |rjλ (σλ )
jλ (σλ ) |ESP (ρ0 ) (236)
jλ
+ dγ EM ( i )| TrVσγ {F (SφS −1 )m+1 [σγ ][σ γ (1) ⊗ 1]}|ESP (ρ0 ) . (237)
γ ∈
/
By Eq. (62), it follows that fmax (σλ ) for each σλ in the the entanglement fidelity, defined as
irreducible decomposition of ω is lower bounded by 1 −
4E (σλ )2 , while the subdominant eigenvalues, correspond 1
to perturbations of the kernel of F (ω)[σλ ], are upper Fe (R) := | 1 ⊗ R | = 2 Tr(R), (240)
bounded by 4 E (σλ )2 . Moreover, by Theorem 18 pre- d
sented in Sec. X, the eigenvalues in those subspaces not
related to irreducible representations appearing in decom- where the trace is taken over the superoperators, and
position are again dominated by 4 E (σλ )2 . Hence, we can related to the average gate fidelity by
choose m large enough such that fmax m
(λ) fjλm (σλ ) for all
fjλ (σλ ) and for each irreducible representations σλ occur- dFe (R) + 1
ring in the decomposition of ω, and such that the leakage Favg (R) = . (241)
d+1
of the perturbation in nonoccurring irreducible subspaces
is suppressed.
For these values of m, we then retrieve the formula for In particular, we have (first formally written down in
the power law in Eq. (63), but here with respect to 1-dim Ref. [14])
parameters,
1
Favg (R) = dλ Tr(Mλ ) (242)
d2 λ∈
p(i, m) ≈ [fmax (λ)]m+1 ξ(S, σλ , i , ρ0 ), (238)
λ∈
with Mλ again an nλ × nλ matrix.
σλ , i , ρ0 ) := dλ EM (
where ξ(S, i )| TrVλ |rmax (σλ ) The connection between the RB decay rates and the
max (σλ ) | |ESP (ρ0 ) . fidelity has been challenged in Ref. [26], where it has been
argued that the average fidelity and the output of RB are
not related in a unique way. In doing so they introduced
2. Average gate fidelity and entanglement fidelity the concept of gauge freedom into the RB literature.
The first RB protocols based on the Clifford group [5,34] In the context of RB, gauge freedom is the observation
linked a single decay parameter f to the average fidelity that two implementation maps φ and φ give rise to the
of a quantum channel R, under the assumption of gate- same RB output data p(m) if they are related by a similarity
independent noise, i.e., φ(g) = Rω(g). The relation is transformation S, i.e., φ = SφS −1 . However, the average
given by fidelity of these implementation maps (relative to some
reference implementation) will generally differ. Note that
this an issue even with the assumption of gate-independent
1−f
Favg (R) = f + . (239) noise, however, in this case there is a “canonical” choice
d of gauge for which the RB decay rates and the fidelity are
related. In the gate-dependent noise scenario there is no
This formula generalizes to uniform RB with an arbitrary such obvious gauge choice. The rest of this section will be
⊕n
group G with reference representation ω = λ∈ σλ λ , concerned with this question.
again under the assumption of gate-independent noise. The entanglement fidelity—averaged over G—can be
However, it is more convenient to express it in terms of expressed in terms of Fourier transforms (as has first been
020357-45
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
noted in Ref. [25]). Indeed, we have We observe that this connection is complicated by two
factors. Firstly, it depends on the gauge-dependent overlap
Eg Fe [SφS −1 (g), ω(g)] z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) between the rank-1 pro-
jection and the perturbed dominant eigenvectors—a quan-
= Eg Fe [ω† (g)SφS −1 (g)] (243)
tity that we cannot retrieve from RB data—which might
1 deviate significantly from 1 depending on the gauge
= Eg Tr[ω† (g)SφS −1 (g)] (244)
d2 choice. Secondly, the residuum αres may be large, consti-
1 tuting a non-negligible part of the entanglement fidelity.
= 2 dλ Tr {F (ω)[σλ ]}† F (SφS −1 )[σλ ] , The rest of the section is concerned with analyzing these
d λ∈Irr(G)
gauge-dependent connective factors.
(245) We begin by deriving a bound on αres , showing that
this term is small, more precisely, of third order in the
where we use the second Parseval identity (28). gauge-dependent perturbation term 4 E (σλ ). For this, we
At this point we can again use of the property in use Corollary 7, where in this specific case a1 = 1 and
Eq. (230) for F (ω)[σλ ] and the reformulation in Eq. (233) A2 = 0(d2 −1),(d2 −1) and where
for F (SφS −1 )[σλ ] and write
†
Qz(σλ ) := X2 X2 = 1 − |z(σλ ) z(σλ ) | (250)
−1
Eg Fe [SφS (g), ω(g)]
1 is the orthogonal complement of the projection |z(σλ )
= 2 dλ Tr |z(σλ ) z(σλ ) | z(σλ ) |. Then, the relations between unperturbed and per-
d λ∈
turbed dominant eigenvectors is given by
× fmax (σλ ) |rmax (σλ ) max (σλ ) | (246)
/ |rmax (σλ ) = |z(σλ ) + Qz(σλ )4
E (σλ ) |z(σλ )
+ fjλ (σλ ) |rjλ (σλ ) jλ (σλ ) (247)
+ O[4
E (σλ )22 ], (251)
jλ
020357-46
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
Now, inserting Eqs. (251)–(253) into Eq. (249) and using the Cauchy-Schwarz inequality, we obtain the following
bound on the residuum:
1
|αres | = dσ max (σλ ) | 1−4
E (σλ )Qz(σλ ) + O [4 4(σλ ) 1 − Qz(σλ )4
E (σλ )22 ]K E (σλ )
d2 λ∈ λ
E (σλ )22 ) |rmax (σλ )
+ O(4 (261)
1
≤ 2 dσλ max (σλ ) | 4E (σλ )Qz(σλ ) K E (σλ ) |rmax (σλ )
4(σλ )Qz(σλ )4 (262)
d λ∈
1
+ dσ O[4 4(σλ )2
E (σλ )32 ]K max (σλ ) rmax (σλ ) + O[4
E (σλ )42 ] (263)
d2 λ∈ λ
1
≤ dσ O[4
E (σλ )32 ] max (σλ ) rmax (σλ ) + O[4
E (σλ )42 ]. (264)
d2 λ∈ λ
This bound for αres has a significant implication: it means that the residuum will not cover the leading term in Eq. (248) if
the latter is (4E (σλ )22 ), for all gauge choices S that yield max (σλ ) · rmax (σλ ) smaller than 1/4
E (σλ )2 .
Note that it is important to compare αres to the difference between 1 (the value of the entanglement fidelity of a perfect
implementation) and the dominant eigenvalues in Eq. (248). This distance is indeed what RB protocols are designed to
detect, and in order for the connection between fidelity and decay rates to be meaningful we require αres to be negligible
in comparison. To analyze this further, we first write
1
max := dσ 1 − fmax (σλ ) z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) |, (265)
d2 λ∈ λ
and we calculate deviation of the absolute of the overlap from 1, which is remarkably only in second order in perturbation,
≤ 1 + O[4
E (σλ )22 ]. (269)
This bound on the overlap, together with the one on the residuum, implies that the parameters fmax (σλ ) obtained from the
fitting of the RB model in Eq. (238) yield a meaningful characterization of the fidelity on the condition when they are
[4E (σλ )2 ].
Having derived a bound on the residuum we can consider Eq. (248) in different regimes [always assuming small
perturbations, i.e., 4
E (σλ )2 1]. In the first regime we make the assumption
[4 E (σλ )22 ] = |1 − fmax (σλ )| 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) |, (270)
corresponding to the situation where the parameters {fmax (σλ )}λ∈ are more sensitive to the perturbation than the overlap of
the dominant eigenvectors. As we mentioned before, this is indeed the regime where RB provides a meaningful estimation
of the fidelity. Indeed, we have
&
1
max ≥ 2 dσ |fmax (σλ ) − 1| · |z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (271)
d λ∈ λ
'
− 1 − z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) | (272)
020357-47
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
(282) (284)
−1
= EG SφS (g) − ω(g)2F , (285)
which is troublesome not only for the fact that we cannot where we apply Parseval’s identity. Note, however, that the
retrieve the overlap but also because in this case max may lhs of this expression runs over all irreducible representa-
be of the same magnitude or smaller than |αres |. Indeed, in tions of G and not the only ones decomposing ω.
this regime the residuum can then play a significant role in
the characterization of the average gate fidelity.
X. RANDOMIZED BENCHMARKING UNDER
The conclusion we draw from this analysis is that the
DIAMOND NORM AND FIDELITY CONSTRAINTS
overlap z(σλ )|rmax (σλ ) max (σλ )|z(σλ ) is the key factor
to consider when relating RB decays to the fidelity. This In Theorem 8, we have argued that randomized bench-
overlap must be sufficiently close to 1 under the adopted marking output data associated with an implementation of
gauge relative to the difference |1 − fmax (σλ )|. a group G could be approximated as a sum of (matrix)
Finally, we wish to relate {4 E (σλ )2 }λ to a promise exponentials provided the implementation map φ was
on a physical quantity related to the perturbation of the close to a reference representation ω with respect to the
ideal gate implementation ω. We recall that 4 E (σλ ) = diamond norm (averaged over all group elements). Here
F (SφS −1 − ω)[σλ ] and consider that ·2 ≤ ·F such we argue that this is a natural condition to demand in the
020357-48
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
context of RB. In particular, we show that this condition is and let ω, ω be representations of G on Vn , Vn
stable, in the sense that it is impossible to be close [in the with embedding maps L : Vn → Vn , L : Vn → Vn and
sense of Eq. (72)] to two inequivalent representations at R : Vn → Vn , R : Vn → Vn such that
once, and, moreover, we show that this requirement can-
1
not be replaced with a weaker one involving the average φ(g) − Rω(g)L ≤ , (287)
fidelity, resolving an open question in Ref. [25]. |G| g∈G
1
φ(g) − R ω (g)L ≤ .
A. Stability of representations under diamond norm (288)
|G| g∈G
First, we prove that “closeness to a representation” is a
stable concept, that is, it is impossible to be close to two Moreover, assume that there exists K such that
representations at once (in a suitable sense). Rω(g)L ≤ K, R ω (g)L ≤ K for all g ∈ G. If the
inequality K( + ) + 3δ + 2 + 2 < 1 holds then the
Theorem 18: (Stability of representations). Let φ be an
representations ω, ω are equivalent on a subspace of
implementation map of a group G taking values in Sd such
dimension at least d2 .
that
1
1 − φ(g)φ(g −1 ) ≤δ Proof. Consider the map LR : Vn → Vn , as well as its
(286)
|G| g∈G twirled version
1
T= ω(g)LR ω (g)† . (289)
|G| g∈G
We would like to argue that T is a map of rank at least d2 , as then we can decide the theorem by application of Schur’s
lemma. To do this, consider the distance to the identity of the natural pullback of T to Sd , namely RTL . We can calculate
1 − RTL ≤ 1 − 1 Rω(g) LR ω(g) †
L + 1
R ω(g) LR ω(g) †
L − R T L (290)
|G| g∈G |G|
g∈G
1
1 − Rω(g)LRρ(g)† L + Rω(g)LRω(g)† L − Rρ(g)LR ω (g)† L .
≤ (291)
|G| g∈G
We upper bound these two terms separately. For the first term, consider
1
1 − Rω(g)LRω(g)† L (292)
|G| g∈G
1
1 − φ(g)φ(g −1 ) + 1 − φ(g)Rω(g −1 )L
≤
|G| g∈G
+ 1 − Rω(g)Lφ(g −1 ) + φ(g) − Rω(g)L φ(g −1 ) − Rω(g −1 )L (293)
1
1 − φ(g)φ(g −1 ) + φ(g) φ(g −1 ) − Rω(g −1 )L
≤ δ + 2 + (294)
|G| g∈G
+ 1 − φ(g)φ(g −1 ) + φ(g) − Rω(g)L φ(g −1 ) (295)
≤ 3δ + 2 + 2 , (296)
where we exploit the submultiplicativity of the diamond norm and the fact that φ(g) = 1 for all g ∈ G. Similarly, for
the second term we get
020357-49
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
1
Rω(g)LRω(g)† L − Rω(g)LR ω (g)† L = 1
Rω(g)L[Rω(g)† L − R ω (g)† L ] (297)
|G| g∈G |G| g∈G
1
≤ Rω(g)L Rω(g)† L − R ω (g)† L (298)
|G| g∈G
≤ K( + ). (299)
1
Rω(g)L − φ (g) ≤δ + . μ
d
μ
(304) L ( |i j |) = δi,j [SL ]i,k |k k | (307)
|G| g∈G
k=1
020357-50
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
with S μ a d × d stochastic matrix of the form Consider now the following implementation map
⎧ defined by its action on X ∈ Md :
⎪
⎪ μ if i = j ≤ L − 1
⎪
⎨
μ 1 if i = j ≥ L φ(g)(X ) = (PL XPL ) + Ug (I − PL )X (I − PL )Ug † ,
[SL ] = (308) (309)
⎪
⎪1 −μ if i = j − 1 ≤ L
⎪
⎩
0 otherwise. where PL is the projection onto the space Span{ |i i ≤
L}. This map can be seen as checking whether a state is in
μ
For convenience we write for L in the following. It is the support of PL (though a measurement) and then apply-
easy to see that is a quantum channel
5 / and
moreover that ing or Ug depending on the outcome. We can calculate
if i, j ≤ L then ( |i j |) ∈ Span{ i j i , j ≤ L}. the average fidelity Favg [φ(g), ω(g)] directly as
.
Favg [φ(g), ω(g)] = dψ Tr φ(g)(|ψ ψ|)ω(g)† (|ψ ψ|) (310)
. .
= dψ Tr Ug|ψ ψ|Ug † (PL |ψ ψ|PL ) + dψ Tr |ψ ψ|(I − PL )|ψ ψ|(I − PL ) (311)
. .
= dψ Tr Ug |ψ ψ|Ug † (PL |ψ ψ|PL ) + dψ[1 − 2 ψ |PL |ψ +( ψ |PL |ψ )2 ] (312)
.
≤ 1 − 2 dψ ψ |PL |ψ (313)
2L
≤1− , (314)
d
where we make use of the fact that (PL |ψ ψ|PL ) ≥ 0, XI. CONCLUSIONS
since is CP. Note that for constant L we can make
In this work, we have introduced a comprehensive
the fidelity arbitrarily high by choosing d = 2q large
theory of RB. As such, it goes beyond a mere classifi-
enough. Now consider RB with input state ρ = |1 1|
cation of known protocols (a task that we also hope to
and measurement POVM {|1 1| + |L L|, 1 − |1 1| −
achieve). But at the same time, it provides a deeper under-
|L L|} and implementation map φL as defined above. The
standing, a more precise formulation and interpretation of
RB probability for the POVM element |1 1| + |L L| is
what the data acquired in RB means, actionable advice to
going to be (setting gend = e and assuming no SPAM
experimentalists and theoretical practitioners and a con-
errors)
ceptual platform from which new schemes can be derived.
Specifically, we show how RB gives rise to exponential
p(|1 1| + |L L|, m) = Tr[(|1 1| + |L L|)φL∗m (|1 1|)]. decays under broad classes of Markovian noise models,
(315) show—importantly in practical contexts—in what sense
RB is robust to deviations from uniform sampling and
Note that since PL |1 1| = |1 1|PL we have that provides further evidence to the interpretation in terms
μ
φL (g)(|1 1|) = (L )m (|1 1|) for all g. From this it fol- of average gate fidelities. Maybe most important for our
lows that work to serve as a basis for substantial further develop-
ment of methods and protocols are new conceptual insights
μ
p(|1 1| + |L L|, m) = Tr[(|1 1| + |L L|)(L )m (|1 1|)] into how inversion gates are—in contrast to common
belief—not required for RB and into how large classes
μm μm
= [SL ]1,L + [SL ]1,1 . (316) of groups in RB can become available by means of new
filtering techniques. This contributes to overcoming the
This data shows curious behavior. For small sequence problem of isolating exponential decays in a fully scalable
lengths we have p(|1 1| + |L L|, m) ≈ μm , but with manner. First steps into exploiting the insights established
increasing sequence length we observe wildly nonexpo- here when devising new schemes have already been made
nential behavior. [57–59]. We hope that this work provides a starting point
020357-51
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
of a further rich class of new protocols of quantum certifi- [6] J. Emerson, M. Silva, O. Moussa, C. Ryan, M. Laforest, J.
cation and benchmarking, providing stringent and rigorous Baugh, D. G. Cory, and R. Laflamme, Symmetrized char-
quality criteria, while respecting experimental needs and acterization of noisy quantum processes, Science 317, 1893
desiderata. (2007).
[7] E. T. Campbell, B. M. Terhal, and C. Vuillot, Roads
towards fault-tolerant universal quantum computation,
Nature 549, 172 (2017).
ACKNOWLEDGMENTS [8] R. Barends, et al., Rolling quantum dice with a supercon-
ducting qubit, Phys. Rev. A 90, 030303 (2014).
J.H. would like to acknowledge helpful conversations [9] E. Onorati, A. H. Werner, and J. Eisert, Randomized Bench-
with Michael Walter, Bas Dirkse, and Freek Witteveen. marking for Individual Quantum Gates, Phys. Rev. Lett.
I.R. would like to thank Richard Kueng, Martin Kliesch, 123, 060501 (2019).
Marios Ioannou, Dominik Hangleiter, and Jonas Hafer- [10] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Char-
kamp for helpful discussions and Susane Calegari for con- acterizing universal gate sets via dihedral benchmarking,
tributions to the illustration. The authors would also like Phys. Rev. A 92, 060302 (2015).
to acknowledge an anonymous referee for pointing out the [11] A. W. Cross, E. Magesan, L. S. Bishop, J. A. Smolin,
and J. M. Gambetta, Scalable randomized benchmark-
correct way to include cycle benchmarking into the frame-
ing of non-Clifford gates, npj Quant. Inf. 2, 16012
work of Theorem 8. The Berlin team has been supported (2016).
by the BMBF project DAQC, for which it introduces [12] J. Helsen, X. Xue, L. M. K. Vandersypen, and S. Wehner, A
new methods for randomized benchmarking of near-term new class of efficient randomized benchmarking protocols,
superconducting quantum platforms, and BMBF project npj Quant. Inf. 5, 1 (2019).
MUNIQC-ATOMS, for which it introduces a starting point [13] A. Erhard, J. J. Wallman, L. Postler, M. Meth, R. Stricker,
to develop schemes of analog randomized benchmarking. E. A. Martinez, P. Schindler, T. Monz, J. Emerson, and
It has also been funded by the DFG (EI 519/9-1, for which R. Blatt, Characterizing large-scale quantum computers via
this work develops ideas of signal processing, and DFG cycle benchmarking. Nat. Commun., 10 (2019).
[14] D. S. Franca and A. K. Hashagen, Approximate random-
CRC 183, for which this is an internode work Berlin-
ized benchmarking for finite groups, J. Phys. A 51, 395302
Copenhagen, as well as DFG EI 519/14-1), and the Munich (2018).
Quantum Valley (K-8). This work has also received fund- [15] T. J. Proctor, A. Carignan-Dugas, K. Rudinger, E. Nielsen,
ing from the European Union’s Horizon 2020 research R. Blume-Kohout, and K. Young, Direct Randomized
and innovation programme under Grant Agreement No. Benchmarking for Multiqubit Devices, Phys. Rev. Lett.
817482 (PASQuanS), for which it assesses feasible bench- 123, 030503 (2019).
marking schemes in quantum computing and simulation, [16] J. Wallman, C. Granade, R. Harper, and S. T. Flammia, Esti-
and the Einstein Foundation. E.O. has been supported by mating the coherence of noise, New J. Phys. 17, 113020
the Royal Society. A.H.W. thanks the VILLUM FONDEN (2015).
[17] J. M. Gambetta, A. D. Córcoles, S. T. Merkel, B. R. John-
for its support with a Villum Young Investigator Grant
son, J. A. Smolin, J. M. Chow, C. A. Ryan, C. Rigetti,
(Grant No. 25452) and its support via the QMATH Centre S. Poletto, T. A. Ohki, M. B. Ketchen, and M. Stef-
of Excellence (Grant No. 10059). fen, Characterization of Addressability by Simultaneous
Randomized Benchmarking, Phys. Rev. Lett. 109, 240504
(2012).
[18] J. J. Wallman, M. Barnhill, and J. Emerson, Robust Char-
[1] J. Emerson, R. Alicki, and K. Zyczkowski, Scalable noise acterization of Loss Rates, Phys. Rev. Lett. 115, 060501
estimation with random unitary operators, J. Opt. B 7, S347 (2015).
(2005). [19] J. J. Wallman, M. Barnhill, and J. Emerson, Robust char-
[2] C. Dankert, R. Cleve, J. Emerson, and E. Livine, Exact acterization of leakage errors, New J. Phys. 18, 043021
and approximate unitary 2-designs and their application to (2016).
fidelity estimation, Phys. Rev. A 80, 012304 (2009). [20] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and
[3] B. Lévi, C. C. López, J. Emerson, and D. G. Cory, Efficient T. Ohki, Robust Extraction of Tomographic Information
error characterization in quantum information processing, via Randomized Benchmarking, Phys. Rev. X 4, 011050
Phys. Rev. A 75, 022314 (2007). (2014).
[4] E. Magesan, J. M. Gambetta, B. R. Johnson, C. A. Ryan, [21] I. Roth, R. Kueng, S. Kimmel, Y.-K. Liu, D. Gross, J.
J. M. Chow, S. T. Merkel, M. P. Da Silva, G. A. Keefe, Eisert, and M. Kliesch, Recovering Quantum Gates from
M. B. Rothwell, and T. A. Ohki, et al., Efficient Measure- few Average Gate Fidelities, Phys. Rev. Lett. 121, 170502
ment of Quantum Gate Error by Interleaved Randomized (2018).
Benchmarking, Phys. Rev. Lett. 109, 080505 (2012). [22] S. T. Flammia and J. J. Wallman, Efficient estimation of
[5] E. Knill, D. Leibfried, R. Reichle, J. Britton, R. B. Pauli channels. (2019), ArXiv:1907.12976.
Blakestad, J. D. Jost, C. Langer, R. Ozeri, S. Seidelin, and [23] J. Eisert, D. Hangleiter, N. Walk, I. Roth, D. Markham, R.
D. J. Wineland, Randomized benchmarking of quantum Parekh, U. Chabaud, and E. Kashefi, Quantum certification
gates, Phys. Rev. A 77, 012307 (2008). and benchmarking, Nat. Rev. Phys. 2, 382 (2020).
020357-52
GENERAL FRAMEWORK FOR RANDOMIZED BENCHMARKING PRX QUANTUM 3, 020357 (2022)
[24] J. J. Wallman, Randomized benchmarking with gate- [42] C. J. Wood and J. M. Gambetta, Quantification and char-
dependent noise, Quantum 2, 47 (2018). acterization of leakage errors, Phys. Rev. A 97, 032306
[25] S. T. Merkel, E. J. Pritchett, and B. H. Fong, Random- (2018).
ized benchmarking as convolution: Fourier analysis of gate [43] R. N. Alexander, P. S. Turner, and S. D. Bartlett, Ran-
dependent errors. (2018). domized benchmarking in measurement-based quantum
[26] T. Proctor, K. Rudinger, K. Young, M. Sarovar, and R. computing, Phys. Rev. A 94, 032303 (2016).
Blume-Kohout, What Randomized Benchmarking Actually [44] J. Combes, C. Granade, C. Ferrie, and S. T. Flammia, Logi-
Measures, Phys. Rev. Lett. 119, 130502 (2017). cal randomized benchmarking. (2017), ArXiv:1702.03688.
[27] A. Carignan-Dugas, K. Boone, J. J. Wallman, and J. [45] S. T. Flammia and J. J. Wallman, Efficient estimation of
Emerson, From randomized benchmarking experiments pauli channels. Nat. Phys., (2020), ArXiv:1907.12976.
to gate-set circuit fidelity: How to interpret randomized [46] R. Harper, S. T. Flammia, and J. J. Wallman, Efficient
benchmarking decay parameters, New J. Phys. 20, 092001 learning of quantum noise. Nat. Phys., (2020).
(2018). [47] R. Harper and S. T. Flammia, Estimating the fidelity of T
[28] A. Acin, I. Bloch, H. Buhrman, T. Calarco, C. Eichler, J. gates using standard interleaved randomized benchmark-
Eisert, J. Esteve, N. Gisin, S. J. Glaser, F. Jelezko, S. Kuhr, ing, Quant. Sc. Tech. 2, 015008 (2017).
M. Lewenstein, M. F. Riedel, P. O. Schmidt, R. Thew, A. [48] S. Sheldon, L. S. Bishop, E. Magesan, S. Filipp, J. M.
Wallraff, I. Walmsley, and F. K. Wilhelm, The European Chow, and J. M. Gambetta, Characterizing errors on qubit
quantum technologies roadmap, New J. Phys. 20, 080201 operations via iterative randomized benchmarking, Phys.
(2018). Rev. A 93, 012301 (2016).
[29] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. [49] T. Chasseur, D. M. Reich, C. P. Koch, and F. K. Wilhelm,
Barends, R. Biswas, S. Boixo, F. G. S. L. Brandao, and D. Hybrid benchmarking of arbitrary quantum gates, Phys.
A. Buell, et al., Quantum supremacy using a programmable Rev. A 95, 062335 (2017).
superconducting processor, Nature 574, 505 (2019). [50] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and
[30] A. Bouland, B. Fefferman, C. Nirkhe, and U. Vazirani, On T. Ohki, Robust Extraction of Tomographic Information
the complexity and verification of quantum random circuit via Randomized Benchmarking, Phys. Rev. X 4, 011050
sampling, Nat. Phys. 15, 159 (2019). (2014).
[31] K. Noh, L. Jiang, and B. Fefferman, Efficient classical sim- [51] K. Boone, A. Carignan-Dugas, J. J. Wallman, and J. Emer-
ulation of noisy random quantum circuits in one dimension, son, Randomized benchmarking under different gate sets,
Quantum 4, 318 (2020). Phys. Rev. A 99, 032329 (2019).
[32] A. M. Dalzell, N. Hunter-Jones, and F. G. S. L. Brandão, [52] T. Kato, Perturbation Theory for Linear Operators, Vol.
Random quantum circuits transform local noise into global 132 (Springer-Verlag Berlin Heidelberg, Berlin, 1995).
white noise. (2021), ArXiv:a2111.14907. [53] G. W. Stewart and Ji-Guang Sun, Matrix Perturbation
[33] Y. Liu, M. Otten, R. Bassirianjahromi, L. Jiang, and B. Fef- Theory (Academic Press, Boston, 1990).
ferman, Benchmarking near-term quantum computers via [54] H.-Y. Huang, R. Kueng, and J. Preskill, Predicting many
random circuit sampling. (2021), ArXiv:2105.05232. properties of a quantum system from very few measure-
[34] E. Magesan, J. M. Gambetta, and J. Emerson, Scalable and ments, Nat. Phys. 16, 1050 (2020).
Robust Randomized Benchmarking of Quantum Processes, [55] M. Kliesch and I. Roth, Theory of quantum system certifi-
Phys. Rev. Lett. 106, 180504 (2011). cation, PRX Quantum 2, 010201 (2021).
[35] A. K. Hashagen, S. T. Flammia, D. Gross, and J. J. [56] J. Helsen, J. J. Wallman, S. T. Flammia, and S. Wehner,
Wallman, Real randomized benchmarking, Quantum 2, 85 Multiqubit randomized benchmarking using few samples,
(2018). Phys. Rev. A 100, 032304 (2019).
[36] J. M. Gambetta, A. D. Córcoles, S. T. Merkel, B. R. John- [57] J. Helsen, S. Nezami, M. Reagor, and M. Walter, Match-
son, J. A. Smolin, J. M. Chow, C. A. Ryan, C. Rigetti, S. gate benchmarking: Scalable benchmarking of a continuous
Poletto, and T. A. Ohki, et al., Characterization of Address- family of many-qubit gates. (2020), ArXiv:2011.13048.
ability by Simultaneous Randomized Benchmarking, Phys. [58] L. Kong, A framework for randomized benchmarking over
Rev. Lett. 109, 240504 (2012). compact groups, ArXiv:2111.10357.
[37] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Char- [59] J. Helsen, M. Ioannou, I. Roth, J. Kitzinger, E. Onorati,
acterizing universal gate sets via dihedral benchmarking, A. H. Werner, and J. Eisert, Estimating gate-set properties
Phys. Rev. A 92, 060302 (2015). from random sequences. (2021), ArXiv:2110.13178.
[38] A. W. Cross, E. Magesan, L. S. Bishop, J. A. Smolin, and J. [60] R. Goodman and N. R. Wallach, Representations and
M. Gambetta, Scalable randomised benchmarking of non- Invariants of the Classical Groups (Cambridge University
Clifford gates, npj Quant. Inf. 2, 16012 (2016). Press, Cambridge, 2000).
[39] J. Helsen, X. Xue, L. M. K. Vandersypen, and S. Wehner, A [61] W. Fulton and J. Harris, Representation Theory: a First
new class of efficient randomized benchmarking protocols, Course, Vol. 129 (Springer Science & Business Media,
npj Quant. Inf. 5, 1 (2019). New York, 2013).
[40] W. G. Brown and B. Eastin, Randomized benchmark- [62] W. T. Gowers and O. Hatami, Inverse and stability theorems
ing with restricted gate sets, Phys. Rev. A 97, 062323 for approximate representations of finite groups, Sbornik:
(2018). Math. 208, 1784 (2017).
[41] T. Chasseur and F. K. Wilhelm, Complete randomized [63] M. E. Kilmer and D. P. O’Leary, Selected Works with
benchmarking protocol accounting for leakage errors, Phys. Commentaries, edited by G.W. Stewart (Birkhäuser Basel,
Rev. A 92, 042333 (2015). 2010).
020357-53
J. HELSEN et al. PRX QUANTUM 3, 020357 (2022)
[64] B. Dirkse, J. Helsen, and S. Wehner, Efficient unitarity ran- [80] W. Liao and A. Fannjiang, Music for single-snapshot spec-
domized benchmarking of few-qubit Clifford gates, Phys. tral estimation: Stability and super-resolution, Appl. Comp.
Rev. A 99, 012315 (2019). Harm. An. 40, 33 (2016).
[65] J. J. Wallman, M. Barnhill, and J. Emerson, Robust char- [81] A. Fannjiang, Compressive spectral estimation with
acterization of leakage errors, New J. Phys. 18, 043021 single-snapshot esprit: Stability and resolution. (2016),
(2016). ArXiv:1607.01827.
[66] E. Magesan, R. Blume-Kohout, and J. Emerson, Gate [82] W. Li and W. Liao, Stable super-resolution limit and small-
fidelity fluctuations and quantum process invariants, Phys. est singular value of restricted fourier matrices. (2017),
Rev. A 84, 012309 (2011). ArXiv:1709.03146.
[67] C. A. Ryan, M. Laforest, and R. Laflamme, Randomized [83] W. Li, W. Liao, and A. Fannjiang, Super-resolution limit of
benchmarking of single-and multi-qubit control in liquid- the ESPRIT algorithm. (2019), ArXiv:1905.03782.
state nmr quantum information processing, New J. Phys. [84] R. Badeau, B. David, and G. Richard, High-resolution
11, 013034 (2009). spectral analysis of mixtures of complex exponentials mod-
[68] J. J. Wallman and S. T. Flammia, Randomized bench- ulated by polynomials, IEEE Trans. Sig. Proc. 54, 1341
marking with confidence, New J. Phys. 16, 103032 (2006).
(2014). [85] R. Badeau, G. Richard, and B. David, Performance of esprit
[69] J. M. Epstein, A. W. Cross, E. Magesan, and J. M. Gam- for estimating mixtures of complex exponentials modulated
betta, Investigating the limits of randomized benchmarking by polynomials, IEEE Trans. Sig. Proc. 56, 492 (2008).
protocols, Phys. Rev. A 89, 062321 (2014). [86] F. S. V. Bazan, Conditioning of rectangular Vandermonde
[70] B. H. Fong and S. T. Merkel, Randomized bench- matrices with nodes in the unit disk, SIAM J. Mat. An. App.
marking, correlated noise, and Ising models. (2017), 21, 679 (2006).
ArXiv:1703.09747. [87] L. T. Nguyen, J. Kim, and B. Shim, Low-rank matrix com-
[71] M. A. Fogarty, M. Veldhorst, R. Harper, C. H. Yang, pletion: A contemporary survey, IEEE Access 7, 94215
S. D. Bartlett, S. T. Flammia, and A. S. Dzurak, Non- (2019).
exponential fidelity decay in randomized benchmarking [88] J. A. Tropp, User-friendly tail bounds for sums of random
with low-frequency noise, Phys. Rev. A 92, 022326 matrices, Found. Comput. Math. 12, 389 (2012).
(2015). [89] R. Ahlswede and A. Winter, Strong converse for identifi-
[72] A. Carignan-Dugas, J. J. Wallman, and J. Emerson, Bound- cation via quantum channels, IEEE Trans. Inform. Th. 48,
ing the average gate fidelity of composite channels using 569 (2002).
the unitarity, New J. Phys. 21, 053016 (2019). [90] A. Ginory and J. Kim, Weingarten calculus and the IntHaar
[73] C. T. Kelley, Iterative Methods for Optimization (SIAM, package for integrals over compact matrix groups. J. Symb.
Philadelphia, 1999). Comp., 2019.
[74] R. Harper, I. Hincks, C. Ferrie, S. T. Flammia, and J. J. [91] Z. Webb, The Clifford group forms a unitary 3-design,
Wallman, Statistical analysis of randomized benchmarking, Quantum Inf. Comput. 16, 1379 (2016).
Phys. Rev. A 99, 052350 (2019). [92] H. Zhu, Multiqubit Clifford groups are unitary 3-designs,
[75] P. R. Prony, Essai experimentale et analytique, J. de l’Ecole Phys. Rev. A 96, 062336 (2017).
Polytechnique 1, 24 (1795). [93] M. B. Ruskai, S. Szarek, and E. Werner, An analysis of
[76] E. J. Candès and C. Fernandez-Granda, Super-resolution completely-positive trace-preserving maps on m2, Lin. Alg.
from noisy data, J. Fourier An. App. 19, 1229 App. 347, 159 (2002).
(2013). [94] A. Carignan-Dugas, M. Alexander, and J. Emerson, A polar
[77] E. J. Candes and C. Fernandez-Granda, Towards a mathe- decomposition for quantum channels (with applications to
matical theory of super-resolution, Comm. Pure App. Math. bounding error propagation in quantum circuits), Quantum
67, 906 (2014). 3, 173 (2019).
[78] R. Schmidt, Multiple emitter location and signal parameter [95] D. J. Hartfiel, Dense sets of diagonalizable matrices, Proc.
estimation, IEEE Trans. Ant. Prop. 34, 276 (1986). Am. Math. Soc. 123, 1669 (1995).
[79] R. Roy, A. Paulraj, and T. Kailath, in MILCOM 1986-IEEE [96] M. M. Wolf, Quantum channels & operations. Guided
Military Communications Conference: Communications- tour. Lecture notes available at http://www-m5.ma.tum.de/
Computers: Teamed for the 90’s, Vol. 3, (IEEE, 1986). foswiki/pubM, 5, 2012.
020357-54