0% found this document useful (0 votes)
6 views

Machine Learning

Uploaded by

agaleatharva5
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Machine Learning

Uploaded by

agaleatharva5
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Bugs in Quantum Computing Platforms: An Empirical Study

MATTEO PALTENGHI, University of Stuttgart, Germany


MICHAEL PRADEL, University of Stuttgart, Germany
The interest in quantum computing is growing, and with it, the importance of software platforms to develop
quantum programs. Ensuring the correctness of such platforms is important, and it requires a thorough
understanding of the bugs they typically suffer from. To address this need, this paper presents the first
in-depth study of bugs in quantum computing platforms. We gather and inspect a set of 223 real-world bugs
from 18 open-source quantum computing platforms. Our study shows that a significant fraction of these bugs
(39.9%) are quantum-specific, calling for dedicated approaches to prevent and find them. The bugs are spread
across various components, but quantum-specific bugs occur particularly often in components that represent,
compile, and optimize quantum programming abstractions. Many quantum-specific bugs manifest through
unexpected outputs, rather than more obvious signs of misbehavior, such as crashes. Finally, we present a
hierarchy of recurrent bug patterns, including ten novel, quantum-specific patterns. Our findings not only
show the importance and prevalence bugs in quantum computing platforms, but they help developers to avoid
common mistakes and tool builders to tackle the challenge of preventing, finding, and fixing these bugs.
CCS Concepts: • Software and its engineering → Software libraries and repositories; • General and
reference → Empirical studies.
Additional Key Words and Phrases: quantum computing platform, software bugs, empirical study
ACM Reference Format:
Matteo Paltenghi and Michael Pradel. 2022. Bugs in Quantum Computing Platforms: An Empirical Study. Proc.
ACM Program. Lang. 6, OOPSLA1, Article 86 (April 2022), 27 pages. https://doi.org/10.1145/3527330

1 INTRODUCTION
Quantum computing has been making immense progress recently. Over the past decades, the field
has evolved from an idea that seemed a long way off to a domain with huge investments from both
public and private players [Ove 2021]. Quantum algorithms [Farhi et al. 2014; Grover 1996; Harrow
et al. 2009; Shor 1999] are on their way to becoming, in selected domains, a serious competitor for
classical computing.
Executing a quantum algorithm on a quantum computer or a simulator requires a complex
software stack. We call this software stack a quantum computing platform. Such a platform encom-
passes a quantum programming language, a compiler, and an execution environment that supports
running quantum programs. Various efforts currently compete for building quantum computing
platforms, e.g., Qiskit by IBM, Circ by Google, and Q# by Microsoft. Given the key role of quantum
computing platforms in this growing domain, ensuring the correctness of these platforms is a high 86
priority. There are various approaches for preventing and finding bugs in other kinds of critical
software infrastructure, e.g., for testing [Chen et al. 2020] or verifying [Leroy 2009] compilers.
We argue that quantum computing platforms play a role similar to language implementations for
traditional computing, and hence deserve a similar level of attention.
Authors’ addresses: Matteo Paltenghi, University of Stuttgart, Stuttgart, Germany, [email protected]; Michael Pradel,
University of Stuttgart, Stuttgart, Germany, [email protected].

This work is licensed under a Creative Commons Attribution 4.0 International License.
© 2022 Copyright held by the owner/author(s).
2475-1421/2022/4-ART86
https://doi.org/10.1145/3527330

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:2 Matteo Paltenghi and Michael Pradel

def fold_global ( circuit : QPROGRAM , stretch : float , ** kwargs ) -> QPROGRAM :


""" Gives a circuit by folding the global unitary of the input circuit . """
...
# Fold remaining gates until the stretch is reached
ops = list ( base_circuit . all_operations () )
- num_to_fold = int ( round ( fractional_stretch * len ( ops )))
+ num_to_fold = int ( round ( fractional_stretch * len ( ops ) / 2) )
if num_to_fold > 0:
folded += Circuit ([ inverse ( ops [- num_to_fold :]) ], [ ops [- num_to_fold :]])

Fig. 1. Example of a quantum bug that induces an incorrect output. (UUID: 1687, from Mitiq)

An important prerequisite for preventing and detecting bugs is understanding what bugs exist in
the wild. Studies in other domains, e.g., on concurrency bugs [Lu et al. 2008] or compiler bugs [Sun
et al. 2016], have proven useful to guide future work toward addressing relevant problems. However,
there currently is no detailed study of bugs in quantum programming platforms.
To fill this gap, this paper presents the first in-depth study of bugs in quantum computing
platforms. We gather a set of 223 real-world bugs from 18 open-source projects, including highly
popular platforms, such as Qiskit, Circ, and Q#. We thoroughly inspect and annotate these bugs in
an iterative process, to understand how many of them are quantum-specific, what components
of a quantum computing platform the bugs are in, how these bugs manifest (and hence can be
detected), and how complex the corresponding bug fixes are. Moreover, we identify a set of recurring
bug patterns that highlight kinds of mistakes developers make repeatedly, even across different
platforms.
As an example of a quantum bug considered in our study, Figure 1 shows a problem from the
Mitiq project [LaRose et al. 2021], along with the fix applied by the developers. The function
fold_global is part of a noise mitigation technique called łZero-noise extrapolationž, which adds
pairs of opposite operations to the program. These additional operations are intended to make the
computation longer and noisier, while preserving the overall mathematical result, with the goal to
inject noise to extrapolate the result without noise. The code adds a number of pairs of operations
determined by num_to_fold. In the buggy version, the code fails to consider that each folding step
adds two operations, and not one, which is then fixed by dividing num_to_fold by two. The bug
results in adding twice the amount of operations, adding more noise than intended. The example
is representative for several commonly observed characteristic of bugs in quantum computing
platforms: The problem is specific to the domain of quantum computing, is located in a component
that is about evaluating the state of a quantum program, and manifests through incorrect output,
which makes the bug non-trivial to detect.
More generally, the key findings of our study include:
• 39.9% of all studied bugs in quantum computing platforms are quantum-specific, which
motivates future work on dedicated approaches to prevent and find them.
• Bugs are spread across various components of the studied platforms. Components that
represent, compile, and optimize quantum programming abstractions are particularly prone
to quantum-specific bugs.
• While many (92 out of 223) bugs manifest through program crashes, i.e., generic and easy-to-
detect signs of misbehavior, quantum-specific bugs tend to more frequently cause incorrect
outputs, making them harder to detect.
• We present a hierarchy of recurring bug patterns and quantify how many bugs match each
pattern. Besides patterns also known from other kinds of software, we identify ten novel,

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:3

quantum-specific bug patterns, such as incorrectly ordered qubits and incorrect scheduling
of low-level quantum operations.
• Many bugs in quantum computing platforms can be fixed by changing only a few lines of
code, making them an interesting target for automated program transformation and program
repair techniques.
The results of this study are useful for both researchers and developers. Researchers working on
bug-related techniques can benefit from insights into this new domain and the kinds of problems
it causes. Some of our results call for quantum-specific approaches to prevent and detect kinds
of bugs not targeted by traditional approaches. Developers of quantum computing platforms can
learn from our bug patterns as recurring mistakes to avoid in the future. They can also use our
results on which components are most bug-prone to guide the allocation of testing and analysis
efforts. As quantum computing platforms have clearly evolved beyond mere research prototypes,
but are not yet as widely used as, e.g., traditional language implementations, our study of bugs and
their characteristics has the chance to positively influence the design of future versions of these
platforms. To allow others to build upon our results, we share the set of studied bugs, including all
annotations produces during the study. For example, we envision the dataset to serve as a basis for
evaluating future work on finding bugs in quantum computing platforms.
Given the young age of practical quantum computing, there currently are only few empirical
studies on it. Huang and Martonosi [2019a] describe their experience of developing and testing
quantum algorithms. A recent position paper [Campos and Souto 2021] underlines the need for
studying quantum bugs. Both approaches focus on developing quantum algorithms, whereas we
study bugs in quantum computing platforms. Others have collected a dataset of 36 bugs [Zhao et al.
2021b], but only from a single quantum computing platform, and without a deeper analysis of the
properties of these bugs.
In summary, this paper contributes the following:
• The first in-depth study of 223 real-world bugs in 18 quantum computing platforms.
• Insights about the components and symptoms of bugs, as well as the complexity of their
fixes.
• A hierarchy of recurring bug patterns, including ten novel, quantum-specific patterns.
• A publicly shared dataset 1 of annotated, real-world bugs to support future work on preventing
and detecting bugs in quantum computing platforms.

2 BACKGROUND
We briefly discuss some basic quantum computing concepts (Section 2.1) and the typical structure
of a quantum computing platform (Section 2.2).

2.1 Quantum Computing Concepts


Qubits. Unlike a classical bit, which can be either in state 0 or 1, a qubit is a superposition of the
two states 0 and 1, written |𝑞⟩ = 𝛼 0 |0⟩ + 𝛼 1 |1⟩. Upon measurement, one can observe a probabilistic
state given by the coefficients of the superposition: |𝛼 0 | 2 + |𝛼 1 | 2 = 1. Thus, a qubit can effectively
encode more information than a classical bit, which is the reason for the speedup behind quantum
computing. Once measured, a qubit collapses to state 0, with probability |𝛼 0 | 2 , or to state 1, with
probability |𝛼 1 | 2 . Multiple runs of a quantum program will return different results, as defined
by these probabilities. When encoding classical information into qubits, the order of qubits is
important. For example, programs may store the most or the least significant bit first.
1 Dataset available at https://doi.org/10.5281/zenodo.5834281 and https://github.com/MattePalte/Bugs-Quantum-Computing-

Platforms. Note that the universally unique identifiers (UUIDs) of bugs given in the paper refer to this shared dataset.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:4 Matteo Paltenghi and Michael Pradel

Quantum programming language Compilation Execution environment

Intermediate Interface to
Quantum abstractions
representations quantum computer

Classical abstractions Optimizations Simulator

Machine code Quantum state


Domain-specific abstractions
generation evaluation

Auxiliary components

Infrastructure scripts
Testing Visualization and plotting
and glue code

Fig. 2. Main components of quantum computing platforms.

Circuits. Quantum programming languages provide abstractions for storing and manipulating
qubits. To store classical and quantum information, classical registers and quantum registers are
available, respectively. The most widespread model to express quantum computations is the gate
model, where a program is expressed in a quantum circuit that describes elementary operations
performed on qubits in a predefined sequence. These operations are represented with gates, which
are the building blocks of circuits, similar to classical logic gates on conventional digital circuits.
Once defined by a developer, circuits are executed either on a quantum computer or a simulator.
For mapping circuits onto hardware and for scheduling the operations, the count of qubits used
in a circuit is important. As typically not all qubits are measured at the end, there is a distinction
between total qubits and measured qubits.
Noise. A quantum program is not only probabilistic due to the superposition and measurement,
but it is also affected by noise induced during a computation. The underlying reason is the phenom-
enon of crosstalk [Murali et al. 2020], i.e. that the computation on some qubits physically disturbs
the information stored in some neighboring qubits, together with the physical errors of executing
operations on hardware. Because of noise, statistical tests on random variables are often used to
interpret results.

2.2 Quantum Computing Platforms


To express and execute quantum programs, developers build upon quantum computing platforms,
by which we mean the entire software stack that enables quantum computing. Broadly speaking,
such platforms consist of three main parts, which we further decompose into several components,
as illustrated in Figure 2. First, each platform has a quantum programming language that allows
developers to express quantum algorithms. Some łlanguagesž are APIs provided as a library for a
well-known host language, such as Qiskit, which provides quantum programming abstractions
as a Python library. Others are stand-alone languages, such as Q#, Silq [Bichsel et al. 2020], and
Quipper [Green et al. 2013a]. These programming languages provide quantum abstractions, e.g.,
qubits, gates, circuits, and channels, but usually also classical abstractions that are useful to ex-
press quantum algorithms, e.g., matrices, tensors, and directed acyclic graphs. In addition, some
languages come with domain-specific abstractions, e.g., to allow for expressing quantum algorithms
in chemistry, finance, or machine learning.
Second, a quantum computing platform has a compiler to translate programs written in the
quantum programming language into a low-level instruction set. The tasks performed in a quantum

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:5

compiler resemble those known from traditional compilers. Focusing on components most relevant
for our study, Figure 2 shows three components of a typical quantum compiler. The intermediate
representation includes code for creating and manipulating in-memory representations of a to-be-
compiled quantum program. Many platforms have more than one intermediate representation, e.g.,
a high-level, AST-like representation and an assembly-level language, such as QASM [Cross et al.
2017]. To improve the efficiency of executing a quantum program, compilers implement various
optimizations, e.g., to reduce the circuit depth by aggregating, removing, or rearranging gates.
Eventually, quantum compilers have a component for machine code generation, where łmachinež
refers to one or more execution environments, as described next.
Third, a quantum computing platform comes with support for one or more execution environ-
ments. Dedicated quantum computers have made impressive progress in recent years. They can be
categorized into devices based on a discrete gate model, a continuous gate model, and adiabatic
quantum computation [Fingerhuth et al. 2018]. Because quantum computers are typically oper-
ated in a computing center, developers access them through some kind of interface to a quantum
computer, similar to traditional cloud computing interfaces. An alternative to dedicated quantum
hardware are simulators, which aim at mimicking the operations performed by a quantum computer
on traditional hardware. To faithfully simulate the execution, simulators typically include a model
of quantum noise induced by stray electromagnetic fields or material defects. An important aspect
of executing a quantum program, both on dedicated hardware and in a simulator, is to evaluate
the state of the program, e.g., by measuring the values of qubits after a computation. We call the
component that implements measurements quantum state evaluation, which typically includes
code for calibrating measurements and for mitigating errors.
Beyond the three main parts of quantum computing platforms, there are several auxiliary
components. With a focus on those that are most relevant for our study, Figure 2 shows three
auxiliary components. Testing includes code both for testing the quantum computing platform itself
and to enable developers of quantum algorithms to test their code. To help handle the complexity
of quantum programs and their results, platforms often provide a component for visualization and
plotting. For example, this component may visualize the circuit of a quantum program or plot the
probabilities that describe the possible output states. Finally, as every software project, quantum
computing platforms need infrastructure scripts and glue code, e.g., to install the software and to
connect different parts of the platform.

3 METHODOLOGY
This section describes how we conduct our study by formulating the research questions we address
(Section 3.1), presenting the projects we study (Section 3.2), describing how we identify bugs to
study (Section 3.3), and detailing how we annotate the bugs to answer our research questions
(Section 3.4).

3.1 Research Questions


Our study is driven by five research questions:
• RQ1: How many of the bugs in quantum computing platforms are specific to quantum computing,
as opposed to classical bugs that may also occur in other projects? This question is relevant
to understand to what extent quantum computing platforms can benefit from traditional
bug-related techniques, and whether dedicated approaches for preventing and detecting
quantum bugs are needed.
• RQ2: Where in quantum computing platforms do the bugs occur? Understanding which compo-
nents of a platform are most bug-prone will serve as guidance on allocating efforts toward

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:6 Matteo Paltenghi and Michael Pradel

detecting and preventing bugs. The answer to this question is relevant both for practitioners,
e.g., to decide where to spend testing efforts, and for researchers, e.g., when developing novel
type systems for avoiding bugs or novel bug detection techniques.
• RQ3: How do the bugs manifest? This question helps understand to what degree bugs may be
found through generic signs of misbehavior, e.g., a program crash, or require application-
specific oracles, e.g., because a bug causes an incorrect measurement of quantum states. In
addition, knowing the consequences of bugs will make us aware of the risks incurred by
leaving bugs undetected in a quantum computing platform.
• RQ4: What recurring bug patterns do exist? Identifying common programming mistakes serves
as a basis for creating techniques that prevent and detect specific kinds of bugs. Furthermore,
a collection of recurring bug patterns will help educate practitioners by highlighting mistakes
to avoid.
• RQ5: How complex are the bug fixes? Studying the complexity of patches is relevant for work
toward automating the process of fixing bugs in quantum computing platforms, e.g., via
synthesizing program transformations [Gao et al. 2020; Miltner et al. 2019] or automated
program repair [Bader et al. 2019; Berabi et al. 2021; Le Goues et al. 2019; Li et al. 2020a].

3.2 Selecting Projects to Study


The goal of this paper is to study bugs in quantum computing platforms, where a complete
platform covers all components described in Section 2.2. In practice, different software projects
cover different parts of a complete platform. Some projects are more focused on simulation, such
as Qualcs [Suzuki et al. 2021], others on interacting with quantum computing devices, such as
Amazon Braket [Gonzalez 2021]. Others again are specialized on domain-specific abstractions,
such as Pennylane [Bergholm et al. 2020] and Tequila [Kottmann et al. 2021], or on advanced error
mitigation techniques, such as Mitiq [LaRose et al. 2021]. Qiskit [Qis 2021] and Cirq [Developers
2021], two large-scale projects backed by IBM and Google, respectively, are perhaps closest to a
complete platform, but studying only them would ignore more specialized projects.
To adequately cover all aspects of quantum computing platforms we hence study not only a
single or a few projects, as done in prior work [Huang and Martonosi 2019a; Zhao et al. 2021b],
but a total of 18 projects. We focus on open-source projects on GitHub, enabling us to apply the
same process to each project. While this focus may potentially bias our results, we are not aware of
any closed-source platform that is as mature as, e.g., Qiskit and Cirq. To select projects, we draw
inspiration from previous work [Fingerhuth et al. 2018] and extend their list with all projects on a list
of łquantum full stack librariesž curated by the Quantum Open Source Foundation2 . The resulting
selection covers the most popular Python libraries according to the PyPi download statistics [Finke
2021], and aims at covering different components of a quantum computing platform. Of course, the
set of studied platform is not comprehensive, and other platforms, such as Tket3 , could be added in
the future.
Table 1 shows the 18 projects we study. For each project, the table shows the components of a
hypothetical, complete quantum platform that the project covers. We consider a project to cover a
particular component if the project repository has at least one source code file that implements a
functionality of this component, which we determine by manually inspecting the code base and
reading the source code. Code that simply calls into another project to use its implementation of a
component is not considered as covering the component. Overall, Table 1 shows that our selection
of projects covers all main components of quantum computing platforms, and that studying a

2 https://qosf.org/project_list/
3 https://github.com/CQCL/tket

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:7

Table 1. Repositories considered in the study with the total commits, the commits satisfying our keyword
heuristic based on łfixž, and the sampled commits divided in real bugs and false positives. We also report the
components implemented by each and the programming language: QA (quantum abstraction), CA (classical
abstraction), DA (domain-specific abstraction), IR (intermediate Representation), OPT (optimization), MCG
(machine code generation), INTER (interface to quantum computer), SIM (simulation), QSE (quantum state
evaluation), TEST (testing), VIZ (visualization and plotting).

Commits Components
Sampled

INTER

TEST
MCG
OPT

QSE
SIM

VIZ
QA

DA
CA

IR
Repository Total Fix Bug FP Languages
PennyLane 2,089 132 18 2 ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
ProjectQ 238 26 15 5 ✓ ✓ ✓✓ ✓ ✓ ✓ ✓ ✓ Py
OpenQL 2,503 44 12 8 ✓ ✓ ✓ ✓✓ ✓ ✓ ✓ C++, Py
Qiskit Aer 1,197 187 16 4 ✓ ✓ ✓✓ ✓ ✓ ✓ C++, Py
Qiskit Ignis 572 33 12 8 ✓ ✓ ✓ ✓ ✓ Py
Qiskit Terra 5,838 748 14 6 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
Tequila 1,140 27 14 6 ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
Braket 396 53 11 9 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
Dwawe-System 1,121 37 9 11 ✓ ✓ ✓ ✓ ✓ ✓ Py
XACC 2,370 58 18 2 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ C++, Py
QDK Libraries 433 44 3 17 ✓ ✓ ✓ ✓ ✓ Q#
QDK Q# Compiler 514 74 14 6 ✓ ✓ ✓ ✓ ✓ ✓ C#, F#, Q#
QDK Q# Runtime 438 72 14 6 ✓ ✓ ✓ ✓ ✓ C#, C++, Q#
Cirq 2,450 341 12 8 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
Qualcs 692 69 13 7 ✓ ✓ ✓ ✓ ✓ C++, C
Pyquil 1,101 109 9 11 ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
mitiq 665 37 6 14 ✓ ✓ ✓ ✓ Py
StrawberryFields 1,100 49 13 7 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Py
Total 24,857 2,140 223 137

single project would give only a partial view on particular components or on a single programming
language.

3.3 Identifying Bugs to Study


Based on the selected projects to study, we gather a set of bugs in these projects. Following the
usual definition of łbugž, we consider a bug as a problem in the source code that causes the behavior
of the program to diverge from the expected behavior. To identify bugs in the given projects, we
first automatically gather a set of commits that are likely to fix a bug, and then manually validate
them to keep only actual bugs.
3.3.1 Automated Selection of Bug Candidates. We scan the version history of each project for
commit messages that are likely to refer to bug fixes. Similar to previous studies of bugs [Karampatsis
and Sutton 2020; Ray et al. 2016], we search for the keyword łfixž within the first line of the commit
message. To reduce the number of coincidental matches, we further filter commits by checking that
the message refers to an issue or a pull request by containing a ł#ž followed by a number. Ensuring
that the studied bugs all refer to an issue or a pull request also helps in understanding the bugs
based on the description of the issue or discussions among developers. Finally, we ignore commits
with a message that contains any of the keywords łrefactorž, łtypož, łrequirementž, łimportž, and

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:8 Matteo Paltenghi and Michael Pradel

for field in fields :


- if not hasattr ( field , self . _options ) :
+ if not hasattr ( self . _options , field ):
raise AttributeError (
" Options field %s is not valid for this "
" backend " % field )

Fig. 3. Example of a classical bug where a Python API is misused. (UUID: 733, from Qiskit Terra)

def is_identity ( term ):


- return len ( term ) == 0
+ if isinstance ( term , PauliTerm ):
+ return ( len ( term ) == 0) and ( not np . isclose ( term . coefficient , 0) )
+ elif isinstance ( term , PauliSum ):
+ return ( len ( term . terms ) == 1) and ( len ( term . terms [0]) == 0) and \
+ ( not np . isclose ( term . terms [0]. coefficient , 0) )
+ else :
+ raise TypeError (" is_identity only checks PauliTerms and PauliSum objects
!")

Fig. 4. Example of a quantum bug. (UUID: 1988, from Pyquil)

łstylež, as such commits often are not about bugs but other code improvements. Table 1 shows for
each project the total number of commits and how many of them match our commit message-based
filtering. In total, there are 2,140 bug candidates.
3.3.2 Manual Validation. Given the automatically gathered bug candidates, we manually inspect
a random sample of them to keep only commits that actually fix a bug. To this end, we sample
commits from the 18 projects. By carefully inspecting each sampled commit, we label it either as
a bug or as a false positive, until having labeled 20 instances per project. The bugs are the basis
for the remainder of this paper. By false positives we mean commits that modify the code in some
other way, e.g., by fixing a comment or documentation. If a bug-fixing commit involves other,
bug-unrelated code changes, we keep it for further study and focus on the bug-related part of the
commit. For bug-fixing commits that address multiple bugs at once, we count and study each bug
individually. Table 1 shows the number of inspect and selected commits. Overall, we identify and
study 223 bugs.

3.4 Understanding and Annotating Properties of Bugs


Answering our research questions requires a solid understanding of the studied bugs. The following
describes how we inspect the bugs to annotate them with various properties.
3.4.1 Classical vs. Quantum-Specific. To address RQ1, we classify each bug as either classical or
quantum-specific. We consider a bug to be quantum-specific if the mistake is in handling quantum-
specific concepts, which typically implies that understanding and fixing the bug requires knowledge
of the quantum programming domain. This definition is analogous to bugs in a traditional compiler
that require knowledge of the programming language’s semantics to be fixed, such as when a
program is miscompiled or a type soundness promise is broken. Inversely, we consider all other
bugs to be classical, which includes mistakes that could occur outside of quantum computing
platforms and that do not require quantum-specific knowledge to be fixed.
Figure 3 shows a classical bug, where the developer accidentally misuses the built-in Python
function hasattr. Other examples of classical bugs are missing library imports and bugs caused by

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:9

confusing basic data types, such as Python’s tuple and list. In contrast, Figure 4 is an example
of a quantum bug. The bug is in the function is_identity(), which checks if term implements
an identity function on the quantum state. The incorrect code fails to distinguish between two
quantum-specific concepts, PauliTerm and PauliSum. To fix the bug, the developer needs to
understand these concepts and how they influence the identity check. Other examples of quantum
bugs are mistakes in accurately representing quantum abstractions within a compiler, or numerical
computations that fail to correctly reflect the underlying quantum phenomena.

3.4.2 Identifying Components, Symptoms, and Recurring Bug Patterns. Addressing RQ2, RQ3, and
RQ4, we annotate all bugs with respect to the component a bug is in, the symptom through which
a bug manifests, and any recurring patterns a bug belongs to. To create these annotations, we
repeatedly inspect all bugs while refining the annotations and discussing unclear annotations
among the authors. While inspecting a bug, we consider not only the actual code change, but also
its commit message, any associated issues, and any discussion among the developers, e.g., as part
of a pull request. We may annotate a bug with multiple components and multiple bug patterns. For
example, if the affected code is at the interface between two components, then we annotate the
bug with both components. In contrast, each bug has a single symptom. If we cannot associate a
specific kind of annotation to a bug based on all information available to us, then we leave this bug
unannotated. In total, the annotation process results on 886 annotations added to the 223 studied
bugs.

3.4.3 Complexity of Bug Fixes. To address RQ5, we count the number of lines of code (LoC) that
are changed to fix a bug. If the same fix is applied at multiple locations within a single commit
and if each fix is independent of the others, then we count the number of lines required to fix a
single location. The rationale is that we want to estimate how difficult fixing a bug would be for an
automated tool, which could be applied at multiple locations. For commits that modify lines not
directly related to fixing the bug, we count only those lines relevant for the fix. When a commit
addresses multiple bugs, then we count the affected lines for each bug separately. In particular, we
exclude any modifications of comment lines and, unless the bug is in a test, changes to test files.
The number of łchanged linesž is the sum of added lines and removed lines, where a line that gets
edited in a minor way, e.g., to replace one token with another, is counted only once.

4 RESULTS
This section presents detailed answers to our five research questions. The IDs mentioned in example
bugs refer to the supplementary material, which includes the full dataset of bugs with references to
their commits and all annotations created during the study. The supplementary material will be
made publicly available once the paper gets accepted.

4.1 RQ1: Classical vs. Quantum-Specific Bugs


This research question is about the ratio of classical and quantum-specific bugs in the quantum
computing platforms, which helps understand whether these platforms require domain-specific
approaches for handling bugs. Based on the classification described in Section 3.4, we find that 134
out of the 223 bugs are classical, whereas 89 are quantum-specific bugs. That is, even though a
majority of all bugs are still classical, there also is a large percentage of bugs where detecting and
fixing the bug requires domain knowledge about quantum computing.

Answer to RQ1: 39.9% of all studied bugs in quantum computing platforms are quantum-
specific, which motivates dedicated approaches for preventing and detecting quantum bugs.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:10 Matteo Paltenghi and Michael Pradel

30 Type of Bug
Classical
25 Quantum

20
Count

15

10

0
ac um

ati e

rat e

tor

nta te

ng

mp to
lot on

tio l
ac ific

ns

Co d
ac ca
alu Stat

ne od

ue an
ns

on

ion

ns
g
ns

de
se dia

ute
tio

sti

Co face
d P ati
ula

tio

tin
str ec

str ssi
str nt

Ge e C
tio

tio

Gl ripts
Te
iza

pre rme

an ualiz
Ab Qua

Ab n-sp
Sim

Ab la
tum

r
n

tum Inte
tim
i

Sc
Re nte
ch
Ev

Vis
i
an

ma

Op
Ma

I
Qu

Do

an
Qu
Component

Fig. 5. Number of quantum and classical bugs per component.

Implications. The comforting consequence of the above finding is that traditional, application-
independent bug detection techniques can contribute to improving the code of quantum computing
platforms. At the same time, we see a large potential for techniques targeting quantum-specific
bugs. The results also show a need for software developers with an in-depth understanding of
quantum computing. Regarding the future relevance of this finding, the consistent portion of
quantum-specific bugs (39.9%) motivates and gives additional evidence for the continuation of
quantum software engineering research [Zhao 2021].

4.2 RQ2: Where in Quantum Computing Platforms Do the Bugs Occur?


Understanding where in quantum programming platforms bugs occur will help allocate efforts for
preventing and detecting bugs to components where bugs are most likely to appear. Figure 5 shows
how the studied bugs distribute across the components of a quantum computing platform. The
components are sorted in decreasing order of quantum-specific bugs per component.
We find five components where at least half of all bugs are quantum-specific: quantum state
evaluation, machine code generation, domain abstractions, optimizations, and visualization and
plotting. Among those, the most striking is the optimizations component, which has almost exclu-
sively quantum-specific bugs. Some other components, such as quantum abstractions and simulator,
show a high number of both quantum-specific and classical bugs. For quantum abstractions, this
observation can be explained by the pervasiveness of code that represents quantum primitives,
such as qubits and gates, which causes a high number of mistakes. The code of simulators usually
involves the encoding of quantum operations, typically represented with matrices, and classical
code to handle those large matrices, both of which can be a source of bugs. In contrast to the above,
there also are components with many classical but few or even no quantum-specific bugs, such as
scripts and glue code, as well as testing.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:11

Bug symp-
toms

Functional Non-
functional

8 6
Application-
Generic Inefficiency Other
specific

16 77 2 92 1
Failing test Incorrect Compilation Crash Non-
output error termination

2 15 67 25
Incorrect Incorrect Application
visualization final OS/PL level
measurement level

Fig. 6. Common symptoms of bugs in the quantum platforms.

Answer to RQ2: Bugs occur across a wide range of components in the studied platforms.
Quantum-specific bugs are particularly prevalent in components that represent, compile, and
optimize quantum programming abstractions, whereas infrastructural scripts, glue code, and
testing code are mostly plagued by traditional bugs.

Implications. Components with a high number of quantum-specific bugs are most likely to benefit
from quantum-specific techniques. For example, there is a huge potential for techniques to detect
bugs in optimization code, analogous to related work on analyzing traditional compilers [Barany
2018; Vafeiadis et al. 2015]. Recent work on testing and verifying compilation and optimization
passes of quantum computing platforms [Hietala et al. 2021; Shi et al. 2020; Wang et al. 2021b] try
to exploit this potential, and our results provide an empirical justification of their assumptions.
The surprisingly large number of bugs in infrastructural scripts and glue code shows a need for
better language support to prevent bugs in this component, more effective integration testing, and
strong classical software engineering skills, even among quantum developers, not all of whom are
computer scientists by training. The low number of quantum-specific bugs in the interface to the
hardware is a peculiarity of bugs in quantum computing, which reflects the current maturity of
hardware research. Thus this is a relevant point to monitor along with the evolution of quantum
computing platforms with future empirical studies.

4.3 RQ3: How Do the Bugs Manifest?


Understanding how bugs in quantum computing platforms manifest is a prerequisite for determining
how to effectively detect them. Figure 6 shows a hierarchy of symptoms that describe how a bug
comes to the attention of users or developers. The hierarchy summarizes similar symptoms into
more general classes. Our annotations are for those symptoms shown with a number in the figure,
where the number indicates how many of all studied 223 bugs manifest through this symptom.
We broadly distinguish between functional and non-functional symptoms. The functional symp-
toms are further classified into application-specific and generic symptoms. One of the most prevalent
functional symptoms is that the program yields an incorrect output, which is the case for 77 of all
studied bugs. In particular, this category includes bugs that cause a quantum program to produce an
incorrect final measurement and bugs that lead to an incorrect plot or diagram. Detecting incorrect
output bugs is inherently difficult, as determining what output is expected typically is highly

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:12 Matteo Paltenghi and Michael Pradel

simulator = ' qiskit '


qc = tq . gates .X (0)
qc += tq . gates .Z (1)
result = \
tq . simulate (qc , backend = simulator , samples =1 , read_out_qubits =[0 , 1])
print ( result ) # Output : +1.0000|10 >
result = \
tq . simulate (qc , backend = simulator , samples =1 , read_out_qubits =[1 , 0])
print ( result ) # Output : +1.0000|01 >

Fig. 7. Minimal quantum program, copied from the corresponding issue, that exposes an incorrect measure-
ment bug. (UUID: 1738, from Tequila)

domain-specific. Figure 7 shows an example of a bug that manifests as an incorrect measurement:


The two print statements are supposed to print the same output.
In contrast, only 16 of the application-specific bugs are detected via failing tests. This result
suggests that quantum computing platforms could benefit from more rigorous testing, possibly
supported by automatically generated test suites. The probabilistic nature of the quantum computing
results and the fact that quantum states collapse to classical values when being observed poses a
major challenge [Huang and Martonosi 2019b].
Generic, functional symptoms include compilation errors, program crashes, and non-termination.
The by far most common among those symptoms are program crashes, which we observe in 92
bugs. We further classify crashes into those induced by the operating system or the programming
language, and those induced by application-specific exceptions. The former include, e.g., runtime
memory errors and runtime type errors, whereas the latter are the result of defensive programming,
e.g., when code in Qiskit raises a CircuitNotValid exception if a circuit is in an unexpected
state. The large majority of program crashes (67 vs. 25) is induced by the operating system or the
programming language, which shows the effectiveness of generic checks, but also suggests that
additional checking in the platforms could detect additional bugs.
Non-functional bugs are relatively infrequent in our study, with only 14 out of all 223 bugs. We
further classify them into inefficiencies, i.e., the program is slower than it should be, and other
symptoms, e.g., a producing a correct but suboptimal result. Since our study is based on reported
and fixed bugs, one interpretation of the low number of such bugs is that detecting non-functional
problems in quantum computing platforms is difficult with existing techniques.
To put these results in context, we compare against a study of bugs in deep learning compil-
ers [Shen et al. 2021]. Those bugs also most commonly manifest through crashes and incorrect
outputs, but bugs in quantum computing platforms cause fewer crashes (41.2% vs. 59.37%) and
more incorrect outputs (34.5% vs. 26.26%) than bugs in deep learning compilers.
To better understand how the different symptoms distribute over classical and quantum-specific
bugs, Figure 8 reports the number of classical and quantum bugs per symptom. The most striking
observation is that most classical bugs, but relatively few quantum bugs, manifest via crashes. In
contrast, quantum-specific bugs often manifest via symptoms that are harder to identify, such as
incorrect outputs and incorrect final measurements.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:13

70 Type of Bug
Classical
60 Quantum

50

40
Count

30

20

10

0
Ou rrect

me al

n
h

p E sh

Er sh

Te g

liza ect

cti er

cy

Er ion
tio
as

ilin
nt
r

ror

al
st

ror
t

ure in

un Oth
tpu

rro

ien
Ap Cra

/PL Cra

tio
ua orr

t
on
as ct F

ina
Cr

ila
Fa
o

ffic
Inc

Vis Inc

mp
erm
Me re

Ine

Co
o r
OS

n-f

n-t
Inc

No

No
Symptom

Fig. 8. Number of quantum and classical bugs per symptom.

Answer to RQ3: Most bugs manifest through functional rather than non-functional symptoms
(188 vs. 14 bugs). The two most common symptoms are crashes (92 bugs) and incorrect outputs
(77 bugs), whereas tests are not yet the main way to find bugs in quantum computing platforms
(16 bugs). Classical bugs in these platforms often manifest via an exception raised by the
operating system or the programming language. In contrast, quantum-specific bugs often
create incorrect output, making them more difficult to detect.

Implications. The large number of bugs that manifest through application-specific symptoms
call for domain-specific analysis techniques for quantum computing platforms, which could have
an impact similar to the success of compiler testing over the past decade [Chen et al. 2020]. In
particular, our results show a need for better support for testing quantum computing platforms.
One way to address this need are approaches for testing quantum programs, which requires specific
support for the peculiar quantum characteristic [Huang and Martonosi 2019b]. Another promising
direction is testing the platforms themselves, where we see potential for future work on differential
testing and metamorphic testing, which researchers have only started to explore [Wang et al. 2021b].
Finally, developers of quantum computing platforms could benefit from support for specifying
their expectations about platform-internal states, e.g., in the form of invariants that intermediate
representations should preserve. Regarding the future relevance of this finding, the distribution
shift of bug symptoms can be an important metric to monitor the level of maturity of a platform
over time, and this study sets the first point in that sequence. For example, crashes that now are
caught at the operating system level or at the language level might be converted into more specific
application-level exceptions.

4.4 RQ4: Bug Patterns


The perhaps most interesting outcome of this study is a hierarchy of recurring bug patterns that
we identify in the 223 bugs. The bug patterns include kinds of bugs known from other domains, but
also novel patterns that are specific to quantum computing platforms. Figure 9 shows the hierarchy
of bug patterns, along with the frequency of each pattern in our dataset. Patterns that are specific

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:14 Matteo Paltenghi and Michael Pradel

13
API misuse
API-related 13
Outdated API client
8
Missing information
Intermediate
representation 21
Wrong information
40
Overlooked
corner case
9
Wrong concept
Refer to wrong
program element 11
Incorrect ap- Wrong identifier
plication logic 3
MSB-LSB conven-
tion mismatch
5
Qubit-related Incorrect qubit order
Bug pattern
3
Incorrect qubit count
5
Incorrect scheduling
11
Incorrect numerical
computation
Math-related 5
Incorrect random-
ness handling
32
Misconfiguration
18
Type problem
8
Typo
6
Others Incorrect string
4
GPU-related
4
Flaky test
3
Memory leak

Fig. 9. Recurring bug patterns. Quantum-specific patterns are printed in italics.

to the domain of quantum computing platforms are printed in italics. We identify three broad
categories of bug patterns, API-related bugs, incorrect application logic, and math-related bugs, as
well as a set of seven patterns summarized under other. The following discusses each of these broad
categories in detail and illustrates the patterns with examples.
4.4.1 API-Related Bugs. A class of bugs commonly observed in other kinds of software [Selakovic
and Pradel 2016; Zhong et al. 2020] are API-related bugs. We also encounter such bugs in quantum
computing platforms, and find two recurring subpatterns. On the one hand are API misuses, such
as passing the wrong parameters, passing the right parameters in the wrong order [Rice et al.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:15

case Operations :: OpType :: diagonal_matrix :


- BaseState :: qreg_ . apply_diagonal_matrix ( op . qubits , op . params );
+ BaseState :: qreg_ . apply_diagonal_unitary_matrix ( op . qubits , op . params );

Fig. 10. Example of API misuse on a function not present in the current namespace. (UUID: 1019, from Qiskit
Aer)

Table 2. API-related bug patterns and what kinds of API they concern.

Bug pattern Kind of API


Internal External
API misuse 10 3
Outdated API client 5 8

def expand_tape ( tape , depth =1 , stop_at = None , expand_measurements = False ):


""" Expand all objects in a tape to a specific depth . """
...
new_tape = tape . __class__ ()
+ new_tape . __bare__ = getattr ( tape , " __bare__ ", tape . __class__ )

Fig. 11. Example of bug where a part of the intermediate representation is lost because the code forgot to
update the attributes of a tape. (UUID: 136, from Pennylane)

2017], or simply calling the wrong API, which account for 13 of the studied bugs. Figure 10 shows
an example of an API misuse bug, where the code was accidentally calling an API function not
presented in the current namespace. On the other hand are bugs due to API client code that has not
yet been adapted to an API change, which we call outdated API client, comprising a total of 13 bugs.
To further guide efforts toward finding API-related bugs, we study what kinds of APIs are incor-
rectly used. Specifically, we classify all API-related bugs depending on whether the incorrect code
uses a project-internal API or an API in a third-party library, referred to as łexternalž. Table 2 reports
the result of this classification. We find that external APIs often are the source of outdated API client
bugs, whereas the internal APIs are more commonly misused. These results confirm the previous
observation that programming against evolving third-party APIs may cause mistakes [McDonnell
et al. 2013].
4.4.2 Incorrect Application Logic. The next family of bug patterns is about mistakes in implementing
the application logic. Naturally, many of these bug patterns are specific to the domain of quantum
computing platforms. We identify five prevalent bug patterns, which the following discusses in
detail.
Intermediate Representation. A common class of bugs is about corrupting an intermediate rep-
resentation of a quantum program, which may occur while creating the representation or while
manipulating it. We identify two subpatterns that are about missing information and wrong in-
formation in the intermediate representation, which account for eight and 21 bugs, respectively.
Figure 11 gives an example of a wrong information bug, where the code fails to add some attributes
while expanding the intermediate representation of a tape, which is a data structure to represent
quantum circuits and measurement statistics.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:16 Matteo Paltenghi and Michael Pradel

def depth ( self ):


""" Return circuit depth (i.e., length of critical path ). """
...
+ # If no registers return 0
+ if reg_offset == 0:
+ return 0

Fig. 12. Example of bug where a corner case has not been considered. (UUID: 505, form Qiskit Terra)

void visit ( ConditionalFunction & c) {


auto visitor = std :: make_shared < QuilVisitor >() ;
- quilStr += " JUMP - UNLESS @" + c. getName () + " [" + std :: to_string (c.
getConditionalQubit () ) + "]\ n ";
+ auto classicalBitIdx = qubitToClassicalBitIndex [c. getConditionalQubit () ];
+ quilStr += " JUMP - UNLESS @" + c. getName () + " [" + std :: to_string (
classicalBitIdx ) + "]\ n ";
for ( auto inst : c. getInstructions () ) {
inst -> accept ( visitor );
}

Fig. 13. Example of confusion between tracking the index mapping of the different bit position (classical and
quantum). (UUID: 1869; from XACC)

if token is None :
token = getpass . getpass ( prompt = ' IBM Q token > ')
if device is None :
- token = getpass . getpass ( prompt = ' IBM device > ')
+ device = getpass . getpass ( prompt = ' IBM device > ')

Fig. 14. Example of wrong identifier bug. (UUID: 1122, from ProjectQ)

Overlooked Corner Case. A common kind of mistake, not only but also in quantum computing
platforms, are overlooked corner cases. The fix of such bugs typically expands the code to handle
a rare input, which the developer forgot to consider. With a total of 40 bugs, overlooked corner
cases is the most prevalent of all bug patterns. Figure 12 shows an example bug, where the code to
compute the depth of a circuit did not consider the corner case of not having any registers.

Refer to Wrong Program Element. This bug pattern occurs when a developer confuses two program
elements, and hence, the code refers to one program element instead of another. We distinguish two
subpatterns. The wrong concept pattern means that the developer confuses two related concepts in
the application domain of quantum computing platforms. For example, Figure 13 shows a bug where
the code confuses the index of a qubits with the index of its corresponding bit. The bug is fixed
by looking up the bit via qubitToClassicalBitIndex. Wrong concept bugs are quantum-specific,
and we find a total of nine of them.
The other subpattern are wrong identifier bugs, which means the code is confusing two identifiers,
e.g., because they have a similar name. Figure 14 shows an example. Wrong identifier bugs may be
the result of copied-and-pasted code [Li et al. 2006] and have been addressed in work on finding
VarMisuse bugs [Allamanis et al. 2018; Dinella et al. 2020; Hellendoorn et al. 2020; Kanade et al.
2020; Rabin et al. 2021; Vasic et al. 2018].

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:17

void GateFuser :: visit ( CNOT & cnot ) {


Eigen :: MatrixXcd cnotMat { Eigen :: MatrixXcd :: Zero (4 , 4) };
cnotMat << 1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 0, 1,
0, 0, 1, 0;
- m_gates . emplace_back ( cnotMat , cnot . bits () );
+ m_gates . emplace_back ( cnotMat , reverseBitIdx ( cnot . bits () ));
}

Fig. 15. Example of a quantum-specific bug caused by confusing most significant bit with least significant bit
representation. (UUID: 1900, from XACC).

void PrintCCLighQasm ( Bundles & bundles , bool verbose = false ) {


...
- size_t curr_cycle =1;
+ size_t curr_cycle =0; // first instruction should be with pre - interval 1, 'bs 1 '

Fig. 16. Example of an incorrect scheduling bug. (UUID: 1245, from OpenQL)

Qubit-Related. As qubits are the basic unit of quantum information, it may come as no surprise
that we find several qubit-related bug patterns. The first pattern refers to a mistake in representing
multiple qubits in memory, where the code stores the most significant bit (MSB) first, instead of
the least significant bit (LSB), or vice versa. We call this pattern MSB-LSB convention mismatch.
The second bug pattern is about computing in incorrect qubit count, e.g., in a function that counts
the number of qubits required by a circuit. Finally, the third bug pattern is about code that causes
incorrectly ordered qubits, e.g., when a mapping of the qubit indices is not properly maintained.
Figure 15 shows a bug in an optimization pass to merge gates, which suffers from a MSB-LSB
convention mismatch. The developers fix the problem by reversing the qubits before using them.

Incorrect Scheduling. Our final bug pattern among the application logic-related patterns is about
code that schedules the low-level instructions to be executed on a quantum computation device. If
such code accidentally schedules an instruction to be executed at the wrong timestep, we call it an
instance of the incorrect scheduling bug pattern. We find five bugs that match this pattern, all of
which are in the machine code generation component. Figure 16 shows an example, where the fix
adapts the incorrect starting time of instructions scheduled in OpenQL.

4.4.3 Math-Related Bugs. Some parts of quantum computation platforms implement mathemati-
cally modeled phenomena, which poses a risk of introducing math-related bugs. We find 16 such
bugs in our study, and classify them into two subpatterns. The first pattern are incorrect numerical
computations, i.e., code that uses a wrong mathematical formula or model to represent a phenom-
enon. For example, Figure 17 shows a bug in the implementation of the __rpow__ operation on
the PauliString, which was incorrectly dividing by four instead of two. Understanding this and
other incorrect numerical computations requires a solid understanding of the mathematics behind
quantum computing.
The second math-related bug pattern is incorrect randomness handling, which means that code
related to probabilities and randomness uses these concepts incorrectly. Figure 18 shows an example.
The bug is due to missing initialization of a random seed, which causes all subsystems of the same
size within a randomized benchmark to have exactly the same gate instructions.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:18 Matteo Paltenghi and Michael Pradel

return pauli_string_phasor . PauliStringPhasor (


PauliString ( qubit_pauli_map = self . _qubit_pauli_map ) ,
- exponent_neg =+ half_turns / 4,
- exponent_pos =- half_turns / 4)
+ exponent_neg =+ half_turns / 2,
+ exponent_pos =- half_turns / 2)

Fig. 17. Example of a quantum bug due to an incorrect numerical computation. (UUID: 1496, from Cirq)

def randomized_benchmarking_seq ( nseeds : int = 1, ...) :


...
for _ in range ( length_multiplier [ rb_pattern_index ]) :
+ # make the seed unique for each element
+ if rand_seed :
+ rand_seed += ( seed + 1)
new_elmnt = rb_group . random ( rb_q_num , rand_seed )

Fig. 18. Example of incorrect randomness handling. (UUID: 27, from Qiskit-Ignis)

def beamsplitter ( self , t , r , mode1 , mode2 ):


""" Perform a beamsplitter operation on the specified modes .
- t ( complex ): transmittivity parameter
+ t ( float ): transmittivity parameters
...
+ if isinstance (t , complex ):
+ raise ValueError (" Beamsplitter transmittivity t must be a float .")

Fig. 19. Example of type problem. (UUID: 1820, from Strawberry Fields)

4.4.4 Other Recurrent Bug Patterns. Beyond the three families of bug patterns described above,
we find seven additional patterns, shown under others in Figure 9. These seven patterns are not
quantum-specific, and hence described only briefly. With 32 examples, the largest category are
misconfiguration bugs, which means that an incorrect configuration parameter causes the build
scripts, testing scripts, or installation scripts to perform in an unexpected way. Another frequent
pattern are type problems, such as the example in Figure 19, where the bug fix raises an error if an
incorrect type is passed as a parameter. We also observe multiple examples of simple typos in the
code, string-related bugs [Eghbali and Pradel 2020], flaky tests, memory leaks, and GPU-related bugs.

4.4.5 Distribution of Classical vs. Quantum-Specific Bugs Across the Bug Patterns. The bug patterns
described above and our classification of each bug into quantum-specific or classical (RQ1) are, a
priori, independent of each other. The following studies whether some bug patterns are particularly
prevalent among quantum-specific or classical bugs. Figure 20 shows for each bug pattern the
number of quantum-specific and classical bugs.
The results allow for several interesting observations. First, some bug patterns are clearly
dominated by quantum-specific bugs. This is obviously the cases for our novel, quantum-specific
bug patterns, such as the qubit-related patterns, but also for incorrect numerical computations and
wrong information in the intermediate representation. Second, wrong concept bugs are more likely
to be quantum-specific than classical, suggesting that concepts in quantum programming, such as
qubits and circuits, are likely causes of confusion among developers. Third, some bug patterns are
populated both by many quantum-specific bugs and by many classical bugs, such as overlooked

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:19

Overlooked corner case Type of Bug


Incorrect IR - Wrong information Quantum
Incorrect numerical computation Classical
Wrong concept
Incorrect qubit count
Wrong identifier
Incorrect IR - Missing information
Type problem
Incorrect scheduling
Bug pattern

MSB-LSB convention mismatch


API misuse
GPU related
Outdated API client
Incorrect qubit order
Typo
Flaky test
Incorrect randomenss handling
Misconfiguration
Memory leak
Incorrect string
0 5 10 15 20 25 30
Count

Fig. 20. Number of quantum-specific and classical bugs per bug pattern.

corner cases. Finally, bug patterns known from other application domains, e.g., misconfigurations
and type problems, are mostly found among the classical bugs.

Answer to RQ4: Among the 223 studied bugs, there are various recurring bug patterns. We
find both patterns known from other domains, e.g., incorrect API uses (26 bugs) and type
problems (18 bugs), and quantum-specific bug patterns, e.g., wrong or incomplete intermediate
representations (29 bugs), mistakes related to the order or count of qubits (8 bugs), and mistakes
in encoding the underlying mathematical formulas into numeric computations (11 bugs).

Implications. Our analysis of recurring bug patterns motivates several lines of future work.
First, the fact that there are quantum-specific bug patterns shows a need for new techniques that
target such bugs. For example, we envision bug detection tools that address common yet domain-
specific problems in a pattern-by-pattern basis, similar to existing static bug detectors [Aftandilian
et al. 2012]. Our annotated bug dataset can serve as a starting point for measuring the detection
abilities of such bug detectors. Second, the observation that there are almost no bugs that are
both type-related and quantum-specific motivates work on type systems to reason about quantum
abstractions. For example, such type systems could help reduce wrong concept bugs by representing
different concepts with different types. Third, the prevalence of incorrect numerical computation
bugs underlines the need for developers of quantum computing platforms to carefully check the
implementations of the underlying mathematical concepts. Fourth, generic end-to-end approaches,
such as differential testing, could be adapted to find bugs in quantum computing platforms, as shown
by Wang et al. [2021b]. We also expect existing techniques for generating realistic programs, such
as based on learning from a corpus of existing programs, to be useful for detecting quantum-related
bugs. Finally, most quantum-specific bugs, such as incorrect qubit order, incorrect scheduling, or
incorrect numerical computation, happen in components independent of the execution environment.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:20 Matteo Paltenghi and Michael Pradel

50
Median LoC Median LoC
for Classical for Quantum
Bug-fixes (4.0) Bug-fixes (8.0)
40

30
Type of Bug
Count

Classical
Quantum
20

10

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20+
Bug-Fix Complexity (LoC)

Fig. 21. Distribution of number of lines of code (LoC) for the bugs under study, divided by quantum and
classical bugs.

This fact implies that, while waiting for future error-corrected quantum computers, the use of
simulators is a good and viable option to test quantum platforms since execution on simulators can
successfully trigger a large part of these bug patterns.

4.5 RQ5: How Complex Are the Bug Fixes?


Motivated by the recent progress on synthesizing program transformations [Gao et al. 2020; Miltner
et al. 2019] and automated program repair [Bader et al. 2019; Berabi et al. 2021; Le Goues et al.
2019; Li et al. 2020a], we study the complexity of the bug fixes applied in quantum computing
platforms. We measure complexity as the number of lines changed to fix a bug. Figure 21 shows
the number of bugs with a specific number of changed lines in its fix. The stacked bars show the
number of quantum-specific and classical bugs. For example, there are 10 quantum-specific bugs
and 39 classical bugs that can be fixed by changing only one line.
Overall, we observe a similar distribution of bug fix complexity across quantum-specific and
classical bugs. Both groups of bugs include a large number of problems that can be fixed by changing
only one or two lines, but also a relatively large share of bugs that need at least 20 changed lines.
Computing the median size of a bug fix for both quantum-specific and classical bugs, we obtain
8.0 and 4.0, respectively. The difference suggests that despite the overall similarity of the two
distributions, fixing quantum-specific bugs tends to require larger code changes.
Beside the number of changed lines, other measures of bug fix complexity exist, such as the
number of methods, hunks, classes, files or packages. We report for each bug fix also the number of
code changes in Figure 22 and changed files in Figure 23. Note that we minimize the bug fix commit
via manual inspection and leave only the relevant code change hunks that fix the bug. For example,
this might involve inspecting sub-commits of a pull request to separate the bug-fixing changes
from those that refactor the code or add regression tests. The majority of bug fixes contains less
than five code changes and less than two changed files.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:21

160
Type of Bug Type of Bug
Median # files
70 Classical for Classical Classical
Median # hunks Quantum 140 Bug-fixes (1.0) Quantum
for Classical
60 Bug-fixes (2.0)
120 Median # files
for Quantum
Median # hunks Bug-fixes (1.0)
50 for Quantum 100
Bug-fixes (3.0)
Count

Count
40 80

30 60

20 40

10 20

0 0
1 2 3 4 5 6 7 8 9 10+ 1 2 3 4 5 6 7 8 9 10+
Bug-Fix Complexity (# hunks) Bug-Fix Complexity (# files)

Fig. 22. Distribution of code change hunks per bug Fig. 23. Distribution of number of edited files per
fix. bug fix.

Answer to RQ5: Many bugs in quantum computing platforms can be fixed by changing only
one or two lines. While this observation holds for both quantum-specific and classical bugs,
the former also include a large number of bug fixes with 20 or more lines. Most of these bugs
can be fixed by editing only one or two files.

Implications. Quantum-specific bugs that can be fixed by changing only a few lines are an
attractive target for automated program transformation and program repair techniques. Yet, we
also observe a disproportionally large number of bug fixes with 20 or more lines, which likely
brings them out of the reach of today’s bug fix automation approaches. Regarding the impact of this
finding on future work, the quantification of the scope of bug fixes in terms of the number of lines,
hunks, and files can guide the design of future program repair techniques. Moreover, regarding the
area of software evolution, to the best of our knowledge, this work is the first to quantify these
metrics, giving a reference to future work.

5 RELATED WORK
5.1 Studies of Quantum Bugs
The need for and the challenges of studying quantum bugs is discussed also by a short position
paper [Campos and Souto 2021], which however does not address this challenge. Zhao et al. [2021b]
provide a dataset of 36 bugs gathered from a single quantum computing platform (Qiskit). Our
study contributes both in terms of scale and depth, by studying an order of magnitude more bugs
than the only existing bug dataset and by performing a detailed analysis of the bugs. Importantly,
our study is based on bugs from 18 different projects, allowing us to draw more general conclusions
than a study on a single platform can.
Related to our hierarchy of bug patterns, Zhao et al. [2021a] define eight bug patterns that
focus on misuses of features of the Qiskit language. In contrast to our work, these patterns are
not based on bugs found in the wild, and hence, it remains unclear whether, and if yes, how often
the bug patterns occur. Moreover, their patterns are about quantum programs written on top of a
quantum computing platform, whereas we focus on bugs in the platforms themselves. Huang and

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:22 Matteo Paltenghi and Michael Pradel

Martonosi [2019a] study three quantum algorithms and, by implementing them in the Scaffold and
ProjectQ languages, the authors propose six quantum bug patterns based on their experience while
programming those algorithms. Instead of reporting our own experience, we study real-world bugs
across 18 projects that many developers contribute to.

5.2 Studies of Other Kinds of Bugs


Prior work studies various other kinds of bugs. Most closely related to our work are studies of bugs
in widely used frameworks and platforms. For example, Chou et al. [2001] study bugs in operating
systems, while more recent work studies bugs in compilers [Sun et al. 2016] and deep learning
libraries [Islam et al. 2019; Shen et al. 2021]. Our work is similar in that we study bugs in a platform
used by various applications, but we cover a domain missed by prior work. Other studies are about
specific kinds of bugs, e.g., concurrency bugs [Lu et al. 2008], performance bugs [Han and Yu 2016;
Jin et al. 2012; Selakovic and Pradel 2016], and string-related bugs [Eghbali and Pradel 2020].

5.3 Correctness of Quantum Computing and Other Platforms


Several approaches to increase the correctness of quantum computing platforms have been proposed
recently. Shi et al. [2020] describe a verification framework to check the correctness of Qiskit’s
compiler passes in a semi-automatic way. Wang et al. [2021b] use differential testing [McKeeman
1998] to test several quantum computing platforms against each other. The results of our study will
be useful for guiding future verification and testing efforts towards components and bug patterns
that are not sufficiently addressed by today’s approaches. Beyond quantum computing platforms,
other widely used platforms are subject to testing and verification approaches, e.g., in the form of
compiler testing [Chen et al. 2020; Le et al. 2014, 2015; Yang et al. 2011; Zhang et al. 2017], compiler
verification [Leroy 2009], or automated testing of deep learning libraries [Pham et al. 2019; Wang
et al. 2021a]. Apart from the few existing approaches cited above, adapting these ideas to quantum
computing platforms remains as a promising line of future work.

5.4 Correctness of Quantum Programs


Finding bugs in quantum programs, i.e., programs that run on a quantum computing platform,
is the primary goal of another line of research. These program are more difficult to debug than
a traditional program due to the impossibility of copying quantum information [Wootters and
Zurek 1982] and the inherently probabilistic nature of measurements. To tackle these challenges,
Huang and Martonosi [2019b] propose statistical methods to perform assertions in a quantum
program. Li et al. [2020b] describe a projection-based runtime assertion scheme for quantum
programs that ensures that testing an assertion does not affect the tested state if it satisfies the
assertion. Yu and Palsberg [2021] propose an abstract interpretation-based static analysis for
quantum programs. These approaches and work on ensuring the correctness of platforms, such as
ours, are complementary to each other, but share the overall goal of mitigating the risk of bugs in
quantum computing.

5.5 Quantum Programming Languages and Their Implementations


Quantum programming languages and their implementation are an active field research. One
line of work is about language constructs to simplify particularly tricky aspects of quantum
programming, such as uncomputation, which is about resetting temporary quantum values, usually
before discarding them. Convenience functions that simplify this step are proposed as ApplyWith
in Q# [Svore et al. 2018] or with_computed in Quipper [Green et al. 2013b]. Paradis et al. [2021]
automatically synthesize uncomputation code for quantum circuits. The bug patterns identified in
our study could motivate other language constructs or code synthesis techniques.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:23

Another line of work is about optimizing the execution of quantum programs. Häner et al. [2020]
propose an optimization technique based on assertions about entanglements between qubits, which
is implemented in the ProjectQ platform. Meuli et al. [2020] describe an optimization aimed at
avoiding numeric approximation errors while reducing the cost of computing with high accuracy,
which is implemented in the Q# platform. Gleipnir computes bounds on the errors caused by
noise in quantum computations, which can help in evaluating how effectively quantum compiler
optimizations mitigate errors [Tao et al. 2021]. Our study finds the optimization component to
have a particularly high ratio of quantum-specific bugs, showing that correctly implementing an
optimization deserves particular attention.

6 CONCLUSIONS
Motivated by the increasing importance of quantum computing platforms, this paper presents the
first empirical study of bugs in these platforms. Based on a set of 223 real-world bugs from 18
open-source projects, we study how many bugs are quantum-specific, where the bugs occur, how
they manifest, whether there are any recurring bug patterns, and how complex it is to fix the bugs.
We find that quantum-specific bugs are common and identify a novel set of quantum-specific bug
patterns. These findings show that, while platform developers can benefit from existing bug-related
tools, there is a need for new, quantum-specific techniques to prevent, detect, and fix bugs. For
example, future work could design type systems to prevent developers from confusing related
quantum concepts, language constructs to encode the order of qubits, static bug detectors that
target quantum-specific bug patterns, and generate quantum programs to test quantum computing
platforms. Our study and its associated dataset provide concrete guidance for these research
directions, and a starting point for evaluating such approaches.

ACKNOWLEDGMENTS
This work was supported by the European Research Council (ERC, grant agreement 851895), and
by the German Research Foundation within the ConcSys and Perf4JS projects.

REFERENCES
2021. Overview on Quantum Initiatives Worldwide - Update Mid 2021. https://www.qureca.com/overview-on-quantum-
initiatives-worldwide-update-mid-2021/.
2021. Qiskit/Qiskit. https://github.com/Qiskit/qiskit.
Edward Aftandilian, Raluca Sauciuc, Siddharth Priya, and Sundaresan Krishnan. 2012. Building Useful Program Analysis
Tools Using an Extensible Java Compiler. In 2012 IEEE 12th International Working Conference on Source Code Analysis and
Manipulation. 14ś23. https://doi.org/10.1109/SCAM.2012.28
Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs.
In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018,
Conference Track Proceedings. OpenReview.net.
Johannes Bader, Andrew Scott, Michael Pradel, and Satish Chandra. 2019. Getafix: Learning to Fix Bugs Automatically.
Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), 159:1ś159:27. https://doi.org/10.1145/3360585
Gergö Barany. 2018. Finding Missed Compiler Optimizations by Differential Testing. In Proceedings of the 27th International
Conference on Compiler Construction (CC 2018). Association for Computing Machinery, New York, NY, USA, 82ś92.
https://doi.org/10.1145/3178372.3179521
Berkay Berabi, Jingxuan He, Veselin Raychev, and Martin Vechev. 2021. TFix: Learning to Fix Coding Errors with a
Text-to-Text Transformer. In Proceedings of the 38th International Conference on Machine Learning. PMLR, 780ś791.
Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, M. Sohaib Alam, Shahnawaz Ahmed, Juan Miguel Ar-
razola, Carsten Blank, Alain Delgado, Soran Jahangiri, Keri McKiernan, Johannes Jakob Meyer, Zeyue Niu, Antal
Száva, and Nathan Killoran. 2020. PennyLane: Automatic Differentiation of Hybrid Quantum-Classical Computations.
arXiv:1811.04968 [physics, physics:quant-ph] (Feb. 2020). arXiv:1811.04968 [physics, physics:quant-ph]

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:24 Matteo Paltenghi and Michael Pradel

Benjamin Bichsel, Maximilian Baader, Timon Gehr, and Martin T. Vechev. 2020. Silq: a high-level quantum language
with safe uncomputation and intuitive semantics. In Proceedings of the 41st ACM SIGPLAN International Conference on
Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and
Emina Torlak (Eds.). ACM, 286ś300. https://doi.org/10.1145/3385412.3386007
José Campos and André Souto. 2021. QBugs: A Collection of Reproducible Bugs in Quantum Algorithms and a Supporting
Infrastructure to Enable Controlled Quantum Software Testing and Debugging Experiments. arXiv:2103.16968 [cs] (March
2021). arXiv:2103.16968 [cs]
Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of
Compiler Testing. ACM Comput. Surv. 53, 1 (2020), 4:1ś4:36. https://doi.org/10.1145/3363562
Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An Empirical Study of Operating
Systems Errors. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (SOSP ’01). Association
for Computing Machinery, New York, NY, USA, 73ś88. https://doi.org/10.1145/502034.502042
Andrew W. Cross, Lev S. Bishop, John A. Smolin, and Jay M. Gambetta. 2017. Open Quantum Assembly Language.
arXiv:1707.03429 [quant-ph] (July 2017). arXiv:1707.03429 [quant-ph]
Cirq Developers. 2021. Cirq. Zenodo. https://doi.org/10.5281/zenodo.5182845
Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning Graph Transforma-
tions to Detect and Fix Bugs in Programs. In 8th International Conference on Learning Representations, ICLR 2020, Addis
Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=SJeqs6EFvB
Aryaz Eghbali and Michael Pradel. 2020. No Strings Attached: An Empirical Study of String-Related Software Bugs. In
Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ’20). Association for
Computing Machinery, New York, NY, USA, 956ś967. https://doi.org/10.1145/3324884.3416576
Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. 2014. A Quantum Approximate Optimization Algorithm. arXiv:1411.4028
[quant-ph] (Nov. 2014). arXiv:1411.4028 [quant-ph]
Mark Fingerhuth, Tomáš Babej, and Peter Wittek. 2018. Open Source Software in Quantum Computing. PLOS ONE 13, 12
(Dec. 2018), e0208561. https://doi.org/10.1371/journal.pone.0208561
Doug Finke. 2021. Relative Popularity of Different Quantum Programming Platforms - Quantum Computing
Report. https://web.archive.org/web/20210619213740/https://quantumcomputingreport.com/relative-popularity-of-
different-quantum-programming-platforms/.
Xiang Gao, Shraddha Barke, Arjun Radhakrishna, Gustavo Soares, Sumit Gulwani, Alan Leung, Nachiappan Nagappan, and
Ashish Tiwari. 2020. Feedback-driven semi-supervised synthesis of program transformations. Proc. ACM Program. Lang.
4, OOPSLA (2020), 219:1ś219:30. https://doi.org/10.1145/3428287
Constantin Gonzalez. 2021. Cloud Based QC with Amazon Braket. Digitale Welt 5, 2 (April 2021), 14ś17. https://doi.org/10.
1007/s42354-021-0330-z
Alexander S. Green, Peter LeFanu Lumsdaine, Neil J. Ross, Peter Selinger, and Benoît Valiron. 2013a. Quipper: a scalable
quantum programming language. In ACM SIGPLAN Conference on Programming Language Design and Implementation,
PLDI ’13, Seattle, WA, USA, June 16-19, 2013, Hans-Juergen Boehm and Cormac Flanagan (Eds.). ACM, 333ś342. https:
//doi.org/10.1145/2491956.2462177
Alexander S. Green, Peter LeFanu Lumsdaine, Neil J. Ross, Peter Selinger, and Benoît Valiron. 2013b. Quipper: A Scalable
Quantum Programming Language. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language
Design and Implementation (PLDI ’13). Association for Computing Machinery, New York, NY, USA, 333ś342. https:
//doi.org/10.1145/2491956.2462177
Lov K. Grover. 1996. A Fast Quantum Mechanical Algorithm for Database Search. In Proceedings of the Twenty-Eighth
Annual ACM Symposium on Theory of Computing (STOC ’96). Association for Computing Machinery, New York, NY,
USA, 212ś219. https://doi.org/10.1145/237814.237866
Xue Han and Tingting Yu. 2016. An Empirical Study on Performance Bugs for Highly Configurable Software Systems. In
Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM
’16). Association for Computing Machinery, New York, NY, USA, 1ś10. https://doi.org/10.1145/2961111.2962602
Thomas Häner, Torsten Hoefler, and Matthias Troyer. 2020. Assertion-based optimization of Quantum programs. Proc. ACM
Program. Lang. 4, OOPSLA (2020), 133:1ś133:20. https://doi.org/10.1145/3428201
Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. 2009. Quantum Algorithm for Solving Linear Systems of Equations.
Physical Review Letters 103, 15 (Oct. 2009), 150502. https://doi.org/10.1103/PhysRevLett.103.150502 arXiv:0811.3171
Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, and David Bieber. 2020. Global Relational Models
of Source Code. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30,
2020. OpenReview.net. https://openreview.net/forum?id=B1lnbRNtwr
Kesha Hietala, Robert Rand, Shih-Han Hung, Xiaodi Wu, and Michael Hicks. 2021. A Verified Optimizer for Quantum Circuits.
Proceedings of the ACM on Programming Languages 5, POPL (Jan. 2021), 37:1ś37:29. https://doi.org/10.1145/3434318

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:25

Yipeng Huang and Margaret Martonosi. 2019a. QDB: From Quantum Algorithms Towards Correct Quantum Programs.
arXiv:1811.05447 [quant-ph] (2019), 14 pages. https://doi.org/10.4230/OASIcs.PLATEAU.2018.4 arXiv:1811.05447 [quant-
ph]
Yipeng Huang and Margaret Martonosi. 2019b. Statistical Assertions for Validating Patterns and Finding Bugs in Quantum
Programs. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA ’19). Association for
Computing Machinery, New York, NY, USA, 541ś553. https://doi.org/10.1145/3307650.3322213
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug
Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and
Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New
York, NY, USA, 510ś520. https://doi.org/10.1145/3338906.3338955
Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World
Performance Bugs. ACM SIGPLAN Notices 47, 6 (June 2012), 77ś88. https://doi.org/10.1145/2345156.2254075
Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and Evaluating Contextual Embedding
of Source Code. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual
Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 5110ś5121. https://doi.org/10.48550/ARXIV.2001.00059
Rafael-Michael Karampatsis and Charles Sutton. 2020. How Often Do Single-Statement Bugs Occur? The ManySStuBs4J
Dataset. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR ’20). Association for
Computing Machinery, New York, NY, USA, 573ś577. https://doi.org/10.1145/3379597.3387491
Jakob S. Kottmann, Sumner Alperin-Lea, Teresa Tamayo-Mendoza, Alba Cervera-Lierta, Cyrille Lavigne, Tzu-Ching Yen,
Vladyslav Verteletskyi, Philipp Schleich, Abhinav Anand, Matthias Degroote, Skylar Chaney, Maha Kesibi, Naomi Grace
Curnow, Brandon Solo, Georgios Tsilimigkounakis, Claudia Zendejas-Morales, Artur F. Izmaylov, and Alán Aspuru-Guzik.
2021. Tequila: A Platform for Rapid Development of Quantum Algorithms. Quantum Science and Technology 6, 2 (April
2021), 024009. https://doi.org/10.1088/2058-9565/abe567 arXiv:2011.03057
Ryan LaRose, Andrea Mari, Sarah Kaiser, Peter J. Karalekas, Andre A. Alves, Piotr Czarnik, Mohamed El Mandouh,
Max H. Gordon, Yousef Hindy, Aaron Robertson, Purva Thakre, Nathan Shammah, and William J. Zeng. 2021. Mitiq:
A Software Package for Error Mitigation on Noisy Quantum Computers. arXiv:2009.04417 [quant-ph] (Aug. 2021).
arXiv:2009.04417 [quant-ph]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In ACM SIGPLAN
Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11,
2014, Michael F. P. O’Boyle and Keshav Pingali (Eds.). ACM, 216ś226. https://doi.org/10.1145/2594291.2594334
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation.
ACM SIGPLAN Notices 50, 10 (Oct. 2015), 386ś399. https://doi.org/10.1145/2858965.2814319
Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair. Commun. ACM 62, 12 (2019),
56ś65. https://doi.org/10.1145/3318162
Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (July 2009), 107ś115. https:
//doi.org/10.1145/1538788.1538814
Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, and Yuan Xie. 2020b. Projection-based runtime assertions
for testing and debugging Quantum programs. Proc. ACM Program. Lang. 4, OOPSLA (2020), 150:1ś150:29. https:
//doi.org/10.1145/3428218
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020a. DLFix: Context-Based Code Transformation Learning for Automated
Program Repair. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20).
Association for Computing Machinery, New York, NY, USA, 602ś614. https://doi.org/10.1145/3377811.3380345
Z. Li, S. Lu, S. Myagmar, and Y. Zhou. 2006. CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code.
IEEE Transactions on Software Engineering 32, 3 (March 2006), 176ś192. https://doi.org/10.1109/TSE.2006.28
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from Mistakes: A Comprehensive Study on Real
World Concurrency Bug Characteristics. In Proceedings of the 13th International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS XIII). Association for Computing Machinery, New York, NY,
USA, 329ś339. https://doi.org/10.1145/1346281.1346323
Tyler McDonnell, Baishakhi Ray, and Miryung Kim. 2013. An Empirical Study of API Stability and Adoption in the Android
Ecosystem. In 2013 IEEE International Conference on Software Maintenance, Eindhoven, The Netherlands, September 22-28,
2013. IEEE Computer Society, 70ś79. https://doi.org/10.1109/ICSM.2013.18
William M. McKeeman. 1998. Differential Testing for Software. Digit. Tech. J. 10, 1 (1998), 100ś107.
Giulia Meuli, Mathias Soeken, Martin Roetteler, and Thomas Häner. 2020. Enabling accuracy-aware Quantum compilers using
symbolic resource estimation. Proc. ACM Program. Lang. 4, OOPSLA (2020), 130:1ś130:26. https://doi.org/10.1145/3428198
Anders Miltner, Sumit Gulwani, Vu Le, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa.
2019. On the fly synthesis of edit suggestions. PACMPL 3, OOPSLA (2019), 143:1ś143:29. https://doi.org/10.1145/3360569

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
86:26 Matteo Paltenghi and Michael Pradel

Prakash Murali, David C. Mckay, Margaret Martonosi, and Ali Javadi-Abhari. 2020. Software Mitigation of Crosstalk on Noisy
Intermediate-Scale Quantum Computers. In Proceedings of the Twenty-Fifth International Conference on Architectural
Support for Programming Languages and Operating Systems (ASPLOS ’20). Association for Computing Machinery, New
York, NY, USA, 1001ś1016. https://doi.org/10.1145/3373376.3378477
Anouk Paradis, Benjamin Bichsel, Samuel Steffen, and Martin T. Vechev. 2021. Unqomp: synthesizing uncomputation
in Quantum circuits. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and
Implementation, Virtual Event, Canada, June 20-25, 20211, Stephen N. Freund and Eran Yahav (Eds.). ACM, 222ś236.
https://doi.org/10.1145/3453483.3454040
Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize
bugs in deep learning libraries. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019,
Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 1027ś1038.
https://doi.org/10.1109/ICSE.2019.00107
Md Rafiqul Islam Rabin, Vincent J. Hellendoorn, and Mohammad Amin Alipour. 2021. Understanding Neural Code Intelligence
through Program Simplification. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference
and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing Machinery, New
York, NY, USA, 441ś452. https://doi.org/10.1145/3468264.3468539
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On
the "Naturalness" of Buggy Code. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16).
Association for Computing Machinery, New York, NY, USA, 428ś439. https://doi.org/10.1145/2884781.2884848
Andrew Rice, Edward Aftandilian, Ciera Jaspan, Emily Johnston, Michael Pradel, and Yulissa Arroyo-Paredes. 2017. Detecting
Argument Selection Defects. Proceedings of the ACM on Programming Languages 1, OOPSLA (Oct. 2017), 104:1ś104:22.
https://doi.org/10.1145/3133928
Marija Selakovic and Michael Pradel. 2016. Performance Issues and Optimizations in JavaScript: An Empirical Study. In
Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery,
New York, NY, USA, 61ś72. https://doi.org/10.1145/2884781.2884829
Qingchao Shen, Haoyang Ma, Junjie Chen, Yongqiang Tian, Shing-Chi Cheung, and Xiang Chen. 2021. A Comprehensive
Study of Deep Learning Compiler Bugs. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering
Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing
Machinery, New York, NY, USA, 968ś980. https://doi.org/10.1145/3468264.3468591
Yunong Shi, Runzhou Tao, Xupeng Li, Ali Javadi-Abhari, Andrew W. Cross, Frederic T. Chong, and Ronghui Gu. 2020.
CertiQ: A Mostly-automated Verification of a Realistic Quantum Compiler. arXiv:1908.08963 [quant-ph] (Nov. 2020).
arXiv:1908.08963 [quant-ph]
Peter W. Shor. 1999. Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer.
SIAM Rev. 41, 2 (Jan. 1999), 303ś332. https://doi.org/10.1137/S0036144598347011
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward understanding compiler bugs in GCC and LLVM. In
Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July
18-20, 2016, Andreas Zeller and Abhik Roychoudhury (Eds.). ACM, 294ś305. https://doi.org/10.1145/2931037.2931074
Yasunari Suzuki, Yoshiaki Kawase, Yuya Masumura, Yuria Hiraga, Masahiro Nakadai, Jiabao Chen, Ken M. Nakanishi, Kosuke
Mitarai, Ryosuke Imai, Shiro Tamiya, Takahiro Yamamoto, Tennin Yan, Toru Kawakubo, Yuya O. Nakagawa, Yohei Ibe,
Youyuan Zhang, Hirotsugu Yamashita, Hikaru Yoshimura, Akihiro Hayashi, and Keisuke Fujii. 2021. Qulacs: A Fast and
Versatile Quantum Circuit Simulator for Research Purpose. Quantum 5 (Oct. 2021), 559. https://doi.org/10.22331/q-2021-
10-06-559
Krysta Svore, Alan Geller, Matthias Troyer, John Azariah, Christopher Granade, Bettina Heim, Vadym Kliuchnikov, Mariia
Mykhailova, Andres Paz, and Martin Roetteler. 2018. Q#: Enabling Scalable Quantum Computing and Development with
a High-level DSL. In Proceedings of the Real World Domain Specific Languages Workshop 2018 (RWDSL2018). Association
for Computing Machinery, New York, NY, USA, 1ś10. https://doi.org/10.1145/3183895.3183901
Runzhou Tao, Yunong Shi, Jianan Yao, John Hui, Frederic T. Chong, and Ronghui Gu. 2021. Gleipnir: toward practical error
analysis for Quantum programs. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language
Design and Implementation, Virtual Event, Canada, June 20-25, 20211, Stephen N. Freund and Eran Yahav (Eds.). ACM,
48ś64. https://doi.org/10.1145/3453483.3454029
Viktor Vafeiadis, Thibaut Balabonski, Soham Chakraborty, Robin Morisset, and Francesco Zappa Nardelli. 2015. Common
Compiler Optimisations Are Invalid in the C11 Memory Model and What We Can Do about It. In Proceedings of the
42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). Association for
Computing Machinery, New York, NY, USA, 209ś220. https://doi.org/10.1145/2676726.2676995
Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, and Rishabh Singh. 2018. Neural Program Repair by Jointly
Learning to Localize and Repair. In International Conference on Learning Representations.

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.
Bugs in Quantum Computing Platforms: An Empirical Study 86:27

Jiyuan Wang, Qian Zhang, Guoqing Harry Xu, and Miryung Kim. 2021b. QDiff: Differential Testing of Quantum Software
Stacks. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 692ś704. https:
//doi.org/10.1109/ASE51524.2021.9678792
Song Wang, Nishtha Shrestha, Abarna Kucheri Subburaman, Junjie Wang, Moshi Wei, and Nachiappan Nagappan. 2021a.
Automatic Unit Test Generation for Machine Learning Libraries: How Far Are We?. In 43rd IEEE/ACM International
Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 1548ś1560. https://doi.org/10.1109/
ICSE43902.2021.00138
W. K. Wootters and W. H. Zurek. 1982. A Single Quantum Cannot Be Cloned. Nature 299, 5886 (Oct. 1982), 802ś803.
https://doi.org/10.1038/299802a0
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. ACM SIGPLAN
Notices 46, 6 (June 2011), 283ś294. https://doi.org/10.1145/1993316.1993532
Nengkun Yu and Jens Palsberg. 2021. Quantum abstract interpretation. In PLDI ’21: 42nd ACM SIGPLAN International
Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 20211, Stephen N.
Freund and Eran Yahav (Eds.). ACM, 542ś558. https://doi.org/10.1145/3453483.3454061
Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal Program Enumeration for Rigorous Compiler Testing. ACM
SIGPLAN Notices 52, 6 (June 2017), 347ś361. https://doi.org/10.1145/3140587.3062379
Jianjun Zhao. 2021. Quantum Software Engineering: Landscapes and Horizons. arXiv:2007.07047 [quant-ph] (Dec. 2021).
arXiv:2007.07047 [quant-ph]
Pengzhan Zhao, Jianjun Zhao, and Lei Ma. 2021a. Identifying Bug Patterns in Quantum Programs. In 2021 IEEE/ACM 2nd
International Workshop on Quantum Software Engineering (Q-SE). IEEE Computer Society, 16ś21. https://doi.org/10.
1109/Q-SE52541.2021.00011
Pengzhan Zhao, Jianjun Zhao, Zhongtao Miao, and Shuhan Lan. 2021b. Bugs4Q: A Benchmark of Real Bugs for Quantum
Programs. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1373ś1376. https:
//doi.org/10.1109/ASE51524.2021.9678908
Hao Zhong, Na Meng, Zexuan Li, and Li Jia. 2020. An Empirical Study on API Parameter Rules. In Proceedings of the
ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New
York, NY, USA, 899ś911. https://doi.org/10.1145/3377811.3380922

Proc. ACM Program. Lang., Vol. 6, No. OOPSLA1, Article 86. Publication date: April 2022.

You might also like