Analyzing Quantum Programs with LintQ: A Static Analysis Framework for Qiskit

Matteo Paltenghi 0000-0003-2266-453X University of StuttgartStuttgartGermany [email protected] and Michael Pradel 0000-0003-1623-498X University of StuttgartStuttgartGermany [email protected]

(2024; 2023-09-28; 2024-04-16)

Abstract.

As quantum computing is rising in popularity, the amount of quantum programs and the number of developers writing them are increasing rapidly. Unfortunately, writing correct quantum programs is challenging due to various subtle rules developers need to be aware of. Empirical studies show that 40–82% of all bugs in quantum software are specific to the quantum domain. Yet, existing static bug detection frameworks are mostly unaware of quantum-specific concepts, such as circuits, gates, and qubits, and hence miss many bugs. This paper presents LintQ, a comprehensive static analysis framework for detecting bugs in quantum programs. Our approach is enabled by a set of abstractions designed to reason about common concepts in quantum computing without referring to the details of the underlying quantum computing platform. Built on top of these abstractions, LintQ offers an extensible set of ten analyses that detect likely bugs, such as operating on corrupted quantum states, redundant measurements, and incorrect compositions of sub-circuits. We apply the approach to a newly collected dataset of 7,568 real-world Qiskit-based quantum programs, showing that LintQ effectively identifies various programming problems, with a precision of 91.0% in its default configuration with the six best performing analyses. Comparing to a general-purpose linter and two existing quantum-aware techniques shows that almost all problems (92.1%) found by LintQ during our evaluation are missed by prior work. LintQ hence takes an important step toward reliable software in the growing field of quantum computing.

program analysis, quantum computing, static analysis, bug detection

^†^†ccs: Theory of computation Program analysis^†^†ccs: Computer systems organization Quantum computing^†^†copyright: rightsretained^†^†doi: 10.1145/3660802^†^†journalyear: 2024^†^†submissionid: fse24main-p691-p^†^†journal: PACMSE^†^†journalvolume: 1^†^†journalnumber: FSE^†^†article: 95^†^†publicationmonth: 7

1. Introduction

Given the rising interest in quantum computing, ensuring the correctness of quantum software is increasingly important. Studies on bugs in quantum computing platforms (Paltenghi and Pradel, 2022) and quantum programs (Luo et al., 2022) show that many bugs in such software are problems specific to the quantum computing domain. For example, Paltenghi and Pradel (2022) and Luo et al. (2022) report that 40% and 82%, respectively, of the bugs found in quantum software are due to quantum-specific bug patterns. Detecting bugs in quantum programs is especially important because many bugs silently lead to unexpected results, which may be hard to spot due to the probabilistic results of quantum computations.

Unfortunately, popular bug detection tools, such as CodeQL (Avgustinov et al., 2016), Pylint (Pyl, [n. d.]), Flake8 (Fla, [n. d.]), Infer (Calcagno et al., 2015), and ErrorProne (Google, 2015), are unaware of quantum computing. These tools consist of two parts: First, a framework that provides a set of abstractions to reason about general properties of programs, such as data flow and control flow. Second, a set of analyses built on top of the framework, each of which detects a particular kind of bug. This design has proven effective for general-purpose bug detection, as it allows to reuse the framework for different analyses. Yet, the abstractions provided by these frameworks are not sufficient to reason about quantum-specific concepts, such as quantum gates, quantum circuits, and quantum registers.

Refer to caption — Figure 1. Example of a quantum program with two bugs.

As a motivating example, Figure 1 shows a buggy quantum program. The program is based on Qiskit, a popular quantum computing platform, where a quantum program is Python code that uses a specific library. The code creates a circuit with a quantum register and a classical register, and then applies a sequence of gates and measurements to the circuit. The image on the right of the figure shows a graphical representation of the circuit. Despite being a simple program, the code contains two bugs. First, the code creates a circuit with four qubits, but then uses only three of them. Oversizing a circuit is strongly discouraged in quantum computing, because it wastes resources and because current hardware offers only a limited amount of qubits. Second, the code measures the state of qubit 0 at line 9, and afterwards applies a gate to the same qubit at line 10. Unfortunately, due to the properties of quantum mechanics, the measurement destroys the quantum state, and hence, the program feeds a collapsed state to the ry operation. Finding such bugs in an automated bug detector requires the ability to reason about quantum-specific concepts.

Applying a general-purpose static bug detector, e.g., Pylint (Pyl, [n. d.]), to the code in Figure 1 does not reveal the quantum-specific bugs. Recently, first techniques aimed at quantum programs have been proposed. One of them, QSmell (Chen et al., 2023) relies on dynamic analysis for most of its checks, and hence, is inherently harder to apply to real-world programs than a static analysis. Another one, QChecker (Zhao et al., 2023b), operates directly on the AST representation of quantum programs, but does not provide a general framework that abstracts over the details of the underlying quantum computing platform. Besides their conceptual limitations, neither QSmell¹¹1Here and also in our empirical evaluation, we refer to the static subset of QSmell’s checks because getting arbitrary quantum programs to execute is non-trivial, e.g., due to unresolved dependencies and user input expected by a program. nor QChecker detects the bugs in Figure 1, showing that there is a need for a comprehensive static bug detection framework for quantum programs.

This paper introduces LintQ, a static analysis framework for detecting bugs in quantum programs. The approach is enabled by two key contributions. First, LintQ offers a set of abstractions of common concepts in quantum computing, such as circuits, gates, and qubits. These abstractions lift code written based on a specific quantum computing API, such as Qiskit, onto a higher level of abstraction. Second, we implement on top these abstractions an extensible set of ten analyses, each of which queries the code for a particular kind of quantum programming problem. To benefit from prior work on general-purpose program analysis, the approach builds on an existing analysis framework, CodeQL (Avgustinov et al., 2016).

Applying LintQ to the code in Figure 1 leads to warnings about the two bugs. To reach this conclusion, the framework first represents the different elements of the program using our quantum-specific abstractions. For example, this representation expresses the fact that the circuit created at line 5 has four qubits. An analysis checking for oversized circuits then uses this information to determine that the circuit only uses the first three of the four qubits. To find the second bug, an analysis aimed at warning about operations to a qubit applied after measuring the qubit builds upon the fact that our framework indicates that lines 9 and 10 operate on the same qubit. Importantly, none of the analyses need to reason about specific API calls in the Python code, but instead reasons about the quantum program at the level of LintQ’s abstractions, which greatly simplifies the implementation of analyses.

We evaluate our approach by applying it to a novel dataset of 7,568 real-world, Qiskit-based quantum programs. The analyses built on top of our framework identify various problems in these programs. Manually inspecting a sample of 345 warnings from ten analyses shows that LintQ identifies 216 legitimate programming problems. Moreover, when using the default and recommended configuration of LintQ with six analyses, it achieves a precision of 91.0% (121 true positives out of 133 warnings). We reported 70 problems, seven of which have already been confirmed or even fixed. Our evaluation also shows that implementing an analysis on top of our abstractions is relatively simple, with an average of only 10 LoC per analysis, and that the analysis time is reasonable, with an average of 1.3 seconds per program.

In summary, this work makes the following contributions:

•

A comprehensive framework for quantum program analysis, which provides reusable abstractions to reason about quantum software.
•

Ten analyses implemented on top of these abstractions, which focus on programming problems reported in existing work on quantum-specific bugs (Zhao et al., 2021a; Kaul et al., 2023), or mentioned in GitHub issues and on StackExchange (Egretta.Thula, 2023; Opt, [n. d.]).
•

A novel dataset of 7,568 real-world, Qiskit-based quantum programs, which is the largest such dataset and hence, may serve as a basis for future work.
•

A thorough evaluation of the effectiveness of the analyses, showing that the approach in its default configuration finds real-world issues with a precision of 91.0% while taking only 1.3 seconds to analyze a program.

2. Background

2.1. Quantum Programming

Several quantum programming languages have been proposed, such as Qiskit (Developers, 2021b), Cirq (Developers, 2021a), Q# (Svore et al., 2018), and Silq (Bichsel et al., 2020). Quantum programs, also called quantum circuits, are expressed as a sequence of operations, called quantum gates, applied to individual qubits. In Figure 1, we show a Qiskit program with four qubits and one classical bit, represented as horizontal lines, whereas the gates are shown as boxes or colored vertical lines. A special type of gate, called measurement gate, is used to measure the state of a qubit and store the result in a classical bit. The measurement gate, represented in black in the figure, produces a certain bit, either 0 or 1, with probabilities determined by the qubit state. Once the circuit has been defined, it is sent to a backend that executes it, typically a simulator or a real quantum computer. Note that a measurement has the important side-effect of destroying the quantum state. To sufficiently characterize its output, the circuit is executed multiple times, called shots, each time measuring the qubit.

To define and run quantum programs, developers rely on a quantum computing platform. A popular approach is to implement a platform on top of Python, as done by Qiskit (Developers, 2021b), Cirq (Developers, 2021a), Tket (Sivarajah et al., 2020), and Pennylane (Bergholm et al., 2020), so that a quantum program is essentially Python code that uses a specific library. Most platforms, with some noteworthy exceptions (Bichsel et al., 2020; Paradis et al., 2021; Weigold et al., 2021) describe programs on the level of circuits, and LintQ focuses on the analysis of quantum circuits. In particular, LintQ focus on code written using the Qiskit quantum computing platform due to its popularity in both practice (Dargan, 2022) and research (Paltenghi and Pradel, 2022; Zhao et al., 2021b; Luo et al., 2022; Fortunato et al., 2022b, a; Wang et al., 2021b; Paltenghi and Pradel, 2023).

2.2. Static Analysis with CodeQL

CodeQL (Avgustinov et al., 2016)²²2https://codeql.github.com/ is a popular engine for static analysis. It extracts facts from a program, such as its syntactic structure, data flow, and control flow, and then stores them in a relational database, which can be queried with the QL logic language. A query refers to classes, which we call abstractions, and their predicates, which represent relations between the abstractions. As a simple example, Figure LABEL:fig:example_codeql shows a CodeQL query that finds redundant if statements in Python. In the from section of the query, we define which program elements we consider, namely all if statements (If) and all statements (Stmt). Then, the where section restricts the query and focuses on those if statements that contain the pass keyword in their body. Finally, the select section specifies the warning and its message. At line 3, we see how the If abstraction offers a getStmt predicate, which connects the if statement with its first statement in the “if true” block. Although Python is the language of choice of many quantum computing platforms (Qis, 2021; Developers, 2021a; Sivarajah et al., 2020), the built-in QL library for Python extracts no information regarding quantum computing concepts, such as the quantum registers, the qubit position in a register, the quantum gates, or the difference between a state manipulation and a measurement, further motivating the need for LintQ.

⬇

1import python

2from If ifstmt, Stmt pass

3where pass = ifstmt.getStmt(0) and

4 pass instanceof Pass

5select ifstmt, "This ’if’ statement is redundant."\end{lstlisting}

6 \caption{Example CodeQL query to find code redundant if statements.}

7 \label{fig:example_codeql}

8 \end{minipage}

9 \hfill

10 \begin{minipage}[b]{.45\textwidth}

11 \includegraphics[width=\textwidth]{images/overview.pdf}

12 \caption{Overview of \approachName{}.}

13 \label{fig:method_overview}

14 \end{minipage}

15 \end{minipage}

16\end{figure}

19\section{Overview of \approachName{}}

20\label{sec:overview}

22Figure~\ref{fig:method_overview} shows an overview of \approachName{}, which takes a quantum program written in Qiskit as its input, and then outputs warnings about quantum-specific programming problems in this program.

23\approachName{} is a static analysis framework realized in three stages:

24\begin{enumerate}

25 \item The existing static analysis engine CodeQL (Section~\ref{sec:background}) extracts general information about Python code, such as control flow paths, data flow facts, and how to resolve imports.

26 \item The core of our approach, called \core{}, represents the behavior of the quantum program using a set of reusable quantum programming abstractions, such as qubits, gates, and circuits.

27 Section~\ref{sec:quantum_abstractions} describes these abstractions in detail.

28 The key benefit provided by this stage is to lift the program representation from a large and diverse set of Python constructs and Qiskit APIs into a smaller set of reusable abstractions.

29 \item An extensible set of analyses builds on the abstractions to identify programming problems.

30 Each analysis is formulated as a query over facts provided by CodeQL and \core{}, which allows for writing concise yet precise analyses.

31 Section~\ref{sec:target_bug_patterns} describes \nPatternDetectors{} analyses in detail.

32\end{enumerate}

34Stages~2 and~3 of \approachName{} are the main technical contributions of our work.

35As Qiskit is a Python library, building on top of CodeQL in Stage~1 allows us to reuse its abilities at reasoning about Python programs.

36At the same time, CodeQL does not have any knowledge of quantum programming, which is why we introduce \core{} in Stage~2.

39\section{Quantum Abstractions}

40\label{sec:quantum_abstractions}

42\core{} provides a set of abstractions that represent concepts commonly found in quantum programs.

43The motivation for introducing these abstractions is that quantum programming platforms, such as the Qiskit, typically offer a wide range of APIs to express quantum computations.

44An alternative to \core{} would be to define analyses directly w.r.t.\ these APIs, which would require each analysis to consider the diversity of the Python language and the Qiskit APIs.

45As an example to illustrate this diversity, consider how a program may refer to qubits.

46First, there are multiple Python constructs for this purpose, including a single integer literal, e.g., \code{1}, a single integer variable, e.g., \code{qubit\_idx}, a

47sequence of integer literals or integer variables, e.g., \code{[0, 1, 2]}, and expressions that retrieve a value from a variable that holds a qubit register, e.g., \code{qreg[2]}.

48Second, all the above references to qubits may occur in various code locations.

49For example, Qiskit offers over 50 functions to add different kinds of gates to a circuit.

50Each of these functions expects references to qubits at one or more argument positions.

51Instead of considering the full diversity of Python and Qiskit in each analysis, \core{} lifts quantum programs into more general abstractions.

52These abstractions enable us to write concise analyses that reason about quantum computing concepts instead of a low-level API that implements these concepts.

58\begin{figure}

59 \centering

60 \includegraphics[width=0.95\textwidth]{images/modelling_v2.pdf}

61 \caption{Examples of \approachName{}’s abstractions and how they are represented in the program and the circuit.}

62 \label{fig:abstraction_example}

63\end{figure}

65Figure~\ref{fig:abstraction_example} gives an overview of the abstractions offered by \core{} and illustrates through an example how they relate to the Qiskit API.

66Each abstraction corresponds to a CodeQL class that offers a number of predicates.

67These predicates express relationships between abstractions, e.g., to reason about the quantum register that stores the qubit a gate operates on, or properties of the specific abstraction, e.g. the size of a register.

68Although some of our abstraction borrow names used in the quantum circuit model~\cite{deutschQuantumComputationalNetworks1997, nielsenQuantumComputationQuantum2002} and in Qiskit, the abstractions represent more general concepts than simple types or API calls.

69For example, as illustrated above, a single qubit or gate can be referred to in various ways, but they all share the same abstractions.

70To clearly distinguish between the abstractions and the Qiskit API, we show abstractions in italics, e.g., \emph{QuantumCircuit} and Python/Qiskit code in monospace, e.g., \code{QuantumCircuit(qreg, creg)}.

74\subsection{Registers}

76The central storage facility of quantum programs are registers.

77Lines~2 to~7 of Figure~\ref{fig:abstraction_example} illustrate different ways of creating registers and associating them with circuits.

78For example, registers can be created explicitly by calling the respective constructors (lines~2 to~5), but also implicitly by passing the size of a register to a newly created circuit (line~8).

79Reasoning about the size of a register requires us to identify the integer constant that determines the register size, as illustrated in line~1.

80\approachName{} models registers and their relationships with the remaining program using two abstractions: \emph{QuantumRegister} and \emph{ClassicalRegister} for storing qubits and bits, respectively.

81The register abstraction is defined by a data flow node corresponding to the register’s allocation site in the source code, e.g., a call to a constructor \code{QuantumRegister}.

82The abstraction offers predicates to retrieve the size of the register and the circuit it is associated with.

83For example, the \emph{getSize} predicate of the \emph{ClassicalRegister} created at line~3 returns \code{3} because \approachName{} tracks the value of the variable \code{n} (line~1) and its relationship with the register.

86\subsection{Quantum Circuits}

88A quantum circuit describes a sequence of instructions that operate on data stored in registers.

89Qiskit provides several APIs for creating circuits, composing larger circuits from smaller ones, and for associating circuits with other program elements, e.g., registers and gates.

90For example, lines~6 and~8 of Figure~\ref{fig:abstraction_example} create two circuits and then add one as a sub-circuit into the other.

92\approachName{} models circuits using the \textit{QuantumCircuit} abstraction.

93The predicates offered by this abstraction include \emph{isSubcircuitOf}, which allows for checking whether a circuit is a sub-circuit of another one.

94The \emph{getNumberOfQubits} and \emph{getNumberOfClassicalBits} predicates track the size of each circuit, even when it uses multiple registers, such as in line~6 or when the registers are added later in the code, such as in line~7.

95Finally, \emph{getAGate} and \emph{getAQuantumRegister} express the relationship between a circuit and any of its gates or quantum registers.

97The \textit{QuantumCircuit} abstraction is lifted from different source-code constructs: (1) calls to the \code{QuantumCircuit} constructor, (2) any call to a user-defined function returning a quantum circuit, (3) any built-in constructor of parametrized circuits, such as \code{EfficientSU2}, (4) any unknown object created via an external function call that uses methods specific to quantum circuits, such as \code{to\_gate}, \code{to\_instruction}, \code{assign\_parameters}, (5) any copy of an existent circuit created by calling \code{copy} on another circuit, (6) a call to \code{transpile}, which returns a new version if the circuit that is compatible with the instruction set and connectivity of the target quantum computer.

98All of these cases are modeled as subclasses of the \textit{QuantumCircuit} abstraction, giving analyses the option to refer to specific kinds of circuits.

99For example, \textit{TranspiledCircuit} abstraction can be used to enforce rules about circuits once they have been transpiled.

100

101

102\subsection{Subcircuits and Composition}

103To create larger circuits from smaller ones, quantum programming platforms offer APIs to compose circuits.

104\approachName{} models how different circuits are composed with each other via the \emph{isSubcircuitOf} predicate.

105The predicate \emph{isSubCircuitOf} tracks via dataflow analysis all those quantum circuits that flow into the \code{append} and \code{compose} methods of a quantum circuit object, and it identifies the object on which the method is called as the \emph{parent} circuit and the circuit passed as the argument as its \emph{subcircuit}.

106In addition, \core{} models two other cases where a circuit is not yet explicitly composed into another one, but is likely to be a subcircuit: (a) when the circuit is returned by a function and (b) when an entire circuit is converted into an atomic instruction or gate via the \code{to\_instruction} or \code{to\_gate} methods.

107

108

109

110

111\subsection{Quantum Operators: Reversible and Irreversible}

112

113Instructions in quantum programs are expressed via quantum operators being added to a circuit.

114Intuitively, a quantum operator is any function that manipulates the quantum state by acting on the values stored in one or more qubits.

115There are two main types of quantum operators: reversible and irreversible.

116Because irreversible operators, such as measurements, destroy the quantum state, they are typically placed at the end of a quantum program.

117\core{} represents quantum operators using the \emph{QuantumOperator} abstraction, which is a superclass of the \emph{Gate}, \emph{Measurement}, and \emph{Reset} abstractions.

118

119The \emph{Gate} abstraction represents reversible quantum operators, such as the Hadamard gates used in lines~12 and~13 of Figure~\ref{fig:abstraction_example}.

120Overall, there are several dozens of different APIs for creating gates and many more for connecting gates with other parts of a quantum program, e.g., the qubits a gate operates on.

121To enable analyses to reason about gates without repeatedly listing all gate-related APIs, \core{} offers the \emph{Gate} abstraction, which captures all gates and their properties.

122The abstraction provides predicates to reason about a gate’s relations to other program elements.

123For example, the \emph{getQuantumCircuit} predicate relates a gate to the circuit it is added to, and the \emph{getATargetQubit} predicate allows for reasoning about the qubit a gate operates on.

124For illustration, consider the control-not gate created at line~14 of Figure~\ref{fig:abstraction_example}.

125The \emph{getATargetQubit} predicate returns the fact that this gate operates on the qubits stored at indices 0 and 1 of the quantum register created at line~4.

126

127

128

129To represent irreversible quantum operators, \core{} offers multiple abstractions: \emph{Measurement} and \emph{MeasurementAll} to represent measurements of a single qubit and all qubits in a register, respectively; \emph{Reset} for operations that reset a qubit to the $|0\rangle$ state; and \emph{Initialize} for operations that initialize one or more qubits with a vector of complex numbers.

130In Figure~\ref{fig:abstraction_example}, \approachName{} creates measurement abstractions for the code at lines~18 and~19.

131

132

133\subsection{Uses of Qubits and Classical Bits}

134

135Quantum information stored in qubits typically is used and manipulated by multiple quantum operators.

136At the level of the Qiskit API, uses of qubits come in various forms.

137For example, as illustrated above, a program may refer to a qubit via an integer passed as an argument to a gate operation or via an index into an array that represents a register.

138Reasoning about qubit uses is compounded by the fact that different gate operations use different parameter indices to refer to the qubits they operate on.

139For example, when adding a controlled unitary gate to a circuit via \code{qc.cu(1, 2, 3, 4, 5, 6)}, then only the last two arguments refer to qubits, whereas the others are parameters of the gate.

140

141To help analyses in precisely reasoning about qubit uses, \approachName{} offers the \emph{BitUse} abstraction, split in its quantum and classical subclasses: \emph{QubitUse} and \emph{ClbitUse}.

142The \emph{QubitUse} abstraction uniquely identifies a used qubit based on the register a qubit is stored in and based on the integer index of a qubit in this register.

143The \emph{BitUse} abstraction offers predicates to connect to other abstractions, e.g., for obtaining the gate where the (qu)bit is used, the register where the (qu)bit is stored, and the corresponding circuit.

144The \emph{getAnIndex} and \emph{getAnAbsoluteIndex} predicates return the position of the qubit in the register and the position of the qubit in the circuit, respectively.

145The latter predicate keeps track of all the registers added to the circuit before the current accessed register and shifts the index accordingly.

146For our running example in Figure~\ref{fig:abstraction_example}, the \emph{QubitUse} abstractions represents each of the many references to qubits, such as the use qubit~1 of \code{qregA} at line~13 or the use of qubit~0 of register \code{qregB} at line~14.

147

148

149

150

151

152

153

154

155

156\subsection{Quantum Data Flow}

157\label{sec:quantum data flow}

158

159An important property of quantum computations, which does not have a direct correspondence in classical computing, is the order in which quantum operators are applied to qubits.

160However, the order of adding two quantum operators to a circuit does not necessarily imply that the operator added first is executed after another operator added to the circuit later.

161Instead, the order of applying operations depends on the qubits the operators act on.

162\core{} derives the ordering of two operators if and only if they act in the same qubit.

163To this end, the approach uses the \emph{QubitUse} abstraction described above to check if the qubits that two operators $op_1$ and $op_2$ manipulate are the same, and if so, derives an ordering relation based on the order in which $op_1$ and $op_2$ are added into the circuit.

164Following this reasoning for all quantum operators yields a partial order between quantum operators, that we call \emph{quantum data flow}, since it describes how data stored in qubits flows between quantum operators.

165\approachName{} exposes this partial order to analyses via the \emph{mayFollow} predicate that relates two quantum operators if and only if the two are part of the same circuit and the first may follow the second according to derived quantum data flow.

166

167For example, consider the control-not gate (line~14) and the measurement (line~18) in Figure~\ref{fig:abstraction_example}.

168Since both operate on the same qubit belonging to the \code{qregB} register, \approachName{} exposes their ordering in the \emph{mayFollow} predicate.

169In contrast, the analysis does not claim any order between the gates created at lines~15 and~16, as they act on different registers.

170

171In addition to the \emph{mayFollow} predicate, \core{} offers: (i) \emph{mayFollowDirectly}, a variant where the two quantum operators are applied on the same qubit directly after one another, without any other operation in between, (i) \emph{sortedInOrder}, which checks whether three quantum operators may appear in the given order according to the quantum data flow.

172Note that both are defined only for unambiguous quantum operator additions, i.e., when the approach knows for sure that the operators act on the same qubit.

173

174

175

176

177\subsection{Access to Low-Level Constructs}

178

179Although our abstractions help in writing concise analysis, the analysis developer is not limited to those, but can refer to lower-level constructs when necessary.

180For example, an analysis could restrict a \emph{Gate} to refer to specific kind of gate or reason on any other specific parameter of an API, if needed.

181Ultimately, the main benefit of \approachName{} is to avoid repeating the same low-level details across different analyses by capturing commonly required abstractions.

182

183\subsection{Soundness and Precision: Unknown Quantum Operators}

184

185Following the philosophy of existing linters and many other static analyzers~\cite{Livshits2015}, \approachName{} aims neither at full soundness nor at full precision.

186Instead, the approach offers a pragmatic compromise between the two, with the goal of finding as many programming problems as possible without overwhelming developers with spurious warnings.

187

188To prevent analyses from raising incorrect warnings due to general limitations of static analysis, \core{} explicitly model unknown information, which gives an analysis the option to (now) draw conclusions based on such information.

189Specifically, \core{} exposes a \emph{QubitUse} only if the approach can unambiguously resolve both the register and the index.

190For example, if a program applies a Hadamard gate with \code{qc.h(idx)} where \code{idx} is a variable obtained from user input, then \approachName{} leaves the qubit access unresolved so that analyses do not draw inaccurate conclusions.

191To the same end, \core{} exposes a \emph{UnknownQuantumOperator} abstraction when the analysis framework cannot resolve some of the qubits used in that operator, e.g., \code{qc.cx(0, i)}, where the value of \code{i} is not statically known.

192\approachName{} also considers functions that may extend a circuit as an \emph{UnknownQuantumOperator}.

193We identify those as either: (i) a call to an unknown function, where a \emph{QuantumCircuit} flows in as an argument, (ii) a call to a function that directly modifies the \emph{QuantumCircuit} by referring to it via a global variable.

194

195

196As a preprocessing step to reduce the number of \emph{UnknownQuantumOperators}, when reasoning about programs with loops, the framework unrolls loops that have a statically known number of iterations.

197Such loops are relatively frequent in quantum programs, e.g., when the programmer applies the same gates multiple times and specifies the loop bound with \code{range(1, 3)} or \code{range(4)}.

198We limit unrolling to loops with at most ten iterations as a tradeoff between better modeling of the program and the risk of introducing too many new program elements, thus affecting the scalability of the approach.

199

200

201\section{Analyses for Finding Quantum Programming Problems}

202\label{sec:target_bug_patterns}

203

204\begin{table}

205 \caption{Analyses for finding quantum programming problems.}

206 \newcommand{\vmyrow}{\vspace*{1mm}}

207 \label{tab:checkers}

208 \small

209 \setlength{\tabcolsep}{2pt}

210 \begin{tabular}{@{}lp{33em}r@{}}

211 \toprule

212 \vmyrow{}Analysis name Description Origin \\

213 \midrule

214 \rowcolor{light-gray} \multicolumn{3}{l}{\vmyrow{} Measurement-related and gate-related problems:}\\

215 DoubleMeas Two measurements measure the same qubit state one after the other. \cite{zhaoIdentifyingBugPatterns2021} \\

216 OpAfterMeas A gate operates on a qubit after it has been measured. \cite{zhaoIdentifyingBugPatterns2021} \\

217 MeasAllAbuse Measurement results are stored in an implicitly created new register, even though another classical register already exists. \cite{zhaoIdentifyingBugPatterns2021} \\

218 CondWoMeas Conditional gate without measurement of the associated register. \cite{kaulUniformRepresentationClassical2023} \\

219 \vmyrow{}ConstClasBit A qubit is measured but has not been transformed. \cite{kaulUniformRepresentationClassical2023} \\

220 \rowcolor{light-gray} \multicolumn{3}{l}{\vmyrow{} Resource allocation problems:}\\

221 InsuffClasReg Classical bits do not suffice to measure all qubits. \cite{zhaoIdentifyingBugPatterns2021} \\

222 \vmyrow{}OversizedCircuit The quantum register contains unused qubits. \cite{user19571QuestionRemoveInactive2022} \\

223 \rowcolor{light-gray} \multicolumn{3}{l}{\vmyrow{} Implicit API constraints:}\\

224 GhostCompose Composing two circuits without using the resulting composed circuit. \cite{egretta.thulaAnswerWhyDoes2023} \\

225 OpAfterOpt A gate is added after transpilation. \cite{OptimizeSwapBeforeMeasurePassDrops} \\

226 OldIdenGate Using a now-removed API to create an identity gate. \cite{zhaoQCheckerDetectingBugs2023} \\

227 \bottomrule

228 \end{tabular}

229\end{table}

230

231

232To illustrate the usefulness of \core{}, the following present an extensible set of \nPatternDetectors{} analyses built on top of the abstractions provided by the framework.

233

234\subsection{Methodology: Collecting a Catalogue of Bug Patterns}

235To identify a set of quantum programming problems that can be detected by static analysis, we search through existing literature and developer discussions.

236We collect literature that studies programming issues in quantum programs by querying both the ACM Digital Library and IEEE Xplore, looking for any work that contains both keywords ‘‘quantum’’ and ‘‘bug’’ in its metadata.

237We apply a cutoff on the search results and inspect the top 50 results of each database, sorted by relevance, which yields 100 candidate papers.

238Next, we exclude papers that match one of the following:

239they are duplicate papers (\miniSurveyExclusionReasonDuplicate{});

240they focus on hardware faults (\miniSurveyExclusionReasonHardwareRelated{});

241they discuss only quantum computing concepts (\miniSurveyExclusionReasonOnlyMentioningQuantum{}) or give an overview of the field (\miniSurveyExclusionReasonFieldOverview{});

242they are theory papers (\miniSurveyExclusionReasonTheoryPaper{});

243they only use quantum computing methods and are not focused on bug detection (\miniSurveyExclusionReasonOnlyUsingQuantum{});

244they focus quantum-related software, such as quantum computing platforms, but not on quantum programs (\miniSurveyExclusionReasonNotOnPrograms{});

245they do not provide a list of bugs or issues (\miniSurveyExclusionReasonNotBugList{});

246or they require a specification of each quantum program, which typically is not available to a linter (\miniSurveyExclusionReasonSpecRequired{}).

247We also exclude the dataset paper of Bugs4Q~\cite{zhaoBugs4QBenchmarkExisting2023}, because we use it as a benchmark in our evaluation.

248After this filtering, we are left with \miniSurveyIncludedWorks{} papers that contain a list of programming issues or bug patterns in quantum programs.

249To complement the list of bug patterns from the literature, we collect \miniSurveyPatternsFromDevDiscussion{} additional patterns that we identified in developer discussions on StackOverflow and in GitHub issues using an approach similar to prior work~\cite{luoComprehensiveStudyBug2022a}.

250

251Inspecting the selected papers and discussions, we find a total of \miniSurveyTotalBugPatternsNoDuplicates{} unique bug patterns for which we have clear examples of the problem to be identified.

252We exclude patterns for the following reasons:

253implementing an accurate analysis requires knowledge of the exact hardware the program will run on (\miniSurveyBugPatternRequireHardwareKnowledge{});

254they require a specification for each quantum program, e.g., describing the ‘‘correct’’ gate, which is generally not available to a linter (\miniSurveyBugPatternSpecRequired{});

255they require runtime information (\miniSurveyBugPatternRuntimeInfo{});

256they require abstractions not available in LintQ (\miniSurveyBugPatternLintqModelingLimitations{});

257the described pattern is not an issue anymore in the current Qiskit release (\miniSurveyBugPatternNotAnIssue{});

258or they look for rare combinations of APIs that hardly appear in our large evaluation dataset (\miniSurveyBugPatternTooApiSpecific{}).

259After this filtering, we obtain a list of \miniSurveyTotalAnalysesImplemented{} analyses to implement in \approachName{}, which are listed in Table~\ref{tab:checkers}.

260Note that, although the ultimate goal of LintQ is to find bugs, we acknowledge that some of the patterns could also be considered code smells or anti-patterns.

261The following describes each analysis in detail by introducing the problem and by then describing how to query for instances of the problem using the \approachName{} abstractions.

262

263

264\subsection{Measurement-Related and Gate-Related Problems}

265

266\textbf{Double measurement.}

267Any two subsequent measurements on the same qubit produce the same classical result, making the second measurement not only redundant but also a possible sign of unintended behavior or a misunderstanding of the properties of quantum information.

268Figure~\ref{fig:bug_and_query_redundant_measurement} (left) shows an example of the problem.

269\textit{Analysis}: The query to spot this problem is shown in Figure~\ref{fig:bug_and_query_redundant_measurement} (right).

270It searches for two consecutive measurements of the same qubit by checking whether the two operations are directly adjacent w.r.t.\ the order derived from quantum data flow (Section~\ref{sec:quantum data flow}).

271Note that simply relying on the integer to spot two gates operating on the same qubit is ineffective since they might refer to the same position but in two different registers.

272To avoid this problem, the analysis relies on the \textit{mayFollowDirectly} predicate provided by \core{}, which leads to simple and concise analysis.

273

274\begin{figure}[t]

275 \begin{minipage}[b]{.93\textwidth}

276 \begin{minipage}[b]{.44\textwidth}

277 \begin{lstlisting}[language=Python, numbers=left]

278circuit = QuantumCircuit(3, 3)

279circuit.ccx(0, 1, 2)

280circuit.measure(0, 0)

281circuit.measure(2, 2)

282# Problem: Qubit 0 already measured

283circuit.measure(0, 1)

⬇

1from Measurement m1, Measurement m2, int q

2where mayFollowDirectly(m1, m2, q)

3select m2, "Redundant measurement on the same qubit"\end{lstlisting}

4\end{minipage}

5\end{minipage}

6\caption{Redundant measurement example (left) and its analysis (right).}

7\label{fig:bug_and_query_redundant_measurement}

8\end{figure}

11\textbf{Operation after measurement.} When a qubit is measured, its quantum state collapses to a classical value, either 0 or 1. %

12The measurement operation thus effectively destroys the quantum state.

13Any subsequent operation after the measurement acts on a destroyed quantum state, which is unlikely to be the intended behavior.

14Figure~\ref{fig:bug_and_query_op_after_measurement} (left) shows an example of the problem, where qubit 0 is measured and then a Pauli-Z gate is applied on it.

15Note how this case differs from a redundant measurement, where the clash is between two measurements of the same qubit, whereas here it is between a measurement and a gate.

16\textit{Analysis}: The query searches for a quantum gate that is applied on a qubit, which has just been measured.

17Note that a trivial check of the usage of API calls that syntactically happen one after the other is insufficient, e.g., because the control flow may be more complex and the two calls could refer to two different qubits.

18Instead, our query ensures that the two operations are applied both to the same qubit belonging to the same register thanks to the \code{mayFollowDirectly} predicate, as shown in Figure~\ref{fig:bug_and_query_op_after_measurement} (right).

19The query excludes cases where applying the gate depends on a classical bit that resulted from a measurement, since this is a common pattern in quantum programs and the measurement preceding the gate could be required.

21\begin{figure}[t]

22 \begin{minipage}[b]{.93\textwidth}

23 \begin{minipage}[b]{.5\textwidth}

24 \begin{lstlisting}[language=Python, numbers=left]

25qc = QuantumCircuit(2, 2)

26qc.h(1)

27qc.cx(1, 0)

28qc.measure(0, 0)

29qc.measure(1, 1)

30qc.z(0) # Problem: Qubit 0 has collapsed

31qc.measure(0, 0) \end{lstlisting}

32 \end{minipage}

33 \hfill

34 \begin{minipage}[b]{.43\textwidth}

35 \begin{lstlisting}[

36 language=codeql, numbers=left, escapechar=] from Measurement m, Gate g, int q where mayFollowDirectly(m, g, q) and not g.isConditional() select gate, "Gate after measurement on qubit " + q ]

37from

38 QuantumCircuit c, MeasurementAll m

39where c = m.getQuantumCircuit() and

40 c.getNumberOfClassicalBits() > 0

41 and m.createsNewRegister()

42select m, ”measure_all() with classical register”\end{lstlisting}

43 \end{minipage}

44\end{minipage}

45 \caption{Measure all abuse example (left) and its analysis (right).}

46 \label{fig:bug_and_query_measure_all}

47\end{figure}

52\subsection{Resource Allocation Problems}

54The current generation of quantum computers are still limited in terms of qubits and gates, thus the use of resources must be carefully managed to avoid wasting them.

55At the same time, enough resources must be allocated for the quantum state to evolve and being measured correctly.

57\textbf{Insufficient classical register.} This problem happens when we define a quantum program that uses more qubits than those that can be measured in the classical register allocated in the beginning.

58For example, the problem arises when the developer allocates a classical register with only two bits and then works on three qubits, i.e., \code{QuantumCircuit(3, 2)}.

59\textit{Analysis}: The query searches for circuits with the number of qubits greater than the number of classical bits.

60The \textit{QuantumCircuit} abstraction is used to reason about the number of classical and quantum bits, but also to check with the predicate \textit{isSubCircuit} that the circuit is not used as a sub-circuit.

61This is necessary to reduce false positives, since sub-circuits are often used legitimately without a classical register.

67\textbf{Oversized circuit.} This problem happens when a program allocates a quantum register that is larger than the number of qubits actually used.

68Given the high cost of implementing a single qubit in hardware, when this issue happens it implies a waste of resources.

69The motivating example in Figure~\ref{fig:bug_example} shows an instance of this problem, where the program allocates a quantum register of size four but uses only three of the qubits.

70To the best of our knowledge, this work is the first to describe this programming problem.

71\textit{Analysis:} As shown in Figure~\ref{fig:bug_and_query_oversized_quantum_circuit} (bottom), the query scans all gates used in a quantum circuit and raises a warning if any of the slots in the quantum register is not used (line~\ref{line:not_for_each_qubit}).

72To ensure precision, the query checks several situations where no warning should be reported, e.g., circuits with an unknown register size or an unknown gate, and circuits that have sub-circuits.

75\begin{figure}[t]

76 \centering

77 \begin{minipage}{0.93\textwidth}

78 \begin{lstlisting}[

79 language=codeql, numbers=left, escapechar=\] from QuantumCircuitConstructor circ, int numQubits where // the circuit has a number of qubits numQubits = circ.getNumberOfQubits() and numQubits ¿ 0 and // there is one qubit position not accessed by any gate not exists(QubitUse bu, int i — i in [0 .. numQubits - 1] —\label{line:not_for_each_qubit}bu.getAnAbsoluteIndex() = i and bu.getAGate().getQuantumCircuit() = circ ) and // the circuit has no (unknown) sub-circuits not exists(SubCircuit sub — sub.getAParentCircuit() = circ) and\label{line:not_usage_as_subcrcuit}// there is no initialize op, because it can potentially touch all qubits not exists(Initialize init — init.getQuantumCircuit() = circ) and // all its registers have well-known size not exists(QuantumRegisterV2 reg — reg = circ.getAQuantumRegister() and not reg.hasKnownSize()) and\label{line:not_unknown_reg_size}// there are no unknown quantum operators not exists(UnknownQuantumOperator unkOp — unkOp.getQuantumCircuit() = circ)\label{line:not_unknown_quantum_op}select circ, ”Circuit has unused qubits”

Figure 2. Ghost composition problem. (ID: 3)

Figure 3. Examples of bugs found by LintQ.

2.3. RQ3: Precision and Recall

Table 1. Warnings and precision of the LintQ analyses (left) and the result of manual inspection (right).

Analysis Name	Tot. warnings	% Files	Precision	FP / NW / TP
DoubleMeas	39	0.36%	72.0%	4/3/18
OpAfterMeas	127	0.92%	100.0%	0/0/44
MeasAllAbuse	22	0.26%	94.1%	0/1/16
ConstClasBit	533	4.29%	48.3%	21/10/29
CondWoMeas	46	0.22%	100.0%	0/0/28
InsuffClasReg	3489	17.35%	34.8%	22/21/23
OversizedCircuit	378	3.01%	50.0%	16/13/29
OpAfterTransp	7	0.05%	100.0%	0/0/7
GhostCompose	12	0.09%	66.7%	0/4/8
OldIdenGate	46	0.37%	50.0%	11/3/14

Precision. Because precision is crucial for practical adoption of static analyzers (Flanagan et al., 2002; Bessey et al., 2010; Johnson et al., 2013), we assess to what extent the analyses suffer from false positives. We manually inspect a random sample of ten warnings for each analysis, or all produced warnings if that number is lower than ten. Two of the authors, who are both experienced in static analysis and with quantum computing knowledge, independently inspect the warnings and then discuss them to reach a consensus. After the initial inspection, a 70.1% agreement was reached, and after the discussion, all disagreements were resolved. Based on the agreement, we compile an annotation protocol and a single author proceeds to annotate more warnings up to reaching a statistically relevant sample with a confidence level of 90% and a margin of error of 10% for each of the ten analyses, similar to related work (Ghaleb et al., 2023). We categorize each warning into one of three categories. A true positive is a warning that reveals clearly incorrect behavior in the program. Such incorrect behavior may result in a program crash, in incorrect output, or in an unnecessary performance degradation. A noteworthy warning is a potential problem where the analysis correctly detects an instance of the targeted programming problem, but we cannot certainly say whether the behavior is unexpected by the developer. Finally, a false positive is a warning reported despite the code being correct, which is typically caused by overly strong assumptions made by an analysis. Based on this classification, we compute precision as the percentage of true positives among all warnings. That is, our notion of precision underestimates the true precision, as it includes noteworthy warnings in the denominator, but not in the numerator. Table 1 illustrates the results of our manual inspection. Each analysis identifies at least a few true positives. The median precision across all analyses is 69.3%. The overall precision across all inspected warnings is 62.6% (216 true positives out of 345 inspected warnings). In practice, we recommend to enable those analyses that produce sufficiently precise results for the usage environment, and to inspect warnings by high-precision analyses first. For example, keeping only analyses with precision above 50%, yields an overall precision of 91.0% from the remaining six analyses, which we recommend as the default configuration for LintQ. Root causes of false positives. To better understand the reasons for false positives, we discuss representative cases in the following. False positives of the InsuffClasReg happen when: (ii) the circuit has more qubits than classical bits because of the presence of ancilla qubits, i.e., auxiliary storage used during a computation that does not need to be measured; (ii) the circuit is used as a submodule of a bigger circuit, thus it is not responsible of instantiating the classical bits. Better distinguishing between ancilla qubits and missed classical bits remains as a challenge for future work. The OversizedCircuit analysis causes false positives when the circuit has sub-circuits generated via a function call (e.g., qc.append(QFT(3), qargs=[0, 1, 2])), which LintQ currently does not track, thus making the circuit appear underused. Recall. Since we do not know the ground truth of all bugs in the 7,568 real-world quantum programs, we cannot compute the recall of LintQ on this large dataset. Instead, we use Bugs4Q (Zhao et al., 2021b; Luo et al., 2022)³³3https://github.com/Z-928/Bugs4Q-Framework, an existing benchmark of 42 quantum bugs. We run LintQ on the 42 buggy files, manually inspect the warnings by using the same annotation procedure applied in the rest of this work, and then check how many of LintQ’s true positives match a known bug in Bugs4Q. LintQ raises four true positive warnings (two by the OpAfterMeas, one by MeasAllAbuse, and one by OldIdenGate) that correspond to three known bugs in Bugs4Q. Thus, the recall of LintQ is 7.1% (3/42). While this number may seem low, it is actually higher than the recall of popular static bug detectors on Defect4J (Just et al., 2014), which prior work has measured to be between 1% and 3%, depending on the bug detector (Habib and Pradel, 2018). Interestingly, LintQ also finds some problems in the benchmark code beyond the known bugs of the benchmark. For example, the InsuffClasReg analysis raises several true positive warnings because some circuits do not have any classical register, which is likely due to the fact that the examples are incomplete code snippets gathered from issues and forum questions.

2.4. RQ4: Comparison with Prior Work

Due to the young field of quantum software engineering, there are only few static analyzers aimed at quantum programs: (i) QSmell (Chen et al., 2023), which detects smells in quantum programs, (ii) QChecker (Zhao et al., 2023b), an AST-based static analysis tool for quantum programs, (iii) QCPG (Kaul et al., 2023), a toolkit that extends Code Property Graphs (Yamaguchi et al., 2014) to analyze quantum code. Moreover, since the quantum programs we analyze are written in Python, we also compare with Pylint (Pyl, [n. d.]), a popular linter for Python designed for classical software. For QSmell we focus on their two static analysis-based detectors; for QChecker we use their eight AST-based checkers; for Pylint, we run the tool its default configuration. Unfortunately, we had to drop QCPG from the comparison because the tool is not available yet.⁴⁴4Although the authors plan to release the tool, the repository states the code is undergoing export checks, as confirmed by email. Prior work applied to problems found by LintQ. We run all three competitors on the programs where LintQ detects one of the 216 true positives that we have manually confirmed (Section 2.3), and we check which of them the existing techniques detect. QSmell, QChecker, and Pylint raise 77, 200, and 8,627 warnings, respectively. We inspect each warning that is at the same line as one of the LintQ warnings, which corresponds to 0, 9, and 42 warnings, and we assess whether they refer to the same problem or coincidentally flag the same line. For the QChecker warnings, we found that seven problems are found by both QChecker and LintQ, two of QChecker’s warnings coincidentally flag the same line, and the remaining 207 are missed by QChecker and found only by LintQ. For the Pylint warnings, 15 Pylint warnings correspond to the same issues flagged by OldIdenGate (14) and OversizedCircuit (1), 32 warnings coincidentally flag the same line, including conventions and coding style (29), and uses of missing APIs and arguments (3), and the remaining 174 warnings are missed by Pylint and found only by LintQ. Overall, prior work can find only 7.9% (17/216) of the true positives found by LintQ, overlooking the remaining 92.1% (199/216). LintQ applied to problems found by prior work. We also study the opposite direction: How many of the warnings raised by the competitors are also raised by LintQ? To this end, we run each quantum-specific competitor and inspect a sample of up to ten warnings produced by each of their detectors. For QSmell and QChecker we inspect, respectively, 20 warnings from two detectors and 59 warnings from six detectors. For QSmell, we unfortunately found no true positives among the 20 inspected warnings. The first detector (NC) flags a file that has more run and execute calls, used to run a circuit on a simulator or real hardware, than bind_parameters calls, used to convert any parametric gate into its concrete version before execution. These warnings are false positives, for two reasons: (1) The warning is emitted also when there is a single execute call, which is normal for any circuit that uses only concrete gates. (2) The detector does not model the assign_parameters API, which is a legitimate alternative to calling bind_parameters. The second detector (LPQ) checks if there is a transpile API call without the argument initial_layout set, since passing that argument is a good practice when running on a real quantum computer. Again, all the inspected warnings are false positives, for two main reasons: (1) The missing initial_layout argument is present with a simulator backend, which in practice has no hardware constraint to respect. (2) The rule considers any transpile calls, even those not belonging to Qiskit, which has no initial_layout argument. For QChecker, we found three true positives among the 59 inspected warnings. The true positives are raised by the Deprecated Order detector, which flags a deprecated usage of the iden, analogously to our OldIdenGate analysis. However, QChecker also reports many false positives, primarly because it warns about any function call that includes the substring iden, which also happens in functions unrelated to the quantum library. All three true positives found by QChecker are also detected by LintQ, because our OldIdenGate analysis targets the same kind of problem. The main difference is that LintQ raises fewer false positives, because it explicitly models gates, instead of relying on a text-based matching of the API name.

2.5. RQ5: Efficiency

We measure the time spent for analyzing all 7,568 quantum programs with all ten analyses. All experiments are run on an Ubuntu machine with an Intel Xeon Silver 4214 CPU with 48 cores and 252 GB of RAM. There are three main computational steps: (i) Using CodeQL to build the database of facts about the Python code, which takes 74 minutes for all 7,568 programs; (ii) Compiling the query plan of the analyses, which takes 97.0 seconds; and (iii) Running the analyses on the database, which takes 162 minutes. Inspecting the computational cost of individual analyses shows that the two most expensive analyses are those that reason about the gate execution order, DoubleMeasurement and OpAfterMeasurement, which take 2,290 and 1,761 seconds to evaluate, respectively. Taking all three steps together, LintQ takes 1.3 seconds per analyzed program.

3. Threats to Validity

Internal Validity

First, our analyses scan each program file individually, not considering the other files in the same repository. Applying LintQ at the repository level may produce different warnings. Second, we inspect only a subset of all warnings. To mitigate this thread, we sample the inspected warnings randomly and use a statistically relevant sample size. Third, our implementation may contains bugs. To mitigate this risk, we implement test cases for the abstractions and the analyses, and we make our implementation publicly available as open-source. Fourth, our literature review may have missed some relevant bug patterns. However, LintQ is designed to be extensible, i.e., additional bug patterns can be added to the framework in the future. Fifth, the manual inspection of warnings is inherently subjective. To mitigate this, two authors participated in the process, collaboratively developing and agreeing on an annotation protocol. This protocol has been documented and is made available in our artifact.

External Validity

First, while the abstractions of LintQ are designed to be also applicable to other quantum computing platforms, such as Cirq (Developers, 2021a) and Tket (Sivarajah et al., 2020), we cannot claim that our results generalize beyond Qiskit. Second, we cannot guarantee that our results generalize to other quantum programs. To mitigate this threat, we evaluate the approach on 7,568 real-world programs, which represents the largest such dataset to date.

4. Related Work

Quantum Software Testing

Miranskyy et al. (2020) highlight quantum-specific debugging issues when working with quantum programs and discuss how classical solutions could be adapted to the quantum domain. Regarding platform code, various approaches have been proposed, including differential testing (Wang et al., 2021b), metamorphic testing (Paltenghi and Pradel, 2023), and fuzzing (Xia et al., 2024). However, they all focus on platform code and require executing the code, whereas LintQ focuses on application code and is based on static analysis. Regarding application code, various techniques have been proposed. QuanFuzz(Wang et al., 2021a) tests a single algorithm with different inputs with the goal of maximizing branch coverage via a genetic algorithm. Quito (Ali et al., 2021) relies on a program specification and statistical tests to evaluate the correctness of a single small program. Huang and Martonosi (2019) propose statistical approaches to evaluate assertions in a quantum program. Li et al. (2020) describe a projection-based runtime assertion scheme that allows for asserting in the middle of the circuit without affecting the tested state if the assertion is satisfied. All these approaches assume to have single circuit programs that can be easily executed multiple times, which may not be the case in practice. Moreover, they rely on executing the programs, whereas LintQ is based on static analysis.

Quantum Program Analysis

Few analyses for quantum programs have been proposed so far, including QSmell (Chen et al., 2023), ScaffCC (JavadiAbhari et al., 2014), QChecker (Zhao et al., 2023b), and QCPG (Kaul et al., 2023). QSmell mostly relies on dynamic analysis, and it focuses on code smells only. ScaffCC is a compiler that performs a limited set of analyses using a new flavor of QASM, whereas we focus on Qiskit-based Python code. QChecker is a static analysis tool that relies only on AST information, but does not provide a general framework to build new analyses and does not model any control flow. QCPG (Kaul et al., 2023) extends Code Property Graphs (Yamaguchi et al., 2014) to analyze quantum code in a single circuit, whereas LintQ is designed to analyze entire programs and models the composition of circuits, as well as unknown quantum operators. There also exist quantum-specific program analyses, such as entanglement analysis (JavadiAbhari et al., 2014; Perdrix, 2008) and automatic uncomputation (Bichsel et al., 2020; Paradis et al., 2021; Xia and Zhao, 2023). However, most of them address a single problem each, whereas LintQ offers a set of general abstractions.Quantum abstract interpretation (Yu and Palsberg, 2021) and runtime assertions (Li et al., 2020) are two techniques to assert properties of quantum computations. They require manually crafted, algorithm-specific assertions, whereas LintQ does not require any prior knowledge of the program. In summary, LintQ goes beyond the purely syntactic level and single-circuit approaches by providing reusable abstractions to build a wide range of analyses in realistic settings with multiple circuits and unknown quantum operators.

Datasets of Quantum Programs

Paltenghi and Pradel (2022) share a large dataset of bugs in quantum computing platforms. However, the focus of LintQ is on application code written in Qiskit, and not on platform code. Two application-level datasets are QASMBench (Li et al., 2023), which includes 48 programs written in OpenQASM, and work by Long and Zhao (2023), which proposes a dataset of 63 Q# programs.However, they are not suitable for our evaluation because they are not written in Python/Qiskit. Luo et al. (2022) and Zhao et al. (2021b) study the bugs in quantum computing programs in Qiskit, mainly collected from GitHub issues of the official Qiskit repository and StackOverflow questions. In contrast, we present a much larger dataset of 7,568 real-world programs, including many programs that are not part of the Qiskit repositories.

Domain-Specific API Modeling

Previous work has modeled other specialized Python libraries, e.g., in machine learning (Lagouvardos et al., 2020; Baker et al., 2022), to spot bugs with static analysis. Our work also relates to general static API misuse detectors (Amann et al., 2019), which mostly focuses on Java and traditional application domains. Instead, we focus on the quantum domain, which comes with its own concepts and APIs to model.

5. Conclusion

We present LintQ, a framework for statically analyzing quantum programs and an extensible set of ten analyses. The approach introduces a set of abstractions that capture common concepts in quantum programs, such as circuits, gates, and qubits, as well as the relations between these concepts. Thanks to these abstractions, analyses aimed at finding specific kinds of programming problems can be easily implemented in a few lines of code (10 LoC). To evaluate LintQ, we apply the approach to a novel dataset of 7,568 quantum programs, and in its default configuration with six analyses, LintQ achieves a precision of 91.0% (121 true positives out of 133 warnings).

6. Data Availability

LintQ, our dataset, and all results are publicly available at https://github.com/sola-st/LintQ and archived at https://zenodo.org/records/11095456.

7. Acknowledgments

This work was supported by the European Research Council (ERC, grant agreement 851895), and by the German Research Foundation within the ConcSys, DeMoCo, and QPTest projects.

References

(1)
Fla ([n. d.]) [n. d.]. Flake8: Your Tool For Style Guide Enforcement — Flake8 6.0.0 Documentation. https://flake8.pycqa.org/en/latest/.
Nbc ([n. d.]) [n. d.]. Nbconvert: Convert Notebooks to Other Formats — Nbconvert 7.2.9 Documentation. https://nbconvert.readthedocs.io/en/latest/index.html.
Opt ([n. d.]) [n. d.]. OptimizeSwapBeforeMeasure Pass Drops Swap Gate (Even If There Is NO Measure after It) $\cdot$ Issue #7642 $\cdot$ Qiskit/Qiskit. https://github.com/Qiskit/qiskit/issues/7642.
Pyl ([n. d.]) [n. d.]. Pylint - Code Analysis for Python — Www.Pylint.Org. https://www.pylint.org/.
Qis (2021) 2021. Qiskit/Qiskit. https://github.com/Qiskit/qiskit.
Ali et al. (2021) Shaukat Ali, Paolo Arcaini, Xinyi Wang, and Tao Yue. 2021. Assessing the Effectiveness of Input and Output Coverage Criteria for Testing Quantum Programs. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). 13–23. https://doi.org/10.1109/ICST49551.2021.00014
Amann et al. (2019) Sven Amann, Hoan Anh Nguyen, Sarah Nadi, Tien N. Nguyen, and Mira Mezini. 2019. A Systematic Evaluation of Static API-Misuse Detectors. IEEE Transactions on Software Engineering 45, 12 (Dec. 2019), 1170–1188. https://doi.org/10.1109/TSE.2018.2827384
Avgustinov et al. (2016) Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-oriented Queries on Relational Data. In 30th European Conference on Object-Oriented Programming (ECOOP 2016) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 56), Shriram Krishnamurthi and Benjamin S. Lerner (Eds.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 2:1–2:25. https://doi.org/10.4230/LIPIcs.ECOOP.2016.2
Baker et al. (2022) Wilson Baker, Michael O’Connor, Seyed Reza Shahamiri, and Valerio Terragni. 2022. Detect, Fix, and Verify TensorFlow API Misuses. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 925–929. https://doi.org/10.1109/SANER53432.2022.00110
Bergholm et al. (2020) Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, M. Sohaib Alam, Shahnawaz Ahmed, Juan Miguel Arrazola, Carsten Blank, Alain Delgado, Soran Jahangiri, Keri McKiernan, Johannes Jakob Meyer, Zeyue Niu, Antal Száva, and Nathan Killoran. 2020. PennyLane: Automatic Differentiation of Hybrid Quantum-Classical Computations. arXiv:1811.04968 [physics, physics:quant-ph] (Feb. 2020). arXiv:1811.04968 [physics, physics:quant-ph]
Bessey et al. (2010) Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World. Commun. ACM 53, 2 (Feb. 2010), 66–75. https://doi.org/10.1145/1646353.1646374
Bichsel et al. (2020) Benjamin Bichsel, Maximilian Baader, Timon Gehr, and Martin Vechev. 2020. Silq: A High-Level Quantum Language with Safe Uncomputation and Intuitive Semantics. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 286–300. https://doi.org/10.1145/3385412.3386007
Calcagno et al. (2015) Cristiano Calcagno, Dino Distefano, Jérémy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving fast with software verification. In NASA Formal Methods: 7th International Symposium, NFM 2015, Pasadena, CA, USA, April 27-29, 2015, Proceedings 7. Springer, 3–11.
Chen et al. (2023) Qihong Chen, Rúben Câmara, José Campos, André Souto, and Iftekhar Ahmed. 2023. The Smelly Eight: An Empirical Study on the Prevalence of Code Smells in Quantum Computing - Artifact. 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE) (Jan. 2023). https://doi.org/10.5281/ZENODO.7556360
Dargan (2022) James Dargan. 2022. Top 5 Quantum Programming Languages in 2022.
Deutsch and Penrose (1997) David Elieser Deutsch and Roger Penrose. 1997. Quantum Computational Networks. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences 425, 1868 (Jan. 1997), 73–90. https://doi.org/10.1098/rspa.1989.0099
Developers (2021a) Cirq Developers. 2021a. Cirq. Zenodo. https://doi.org/10.5281/zenodo.5182845
Developers (2021b) Qiskit Developers. 2021b. Qiskit: An Open-Source Framework for Quantum Computing. https://doi.org/10.5281/zenodo.2573505
Egretta.Thula (2023) Egretta.Thula. 2023. Answer to ”Why Does Composing a Clifford Circuit to Another Circuit Not Work? (Qiskit)” - Quantum Computing Stack Exchange.
Flanagan et al. (2002) Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B. Saxe, and Raymie Stata. 2002. Extended Static Checking for Java. ACM SIGPLAN Notices 37, 5 (May 2002), 234–245. https://doi.org/10.1145/543552.512558
Fortunato et al. (2022a) Daniel Fortunato, José Campos, and Rui Abreu. 2022a. Mutation Testing of Quantum Programs Written in QISKit. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 358–359. https://doi.org/10.1145/3510454.3528649
Fortunato et al. (2022b) Daniel Fortunato, José Campos, and Rui Abreu. 2022b. QMutPy: A Mutation Testing Tool for Quantum Algorithms and Applications in Qiskit. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). Association for Computing Machinery, New York, NY, USA, 797–800. https://doi.org/10.1145/3533767.3543296
Ghaleb et al. (2023) Asem Ghaleb, Julia Rubin, and Karthik Pattabiraman. 2023. AChecker: Statically Detecting Smart Contract Access Control Vulnerabilities. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 945–956. https://doi.org/10.1109/ICSE48619.2023.00087
Google (2015) Google. 2015. Error Prone: static analysis tool for Java. http://errorprone.info/.
Habib and Pradel (2018) Andrew Habib and Michael Pradel. 2018. How Many of All Bugs Do We Find? A Study of Static Bug Detectors. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). 317–328. https://doi.org/10.1145/3238147.3238213
Huang and Martonosi (2019) Yipeng Huang and Margaret Martonosi. 2019. Statistical Assertions for Validating Patterns and Finding Bugs in Quantum Programs. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA ’19). Association for Computing Machinery, New York, NY, USA, 541–553. https://doi.org/10.1145/3307650.3322213
JavadiAbhari et al. (2014) Ali JavadiAbhari, Shruti Patil, Daniel Kudrow, Jeff Heckey, Alexey Lvov, Frederic T. Chong, and Margaret Martonosi. 2014. ScaffCC: A Framework for Compilation and Analysis of Quantum Computing Programs. In Proceedings of the 11th ACM Conference on Computing Frontiers (CF ’14). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/2597917.2597939
Johnson et al. (2013) Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why Don’t Software Developers Use Static Analysis Tools to Find Bugs?. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, San Francisco, CA, USA, 672–681.
Just et al. (2014) René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). Association for Computing Machinery, New York, NY, USA, 437–440. https://doi.org/10.1145/2610384.2628055
Kaul et al. (2023) Maximilian Kaul, Alexander Küchler, and Christian Banse. 2023. A Uniform Representation of Classical and Quantum Source Code for Static Code Analysis. https://doi.org/10.48550/arXiv.2308.06113 arXiv:2308.06113 [cs]
Lagouvardos et al. (2020) Sifis Lagouvardos, Julian Dolby, Neville Grech, Anastasios Antoniadis, and Yannis Smaragdakis. 2020. Static Analysis of Shape in TensorFlow Programs. In 34th European Conference on Object-Oriented Programming (ECOOP 2020) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 166), Robert Hirschfeld and Tobias Pape (Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 15:1–15:29. https://doi.org/10.4230/LIPIcs.ECOOP.2020.15
Li et al. (2023) Ang Li, Samuel Stein, Sriram Krishnamoorthy, and James Ang. 2023. QASMBench: A Low-Level Quantum Benchmark Suite for NISQ Evaluation and Simulation. ACM Transactions on Quantum Computing 4, 2 (Feb. 2023), 10:1–10:26. https://doi.org/10.1145/3550488
Li et al. (2020) Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, and Yuan Xie. 2020. Projection-Based Runtime Assertions for Testing and Debugging Quantum Programs. Proceedings of the ACM on Programming Languages 4, OOPSLA (Nov. 2020), 150:1–150:29. https://doi.org/10.1145/3428218
Livshits et al. (2015) Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto. Commun. ACM 58, 2 (2015), 44–46.
Long and Zhao (2023) Peixun Long and Jianjun Zhao. 2023. Equivalence, Identity, and Unitarity Checking in Black-Box Testing of Quantum Programs. https://doi.org/10.48550/arXiv.2307.01481 arXiv:2307.01481 [quant-ph]
Luo et al. (2022) Junjie Luo, Pengzhan Zhao, Zhongtao Miao, Shuhan Lan, and Jianjun Zhao. 2022. A Comprehensive Study of Bug Fixes in Quantum Programs. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 1239–1246. https://doi.org/10.1109/SANER53432.2022.00147
Miranskyy et al. (2020) Andriy Miranskyy, Lei Zhang, and Javad Doliskani. 2020. Is Your Quantum Program Bug-Free? Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results (June 2020), 29–32. https://doi.org/10.1145/3377816.3381731 arXiv:2001.10870
Nielsen et al. (2002) Michael A Nielsen, Isaac Chuang, and Lov K Grover. 2002. Quantum Computation and Quantum Information. Am. J. Phys. 70, 5 (2002), 4.
Paltenghi and Pradel (2022) Matteo Paltenghi and Michael Pradel. 2022. Bugs in Quantum Computing Platforms: An Empirical Study. Proceedings of the ACM on Programming Languages 6, OOPSLA1 (April 2022), 86:1–86:27. https://doi.org/10.1145/3527330
Paltenghi and Pradel (2023) Matteo Paltenghi and Michael Pradel. 2023. MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, Melbourne, Victoria, Australia, 2413–2424. https://doi.org/10.1109/ICSE48619.2023.00202
Paradis et al. (2021) Anouk Paradis, Benjamin Bichsel, Samuel Steffen, and Martin Vechev. 2021. Unqomp: Synthesizing Uncomputation in Quantum Circuits. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2021). Association for Computing Machinery, New York, NY, USA, 222–236. https://doi.org/10.1145/3453483.3454040
Perdrix (2008) Simon Perdrix. 2008. Quantum Entanglement Analysis Based on Abstract Interpretation. In Static Analysis (Lecture Notes in Computer Science), María Alpuente and Germán Vidal (Eds.). Springer, Berlin, Heidelberg, 270–282. https://doi.org/10.1007/978-3-540-69166-2_18
Sivarajah et al. (2020) Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington, and Ross Duncan. 2020. T—ket $\rangle$ : A Retargetable Compiler for NISQ Devices. Quantum Science and Technology 6, 1 (Nov. 2020), 014003. https://doi.org/10.1088/2058-9565/ab8e92
Svore et al. (2018) Krysta Svore, Alan Geller, Matthias Troyer, John Azariah, Christopher Granade, Bettina Heim, Vadym Kliuchnikov, Mariia Mykhailova, Andres Paz, and Martin Roetteler. 2018. Q#: Enabling Scalable Quantum Computing and Development with a High-level DSL. In Proceedings of the Real World Domain Specific Languages Workshop 2018 (RWDSL2018). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3183895.3183901
user19571 (2022) user19571. 2022. Question: ”Remove Inactive Qubits from Qiskit Circuit” - Quantum Computing Stack Exchange.
Wang et al. (2021a) Jiyuan Wang, Fucheng Ma, and Yu Jiang. 2021a. Poster: Fuzz Testing of Quantum Program. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). 466–469. https://doi.org/10.1109/ICST49551.2021.00061
Wang et al. (2021b) Jiyuan Wang, Qian Zhang, Guoqing Harry Xu, and Miryung Kim. 2021b. QDiff: Differential Testing of Quantum Software Stacks. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 692–704. https://doi.org/10.1109/ASE51524.2021.9678792
Weigold et al. (2021) Manuela Weigold, Johanna Barzen, Frank Leymann, and Marie Salm. 2021. Encoding Patterns for Quantum Algorithms. IET Quantum Communication 2, 4 (2021), 141–152. https://doi.org/10.1049/qtc2.12032
Xia et al. (2024) Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, and Lingming Zhang. 2024. Fuzz4All: Universal Fuzzing with Large Language Models. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3597503.3639121
Xia and Zhao (2023) Shangzhou Xia and Jianjun Zhao. 2023. Static Entanglement Analysis of Quantum Programs. In 2023 IEEE/ACM 4th International Workshop on Quantum Software Engineering (Q-SE). 42–49. https://doi.org/10.1109/Q-SE59154.2023.00013
Yamaguchi et al. (2014) Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and Discovering Vulnerabilities with Code Property Graphs. In 2014 IEEE Symposium on Security and Privacy. 590–604. https://doi.org/10.1109/SP.2014.44
Yu and Palsberg (2021) Nengkun Yu and Jens Palsberg. 2021. Quantum Abstract Interpretation. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2021). Association for Computing Machinery, New York, NY, USA, 542–558. https://doi.org/10.1145/3453483.3454061
Zhao et al. (2023a) Pengzhan Zhao, Zhongtao Miao, Shuhan Lan, and Jianjun Zhao. 2023a. Bugs4Q: A Benchmark of Existing Bugs to Enable Controlled Testing and Debugging Studies for Quantum Programs. Journal of Systems and Software 205 (Nov. 2023), 111805. https://doi.org/10.1016/j.jss.2023.111805
Zhao et al. (2023b) Pengzhan Zhao, Xiongfei Wu, Zhuo Li, and Jianjun Zhao. 2023b. QChecker: Detecting Bugs in Quantum Programs via Static Analysis. https://doi.org/10.48550/arXiv.2304.04387 arXiv:2304.04387 [cs]
Zhao et al. (2021a) Pengzhan Zhao, Jianjun Zhao, and Lei Ma. 2021a. Identifying Bug Patterns in Quantum Programs. In 2021 IEEE/ACM 2nd International Workshop on Quantum Software Engineering (Q-SE). 16–21. https://doi.org/10.1109/Q-SE52541.2021.00011
Zhao et al. (2021b) Pengzhan Zhao, Jianjun Zhao, Zhongtao Miao, and Shuhan Lan. 2021b. Bugs4Q: A Benchmark of Real Bugs for Quantum Programs. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). https://doi.org/10.1109/ASE51524.2021.9678908