Data Encoding Patterns For Quantum Computing
Data Encoding Patterns For Quantum Computing
MANUELA WEIGOLD, JOHANNA BARZEN, FRANK LEYMANN, and MARIE SALM, IAAS, University of Stuttgart
Quantum computers have the potential to solve certain problems faster than classical computers. However, loading data into a quantum
computer is not trivial. To load the data, it must be encoded in quantum bits (qubits). There are several ways how qubits can represent
the data and, thus, multiple data encodings are possible. Both the data itself and the chosen encoding influence the runtime of the loading
process. In the worst case, loading requires exponential time. This is critical because quantum algorithms that promise a speed-up assume
that loading data can be done faster, in logarithmic or linear time. To outline abstract knowledge about encodings and the consequences
of choosing a particular data encoding, we present three common encodings as patterns. Especially in complex domains like quantum
computing, patterns can contribute to making this new technology and its broad potential accessible to users with different backgrounds. In
particular, they facilitate the development of quantum applications for software developers.
Categories and Subject Descriptors: [PLoPourri]: —Patterns For Quantum Computing
General Terms: Algorithms, Measurement
Additional Key Words and Phrases: Quantum Computing, Quantum Algorithms, Data Encoding, Speed-up, Patterns
ACM Reference Format:
M. Weigold, J. Barzen, F. Leymann, and M. Salm. 2020. Data Encoding Patterns for Quantum Algorithms. HILLSIDE Proc. of Conf. on Pattern
Lang. of Prog. 22 (October 2020), 11 pages.
1. INTRODUCTION
For decades, the concept of a bit has been the fundamental unit for information encoding in computer science.
Recent advances in quantum computing have led to the first commercial quantum computers which operate on
quantum bits (qubits) instead of bits [National Academies of Sciences, Engineering and Medicine 2019; LaRose
2019]. Analogous to a bit that can be either 0 or 1, a qubit can also take one of two states: |0i and |1i. But in
addition, due to quantum mechanics, it can also be in a combination of these two states at once - a superposition of
states. Empowered by superposition and other fundamental properties of quantum mechanics, quantum computers
have the potential to solve certain problems faster than conventional computers [Horodecki et al. 2009]. In fact,
various algorithms for quantum computers exist for which a theoretical linear or exponential speed-up over their
classical counterparts was demonstrated, e.g. for the factorization of prime numbers [Shor 1999].
As the number of available qubits of quantum computers increases, more companies start to explore quantum
computing. However, it is expected that near-term devices will only contain up to a few hundred qubits [Preskill 2018].
Another restricting factor is that these qubits are not perfect: Their states are only stable for a short amount of time.
Because of their rapid decay, only a limited number of operations can be executed on them. Thus, successfully
programming quantum computers today is limited by the available hardware.
Author’s address: M. Weigold, Universitätsstraße 38, 70569 Stuttgart, Germany; email: weigold@iaas.uni-stuttgart.de; J. Barzen, Univer-
sitätsstraße 38, 70569 Stuttgart, Germany; email: barzen@iaas.uni-stuttgart.de; F. Leymann, Universitätsstraße 38, 70569 Stuttgart, Germany;
email: leymann@iaas.uni-stuttgart.de; M. Salm, Universitätsstraße 38, 70569 Stuttgart, Germany; email: salm@iaas.uni-stuttgart.de
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the
first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. A prelimi-
nary version of this paper was presented in a writers’ workshop at the 27th Conference on Pattern Languages of Programs (PLoP).
PLoP’20, OCTOBER 12–16, Virtual Online. Copyright 2020 is held by the author(s). HILLSIDE 978-1-941652-16-9
Besides, the sheer fact that quantum computers obey the law of quantum mechanics results in - from the point of
a software developer - unusual effects. To illustrate how different quantum computing is, we describe implications
for two basic programming tasks: reading, and loading data. For the first task of reading a qubit, its quantum state
must be accessed. This can only be done by measuring it. Unfortunately, measurement causes a qubit to collapse
to either |0i and |1i. Thus, the state of a qubit that is in superposition can not be accessed for reading.
The second task consists of loading data into a quantum computer. This task is at the beginning of almost every
algorithm that processes input data. After the initial loading process, the data is represented by qubits via a specific
encoding. Each algorithm expects that a certain data encoding is used, and then processes the data by performing
calculations. Unfortunately, loading data can not always be done efficiently. In the worst case, loading requires
exponential time. This slows down algorithms with an otherwise logarithmic or linear runtime: With an exponential
loading time, their overall runtime is also exponential. This ruins a theoretical linear or exponential speedup of an
algorithm - which was one of the reasons why we wanted to use a quantum computer in the first place. In general,
the time for loading depends (i) on the routine that loads the data in a specific encoding and (ii) on the data itself.
Thus, loading data is not a trivial task that influences the runtime complexity of a quantum algorithm.
To help software developers understand the implications of using a specific encoding to load data, we formulate
three common data encodings as patterns. A pattern in the spirit of Alexander et al. [1977] describes a proven
solution to a re-occurring problem. For the development of software, documenting patterns is commonly used to
capture knowledge about a specific domain [Coplien 1996; Buschmann et al. 1996]. Especially in an interdisciplinary
and complex domain like quantum computing, patterns can be used to make proven solutions explicit, explain
’how’ they work, and ’why’ a solution (e.g. an encoding) should be used [Meszaros and Doble 1997].
The remainder of this paper is structured as follows: Section 2 describes fundamentals of quantum computing.
Section 3 starts with an overview of patterns for quantum algorithms and then presents the new encoding patterns.
Related work is discussed in Section 4. Finally, Section 5 concludes the paper and describes future work.
Sometimes the bit strings are transformed into decimal representations resulting in natural numbers, thus, the
state vectors are written even more compact as |0i, |1i, |2i, and |3i.
1 http://quantumcomputingpatterns.org
Initiali- Unitary
Measurement
zation Transformation
Uniform Quantum-
CC
Superposition Classic Split
Creating Speedup by
Entanglement ? Verification
Pre-processing Post-processing
Fig. 1. Overview of pattern for quantum computing. In the center, the steps of a quantum algorithm are shown (based on [Leymann et al.
2020]). The new encoding patterns (highlighted in bold) are part of the first step that is executed on a quantum computer (State Preparation).
I NITIALIZATION
|𝝍〉 Summary: At the beginning of an algorithm, its initial state is prepared: First, the registers are
initialized in the |0 . . . 0i state. If input data is used by the algorithm, a suitable state preparation
routine encodes the data via a specific encoding.
Forces. Encoding data in qubits is not trivial. Current devices contain a limited amount of qubits that are stable
for a short amount of time. In order to make use of current devices, the representation must be compact and use
only a few qubits and few quantum gates. Because qubits decay fast and quantum gates are error-prone too, the
number of operations to prepare the quantum state must be small. To encode even a large number of data values
efficiently, a logarithmic or linear runtime is ideal, i.e., the state preparation routine consists of a logarithmic or
linear number of parallel operations. Each encoding is essentially a trade-off between two major forces: (i) the
number of required qubits and (ii) the runtime complexity for the loading process. Besides that, an additional force
requires that data must be represented in a suitable format for further operations. For arithmetic operations like
addition or multiplication often the exact values of the data need to be represented. For other operations it may
be sufficient to represent their relative values (e.g., as relatively small or large amplitude of a quantum state with
A MPLITUDE E NCODING).
B ASIS E NCODING
Context. A quantum algorithm requires numerical input data X for further calculations.
Solution. The main idea for this encoding is to use the computational basis {|0...00i , |0...01i , . . . , |1...11i} to
encode the input data: An input number x is approximated by a binary format x := bn−1 . . . b1 b0 which is then turned
into the corresponding basis vector |xi := |bn−1 . . . b1 b0 i. For example, the number "2"P
is represented as 10 which
m
is then encoded by |10i (Fig. 2). In general, this leads to the following encoding: X ≈ i=−k bi 2i 7→ |bm . . . b−k i
where X is first approximated with a precision of k +m significant digits and then represented by a basis vector.
𝑏1 𝑏0 𝑞1 𝑞0
Fig. 2. Basis encoding. A number is approximated by a binary bit string (first step) and encoded by a computational basis state (second step) .
Context. A quantum algorithm requires multiple numerical values X as input for further calculations.
Solution. Use a quantum associative memory (QuAM) to prepare a superposition of basis encoded values in
the same qubit register [Leymann and Barzen 2020b]. In Fig. 3 this is illustrated for the three values x1 , x2 and
x3 in binary format. Note that the resulting encoding is an equally weighted superposition of the basis encoded
values, i.e., all amplitudes are of the same magnitude.
𝑞2 𝑞1 𝑞0
1
𝑥0 3 +
𝑥1 1
3
+
𝑥2 1
3
Fig. 3. Resulting Encoding. Each data value represented by a row on the left is encoded in B ASIS E NCODING and an amplitude of √1 .
n
To load the data, the register of the quantum associative memory is in superposition of two states, a processing
and a memory branch (Fig. 4). Both branches have a load and a storage part. An additional element is first
prepared into the load part of both branches (step 1). Next, the processing branch is split in such a manner, that
the new element gets a proper amplitude (step 2) such that it can be stored by bringing it into superposition with
the already added elements (step 3). Finally, an U NCOMPUTE cleans the both branches to be ready for the next
iteration. See [Ventura and Martinez 2000] for a more detailed description of the individual steps.
Result. The resulting encoding is a digital encoding and therefore suitable for arithmetic computations [Ley-
mann and Barzen 2020b]. For input n numbers that are approximated by l digits, l qubits are needed for this
representation. Each of the n encoded input values is represented by a basis vector with an amplitude of √1n .
All other 2l − n amplitudes of the register are zero - in our example, |000i , |001i , |100i , |101i, and |111i. The
amplitude vector is therefore often sparse for this encoding [Schuld and Petruccione 2018].
Related Pattern. This pattern refines I NITIALIZATION and makes use of U NCOMPUTE. U NIFORM S UPERPO -
SITION creates a superposition of all computational basis states. Each of the computational basis states also
represents a value in B ASIS E NCODING.
Data Encoding Patterns for Quantum Computing — Page 6
4 UNCOMPUTE
2 split
Processing Branch
𝑥2 +
1
3
Memory Branch
𝑞2 𝑞1 𝑞0
1 3 store
3
+
1
3
4 UNCOMPUTE
Fig. 4. Illustration of the state preparation routine of [Ventura and Martinez 2000]. In each iteration, an element is loaded and brought into
superposition with the already stored elements.
Known Uses. The presented state preparation routine based on Ventura and Martinez [2000] can be used
whenever multiple data values need to be represented in B ASIS E NCODING. Shor’s algorithm [Shor 1999] for the
factorization of prime numbers, a quantum version of the Fourier transform [Coppersmith 2002], and Grover’s
algorithm [Grover 1996] for unstructured search rely on this encoding. Various algorithms extend or use Grover’s
algorithm and therefore also make use of this encoding.
A MPLITUDE E NCODING
Alias. This encoding has also been referred to as Wavefunction Encoding by LaRose and Coyle [2020].
Every quantum system is described by its wavefunction ψ which also defines the measurement probabilities. By
expressing that the wavefunction is used to encode data, it is therefore implied that amplitudes of the quantum
system are used to represent data values.
Context. A numerical input data vector (x0 , . . . , xn−1 )T must be encoded for an algorithm.
Solution. Use amplitudes to encode the data. As the squared moduli of the amplitudes of a quantum state must
sum up to 1, the input vector needs to be normalized to length 1. This is illustrated in Fig. 5 for a 2-dimensional
input vector that contains two data points. To associate each amplitude with a component of the input vector, the
dimension of the vector must be equal to a power of two because the vector space of an n qubit register has
dimension 2n . If this is not the case, the input vector can be padded with additional zeros to increase the dimension
of it. Using a suitable state preparationProutine (see Known Uses), the input vector is encoded in the amplitudes of
n−1
the quantum state as follows: |ψi = i=0 xi |ii. As the amplitudes depend on the data, the process of encoding
the data (but not the encoding itself) is often referred to as arbitrary state preparation.
Result. A data input vector of length l can be represented by dlog2 (l)e qubits - this is indeed a very compact
representation. For an arbitrary state represented by n qubits (which represents 2n data values), it is known that
at least n1 2n parallel operations are needed [Schuld and Petruccione 2018]. Current state preparation routines
perform slightly better than 2n operations [Schuld and Petruccione 2018]. However, depending on the data it may
Data Encoding Patterns for Quantum Computing — Page 7
1 𝑋
𝑥1
𝑋
𝑋
𝑥0 1 𝑥0 00 +𝑥1 01
Fig. 5. Amplitude Encoding for 3 data points. The input vector (left) is normalized and represented by the amplitudes in the resulting encoding.
.
still be possible to realize an encoding in a logarithmic runtime. For example, a U NIFORM S UPERPOSITION can be
created by applying a Hadamard gate to each of the n qubits - which can be done in parallel and thus in a single
step. This represents a 2n -dimensional vector in which all data entries are √1n . Similarly, sparse data vectors can
also be prepared more efficiently [Schuld and Petruccione 2018].
It must be noted that if the output is also encoded in the amplitude, multiple measurements must be taken to
obtain a good estimate of the output result. The number of measurements scales with the number of amplitudes -
as n qubits contain 2n amplitudes, this is costly [Schuld and Petruccione 2018].
Related Patterns. This pattern refines I NITIALIZATION. The encoding is more compact (in terms of qubits) than
B ASIS, A NGLE or QRAM E NCODING.
Known Uses. A MPLITUDE E NCODING is required by many quantum machine learning algorithms [LaRose and
Coyle 2020]. Another example is the algorithm of Harrow, Hassidim and Lloyd [Harrow et al. 2009] (often referred
to as HHL algorithm) for solving linear equations. The pre-condition that the data values can be normalized is a
common assumption in machine learning [Duarte and Ståhl 2019], e.g. in support vector machine.
There are various ways to construct a state preparation routine for this encoding. For example, Plesch and
Brukner [2011] and Iten et al. [2016] use the Schmidt Decomposition. For the latter, an implementation in
Mathematica was presented [Iten et al. 2019]. Shende et al. [2006] presented an alternative way to construct
an arbitrary quantum state which was implemented by Qiskit [Qis 2020]. PennyLane offers a loading routine for
AMPLITUDE ENCODING [Pen 2020]. The library also includes an arbitrary state preparation routine that uses the
algorithm proposed by Möttönen and Vartiainen [2005]. The state preparation routine by Möttönen and Vartiainen
[2005] requires an exponential number of operations to encode 2n data values. Q# provides functionality to
compute a state preparation routine that approximates the desired amplitude encoding [QSh 2020].
4. RELATED WORK
Our patterns are based on the concept of patterns by Alexander et al. [1977] who introduced patterns for
documenting best practices in the domain of buildings. Since then, the concept has been adapted by various other
areas and is especially popular for the domain of software [Coplien 1996]. Leymann [2019] already presented
patterns for quantum algorithms that we reviewed in Section 3. In this work, we extend the brief pattern format that
was used by Leymann [2019] and present three patterns for the encoding of data. To our knowledge, no other
patterns for the domain of quantum computing exist.
Perdrix [2007] introduces quantum patterns and types that are part of a formal quantum programming language.
But these are not patterns in the sense of Alexander et al. [1977] as they only reflect technical details instead of
describing a problem or context.
Several authors discussed the process of loading data into a quantum computer and the implications on runtime.
Biamonte et al. [2017] refer to it as input problem as data can not always be loaded efficiently. Aaronson [2015]
examines loading data for the HHL algorithm for solving linear equations. He points out that the logarithmic runtime
Data Encoding Patterns for Quantum Computing — Page 8
for this algorithm can only be achieved if the A MPLITUDE E NCODING of the data can be prepared in logarithmic
time. He concludes that this is a general drawback for algorithms that use this encoding, which we also emphasize
in our pattern for this encoding.
Salm et al. [2020] consider given input data to support the selection of concrete quantum algorithm implementa-
tions and suitable quantum computers for execution. Thereby, they are estimating the required number of qubits
and sequentially executable gates of an implementation depending on the size of the input data.
Yan et al. [2016] review different quantum representations for quantum image processing. In particular, B ASIS
E NCODING and A NGLE E NCODING are used in various representations. They outline similarities, applications, and
drawbacks of the representations but do not draw general conclusions for data encodings.
Schuld and Petruccione [2018] as well as LaRose and Coyle [2020] define various data encodings for quantum
computing. We refer to these definitions in our data encoding patterns and visualize them in greater detail. LaRose
and Coyle [2020] also compare data encodings in the context of classification with quantum computers. They show
that in a noiseless setting, different data encodings lead to different decision boundaries that can be learned by a
quantum classifier. While they discuss the findings for quantum classifier, they do not consider implications for
data encodings in general. In particular, LaRose and Coyle [2020] do not consider B ASIS E NCODING as these are
not common for quantum classifiers.
Schuld and Killoran [2019] point out how data encodings and kernels in machine learning are related. They show
that an input encoding (that maps a numerical input value into the high dimensional vector space of a quantum
system) defines a quantum kernel. They refer to a specific encoding as a quantum feature map φ and point out that
different encodings lead to different values of the inner product between the encoded data values. Very recently,
there is active research about learning suitable data encodings for quantum machine learning [LaRose and Coyle
2020; Lloyd et al. 2020]. Here, we depict a more general view on encodings and do not focus on machine learning.
ACKNOWLEDGMENT
We thank our shepherd Dana (Peng) Zhang as well as the members of our writing groups for helpful comments
and suggestions. We would also like to thank Daniel Vietz, Benjamin Weder, and Karoline Wild for discussions
about quantum computing and patterns. This work is partially funded by the BMWi project PlanQK (01MK20005N).
2 https://github.com/PatternAtlas/pattern-atlas-docker
PLoP’20, OCTOBER 12–16, Virtual Online. Copyright 2020 is held by the author(s). HILLSIDE 978-1-941652-16-9
Data Encoding Patterns for Quantum Computing — Page 11