Sampling on Quantum Computer
Sampling on Quantum Computer
Maximilian Balthasar Mansky∗, Jonas Nüßlein∗ , David Bucher†, Daniëlle Schuman∗, Sebastian Zielinski∗ ,
and Claudia Linnhoff-Popien∗
∗ LMUMunich
Munich, Germany
Email: [email protected]
† Aqarios GmbH
Munich, Germany
arXiv:2402.16341v1 [quant-ph] 26 Feb 2024
Abstract—Due to the advances in the manufacturing of quan- A novel approach in generating samples is the use of a quan-
tum hardware in the recent years, significant research efforts have tum computer. Quantum computing is a new computational
been directed towards employing quantum methods to solving paradigm with promises of significant computational speedup
problems in various areas of interest. Thus a plethora of novel
quantum methods have been developed in recent years. In this [4]. The technology is structurally different from classical
paper, we provide a survey of quantum sampling methods along- computing and relies on the effects of quantum mechanics
side needed theory and applications of those sampling methods to process information. Instead of representing information
as a starting point for research in this area. This work focuses in through a binary encoding, quantum computing relies on the
particular on Gaussian Boson sampling, quantum Monte Carlo superposition of states for encoding. The states have a physical
methods, quantum variational Monte Carlo, quantum Boltzmann
Machines and quantum Bayesian networks. We strive to provide representation depending on the underlying hardware and
a self-contained overview over the mathematical background, are typically different energy levels in a quantum-mechanical
technical feasibility, applicability for other problems and point system. Superposition means that the state is composed from
out potential areas of future research. a complex-valued superposition of base states. Information
can be changed on the quantum computer through operations
I. I NTRODUCTION
called gates and can be retrieved for further classical process-
Sampling from a population is a well established way to ing and interpretation with a measurement.
learn about the structure of large data or to learn about a The two approaches, quantum computing and sampling
distribution of properties in spaces that are too large for from the unknown, can be combined to shed light on distribu-
an exhaustive examination. Sampling is the act of drawing tions within quantum-mechanical systems that are otherwise
samples from some unknown distribution with the goal of difficult to calculate or to model classically. In the context
learning about the underlying distribution by using statistical of sampling, this means obtaining a sample via classical
reasoning. One samples to generate a smaller population simulation is difficult, but drawing a sample via a quantum
from a distribution because querying all possible elements is computer is easy. Many of the physical systems that one is
prohibitively expensive. For many problems sampling provides interested in form a high-dimensional space whose structure
a reasonable approximation of the total distribution and is an is difficult to model. Even simple composite two-state systems
important approximative method. Sampling is used to build an suffer from a combinatorial explosion in their complete state
understanding of complex systems with limited resources and representation, where the number of dimensions scales as
therefore an important tool in many areas of science. [1]. O(2n ), with n the number of qubits. Since the system size
For reliable statistical reasoning, one needs a sufficient of a quantum computer scales in the same way, a quantum
amount of samples. The number of samples depends on the computer is an appropriate tool to simulate these systems. This
problem at hand and the applicable statistical rigour and is is reflected in the approaches discussed in this paper, drawing
an important consideration when using a sampling approach. inspiration from physical systems.
Samples also need to be independent and identically dis- Classical distributions can also be modeled on a quantum
tributed (i.i.d.), meaning that subsequent samples should have computer, taking advantage of the larger representation space
no relationship with each other and should not affect each in order to represent more complex distributions and then draw
other’s result and that they need to come from the same total samples from them. The expressiveness of these systems can
distribution. In turn, this means that sampling processes can be higher than an equivalent classical formulation, again due
be easily parallelized for fast querying. to the larger available state space.
These benefits of sampling approaches to problem solving, The topic of quantum-based sampling has received signif-
namely fast approximative solutions to complex systems ex- icant attention over the past few years, with experimental
plain their ubiquity and applicability across a broad range of breakthroughs in the size of the experiments [5].
subjects [2], [3]. There is continuous effort to improve current In this work we introduce the state of the art of quantum
theory and provide better approaches to creating independent sampling techniques and applications and provide an overview
samples for individual problems. of their technical feasibility, current technical state and appli-
B. Sampling on quantum circuits
A quantum computer can similarly be used for a) per-
forming calculations and b) providing a translation between
uniform randomness and a biased distribution. The necessary
randomness is however not easy to control, despite the fact that
a quantum computer is fundamentally a probabilistic machine
[4]. The obvious quantum circuit for generating randomness,
applying a Hadamard gate to each available qubit in the |0i
state, results in uniform independent random qubits when
measuring in the same h0| basis. In order to represent any
uniformly random state of the Hilbert space, one needs an
exponentially deep circuit in the number of qubits [7]. This is
in stark contrast to the classical world, where the randomness
of a bit string is fundamentally the same as the randomness of
an equivalently sized group of individually sampled bits [8].
Fig. 1. Visualization of a sampling process to determine the value of π Suitable construction of the quantum circuit can also gen-
via a sampling process. Black dots represent random (x, y) positions. If the
distance to the origin is smaller or equal to 1 (red area), it is counted towards
erate an appropriate sampling statistic without a source of
the red bin, otherwise it is only counted towards the total number of samples. randomness. This makes use of the probabilistic nature of
The fraction of the count of elements in the red bin to the total number of a quantum computer, where a measured mixed state returns
samples multiplied by 4 approximates π.
a probabilistic value based on the measurement basis. Mixed
states form a statistical ensemble that can be expressed through
a density matrix ρ [4].
cability as a solution to other problems. We present the topic as
For this discussion, we assume a perfect quantum computer
a self-contained as possible to allow a fast translation towards
without extrinsic noise. The noise of current imperfect quan-
a solution.
tum computers is an unwanted source of randomness and will
In the remainder of the paper we provide a background on in most cases bias the desired calculation.
sampling as an explicit approach on classical and quantum
systems. In section III we introduce all current approaches to III. E XAMPLES OF SAMPLING PROBLEMS
quantum-based sampling, including their mathematical basis, We provide a thorough coverage of different quantum com-
technological readiness, open questions and applicability as a puting approaches to sampling problems. We provide a self-
solution path. Lastly we provide a discussion of the presented contained background on the mathematical basis. Technical
methods. feasibility on current hardware is explored as well, especially
for the cases where an implementation has already been
II. BACKGROUND ON SAMPLING TECHNIQUES achieved. We also highlight the applicability as a solution path
to other areas of science and take note of open questions for
A sampling algorithm can be imagined as a machine that future work.
transforms uniform random bits into non-uniformly distributed We start with Gaussian Boson Sampling, an experiment
random bits [6] In the context of sampling from a population, on current hardware that explicitly samples on an unknown
it means taking independent and identical samples from the distribution and is very time-consuming to model classi-
distribution. This simple structure is a starting point into ques- cally. Quantum-Enhanced Markov Chain Monte-Carlo ex-
tions of statistics, probability modeling, conditional inference pands the classical MCMC structure to quantum computers
and much more. The sampling algorithm that transforms the with promises of significant speed-up when investigating quan-
bit strings is often opaque to the questions, if known at all. tum systems. Variational Monte Carlo is especially applicable
to material science and chemistry. Quantum Boltzmann Ma-
A. Sampling on a classical system chines are a sampling-based machine learning model that take
advantage of the quantum computing structure for faster and
A sampling algorithm turns a source of uniform randomness easier sampling from arbitrary distributions. Lastly, quantum
into a non-uniform one. Running the algorithm many times Bayesian Networks are a direct translation of classical models
generates a statistical sample with meaningful insights into for chained probabilities. There are indications that the quan-
the underlying problem. A classic example is sampling π by tum version has significantly higher expressive power.
generating uniform random x, y positions in the unit square.
Here, the sampling algorithm takes uniform randomness – A. Gaussian Boson Sampling
the position – and assigns a new value based on the points’ Boson Sampling is a simplified, non-universal model
distance from the origin. After generating many points, the of quantum computation first introduced by Aaronson and
fraction of points within unit distance of the origin provide an Arkipov [9] in which n Bosons, originally in an input
estimate of the value of π. This is visualized in Figure 1. arrangement k, are scattered by a passive, linear unitary
transformation U into m ≫ n output modes. The Boson Hamilton by adjusting the protocol to account for displaced
Sampling problem consists of producing a fair sample of squeezed states and higher-order photon number contributions.
the output probability distribution P (l|k, U ), where l is the Since the conception of GBS, a number of experimental
output arrangement [10], [11]. Aaronson and Arkipov argue implementations have advanced the study of the protocol.
that the existence of an efficient classical algorithm which Most notably, Zhong et al. [14] used a photonic quantum
accomplishes this given a random transformation U implies computer, Jiuzhang, to execute the GBS protocol with 50
the ability to estimate the permanent of an arbitrary complex indistinguishable single-mode squeezed states and a 100-mode
valued matrix, a problem lying in the class #P [9]. This means ultralow-loss interferometer with full connectivity and sampled
that the problem is in fact hard for classical computers to the output using 100 high-efficiency single-photon detectors.
solve and provides an argument for the superiority of quantum Jiuzhang has a 76-photon coincidence, with an output state
computers over classical ones, as the Boson Sampling problem space dimension of 1030 and outpaced classical state-of-the-
can be solved efficiently by the former, as well as evidence art simulation on supercomputers by a factor of 1014 .
against the Church-Turing thesis [11]. Significant progress has also been achieved in the classical
The primary hurdle in implementing Boson Sampling exper- simulation of GBS. Bulmer et al. [15] present a classical GBS
imentally lies in the fact that currently available single photon simulation method using threshold detectors, which demon-
sources are spontaneous, meaning that the cost of producing strates a nine-orders of magnitude speedup over previous
an input state with exactly n photons grows exponentially classical algorithms that employ photon number–resolving
in n. To combat this issue, Lund et al. [11] suggest using detectors. The novel GBS simulation using threshold detectors
Gaussian states, which can be produced deterministically with was applied to two separate sampling algorithms, a probability
high purity. They describe a Gaussian Boson Sampler, a chain rule method and Metropolis independence sampling,
quantum optical processor consisting of 2-mode squeezed and was able to simulate the GBS protocol with up to 92
input states and a non-adaptive linear optical network, which photons and 100 modes, reducing computation time from 600
produces photon number counting statistics as its output. They million years to a matter of months, a nine-orders of magnitude
argue that in one particular case, namely in the context of improvement. However, such an approach only proves useful
the generalized Boson Sampling problem, such a device can for verification purposes, as state-of-the-art GBS setups, such
efficiently sample distributions which are hard to sample for as Jiuzhang, require only minutes for the same computation.
classical counterparts. Furthermore, Lund et al. contend that
approximate Boson Sampling is also a hard problem, even in B. Quantum-Enhanced Markov Chain Monte Carlo
the generalized case [11]. Markov Chain Monte Carlo (MCMC) is a statistical ap-
In [12], Hamilton et al. formally introduce Gaussian Boson proach for generating random samples from a target probabil-
Sampling (GBS), which, unlike previous protocols involving ity distribution. The basic idea of Markov Chains is to start
Gaussian states, takes full advantage of the Gaussian nature from an initial state and repeatedly jump to new states ac-
of the states. In the GBS setup, Single Mode Squeezed States cording to a transition rule. This allows for a computationally
(SMSS) enter a linear interferometer, and the output patterns inexpensive estimation of various statistics (e.g., mean, vari-
are sampled in the photon number basis. They show that the ance) of the target distribution. In various parts of physics, they
probability of measuring a specific output distribution of a are widely used to estimate observables of statistical systems
Gaussian input state is related to the hafnian, a matrix function whose probability distributions are inaccessible through direct
more general than the permanent which resides in the #P computation [16]. Furthermore, they are used for sampling
complexity class [12], [13]. With this result, Hamilton et al. from Boltzmann distributions [17] (see also Sec. III-D) and
prove that the GBS protocol resides in #P along with the for combinatorial optimization using the simulated annealing
approximate sampling problem. This protocol differs above heuristic [18].
all due to the fact that the sampling matrix describes not only Sampling from the Boltzmann distribution of a classical
the action of the interferometer, but also the shape of the Ising spin-glass at low temperatures is known to be a hard
Gaussian input state. This implies that a coherent superposition problem [19] The probability of a certain classical spin string
of all n-photon patterns from the input can be used and no s = {±1}N is given by
exact input pattern must be heralded as in other protocols. As
1 −βE(s)
such, GBS increases photon generation probability relative to p(s) = e , (1)
standard boson sampling protocols which use single photon Z
Fock states [10]. Furthermore, GBS reduces the size of the where β is the reciprocal temperature given by β −1 = kB T ,
2
sampling space by a factor of NN compared to Scattershot with kB being the Boltzmann factor. The energy of the general
Boson sampling, thereby significantly advancing experimental Ising system [20] is given by
possibilities. While the classification of the Boson Sampling X X
problem with Gaussian states has not definitively been as- E(s) = − Jij si sj − hi si . (2)
i<j
signed a complexity class, it has been shown that the special
case of sampling from a multimode thermal state resides in The partition functionPZ is defined as the sum over the Boltz-
−βE(s)
BP P N P [12]. In [13], Kruse et al. built upon the work of mann factors Z = {s} e . Although the Boltzmann
• Uniform updates: Now, update candidates s′ are chosen
randomly. In higher temperature simulations, this strategy
works fine and is able to traverse the whole state space
E(s) relatively quickly, which means fast convergence of the
Local Markov chain. However, the acceptance rate drops rapidly
o rm
Unif as the temperature decreases [24], see Fig. 2.
• Cluster updates: In the phase transition between the mag-
Quantum
s netized and disordered state of the Ising model, generally,
ordered patches emerge in the material. For this reason,
Fig. 2. Visualization of different candidate proposal techniques. The local one cluster update proposals have been introduced [25], [26].
does achieve relatively good acceptance rates while not exploring the state Being able to explore the state space rather quickly with
space. Uniform updating tries to explore the state space but struggles with
acceptance since the proposed state has most likely high energy. However,
high acceptance rates, they only work in critical phases
the discussed quantum proposal routine samples states that are far away in of the material and similarly lose their advantage as soon
the state space while having comparable energy, thus, also leading to high as the temperature falls significantly below the critical
acceptance rates.
point.
In general, sampling from low-temperature Ising spin glasses
factors are easy to compute independently, the partition func- with Markov chains is plagued by slow convergence and,
tion is not, because of the exponential number of summands. therefore, long runtimes.
Without specific domain knowledge and advanced analytical To counteract this problem, Layden et al. [24] suggested
methods, the calculation is intractable. a quantum routine to propose updates. The quantum register
The Metropolis-Hastings algorithm is an MCMC method is prepared in the state of the current MCMC chain and
to sample from many distributions, like this Boltzmann dis- undergoes an arbitrarily chosen unitary evolution. The au-
tribution. Essentially, we start at a random spin configuration thors of [24] chose to use a time evolution of the problem
and then propose update steps. These are accepted or rejected Hamiltonian paired with a mixer Hamiltonian known from the
according to the transition probabilities [21], [22]. The accep- Quantum Approximate Operator Ansatz (QAOA) [27], [28].
tance probability of a proposed update s → s′ is given by The joint Hamiltonian can be expressed through
p(s′ )Q(s|s′ )
X X
′
A(s, s ) = min 1, , (3) Hint = (1 − γ)α E(s) |si hs| − γ σix , (4)
p(s)Q(s′ |s) s i
′
with Q(s|s ) being the transition probability of the system where α is a normalizing factor and γ ∈ [0, 1] controls the
moving from s to s′ . These transition probabilities are cho- strength of the quantum transitions by increasing the effect of
sen such that the samples from this procedure resemble the the mixer.
demanded probability distribution. The output of measuring the state
Commonly, the transition probabilities are chosen symmet-
T
rically, meaning that the probability is the same no matter if Y
exp(−iHint t) |si ≈ exp(−iHint t/T ) |si (5)
the state is moving from s → s′ or the other way round. In the
acceptance probability (3) the expression then reduces to just is the proposed spin configuration, where the Hamiltonian
a fraction of the state probabilities. Furthermore, note that due is time-evolved using Trotterization. Important to note is
to that ratio, we never need to calculate the partition function the symmetry of the transition probability | hs| U |s′ i | =
explicitly. A metric to measure how well the Markov chain | hs′ | U |si | using this approach. Yet, there are still two free
samples through the probability distribution is the acceptance parameters γ and t that need to be set. To circumvent the
ratio α, i.e., the number of accepted proposals compared to need for tuning them, [24] chose to sample the parameters of
the number of all trials. Theoretical investigations suggest an each Monte Carlo iteration randomly, decreasing the bias of
optimal value of α = 0.243 for random walk problems [23]. a constant setting. The achieved effect, compared to local and
However, most importantly, the acceptance ratio should not uniform updating, is visualized in Fig. 2.
drop close to zero in order to maintain a good sampling quality. The authors of [24] have found a significant improvement
Small acceptance rates are an indication that the chain gets in convergence speed when compared to local and uniform
stuck in a trench in the energy landscape. updating procedures. Furthermore, with clever error mitigation
Typically, update-proposal strategies comprise in use, they have also been able to observe a performance
• Local Spin flips: Here, just a random spin is chosen to increase when running on quantum hardware. The gain was
be flipped. In many scenarios, mainly where the energy not as big as the simulations suggested, nevertheless, faster
landscape is very rugged, single spin flips cannot get the convergence than local and uniform updates has been achieved.
chain out of deep trenches, also depicted in Fig. 2. Since Remarkably, the proposed algorithm never miscalculates a
only a limited number of proposals are available, if all quantity based on quantum imperfections since the precise
of them are unlikely to be accepted due to higher energy, values are computed using the Metropolis algorithm. The
the chain gets stuck, and the acceptance rate drops. quantum routine only produces update proposals, which still
need to be accepted in order to be included to the computation. Hamiltonian efficiently using MC sampling:
The quantum routine only helps with increasing the acceptance
E(Ω) = hψΩ | H |ψΩ i (6)
rate and the exploration speed, leading to faster convergence. X X
∗
How the authors [24] deal with the parameters in the time = ψΩ (x)ψΩ (x′ ) hx| Ha |x′ i (7)
evolution surely removes bias but is not the ideal setting. Thus a {x},{x′ }