Quantom Paper NN
Quantom Paper NN
Dr. Z.-A. Jia, Prof. Y.-C. Wu, Prof. G.-C. Guo, Prof. G.-P. Guo Dr. Z.-A. Jia
Key Laboratory of Quantum Information Microsoft Station Q and Department of Mathematics
Chinese Academy of Sciences University of California
School of Physics Santa Barbara, CA 93106-6105, USA
University of Science and Technology of China B. Yi
Hefei, Anhui 230026, P. R. China Department of Mathematics
E-mail: [email protected]; [email protected] Capital Normal University
Dr. Z.-A. Jia, Prof. Y.-C. Wu, Prof. G.-C. Guo, Prof. G.-P. Guo Beijing 100048, P. R. China
CAS Center For Excellence in Quantum Information and Quantum R. Zhai
Physics Department of Engineering Physics
University of Science and Technology of China Institute of Technical Physics
Hefei, Anhui 230026, P. R. China Tsinghua University
Beijing 10084, P. R. China
Prof. G.-P. Guo
Origin Quantum Computing
The ORCID identification number(s) for the author(s) of this article Hefei, Anhui 230026, P. R. China
can be found under https://doi.org/10.1002/qute.201800077
DOI: 10.1002/qute.201800077
that we put neurons at both input and output ends. These input
and output neurons are not neurons as introduced previously, but
depend on the learning problems. There may or may not be acti-
vation functions associated with them. In what follows, we briefly
introduce the feed-forward, convolutional neural networks, and
the Boltzmann machines.
Table 1. Some popular activation functions. ent of cost function over S equals roughly the average gradient
over the whole training set S. Then the updating formulae are
Function accordingly modified as
Logistic function f(x ) = 1
1+e −x
e x −e −x
η ∂C(X i )
N
tanh tanh(x ) = e x +e −x
wi j → wi j = wi j − (6)
cos cos(x )
xj
N i=1 ∂wi j
Softmaxa) σ (x) j = e x
i e i
Rectified linear unit ReLU(x ) = max{0, x }
η ∂C(X i )
N
x, x ≥0
Exponential linear unit ELU(x ) =
α(e x − 1), x < 0
bi → bi = bi − (7)
N i=1 ∂bi
Softplus SP(x ) = ln(e + 1)
x
a)
The softmax function acts on vectors x, which are usually used in the final layer of where C(X i ) = y(X i ) − Yi 2 /2 is the cost function over the
the neural networks. training input X i .
The test data T is usually chosen differently from S, and when
training data and test data, respectively; here yi (resp. ti ) is the the training process is done, the test data is used to test the per-
label of xi (resp. zi ). Our aim is to find the weights and biases formance of the neural network, which for many traditionally dif-
of the neural network such that the network output y(xi ) (which ficult problems (such as classification and recognition) is very
depends on network parameters = {wi j , bi }) approximates good. As discussed later, the feed-forward neural network and
yi for all training inputs xi . To quantify how well the neural many other neural networks also work well in approximating
network approximates the given labels, we need to introduce quantum states,[49,50] this being the main theme of this paper.
the cost function, which measures the difference between y(xi )
and yi ,
2.1.2. Convolutional Neural Network
1
N
C() := C(y(xi ), yi ) = y(xi ) − yi 2 (3) Convolutional neural networks are another important class of
2N i=1
neural network and are most commonly used to analyze images.
A typical convolutional neural network consists of a sequence
where N denotes the number of data in the training set, the
of different interleaved layers, including a convolutional layer,
set of network parameters wi j and b j , and the sum runs over
a pooling layer, and a fully connected layer. Through a differ-
all data in the training set. Here, we choose the quadratic norm,
entiable function, every layer transforms the former layer’s data
therefore, the cost function is called quadratic. Now our aim is
(usually pixels) into a new set of data (pixels).
to minimize the cost function as a multivariable function of the
For regular neural networks, each neuron is fully connected
network parameters such that C() ≈ 0, this can be done by the
with the neurons in the previous layer. However, for the con-
well-known gradient descent method.
volution layer of a convolutional neural network, the neurons
The intuition behind the gradient decent method is that we can
only connect with neurons in a local neighborhood of the pre-
regard the cost function, in some sense, as the height of a map
vious layer. More precisely, in a convolutional layer, the new pixel
where the place is marked by network parameters. Our aim is to
values of the kth layer are obtained from the (k − 1)th layer by
go down repeatedly from some initial place (given a configura-
a filter that determines the size of the neighborhood and then
tion of the neural network) until we reach the lowest point. For- (k) (k−1)
mally, from some given configuration of the neural network, that gives vi j = p,q wi j ; pq v p,q where the sum runs over the neu-
(k)
is, given parameters wi j and bi , the gradient decent algorithm rons in the local neighborhood of vi j . After the filter scans the
∂C ∂C
needs to compute repeatedly the gradient ∇C = ( ∂w ij
, ∂b i
). The whole image (all pixel values), a new image (new set of pixel val-
updating formulae are given by ues) is obtained. The pooling layers are usually periodically added
in between successive convolutional layers and its function is to
∂C reduce the data set. For example, the max (or average) pooling
wi j → wi j = wi j − η (4) chooses the maximum (or average) value of the pixels of the pre-
∂wi j
vious layer contained in the filter. The last fully connected layer
∂C is the same as the one in the regular neural network and outputs
bi → bi = bi − η (5) a class label used to determine which class the image is catego-
∂bi
rized in.
where η is a small positive parameter known as the learning The weights and biases of the convolutional neural networks
rate. are learnable parameters, but the variables such as the size of the
In practice, there are many difficulties in applying gradient filter and the number of interleaved convolutional and pooling
method to train the neural network. The modified form, the layers are usually fixed. The convolutional neural network per-
stochastic gradient descent, is usually used to speed up the train- forms well in classification-type machine learning tasks such as
ing process. In the stochastic gradient method, sampling over the image recognition.[18,51,52] As has been shown numerically,[53] the
training set is introduced, that is, we randomly choose N sam- convolutional neural network can also be used to build quantum
ples S = {(X 1 , Y1 ), . . . , (X N , YN )} such that the average gradi- many-body states.
Now we introduce another special type of artificial neural net- Tensor networks are certain contraction pattern of tensors, which
works, the Boltzmann machine (also known as the stochastic play an important role in many scientific areas such as condensed
Hopfield network with hidden units), which is an energy-based matter physics, quantum information and quantum computa-
neural network model.[54,55] Recently introduced in many dif- tion, computational physics, and quantum chemistry.[3–7,12] We
ferent physical areas,[26,27,29,31,38,39,56–61] the quantum versions of discuss some details of tensor networks latter in this review.
BMS, quantum BMs, have also been investigated.[62] As the BM Here, we only comment on the connection between tensor net-
is very similar to the classical Ising model, here we explain the works and machine learning.
BM neural network by frequently referring to the terminology of Many different tensor network structures have been developed
the Ising model. Notice that the BM is very different from the over the years for solving different problems like the matrix prod-
perceptrons and logistic neural network as it does not treat each uct states (MPS),[68–70] projective entangled pair states (PEPS),[14]
neuron individually. Therefore, there is no activation function at- multiscale entanglement renormalization ansatz (MERA),[16]
tached to each specific neuron. Instead, the BM treats neurons branching,[71,72] and tree tensor network,[73] matrix product
as a whole. operator,[74–77] projective entangled pair operator,[78–80] and con-
Given a graph G with vertex set V (G) and edge set E (G), the tinuous tensor networks.[81–83] A large number of numerical
neurons s 1 , . . . , s n (spins in the Ising model) are put on vertices, algorithms based on tensor networks are now available, in-
n = |V (G)|. If two vertices i and j are connected, there is a weight cluding the density-matrix renormalization group,[13] folding
wi j (coupling constant in the Ising model) between the corre- algorithm,[15] entanglement renormalization,[16] time-evolving
sponding neurons s i and s j . For each neuron s i , there is also block decimation,[17] and tangent space method.[84]
a corresponding local bias (local field in the Ising model). As One of the most important properties that empowers tensor
has been done for Ising model, for each series of input values networks is that entanglement is much easier to treat in this rep-
s = (s 1 , . . . , s n ) (spin configuration in the Ising model), we de- resentation. Many studies have appeared in recent years that in-
fine its energy as dicate that tensor networks have a close relationship with state-
of-the-art neural network architectures. From theory, machine
E (s) = − wi j s i s j − s i bi (8) learning architectures were shown in ref. [85] to be understood
i j ∈E (G) i via the tensor networks and their entanglement pattern. In prac-
tical applications, tensor networks can also be used for many
Up to now, everything is just as for the Ising model. No new machine-learning tasks, for example, performing learning tasks
concepts or techniques are introduced. The main difference is by optimizing the MPS,[86,87] preprocessing the dataset based on
that, the BM construction introduces a coloring on each vertex. layered tree tensor networks,[88] classifying images via the MPS
Each vertex receives a label hidden or visible. We assume the first k and tree tensor networks,[86–89] and realizing quantum machine
neurons are hidden neurons denoted by h 1 , . . . , h k , and the left l learning via tensor networks.[90] Both tensor networks and neural
neurons are visible neurons denoted by v1 , . . . , vl and k + l = n. networks can be applied to represent quantum many-body states;
Therefore, the energy is now E (h, v). The BM is a parametric the difference and connections of the two kinds of representa-
model of a joint probability distribution between variables h and tions are extensively explored in several works.[27,29,38,40,43,60,91] We
v with the probability given by shall review some of these progress in detail in Section 3.
e −E (h,v)
p(h, v) = (9)
Z 2.2. Representational Power of Neural Network
where Z = h,v e −E (h,v) is the partition function.
Next we comment on the representational power of neural net-
The general BM is very difficult to train, and therefore some
works, which is important in understanding the representational
restricted architecture on the BM is introduced. The restricted
power of quantum neural network states. In 1900, Hilbert for-
BM (RBM) was initially invented by Smolensky [63] in 1986. In
mulated his famous list of 23 problems, among which the thir-
the RBM, it is assumed that the graph G is a bipartite graph;
teenth problem is devoted to the possibility of representing an n-
the hidden neurons only connect with visible neurons and there
variable function as a superposition of functions of a lesser num-
are no intra-layer connections. This kind of restricted structure
ber of variables. This problem is closely related to the representa-
makes the neural network easier to train and therefore has been
tional power of neural networks. Kolmogorov [92,93] and Arnold[94]
extensively investigated and used.[26,27,29,31,38,39,56–61] The RBM can
proved that for continuous n-variable functions, this is indeed
approximate every discrete probability distribution.[64,65]
the case. The result is known as the Kolmogorov–Arnold repre-
The BM is most notably a stochastic recurrent neural network
sentation theorem (alternatively, the Kolmogorov superposition
whereas the perceptron and the logistic neural network are feed-
theorem);
forward neural networks. There are many other types of neural
networks. For a more comprehensive list, see textbooks such as
refs. [66,67]. The BM is crucial in quantum neural network states
and hence its neural network states are also the most studied. In Theorem 1. Any n-variable real continuous function f : [0, 1]n →
later sections, we shall discuss the physical properties of the BM R expands as sums and compositions of continuous univariate func-
neural network states and their applications. tions; more precisely, there exist real positive numbers a, b, λ p , λ p,q
and a real monotonic increasing function φ : [0, 1] → [0, 1] such that network can represent a function or distribution in polynomial
time (the number of parameters depends polynomially on the
⎛ ⎞ number of input neurons), we say that the representation is effi-
2n+1
n
cient.
f (x1 , . . . , xn ) = F⎝ λ p φ(x p + aq ) + bq ⎠ (10)
q =1 p=1
()|H|
()
Theorem 3. Any discrete probability distribution p : Bn → R≥0 can E () = (13)
()|
()
be approximated with an RBM with k + 1 hidden neurons where k =
|supp( p)| is the cardinality of the support of p (i.e., the number of
In accordance with the variational method, the aim now is to
vectors with non-zero probabilities) arbitrarily well in the metric of
minimize the energy functional and obtain the corresponding
the Kullback–Leibler divergence.
parameter values, with which the (approximate) ground state is
The theorem states that any discrete probability distribution obtained. The process of adjusting parameters and finding the
can be approximated by the RBM. The bound of the number of minimum of the energy functional is performed using neural
hidden neurons is later improved.[65] network learning (see Figure 2). Alternatively, if the appropri-
Here we must stress that these representation theorems are ate dataset exists, we can also build the quantum neural network
applicable only if the given function or probability distribution states by standard machine learning procedures rather than min-
can be represented by the neural network. In practice, the num- imizing the energy functional. We first build a neural network
ber of parameters to be learned cannot be too large for the num- with learnable parameters and then train the network with
ber of input neurons when we build a neural network. If a neural the available dataset. Once the training process is completed, the
Figure 2. Schematic diagram for the neural network ansatz state. The first neural network state we consider is the logistic neural
network state, where weights and biases now must be chosen as
complex numbers and the activation function f (z) = 1/(1 + e −z)
parameters of the neural network are fixed; we also obtain the is also a complex function. As shown in Figure 3, we take the two-
corresponding approximate quantum states. qubit state as an example. We assume the biases are b 1 , . . . , b 4
The notion of the efficiency of the neural network ansatz in for hidden neurons h 1 , . . . , h 4 respectively; the weights between
representing a quantum many-body state is defined as the de- neurons are denoted by wi j . We construct the state coefficient
pendency relation of the number of non-zero parameters || in- neuron by neuron next.
volved in the representation and the number of physical parti- In Figure 3, the output for hi , i = 1, 2, 3 is yi = f (v1 w1i +
cles N: if || = O(poly(N)), the representation is called efficient. v2 w2i − bi ), respectively. These outputs are transmitted to h 4 ; af-
The aim when solving a given eigenvalue equation is therefore to ter acting with h 4 , we get the state coefficient,
build a neural network for which the ground state can be repre-
sented efficiently.
log (v1 , v2 , ) = f (w14 y1 + w24 y2 + w34 y3 − b 4 ) (14)
To obtain the quantum neural network states from the above
construction, we first need to make the neural network a complex where = {wi j , bi }. Summing over all possible in-
neural network, specifically, use complex parameters and output values, we obtain the quantum state |
log () =
put
complex values. In practice, some neural networks may have dif- v1 ,v2
log (v1 , v2 , )|v1 , v2 up to a normalization factor.
ficulty outputing complex values. Therefore, we need to develop We see that the logistic neural network states have a hierar-
another way to build a quantum neural network state |
. We chical iteration control structure that is responsible for the
know that wavefunction
(v) can be written as
(v) = R(v)e iθ(v) representation power of the network in representing states.
where the amplitude R(v) and phase θ(v) are both real functions; However, when we want to give the neural network parame-
hence, we can represent them by two separate neural networks ters of a given state |
explicitly, we find that f (z) = 1/(1 + e −z)
with parameter sets 1 and 2 . The quantum states are deter- cannot exactly take values zero and one as they are the asymptotic
values of f . This shortcoming can be remedied by a smoothing where ai and b j are biases of visible neurons and hidden neurons,
step function in another way. Here we give a real function so- respectively; wi j , w j j , and wii are connection weights. The state
lution; the complex case can be done similarly. The idea is very coefficients are now
simple. We cut the function into pieces and then glue them to-
e −E (h,v)
gether in some smooth way. Suppose that we want to construct a
B M (v, ) = ··· (19)
smooth activation function F (x) such that h1 h
Z
l
⎧ a
with Z = v,h e −E (h,v) the partition function, and the sum runs
⎪= 0,
⎪
⎪
x≤−
⎪
⎪
2 over all possiblevalues of the hidden neurons. The quantum state
⎨ a a
(v,)|v
is |
B M () = v B MN where N is the normalizing factor.
F (x) ∈ (0, 1) − < x < (15)
⎪
⎪ 2 2 Because the fully connected BM states are extremely difficult
⎪
⎪
⎪
⎩= 1, a to train in practice, the more commonly used ones are the RBM
x≥
2 states where there is one hidden layer and one visible layer. There
are no intra-layer connections [hidden (resp. visible) neurons do
we can choose a kernel function not connect with hidden (resp. visible) neurons]. In this instance,
the energy function becomes
⎧
⎪ 4x 2 a
⎪
⎨ + , − ≤x≤0
a2 a 2 E (h, v) = − ai vi − bjhj − vi Wi j h j
K (x) = (16)
⎪
⎪
i j ij
⎩ 2 − 4x , 0 ≤ x < a
a a2 2
=− ai vi − hj bj + vi Wi j (20)
The required function can then be constructed as i j i
x+ a2
Then the wavefunction is
F (x) = K (x − t)s (t)dt (17)
x− a2
(v, ) ∼ ··· e i ai vi + j h j (b j + i vi Wi j )
h1 hl
(v) = R(v)e iθ(v) , we can also represent the amplitude and phase
RBM states.
by two neural networks separately as R(1 , v) and θ (2 , v) where
The DBM has more than one hidden layer; indeed, as has been
1 and 2 are two respective parameter sets of the neural net-
shown in ref. [29], any BM can be transformed into a DBM with
works. The approach is used in representing a density operator
two hidden layers. Hence, we shall only be concerned with the
by purification; to be discussed in Section 4.
DBM with two hidden layers. The wavefunction is written explic-
For the BM states, we notice that the classical BM networks
itly as
can approximate a discrete probability distribution. The quantum
state coefficient
(v) is the square root of the probability distri- exp−E (v,h,g)
bution and therefore should also be able to be represented by the
(v, ) ∼ ··· ··· (22)
BM. This is one reason that the BM states are introduced as a h1 hl g1 g
Z
q
representation of quantum states. Here we first treat instances
of fully connected BM states (Figure 3). Instances for the RBM where
the energy function
of the form E (v, h, g) =
is now
and DBM are similar. As in the logistic states, the weights and −
i i i v a − c
k k kg − h b
j j j − i, j ;
i j Wi j vi h j −
biases of the BM are now complex numbers. The energy func- j k;
k j W k j h j g k . It is also difficult to train the DBM, in
tion is defined as general, but the DBM states have a stronger representational
power than the RBM states; the details are discussed in the
⎛
next subsection.
E (h, v) = − ⎝ vi ai + hjbj + wi j vi h j
i j
i j
Ising model and the antiferromagnetic Heisenberg model By definition, a rank-n tensor is a complex variable with n in-
efficiently,[26] many researchers have studied their representa- dices, for example Ai1 ,i2 ,...,in . The number of values that an index
tion power. We now know that RBMs are capable of represent- i k can take is called the bond dimension of i k . The contraction
ing many different classes of states.[28,29,38,59] Unlike their un- of two tensors is a new tensor, that being defined as the sum
restricted counterparts, RBMs allow an efficient sampling and over any number of pairs of indices; for example, Ci1 ,...,i p ,k1 ,...,kq =
they are also the most studied cases. The DBM states are also ex- j1 ,..., jl Ai 1 ,...,i p , j1 ,..., jl B j1 ,..., jl ,k1 ,...,kq . A tensor network is a set of
plored in various works.[29,42,43] In this section, we briefly review tensors for which some (or all) of the indices are contracted.
the progress in this direction. Representing the tensor network graphically is quite conve-
We first list some known classes of states that can be efficiently nient. The corresponding diagram is called a tensor network di-
represented by RBM: Z2 -toric code states;[28] graph states;[29] sta- agram, in which, a rank-n tensor is represented as a vertex with
bilizer states with generators of pure type, S X , SY , S Z and their n-edges, for example, a scalar is just a vertex, a vector is a vertex
arbitrary union;[38] perfect surface code states, surface code states with one edge, and a matrix is a vertex with two edges:
with boundaries, defects, and twists;[38] Kitaev’s D(Zd ) quantum
double ground states;[38] the arbitrary stabilizer code state;[40]
ground states of the double semion model and the twisted (23)
quantum double models;[41] states of the Affleck–Lieb–Kennedy–
Tasaki model and the 2D CZX model;[41] states of Haah’s cu-
bic code model;[41] and the generalized-stabilizer and hypergraph The contraction is graphically represented by connecting two ver-
states.[41] The algorithmic way to obtain the RBM parameters of tices with the same edge label. For two vectors and matrices, this
the stabilizer code state for arbitrary given stabilizer group S has corresponds to the inner product and the matrix product, respec-
also been developed.[40] tively. Graphically, they look like
Although many important classes of states may be represented
by the RMB, there is a crucial result regarding a limitation:[29] (24)
there exist states that can be expressed as PEPS [102] but cannot
be efficiently represented by an RBM; moreover, the class of RBM
states is not closed under unitary transformations. One way to
remedy the defect is by adding one more hidden layer, that is,
(25)
using the DBM.
The DBM can efficiently represent physical states including:
r Any state which can be efficiently represented by RBMs;[103]
How can we use the tensor network to represent a many-
r Any n-qubit quantum states generated by a quantum circuit
body quantum state? The idea is to regard the wavefunction
(v1 , . . . , vn ) =
v|
as a rank-n tensor
v1 ,...,vn . In some cases,
of depth T ; the number of hidden neurons is O(nT );[29]
r Tensor network states consist of n-local tensors with bound di-
the tensor wavefunction can break into some small pieces, specif-
ically, contraction of some small tensors. For example
v1 ,...,vn =
mension D and maximum coordination number d; the num- [1] [2] [n]
ber of hidden neurons is O(nD2d );[29] α1 ,...,αn Ai 1 ;αn α1 Ai 2 ;α1 α2 · · · Ai n ;αn−1 αn . Graphically, we have
r The ground states of Hamiltonians with gap
; the number of
2
hidden neurons is O( m
(n − log )) where is the representa-
[29]
tional error;
[k]
3.2. Tensor Network States where each Aik ;αk−1 αk is a local tensor depending only on some
subset of indices {v1 , . . . , vn }. In this way, physical properties
Let us now introduce a closely related representation of the such as entanglement are encoded into the contraction pattern
quantum many-body states—the tensor network representation, of the tensor network diagram. It turns out that this kind of rep-
which was originally developed in the context of condensed mat- resentation is very powerful in solving many physical problems.
ter physics based on the idea of the renormalization group. Ten- There are several important tensor network structures. We
sor network states have now applications in many different scien- take two prototypical tensor network states used for 1d and 2d
tific fields. Arguably, the most important property of the tensor systems, MPS states,[68–70] and PEPS states,[14] as examples to il-
network states is that entanglement is much easier to read out lustrate the construction of tensor-network states. In Table 2, we
than other representations. list some of the most popular tensor-network structures includ-
Although there are many different types of tensor networks, ing MPS, PEPS, MERA,[16] branching MERA,[71,72] and tree ten-
we focus here on the two simplest and easily accessible ones, the sor networks,[73] We also list the main physical properties of these
MPS and the PEPS. For other more comprehensive reviews, see structures, such as correlation length and entanglement entropy.
refs. [3–7,12]. For more examples, see refs. [3–7,12]
Tensor network structure Entanglement entropy S(A) Correlation length ξ Local observable
Ô Diagram
Branching multiscale entanglement renormalization ansatz (1d) O(log |∂A|) Finite/infinite exact
A periodic-boundary-condition MPS state is just like the right- regard visible and hidden neurons as tensors. For example, the
hand side of Equation (26), which consists of many local rank-3 visible neuron vi and hidden neuron h j is now replaced by
tensors. For the open boundary case, the boundary local tensor
is replaced with rank-2 tensors, and the inner part remains the 1 0
V (i) = (28)
same. The MPSs correspond to the low energy eigenstates of lo- 0 e ai
cal gapped 1d Hamiltonians.[104,105] The correlation length of the
MPS is finite and they obey the entanglement area law, thus they
cannot be used for representing quantum states of critical sys- 1 0
tems that break the area law.[8] H( j ) = (29)
0 eb j
The PEPS state can be regarded as a higher-dimensional gen-
eralization of MPS. Here we give an example of a 2d 3 × 3 PEPS and the weighted connection between vi and h j is now also re-
state with open boundary placed by a tensor
1 1
W(i j ) = (30)
1 e wi j
work trying to use a neural network to solve the Schrödinger this model using the feed-forward neural networks. They used
equations[34–37] date back to 2001. Recently, in 2016, Carleo and the variational Monte Carlo method to find the ground state for
Troyer made the approach popular in calculating physical quan- the 1d system and obtained precisions to ≈ O(10−3 ). Liang and
tities of the quantum systems.[26] Here we briefly discuss sev- colleagues[53] investigated the model using the convolutional neu-
eral examples of numerical calculations in many-body physics, ral network, and showed that the precision of the calculation
including spin systems, and bosonic and fermionic systems. based on convolutional neural network exceeds the string bond
state calculation.
where the first sum runs over all nearest neighbor pairs. For †
the 1d case, the system is gapped as long as J = B but H = −t ĉ i,σ ĉ j,σ + ĉ †j,σ ĉ i,σ + U n̂i,↑ n̂i,↓ (34)
i j ,σ
gapless when J = B. In ref. [26], Carleo and Troyer demon- i
strated that the RBM state works very well in finding the
ground state of the model. By minimizing the energy E () = where the first term accounts for the kinetic energy and the sec-
†
()|HtIsing |
()/
()|
() with respect to the network ond term the potential energy; c i,σ and c i,σ denote the usual cre-
†
parameters using the improved gradient-descent optimization, ation and annihilation operators, with n̂i,σ = c i,σ c i,σ . The phase
they showed that the RBM states achieve an arbitrary accuracy for diagrams of the Hubbard model have not been completely de-
both 1d and 2d systems. termined yet. In ref. [107], Nomura and colleagues numerically
analyzed the ground state energy of the model by combining the
RBM and the pair product states approach. They showed numer-
3.3.2. Antiferromagnetic Heisenberg Model ically that the accuracy of the calculation surpasses the many-
variable variational Monte Carlo approach when U/t = 4, 8. A
The antiferromagnetic Heisenberg model is of the form modified form of the model, described by the Bose–Hubbard
Hamiltonian, was studied in ref. [50] using a feed-forward neu-
H= J Si S j , (J > 0) (32) ral network. The result is in good agreement with the calculation
i j given by an exact diagonalization and the Gutzwiller approxima-
tion.
where the sum runs over all nearest neighbor pairs. In ref. [26], Here we briefly mention several important examples of nu-
the calculation of the model is performed for the 1d and 2d sys- merical calculations of many-body physical systems. Numerous
tems using the RBM states. The accuracy of the neural network other numerical works concerning many different physical mod-
ansatz turns out to be much better than the traditional spin- els have appeared. We refer the interested readers to for example,
Jastrow ansatz[106] for the 1d system. The 2d system is harder, refs. [22–29,34–38,49,50,53,107].
and more hidden neurons are needed to reach a high accuracy. In
ref. [107], a combined approach is presented; the RBM architec-
ture was combined with a conventional variational Monte Carlo 4. Density Operators Represented by Neural
method with paired-product (geminal) wave functions to calcu- Network
late the ground-state energy and ground state. They showed that
the combined method has a higher accuracy than that achieved 4.1. Neural Network Density Operator
by each method separately.
In realistic applications of quantum technologies, the states that
we are concerned about are often mixed because the system is
3.3.3. J 1 -J 2 Heisenberg Model barely isolated from its environment. The mixed states are math-
ematically characterized by the density operator ρ which is i) Her-
The J 1 -J 2 Heisenberg model (also known as the frustrated mitian ρ † = ρ; ii) positive semi-definite
|ρ|
≥ 0 for all |
;
Heisenberg model) is of the form and iii) trace one Trρ = 1. The pure state |
provides a represen-
tation of the density operator ρ
= |
where Z(i ) = h e v e −E (i ,v,h,e) is the partition function
corresponding to i . The density operator can now be obtained
from Equation (37).
SE |. Every mixed state can be purified approach circumvents the experimental difficulty and requires
in this way. only a reasonable number of measurements.[109] The MPS to-
In ref. [108], Torlai and Melko explored the possibility of rep- mography works well for states with low entanglement.[110,111] For
resenting mixed states ρ S using the RBM. The idea is the same general mixed states, the efficiency of the permutationally invari-
as that for pure states. We build a neural network with parame- ant tomography scheme based on the internal symmetry of the
ters , and for the fixed basis |v, the density operator is given by quantum states is low.[112] Despite all the progress, the general
the matrix entries ρ(, v, v ), which is determined by the neural case for quantum state tomography is still very challenging.
network. Therefore, we only need to map a given neural network The neural network representation of quantum states provides
with parameters to a density operator as another approach to state tomography. Here we review its basic
idea. For clarity (although there will be some overlap), we discuss
ρ() = |vρ(, v, v )
v | (35) its application to pure states and mixed states separately.
v,v From the work by Torlai and colleagues,[33] for a pure quan-
tum state, the neural network tomography works as follows. To
To this end, the purification method of the density operators is reconstruct an unknown state |
, we first perform a collection of
used. The environment is now represented by some extra hidden measurements {v(i) }, i = 1, . . . , N and therefore obtain the prob-
neurons e 1 , . . . , e m besides the hidden neurons h 1 , . . . , hl . The abilities pi (v(i) ) = |
v(i) |
|2 . The aim of the neural network to-
purification |
SE of ρ S is now captured by the parameters of the mography is to find a set of RBM parameters such that the
network, which we still denote as , that is, RBM state (, v(i) ) mimics the probabilities pi (v(i) ) as closely
as possible in each basis. This can be done in neural network
|
SE =
SE (, v, e)|v|e (36) training by minimizing the distance function (total divergence)
v e between |(, v(i) )|2 and pi (v(i) ). The total divergence is chosen
as
By tracing out the environment, the density operator also is de-
termined by the network parameters
N
D() = DK L [|(, v(i) )|2 | pi (v(i) )] (39)
i=1
∗
ρS =
SE (, v, e)
SE (, v , e) |v
v | (37)
v,v e
where DK L [|(, v(i) )|2 | pi (v(i) )] is the Kullback–Leibler (KL) di-
vergence in basis {v(i) }.
To represent the density operators, ref. [108] takes the approach Note that to estimate the phase of |
in the reference ba-
to represent the amplitude and phase of the purified state |
SE sis, a sufficiently large number of measurement bases should
by two separate neural networks. First, the environment units are be included. Once the training is completed, we get the target
embedded into the hidden neuron space, that is, they introduced state |() in the RBM form, which is the reconstructed state
some new hidden neurons e 1 , . . . , e m , which are fully connected for |
. In ref. [33], Torlai and colleagues test the scheme for
to all visible neurons (See Figure 4). The parameters correspond- the W-state, modified W state with local phases, Greenberger–
ing to the amplitude and phase of the wave function are now en- Horne–Zeilinger and Dicke states, and also the ground states for
coded in the RBM with two different sets of parameters. That is, the transverse-field Ising model and XXZ model. They find the
the state
SE (, v, e) = R(1 , a, v)e iθ(2 ,a,v) with = 1 ∪ 2 . scheme is very efficient and the number of measurement bases
R(1 , a, v) and θ (2 , a, v) are both characterized by the corre- usually scales only polynomially with system size.
sponding RBM (this structure is called the latent space purifica- The mixed state case is studied in ref. [108] and is based on the
tion by authors). In this way, the coefficients of the purified state RBM representations of the density operators. The core idea is
|
SE encoded by the RBM are the same as for the pure state; that is, to reconstruct an unknown
density operator ρ, we need to build an RBM neural network den-
sity σ () with RBM parameter set . Before training the RBM,
−E ( ,v,h,e)
h e −E (1 ,v,h,e) i log he 2
we must perform a collection of measurements {v(i) } and obtain
SE (, v, e) = e 2 (38)
Z(1 ) the corresponding probability distribution pi (v(i) ) =
v(i) |ρ|v(i) .
|) is the reduced density matrix. stand what are the practical applications of the quantum comput-
If the Rényi entanglement entropy is nonzero, then A and Ac ing platforms developed recently in different laboratories. Here
are entangled. we introduce the approach to simulating quantum circuits based
The entanglement property is encoded in the geometry of on the neural network representation of quantum states.
the contraction patterns of the local tensors for tensor network Following ref. [29], we first discuss how to simulate quantum
states. For neural network states, it was shown that the entan- computing via DBM states, since in the DBM formalism, all op-
glement is encoded in the connecting patterns of the neural erations can be written out analytically. A general quantum com-
networks.[27,43,59–61] For RBM states, Deng, Li, and Das Sarma[27] puting process can be loosely divided into three steps: i) initial
showed that locally connected RBM states obey the entanglement state preparation, ii) applying quantum gates, and iii) measuring
area law, see Figure 5a for an illustration of a local RBM state. the output state. For the DBM state simulation in quantum com-
Nonlocal connections result in the volume-law entanglement of puting, the initial state is first represented by a DBM network. We
the states.[27] We extended this result for any BM, showing that by are mainly concerned in how to apply a universal set of quantum
cutting the intra-layer connection and adding hidden neurons, gates in the DBM representations. As we shall see, this can be
in (v, ) =
v|
in (). To simulate the circuit quantum comput- setting by adding directly a new visible neuron vi and connect-
ing, characterized by unitary transform UC , we need to devise ing it with the (hidden) neuron vi . Z(θ) can also be realized in
strategies so that we can apply all the universal quantum gates to the DBM setting by changing the bias of the visible neuron vi .
achieve the transform, We choose the method presented above simply to make the con-
struction clearer and more systematic.
DBM
v|
in () →
v|
out () =
v|UC |
in () (40) The above protocol based on the DBM is an exact simulation
but has a drawback in that the sampling of the DBM quickly be-
Let us first consider how to construct the Hadamard gate op- comes intractable with increasing depth of the circuit because
eration the gates are realized by adding deep hidden neurons. In con-
trast, RBMs are easier to train; a simulation based on the RBM
1 1 has already been developed.[116] The basic idea is the same as the
H|0 = √ (|0 + |1), H|0 = √ (|0 − |1) (41)
2 2 DBM approach, the main difference being that Hadamard gate
cannot be exactly simulated in the RBM setting. In ref. [116],
If H acts on the ith qubit of the system, we can then represent the authors developed the approximation method to simulate
the operation in terms of the coefficients of the state, the Hadamard gate operation. The RBM realizations of Z(θ ) and
C Z(θ) are achieved by adjusting the bias and introducing a new
H hidden neuron and weighted connections, respectively.
(· · · vi · · · ) →
(· · · vi · · · )
1
= √ (−1)vi vi
(· · · vi · · · ) (42)
vi =0,1 2 7. Concluding Remarks
In this work, we discussed aspects of the quantum neural net-
In DBM settings, it is clear now that the Hadamard DBM trans-
work states. Two important kinds of neural networks, feed-
form of the i-th qubit adds a new visible neuron vi , which re-
forward and stochastic recurrent, were chosen as examples to il-
places vi , and another hidden neuron Hi and vi now becomes a
lustrate how neural networks can be used as a variational ansatz
hidden neuron. The connection weight is given by WH (v, Hi ) =
state of quantum many-body systems. We reviewed the research
iπ
− ln2 − iπ2v − iπ4Hi + iπ v Hi , where v = vi , vi . We easily check
8 WH (vi ,Hi )+WH (vi ,Hi ) progress on neural network states. The representational power of
that Hi =0,1 e = √12 (−1)vi vi , which completes these states was discussed and entanglement features of the RBM
the construction of the Hadamard gate operation. and DBM states reviewed. Some applications of quantum neural
The Z(θ) gate operation, network states, such as quantum state tomography and classical
−iθ iθ
simulations of quantum computing, were also discussed.
Z(θ )|0 = e 2 |0, Z(θ )|1 = e 2 |1 (43) In addition to the foregoing, we present some remarks on the
main open problems regarding quantum neural network states.
can be constructed similarly. We can also add a new visible neu-
ron vi and a hidden neuron Zi , and vi becomes a hidden neu- r One crucial problem is to explain why the neural network
ron that should be traced. The connection weight is given by works so well for some special tasks. There should be deep
WZ(θ) (v, Zi ) = − ln22 + iθv
2
+ iπ v Zi where v = vi , vi . The DBM reasons for this. Understanding the mathematics and physics
transform of the controlled Z(θ ) gates is slightly different from behind the neural networks may help to build many other im-
single qubit gates because it is a two-qubit operation acting on vi portant classes of quantum neural network states and guide
and v j . To simplify the calculation, we give here the explicit con- us in applying the neural network states to different scientific
struction for C Z. This can be done by introducing a new hidden problems.
neuron Hi j , which connects both vi and v j with the same weights r Although the BM states have been studied from various as-
as those given by the Hadamard gate. In summary, we have pects, many other neural networks are less explored in regard
to representing quantum states both numerically and theoreti-
cally. This raises the question whether other networks can also
(44) efficiently represent quantum states, and what are the differ-
ences between these different representations?
r Developing the representation theorem for the complex func-
tion is also a very important topic in quantum neural network
states. Because we must build the quantum neural network
states from complex neural networks, as we have discussed,
(45) so it is important to understand the expressive power of the
complex neural network.
r Having a good understanding of entanglement features is of [19] G. E. Hinton, R. R. Salakhutdinov, Science 2006, 313, 504.
great importance in understanding quantum phases and the [20] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction,
quantum advantage over some information tasks. Therefore, Vol. 1, MIT Press, Cambridge, MA 1998.
we can also ask if there is an easy way to read out entanglement [21] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd,
Nature 2017, 549, 195.
properties from specific neural networks such as the tensor
[22] P. Rebentrost, M. Mohseni, S. Lloyd, Phys. Rev. Lett. 2014, 113,
network.
130503.
[23] V. Dunjko, J. M. Taylor, H. J. Briegel, Phys. Rev. Lett. 2016, 117,
We hope that our review of the quantum neural network states
130501.
inspires more work and exploration of the crucial topics high- [24] A. Monràs, G. Sentı́s, P. Wittek, Phys. Rev. Lett. 2017, 118, 190503.
lighted above. [25] J. Carrasquilla, R. G. Melko, Nat. Phys. 2017, 13, 431.
[26] G. Carleo, M. Troyer, Science 2017, 355, 602.
[27] D.-L. Deng, X. Li, S. Das Sarma, Phys. Rev. X 2017, 7, 021021.
Acknowledgements [28] D.-L. Deng, X. Li, S. Das Sarma, Phys. Rev. B 2017, 96, 195145.
Z.-A.J. thanks Zhenghan Wang and hospitality of Department of Mathe- [29] X. Gao, L.-M. Duan, Nat. Commun. 2017, 8, 662.
matics of UCSB. He also acknowledges Liang Kong and Tian Lan for dis- [30] M. August, X. Ni, Phys. Rev. A 2017, 95, 012335.
cussions during his stay in Yau Mathematical Science Center of Tsinghua [31] G. Torlai, R. G. Melko, Phys. Rev. Lett. 2017, 119, 030501.
University, and he also benefits from the discussion with Giuseppe Car- [32] Y. Zhang, E.-A. Kim, Phys. Rev. Lett. 2017, 118, 216401.
leo during the first international conference on “Machine Learning and [33] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko, G. Carleo,
Physics” at IAS, Tsinghua University. The authors thank Richard Haase, Nat. Phys. 2018, 14, 447.
from Liwen Bianji, Edanz Group China, for helping to improve the English [34] C. Monterola, C. Saloma, Opt. Express 2001, 9, 72.
of a draft of this manuscript. This work was supported by the Anhui Initia- [35] C. Monterola, C. Saloma, Opt. Commun. 2003, 222, 331.
tive in Quantum Information Technologies (Grant No. AHY080000). [36] C. Caetano, J. Reis Jr, J. Amorim, M. R. Lemes, A. D. Pino Jr, Int. J.
Quantum Chem. 2011, 111, 2732.
[37] S. Manzhos, T. Carrington, Can. J. Chem. 2009, 87, 864.
[38] Z.-A. Jia, Y.-H. Zhang, Y.-C. Wu, L. Kong, G.-C. Guo, G.-P. Guo, Phys.
Conflict of Interest Rev. A 2019, 99, 012307.
The authors declare no conflict of interest. [39] Y. Huang, J. E. Moore, arXiv:1701.06246, 2017.
[40] Y.-H. Zhang, Z.-A. Jia, Y.-C. Wu, G.-C. Guo, arXiv:1809.08631, 2018.
[41] S. Lu, X. Gao, L.-M. Duan, arXiv:1810.02352, 2018.
[42] W.-C. Gan, F.-W. Shu, Int. J. Mod. Phys. D 2017, 26, 1743020.
Keywords [43] Z.-A. Jia, Y.-C. Wu, G.-C. Guo, to be published 2018.
[44] T. Kohonen, Neural Networks 1988, 1, 3.
neural network states, quantum computing, quantum machine learning,
[45] W. S. McCulloch, W. Pitts, Bull. Math. Biophys. 1943, 5, 115.
quantum tomography
[46] M. Minsky, S. A. Papert, Perceptrons: An introduction to computa-
tional geometry, MIT Press, Cambridge, MA 2017.
Received: August 31, 2018
[47] M. A. Nielsen, Neural Networks and Deep Learning, Determination
Revised: February 27, 2019
Press, San Francisco, CA 2015.
Published online:
[48] Here, we emphasize the importance of the FANOUT operation,
which is usually omitted from the universal set of gates in the clas-
sical computation theory. However, the operation is forbidden in
[1] T. J. Osborne, Rep. Prog. Phys. 2012, 75, 022001. quantum computation by the famous no-cloning theorem.
[2] F. Verstraete, Nat. Phys. 2015, 11, 524. [49] Z. Cai, J. Liu, Phys. Rev. B 2018, 97, 035116.
[3] R. Orús, Ann. Phys. 2014, 349, 117. [50] H. Saito, J. Phys. Soc. Jpn. 2017, 86, 093001.
[4] Z. Landau, U. Vazirani, T. Vidick, Nat. Phys. 2015, 11, 566. [51] Y. LeCun, Y. Bengio, The Handbook of Brain Theory and Neural Net-
[5] I. Arad, Z. Landau, U. Vazirani, T. Vidick, Commun. Math. Phys. 2017, works, MIT Press, Cambridge, MA 1995.
356, 65. [52] A. Krizhevsky, I. Sutskever, G. E. Hinton, in Advances in Neural Infor-
[6] N. Schuch, M. M. Wolf, F. Verstraete, J. I. Cirac, Phys. Rev. Lett. 2007, mation Processing Systems, Proc. of the First 12 Conferences (Eds: M.
98, 140506. I. Jordan, Y. LeCun, S. A. Solla), MIT Press, Cambridge, MA 2012,
[7] A. Anshu, I. Arad, A. Jain, Phys. Rev. B 2016, 94, 195143. pp. 1097–1105.
[8] J. Eisert, M. Cramer, M. B. Plenio, Rev. Mod. Phys. 2010, 82, 277. [53] X. Liang, W.-Y. Liu, P.-Z. Lin, G.-C. Guo, Y.-S. Zhang, L. He, Phys. Rev.
[9] L. Amico, R. Fazio, A. Osterloh, V. Vedral, Rev. Mod. Phys. 2008, 80, B 2018, 98, 104426.
517. [54] G. E. Hinton, T. J. Sejnowski, Proc. of the IEEE Conference on Com-
[10] M. Friesdorf, A. H. Werner, W. Brown, V. B. Scholz, J. Eisert, Phys. puter Vision and Pattern Recognition, IEEE, New York 1983, pp. 448–
Rev. Lett. 2015, 114, 170505. 453.
[11] F. Verstraete, V. Murg, J. Cirac, Adv. Phys. 2008, 57, 143. [55] D. H. Ackley, G. E. Hinton, T. J. Sejnowski, Cognitive Science 1985, 9,
[12] R. Orus, arXiv:1812.04011, 2018. 147.
[13] S. R. White, Phys. Rev. Lett. 1992, 69, 2863. [56] G. Torlai, R. G. Melko, Phys. Rev. B 2016, 94, 165134.
[14] F. Verstraete, J. I. Cirac, arXiv:cond-mat/0407066, 2004. [57] K.-I. Aoki, T. Kobayashi, Mod. Phys. Lett. B 2016, 30, 1650401.
[15] M. C. Bañuls, M. B. Hastings, F. Verstraete, J. I. Cirac, Phys. Rev. Lett. [58] S. Weinstein, arXiv:1707.03114, 2017.
2009, 102, 240603. [59] L. Huang, L. Wang, Phys. Rev. B 2017, 95, 035105.
[16] G. Vidal, Phys. Rev. Lett. 2007, 99, 220405. [60] J. Chen, S. Cheng, H. Xie, L. Wang, T. Xiang, Phys. Rev. B 2018, 97,
[17] G. Vidal, Phys. Rev. Lett. 2003, 91, 147902. 085104.
[18] Y. LeCun, Y. Bengio, G. Hinton, Nature 2015, 521, 436. [61] Y.-Z. You, Z. Yang, X.-L. Qi, Phys. Rev. B 2018, 97, 045153.
[62] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, R. Melko, Phys. [91] I. Glasser, N. Pancotti, M. August, I. D. Rodriguez, J. I. Cirac, Phys.
Rev. X 2018, 8, 021050. Rev. X 2018, 8, 011006.
[63] P. Smolensky, Technical Report, Department of Computer Science, [92] A. N. Kolmogorov, Dokl. Akad. Nauk SSSR 1956, 108, 179.
University of Colorado Boulder, 1986 [93] A. N. Kolmogorov, Dokl. Akad. Nauk, 1957, 114, 953.
[64] N. L. Roux, Y. Bengio, Neural Comput. 2008, 20, 1631. [94] V. I. Arnold, Collected Works, Volume 1: Representations of Functions,
[65] G. Montufar, N. Ay, Neural Comput. 2011, 23, 1306. Celestial Mechanics, and KAM Theory 1957–1965 (Eds: A. B. Givental
[66] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford Uni- et al.), Springer, New York 2009.
versity Press, Oxford 1995. [95] D. Alexeev, J. Math. Sci. 2010, 168, 5.
[67] L. V. Fausett, Fundamentals of Neural Networks: Architectures, Algo- [96] F. Rosenblatt, Technical Report, Cornell Aeronautical Lab Inc., Buf-
rithms, and Applications, Vol. 3, Prentice-Hall, Englewood Cliffs, NJ falo, NY 1961.
1994. [97] J. Słupecki, Studia Logica 1972, 30, 153.
[68] M. Fannes, B. Nachtergaele, R. F. Werner, Commun. Math. Phys. [98] G. Cybenko, Math. Control Signals Syst. 1989, 2, 183.
1992, 144, 443. [99] K.-I. Funahashi, Neural Networks 1989, 2, 183.
[69] A. Klümper, A. Schadschneider, J. Zittartz, EPL 1993, 24, 293. [100] K. Hornik, M. Stinchcombe, H. White, Neural Networks 1989, 2,
[70] A. Klumper, A. Schadschneider, J. Zittartz, J. Phys. A: Math. Gen. 359.
1991, 24, L955. [101] R. Hecht-Nielsen, in Proc. of the IEEE Int. Conf. on Neural Networks
[71] G. Evenbly, G. Vidal, Phys. Rev. Lett. 2014, 112, 240502. III, IEEE Press, Piscataway, NJ 1987, pp. 11–13.
[72] G. Evenbly, G. Vidal, Phys. Rev. B 2014, 89, 235113. [102] X. Gao, S.-T. Wang, L.-M. Duan, Phys. Rev. Lett. 2017, 118, 040502.
[73] Y.-Y. Shi, L.-M. Duan, G. Vidal, Phys. Rev. A 2006, 74, 022320. [103] This can be done by setting all the parameters involved in the deep
[74] M. Zwolak, G. Vidal, Phys. Rev. Lett. 2004, 93, 207205. hidden layer to zeros; only the parameters of the shallow hidden
[75] J. Cui, J. I. Cirac, M. C. Bañuls, Phys. Rev. Lett. 2015, 114, 220601. layer remain nonzero.
[76] A. A. Gangat, T. I, Y.-J. Kao, Phys. Rev. Lett. 2017, 119, 010501. [104] M. B. Hastings, Phys. Rev. B 2006, 73, 085115.
[77] B.-B. Chen, L. Chen, Z. Chen, W. Li, A. Weichselbaum, Phys. Rev. X [105] M. B. Hastings, J. Stat. Mech.: Theory Exp. 2007, 2007, P08024.
2018, 8, 031082. [106] R. Jastrow, Phys. Rev. 1955, 98, 1479.
[78] P. Czarnik, J. Dziarmaga, Phys. Rev. B 2015, 92, 035152. [107] Y. Nomura, A. S. Darmawan, Y. Yamaji, M. Imada, Phys. Rev. B 2017,
[79] M. M. Parish, J. Levinsen, Phys. Rev. B 2016, 94, 184303. 96, 205152.
[80] A. Kshetrimayum, H. Weimer, R. Orús, Nat. Commun. 2017, 8, 1291. [108] G. Torlai, R. G. Melko, Phys. Rev. Lett. 2018, 120, 240503.
[81] F. Verstraete, J. I. Cirac, Phys. Rev. Lett. 2010, 104, 190405. [109] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, J. Eisert, Phys. Rev. Lett.
[82] J. Haegeman, J. I. Cirac, T. J. Osborne, I. Pizorn, H. Verschelde, F. 2010, 105, 150401.
Verstraete, Phys. Rev. Lett. 2011, 107, 070601. [110] M. Cramer, M. B. Plenio, S. T. Flammia, R. Somma, D. Gross, S.
[83] J. Haegeman, T. J. Osborne, H. Verschelde, F. Verstraete, Phys. Rev. D. Bartlett, O. Landon-Cardinal, D. Poulin, Y.-K. Liu, Nat. Commun.
Lett. 2013, 110, 100402. 2010, 1, 149.
[84] J. Haegeman, T. J. Osborne, F. Verstraete, Phys. Rev. B 2013, 88, [111] B. Lanyon, C. Maier, M. Holzäpfel, T. Baumgratz, C. Hempel, P. Ju-
075133. rcevic, I. Dhand, A. Buyskikh, A. Daley, M. Cramer, M. B. Plenio, R.
[85] Y. Levine, O. Sharir, N. Cohen, A. Shashua, arXiv:1803.09780, 2018. Blatt, C. F. Roos, Nat. Phys. 2017, 13, 1158.
[86] E. Stoudenmire, D. J. Schwab, in Advances in Neural Information Pro- [112] G. Tóth, W. Wieczorek, D. Gross, R. Krischek, C. Schwemmer, H.
cessing Systems, 2016, pp. 4799–4807. Weinfurter, Phys. Rev. Lett. 2010, 105, 250403.
[87] Z.-Y. Han, J. Wang, H. Fan, L. Wang, P. Zhang, Phys. Rev. X 2018, 8, [113] M. A. Nielsen, I. L. Chuang, Quantum Computation and Quantum
031012. Information, Cambridge University Press, Cambridge 2010.
[88] E. M. Stoudenmire, Quantum Sci. Technol. 2018, 3, 034003. [114] J. Preskill, arXiv:1203.5813, 2012.
[89] D. Liu, S.-J. Ran, P. Wittek, C. Peng, R. B. Garcı́a, G. Su, M. Lewen- [115] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus,
stein, arXiv:1710.04833, 2017. P. Shor, T. Sleator, J. A. Smolin, H. Weinfurter, Phys. Rev. A 1995, 52,
[90] W. Huggins, P. Patel, K. B. Whaley, E. M. Stoudenmire, 3457.
arXiv:1803.11537, 2018. [116] B. Jónsson, B. Bauer, G. Carleo, arXiv:1808.05232, 2018.