Quantum-Inspired Support Vector Machine: Chen Ding, Tian-Yi Bao, and He-Liang Huang
Quantum-Inspired Support Vector Machine: Chen Ding, Tian-Yi Bao, and He-Liang Huang
8, AUGUST 2020 1
Abstract—Support vector machine (SVM) is a particularly exponential improvement on previous algorithms [16], which
powerful and flexible supervised learning model that analyzes is a breakthrough that shows how to apply the subsampling
data for both classification and regression, whose usual algorithm strategy based on Alan Frieze, Ravi Kannan, and Santosh
complexity scales polynomially with the dimension of data space
and the number of data points. To tackle the big data challenge, Vempala’s 2004 algorithm [17] to find a low-rank approxi-
a quantum SVM algorithm was proposed, which is claimed mation of a matrix. Subsequently, Tang continued to use same
to achieve exponential speedup for least squares SVM (LS- techniques to dequantize two quantum machine learning algo-
arXiv:1906.08902v5 [cs.LG] 16 Mar 2021
SVM). Here, inspired by the quantum SVM algorithm, we rithms, quantum principal component analysis [18] and quan-
present a quantum-inspired classical algorithm for LS-SVM. In tum supervised clustering [19], and shows classical algorithms
our approach, an improved fast sampling technique, namely
indirect sampling, is proposed for sampling the kernel matrix and could also match the bounds and runtime of the corresponding
classifying. We first consider the LS-SVM with a linear kernel, quantum algorithms, with only polynomial slowdown [20].
and then discuss the generalization of our method to non-linear Later, András Gilyén et al. [21] and Nai-Hui Chia et al.
kernels. Theoretical analysis shows our algorithm can make [22] independently and simultaneously proposed a quantum-
classification with arbitrary success probability in logarithmic inspired matrix inverse algorithm with logarithmic complexity
runtime of both the dimension of data space and the number
of data points for low rank, low condition number and high of matrix size, which eliminates the speedup advantage of
dimensional data matrix, matching the runtime of the quantum the famous Harrow-Hassidim-Lloyd (HHL) algorithm [23]
SVM. on certain conditions. Recently, Juan Miguel Arrazola et al.
Index Terms—Quantum-inspired algorithm, machine learning, studied the actual performance of quantum-inspired algorithms
support vector machine, exponential speedup, matrix sampling. and found that quantum-inspired algorithms can perform well
in practice under given conditions. However, the conditions
should be further reduced if we want to apply the algorithms
I. I NTRODUCTION to practical datasets [24]. All of these works give a very
INCE the 1980s, quantum computing has attracted wide promising future for designing the quantum-inspired algorithm
S attention due to its enormous advantages in solving hard
computational problems [1], such as integer factorization [2]–
in the machine learning area, where matrix inverse algorithms
are universally used.
[4], database searching [5], [6], machine learning [7]–[11] and Support vector machine (SVM) is a data classification algo-
so on [12], [13]. In 1997, Daniel R. Simon offered compelling rithm which is commonly used in machine learning area [25],
evidence that the quantum model may have significantly [26]. Extensive studies have been conducted on SVMs to boost
more complexity theoretic power than the probabilistic Turing and optimize their performance, such as the sequential minimal
machine [14]. However, it remains an interesting question optimization algorithm [27], the cascade SVM algorithm [28],
where is the border between classical computing and quantum and the SVM algorithms based on Markov sampling [29],
computing. Although many proposed quantum algorithms have [30]. These algorithms offer promising speedup either by
exponential speedups over the existing classical algorithms, is changing the way of training a classifier, or by reducing
there any way we can accelerate such classical algorithms to the size of training sets. However, the time complexity of
the same complexity of the quantum ones? current SVM algorithms are all polynomial of data sizes. In
In 2018, inspired by the quantum recommendation sys- 2014, Patrick Rebentrost, Masoud Mohseni and Seth Lloyd
tem algorithm proposed by Iordanis Kerenidis and Anupam proposed the quantum SVM algorithm [31], which can achieve
Prakash [15], Ewin Tang designed a classical algorithm to an exponential speedup compared to the classical SVMs. The
produce a recommendation algorithm that can achieve an time complexity of quantum SVM algorithm is polynomial of
the logarithm of data sizes. Inspired by the quantum SVM al-
This work was supported by the Open Research Fund from State Key gorithm, Tang’s methods [16] and András Gilyén et al.’s work
Laboratory of High Performance Computing of China (Grant No. 201901-01),
National Natural Science Foundation of China under Grants No. 11905294, [21], we propose a quantum-inspired classical SVM algorithm,
and China Postdoctoral Science Foundation. (Corresponding author: He- which also shows exponential speedup compared to previous
Liang Huang. Email: [email protected]) classical SVM for low rank, low condition number and high
Chen Ding is with CAS Centre for Excellence and Synergetic Innovation
Centre in Quantum Information and Quantum Physics, University of Science dimensional data matrix. Both quantum SVM algorithm [31]
and Technology of China, Hefei, Anhui 230026, China. and our quantum-inspired SVM algorithm are least squares
Tian-Yi Bao is with Department of Computer Science, University of Oxford, SVM (LS-SVM), which reduce the optimization problem to
Wolfson Building, Parks Road, OXFORD, OX1 3QD, UK.
He-Liang Huang is with Hefei National Laboratory for Physical Sciences finding the solution of a set of linear equations.
at Microscale and Department of Modern Physics, University of Science and Our algorithm is a dequantization of the quantum SVM
Technology of China, Hefei, Anhui 230026, China, and also with CAS Centre algorithm [31]. In quantum SVM algorithm, the labeled data
for Excellence and Synergetic Innovation Centre in Quantum Information and
Quantum Physics, University of Science and Technology of China, Hefei, vectors (𝑥 𝑗 for 𝑗 = 1, ..., 𝑚) are mapped to quantum vec-
Í
Anhui 230026, China. tors |𝑥 𝑗 i = 1/|𝑥 𝑗 | (𝑥 𝑗 ) 𝑘 |𝑘i via a quantum random access
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 2
need to develop the indirect sampling technique to efficiently algorithm is designed for low-rank datasets, while the algo-
perform matrix inversion on 𝑋 𝑇 𝑋 + 𝛾 −1 𝐼 with only sampling rithms based on Markov sampling [29], [30] may work well on
access of 𝑋. the datasets that the columns form a uniformly ergodic Markov
chain. In our algorithm, to achieve exponential speedup, the
sampling technique is different from Markov sampling: (i) We
C. The sampling technique
sample both the rows and columns of matrix, rather than only
We show the definition and idea of our sampling method sampling columns. (ii) We sample each elements according to
to get indices, elements or submatrices, which is the key norm-squared probability distribution. (iii) In each dot product
technique used in our algorithm, as well as in [16], [17], [21]. calculation (Alg. 1), we use sampling technique to avoid
Definition 1 (Sampling on vectors). Suppose 𝑣 ∈ C𝑛 , define operations with high complexity.
𝑞 (𝑣) as a probability distribution that:
D. The preliminary algorithms
(𝑣) |𝑣 𝑖 | 2
𝑥∼𝑞 : P[𝑥 = 𝑖] = . We invoke two algorithms employing sampling techniques
k𝑣k 2 for saving complexity from [21]. They are treated as oracles
Picking an index according to the probability distribution 𝑞 (𝑣) that outputs certain outcomes with controlled errors in the
is called a sampling on 𝑣. main algorithm. Lemma 1 and Lemma 2 shows their correct-
ness and efficiency. For the sake of convenience, some minor
Definition 2 (Sampling the indices from matrices). Suppose changes on the algorithms and lemmas are made.
𝐴 ∈ C𝑛×𝑚 , define 𝑞 ( 𝐴) as a 2-dimensional probability distri- 1) Trace inner product estimation: Alg. 1 achieves calcula-
bution that: tion of trace inner products with logarithmic time on the sizes
| 𝐴𝑖 𝑗 | 2 of the matrices.
(𝑥, 𝑦) ∼ 𝑞 (𝑣) : P[𝑥 = 𝑖, 𝑦 = 𝑗] = .
k 𝐴k 2𝐹 Algorithm 1 Trace Inner Product Estimation.
Picking a pair of indices (𝑖, 𝑗) according to the probability Input: 𝐴 ∈ C𝑚×𝑛 that we have sampling access in complexity
distribution 𝑞 ( 𝐴) is called a sampling on 𝐴. 𝐿 ( 𝐴) and 𝐵 ∈ C𝑛×𝑚 that we have query access in
complexity 𝑄(𝐵). Relative error bound 𝜉 and success
Definition 3 (Sampling the submatrices from matrices). Sup- probability bound 1 − 𝜂.
pose the target is to sample a submatrix 𝑋 00 ∈ C𝑐×𝑟 from 𝑋 ∈ Goal: Estimate Tr[ 𝐴𝐵].
2
C𝑛×𝑚 . First we sample 𝑟 times on the vector (k 𝑋∗, 𝑗 k) 𝑗=1,...,𝑚 1: Repeat step 2 d6 log2 ( 𝜂 )e times and take the median of
and get column indices 𝑗1 , ..., 𝑗𝑟 . The columns 𝑋∗, 𝑗1 , ..., 𝑋∗, 𝑗𝑟 𝑌 , noted as 𝑍.
form submatrix 𝑋 0. Then we sample 𝑐 times on the 𝑗-th column 2: Repeat step 3 d 92 e times and calculate the mean of 𝑋,
𝜉
of 𝑋 and get row indices 𝑖1 , ..., 𝑖 𝑐 . In each time the 𝑗 is sampled noted as 𝑌 .
uniformly at random from 𝑗1 , ..., 𝑗𝑟 . The rows 𝑋𝑖01 ,∗ , ..., 𝑋𝑖0𝑐 3: Sample 𝑖 from row norms of 𝐴. Sample 𝑗 from 𝐴𝑖 . Let
k 𝐴k 2
form submatrix 𝑋 00. The matrices 𝑋 0 and 𝑋 00 are normalized 𝑋 = 𝐴𝑖 𝑗𝐹 𝐵 𝑗𝑖 .
so that E[𝑋 0 𝑋 0𝑇 ] = 𝑋 𝑋 𝑇 and E[𝑋 00𝑇 𝑋 00] = 𝑋 0𝑇 𝑋 0. Output: 𝑍.
The process of sampling the submatrices from matrices (as
described in Def. 3) is shown in Fig. 1. To put it simple, it is
taking several rows and columns out of the matrix by a random Lemma 1 [21]. Suppose that we have length-square sampling
choice decided by the “importance” of the elements. Then access to 𝐴 ∈ C𝑚×𝑛 and query access to the matrix 𝐵 ∈ C𝑛×𝑚
normalize them so that they are unbiased from the original in complexity 𝑄(𝐵). Then we can estimate Tr[ 𝐴𝐵] to precision
rows and columns. 𝜉 k 𝐴k 𝐹 k𝐵k 𝐹 with probability at least 1 − 𝜂 in time
To achieve fast sampling, we usually store vectors in an
log(1/𝜂)
arborescent data structure (such as binary search tree) as 𝑂 (𝐿( 𝐴) + 𝑄(𝐵)) .
𝜉2
suggested in [16] and store matrices by a list of their row trees
or column trees. Actually, the sampling is an analog of quan-
tum states measurements. It only reveals a low-dimensional Algorithm 2 Rejection sampling.
projection of vectors and matrices in each calculation. Rather Input: 𝐴 ∈ C𝑚×𝑛 that we have length-square sampling access
than computing with the whole vector or matrix, we choose and 𝑏 ∈ C𝑛 that we have norm access and 𝑦 = 𝐴𝑏 that
the most representative elements of them for calculation with we have query access.
a high probability (we choose the elements according to Goal: Sample from length-square distribution of 𝑦 = 𝐴𝑏.
the probability of their squares, which is also similar to 1: Take 𝐷 ≥ k𝑏k 2 .
the quantum measurement of quantum states.). The sampling 2: Sample a row index 𝑖 by row norm squares of 𝐴.
technique we use has the advantage of unbiasedly representing | 𝐴𝑖,∗ 𝑏 | 2
3: Query |𝑦 𝑖 | 2 = | 𝐴𝑖,∗ 𝑏| 2 and calculate .
𝐷 k 𝐴𝑖,∗ k 2
the original vector while consuming less computing resources. 4: Sample a real number 𝑥 uniformly distributed in [0, 1]. If
We note that there are other kinds of sampling methods | 𝐴 𝑏 |2
𝑥 < 𝐷 k 𝑖,∗
𝐴𝑖,∗ k 2
, output 𝑖, else, go to step 2.
for SVM such as the Markov sampling [29], [30]. Different
Output: The row index 𝑖.
sampling methods may work well on different scenarios. Our
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 4
i2 i1 i3 i4 i1 i2 i3 i4
-0.25 +0.21 -0.21 +0.30 +0.26 +0.24 +0.06 -0.15 +0.21 -0.25 -0.21 -0.21 +0.31 -0.31 -0.31 -0.31
+0.08 -0.06 +0.07 -0.09 -0.09 -0.07 -0.03 +0.04 -0.06 +0.08 +0.07 +0.07 -0.10 +0.10 +0.11 +0.11
+0.00 -0.00 -0.01 +0.01 -0.00 +0.00 -0.00 +0.00 -0.00 +0.00 -0.01 -0.01 -0.00 +0.01 -0.01 -0.01
-0.05 +0.04 -0.05 +0.05 +0.04 +0.04 +0.01 -0.04 sample columns +0.04 -0.05 -0.05 -0.05 renormalization X= +0.06 -0.06 -0.07 -0.07
X= -0.19 +0.19 +0.18 +0.18
+0.15 -0.13 +0.12 -0.18 -0.17 -0.16 -0.04 +0.08 -0.13 +0.15 +0.12 +0.12
-0.22 +0.18 -0.19 +0.27 +0.23 +0.22 +0.06 -0.14 +0.18 -0.22 -0.19 -0.19 +0.28 -0.28 -0.29 -0.29
-0.08 +0.05 -0.07 +0.08 +0.08 +0.07 +0.02 -0.04 +0.05 -0.08 -0.07 -0.07 +0.08 -0.10 -0.10 -0.10
+0.10 -0.09 +0.08 -0.12 -0.10 -0.09 -0.02 +0.05 -0.09 +0.10 +0.08 +0.08 -0.14 +0.13 +0.12 +0.12
=
+0.31 -0.31 -0.31 -0.31 j2
-0.10 +0.10 +0.11 +0.11 j3
-0.00 +0.01 -0.01 -0.01
+0.29 -0.28 -0.29 -0.29 +0.28 -0.28 -0.29 -0.29 j1
renormalization sample rows +0.06 -0.06 -0.07 -0.07
X= +0.29 -0.29 -0.29 -0.29 +0.31 -0.31 -0.31 -0.31 j2
-0.27 +0.28 +0.30 +0.30 -0.10 +0.10 +0.11 +0.11 j3 -0.19 +0.19 +0.18 +0.18
+0.28 -0.28 -0.29 -0.29 j1
+0.08 -0.10 -0.10 -0.10
-0.14 +0.13 +0.12 +0.12
Fig. 1. A demonstration of sampling submatrices from matrices (The process described in Def. 3, which is also Step 2 and Step 3 in Alg. 3.). We sample
columns from 𝑋 to get 𝑋 0 and sample rows from 𝑋 0 to get 𝑋 00 . Note that 𝑋 0 and 𝑋 00 are normalized such that E[𝑋 0 𝑋 0𝑇 ] = 𝑋 𝑋 𝑇 and E[𝑋 00𝑇 𝑋 00 ] = 𝑋 0𝑇 𝑋 0 .
2) Rejection sampling: Alg. 2 achieves sampling of a vector Algorithm 3 Quantum-inspired SVM Algorithm.
that we do not have full query access in time logarithmic of 3: Sample rows: Sample 𝑠 ∈ [𝑟] uniformly, then sample a
|𝑋 0 | 2
its length. row index 𝑗 distributed as k𝑋 0𝑗𝑠 k 2 . Sample a total number
∗,𝑠
of 𝑐 row indices 𝑗1 ,0 𝑗2 , ..., 𝑗 𝑐 this way. Define 𝑋 00 whose
𝑋
Lemma 2 [21]. Suppose that we have length-square sampling 𝑡-th row is k𝑋√𝑐k 𝐹 k𝑋 0𝑗𝑡 ,∗ k . Define 𝐴 00 = 𝑋 00𝑇 𝑋 00.
𝑗𝑡 ,∗
access to 𝐴 ∈ C𝑚×𝑛 having normalized rows, and we are 4: Spectral decomposition: Calculate the spectral decompo-
given 𝑏 ∈ C𝑛 . Then we can implement queries to the vector sition of 𝐴 00. Denote here by 𝐴 00 = 𝑉 00Σ2𝑉 00𝑇 . Denote the
𝑦 := 𝐴𝑏 ∈ C𝑛 with complexity 𝑄(𝑦) = 𝑂 (𝑛𝑄( 𝐴)) and we can calculated eigenvalues by 𝜎𝑙2 , 𝑙 = 1, . . . , 𝑘.
length-square sample from 𝑞 ( 𝑦) with complexity
𝐿 (𝑦) such that 5: Approximate eigenvectors: Let 𝑅 = 𝑋 0𝑇 𝑋. Define 𝑉˜𝑙 =
𝑛 k𝑏 k 2 𝑅𝑇 𝑉𝑙00
E[𝐿(𝑦)] = 𝑂 k 𝑦 k 2 (𝐿 ( 𝐴) + 𝑛𝑄( 𝐴)) . for 𝑙 = 1, ..., 𝑘, 𝑉˜ = (𝑉˜𝑙 )𝑙=1,...,𝑘 .
𝜎𝑙2
6: Estimate matrix elements: Calculate 𝜆˜𝑙 = 𝑉˜ 𝑇 𝑦 to pre- 𝑙
3𝜖 𝜎𝑙2
cision √k𝑦k by Alg. 1, each with success probability
III. Q UANTUM - INSPIRED SVM A LGORITHM 16 𝑘
Í 𝑘 𝜆˜𝑙 00
𝜂
1 − 4𝑘 . Let 𝑢 = 𝑙=1 𝑉 .
𝜎𝑙4 𝑙
We show the main algorithm (Alg. 3) that makes classifica- 7: Find query access: Find query access of 𝛼 ˜ = 𝑅˜𝑇 𝑢
tion as the classical SVMs do. Note that actual calculation ˜ ˜
by 𝛼˜ 𝑝 = 𝑢 𝑅∗, 𝑝 , in which 𝑅𝑖 𝑗 is calculated to pre-
𝑇
only happens when we use the expression "calculate" in 𝜖 𝜅2
cision 4 k𝑋 k 𝐹 by Alg. 1, each with success probability
this algorithm. Otherwise it will lose the exponential-speedup
1 − 4 d864/𝜖 2𝜂log(8/𝜂) e .
advantage for operations on large vectors or matrices. 𝛾 is
temporarily taken as ∞. Fig. 2 shows the algorithm process. 8: Find sign: Calculate 𝑥𝑇 𝑋 𝛼 ˜ to precision 𝜖4 k𝛼k k𝑥k with
𝜂
success probability 1 − 4 by Alg. 1. Tell its sign.
Algorithm 3 Quantum-inspired SVM Algorithm. Output: The answer class depends on the sign. Positive
corresponds to 1 while negative to −1.
Input: 𝑚 training data points and their labels {(𝑥 𝑗 , 𝑦 𝑗 ) : 𝑥 𝑗 ∈
R𝑛 , 𝑦 𝑗 = ±1} 𝑗=1,...,𝑚 , where 𝑦 𝑗 = ±1 depending on the
class to which 𝑥 𝑗 belongs. Error bound 𝜖 and success
The following theorem states the accuracy and time com-
probability bound 1 − 𝜂. 𝛾 set as ∞.
plexity of quantum-inspired support vector machine algorithm,
Goal 1: Find 𝛼˜ that k 𝛼˜ − 𝛼k ≤ 𝜖 k𝛼k with success probability
from which we conclude the time complexity 𝑇 depends
at least 1 − 𝜂, in which 𝛼 = (𝑋 𝑇 𝑋) + 𝑦.
polylogarithmically on 𝑚, 𝑛 and polynomially on 𝑘, 𝜅, 𝜖, 𝜂. It
Goal 2: For any given 𝑥 ∈ R𝑛 , find its class.
is to be proved in section IV and section V.
1: Init: Set 𝑟, 𝑐 as described in (6) and (7).
2: Sample columns: Sample 𝑟 column indices 𝑖 1 , 𝑖 2 , ..., 𝑖𝑟 Theorem 1. Given parameters 𝜖 > 0, 0 < 𝜂 < 1, and given the
k𝑋 k 2
according to the column norm squares k𝑋∗,𝑖k 2 . Define 𝑋 0 data matrix 𝑋 with size 𝑚 × 𝑛, rank 𝑘, norm 1, and condition
𝐹
k𝑋 k 𝐹 𝑋∗,𝑖𝑠 number 𝜅, the quantum-inspired SVM algorithm will find the
to be the matrix whose 𝑠-th column is √
𝑟 k𝑋∗,𝑖𝑠 k
. Define
√ 𝑥 𝑋𝛼 for any vector 𝑥 ∈ C with
classification expression 𝑇 𝑛
𝐴 0 = 𝑋 0𝑇 𝑋 0. 2
error less than 𝜖 𝜅 𝑚k𝑥k, success probability higher than
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 5
Pk Step 5
1
α̃ = RT u ≈ A−1 y u= λl 00
l=1 σl4 Vl Ṽl = σl2
RT Vl00
Step 7 Step 6
Step 8
sgn(xT X α̃)
Fig. 2. The quantum-inspired SVM algorithm. In the algorithm, the subsampling of 𝐴 is implemented by subsampling the matrix 𝑋 (Step 1-3), which is
called the indirect sampling technique. After the indirect sampling, we perform the spectral decomposition (Step 4). Then we estimate the approximation of
the eigenvectors (𝑉˜𝑙 ) of 𝐴 (Step 5). Finally, we estimate the classification expression (Step 6-8).
Í 𝑘 𝜆𝑙 ˜ ˜ −2𝑉˜ 𝑇 𝑦, in which
1 − 𝜂 and time complexity 𝑇 (𝑚, 𝑛, 𝑘, 𝜅, 𝜖, 𝜂). Let 𝛼 = (𝑋 𝑇 𝑋) + 𝑦, 𝛼 0 = 𝑙=1 𝑉 = 𝑉Σ
𝜎𝑙2 𝑙
Í 𝑘 𝜆˜ 𝑙 ˜
𝜆𝑙 = 𝑉˜𝑙𝑇 𝑦 and 𝛼 00 = 𝑙=1 𝑉 . Then the total error of the
𝑇 = 𝑂 (𝑟 log2 𝑚 + 𝑐𝑟 log2 𝑛 + 𝑟 3 𝜎2 𝑙 𝑙
classification expression is 1
k 𝑋 k 2𝐹 𝑘 2
8𝑘
+ log2 (
) (log2 (𝑚𝑛) + 𝑘)
𝜖2 𝜂
1 1 2 k 𝑋 k4 𝐸 = Δ(𝑥𝑇 𝑋𝛼)
+ 2 log2 (log2 (𝑚𝑛) + 𝑟 𝑘 log2 ( ) 2 𝐹 log2 (𝑚𝑛))),
𝜖 𝜂 𝜂1 𝜖 1 𝑟 ≤ |𝑥𝑇 𝑋 (𝛼 − 𝛼)|
˜ + Δ(𝑥𝑇 𝑋 𝛼)
˜
≤ k𝑥k (k𝛼 − 𝛼 0 k + k𝛼 0 − 𝛼 00 k + k𝛼 00 − 𝛼k)
˜ + Δ(𝑥𝑇 𝑋 𝛼)
˜
in which
𝜖 k𝑥k Denote 𝐸 1 = k𝑥k k𝛼 0 −𝛼k, 𝐸 2 = k𝑥k k𝛼 00 −𝛼 0 k, 𝐸 3 = k𝑥k k 𝛼−
˜
𝜖1 = √ 36 , 𝛼 00 k, ˜ Our target is to show each of them is
𝐸 4 = Δ(𝑥𝑇 𝑋 𝛼).
2 𝑟 d 𝜖 2 e d6 log2 ( 16
𝜂 )e no more than 𝜖4 k𝛼k k𝑥k with probability no less than 1 − 𝜂4 .
So that
𝜂
𝜂1 = .
8𝑟 d 36 e d6 log2 ( 16 𝐸 ≤ 𝐸1 + 𝐸2 + 𝐸3 + 𝐸4
𝜖2 𝜂 )e √
≤ 𝜖 𝜅 2 𝑚k𝑥k,
In Alg. 3, 𝛾 is set as ∞, which makes the coefficient matrix
𝐴 = 𝑋 𝑇 𝑋. Notice that the eigenvectors of 𝑋 𝑇 𝑋 + 𝛾 −1 𝐼 and with success probability no less than 1 − 𝜂.
𝑋 𝑇 𝑋 are the same, and the difference of their eigenvalues are 𝐸 1 represents the error introduced by subsampling and
𝛾 −1 . Thus the algorithm can be easily extended to be applied eigenvector approximation (i.e., Step 1-5 in Alg. 3). The fact
to the coefficient matrix 𝑋 𝑇 𝑋 + 𝛾 −1 𝐼 with arbitrary 𝛾, by just that it is less than 𝜖4 k𝛼k k𝑥k with probability no less than 1− 𝜂4
simply adding 𝛾 −1 to the calculated eigenvalues in Step 4. is shown in subsection IV-B.
𝐸 2 represents the error introduced by approximation on 𝜆𝑙
(i.e., Step 6 in Alg. 3). The fact that it is less than 𝜖4 k𝛼k k𝑥k
IV. ACCURACY with probability no less than 1 − 𝜂4 is shown in subsection
IV-A.
We prove that the error of computing the classification 𝐸 3 represents the error introduced in query of 𝑅 and 𝛼. The
expression 𝑥𝑇 𝑋√𝛼˜ in the quantum-inspired SVM algorithm will fact that it is less than 𝜖4 k𝛼k k𝑥k with probability no less than
not exceed 𝜖 𝜅 2 𝑚k𝑥k. We take 𝛾 = ∞ in the analysis because 1 − 𝜂4 is guaranteed by Step 7 of Alg. 3.
adding 𝛾 −1 to the eigenvalues won’t cause error and thus the
1 For any expression 𝑓 , Δ( 𝑓 ) represents the difference of the exact value
analysis is the same in the case of 𝛾 ≠ ∞. We first show how
of 𝑓 and the value calculated by the estimation algorithms Alg. 1 and Alg. 3
to break the total error into multiple parts, and then analyze (These two algorithms cannot get the exact values because randomness is
each part in the subsections. introduced.).
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 6
Ṽl = 1
RT Vl′′ Suppose k 𝐴0 − 𝐴 00 k ≤ 𝛽. Then ∀𝑖, 𝑗 ∈ {1, ..., 𝑟 },
Thm 2 Thm 3 σl2
|𝑉𝑖00𝑇 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 | ≤ 𝛽.
A. Proof of 𝐸 2 ≤ 𝜖
k𝛼k k𝑥k in which 𝐴 0 = 𝑋 0𝑇 𝑋 0, 𝐴 = 𝑋 𝑇 𝑋.
4
Notice that Theorem 4 points out that if 𝐴 00’s eigenvectors approxi-
mately work as eigenvectors for 𝐴 0 and k 𝑋 𝑋 𝑇 − 𝑋 0 𝑋 0𝑇 k ≤ 𝜖 0,
𝐸 3 = k𝑥k k𝛼 − 𝛼 0 k
𝑉˜𝑙𝑇 approximately work as eigenvectors for 𝐴.
˜ −2𝑉˜ 𝑇 𝐴𝛼k
= k𝑥k k𝛼 − 𝑉Σ
Theorem 5 [21]. If rank(𝐵) ≤ 𝑘, 𝑉˜ has 𝑘 columns that spans
˜ −2𝑉˜ 𝑇 𝐴 − 𝐼𝑚 k.
≤ k𝛼k k𝑥k k𝑉Σ the row and column space of 𝐵, then
Here we put 5 theorems (from 2 to 6) to prove k𝑉Σ ˜ −2𝑉˜ 𝑇 𝐴− ˜ + k k𝑉˜ 𝑇 𝐵𝑉˜ k.
𝜖
k𝐵k ≤ k (𝑉˜ 𝑇 𝑉)
𝐼𝑚 k ≤ 4 , in which theorem 2 and 5 are invoked from [21]. We
offer proofs for Theorem 3,4 and 6 in appendix A. The purpose Under the condition that 𝑉˜𝑙𝑇 approximately work as eigen-
of these theorems is to show that 𝑉Σ ˜ −2𝑉˜ 𝑇 is functionally close ˜ −2𝑉˜ 𝑇
vectors for 𝐴, the following Theorem 6 points out that 𝑉Σ
to the inverse of matrix A, as k𝑉Σ ˜ −2𝑉˜ 𝑇 𝐴 − 𝐼𝑚 k ≤ 𝜖 suggests. is functionally close to the inverse of matrix A.
4
Theorem 2 states the norm distance between 𝐴, 𝐴 0 and 𝐴 00.
According to the norm distance, and the fact that 𝑉𝑙00 are the
Theorem 6. If ∀𝑖, 𝑗 ∈ {1, . . . , 𝑘 },
eigenvectors of 𝐴 00, Theorem 3 finds the relation between 𝐴 0
and 𝑉𝑙00. We define 𝑉˜𝑙 = 𝜎12 𝑅𝑇 𝑉𝑙00, and Theorem 6 finally 1
𝑙 |𝑉˜𝑖𝑇 𝑉˜ 𝑗 − 𝛿𝑖 𝑗 | ≤ , (2)
gives the relation between 𝐴 and 𝑉. ˜ The procedure is shown 4𝑘
in Fig. 3. |𝑉˜𝑖𝑇 𝐴𝑉˜ 𝑗 − 𝛿𝑖 𝑗 𝜎𝑖2 | ≤ 𝜁,
Theorem 2 [21]. Let 𝑋 0 ∈ C𝑛×𝑟 , 𝑋 00 ∈ C𝑐×𝑟 is the sampling
and the condition of Thm 4 suffices. Then
outcome of 𝑋 0. Suppose 0𝑋 00 is normalized that E[𝑋 00𝑇 𝑋 00] =
k𝑋 k
𝑋 0𝑇 𝑋 0, then ∀𝜖 ∈ [0, k𝑋 0 k ], we have ˜ −2𝑉˜ 𝑇 𝐴 − 𝐼𝑚 k ≤ 5
𝐹 k𝑉Σ 𝜅𝑘 𝜁 .
𝜖 2𝑐
3
P k 𝑋 0𝑇 𝑋 0 − 𝑋 00𝑇 𝑋 00 k ≥ 𝜖 k 𝑋 0 k k 𝑋 0 k 𝐹 ≤ 2𝑟𝑒 − 4 .
To conclude, for P[k𝛼 0 − 𝛼k > 𝜖
4 k𝛼k] ≤ 𝜂
4, we need to pick
4 log2 ( 2𝑟 0
𝜖 and 𝛽 such that
𝜂 )
Hence, for 𝑐 ≥ 𝜖2
, with probability at least 1 − 𝜂 we
have 1
𝜅 2 𝛽2 + 2𝜅𝛽 + 𝜅 2 𝜖 0 k 𝑋 k 2𝐹 ≤ , (3)
k 𝑋 0𝑇 𝑋 0 − 𝑋 00𝑇 𝑋 00 k ≤ 𝜖 k 𝑋 0 k k 𝑋 0 k 𝐹 . 4𝑘
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 7
4 log2 ( 8𝑛
𝜂) C. Calculation of 𝜆˜𝑙
𝑟=d e, (6)
𝜖 02 In Step 5-6 we calculate 𝜆˜𝑙 . By Alg. 1, we have
4𝜅 2 log2 ( 8𝑟
𝜂)
𝑐=d e. (7) 1 00𝑇 1 1
𝛽2 𝜆𝑙 = 2
𝑉𝑙 𝑅𝑦 = 2 Tr[𝑉𝑙00𝑇 𝑋 0𝑇 𝑋 𝑦] = 2 Tr[𝑋 𝑦𝑉𝑙00𝑇 𝑋 0𝑇 ].
𝜎𝑙 𝜎𝑙 𝜎𝑙
B. Proof of 𝐸 1 ≤ 𝜖
k𝛼k k𝑥k Observe that k𝑦𝑉𝑙00𝑇 𝑋 0𝑇 k 𝐹 = k𝑦k k𝑉𝑙00𝑇 𝑋 0𝑇 k ≤ k𝑦k, and we
4
can query the (𝑖, 𝑗) matrix element of 𝑦𝑉𝑙00𝑇 𝑋 0𝑇 in cost 𝑂 (𝑟).
Notice that According to Lemma 1, the complexity in step 6 is
𝐸 4 = k𝑥k k𝛼 − 𝛼k.
˜ k 𝑋 k 2𝐹 𝑘 2 8𝑘
𝑇6 = 𝑂 ( log2 ( ) (log2 (𝑚𝑛) + 𝑘)).
𝜖2 𝜂
For 𝑦 = 𝑋 𝑇 𝑋𝛼 and 𝛼 = 𝑋 + 𝑋 +𝑇 𝑦, we have k𝑦k ≤ k𝛼k ≤
𝜅 2 k𝑦k.
˜ D. Calculation of 𝑥𝑇 𝑋 𝛼˜
For k 𝛼˜ − 𝛼 0 k, let 𝑧 be the vector that 𝑧𝑙 = 𝜆𝑙𝜎−2𝜆𝑙 , we have
𝑙
In Step 7-8 we calculate 𝑥𝑇 𝑋 𝛼. ˜ Calculation of 𝑥𝑇 𝑋 𝛼˜ is
𝑘
∑︁ 𝜆𝑙 − 𝜆˜𝑙 ˜ the last step of the algorithm, and also the most important step
k 𝛼˜ − 𝛼 0 k =k 𝑉𝑙 k for saving time complexity. In Step 8 of Alg. 3, we need to
𝜎𝑙2
𝑙=1 calculate 𝑥𝑇 𝑋 𝛼, ˜ which is equal to Tr[𝑋 𝛼𝑥 ˜ 𝑇 ], with precision
=k𝑉˜ 𝑧k 𝜂
𝜖 k𝛼k k𝑥k and success probability 1 − 4 using Alg. 1. Let the
˜ 𝑇 , respectively. To calculate
√︃
𝐴 and 𝐵 in Alg. 1 be 𝑋 and 𝛼𝑥
≤ k𝑉˜ 𝑇 𝑉˜ k k𝑧k
˜ ], we first establish the query access for 𝛼𝑥
Tr[𝑋 𝛼𝑥 𝑇 ˜ 𝑇 (we
2
4 3𝜖 𝜎𝑙 1 √ already have the sampling access of 𝑋), and then using the
≤ √ k𝑦k 2 𝑘
3 8 𝑘 𝜎𝑙 Alg. 1 as an oracle. We first analyze the time complexity of
1 querying 𝑅 and 𝛼, ˜ and then provide the time complexity of
≤ 𝜖 k𝛼k. calculating 𝑥𝑇 𝑋 𝛼: ˜
4
1) Query of 𝑅: First we find query access of 𝑅 = 𝑋 0𝑇 𝑋.
4
in which k𝑉˜ 𝑇 𝑉˜ k ≤ 3 as shown in proof of theorem 6. For any 𝑠 = 1, ..., 𝑟, 𝑗 = 1, ..., 𝑚, 𝑅𝑠 𝑗 = 𝑒𝑇𝑠 𝑋 0𝑇 𝑋𝑒 𝑗 =
Tr[𝑋𝑒 𝑗 𝑒𝑇𝑠 𝑋 0𝑇 ], we calculate such trace by Alg. 1 to precision
𝜖1 with success probability 1 − 𝜂1 . The time complexity for
V. C OMPLEXITY
one query will be
In this section, we will analyze the time complexity of
each step in the main algorithm.We divide these steps into 2 k 𝑋 k 4𝐹
𝑄(𝑅) = 𝑂 (log2 ( ) log2 (𝑚𝑛)).
four parts and analyze each part in each subsection: Step 1- 𝜂1 𝜖12 𝑟
3 are considered in Subsection V-A. Step 4 is considered in 2) Query of 𝛼: ˜ For any 𝑖 = 1, ..., 𝑚, we have 𝛼˜ 𝑗 =
Subsection V-B. Step 5-6 are considered in Subsection V-C. Í𝑟
𝑅 𝑢 . One query of 𝛼˜ will cost time 𝑟 𝑘𝑄(𝑅), with error
𝑠=1 𝑠 𝑗 𝑠
Step 7-8 are considered in Subsection V-D. Note that in the Í
𝜖1 𝑟𝑠=1 |𝑢 𝑠 | and success probability more than 1 − 𝑟𝜂1 .
main algorithm the variables 𝑅, 𝑉˜𝑙 , 𝛼˜ are queried instead of
3) Calculation of 𝑥𝑇 𝑋 𝛼: ˜ We use Alg. 1 to calculate
calculated. We include the corresponding query complexity in
𝑥 𝑋 𝛼˜ = Tr[𝑋 𝛼𝑥
𝑇 ˜ 𝑇 ] to precision 𝜖2 k𝛼k k𝑥k with success
analysis of the steps where we queried these variables.
probability 1 − 𝜂8 . Notice the query of 𝛼˜ is with error and
success probability. We only need
A. Sampling of columns and rows 𝑟
∑︁ 36 16 𝜖
𝜖1 |𝑢 𝑠 | d e d6 log2 ( )e ≤ k𝛼k k𝑥k,
In Step 1, the value of 𝑟 and 𝑐 are determined according to 𝜖2 𝜂 2
𝑠=1
Inequalities (3,4,5,6,7). The time of solving these inequalities
is a constant. In Step 2 we sample 𝑟 indices, each sampling 36 16 𝜂
𝑟𝜂1 de d6 log2 ( )e ≤
takes no more than log2 𝑚 time according to the arborescent 𝜖2 𝜂 8
vector data structure shown in II-C. In Step 3 we sample Í √
to fulfill the overall computing task. Notice 𝑟𝑠=1 |𝑢 𝑠 | ≤ 𝑟 k𝑢k
𝑐 indices, each sampling takes no more than 𝑟 log2 𝑛 time
and 𝛼 = 𝑅𝑇 𝑢 We set
according to the arborescent matrix data structure shown
in II-C. Thus the overall time complexity of Step 1-3 is 𝜖 k𝑥k
𝜖 1 = √ 36 ,
𝑂 (𝑟 log2 𝑚 + 𝑐𝑟 log2 𝑛). 2 𝑟 d 𝜖 2 e d6 log2 ( 16
𝜂 )e
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 8
𝜂
𝜂1 = . B. Experiment II: Discussion on algorithm parameters
8𝑟 d 36
𝜖2
e d6 log2 ( 16
𝜂 )e As analyzed in Section IV and Section V, there are two
main parameters for the quantum-inspired algorithm: relative
And the overall time complexity for computing 𝑥𝑇 𝑋 𝛼˜ is
error 𝜖 and success probability 1 − 𝜂. Based on them we set
1 1 subsampling size 𝑟, 𝑐 and run the algorithm. However, for
𝑇7 = 𝑂 ( log2 (log2 (𝑚𝑛) + 𝑟 𝑘𝑄(𝑅))) datasets that are not large enough, setting 𝑟, 𝑐 by Equation (6)
𝜖2 𝜂
and Equation (7) is rather time costly. For instance, when the
1 1 2 k 𝑋 k4
= 𝑂 ( 2 log2 (log2 (𝑚𝑛) + 𝑟 𝑘 log2 ( ) 2 𝐹 log2 (𝑚𝑛))). condition number of data matrix is 1.0, taking 𝜂 = 0.1 and 𝜖 =
𝜖 𝜂 𝜂1 𝜖 1 𝑟 5.0, theoretically, the 𝑟, 𝑐 for 10000 × 10000 dataset should be
set as 1656 and 259973 to assure that the algorithm calculates
VI. E XPERIMENTS the classification expression with relative error less than 𝜖 and
success probability higher than 1−𝜂. For practical applications
In this section, we demonstrate the proposed quantum- of not too large datasets, we set 𝑟, 𝑐 as 𝑟 = 𝑏d4 log2 (2𝑛/𝜂)/𝜖 2 e
inspired SVM algorithm in practice by testing the algorithm on and 𝑐 = 𝑏d4 log2 (2𝑟/𝜂)/𝜖 2 e, in which 𝑏 is the subsampling
artificial datasets. The feasibility and efficiency of some other size control parameter. When 𝑏 = 1, our practical choice of
quantum-inspired algorithms (quantum-inspired algorithms for 𝑟, 𝑐 assures the relative error of subsampling (Step 2 and Step 3
recommendation systems and linear systems of equations) on in Alg. 3) won’t exceed 𝜖 (guaranteed by Theorem 2).
large datasets has been benchmarked, and the results indicate In Experiment I, we took the practical setting of 𝑟, 𝑐,
that quantum-inspired algorithms can perform well in practice where we already found advantage compared to LIBSVM.
under its specific condition: low rank, low condition number, Our choice of 𝜖, 𝜂 and 𝑏 is 𝜖 = 5, 𝜂 = 0.1 and 𝑏 = 1.
and very large dimension of the input matrix [24]. Here we Here, we test the algorithm on other choices of 𝜖, 𝜂 and 𝑏
show the feasibility of the quantum-inspired SVM. Firstly, we and check the classification rate of the algorithm. We test
test the quantum-inspired SVM algorithm on low-rank and each parameter choice for 50 times. The variation intervals
low-rank approximated datasets and compare it to an exist- of each parameter are 𝜖 from 1 to 10, 𝜂 from 0.1 to 1, and
ing classical SVM implementation. Secondly, we discuss the 𝑏 from 1 to 10. The results are shown in Fig. 4. We find the
characteristics of the algorithm by analyzing its dependence average classification rates of the algorithm in each experiment
on the parameters and datasets. In our experiment, we use are close. We notice when using the practical 𝑟, 𝑐, which
the arborescent data structure instead of arrays for storage are much smaller than the theoretical ones, the algorithm
and sampling [24], making the experiment conducted in a maintains its performance (classification rate around 0.90).
more real scenario compared to the previous work [24]. All This phenomenon indicates a gap between our theoretical
algorithms are implemented in Julia [32]. The source code and analysis and the actual performance, as [24] reports “the
data are available at https://github.com/helloinrm/qisvm. performance of these algorithms is better than the theoretical
complexity bounds would suggest”.
TABLE II
T HE AVERAGE VALUES AND STANDARD DEVIATIONS OF CLASSIFICATION RATES (%) OF QI SVM AND LIBSVM IN FIVE EXPERIMENTS .
1 .0 1 .0 1 .0
a v e r a g e c la s s ific a tio n r a te
a v e r a g e c la s s ific a tio n r a te
a v e r a g e c la s s ific a tio n r a te
0 .8 0 .8 0 .8
0 .6 0 .6 0 .6
1 2 3 4 5 6 7 8 9 1 0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0 .8 0 .9 1 .0 1 2 3 4 5 6 7 8 9 1 0
ε η s u b s a m p lin g s iz e
(a) (b) (c)
Fig. 4. The average classification rate of quantum-inspired SVM algorithm with different parameters on the dataset with rank 1. Each point represents an
average classification rate for 50 trials, and the error bar shows the standard deviation of the 50 trials. (a) Algorithm performance when the parameter 𝜖 is
taken from 1 to 10. (b) Algorithm performance when the parameter 𝜂 is taken from 0.1 to 1. (c) Algorithm performance when the parameter 𝑏 is taken from
1 to 10.
logarithmic-complexity algorithm (e.g. Theorem 2 shows the Algorithm 4 Polynomial kernel matrices sampling.
error of matrix row subsampling). Input: The sampling access of 𝑋 in logarithmic time of 𝑚
and 𝑛.
Goal: Sample a column index 𝑗 from the column norm vector
B. Improving sampling for dot product (k𝑥1 k 𝑝 , k𝑥 2 k 𝑝 , . . . , k𝑥 𝑚 k 𝑝 ) of 𝑍, and them sample a row
index 𝑖 from column 𝑥 ⊗𝑗 𝑝 of 𝑍.
Remember in Alg. 1 we can estimate dot products for two
1: Sample on column norm vector (k𝑥 1 k, k𝑥 2 k, . . . , k𝑥 𝑚 k) of
vectors. However, it does not work well for all the conditions,
𝑋 to get index 𝑗.
like when k𝑥k and k𝑦k are donminated by one element. For
2: Query k𝑥 𝑗 k from (k𝑥 1 k, k𝑥 2 k, . . . , k𝑥 𝑚 k). Calculate
randomness, [34] implies that we can apply a spherically
k𝑥 𝑗 k 𝑝 .
random rotation 𝑅 to all 𝑥, which does not change the kernel
3: Sample a real number 𝑎 uniformly distributed in [0, 1].
matrix 𝐾, but will make all the elements in the dataset matrix
If 𝑎 ≥ k𝑥 𝑗 k 𝑝 , go to Step 1. If not, output index 𝑗 as the
be in a same distribution.
column index and continue.
𝑁 𝑝 ∑︁𝑝
∑︁ 1 𝑝 more complex kernels [26], [36] or other machine learning
= − 𝐾𝑞 (𝑥𝑖 , 𝑥𝑖 )
2𝜎 2 𝑞 algorithms. The technique of indirect sampling can expand
𝑝=0 𝑞,𝑙=0
the application area of fast sampling techniques. And it will
𝑝 𝑝 make contribution to the further competition between classical
+ (−2) 𝑙 𝐾𝑙 (𝑥𝑖 , 𝑥 𝑗 ) + 𝐾 𝑝−𝑞−𝑙 (𝑥 𝑗 , 𝑥 𝑗 ).
𝑙 𝑝−𝑞−𝑙 algorithms and quantum ones.
Some improvements on our work would be made in the
future, such as reducing the conditions on the data matrix,
D. General LS-SVM further reducing the complexity, and tighten the error bounds
In the former sections, we began with a LS-SVM with 𝑏 = in the theoretical analysis, which can be achieved through a
0 and linear kernels in Section II. And we showed how the deeper investigation on the algorithm and the error propaga-
method can be extended to nonlinear kernels in Section VII-C. tion process. The investigation of quantum-inspired non-linear
Finally, we deal with the last assumption 𝑏 = 0. We show how SVMs and least squares SVM as discussed in Section VII also
a general LS-SVM can be tackled using techniques alike in requires theoretical analysis and empirical evaluations.
Alg. 3: We note that our work, as well as the previous quantum-
A general LS-SVM equation [26] is inspired algorithms, are not intended to demonstrate that
quantum computing is uncompetitive. We want to find out
0 1𝑇 𝑏 0 where the boundaries of classical and quantum computing are,
= , (8)
1 𝐾 + 𝛾 −1 𝐼 𝛼 𝑦 and we expect new quantum algorithms to be developed to beat
in which 𝐾 is the kernel matrix. our algorithm.
Equation (8) can be solved as follows:
(i) Firstly, by methods in Section VII-C, we establish the A PPENDIX A
sampling access for kernel matrix 𝐾. Suppose a sampling P ROOF OF T HEOREMS IN IV
outcome of 𝐾 is 𝐾 00. A. Proof of Theorem 3
(ii) Secondly, take Proof: We break the expression |𝑉𝑖00𝑇 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 | into
0 1𝑇
two parts,
𝐴= .
1 𝐾 + 𝛾 −1 𝐼 |𝑉𝑖00𝑇 𝐴 0𝑉 𝑗00 −𝛿𝑖 𝑗 𝜎𝑖2 | ≤ |𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 |+|𝑉𝑖00𝑇 𝐴 00𝑉 𝑗00 −𝛿𝑖 𝑗 𝜎𝑖2 |.
and For the first item, because of the condition k 𝐴 0 − 𝐴 00 k ≤ 𝛽 and
0
00 1𝑇
𝐴 = . 𝑉 𝑗00 are normalized,
1 𝐾 00 + 𝛾 −1 𝐼
We establish the eigen relations between 𝐴 and 𝐴 00 by theo- |𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 | ≤ k𝑉𝑖00𝑇 k · k ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 k
rems which are similar to Theorem 2 and Theorem 4. ≤ 𝛽.
(iii) Once 𝐴 ∈ R𝑚×𝑚 is subsampled to 𝐴 00 ∈ R𝑟 ×𝑟 , we can
For
Í 𝑘 the second item, because of the condition 𝐴 00 =
continue Step 3-7 of Alg. 3. 2 00 00𝑇
(iv) Once Equation (8) is solved in Step 7 of Alg. 3, which 𝑙=1 𝜎𝑙 𝑉𝑙 𝑉𝑙 ,
means we can establish the query access for 𝛼. According |𝑉𝑖00𝑇 𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 | = 0.
to Equation 8, 𝑏 = 𝑦 𝑗 − 𝑥𝑇𝑗 𝑋𝛼 − 𝛾 −1 𝛼 𝑗 for any 𝑗 such that
𝛼 𝑗 ≠ 0. We can then evaluate the classification expression In all,
𝑦 𝑗 + (𝑥 −𝑥 𝑗 )𝑇 𝑋𝛼 − 𝛾 −1 𝛼 𝑗 and make classification using Alg. 1. |𝑉𝑖00𝑇 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 | ≤ 𝛽.
There are two ways to find 𝑗: One is executing the rejection The description above can be written in short as follows:
sampling on 𝛼 using Alg. 2. The other is checking if 𝛼 𝑗 = 0
after each sampling of 𝑋 in Step 3 of Alg. 1. |𝑉𝑖00𝑇 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 | ≤ |𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 | + |𝑉𝑖00𝑇 𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖2 |
≤ k𝑉𝑖00𝑇 k · k ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 k
VIII. C ONCLUSION ≤ 𝛽.
We have proposed a quantum-inspired SVM algorithm that
achieves exponential speedup over the previous classical al-
gorithms. The feasibility of the proposed algorithm is demon-
B. Proof of Theorem 4
strated by experiments. Our algorithm works well on low-rank
datasets or datasets that can be well approximated by low-rank Proof: Denote |𝑉˜𝑖𝑇 𝑉˜ 𝑗 − 𝛿𝑖 𝑗 | as Δ1 , |𝑉˜𝑖𝑇 𝐴𝑉˜ 𝑗 − 𝛿𝑖 𝑗 𝜎𝑖2 | as
matrices, which is similar with quantum SVM algorithm [31] Δ2 . By definition, 𝑉˜𝑙 = 𝜎12 𝑅𝑇 𝑉𝑙00. Thus
𝑙
as "when a low-rank approximation is appropriate". Certain
investigations on the application of such an algorithm are 𝑉𝑖00𝑇 𝑅𝑅𝑇 𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖4
Δ1 = | |.
required to make quantum-inspired SVM operable in solving 𝜎𝑖2 𝜎 2𝑗
questions like face recognition [25] and signal processing [35].
We hope that the techniques developed in our work can We break it into two parts:
lead to the emergence of more efficient classical algorithms, 1
Δ1 ≤ 2 2 |𝑉𝑖00𝑇 𝐴 0 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖4 | + |𝑉𝑖00𝑇 (𝑅𝑅𝑇 − 𝐴 0 𝐴 0)𝑉 𝑗00 | .
such as applying our method to support vector machines with 𝜎𝑖 𝜎 𝑗
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 11
For the first item, we have For the second item, we have
|𝑉𝑖00𝑇 𝐴 0 𝐴 0 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖6 |
|𝑉𝑖00𝑇 𝐴 0 𝐴 0𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖4 |
=|𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 𝐴 0 𝐴 0𝑉 𝑗00 + 𝑉𝑖00𝑇 𝐴 00 ( 𝐴 0 − 𝐴 00) 𝐴 0𝑉 𝑗00
=|𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 2𝑉 𝑗00 + 𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 𝐴 0𝑉 𝑗00
+𝑉𝑖00𝑇 𝐴 00 𝐴 00 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 + 𝑉𝑖00𝑇 𝐴 00 𝐴 00 𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖6 |
+𝑉𝑖00𝑇 𝐴 0 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 + 𝑉𝑖00𝑇 𝐴 00 𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖4 |
≤|𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 𝐴 0 𝐴 0𝑉 𝑗00 | + |𝑉𝑖00𝑇 𝐴 00 ( 𝐴 0 − 𝐴 00) 𝐴 0𝑉 𝑗00 |
≤|𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 2𝑉 𝑗00 | + |𝑉𝑖00𝑇 ( 𝐴 0 − 𝐴 00) 𝐴 0𝑉 𝑗00 |
+|𝑉𝑖00𝑇 𝐴 00 𝐴 00 ( 𝐴 0 − 𝐴 00)𝑉 𝑗00 | + |𝑉𝑖00𝑇 𝐴 00 𝐴 00 𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖6 |
+|𝑉𝑖00𝑇 0
𝐴 (𝐴 − 𝐴 0 00
)𝑉 𝑗00 | + |𝑉𝑖00𝑇 𝐴 00
𝐴 00𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖4 |
≤k ( 𝐴 0 − 𝐴 00) 𝐴 0 𝐴 0 k + k 𝐴 00 ( 𝐴 0 − 𝐴 00) 𝐴 0 k + k 𝐴 00 𝐴 00 ( 𝐴 0 − 𝐴 00) k
2
≤𝛽 + 𝜎 2𝑗 𝛽 + 𝜎𝑖2 𝛽.
≤k 𝑋 0 k 4 k 𝐴 0 − 𝐴 00 k + k 𝑋 00 k 2 k 𝑋 0 k 2 k 𝐴 0 − 𝐴 00 k
+k 𝑋 00 k 4 k 𝐴 0 − 𝐴 00 k
The last step above used the same technique as the proof of
Thm 3. ≤𝛽k 𝑋 k 4𝐹 .
For the second item, we have In all,
1
|𝑉𝑖00𝑇 (𝑅𝑅𝑇 − 𝐴 0 𝐴 0)𝑉 𝑗00 | ≤ k𝑅𝑅𝑇 − 𝐴 0 𝐴 0 k Δ2 ≤ (2k 𝑋 k 2𝐹 𝜖 0 + 𝛽k 𝑋 k 4𝐹 )
𝜎𝑖2 𝜎 2𝑗
= k 𝑋 0𝑇 𝑋 𝑋 𝑇 𝑋 0 − 𝑋 0𝑇 𝑋 0 𝑋 0𝑇 𝑋 0 k
≤ (2𝜖 0 + 𝛽k 𝑋 k 2𝐹 ) k 𝑋 k 2𝐹 𝜅 2 .
≤ k 𝑋 0 k 2 k 𝑋 𝑋 𝑇 − 𝑋 0 𝑋 0𝑇 k.
Because
C. Proof of Theorem 6
0 0
k 𝑋 k ≤ k 𝑋 k𝐹 = k 𝑋 k𝐹 , Proof: For 𝑉˜𝑖𝑇 𝑉˜ 𝑗 −𝛿𝑖 𝑗 are elements of 𝑉˜ 𝑇 𝑉˜ −𝐼 and |𝑉˜𝑖𝑇 𝑉˜ 𝑗 −
1
𝛿𝑖 𝑗 | ≤ 4𝑘 ,
we have 1
k𝑉˜ 𝑇 𝑉˜ − 𝐼 k ≤ 𝑘max{|𝑉˜𝑖𝑇 𝑉˜ 𝑗 − 𝛿𝑖 𝑗 |} ≤ .
|𝑉𝑖00𝑇 (𝑅𝑅𝑇 − 𝐴 0 𝐴 0)𝑉 𝑗00 | ≤ 𝜖 0 k 𝑋 k 2𝐹 . 4
Thus k𝑉˜ 𝑇 𝑉˜ k is invertible and
In all, due to 𝜎𝑖 ≥ 𝜅 ∀𝑖 ∈ {1, . . . , 𝑘 }, ˜ −1 k = 1/k𝑉˜ 𝑇 𝑉˜ k ≤ 1/(1 − k𝑉˜ 𝑇 𝑉˜ − 𝐼 k) = 4
k (𝑉˜ 𝑇 𝑉) .
3
1 ˜ −2𝑉˜ 𝑇 𝐴 − 𝐼𝑚 , we have
Take 𝐵 = 𝑉Σ
Δ1 ≤ (𝛽2 + 𝜎 2𝑗 𝛽 + 𝜎𝑖2 𝛽 + 𝜖 0 k 𝑋 k 2𝐹 )
𝜎𝑖2 𝜎 2𝑗 𝑘 ˜𝑇 ˜
∑︁ 𝑉𝑖 𝑉𝑙 · 𝑉˜𝑙𝑇 𝐴𝑉˜ 𝑗
2 2
≤ 𝜅 𝛽 + 2𝜅𝛽 + 𝜅 𝜖 2 0
k 𝑋 k 2𝐹 . |𝑉˜𝑖𝑇 𝐵𝑉˜ 𝑗 | = | − 𝑉˜𝑖𝑇 𝑉˜ 𝑗 |.
𝑙=1
𝜎𝑙2
1 𝑇 00 We break it into two parts:
By definition, 𝑉˜𝑙 = 𝜎𝑙2
𝑅 𝑉𝑙 . Thus
𝑘 ˜𝑇 ˜ 𝑘
∑︁ 𝑉 𝑉𝑙 ∑︁
|𝑉˜𝑖𝑇 𝐵𝑉˜ 𝑗 | ≤ | 𝑖
(𝑉˜𝑙𝑇 𝐴𝑉˜ 𝑗 −𝛿𝑙 𝑗 𝜎𝑙2 )|+| 𝑉˜𝑖𝑇 𝑉˜𝑙 𝛿𝑙 𝑗 −𝑉˜𝑖𝑇 𝑉˜ 𝑗 |.
𝑉𝑖00𝑇 𝑅 𝐴𝑅𝑇 𝑉 𝑗00 − 𝛿𝑖 𝑗 𝜎𝑖6 𝑙=1
𝜎𝑙2 𝑙=1
Δ2 = | |.
𝜎𝑖2 𝜎 2𝑗 The second item is zero because
𝑘
∑︁
We break it into two parts: | 𝑉˜𝑖𝑇 𝑉˜𝑙 𝛿𝑙 𝑗 − 𝑉˜𝑖𝑇 𝑉˜ 𝑗 | = |𝑉˜𝑖𝑇 𝑉˜ 𝑗 − 𝑉˜𝑖𝑇 𝑉˜ 𝑗 |.
𝑙=1
A PPENDIX B R EFERENCES
T HE CONSTRUCTION METHOD OF DATASETS
[1] H.-L. Huang, D. Wu, D. Fan, and X. Zhu, “Superconducting quantum
In our experiment, we constructed artificial datasets which computing: a review,” Science China Information Sciences, vol. 63, no.
180501, 2020.
are low-rank or can be low-rank approximated. Here we put [2] P. W. Shor, “Algorithms for quantum computation: Discrete logarithms
up our construction mehtod: and factoring,” in Proc. 35th Annual Symposium Foundations Computer
1. Firstly, we multiply a random matrix 𝐴 of size 𝑛 × 𝑘 Sci. Santa Fe, NM, USA: IEEE, Nov. 1994, pp. 124–134. [Online].
Available: https://ieeexplore.ieee.org/document/365700
with another random matrix 𝐵 of size 𝑘 × 𝑚. The elements in [3] C.-Y. Lu, D. E. Browne, T. Yang, and J.-W. Pan, “Demonstration
both of them are evenly distributed in [−0.5, 0.5]. Denote the of a compiled version of shor’s quantum factoring algorithm using
multiplication outcome as 𝑋. Then the rank of 𝑋 is at most photonic qubits,” Physical Review Letters, vol. 99, no. 25, p. 250504,
2007. [Online]. Available: https://journals.aps.org/prl/abstract/10.1103/
𝑘. PhysRevLett.99.250504
2. We add turbulence to the matrix 𝑋 by adding a random [4] H.-L. Huang, Q. Zhao, X. Ma, C. Liu, Z.-E. Su, X.-L. Wang,
number evenly distributed in [−0.1𝑥, 0.1𝑥] to all the elements L. Li, N.-L. Liu, B. C. Sanders, C.-Y. Lu et al., “Experimental
blind quantum computing for a classical client,” Physical review
in 𝑋, in which 𝑥 is the average of all the absolute values of letters, vol. 119, no. 5, p. 050503, 2017. [Online]. Available:
𝑋. After adding turbulence, 𝑋 is no more low-rank but still https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.119.050503
low-rank approximated. [5] L. K. Grover, “A fast quantum mechanical algorithm for database
search,” in Proc. 21th Annual ACM Symposium Theory Computing.
3. We normalize 𝑋 such that 𝑋 has operator norm 1. Philadelphia, Pennsylvania, USA: ACM, May 1996, pp. 212–219.
4. We divide the column vectors of 𝑋 into two classes by a [Online]. Available: http://doi.acm.org/10.1145/237814.237866
[6] T. Li, W.-S. Bao, H.-L. Huang, F.-G. Li, X.-Q. Fu, S. Zhang,
random hyperplane 𝑤𝑇 𝑥 = 0 that passes the origin (By random C. Guo, Y.-T. Du, X. Wang, and J. Lin, “Complementary-
hyperplane we mean the elements in 𝑤 are uniformly sampled multiphase quantum search for all numbers of target items,” Physical
from [0, 1] at random.), while making sure that both classes Review A, vol. 98, no. 6, p. 062308, 2018. [Online]. Available:
https://journals.aps.org/pra/abstract/10.1103/PhysRevA.98.062308
are not empty.
[7] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe,
5. Since now we have 𝑚 linear-separable labeled vectors, and S. Lloyd, “Quantum machine learning,” Nature, vol. 549,
each with length 𝑛. We choose uniformly at random 𝑚 1 of no. 7671, p. 195–202, Sept. 2017. [Online]. Available: https:
//doi.org/10.1038/nature23474
them for training, and let the other 𝑚 2 = 𝑚 − 𝑚 1 for testing,
[8] H.-L. Huang, X.-L. Wang, P. P. Rohde, Y.-H. Luo, Y.-W. Zhao,
while making sure that the training set includes vectors of both C. Liu, L. Li, N.-L. Liu, C.-Y. Lu, and J.-W. Pan, “Demonstration
classes. of topological data analysis on a quantum processor,” Optica,
vol. 5, no. 2, pp. 193–198, 2018. [Online]. Available: https:
//www.osapublishing.org/optica/abstract.cfm?uri=optica-5-2-193
[9] J. Liu, K. H. Lim, K. L. Wood, W. Huang, C. Guo, and H.-L. Huang,
A PPENDIX C “Hybrid quantum-classical convolutional neural networks,” arXiv
T HE EFFECTIVENESS OF A LG . 4 preprint, 2019. [Online]. Available: https://arxiv.org/abs/1911.02998
[10] H.-L. Huang, Y.-W. Zhao, T. Li, F.-G. Li, Y.-T. Du, X.-Q. Fu,
S. Zhang, X. Wang, and W.-S. Bao, “Homomorphic encryption
The goal of Alg. 4 is to sample a column index and a row experiments on ibm’s cloud quantum computing platform,” Frontiers
index from 𝑍. We show it achieves this goal. of Physics, vol. 12, no. 1, p. 120305, 2017. [Online]. Available:
https://link.springer.com/article/10.1007/s11467-016-0643-9
Step 1-3 are for sampling out the column index. They are [11] H.-L. Huang, Y. Du, M. Gong, Y. Zhao, Y. Wu, C. Wang, S. Li, F. Liang,
essentially Alg. 2 with 𝐴 = Diag(k𝑥1 k 𝑝−1 , . . . , k𝑥 𝑚 k 𝑝−1 ) and J. Lin, Y. Xu et al., “Experimental quantum generative adversarial
𝑏 = (k𝑥1 k, . . . , k𝑥 𝑚 k), which sample from the column norm networks for image generation,” arXiv:2010.06201, 2020.
[12] H.-L. Huang, A. K. Goswami, W.-S. Bao, and P. K. Panigrahi, “Demon-
vector 𝑏 = (k𝑥1 k 𝑝 , . . . , k𝑥 𝑚 k 𝑝 ) of 𝑍 to get the column index 𝑗. stration of essentiality of entanglement in a deutsch-like quantum
We note that in practical applications, Step 1-3 can be adjusted algorithm,” SCIENCE CHINA Physics, Mechanics & Astronomy, vol. 61,
for speedup, such as frugal rejection sampling suggested in no. 060311, 2018.
[37]. [13] H.-L. Huang, M. Narożniak, F. Liang, Y. Zhao, A. D. Castellano,
M. Gong, Y. Wu, S. Wang, J. Lin, Y. Xu et al., “Emulating quantum
Í 𝑝Step 4 is for sampling out the row index. Suppose 𝑙 = teleportation of a majorana zero mode qubit,” Physical Review Letters,
(𝑖 𝜏 −1)𝑛 𝑝−𝜏 +1. According the definition of tensor power, vol. 126, no. 9, p. 090502, 2021.
𝜏=1
[14] D. R. Simon, “On the power of quantum computation,” SIAM J.
the 𝑙-th element of 𝑥 ⊗𝑗 𝑝 is Comput., vol. 26, no. 5, pp. 1474–1483, July 1997. [Online]. Available:
https://doi.org/10.1137/S0097539796298637
(𝑥 ⊗𝑗 𝑝 )𝑙 = Π 𝜏=1
𝑝
𝑥𝑖𝜏 𝑗 . [15] I. Kerenidis and A. Prakash, “Quantum recommendation systems,”
in 8th Innovations Theoretical Computer Sci. Conf., ser. Leibniz
International Proceedings in Informatics (LIPIcs), vol. 67, Berkeley,
When Step 4 executes 𝑝 times of sampling on 𝑥 𝑗 , the probabil- CA, USA, Jan. 2017, pp. 49:1–49:21. [Online]. Available: http:
//drops.dagstuhl.de/opus/volltexte/2017/8154
𝑝
ity of getting the outcome 𝑖 1 , 𝑖2 , . . . , 𝑖 𝑝 is |Π 𝜏=1 𝑥𝑖𝜏 𝑗 | 2 , which
[16] E. Tang, “A quantum-inspired classical algorithm for recommendation
is exactly the probability of sampling out (𝑥 𝑗 )𝑙 in 𝑥 ⊗𝑗 𝑝 . Thus
⊗𝑝
systems,” in Proc. 51st Annual ACM SIGACT Symposium Theory
Í𝑝 Computing, vol. 25. New York, NY, USA: ACM, June 2019, pp.
we output index 𝑙 = 𝜏=1 (𝑖 𝜏 − 1)𝑛 𝑝−𝜏 + 1.
217–228. [Online]. Available: https://doi.org/10.1145/3313276.3316310
[17] A. Frieze, R. Kannan, and S. Vempala, “Fast monte-carlo algorithms
for finding low-rank approximations,” J. Assoc. Comput. Mach.,
ACKNOWLEDGMENT vol. 51, no. 6, pp. 1025–1041, Nov. 2004. [Online]. Available:
http://doi.acm.org/10.1145/1039488.1039494
[18] S. Lloyd, M. Mohseni, and P. Rebentrost, “Quantum principal
The authors would like to thank Yi-Fei Lu for helpful component analysis,” Nat. Phys., vol. 10, no. 9, p. 631–633, July 2014.
discussions. [Online]. Available: https://doi.org/10.1038/nphys3029
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2020 13
[19] S. Lloyd, M. Mohseni, and P. Rebentrost, “Quantum algorithms for [34] D. Achlioptas, F. McSherry, and B. Schölkopf, “Sampling techniques
supervised and unsupervised machine learning,” arXiv preprint, Nov. for kernel methods,” in Advances Neural Inform. Processing
2013. [Online]. Available: https://arxiv.org/abs/1307.0411 Systems, T. G. Dietterich, S. Becker, and Z. Ghahramani,
[20] E. Tang, “Quantum-inspired classical algorithms for principal Eds. Vancouver, British Columbia, Canada: MIT Press, Dec.
component analysis and supervised clustering,” arXiv preprint, 2002, pp. 335–342. [Online]. Available: https://papers.nips.cc/paper/
Oct. 2018. [Online]. Available: http://arxiv.org/abs/1811.00414 2072-sampling-techniques-for-kernel-methods
[21] A. Gilyén, S. Lloyd, and E. Tang, “Quantum-inspired low-rank [35] L. Wang, Support Vector Machines for Signal Processing, 1st ed. The
stochastic regression with logarithmic dependence on the dimension,” Netherlands: Springer, Berlin, Heidelberg, 2005, ch. 15, pp. 321–342.
arXiv preprint, Nov. 2018. [Online]. Available: http://arxiv.org/abs/ [Online]. Available: https://doi.org/10.1007/b95439
1811.04909 [36] L. Wang, Multiple Model Estimation for Nonlinear Classification,
[22] N.-H. Chia, H.-H. Lin, and C. Wang, “Quantum-inspired sublinear 1st ed. The Netherlands: Springer, Berlin, Heidelberg, 2005, ch. 2,
classical algorithms for solving low-rank linear systems,” arXiv preprint, pp. 49–76. [Online]. Available: https://doi.org/10.1007/b95439
Nov. 2018. [Online]. Available: https://arxiv.org/abs/1811.04852 [37] I. L. Markov, A. Fatima, S. V. Isakov, and S. Boixo, “Quantum
[23] A. W. Harrow, A. Hassidim, and S. Lloyd, “Quantum algorithm for linear Supremacy Is Both Closer and Farther than It Appears,” arXiv preprint,
systems of equations,” Phys. Rev. Lett., vol. 103, no. 15, p. 150502, Oct. Sep. 2018. [Online]. Available: http://arxiv.org/abs/1807.10749
2009. Chen Ding received the B.S. degree from University
[24] J. M. Arrazola, A. Delgado, B. R. Bardhan, and S. Lloyd, “Quantum- of Science and Technology of China, Hefei, China,
inspired algorithms in practice,” Quantum, vol. 4, p. 307, Aug. 2020. in 2019.
[Online]. Available: https://doi.org/10.22331/q-2020-08-13-307 He is currently a graduate student in CAS Centre
[25] P. J. Phillips, “Support vector machines applied to face recognition,” for Excellence and Synergetic Innovation Centre in
in Advances Neural Inform. Processing Systems, vol. 48, no. 6241, Quantum Information and Quantum Physics. His
Gaithersburg, MD, USA, Nov. 1999, pp. 803–809. [Online]. Available: current research interests include quantum machine
https://doi.org/10.6028/nist.ir.6241 learning, quantum-inspired algorithm designing and
[26] J. A. K. Suykens and J. Vandewalle, “Least squares support vector variational quantum computing.
machine classifiers,” Neural Process. Lett., vol. 9, no. 3, pp.
293–300, June 1999. [Online]. Available: https://doi.org/10.1023/A:
1018628609742
[27] J. Platt, “Sequential Minimal Optimization: A Fast Algorithm
for Training Support Vector Machines,” Apr. 1998. [Online].
Available: https://www.microsoft.com/en-us/research/publication/
sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/ Tian-Yi Bao received the B.S. degree from Univer-
[28] H. P. Graf, E. Cosatto, L. Bottou, I. Dourdanovic, and sity of Michigan, Ann Arbor, USA, in 2020.
V. Vapnik, “Parallel Support Vector Machines: The Cascade She is currently a graduate student in Oxford
SVM,” in Advances in Neural Information Processing Systems University. Her current research interests include the
17, L. K. Saul, Y. Weiss, and L. Bottou, Eds. MIT Press, machine learning and human-computer interaction.
2005, pp. 521–528. [Online]. Available: http://papers.nips.cc/paper/
2608-parallel-support-vector-machines-the-cascade-svm.pdf
[29] J. Xu, Y. Y. Tang, B. Zou, Z. Xu, L. Li, Y. Lu, and B. Zhang, “The
generalization ability of svm classification based on markov sampling,”
IEEE transactions on cybernetics, vol. 45, no. 6, pp. 1169–1179,
2014. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/
6881630
[30] B. Zou, C. Xu, Y. Lu, Y. Y. Tang, J. Xu, and X. You, “𝑘-times
markov sampling for svmc,” IEEE transactions on neural networks
and learning systems, vol. 29, no. 4, pp. 1328–1341, 2017. [Online]. He-Liang Huang received the Ph.D. degree from
Available: https://ieeexplore.ieee.org/abstract/document/7993056/ the University of Science and Technology of China,
[31] P. Rebentrost, M. Mohseni, and S. Lloyd, “Quantum support vector Hefei, China, in 2018.
machine for big data classification,” Phys. Rev. Lett., vol. 113, p. He is currently an Assistant Professor of Henan
130503, Sept. 2014. [Online]. Available: https://link.aps.org/doi/10. Key Laboratory of Quantum Information and Cryp-
1103/PhysRevLett.113.130503 tography, Zhengzhou, China, and the Postdoctoral
[32] J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah, “Julia: A fresh Fellow of University of Science and Technology of
approach to numerical computing,” SIAM review, vol. 59, no. 1, pp. China, Hefei, China. He has authored or co-authored
65–98, 2017. [Online]. Available: https://doi.org/10.1137/141000671 over 30 papers in refereed international journals and
[33] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector co-authored 1 book. His current research interests
machines,” ACM Transactions on Intelligent Systems and Technology, include secure cloud quantum computing, big data
vol. 2, pp. 27:1–27:27, 2011, software available at http://www.csie.ntu. quantum computing, and the physical implementation of quantum computing
edu.tw/~cjlin/libsvm. architectures, in particular using linear optical and superconducting systems.