Binary Classifiers For Noisy Datasets A
Binary Classifiers For Noisy Datasets A
works to improve binary classification models for noisy datasets which are prevalent in financial
datasets. The metric we use for assessing the performance of our quantum classifiers is the area
under the receiver operating characteristic curve (ROC/AUC). By combining such approaches as
hybrid-neural networks, parametric circuits, and data re-uploading we create QML inspired architec-
tures and utilise them for the classification of non-convex 2 and 3-dimensional figures. An extensive
benchmarking of our new FULL HYBRID classifiers against existing quantum and classical clas-
sifier models, reveals that our novel models exhibit better learning characteristics to asymmetrical
Gaussian noise in the dataset compared to known quantum classifiers and performs equally well for
existing classical classifiers, with a slight improvement over classical results in the region of the high
noise.
Introduction.— Noisy Intermediate-Scale Quantum this photonic quantum computer is not programmable.
(NISQ) [1–3] devices hold a promise to deliver a prac-
tical quantum advantage by harnessing the complexity One of the most promising areas of research to obtain
of quantum systems. Despite being several years away practical advantage is Quantum Machine Learning[10–
from having fault-tolerant quantum computing[4–6], 12] which was born as a result of cross-fertilisation of
researchers have been hopeful to achieve this task. ideas between Quantum Computing [13, 14] and Classi-
Perhaps one of the most exciting breakthroughs in this cal Machine Learning [15, 16]. QML in its spirit is similar
direction was a demonstration of ”quantum supremacy” to classical machine learning but with the main difference
by Google researchers [7], using their programmable being that instead of classical neurons in the layers of a
superconducting Sycamore chip with 53 qubits, in which deep neural network, now we have qubits and quantum
single-qubit gate fidelities of 99.85% and two-qubit gate gates acting on qubits combined with quantum measure-
fidelities of 99.64% were obtained on average. Here the ments playing the role of the activation function. The
task of sampling the output of a pseudo-random quantum elegant field of QML has been providing a new platform
circuit was successfully achieved. Quantum Supremacy for devising algorithms that exhibit quantum speedups.
would imply that a universal quantum computer has For instance, it has been demonstrated that such basic
the ability to perform certain tasks exponentially faster linear algebra subroutines as solving certain types of lin-
than a classical computer[8]. However, it has been ear equations (the quantum version is known in the com-
argued later that Google’s achievement amounted to munity as HHL), finding eigenvectors and eigenvalues,
a demonstration of a quantum advantage but not a principal component analysis (PCA) exhibit exponential
practical advantage, in other words, the performed task speedups compared to their classical counterparts[17–
was not useful for any real-life applications. Another 21]. Since we are dealing with a quantum system, one
quantum advantage breakthrough experiment has been can utilise such quantum resources as coherence, entan-
implemented [9] utilising a Jiuzhang photonic quantum glement, negativity, contextuality to leverage towards
computer and performing Gaussian boson sampling achieving practical advantage. However, it is still not
(GBS) with 50 indistinguishable single-mode squeezed completely understood what the role of different types
states. Here, quantum advantage has been elucidated in of resources is in harnessing practical advantage from
the sampling time complexity of a Torontonian matrix, available 50-100 qubit noisy devices [3]. The three main
which has exponential scaling with output photon clicks. building blocks of any QML algorithm are data encod-
However, this experiment demontsrates quantum advan- ing, unitary evolution of the system followed by the state
tage but fails to demonstrate quantum supremacy as readout performed through the measurement [12]. Up-
loading classical data in the quantum computer is not
a trivial task and can account for most of the complex-
ity of the algorithm, determining what kind of speed-
∗ Equal contribution ups are feasible. This procedure is called quantum em-
† [email protected] bedding which can be achieved, for instance, with help
2
of ”quantum feature maps”[22–26] which take classical to the noise. This kind of study sheds light on learn-
data and map it to the high-dimensional Hilbert space, ing properties for the amount of noise in the dataset.
where one hopes to achieve higher separation between We also perform systematic hyperparameter tuning by
the data classes compared to the original coordinate sys- studying how ROC/AUC changes with the number of
tem. Moreover, one can train the quantum embedding repeating units in the data-re-uploading approach, num-
to achieve maximal separation between the data clusters ber of qubits, batch size, number of epochs and number
in the Hilbert space (this approach has been coined as of strongly entangling units. We remark, that our bi-
”quantum metric learning”)[25, 26], paving the way to- nary classifiers can be extended to multi-class classifica-
wards constructing faithful quantum classifiers. tion problems using a one-versus-all approach.
Binary classification is a ubiquitous task in machine The manuscript is organised as follows. In section I
learning. Perhaps the most prominent example is the we explain what kind of classification problems for 2 and
cat recognition algorithm, which gives a flavour of the 3-dimensional non-convex surfaces we tackle in the cur-
power brought by utilising such basic tools as logistic rent study. In Section II, we briefly review the three
regression combined with deep neural network architec- building blocks in the QML frameworks which we utilise
tures [15]. Quantum classifiers hold a promise to bring in the next section III for our novel classification circuit
feasible speedups compared to their classical counter- which combines the best features of the building blocks.
parts. Several theoretical proposals combined with ac- In Section IV we benchmark several known QML ap-
tual experimental runs on commercially available back- proaches(including our QML classifier) along with the
ends have been put forward for realising faithful quan- best known classical counterparts for binary classifica-
tum classifiers[22, 23, 27–36]. For instance, approaches tion problems. Here we focus, in particular, on predic-
in Refs. [35, 36] are inspired by kernel methods used in tion grids and ROC/AUC characteristics for assessing
classical machine learning. Refs. [22, 27, 28] are com- the performance of the classifier. Section V is devoted
bining certain types of quantum embeddings to achieve to the conclusions and future directions.Finally, in Ap-
quantum hybrid neural networks, which are promising pendix A material we show some results for 3-dimensional
candidates for building a faithful classifier. Ref. [37] sug- non-convex boundary classification problems and demon-
gests using hypergraph-states[38], where the assumption strate the performance of our FLL HYBRID classifiers.
is that such states can lower the circuit depth of the clas-
sifier. Refs.[31, 32] are based on quantum Grover’s search
algorithm. I. PROBLEM SETTING
In this manuscript, we take a rather pragmatic ap-
proach and try to benefit from a plethora of available We consider a non-trivial classification problem and
QML software packages[39–43], which grant access to run will train single and multi-qubit variational quantum cir-
the quantum circuit in the quantum simulator or an ac- cuits to achieve this goal. The data is generated as a
tual hardware (such as IBM Quantum Experience, Ama- set of random points in a plane x1 , x2 and labelled as 1
zon Braket, Rigetti Computing, Strawberry Fields). By (blue) or 0 (red) depending on whether they lie inside or
utilising these tools we provide new software that is par- outside of a given 2-dimensional non-convex figure. The
ticularly well suited for targeting classification problems goal is to train a quantum circuit to predict the label (red
in the unbalanced and noisy datasets which are prevalent or blue) given an input point’s coordinate.
in the financial industry[44].
In this paper at first we briefly outline and review three
different necessary building block QML architectures for II. REVIEW OF EXISTING QML
our software package : hybrid-neural networks [22, 27, FRAMEWORKS
28], parametric quantum circuits [2, 45–47] and data-
reuploading [23, 24]. In this section we briefly review three different neces-
The metric we use for assessing the performance of sary building block QML architectures for our software
our quantum classifiers is the area under the receiver package : hybrid-neural networks [22, 27, 28], variational
operating characteristic curve (ROC/AUC). ROC is a circuits [2, 46] and data-reuplodaing [23, 24].
probability curve and AUC represents the degree of sep-
arability. In general a good model has AUC close to
1. We test our FULL HYBRID models and benchmark A. Hybrid classical-quantum classifier (Hybrid)
them against existing QML classifiers and also to the
best known classical machine learning counterparts by Hybrid neural networks are formed by concatenating
running simulations on quantum simulators for three dif- classical and quantum neural networks and can bring a
ferent 2-dimensional non-convex surfaces. It is believed great advantage by having a number of features in the
that non convex boundaries represent more difficult clas- initial classical layers that exceeds the number of qubits
sification problems as linear regression is bound to fail in in the quantum layer. Normally we assume that in each
this tasks. Then by introducing asymmetrical Gaussian layer we have one qubit for each feature and a sequence
noise we study the resilience of our different approaches of one and two-qubit gates acting on it. To create a
3
B. QNode
different number of blocks. The results are shown in right we summarize the highest ROC/AUC scores for the
Fig.8. It is apparent from Fig. 8 (first row) that with respective classifiers. In the left figure we show predic-
an increasing number of repeating blocks, we get better tion grids for the respective quantum classifiers. As in
ROC/AUC for every noise level for the DRC classifier. previous case, VC is more stable to noise and DRC tends
On the bottom row of figure Fig. 8 we show results for to overfit and explores richer prediction grids. That is
the VC-DRC where compared to DRC we get even higher why VC-DRC, which combines both features, and the
ROC/AUC. We remark that no major improvements are more complex approach like FH, is giving great results
seen for a Block number greater than six. From now on, as apparent from row number 6. Surprisingly, for this
in all codes of this section, we will set the number of particular dataset FH: NN/VC-DRC fails to capture the
blocks equal to six (B=6). In what follows we specify pattern of the dataset while FH: VC-DRC/NN captures
number of blocks and layers for each classifier: 1. The the pattern and has the highest ROC/AUC score. It
single qubit DRC (B=6) 2. 2 qubit VC (with 6 layers, should be noted that the FH models outperforms again
L=6) 3. VC-DRC (B=6,L=1) 4. QNode (B=6,L=1) both it’s components (NN and QNode).
5. FH: VC-DRC/NN (B=6,L=1) 6. FH: NN/VC-DRC
(B=6,L=1). All models have been trained for maximum
35 epochs, using the same optimizer and learning rate. V. CONCLUSIONS AND FUTURE
The best result during the training process is shown. On DIRECTIONS
the left Fig.9 we compare all the previously mentioned
classifiers. As we can see from on the left Fig. 9 VC-
In this paper, we applied Quantum Machine Learn-
DRC outperforms both VC and DRC.VC-DRC and Qn-
ing frameworks to improve binary classification models
ode have almost identical performance. The FH:NN/VC-
for noisy datasets which are prevalent in financial mar-
DRC outperforms all classifiers whilst FH:VC-DRC/NN
kets. The metric used for assessing the performance of
has slightly worse behavior. In the right column of Fig.9
our quantum classifiers is the area under the receiver op-
we can see the prediction grids for all classifiers at dif-
erating characteristic curve (ROC/AUC). By combining
ferent noise levels. For low noise levels (Noise/10=0),
such approaches as hybrid-neural networks, parametric
DRC and VC struggle to capture the prediction grid
circuits, and data re-uploading we created a new ap-
pattern while VC-DRC and FH almost capture it. For
proach called Full Hybrid (FH). We tested our models
medium noise levels (Noise/10=6), DRC tends to cap-
for the classification of 2 and 3-dimensional non-convex
ture the noise (overfitting) while VC looks more stable.
datasets and benchmarked them against each other as
VC-DRC still captures the main pattern but also shows
well as to the best known classical machine learning coun-
signs of overfitting. FH performs very well thanks to the
terpart by running simulations on quantum simulators.
classical preprocessing and utilising the power brought by
Then, by introducing asymmetrical Gaussian noise in the
VC-DRC. For high noise levels (Noise/10=12) FH cap-
input datasets, we studied the resilience of our different
tures the pattern and shows robustness to the noise while
approaches to noise. This kind of study sheds light on the
the rest of the classifiers are capturing the noise. In order
learning efficacy to the amount of noise in the dataset. In
to demonstrate that FULL HYBRID does not perform
the scope of the manuscript we also performed system-
well only because of the strong classical NN attached to
atic hyperparameter tuning by studying how ROC/AUC
the quantum circuit, we benchmark FH versus just the
changes with the number of repeating units in the data-
Classical part (NN) and versus just the Quantum part
re-uploading approach, number of qubits, batch size,
(QNode). On from Fig. 10 we show results for two NN’s
number of epochs and number of strongly entangling
one with 35 epochs training (same training epochs as in
units. An extensive benchmarking of our new QML ap-
the FH) and 3000 epochs just to see what is the best
proach against existing quantum and classical classifier
outcome this NN can produce. We conclude that the
models reveals that our novel (FH) models exhibits bet-
FH outperforms both it’s components (NN and QNode)
ter learning properties to asymmetric Gaussian noise in
which shows that FH is more powerful classifier than it’s
the dataset compared to known quantum classifiers, and
isolated parts.
performs equally well for existing classical counterparts.
To test even further the FH classifier, we benchmark its Yet more understanding of the merits of the (FH) classi-
performance against a great number of classical counter- fier has been gained by a detailed analysis and compar-
parts, which are specified in the inset of the Fig.11. Inter- ison of the prediction grids for the VC, DRC, VC-DRC,
estingly, this figure shows that in the high noise region, QNode binary classifiers. We observed that for low noise
the quantum classifier even outperforms some classical levels , DRC and VC struggle to capture the prediction
ones or performing equally well in all noise regions. We grid pattern while VC-DRC and FH almost fully capture
also see that compared to the other classical approaches it. For medium noise levels, DRC tends to capture the
(QDA, Decision tree, KNN and Random forest) that are noise (overfitting) while VC looks more stable. VC-DRC
well suited for non-convex classification problems and still captures the main pattern but also shows signs of
showing good performance in all noise regimes. In Fig. 12 overfitting. FH performs very well thanks to the clas-
we are showing results for a more complicated non-convex sical preprocessing and utilising the power brought by
classification problem versus noise. In the table on the VC-DRC. For high noise levels, (FH) captures the pat-
7
FIG. 10. ROC/AUC for increasing level of noise for the classi-
fication of 2d dataset. Here we benchmark FH versus just the
classical part (NN) and versus just the quantum part (QN-
ode).
Acknowledgements— All the codes used in the on newly emerging non-VQA algorithms. P.Griffin would
manuscript will be provided under the reasonable re- like to acknowledge to the Monetary Authority of Singa-
quest. D.A. would like to thank Kishor Bharti for useful pore (MAS) and to Tradeteq for their support in this
discussions on Quantum Machine Learning (QML) and work.
[1] J. Preskill, Quantum 2, 79 (2018). [28] E. Farhi and H. Neven, arXiv preprint arXiv:1802.06002
[2] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, (2018).
S. Alperin-Lea, A. Anand, M. Degroote, H. Heimo- [29] F. Tacchino, C. Macchiavello, D. Gerace, and D. Bajoni,
nen, J. S. Kottmann, T. Menke, et al., arXiv preprint npj Quantum Information 5, 1 (2019).
arXiv:2101.08448 (2021). [30] W. Cappelletti, R. Erbanni, and J. Keller, in 2020 IEEE
[3] I. H. Deutsch, PRX Quantum 1, 020101 (2020). International Conference on Quantum Computing and
[4] J. Preskill, in Introduction to quantum computation and Engineering (QCE) (IEEE, 2020) pp. 22–29.
information (World Scientific, 1998) pp. 213–269. [31] N. Wiebe, A. Kapoor, and K. M. Svore, arXiv preprint
[5] D. Gottesman, Physical Review A 57, 127 (1998). arXiv:1602.04799 (2016).
[6] P. W. Shor, in Proceedings of 37th Conference on Foun- [32] Y. Liao, D. Ebler, F. Liu, and O. Dahlsten, arXiv
dations of Computer Science (IEEE, 1996) pp. 56–65. preprint arXiv:1810.12948 (2018).
[7] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, [33] M. Schuld, M. Fingerhuth, and F. Petruccione, EPL
R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. (Europhysics Letters) 119, 60002 (2017).
Buell, et al., Nature 574, 505 (2019). [34] P. Tiwari and M. Melucci, IEEE Access 7, 42354 (2019).
[8] A. W. Harrow and A. Montanaro, Nature 549, 203 [35] C. Blank, D. K. Park, J.-K. K. Rhee, and F. Petruccione,
(2017). npj Quantum Information 6, 1 (2020).
[9] H.-S. Zhong, H. Wang, Y.-H. Deng, M.-C. Chen, L.-C. [36] D. K. Park, C. Blank, and F. Petruccione, Physics Let-
Peng, Y.-H. Luo, J. Qin, D. Wu, X. Ding, Y. Hu, et al., ters A 384, 126422 (2020).
Science 370, 1460 (2020). [37] F. Tacchino, C. Macchiavello, D. Gerace, and D. Bajoni,
[10] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, npj Quantum Information 5, 1 (2019).
N. Wiebe, and S. Lloyd, Nature 549, 195 (2017). [38] M. Rossi, M. Huber, D. Bruß, and C. Macchiavello, New
[11] P. Wittek, Quantum machine learning: what quantum Journal of Physics 15, 113022 (2013).
computing means to data mining (Academic Press, 2014). [39] V. Bergholm, J. Izaac, M. Schuld, C. Gogolin, M. S.
[12] M. Schuld, Supervised learning with quantum computers Alam, S. Ahmed, J. M. Arrazola, C. Blank, A. Delgado,
(Springer, 2018). S. Jahangiri, et al., arXiv preprint arXiv:1811.04968
[13] M. A. Nielsen and I. Chuang, “Quantum computation (2018).
and quantum information,” (2002). [40] N. Killoran, J. Izaac, N. Quesada, V. Bergholm, M. Amy,
[14] J. Preskill, California Institute of Technology 16, 10 and C. Weedbrook, Quantum 3, 129 (2019).
(1998). [41] M. Broughton, G. Verdon, T. McCourt, A. J. Mar-
[15] I. Goodfellow, Y. Bengio, and A. Courville, Deep learn- tinez, J. H. Yoo, S. V. Isakov, P. Massey, M. Y.
ing 1, 98 (2016). Niu, R. Halavati, E. Peters, et al., arXiv preprint
[16] M. I. Jordan and T. M. Mitchell, Science 349, 255 (2015). arXiv:2003.02989 (2020).
[17] A. W. Harrow, A. Hassidim, and S. Lloyd, Physical re- [42] S. Efthymiou, S. Ramos-Calderer, C. Bravo-Prieto,
view letters 103, 150502 (2009). A. Pérez-Salinas, D. Garcı́a-Martı́n, A. Garcia-Saez,
[18] H.-Y. Huang, K. Bharti, and P. Rebentrost, arXiv J. I. Latorre, and S. Carrazza, arXiv preprint
preprint arXiv:1909.07344 (2019). arXiv:2009.01845 (2020).
[19] P. Rebentrost, A. Steffens, I. Marvian, and S. Lloyd, [43] J. Kottmann, S. Alperin-Lea, T. Tamayo-Mendoza,
Physical review A 97, 012327 (2018). A. Cervera-Lierta, C. Lavigne, T.-C. Yen, V. Vertelet-
[20] S. Lloyd, M. Mohseni, and P. Rebentrost, Nature Physics skyi, P. Schleich, A. Anand, M. Degroote, et al., Quan-
10, 631 (2014). tum Science and Technology (2021).
[21] N. Wiebe, D. Braun, and S. Lloyd, Physical review let- [44] R. Orus, S. Mugel, and E. Lizaso, Reviews in Physics 4,
ters 109, 050505 (2012). 100028 (2019).
[22] M. Schuld and N. Killoran, Physical review letters 122, [45] M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini,
040504 (2019). Quantum Science and Technology 4, 043001 (2019).
[23] A. Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and [46] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin,
J. I. Latorre, Quantum 4, 226 (2020). S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan,
[24] A. Pérez-Salinas, D. López-Núñez, A. Garcı́a-Sáez, L. Cincio, et al., arXiv preprint arXiv:2012.09265 (2020).
P. Forn-Dı́az, and J. I. Latorre, arXiv preprint [47] L. Funcke, T. Hartung, K. Jansen, S. Kühn, and P. Stor-
arXiv:2102.04032 (2021). nati, Quantum 5, 422 (2021).
[25] S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, and N. Killoran, [48] https://qiskit.org/textbook/ch-machine-
arXiv preprint arXiv:2001.03622 (2020). learning/machine-learning-qiskit-pytorch.html (2020).
[26] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Phys- [49] S. Ahmed, https://pennylane.ai/qml/demos/tutorial-
ical Review A 98, 032309 (2018). data-reuploading-classifier.html (2021).
[27] M. Schuld, A. Bocharov, K. M. Svore, and N. Wiebe, [50] https://pennylane.ai/qml/demos/tutorial/variational/classifier.html
Physical Review A 101, 032308 (2020). (2021).
11
[51] S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, and N. Killoran, [54] K. Bharti, arXiv preprint arXiv:2009.11001 (2020).
arXiv preprint arXiv:2001.03622 (2020). [55] H. Carmichael, An Open Systems Approach to Quantum
[52] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, Optics: Lectures Presented at the Université Libre de
and H. Neven, Nature communications 9, 1 (2018). Bruxelles October 28 to November 4, 1991 , 5 (1993).
[53] T. Haug and K. Bharti, arXiv preprint arXiv:2011.14737
(2020).