Paper 23
Paper 23
Nouhaila Innan ,1, 2, ∗ Muhammad Al-Zafar Khan ,2, 3, † and Mohamed Bennai1, ‡
1
Quantum Physics and Magnetism Team, LPMC,
arXiv:2308.05237v1 [quant-ph] 9 Aug 2023
Abstract
In this research, a comparative study of four Quantum Machine Learning (QML) models was
conducted for fraud detection in finance. We proved that the Quantum Support Vector Classifier
model achieved the highest performance, with F1 scores of 0.98 for fraud and non-fraud classes.
Other models like the Variational Quantum Classifier, Estimator Quantum Neural Network (QNN),
and Sampler QNN demonstrate promising results, propelling the potential of QML classification
for financial applications. While they exhibit certain limitations, the insights attained pave the way
for future enhancements and optimisation strategies. However, challenges exist, including the need
for more efficient quantum algorithms and larger and more complex datasets. The article provides
solutions to overcome current limitations and contributes new insights to the field of Quantum
Machine Learning in fraud detection, with important implications for its future development.
Keywords: Quantum Machine Learning, Quantum Neural Networks, Quantum Feature Maps, Fraud
Detection.
∗
[email protected]
†
[email protected]
‡
[email protected]
1
I. INTRODUCTION
Fraud is the act of deceiving and misleading a person, or group of people, with the
intention of obtaining some kind of gain (oftentimes financial). It involves the provisioning
of misrepresented information or data to the victim, which seems “too good to be true”,
or the request of the victim’s private data. Frequently, the targets of these attacks are
elderly folk or those individuals whom are not technologically inclined. Fraudsters play on
the emotions of their victims by usually creating a need for urgency around performing a
certain task, like the victim disclosing his/her confidential information like identity/social
security numbers, pin codes, One-Time Pins (OTPs), or other information that can render
the victim susceptible. Over the years, fraud schemes have become even more sophisticated,
and with the advent of Generative Artificial Intelligence (GenAI) becoming more ubiquitous,
more suave and ultra-modern schemes such as the employment of various phishing scams
and Natural Language Processing (NLP) to use voices of the victim’s family members or
friends are used in order to gain their trust, and credence.
Broadly speaking, fraud can be categorised into the following flavours:
I.1.1. Purloinment of Identity: Also known as “identity theft”, This occurs when the per-
petrator steals the personal information from the victim and “assumes their identity”
in the sense of using their details with nefarious intent: Using the victim’s personal
identification number, applying for any licenses, using the victim’s debit/credit card
details for purchasing goods or paying for services.
I.1.2. Insurance Claims Fraud: This occurs when the perpetrator intentionally files fal-
lacious insurance claims or overinflates the value of losses that occurred.
I.1.3. Financial Fraud: This falls under the broader category of white collar crimes and
constitutes:
I.1.3.1. Accounting Fraud: Also known as “crooking the books”. This involves the de-
liberate manipulation and misrepresentation of figures in financial statements to
mislead investors and interested parties regarding the company’s financial health.
I.1.3.2. Ponzi and Pyramidal Schemes: These constitute schemes whereby victims
outlay some capital with the promise of receiving enormously high returns in
2
short periods of time. In these schemes, funds are taken from the late investor
“Tom” and given to the earlier investors “Dick” and “Harry”. At the end of these
schemes, the late investors are not paid out the promised return, or any return
whatsoever, and the so called “expert investment manager” disappears.
I.1.3.3. Embezzlement: This type of fraud occurs when an entrusted party in a com-
pany holds fiduciary responsibilities and abuses their power by stealing or mis-
appropriating funds, or assets, to suit their own objectives.
I.1.3.4. Insider Trading: This occurs when a party has access to non-public, privileged
information about the company and they hedge against the company’s stock price
rising or plummeting. This ties into corporate espionage, where spies are deployed
into companies to steal trade secrets and report to them parties of interest, who
use this information to take advantage of the company.
I.1.4. Wire Fraud: Using electronic media such as emails, phone calls, text messages, or
personalised social media messages to hoodwink victims. Typically, scammers will act
under false pretences to impersonate an agent at a bank or institution, ask the victim
to transfer funds from their accounts or disclose sensitive data. In addition, these
scammers play on the victims personal troubles like romance (the famous “Nigerian
Prince scam”), or the victims financial woes like lottery prize scams, or inheritance
scams, the victims philanthropic nature with charity scams, the victim’s need to secure
employment with job offer scams, or tech support scams.
I.1.5. Credit Fraud: This involves the unauthorised usage for purchasing goods, paying
for services, and using the victim’s debit or credit cards. Typically, this would involve
the scammer getting a hold of the victim’s 16-digit card number, then phishing for the
card’s expiry date and the 3-digit Card Verification Value (CVV).
I.1.6. Internet Fraud: This is the collective term for online scams and phishing attacks
whereby the scammer uses emails, pop-up messages, websites, and social media to get
the victim to make a payment or disclose their confidential information.
The focus of this paper is concentrated on credit fraud. According to a 2022 study by UK
Finance, fraud resulted in losses of £1.2 bil. (sterling), and 80% of app fraud originates from
online solicitations. In a 2023 study published by the news agency CNBC, it is estimated
3
that in 2022, fraud cost consumers in the US $8.8 bil. Such high consumer costs directly
correlate to economic downturns for countries and, thus, translate to worldwide economic
collapse. Thus, an accurate and quick fraud detection system is needed to tame this type of
fraud.
The idea of fraud detection using (Classical) Machine Learning (CML) models is not
novel and oftentimes forms a standard textbook exercise/capstone project in this regard,
and many big corporates across the financial, telecommunications, and consulting industries
have fraud detection models deployed into production. For example, several of these CML
models that utilise: Multivariate Logistic Regression (see Alenzi & Aljehane, 2020), Support
Vector Machines (SVMs) – see Kumar et al, 2022; Gyamfi & Abdulai, 2018, Random Forest
Classifiers (see Liu et al, 2015; Xuan et al, 2018a; Xuan et al, 2018b), Gradient Boosting
Machines (see Taha & Malebary, 2020), comparative studies across methods (see Kumar et
al, 2020; Han et al, 2020; Afriyie et al, 2023), or combining models in ensembles (see Nandi
et al, 2022) show high fidelity, robustness, and ease of implementation.
Additionally, researchers have also applied various Deep Learning (DL) approaches: Au-
toencoders and Restricted Boltzmann Machines (RBMs) – see Pumsirirat & Yan, 2018,
Graph Neural Networks (GNNs) – see Ma et al, 2021. The only time-consuming aspect of
the model lifecycle is data cleaning and feature engineering.
Quantum Machine Learning (QML) is a newly developing field in which researchers be-
gan to express interest back in the early 2000s by combining the then emerging field of
Quantum Computing (QC), an idea accredited to Feynman, 1982, and CML. The goal is to
leverage properties of the fundamental units of QC, qubits, and QML algorithms to obtain
a computational advantage over analogous classical approaches.
However, the crystallisation and commercialisation of these ideas began to flourish in the
early 2010s, and one of the most pioneering books and papers is credited to Wittek, 2014
and Biamonte et al, 2017 respectively, who set the stage for a formalised research track –
Of course, if one looks deep enough, one may find many earlier papers, but it is beyond
the scope of mentioning research works of chronological order, rather those with the highest
impact. Potentially, QML can radically transform the paradigm and approach to CML
by facilitating the discovery of novel algorithms that are more efficient than their classical
counterparts. Since this is a rapidly developing field and we are in the Noisy Intermediate-
Scale Quantum (NISQ) era of QC – see Preskill, 2018, there is no single approach. We
4
discuss these approaches in Tab. I. below.
Approach Description
Quantum Approach to CML This entails the development of novel Quantum algorithms to
solve computationally-expensive CML tasks. For example, the
Quantum Support Vector Classifier has been shown to train on
large datasets faster than the classical Support Vector Machine.
Quantum-supplemented Approach to This involves using Quantum principles to enhance existing
CML CML algorithms. For example, the Quantum Neural Network
(QNN) offers several advantages over the classical Neural Net-
work (CNN).
Composite Classical-Quantum Machine This approach offers a hybrid procedure that combines ele-
Learning ments from classical computing and QC to solve CML tasks.
For example, a Quantum Computer may be used to preprocess
the data, and a CML algorithm may be used to optimise the
model’s weights, biases, and additional parameters.
Applications of QML to Other Domains This approach involves developing and modifying existing QML
Besides CML algorithms for applications in areas beyond CML. As an ex-
ample, QML is used extensively in the field of Computational
Chemistry. One such use case is by Innan et al, 2023 in which a
Variational Quantum Eignesolver (VQE) was modified to per-
form electronic structure calculations, and a novel algorithm
was presented.
It is important to note that while QML has immense potential, it is still in the early
stages of its development. Breakthroughs in hardware design, computing power, Quantum
cloud technologies, and new approaches to QC will result in the more widespread adoption
of QML to solve daily tasks, much like how CML is a tool that all major companies are
trying to integrate and embed into their organisational processes.
The question arises: “If these CML models are so successful and doing such a fantastic
5
job in flagging fraudulent use cases, what is the need for QML fraud detection models?”
We advocate for adopting a Quantum approach because we believe it provides the following
advantages over the classical approaches in the post-NISQ era:
I.2.2. Decrease in the Amount of Inessential Data: Fraud detection involves the anal-
yses of large swathes of data, and although fraud accounts for such large losses, it is
rare to detect while it is in progress (usually detected after it occurs), and the training
data has to be specifically fabricated from real-time data; thus, a lot of redundancies
occur. Since Quantum Computers offer the opportunity to analyse data in a reduced
amount of time, the amount of redundant data is thereby minimised.
I.2.3. Scalability through Parallelisation: QML offers the opportunity to work with
larger datasets because of its ability to parallelise algorithms in a streamlined manner
as compared to CML.
In this paper, we apply the Quantum Support Vector Classifier, the Variational Quan-
tum Classifier, the Estimator Quantum Neural Network, and the Sampler Quantum Neural
Network to the BankSim dataset. This paper is divided into the following sections:
In Sec. II., we provide a comprehensive précis of the relevant literature papers pertaining
to anomaly detection and fraud prediction.
In Sec. III., we provide an overview of the theoretical constructs of the methods used.
Namely, the data encoding and the QML methods respectively.
In Sec. IV., we discuss the dataset used and present the results of applying the QML
models. Thereafter, we discuss the results by alluding to the various model heuristic metrics.
In Sec. V., we provide closing remarks on the findings of this paper.
6
II. LITERATURE REVIEW
Since the launch of IBM’s Qiskit package and Xanadu’s PennyLane, it has become more
common to apply QML methods for fraud detection. However, we note that this is a fairly
new application for QML, with many papers not being very old. In this regard, we note the
following literature pieces:
Although strictly not a paper that applies the methods to fraud detection in financial data,
anomaly detection forms an integral component of fraud detection. Thus, it is noteworthy
to mention the work of Liu & Rebentrost, 2018, who discuss the potential applications of
anomaly detection to Quantum data and propose a Quantum anomaly detection algorithm
based on autoencoders. This is particularly useful when real-world data is converted to
Quantum states via some feature map embedding. The research highlights the usage of the
Quantum methods (Quantum Principal Component Analysis, Quantum Density Estimation,
Quantum Support Vector Machines, and Quantum k-Nearest Neighbours) and compares
them to their classical counterparts. Lastly, it gives advantages for the superiority of the
Quantum methods over classical methods for anomaly detection, such as faster processing
time of the data and enhanced accuracy.
Liang et al, 2019 propose two Quantum anomaly detection algorithms that find applica-
tions in fraud detection. The basis for these algorithms comprises density estimation and
multivariate Gaussian distributions. The goal is to find the probability density function for
the training data. The advantage of this approach over classical approaches is that these
algorithms scale logarithmically with respect to the number of datapoints in the training
data and the dimensionality of the Quantum states. Thus, making the algorithm superior
in efficiency for handling high-dimensional data. In addition, the authors propose a method
for calculating the determinant of any Hermitian operator, which is particularly useful for
anomalous data with a higher-dimensional normal distribution. The advantages of these al-
gorithms are demonstrated experimentally by illustrating comparable accuracy and precision
in a shorter time.
Kottmann, et al, 2021 introduced the unsupervised QML algorithm known as Variational
Quantum Anomaly Detection (VQAD) that takes simulation data and extracts the phase
diagram, a priori, without knowledge of the system. Importantly, the authors have demon-
strated that the algorithm works in realistic scenarios for both real-noise simulations and
7
on a real Quantum computer. Further, it was shown to improve the anomaly detection
scheme by employing measurement error mitigation and adopting the circuits according to
the physical device. Although more oriented towards Physics, the findings of this paper have
potentially important implications for fraud detection.
Kyriienko & Magnusson, 2022 develop a Quantum protocol for anomaly detection and
apply their technique for detecting credit card fraud. By establishing classical benchmarks,
a comparative study is done against different types of Quantum kernels (products of data-
dependent rotations with variational circuits, and evolution circuits, the spin-glass Hamil-
tonian’s or the Heisenberg Hamiltonian) is established, and it is shown that Quantum fraud
detection is superior to classical methods. Specifically, for supervised fraud detection, Quan-
tum kernels offer higher expressivity and generalisability by outperforming RBF kernels,
||x−x′ ||2
K(x, x′ ) = exp − 2σ2 2 , for the free parameter σ, by over 10% on the average precision
heuristic. For unsupervised fraud detection, Quantum kernels offer a 15% increase in average
precision and grow as the system size grows. Lastly, the authors discuss future improvements
in near- and mid-term Quantum hardware.
Grossi et al, 2022 use the Qiskit software stack (IBM Safer Payments and IBM Quantum
Computers) to present an end-to-end application of Quantum Support Vector Machines
for classification in financial services and a comparative study of the state-of-the-art QML
methods collated against the classical methods. The paper shows that the hybrid method
outperforms the classical method with respect to accuracy and the false positive rate (FPR)
measures. Feature selection plays a pivotal role in optimising the fraud detection system.
The paper proposes a Quantum Feature Importance Selection Algorithm (QFISA) that
selects the most important features from a dataset to reduce the dimensionality of the dataset
for running the experiment on a real Quantum device. Lastly, the drawbacks and limitations
of the Factorial Analysis of Mixed Data (FAMD) method are highlighted (overlap between
components, and not showing any discrimination power between the reduced variables), and
it is shown how the method proposed is superior in this regard.
Wang et al, 2022 propose a framework using QML for analysing online transaction data
that is time series-based, highly imbalanced, and high-dimensional in order to detect fraud-
ulent records. Using an enhanced-Support Vector Machine with Quantum annealing solvers,
they benchmark this method against CML models. This research highlights the challenges
encountered when dealing with real-time transactional data and how a Quantum approach
8
potentially provides a better approach and can be more broadly applied to other critical
business applications. While providing a roadmap for further research, the authors caution
that several factors must be accounted for when implementing a fraud detection model on
such data; namely:
• Accuracy: How close to the actual values does one want the predicted values to be?
• Cost of Computing: Whether one, or the company that one works for, has the
financial resources to purchase hardware, and access extra qubits, to perform such
calculations.
Guo et al, 2022 propose an Anomaly Detection based on the Density Estimation (ADDE)
algorithm, which hinges on the estimation of the amplitude of a Quantum state, and they
show that it has an exponential speed-up in the number of training datapoints and dimen-
sions over classical algorithms. Further, the authors show how the proposed algorithm can
be used for anomaly detection based on Kernel Principal Component Analysis (KPCA).
Lastly, it is indicated that the findings in this paper are not limited to fraud detection but
can also be applied to other domains, namely: Military surveillance, intrusion detection,
and healthcare.
Other references are contained therein in the aforementioned literature pieces. One may
expect that there exists a plethora of application-based papers of QML papers for fraud
detection, unexpectedly, there are not so many.
III. THEORY
We present the theory of the data encoding methods used in the paper, namely:
ZZFeatureMap, PauliFeatureMap, ZFeatureMap, and QML models: QSVC, VQC, EQNN,
SQNN, used below. This is because the theory is not widely known, it helps to establish the
context, justifies the choice of methods used, guides the analyses and interpretation, and
enhances the overall credibility of this research.
9
A. Data Encoding Methods
1. ZZFeatureMap
q0 H P
2.0*x[0]
q1 H P
2.0*x[1] 2.0*(
P
x[0])*( x[1])
q2 H P
2.0*x[2] 2.0*(
P
x[0])*( x[2]) 2.0*(
P
x[1])*( x[2])
q3 H P
2.0*x[3] 2.0*(
P
x[0])*( x[3]) 2.0*(
P
x[1])*( x[3])
q0
q1
q2
q3 2.0*(
P
x[2])*( x[3])
2. PauliFeatureMap
The PauliFeatureMap class represents a Quantum circuit that enables a Pauli expansion
of a given data set. The Pauli expansion is a method for representing the data set as a
product of Pauli operators, where each Pauli operator corresponds to a distinct feature
within the data. The expression for the Pauli operator combination is given as:
!
X Y
Uφ(x) = exp ı ϕS (x) Pi ,
S∈I i∈S
10
where I is the set of qubit indices describing the connections in the feature map, and ϕS (x)
is the data mapping function. The data mapping function ϕS (x) maps classical input data
x into the Quantum circuit, enhancing the circuit’s representation capabilities. It is defined
as follows:
x i if S = {i},
ϕS (x) =
Q (π − xj ) if |S| > 1.
j∈S
q0 H P RX
2.0*x[0] /2
q1 H P
2.0*x[1] 2.0*(
P
x[0])*( x[1])
H
q2 H P
2.0*x[2] 2.0*(
P
x[0])*( x[2]) 2.0*(
P
x[1])*( x[2])
q3 H P
2.0*x[3] 2.0*(
P
x[0])*( x[3]) 2.0*(
P
x[1])*( x[3])
q0 RX RX RX RX RX
/2 /2 /2 /2 /2
q1 P H RX RX RX
2.0*( x[0])*( x[1]) /2 /2 /2
q2 H P H H P H RX
2.0*( x[0])*( x[2]) 2.0*( x[1])*( x[2]) /2
q3 2.0*(
P
x[2])*( x[3])
H 2.0*(
P
x[0])*( x[3])
H H
q0
q1 RX
/2
q2 RX
/2
q3 2.0*(
P
x[1])*( x[3])
H H 2.0*(
P
x[2])*( x[3])
H
3. ZFeatureMap
11
particularly well-suited for specific applications where a shallow Quantum circuit without
entanglement is desired.
Similar to the ZZFeatureMap, the ZFeatureMap is tailored for a designated number of
qubits, known as the feature dimension, and the user can specify the number of repetitions
to replicate the rotation blocks. The circuit is constructed by applying Hadamard gates to
all qubits, followed by rotation blocks as shown in Fig 3. The rotation blocks are structured
following the same principles employed in the ZZFeatureMap.
q0 H P
2.0*x[0]
q1 H P
2.0*x[1]
q2 H P
2.0*x[2]
q3 H P
2.0*x[3]
The ZFeatureMap class also offers essential attributes for inspecting the circuit, including
the feature dimension, the number of repetitions, and the entanglement strategy. In the case
of the ZFeatureMap, the entanglement strategy is null since no entangling gates are present.
The ZFeatureMap class complements the ZZFeatureMap by providing an alternative
Quantum feature map that aligns with specific use cases where entangling gates are to be
avoided. Its customisable nature, and absence of entanglement, allow for efficient Quantum
data encoding and processing.
The Quantum Support Vector Classifier (QSVC) is the Quantum Mechanical analogue of
the classical Support Vector Machine (SVM), as depicted in Fig 4. The SVM model aims to
find the optimal planum separans (separating hyperplane) that categorises the datapoints.
This is achieved by maximal margin classification: Minimising the margin (distance between
12
classes of datapoints) while simultaneously maximising the distance between the closest
datapoints from each class and the hyperplane; see the excellent texts of Bishop, 2006;
Goodfellow et al, 2016 for a full mathematical elucidation.
The output of a QSVC is given by
n
X
f (x) = αj K(x, xj ) + b,
j=1
where αj are the coefficients of the classifier, b are the bias terms, and K are the kernels
– which gives a measure of similarity between the datapoints x, and the j th datapoint xj .
Schuld & Petruccione, 2021 provide an excellent discussion of the various kernel types. We
advocate that the kernel is the most important component of a QSVC and significantly affects
its performance. Thus, in the style of “hyper-parameter tuning”, one should experiment with
various kernels to see which gives the best model performance.
QPU CPU
Classical Data
Measurement Predicted
Data Encoding
Classical SVM Values
Kernel:
subject to
13
III.C.2. Application of a Unitary Transformation: In this part of the circuit, a series of
Quantum gates are applied to the initial states.
Let Gi ∈ {I, X, Y, Z, H, S, T, RX , RY , RZ , CNOT, SWAP, . . .} be Quantum gates, for
1 ≤ i ≤ m, and then we apply a series of Quantum gates on the initial state. We can
be sure that no matter what combination of these Quantum gates we have, they form
a unitary operator, i.e. U = mi=1 Gi . Mathematically, this part of the circuit is given
N
Steps III.C.2. and III.C.3. are repeated in order to minimise the loss function, J(θ), and
the process is stopped once an acceptance criterion is met.
(Preprocessed) Measurement
Unitary Operator:
Classical Data
Optimisation:
14
III.D.1. State Preparation via Quantum Feature Map: Given classical data x =
(x1 , x2 , . . . , xn ), the Quantum feature map, Φ : x −→ |Ψ0 (θ)⟩, encodes the classi-
cal data into parameterised Quantum states, |Ψ0 (θ)⟩, using the VQC. As is the case
with the VQC, the states |Ψ0 (θ)⟩ are oftentimes just a series of |0⟩ states.
III.D.3. Processing in a Classical Neural Network: The classical features that are ex-
tracted here are fed to fully-connected classical neural network architecture in order
to produce the predicted values, ŷ.
III.D.4. Model Optimisation and Optimal Parameter Search: In this step, the archi-
tecture is optimised to discover the optimal parameters θ ∗ of the VQC, as well as
the weights, W∗ , and biases, b∗ , of the classical neural network, such that the loss
function is minimised; i.e. (θ ∗ ; W∗ ; b∗ ) = arg min J(y; ŷ). Importantly, this optimal
θ,W,b
search is carried out in parallel.
Classical Data
Predicted Values
Data Features Labels
Points
Optimisation:
Analogous to the EQNN, the Sampler Quantum Neural Network (SQNN) also contains
a hybrid Classical-Quantum architecture. However, the SQNN is equipped with a Quantum
15
Sampler, which extracts example Quantum states from the complex probability distributions
associated with the Quantum states. As illustrated in Fig 7, the SQNN operates as follows:
III.E.1. State Preparation via Quantum Feature Map: Exactly the same as the case of
the EQNN; see III.D.1.
III.E.2. Application of the Quantum Sampler: Oftentimes this is taken to be the Quan-
tum Approximate Optimisation Algorithm (QAOA); see Farhi et al, 2014. The pur-
pose is to efficiently extract example Quantum states from the complex probability
distribution corresponding to problem solutions under specific variable configurations.
III.E.3. Sample Extraction: Samples are chosen from the examples generated by the Quan-
tum sampler.
III.E.4. Utilising Classical Methods to Extract the Best Solutions: From the samples,
the best solutions to the given task are chosen using some kind of classical scheme.
III.E.5. Optimal Parameter Search: The optimal parameters are found using a classical
optimisation method in order to minimise the cost function, i.e. θ ∗ = arg min J.The
θ
values of θ are fed back to the VQC, and the process begins once again. The process
is repeated until the optimal values of the parameters are found.
Sample Extraction
Features Labels
Classical Method to
Data
Points Extract Best Solutions
extracted samples
1 0.965 1.992 0.21 0
2 0.772 2.437 0.93 1
, for
Best that solves the task
Optimisation:
16
IV. RESULTS AND DISCUSSION
The dataset used in this research study is derived from BankSim, an agent-based simulator
of bank payments based on aggregated transactional data provided by a prominent bank in
Spain. The primary objective of BankSim is to generate synthetic data tailored explicitly
for fraud detection research. To achieve this goal, statistical analysis and Social Network
Analysis (SNA) were deployed to study the relationships between merchants and customers,
developing a calibrated model – see Lopez-Rojas & Axelsson, 2014.
The BankSim dataset encompasses 594 643 records obtained over 180 steps, simulating
approximately six months of temporal activity. From these records, 587 443 are regular
payments, while 7 200 are classified as fraudulent transactions. It is important to note that
the simulated fraud occurrences were introduced by incorporating thieves aiming to steal an
average of three cards per step and performing around two fraudulent transactions per day.
The dataset comprises nine feature columns and one target column, each offering essential
insights to discern underlying patterns and characteristics. The features encompassed are
as follows:
• Step: Representing the temporal aspect, this feature denotes the simulation day,
effectively encompassing 180 steps, emulating six months.
• ZipMerchant: This feature denotes the zip code associated with each merchant,
providing further potential for geographic insights.
• Age: Representing the customer’s age, this feature is categorized into discrete age
groups, including “0”: (≤ 18), “1”: (19 − 25), “2” : (26 − 35), “3”: (36 − 45),
17
“4”: (46 − 55), “5”: (56 − 65), “6”: (> 65), and “U” : (Unknown).
• Gender: Categorizing the gender of each customer, this feature includes values such
as “E” (Enterprise), “F” (Female), “M” (Male), and “U” (Unknown).
• Category: Capturing the category of each purchase transaction, this feature imparts
valuable insights into the nature and type of transactions.
• Amount: Representing the monetary value of each purchase, this feature offers critical
information on transaction volumes.
• Fraud: This binary target variable classifies each transaction as fraudulent (denoted
by “1”) or benign (denoted by “0”). This classification forms the basis for the subsequent
fraud detection analysis.
Graphical analysis played a crucial role in deepening our understanding of the dataset.
We generated several visualisations, including histograms, bar plots, and a heatmap, to gain
valuable insights into the data distribution and uncover potential patterns.
8000
6000
Count
4000
2000
0
0 200 400 600 800 1000
Payment Amount
18
Fig. 8 displays a histogram comparing payment amounts for fraudulent and non-
fraudulent transactions. Our analysis reveals that fraudulent transactions involve higher
payment amounts on average (mean = 567.23, std = 128.47) compared to legitimate trans-
actions (mean = 145.68, std = 50.32). This insight highlights the significance of payment
amount as a distinguishing factor between the two transaction categories.
Fig. 9 presents a bar plot depicting fraudulent payments categorized by age and gender.
The visualisation indicates that individuals aged 26 to 35 (45%) and females (56%) consti-
tute more fraudulent transactions. In comparison, males (34%) and individuals aged 36 to
45 years (32%) show a lower incidence of involvement in fraudulent activities. These de-
mographic trends offer valuable guidance for developing targeted fraud detection strategies.
1500
1000
500
0
'0'
'1'
'2'
'3'
'4'
'5'
'6'
'U'
Age Category
Fig. 10 illustrates the distribution of fraudulent payments across different merchant cat-
egories. Specific merchant categories, such as “sports & toys” and “health”, exhibit a dispro-
portionately higher occurrence of fraudulent transactions, representing 20% and 15% of all
fraud cases, respectively. This finding emphasizes the importance of considering merchant
categories as a relevant feature in our fraud detection models.
19
Distribution of Fraudulent Payments by Merchant Category
2000
1750
1500
Count of Fraudulent Payments
1250
1000
750
500
250
'es_barsandrestaurants'
'es_travel'
'es_leisure'
'es_hyper'
'es_sportsandtoys'
'es_health'
'es_tech'
'es_fashion'
'es_wellnessandbeauty'
'es_home'
'es_hotelservices'
'es_otherservices'
Merchant Category
To identify the most informative features that significantly contribute to our fraud detec-
tion models, we employed Principal Component Analysis (PCA) to reduce the dimensionality
of the dataset while preserving the most valuable information. As shown in Fig. 11, the re-
sults of the PCA analysis indicated the order of importance of the features based on their
corresponding principal components. Notably, the feature “amount” emerged as the most
influential, followed by “merchant,” “category,” “customer,” “step,” “age,” “gender,” “zipMer-
chant,” and “zipcodeOri.” This valuable ranking guided our further feature selection process.
Subsequently, we conducted a logical analysis to investigate the relationships between these
selected features and their potential impact on fraud detection. The logical analysis con-
firmed that the features “age,” “gender,” ‘category,” and “amount” exhibited distinct patterns
in fraudulent and non-fraudulent transactions, making them promising candidates for our
fraud detection models.
We further examined the correlation heatmap to gain deeper insights into the relationships
among the selected features (Fig. 12). The heatmap matrix displayed the pairwise correla-
tions among “age,” “gender,” “category,” “amount,” and “fraud.” The correlation heatmap
20
Feature Importance
amount
merchant
category
customer
Feature
step
age
gender
zipcodeOri
zipMerchant
0.0 0.1 0.2 0.3 0.4 0.5
Importance
showcased the strength and nature of the relationships. Notably, the feature “amount” ex-
hibited a weak negative correlation with “fraud,” suggesting a potential association between
higher transaction amounts and fraudulent transactions. Based on the insights gained from
the logical analysis and confirmed by the correlation heatmap, we concluded that the fea-
tures “age,” “gender,” “category,” and “amount” were the most informative variables for our
fraud detection models. Incorporating these features into our fraud detection framework
allows us to deliver robust and efficient financial security and risk management practices,
advancing the field.
Before conducting the fraud detection analysis, a rigorous data preprocessing and cleaning
process was undertaken to ensure the dataset’s quality and suitability for reliable model
training and evaluation. The original dataset was loaded, and specific subsets were extracted
to create a balanced dataset containing 200 records with 100 instances of fraudulent and
non-fraudulent transactions. A data transformation step addressed inconsistencies in the
“age” column, which contained non-numeric characters, by extracting numerical values from
the age categories using regular expressions. Consequently, each age was converted to an
integer for accurate representation in the subsequent analysis.
21
Correlation Heatmap
1.0
0.6
gender
0.4
category
0.2
fraud
0.4
To prepare the dataset for model training, certain categorical features, such as “category”
and “gender,” were transformed into numerical representations using scikit-learn’s Labe-
lEncoder. This encoding process allowed the model to process these categorical variables
effectively during training. Subsequently, the dataset was further prepared by removing
unused features and converting the remaining features into numerical values to ensure ho-
mogeneity across the data. The dataset was split into training and testing sets using the
train_test_split function from scikit-learn to facilitate the model training process.
The training set denoted as Xtrain and ytrain contained a portion of the data used for train-
ing the model. The testing set, represented as Xtest and ytest , was kept separate and served
as unseen data to evaluate the model’s performance.
The feature matrix X encompassed all pertinent features, excluding the “fraud” column,
22
which served as the target variable. The target variable, denoted as y, distinguished between
fraudulent transactions (encoded as (1)) and non-fraudulent transactions (encoded as (0)).
This distinction was essential for the model to learn patterns and accurately classify new
data. Following these preprocessing steps and dividing the dataset into training and testing
sets, the data was ready for the subsequent model training and evaluation processes.
We employed our four Quantum Machine Learning models for the training process, each
tailored to specific configurations. To optimize these models effectively, we harnessed the
power of the Qiskit optimizer, implementing the COBYLA algorithm with a maximum iter-
ation limit of 200. This prudent choice of optimizer facilitated efficient convergence towards
the optimal solution, ensuring the training process was effective and resource-efficient.
To provide an ideal environment for training, we utilized the Aer backend with the Qasm-
Simulator. This choice enabled us to simulate Quantum circuits effectively, enabling seamless
training of the models. Following the training process, we meticulously evaluated the perfor-
mance of each model using various key metrics. These metrics comprehensively understood
each model’s predictive capabilities and effectiveness.
23
TABLE II: Performance Comparison of Quantum Machine Learning Models
24
0.90. However, in contrast to QSVC, the VQC model experienced a loss during training. In
Fig. 13, we observe that the VQC model achieved a loss of 0.5 when using the ZFeatureMap,
while losses of 0.95 were observed for the PauliFeatureMap and ZZFeatureMap. The lower
loss with ZFeatureMap indicates that this data encoding strategy leads to better convergence
during the optimisation process, contributing to the higher accuracy achieved by the VQC
model with this feature map.
1.0
0.9
Loss value
0.8
0.7
0.6
0.5
On the other hand, the EQNN model, using the ZFeatureMap, showed a relatively lower
F1 score of 0.78. Fig. 14 illustrates the corresponding loss values, with the ZFeatureMap
achieving a loss of 0.5, the PauliFeatureMap a loss of 0.96, and the ZZFeatureMap a loss of
0.97. The higher losses with the latter two feature maps suggest that the optimisation pro-
cess encountered difficulties reaching an optimal solution, reducing accuracy for the EQNN
model.
The limited accuracy of the EQNN model might result from the inherent limitations of the
Quantum circuits used for data encoding. The simplicity of the Quantum circuit utilized by
the EQNN model might not adequately capture the complex patterns present in the dataset.
Exploring more expressive Quantum circuits or advanced Quantum architectures could offer
potential improvements. Similarly, the SQNN model demonstrated lower accuracy than the
other models, which was expected.
Fig. 15 shows the corresponding loss values, with the ZFeatureMap achieving a loss of
25
Estimator QNN Model
ZZFeatureMap
PauliFeatureMap
1.1 ZFeatureMap
1.0
0.9
Loss value
0.8
0.7
0.6
0.5
0.458, the PauliFeatureMap a loss of 0.454, and the ZZFeatureMap a loss of 0.455. The
higher losses indicate that the SQNN model struggled to find an optimal solution, resulting
in lower accuracy. We also observed that the SQNN possesses lower accuracy than the other
0.50
0.49
Loss value
0.48
0.47
0.46
0.45
0 25 50 75 100 125 150 175 200
Iteration
models, which aligns with our expectations because SQNNs are better suited to combina-
torial optimisation and general constraint-imposing problems, such as scheduling problems,
map colouring, and logic-placement number assignment games like Sudoku. The inherent
26
limitations of SQNNs in handling continuous and high-dimensional data, as encountered in
our dataset, could explain the observed lower accuracy in the context of fraud detection.
V. CONCLUSION
In conclusion, our research presents a rigorous and insightful comparative study of four
cutting-edge Quantum Machine Learning models: QSVC, VQC, EQNN, and SQNN. We
have comprehensively understood their capabilities and limitations by evaluating their
performance on a meticulously curated dataset and utilizing three distinct feature maps,
ZZFeatureMap, PauliFeatureMap, and ZFeatureMap.
Among the models evaluated, QSVC stood out as the top performer, showcasing unpar-
alleled excellence with F1 scores of 0.98 for both fraud and non-fraud classes. Its utilisation
of the Quantum kernel for state similarity measurement proves to be a potent strategy,
circumventing the need for conventional loss functions and yielding extraordinary results.
VQC also demonstrated remarkable performance, boasting an impressive F1 score of 0.90.
However, we observed a potential area for refinement during its training process, suggesting
avenues for future exploration to harness its power. In contrast, EQNN and SQNN exhibited
comparatively lower F1 scores, hinting at the influence of the Quantum circuits used for data
encoding on their accuracy. Addressing these limitations might be the key to unlocking their
potential in this field.
Our findings reinforce the promise of Quantum computing in revolutionizing machine
learning paradigms. The exceptional performance of QSVC and VQC attests to the vast
potential of Quantum algorithms for solving complex classification problems with unprece-
dented precision.
27
DECLARATIONS
Conflicts of interest
The authors have no competing interests or other interests that might be perceived to
influence the results and/or discussion reported in this paper.
VI. REFERENCES
[1] Afriyie, J. K., Tawiah, K., Pels, W. A., Addai-Henne, S., Dwamena, H. A., Owiredu, E. O.,
Ayeh, S. A., & Eshun, J. (2023). A supervised Machine Learning Algorithm for Detecting and
Predicting Fraud in Credit Card Transactions. Elsevier – Decision Analytics Journal, 6, pp.
1-12.
[2] Alenzi, H. Z., & Aljehane, N. O. (2020). Fraud Detection in Credit Cards using Logistic Re-
gression. International Journal of Advanced Computer Science and Applications (IJACSA),
11 (12), pp. 540-551.
[3] Bergholm, V., Izaac, J., Schuld, M., Gogolin, C., Ahmed, S., Ajith, V., ... & Killoran, N.
(2018). Pennylane: Automatic differentiation of hybrid quantum-classical computations. arXiv:
https://arxiv.org/abs/1811.04968
[4] Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., & Lloyd, S. (2017). Quantum
Machine Learning. Nature, 549 (7671), pp. 195-202.
[5] Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York City:
New York, USA.
[6] Farhi, E., Goldstone, J., & Gutmann, S. (2014). A Quantum Approximate Optimization Algo-
rithm. arXiv: https://arxiv.org/abs/1411.4028.
[7] Feynman, R. P. (1982). Simulating Physics with Computers. International Journal of Theoret-
ical Physics, 21 (6/7), pp. 467-488.
[8] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. The MIT Press, Cambridge:
Massachusetts, USA.
[9] Grossi, M., Ibrahim, N., Radescu, V., Loredo, R., Voigt, K., Von Altrock, C., & Rudnik, A.
(2022). Mixed Quantum-Classical Method for Fraud Detection with Quantum Feature Selection.
28
arXiv: https://arxiv.org/abs/2208.07963.
[10] Guo, M-C., Liu, H-L., Li, Y-M., Qin, S-J., Wen, Q-Y., & Gao F. (2022). Quantum Algorithms
for Anomaly Detection using Amplitude Estimation. Physica A: Statistical Mechanics and its
Applications, 604 (127936).
[11] Gyamfi, N. K., & Abdulai, J-D. (2018). Bank Fraud Detection Using Support Vector Machine.
2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Con-
ference (IEMCON), Vancouver, Canada.
[12] Han, Y., Yao, S., Wen, T., Tian, Z., Wang, C., & Gu, Z. (2020). Detection and Analysis
of Credit Card Application Fraud Using Machine Learning Algorithms. Journal of Physics:
Conference Series, 1693 (012064), pp. 1-16.
[13] Innan, N., Khan, M. A. Z., & Bennai, M. (2023). Electronic Structure Calculations using
Quantum Computing, arXiv: https://arxiv.org/abs/2305.07902.
[14] Kottmann, K., Metz, F., Fraxanet, J., & Baldelli, N. (2021). Variational Quantum Anomaly
Detection: Unsupervised Mapping of Phase Diagrams on a Physical Quantum Computer. Phys-
ical Review Research, 3 (4), pp. 043184 1-9.
[15] Kumar, S., Gunjan, V.K., Ansari, M.D., & Pathak, R. (2022). Credit Card Fraud Detection
Using Support Vector Machine. Proceedings of the 2nd International Conference on Recent
Trends in Machine Learning, IoT, Smart Cities and Applications. Lecture Notes in Networks
and Systems, 237, Singapore.
[16] Kumar, Y., Saini, S., & Payal, R. (2020). Comparative Analysis for Fraud Detection Using
Logistic Regression, Random Forest and Support Vector Machine. International Journal of
Research and Analytical Reviews (IJ RAR), 7 (4), pp. 726-731.
[17] Kyriienko, O., & Magnusson, E. B. (2022). Unsupervised Quantum Machine Learning for Fraud
Detection. arXiv: https://arxiv.org/abs/2208.01203.
[18] Liang, J-M., Shen, S-Q., Li, M., & Li, L. (2019). Quantum Anomaly Detection with Density
Estimation and Multivariate Gaussian Distribution. Physical Review A, 99 (5), pp. 052310
1-6.
[19] Liu, C., Chan, Y., Kazmi, S. H. A., & Fu, H. (2015). Financial Fraud Detection Model: Based
on Random Forest. International Journal of Economics and Finance, 7 (7), pp. 178-188.
[20] Liu, N., & Rebentrost, P. (2018). Quantum Machine Learning for Quantum Anomaly Detection.
Physical Review A, 97 (4), pp. 042315 1-10.
29
[21] Lopez-Rojas, E. A., & Axelsson, S. (2014). Banksim: A Bank Payments Simulator for Fraud
Detection Research. Proceedings of the 26th European Modeling and Simulation Symposium
(EMSS), Bordeaux, France.
[22] Ma, X., Wu, J., Xue, S., Yang, J., Zhou, C., Sheng, Q. Z., Xiong, H., & Akoglu, L. (2021). A
Comprehensive Survey on Graph Anomaly Detection with Deep Learning. IEEE Transactions
on Knowledge and Data Engineering, arXiv: https://arxiv.org/abs/2106.07178.
[23] Nandi, A. K., Randhawa, K. K., Chua, H. S., Seera, M., & Lim, C. P. (2022). Credit Card
Fraud Detection using a Hierarchical Behavior-knowledge Space Model. PLoS One, 17 (1), pp.
1-16.
[24] Preskill, J. (2018). Quantum Computing in the NISQ Era and Beyond. Quantum, 2, pp. 79-99.
[25] Pumsirirat, A., & Yan, L. (2018). Credit Card Fraud Detection using Deep Learning based on
Auto-Encoder and Restricted Boltzmann Machine. International Journal of Advanced Com-
puter Science and Applications (IJACSA), 9 (1), pp. 18-25.
[26] Qiskit contributors. (2023). Qiskit: An Open-source Framework for Quantum Computing.
doi:10.5281/zenodo.2573505
[27] Schuld, M., & Petruccione, F. (2021). Machine Learning with Quantum Computers (2nd Ed.).
Springer, New York City, New York, USA.
[28] Taha, A. A., & Malebary, S. J. (2020). An Intelligent Approach to Credit Card Fraud Detection
Using an Optimized Light Gradient Boosting Machine. IEEE Access 1-1, 8, pp. 25579-25587.
[29] Wang, H., Wang., W., Liu, Y., & Alidaee, B. (2022). Integrating Machine Learning Algorithms
with Quantum Annealing Solvers for Online Fraud Detection. IEEE Access, 10, pp. 75908-
75917.
[30] Wittek, P. (2014). Quantum Machine Learning: What Quantum Computing Means to Data
Mining. Academic Press, New York City, New York, USA.
[31] Xuan, S., Liu, G., & Li, Z. (2018a). Refined Weighted Random Forest and Its Application to
Credit Card Fraud Detection. Proceedings of the International Conference on Computational
Social Networks, 11280, pp. 343-355.
[32] Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., & Jiang, C. (2018b). Random Forest for
Credit Card Fraud Detection. IEEE 15th International Conference on Networking, Sensing and
Control (ICNSC), Zhuhai, China.
[33] https://www.kaggle.com/datasets/ealaxi/banksim1
30