A Review On Credit Card Fraud Detection Techniques Using ML
A Review On Credit Card Fraud Detection Techniques Using ML
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract-Now a day’s credit card frauds are easy and • Device Intrusion
friendly targets. E-commerce and many other online sites have
increased the online payment modes, increasing the risk for • Application Fraud
online frauds. Increase in fraud rates, researchers started using
• Counterfeit Card
different machine learning methods to detect and analyses
frauds in online transactions. Frauds are known to be dynamic • Telecommunication
and have no patterns, hence they are not easy to identify.
Fraudsters use recent technological advancements to their Fraud Some of the currently used approaches to detection of
advantage. They somehow bypass security checks, leading to such fraud are [2]:
the loss of millions of dollars. Analyzing and detecting
unusual activities using data mining techniques is one way of
• Artificial Neural Network
tracing fraudulent transactions. In this paper the different
proposed system developed by different researcher used in • Fuzzy Logic
Credit Card Risk Detection Techniques Using Machine
Learning is stated. • Genetic Algorithm
• Logistic Regression
Key Words: credit card, fraud detection, machine learning,
deep learning, random forest, k nearest neighbor, support • Decision tree
vector machine, auto encoder, restricted Boltzmann machine,
deep belief networks, convolutional neural networks • Support Vector Machines
• Bayesian Networks
discovered a lot in predicting diagnosis applying machine neural network models require far more parameters in
learning algorithms. Since there is a lot of information achieving their tasks [5].
scattered in many papers, whenever some researchers aspire to
recognize about machine learning algorithms, what is the 2. LITERATURE SURVEY
predicting accuracy of diagnosis and which algorithms are the
best in all the papers, usually, they get exhausted of looking The following section describes the contribution of the
for papers. Since there are lots of diseases and lots of various researchers in the area of credit card risk detection.
algorithms, it’s extremely challenging to find out of the best
Esraa Faisal Malik, Khai Wah Khaw et al. [6] proposed the
algorithm for diagnosis most of the time they could not figure
several hybrid machine learning models were developed and
out of the exact paper. This paper purposed a literature survey investigated based on the combination of supervised machine
with various machine learning algorithms predicting diseases learning techniques as a part of a credit card fraud detection
with accuracy is explained by different papers information in study. The hybridization of different models was found to
one, a paper by using commonly used machine learning have the ability to yield a major advantage over the state-of-
algorithms. This paper extensively presents as much the-art models. However, not all hybrid models worked well
information as possible can get, those can be kept. This work with the given dataset. Several experiments need to be
also focused to analyze the best algorithms which give the conducted to examine various types of models to define which
most reliable accuracy for the prediction of any disease based
works the best. Comparing the performance of the hybrid
on the study from existing literature. This work will
model to the state-of-the-art and itself, conclude that Adaboost
encourage practitioners and researchers to find the + LGBM is the champion model for this dataset. The result
information easily while working in the healthcare sector. The also illustrates that the use of hybrid methods has lowered the
remainder of the article is structured as follows. In the next error rate. For future work, the hybrid models used in this
section, the related work is described, followed by research study will be extended to other datasets in the credit card
methodology and data analysis. Then, discuss our work before fraud detection domain. Future work may focus on different
conclude [3]. areas, starting by proposing data preprocessing techniques to
overcome the drawback of the missing values. Additionally,
One important infiltrate technique is recognized as “deep
different methods of feature selection and extraction should be
learning”, which includes a family of machine learning
algorithms that attempt to mold high-level abstractions in data investigated in the credit card domain and to determine its
by utilizing deep architectures compiled of multiple non-linear impact on prediction accuracy. An investigation of the most
transformations. Contrasting usual machine learning methods, appropriate hybrid model among the state-of-the-art machine
deep learning imitates the human brain that is arranged in a learning algorithms to determine the most accurate hybridized
deep architecture and processes information through multiple model in the previously mentioned domain should be the main
stages of transformation and representation. By exploring concern for future studies.
deep architectures to learn features at multiple level of Emmanuel Ileberi, Yanxia Sun et al. [7] has recommended the
abstracts from data automatically, deep learning methods new technique for GA based feature selection method in
permit a system to learn complex functions that directly map conjunction with the RF, DT, ANN, NB, and LR was
raw sensory input data to the output, without relying on proposed. The GA was implemented with the RF in its fitness
human-crafted features using domain knowledge [4]. function. Te GA was further applied to the European
cardholder’s credit card transactions dataset and 5 optimal
A Deep Neural Network is a type of discriminative feature
feature vectors were generated. Te experimental results that
learning technique, a neural network that contains multiple
hidden layers. This is a simple conceptual extension of neural were achieved using the GA selected attributes demonstrated
networks; however, it provides valuable advances with regard that the GA-RF (using v5) achieved an overall optimal
to the capability of these models and new challenges as to accuracy of 99.98%. Furthermore, other classifiers such as the
GA-DT achieved a remarkable accuracy of 99.92% using v1.
training them. The structure of deep neural networks causes
them to be more sophisticated in design, and yet more Te results obtained in this research were superior to those
complex in elements. There are two complexity aspects of a achieved by existing methods. Moreover, implemented our
DNN model's architecture. Firstly, how wide, or narrow it is, proposed framework on a synthetic credit card fraud dataset
to validate the results that were obtained on the European
in other words, how many neurons there are in each layer.
credit card fraud dataset. Te experimental outcomes showed
Secondly, how deep it is, that is, how many layers of neurons
that the GA-DT obtained an AUC of 1 and an accuracy of
there are. When dealing with the kind of data that has such
100%. Seconded by the GA-ANN with an AUC of 0.94 and
deep architecture, Deep neural networks can be very
an accuracy of 100%. In future works, intend to use more
beneficial, a deep neural network can fit the data more
accurately with fewer parameters than a normal neural datasets to validate our framework.
network, this is because more layers can be used for a more
efficient and accurate representation. It is clear, that shallow
Najadat Hassan, Ola Adnan Altiti et al. [8] accomplished a Salvatore J. Stolfo, David W. Fan et al. [11] proposed a new
different performance technique by using several machine and experiment tested using several machine learning algorithms
deep learning models to detect whether an online transaction as well as meta-learning strategies on real-world data. Unlike
is legitimate or fraud on the IEEE-CIS Fraud Detection many reported experiments on "standard" data sets, the set up
dataset as well built our model which is BiLSTM- and the evaluation criteria of our experiments in this domain
MaxPooling-BiGRU MaxPooling that based on bidirectional attempt to reflect the real-world context and its resultant
LSTM and GRU. also tested several methods to deal with challenges. The experiments reported here indicate: 50%/50%
highly imbalanced datasets including under sampling, distribution of fraud/non-fraud training data will generate
oversampling and SMOTE. Set of evaluation metrics used to classifiers with the highest True Positive rate and low False
evaluate the performance of the models. The results from Positive rate. Other researchers also reported similar findings.
machine learning classifiers show that the best AUC was 80% Meta-learning with BAYES as a meta-learner to combine base
and 81% that achieved by hard voting with under sampling classifiers with the highest True Positive rates learned from
and oversampling technique. However, the results from 50%/50% fraud distribution is the best method found thus far.
machine learning classifiers were not promising compared
with our model that achieved 91.37% AUC. Philip K. Chan, Salvatore J. Stolfo[12] has recommended the
new technique to demonstrates that the training class
Vaishnavi Nath Dornadulaa , Geetha Sa [9] developed a novel distribution affects the performance of the learned classifiers
method for fraud detection, where customers are grouped and the natural distribution can be different from the desired
based on their transactions and extract behavioral patterns to training distribution that maximizes performance. Moreover,
develop a profile for every cardholder. Then different our empirical results indicate that our multi-classifier meta-
classifiers are applied on three different groups later rating learning approach using a 50:50 distribution in the data
scores are generated for every type of classifier. This dynamic subsets for training can significantly reduce the amount of loss
changes in parameters lead the system to adapt to new due to illegitimate transactions. The subsets are independent
cardholder's transaction behaviors timely. Followed by a and can be processed in parallel. Training time can further be
feedback mechanism to solve the problem of concept drift. reduced by also using a 50:50 distribution in the validation set
observed that the Matthews Correlation Coefficient was the without degrading the cost performance. That is, this approach
better parameter to deal with imbalance dataset. MCC was not provides a means for efficiently handling learning tasks with
the only solution. By applying the SMOTE, tried balancing skewed class distributions, non-uniform cost per error, and
the dataset, where found that the classifiers were performing large amounts of data. Not only is our method efficient, it is
better than before. The other way of handling imbalance also scalable to larger amounts of data. Although downscaling
dataset is to use one-class classifiers like one-class SVM. instances of the majority class is not new for handling skewed
finally observed that Logistic regression, decision tree and distributions (Breiman et al. 1984), our approach does not
random forest are the algorithms that gave better results. discard any data, allows parallelism for processing large
amounts of data efficiently, and permits the usage of multiple
Masoumeh Zareapoora, PouryaShamsolmoalia,b [10] "off-the-shelf" learning algorithms to increase diversity
proposed the performance of three states of art data mining among the learned classifiers. Furthermore, how the data are
techniques, with bagging ensemble classifier based on sampled is based on the cost model, which might dictate down
decision three algorithm which is a novel technique in area of sampling instances of the minority class instead of the
credit card fraud detection system. A real-life dataset on credit majority class. One limitation of our approach is the need of
card transactions is used for our evaluation. And found that, running preliminary experiments to determine the desired
the bagging classifier based on decision three works well with distribution based on a defined cost model. This process can
this kind of data since it is independent of attribute values. be automated but it is unavoidable since the desired
The second feature of this novel technique in credit card fraud distribution is highly dependent on the cost model and the
detection is its ability to handle class imbalance. This is learning algorithm. Using four learning algorithms, our
incorporated in the model by creating four sets of datasets approach generates 128 classifiers from a 50:50 class
(Df1, Df2, Df3, DF4) which the fraud rate in each of them distribution and eight months of data. might not need to keep
were 20%, 15%, 10%, 3% respectively. Bagging classifier- all 128 classifiers since some of them could be highly
based decision three algorithm performance is found to be correlated and hence redundant. Also, more classifiers are
stable gradually during the evaluation. More over the bagging generated when the data set is more skewed or additional
ensemble method takes very less time, which is also an learning algorithms are incorporated. Metrics for analyzing an
important parameter of this real time application, because in ensemble of classifiers (e.g., diversity, correlated error, and
fraud detection domain time is known one of the important coverage) can be used in pruning unnecessary classifiers.
parameters. Furthermore, the real distribution is more skewed than the
20:80 provided to us. Author intends to investigate our
approach with more skewed distributions. As with a large
overhead, a highly skewed distribution can render fraud can consider a fairly good ordered input arrangement within a
detection economically undesirable. More importantly, since number of times. Compared to most current CNN models, this
thieves also learn and fraud patterns evolve over time, some model saves significant calculation time for the derived
classifiers are more relevant than others at a particular time. variables, making the model's design and adjustment process
Therefore, an adaptive classifier selection method is essential. fast and simple. And in an environment where online
Unlike a monolithic approach of learning one classifier using transactions require rapid response and accurate identification,
incremental learning, our modular meta classifier approach there's a higher level of availability. From the above model
facilitates adaptation over time and removal of out-of-date can conclude that when the max pooling layer had added to
knowledge. the model the accuracy level will be decreases as max pooling
layer is the part of the network without taking into account
Abhishek Shivanna, SujanRay et al. [13] has recommended other factors: the dimensionality curse, the network size and
the new technique for credit cards , online transactions is the problem of over fitting. And also max pooling preserves
increasing very rapidly. In this day and age, it is very critical the most important details that can be used to create a
to correctly identify online credit card fraudulent transactions. powerful multilayer completely linked network at the edge.
In conclusion, have proposed a credit card fraudulence When the max pooling layer is not added to the model the
detection method which identifies online credit card accuracy level has been increased.
fraudulent transactions. Decision Jungle algorithm has shown
promising results to be adapted in any fraudulence detection RuttalaSailusha, V. Gnaneswar et al. [17] has recommended
system. will extend our work in the future by using neural the new technique for fraud detection techniques .From author
networks to build more advanced fraudulence detection analysis, author can conclude that the accuracy is the same for
system. both the Random Forest and the Adaboost algorithms. When
author consider the precision, recall, and the F1-score the
Yiheng Wei, Yu Qi et al. [14] proposed machine learning Random Forest algorithm has the highest value than the
techniques which is used to identify potential fraud cases. The Adaboost algorithm. Henceauthor conclude that the Random
first step was data description and data cleaning, where this Forest Algorithm works best than the Adaboost algorithm to
work located and cleaned all frivolous values and replace detect credit card fraud. From the above analysis, it is clear
them with the record number. The second step was the that many machine learning algorithms are used to detect the
variable creation. More than 600 variables were created using fraud but can observe that the results are not satisfactory. So,
a different method. Then the third step was to select the most would like to implement deep learning algorithms to detect
relevant features among all the variables that this work had credit card fraud accurately.
created. Finally, this work trained several models using the
features that this work has selected: logistic regression, Anuruddha Thennakoon, Chee Bhagyani et al. [18] has
support vector machine, boosted trees, random forest, and recommended the new technique for Credit card fraud
neural network. The best model turns out to be Boosted Tree, detection. Credit card fraud has been a keen area of research
with a 54.3% FDR at 3% cutoff for testing and a 54% FDR at for the researchers for years and will be an intriguing area of
3% cutoff for OOT. This article has very important research research in the coming future. This happens majorly due to
significance and explains how to use different machine continuous change of patterns in frauds. In this paper, propose
learning methods to monitor credit card fraud in real time. a novel credit-card fraud detection system by detecting four
Finally, this work hopes to improve the optimization of these different patterns of fraudulent transactions using best suiting
methods in dealing with unbalanced data sets in the future. algorithms and by addressing the related problems identified
by past researchers in credit card fraud detection. By
Mohammed Azhan, ShazliMeraj [15] has recommended the addressing real time credit-card fraud detection by using
new technique for Machine learning techniques have shown to predictive analytics and an API module the end user is
be more competent in handling class imbalance problem as notified over the GUI the second a fraudulent transaction is
compared to a shallow neural network. Distribution of class taken place. This part of our system can allow the fraud
weights in neural networks make minor contribution towards investigation team to make their decision to move to the next
handling the class imbalance. Additional techniques like using step as soon as a suspicious transaction is detected. Optimal
Cost sensitive loss functions, over-sampling, under-sampling algorithms that address four main types of frauds were
can also be used. It must also be noted that, a better-balanced selected through literature, experimenting and parameter
dataset would provide a much better insight into the problem. tuning as shown in the methodology. also assess sampling
methods that effectively address the skewed distribution of
Anu Maria Babu, Dr. Anju Pratap [16] proposed new
data. Therefore, can conclude that there is a major impact of
technique based on the feature rearrangement developed in
using resampling techniques for obtaining a comparatively
this paper, the CNN model has an excellent experimental
higher performance from the classifier. The machine learning
performance with good stability. The model does not require
models that captured the four fraud patterns (Risky MCC,
high-dimensional input features or derivative variables, and
Unknown web address, ISOResponse Code, Transaction
above 100$) with the highest accuracy rates are LR, NB, LR Thulasyammal Ramiah Pillai et al. [21] has recommended the
and SVM. Further the models indicated 74%, 83%, 72% and new technique which is MLP algorithm. In this paper , the
91% accuracy rates respectively. As the developed machine highest sensitivity is only 83% due to very limited
learning models present an average level of accuracy, hope to combination of number of hidden layers and number of nodes.
focus on improving the prediction levels to acquire a better In the future study should use multiple number of hidden
prediction. Also, the future extensions aim to focus on layers and various number of nodes in the hidden layers to
location-based frauds. obtain the optimum results. can conclude that MLP with
logistic activation function gives the best result followed by
Greeshma N Pai, Kirana R [19] has recommended the new the tan. can also observe that identity activation function gives
technique for CCRD(CREDIT CARD RISK DETECTION) the lowest sensitivity value due to its nature. It does not
.A limitation of this study is however that it only deals with perform any transformation. In our future study, should use a
detecting fraud in a supervised learning context. Although balanced data. In the future, more advanced deep learning
supervised learning KNN, Random Forest seem attractive and algorithms can be used to detect credit card fraud. Moreover,
produce good results, they do not work well for dynamic will try using new activation functions in our future study.
environments. Fraud patterns typically change over time and
would be hard to catch. New data sets would need to be John O. Awoyemi et al. [22] proposed technique which
collected and machine learning models need to be retained. In investigates the comparative performance of Naïve Bayes, K-
this paper, studied applications of machine learning like Naïve nearest neighbor and Logistic regression models in binary
Bayes, Logistic regression, Random forest with boosting and classification of imbalanced credit card fraud data. The
shows that it proves accurate in deducting fraudulent rationale for investigating these three techniques is due to less
transaction and minimizing the number of false alerts. comparison they have attracted in past literature. However, a
Supervised learning algorithms are novel one in this literature subsequent study to compare other single and ensemble
in terms of application domain. If these algorithms are applied techniques using our approach is underway. The contribution
into bank credit card fraud detection system, the probability of of the paper is summarized in the following: 1. Three
fraud transactions can be predicted soon after credit card classifiers based on different machine learning techniques
transactions. And a series of anti-fraud strategies can be (Naïve Bayes, K-nearest neighbors and Logistic Regression)
adopted to prevent banks from great losses and reduce risks. are trained on real life of credit card transactions data and
Our study reveals that to detect fraud, the best methods with their performances on credit card fraud detection evaluated
larger datasets would be using SVMs, potentially combined and compared based on several relevant metrics. 2. The highly
with CNNs to get a more reliable performance. For the imbalanced dataset is sampled in a hybrid approach where the
smaller datasets, ensemble approaches of SVM, Random positive class is oversampled and the negative class under-
Forest and KNNs can provide good enhancements. sampled, achieving two sets of data distributions. 3. The
Convolution Neural Networks (CNN) usually, outperforms performances of the three classifiers are examined on the two
other deep learning methods such as Auto encoders, RBM and sets of data distributions using accuracy, sensitivity,
DBN methods such as CNN . specificity, precision, balanced classification rate and
Matthews Correlation coefficient metrics. Performance of
DejanVarmedja, Mirjana Karanovic [20] proposed a technique classifiers varies across different evaluation metrics. Results
for credit card frauds detection which is a very serious from the experiment shows that the kNN shows significant
business problem. These frauds can lead to huge losses, both performance for all metrics evaluated except for accuracy in
business and personal. Because of that, companies invest the 10:90 data distribution. This study shows the effect of
more and more money in developing new ideas and ways that hybrid sampling on the performance of binary classification of
will help to detect and prevent frauds. The main goal of this imbalanced data. Expected future areas of research could be in
paper was to compare certain machine learning algorithms for examining meta-classifiers and meta learning approaches in
detection of fraudulent transactions. Hence, comparison was handling highly imbalanced credit card fraud data. Also
made and it was established that Random Forest algorithm effects of other sampling approaches can be investigated.
gives the best results i.e. best classifies whether transactions
are fraud or not. This was established using different metrics, D. Tanouz, R Raja Subramanian et al. [23] has recommended
such as recall, accuracy and precision. For this kind of the new technique for credit card risk detection is algorithms
problem, it is important to have recall with high value. Feature decision tree , Random forest, logistic regression, naive bayes
selection and balancing of the dataset have shown to be classification machine learning algorithms , results shows that
extremely important in achieving significant results. Further Random forest classifier performs best with having 96.7741%
research should focus on different machine learning accuracy , 100% precision , 91.1111% recall , 95.3488% f1
algorithms such as genetic algorithms, and different types of scores and 95.5555 ROU-AUC score and still there are 4
stacked classifiers, alongside with extensive feature selection False Negative values and when use data without Random
to get better results. Under Sampling will get accuracy of 99.98% due to heavily
imbalance and results many false output. After cleaning data
[11] Salvatore J.Stolfo, David W. Fan, Wenke Lee and Marjani,” Credit Card Fraud Detection Using Deep Learning
Andreas L. Prodromidis, Philip K. Chan,” Credit Card Fraud Technique”, ©2018 IEEE
Detection Using Meta-Learning: Issues 1 and Initial Results”,
from DARPA (F30602-96-1-0311), NSF (IRI-96-32225 and [22] John O. Awoyemi, Adebayo O. Adetunmbi, Samuel A.
CDA-96-25374), and NYSSTF (423115-445). Oluwadare,” Credit card fraud detection using Machine
Learning Techniques: A Comparative Analysis”, ©2017 IEEE
[16] Anu Maria Babu, Dr. Anju Pratap,” Credit Card Fraud
Detection Using Deep Learning”, 2020 IEEE Recent
Advances in Intelligent Computational Systems (RAICS).