Khatri 2020
Khatri 2020
Abstract— In today’s economic scenario, credit card use has become well as unsupervised machine learning techniques can be
extremely commonplace. These cards allow the user to make payments of applied to the data.
large sums of money without the need to carry large sums of cash. They
have revolutionized the way of making cashless payments and made The objective of this paper is to evaluate an imbalanced
making any sort of payments convenient for the buyer. This electronic dataset with the help of various supervised machine learning
form of payment is extremely useful but comes with its own set of risks. models and to determine which one of those is the best suited
With the increasing number of users, credit card frauds are also for detecting credit card frauds. We make use of 5 supervised
increasing at a similar pace. The credit card information of a particular
individual can be collected illegally and can be used for fraudulent machine learning models to evaluate a dataset on the basis of
transactions. Some Machine Learning Algorithms can be applied to various predefined criteria.
collect data to tackle this problem. This paper presents a comparison of
some established supervised learning algorithms to differentiate between
genuine and fraudulent transactions. II. RELATED WORK
Keywords: Credit Card, Credit Card Fraud, Machine Learning, Supervised Specific algorithms based on artificial intelligence and neural
Learning. networks are also being proposed and implemented to predict
the credit card frauds with increased accuracy. The
I. In t r o d u c t io n
distribution of the datasets used for fraud detection is highly
A Fraud can be described as an intentional deceit which is imbalanced. So, to overcome this obstacle, under- sampling
perpetrated for some kind of gain, mostly monetary. It is an and oversampling techniques are being designed to obtain
unfair practice whose occurrences are increasing by the day. comparatively balanced data. Data mining techniques are also
There has been a sharp increase in the usage of electronic being implemented in order to create a more efficient Fraud
payment methods like credit and debit cards and this has in Detection System [9]. Another important area of development
turn led to a rise in credit card frauds. These cards may be is the emergence of new hybrid models. These are derived
used in both online as well as offline modes to make payments from preexisting supervised as well as unsupervised machine
[7]. In case of the online mode of payment, the card may not learning techniques. Hybrid Models may be able to produce a
have to be physically presented. In such cases the card data is more accurate result as they encapture the capabilities of both
prone to attack by hackers or cyber criminals. These kinds of supervised as well as unsupervised machine learning [15].
frauds result in millions being lost every year. To overcome
this obstacle, many algorithms have and are being developed. It is observed that the performance of all machine learning
Various detection approaches are being worked upon to solve datasets is hindered due to the skewness of available data sets
this issue most efficiently [8]. which are usually unbalanced. To overcome this problem, the
unbalanced datasets are to be converted to balanced ones. This
Credit card transactions are extremely commonplace now but can be done by mainly two ways which are Intrinsic Method
they also come with their own set of problems. There are a lot and Network based Method. In Intrinsic Feature Method, a
of problems faced during fraud detection. The process of pattern in the customer Activity is observed whereas in
acceptance or rejection of a transaction happens within a very Network -based features Method, the network of users and the
small-time frame, which may range between micro and card merchants is exploited. These techniques may
milliseconds. Therefore, the process adopted for the detection significantly improve the functioning of certain Models as
of a fraudulent transaction has to be extremely quick and they work on a more Balanced Dataset[5].
effective. Another problem is that there are a vast number of
similar types of transactions happening at the same time. This
makes it difficult to monitor each and every transaction III. Ma c h in e Le a r n in g
individually and hence determine a fraud. Thus, an efficient Machine Learning is basically an application of Artificial
Fraud Detection System must be put into work to be able to Intelligence techniques in order to make the systems learn by
differentiate between a genuine and a fraud transaction. Such a themselves. This means that the system automatically learns,
system works on the principle of learning user-specific card improvises and adapts through experience without it being
usage behavior. Thus, existing approaches of supervised as programmed for performing a particular operation. This field
10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 681
D. Random Forest
VI. Mo d e l s Us e d This model is basically an ensemble classifier, i.e. a
combining classifier that uses and combines many decision
A. Decision Tree tree classifiers. The main agenda behind using multiple trees is
This is one of the most widely used predictive modelling to be able to train the trees enough, such that, contribution
approaches. As per the name of the model, this is built in the from each of them comes in the form of a model. After the
form of a tree like structure [16]. This model maybe used in generation of the tree, the output is combined through
case of a multi-dimensional analysis where there are multiple majority. It uses multiple decision trees so that, the
classes present. The past data also known as the past vector is dependence of each of them is on a particular dataset
used to create a model that can be used to predict the value of possessing similar distribution throughout the tree [6]. This
the output based on the input being provided. There are particular model has the quality of efficiently balancing errors
multiple nodes in a tree and each node corresponds to one or in a class population of unbalanced data sets. It can be used to
the other vector. The tree terminates at a leaf node where each solve both classification as well as regression problems.
such node represents a possible outcome or output.
E. Naive Bayes
B. kNN It is a form of probabilistic classifier model; this implies that it
k- Nearest Neighbor model is one the simplest but most has the ability to make predictions for multiple classes at once.
effective models. In this model, the class label of the test It is based on the Bayes Theorem. Probabilistic Classifiers are
datasets on the basis of the class label of the neighboring those which make it possible to predict multiple classes. The
training data elements. The similarity between two elements is decision is made based on conditional probability. This model
measured using Euclidean Distance [4][16]. It is also known uses a set of algorithms instead of a single algorithm, but all of
as an Instance learning or Lazy model. The value of ‘k ’ is these have a common principle. In this model, it is assumed
calculated which actually the number of is nearest neighbors that each feature makes an equal and individual contribution to
that have to be considered. the output. This model has certain advantage over other
A suitable value for ‘k’ should be chosen. An appropriate models as it requires only a small amount of training data [4].
distance metric is also a requirement. Sometimes, the
‘Minkowski’ distance may be used. It is a generalization of the
Euclidean and Manhattan distance. Mathematically, it is can VII. OBSERVATIONS
be represented as:
TABLE I
PERFORMANCE EVALUATION OF VARIOUS MODELS AT
THRESHOLD VA LUE OF 0.5
d (x(l),x 0)) = j £ fc|x® - x ^ Y (3)
MODEL S E N S IT IV IT Y PREC ISIO N
C. Logistic Regression DECISION TREE 79.21 85.11
It is basically a statistical model which makes use of a logistic KNN 81.19 91.11
function to model a binary dependent variable. This model is LOGISTIC REGRESSION 63.34 87.67
RANDOM FOREST 75.25 93.83
mainly used where there is a chance of occurrence of a binary
NAIVE BAYES 85.15 6.56
classification issue. It works well on linearly separable classes
We use the imbalanced dataset to analyze the 5 supervised
[4].The odds ratio is one concept using which we can also
learning models and find out the values of sensitivity and
define the logit function. It is the probability of an event
precision for each of these models. The default threshold value
occurring.
is taken as 0.5 according to standards.
Odds Ratio —p / ( l —p) (4) TABLE II
PERFORMANCE EVALUATION OF VARIOUS MODELS AT
Where, p = probability of the positive event THRESHOLD VA LUE OF 0.4
The logit function is the logarithm of the odds ratio. It takes
MODEL S E N S IT IV IT Y P R EC ISIO N
input in the range of [0,1] and transforms them to values over DECISION TREE 79.21 85.11
the real-number range. KNN 81.19 91.11
The logit function can be defined as follows: LOGISTIC REGRESSION 69.31 87.5
RANDOM FOREST 78.22 89.77
N A IV E BAYES 85.15 6.52
Logit (P) = l o g ^ (5)
The Threshold value is changed from 0.5 to 0.4 for calculating
In this model, the sigmoid function is also used effectively sensitivity and precision for each model. The analysis was
performed at different values, but the best output was obtained
0(*) - ^ (6) when the threshold value was taken as 0.4. When the value
was changed, an increase was observed in the sensitivity and
682 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)
precision of Logistic Regression, Naive Bayes and Random the best approach to be used for detecting credit card fraud
Forest. detection. But, the performance of Decision Tree Model must
also be evaluated with the help of unsupervised machine
TABLE III learning models in the future to produce a more conclusive
TIME TAKEN FOR TRAINING AND PREDICTING DATA USING
result. This tells us whether the model which is chosen is a
VARIOUS MODELS
better option or the unsupervised machine learning techniques
M O D EL T IM E TAKEN T IM E TA K EN FOR perform better.
FOR PR ED IC TIN G THE
TR A IN IN G TEST DATA
TH E M O DEL (SECONDS) REFERENCES
(SECONDS)
DECISION TREE 5s 0s
KNN 1s 462s [1] https ://www.analyticsvidhya.com/blog/2016/03/practical-guide-deal-
LOGISTIC REGRESSION 3s 0s imbalanced-classification-problems/.[Accessed: Oct 12, 2019].
RANDOM FOREST 23s 0s [2] https://www.ritchieng.com/machine-learning-evaluate-classification-
model/.[Accessed: Oct 12, 2019].
NAIVE BAYES 0s 0s
[3] A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi and G. Bontempi,”
Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning
The time taken by all 5 models for training the model and Strategy,” in IEEE Transactions on Neural Networks and Learning Systems,
predicting the test data was recorded. These values are not the vol. 29, no. 8, pp. 3784-3797, Aug. 2018.
[4] J. O. Awoyemi, A. O. Adetunmbi and S. A. Oluwadare, ’’Credit card
actual values, but the approximate values of time taken by fraud detection using machine learning techniques: A comparative analysis,”
them. 2017 International Conference on Computing Networking and Informatics
(ICCNI), Lagos, 2017, pp. 1-9.
[5] S. Dhankhad, E. Mohammed and B. Far, ’ Supervised Machine Learning
V III. CONCLUSION AND FUTURE WORK Algorithms for Credit Card Fraudulent Transaction Detection: A Comparative
Study,” 2018 IEEE International Conference on Information Reuse and
In this study, we used an imbalanced dataset to check the Integration (IRI), Salt Lake City, UT, 2018, pp. 122-125.
suitability of different supervised machine learning models to [6] S. Xuan, G. Liu, Z. Li, L. Zheng, S. Wang and C. Jiang, ’’Random forest
predict the chances of occurrence of a fraudulent transaction. for credit card fraud detection,” 2018 IEEE 15th International Conference on
Networking, Sensing and Control (ICNSC), Zhuhai, 2018, pp. 1-6.
We used sensitivity, precision and time as the deciding [7] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, “Data
parameters to come to a particular conclusion. Accuracy as a mining for credit card fraud: A comparative study,” Decis. Support Syst., vol.
parameter was not used as it is not sensitive to imbalanced 50, no. 3, pp. 602-613, 2011.
data and does not give a conclusive answer. We analyzed the [8] K. Chaudhary, J. Yadav, and B. Mallick, “A review of Fraud Detection
Techniques: Credit Card,” Int. J. Comput. Appl., vol. 45, no. 1, pp. 975-8887,
kNN, Naive Bayes, Decision Tree, Logistic Regression and 2012.
Random Forest models in this study. We used these models [9] F. N. Ogwueleka, “Data Mining Application in Credit Card Fraud
for predicting the chances of occurrence of a fraudulent credit Detection System,” vol. 6, no. 3, pp. 311-322, 2011.
card transaction out of a given number of transactions. Credit [10] O. S. Yee, S. Sagadevan, N. Hashimah, and A. Hassain, “Credit Card
Fraud Detection Using Machine Learning As Data Mining Technique,” vol.
Card frauds are a modern-day issue and we came to the 10, no. 1, pp. 23-27.
conclusion that the best suited model for predicting such [11] C. Phua, D. Alahakoon and V. Lee, "Minority report in fraud
frauds is the Decision Tree model. The analysis shows that the detection", ACMSIGKDD Explorations Newsletter, vol. 6, no. 1, p. 50, 2004.
sensitivity of the kNN model is greater than that of Decision [12] N. Sethi and A. Gera, "A Revived Survey of Various Credit Card Fraud
Detection Techniques", International Journal o f Computer Science and
tree, but as time taken by kNN for testing the data is very Mobile Computing, vol. 3, no. 4, pp. 780-791, 2014.
large, we choose Decision Tree over kNN. In case of fraud [13] J. Awoyemi, A. Adetunmbi and S. Oluwadare, "Credit card fraud
detection, we need to ensure that minimum time is taken for detection using machine learning techniques: A comparative analysis", 2017
prediction, therefore, Decision Tree is the preferred model. International Conference on Computing Networking and Informatics (ICCNI),
2017.
[14] http://www.ulb.ac.be/di/map/adalpozz/imbalancedatasets.zip.[Accessed:
Future researchers in this field may apply the resampling Oct 10, 2019].
techniques to the respective datasets being used. This [15] S. Mittal and S. Tyagi, "Performance Evaluation of Machine Learning
technique helps to reduce the imbalance ratio of datasets Algorithms for Credit Card Fraud Detection", 2019 9th International
Conference on Cloud Computing, Data Science & Engineering (Confluence),
which in turn produces better classification results. 2019.
[16] S.Dutt, A.K.Das and S.Chandramouli, Machine Learning. Pearson
After the comparative analysis of the various Supervised Education India, 2018.
Learning models, we can infer that the Decision Tree Model is
10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 683