Comparative Evaluation of Credit Card Fraud Detection

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/339019564

Comparative Evaluation of Credit Card Fraud Detection Using Machine


Learning Techniques

Conference Paper · October 2019


DOI: 10.1109/GCAT47503.2019.8978372

CITATIONS READS

13 178

4 authors, including:

Olawale Adepoju Julius Wosowei


Jain University Niger Delta University
3 PUBLICATIONS   14 CITATIONS    8 PUBLICATIONS   13 CITATIONS   

SEE PROFILE SEE PROFILE

Hemant Jaiman
Rajasthan Technical University
1 PUBLICATION   13 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Developing a prediction and classification model for loan risk using machine learning View project

An Enhanced Integrated Digital Design of an Oilfield View project

All content following this page was uploaded by Julius Wosowei on 02 April 2022.

The user has requested enhancement of the downloaded file.


2019 Global Conference for Advancement in Technology (GCAT)

COMPARATIVE EVALUATION OF CREDIT


CARD FRAUD DETECTION USING MACHINE
LEARNING TECHNIQUES
Olawale Adepoju Julius Wosowei Shiwani lawte Hemaint Jaiman
Research Scholar, Department Research Scholar, Department Research Scholar, Department Research Scholar, Department
of Data Science of Computer Engineering of Data Science of Data Science
School of Engineering and School of Engineering and School of Engineering and School of Engineering and
Technology, Jain University Technology, Jain University Technology, Jain University Technology, Jain University
Bangalore, India Bangalore, India Bangalore, India Bangalore, India
[email protected] [email protected] [email protected] [email protected]

Abstract—Credit card fraud is a serious and growing problem includes the utilization of the credit or debit card to obtain
with the increase in e-commerce and online transactions in this money by questionable or fraudulent methods. A good deal
modern era. With this identity theft and loss of money, such of research has focused on identifying external card fraud
mischievous practices can affect millions of people around the which often accounts for many of these credit card fraud.
world. Criminal activity is a rising threat to the financial sector It is possible to classify external credit card fraud into two
with-reaching implications.
different types, card-not-present fraud and card-present
Information extraction seemed to have assumed a basic job in
recognition of online payment fraud, fraud detection efficiency fraud. Card-not-present fraud happens when a client's card
in credit card purchases is significantly affected by the data set details including card number, termination date, and card
measuring strategy, the choice of variable and the detection verification code (CVC) are undermined and afterward
techniques used. utilized without physically exhibiting a credit card to a seller,
This publication inspects execution of, Support Vector Machine, for example, in online transactions. Card-present fraud
Naive Bayes, Logistic Regression and K-Nearest Neighbor on happens when credit card data is stolen legitimately from a
exceptionally distorted data on credit card fraud. physical credit card [1]. Distinguishing false transactions
The execution of these techniques is assessed dependent on utilizing traditional techniques for manual recognition of
accuracy, sensitivity, precision, specificity. The outcomes show
fraud is tedious and uneconomical, rendering manual
an ideal accuracy for logistic regression, Naive Bayes, k-nearest
neighbor and Support vector machine classifiers are 99.07%, methods more unfeasible due to the introduction of big data.
95.98%, 96.91%, and 97.53% respectively. The relative In any case, financial organizations have centered
outcomes demonstrate that logistic regression performs consideration regarding later computational strategies to deal
superior to other algorithms. with credit card fraud.
Information mining technique is one of the outstanding
Keywords— Credit card fraud, Data Mining, Machine Learning, and famous techniques utilized in taking care of identification
Naive Bayes, Logistic Regression, Support Vector Machine, K- of fraud, the true motive and legitimacy behind any
Nearest Neighbor transaction cannot be absolutely certain. In fact, the best
effective option is to search for possible evidence of fraud
I. INTRODUCTION from available data using statistical algorithms. Identification
of card fraud is the conceptual model for recognizing
Financial fraud is a consistently developing threat with fraudulent activity into sub classes of authentic class and not-
sweeping outcomes in the sectors of financial services, genuine class [2].
business and government organizations. Scam is termed as Detection of credit card fraud is based on analyzing the spen
illegal deceit with the intention of obtaining monetary profit. ding behavior of a card. Numerous methods have been
Monetary fraud is a growing threat to business, corporate connected to card fraud recognition using a support vector
affiliations as well as government with extensive outcome. machine [3], genetic algorithm [4], decision tree [5], artificial
Extortion can be portrayed as criminal dubiousness or neural network [6] and naïve bayes [7].
trickery with the objective of getting financial benefit. Credit card companies are actually attempting to predict a
Significantly increased credit card transactions have enjoyed purchase’s authenticity by evaluating discrepancies in
high reliance on internet innovation. As card transactions various areas such as purchasing location, transaction amount
becomes the predominant form of exchange for physical and and user purchase history. Nevertheless, with a current rise in
digital operations, the rate of fraudulent activity credit card credit card fraud cases, optimizing algorithm solutions is
fraud rate additionally increases rapidly. crucial for credit card companies [8].
Scam on cards may be an internal or exterior scam. Credit card detection is aligned with a number of
Internal card misrepresentation happens because of assent challenges, firstly fraudulent behavior patterns which are
among cardholder with the financial institutions, using a fake dynamic, which means falsified operations will, in general,
character to cheat the system whereas the exterior scam

978-1-7281-3694-3/$31.00 ©2019 IEEE


2019 Global Conference for Advancement in Technology (GCAT)
look like authentic ones, besides, card transactions billion in 2015, implying that every $100 is having 6.1 cents
informational collections are once in a while accessible and which were fraud.
exceptionally imbalanced, thirdly the efficiency of the
C. Machine Learning
identification of fraud on credit card is incredibly influenced
by the type of testing approach, variable choice, and Machine Learning is a kind of Artificial Intelligence in
identification techniques utilized. which computers are trained to identify designs inside
Last but not least, the data constantly changes over time expansive informational indexes or datasets and enhance
and alters, rendering the status of regular and fraud behaviors those examples naturally without the requirement for human
that were the genuine operations in the past to be always intercession. The training procedure includes beginning with
different and might be the present fraud or vice versa. a basic machine-learning algorithm that forms training data
This paper aims to conduct comparative analyses of identific to dissect the relationship of different components with an
ation of fraudulent activity on credit card utilizing support objective esteem. The target value is unequivocally given to
vector machine, k-nearest neigbour technique, naïve bayes the machine learning algorithm in the training stage. When
and logistic regression techniques using biased information, trained, the model would then be able to be utilized to
depending on the reliability to explore the most accurate anticipate for other data instances to forecast unknown target
method of classifying a credit card transaction as fraudulent values.
or non-fraudulent by which some algorithm and combination Depending on whether the training data provided are
of factors are considered. labeled, machine learning can be classified as supervised or
unsupervised. Supervised learning focuses on finding a
relationship between an input value and an output value to
II. LITERATURE REVIEW predict additional output values when there is more input.
Fraud detection is traditionally seen as a problem of data It is possible to further group a supervised learning problem
mining classification, with the aim of correctly classifying into either classification or regression [16]. Classification
credit card transactions as legitimate or fraudulent [9]. issues categorize output (such as fraud and non-fraud) while
Categorizing of card operations is primarily a concern of regression issues output as a specific value (such as height).
Boolean categorization. Credit card transactions in this Machine learning algorithms that do not produce an output,
regard are either legit transactions or a fraudulent transaction. but rather analyze the input-output relationship are referred
to as unsupervised because the training data are neither
A. Data Mining labeled nor classified [17].
Data Mining is the process of identifying valuable trends This project actualizes supervised machine algorithm for
and patterns from large datasets [10], combining different arrangement of credit card fraud as either fraudulent or non-
fields of study such as machine learning, informatics and fraudulent [18].
statistics. It requires the ability to analyze and manipulate
D. Fraud on Credit Card Detection
data [11].
Classification is a function of data mining which assigns With card payment turning into the broadest method of
items to target categories or classes in a collection. The payments both over the internet and in person, the credit card
classification objective is to predict the target class accurately fraud tends to accelerate quickly. Distinguishing fraudulent
in the data for each case. A classification job begins with a transaction utilizing conventional methods for manual
dataset that knows the class assignments that serve as target identification are tedious and inaccurate, in this manner the
predictors. growth of large information has rendered conventional
The least complex sort of classification problem is binary method strategies increasingly unrealistic. Corporate
classification. In binary classification, the target attribute has organizations have switched to smart methods, however with
just two conceivable values, for example, high risk or on the smart methods based on artificial learning.
other hand generally safe for fraud. Therefore, the most Methods for detecting predictive fraud were divided into tw
appropriate algorithm for detection of credit card fraud is the o broad categories: supervised and unsupervised [19]. Design
binary classifiers. are projected based on features of deceptive and legit
operations in supervised fraud detection methods [20] to
B. Scam on Credit Card
classify new transactions as fraudulent or legitimate while
Fraud on cards can be apportioned into two sorts: internal outliers’ transactions are detected as potential instances of
theft for card identity, and exterior [12, 13] whereas a more fraudulent transactions in unsupervised fraud detection. A
extensive groupings has also been performed into more
point by point dialog of supervised technique and
classes, i.e Customary card associated cheats
(implementation, theft, login acquisition, phony as well as unsupervised machine technique can be discovered in [21].
fake), vendor-related fraud (connivance and triangular) as Variety of research have been conducted on various method
well as cyber fraud (cloning of website, generators of to solve the problem of detection card scam. These methods
payment cards and fake vendor) [14]. The aggregated sum of are not restricted to but includes: Artificial Neural Network
losses in scam transaction involving financial institution and model, Bayesian model, Logistic regression, Support vector
organization across the world is accounted for in [15]. The machine, Decision tree, K-nearest neighbor etc.
report concluded that, with an increase of about $2.6billion in
the earlier year’s reported losses, it exceeded more than $16

978-1-7281-3694-3/$31.00 ©2019 IEEE


2019 Global Conference for Advancement in Technology (GCAT)
E. Feature Selection B. Data Preprocessing
Major foundation for detecting frauds on credit card is to Data preprocessing is carried out on the data, most of the
analyze its cardholder’s spending behavior. This expenditure feature column are categorical data, so conversion into
behavior is assessed utilizing suitable features determination integers were necessary. Where the class of the feature is ‘Y’,
that gather a credit card unique behavior, a real and fake we convert to binary 1 and where feature class is ‘N’ we
transaction profile will in general change. These features are convert to 0. 80% of the dataset to be used as the train set and
gotten from a blend of past transaction history. All features 20% as the test set.
are categorized within basic fundamental input which Because of the high non - fraudulent to fraudulent ratio sho
include: operation data, geographic data, merchant type data, wn in the dataset predictions from the initial training set, wh
time-based data on quantities and transaction period data ich is not uniformly assigned, we take the range of fraudulent
[22]. class for training data from 1 to 400 and fraudulent for testing
The variables falling under the statistics type of all transacti data from 401 to 450. For not fraudulent class we take
ons depict an overall card utilization pattern, all variable training data from 451 to 2800 and for testing data 2801 to
under the type of geographical data demonstrate the card’s 3075.
expenditure habit taking geographical regions into account, r=[]
and the variables demonstrate the use of the card in separate for i in range(401,450):
merchant classes under a type of merchant data, time-based
data variables defines the utilization history of the cards with r.append(i)
respect to the quantity of use compared to moment periods. train_x = x.iloc[:2800,:]
Generally some of the papers concentrated on the owner of
the card’s data as opposed to card details. It’s also evident for train_x = train_x.drop(r)
an individual to have at least two payment cards for train_x = train_x.as_matrix()
distinctive reasons. Thus, distinct expenditure habits can be
displayed on these cards. The core is harnessed towards the train_y = y.iloc[:2800,]
card throughout this examination, as opposed to the bearer of train_y = train_y.drop(r)
the card since either of the cards can display an interesting
expenditure habit whereas a card owner will show numerous train_y = train_y.as_matrix()
practices on various credit cards. An aggregate of 30 factors df1 = x.iloc[r]
is been utilized at [23], 25 factors at [22] and 21 factors is
been diminished to 16 pertinent variables [9]. df2 = x.iloc[2800:,:]
test_x = pd.concat([df1,df2])
III. EXPERIMENTAL METHODOLOGY df3 = y.iloc[r]
This section explains the procedure and methods used in df4 = y.iloc[2800:,:]
the experiment the dataset used to train the model and the four
techniques used in the study for machine learning; Support test_y = pd.concat([df3,df4])
Vector Machine, Naïve Bayes, K-Nearest Neighbor and List 1. Code snippet for Data Preprocessing
Logistic Regression techniques. The different phases engaged
in the experiment are information gathering, information C. Logistic Regression:
processing, information assessment, classifier algorithm, and The logistic regression algorithm uses both the logistic
the evaluation stage which involved splitting the data into train regression function and the sigmoid function to carry out
set and test set. binary classification based on the various factors within the
The learning phase is where the classifier’s system are dataset. The sigmoid function is shown below:
created and supplied with the extracted information, the test is 1
evaluated utilizing confusion matrix measurement rates. yi = 1+e−(z) (1)
A. Dataset The Sigmoid function is used to find a binary classification
probability. In this equation, y is the probability of the outpu
Dataset emerges from Kaggle Machine Learning platform t, and z is the log - odds of the example.
[24]. This dataset presents 3075 transactions with 12 features
of transactions in CSV file. Due to confidentiality issues, the z = b + m1 x1 + m2x2 + m3x3 + ............ mnxn (2)
features details and background information cannot be
presented. The features contains the average amount of
transaction per day, transaction amount, if declined or not, Where b is the linear regression intercept, m is the weighted
foreign transaction or not, if it’s of high risk, and six month values and bias and x is the values featured. The likelihood
average balance in the dataset. The ‘is fraudulent’ feature is of a certain outcome is predicted by the sigmoid function.
perhaps the label for the Boolean evaluation and it contains
precedence Y in case of illegal operations (fraud) and N for
legal operations (not fraud).

978-1-7281-3694-3/$31.00 ©2019 IEEE


2019 Global Conference for Advancement in Technology (GCAT)
therefore naïve. The naïve bayes classifier is also a way to
classify fraudulent credit card transaction.
F. Support Vector Machine
Support vector machine are examples of supervised
algorithms for supervised learning that can be applied to
classification and regression issues. A support vector
machine will decide the best fitting technique for classifying
the information.
A support machine (SVM) is formally defined by a separate
hyper-plane as a discriminatory classifier. In other words,
given the labeled training an optimal hyper-plane is produced
by the algorithm that categorizes new examples. This hyper-
plane is a line dividing a plane into two parts in two-
Figure 1. Graph of Logistic regression sigmoid function dimensional spaces where it lay on either side in each class.
Data points to the right of the hyper-planes are classified as
Logistic regression is used in binary classification of either 1 non-fraudulent, while others are classified as fraudulent. Both
or 0, there is a threshold of 0.5, and any value higher than this hyper-plane in the figure above correctly separate the data
threshold is considered 1 and any value lesser than threshold points by fraud, but the most effective hyper-plane will
of 0.5 is automatically considered 0. achieve a similar level of accuracy when unknown data points
need to be classified.
D. K-Nearest Neighbor (KNN): This selects the optimal hyper-plane based on the line
K-nearest neighbor algorithm is a classification distance as in figure 2. Support vector machine separate the
algorithm that predicts the attributes of an informational class to either sides nearest point. This distance is called the
point to other points based on its relative position. Its margin, and the margin point is known as support vectors.
classification is based on similarity measures such as
Euclidean distance, Manhattan distance measure. It is
assumed that the data point in the training set that has the
shortest Euclidean distance to the test point has the same
unknown attribute as the test point. The Euclidean distance
measure for the KNN classifier is used in this study. The
range between Euclidean (EC)’s two point vectors (x1, x2)
is determined by:
EC = √ ∑ (𝑥1 − 𝑥2 )2 k=1, 2,…..,n (3)

Manhattan distance measure between two points (xi, yi) and Figure 2. Support Vector Machine
(xn, yn) is a metric in which the distance between two points is
the absolute difference of their Cartesian coordinate.
IV. EVALUATION AND RESULT
M = (𝑥𝑖 − 𝑥𝑛 ) + (𝑦𝑖 − 𝑦𝑛) (4) To evaluate this machine learning models we considered
E. Naïve Bayes: two different method namely;
(1) Classification accuracy, which is the ratio of number
Naïve Bayes classifier is based on Bayes theorem that of correct prediction to the number of input sample,
selects the highest probability based decision. Bayesian as seen in equation 6. But this is very effective only if
probability estimates from known values and known there are equal number of samples in each class.
probabilities.
number of correct prediction
It is a supervised machine learning algorithm which is Accuracy = total number of predicted made (6)
represented by

𝑃(𝐵|𝐴). 𝑃(𝐴) (2) Confusion Matrix: this gives a matrix as output and
P (A|B) = (5) describe the complete performance of the model. Four
𝑃(𝐵)
Bayes theorem provides a method of calculating the posterior essential measurements are utilized in evaluating the
likelihood P (A|B), the likelihood of outcome (A) provided analyses, to be specific True Positive Ratio (TPR),
certain conditions (B). True Negative Ratio (TNR), False Positive Ratio
The theorem calculates the later probability by using a (FPR) and False Negative Ratio (FNR) rates metric
probability ratio P (B|A) = P (B) to relate it to the previous individually.
probability of the result without any knowledge of influential In which true positive, true negative, false positive and
conditions. false negative are the quantity characterized by true positive,
The theorem of the naïve bayes is based on the assumption false positive, true negative, and false negative experiments,
that each factor affects the outcome independently and is

978-1-7281-3694-3/$31.00 ©2019 IEEE


2019 Global Conference for Advancement in Technology (GCAT)
thus p and n are the absolute values of positive and negative Accuracy 97.53 96.91 95.99 99.074
class cases being tested.
Sensitivity 97.56 89.36 0 1
True positives are classes predicted to be positive and are Specificity 97.53 98.19 1 98.92
actually, true negative classes are predicted as negative but are
Precision 85.1 89.36 1 93.61
actually negative. False positive are classes predicted to be
positive and are actually negative, false negatives are classes
predicted to be negative but that are actually positive. Table 2. Comparison Table for the four classifiers

Based on accuracy, sensitivity (recall), specificity, precision, Table 2 above shows the efficiency assessment of the four
performance of support vector machine, naïve bayes, k- models for information allocation. The stronger efficiency
nearest neighbor and logistic regression classifier is evaluated. was shown by this information allocation. The method of
Actual no of sample Predicted No Predicted Yes logistic regression showed the greatest accuracy of results
across the assessment metrics used.
Actual No True Negative False positive
V. CONCLUSION

Actual Yes False Negative True Positive Four classifiers models are being developed in this study
based on Support Vector Machine, Naïve Bayes, K-Nearest
Neighbor, and Logistic Regression. 80% of the dataset is used
Table 1. Confusion Matrix Table for validation and testing.
Precision, Sensitive, Specificity, Accuracy are used to assess
Accuracy is the ratio of the sum of true positive and true performance. However, an unrealistic expectation is the
negative to the sum of all the predicted samples as seen in presence of a balanced training and testing dataset of the same
equation 7. distribution.
TP+TN
According to Table 2, when tested under a realistic
Accuracy = TP+TN+FP+FN (7) conditions, Logistic Regression was the most accurate in
detecting credit card fraud.
Sensitivity which is also called recall is the measure of the Based on this exploration, a credit card organization ought
ratio of true positive predictions to the sum of true positive and to consider executing a Logistic Regression algorithm that
false negative. The recall evaluate the completeness of the investigates the buy time to distinguish whether a credit card
program, examining how many true positives were detected transaction is fraud.
as positive.as seen in equation 8.
TP
Sensitivity (recall) = TP+FN (8) A. FUTURE WORKS
This exploration on distinguishing charge card extortion
Specificity is the measure of the ration of true negative to the has extraordinary potential for future ramifications. In the
sum of true negative and false positive event that a dataset with decoded fields were discharged to
people in general, the genuine components which can be
TN followed for charge card extortion identification can know.
Specificity = TN+FP (9) Besides, the aftereffects of this project were restricted by the
small data size of fraudulent cases given by the dataset.
Precision is the ratio of the number of true positives to the
By utilizing a bigger dataset with a more noteworthy
sum of true positive and false positive. It can be said to be the
number of fraudulent cases, the calculations can be prepared
measure of the quality of the positive feedback data. The to make expectations of more noteworthy exactness. To seek
equation for precision can be seen in Equation 10. after these objectives, all the more processing power might be
TP
required. Different strategies for bias avoidance, for example,
Precision= (10) other re sampling strategies, cost-sensitive learning methods,
TP+FP
and ensemble learning techniques could likewise be tried in
Four algorithm systems are developed in this research that future datasets to find the best strategy for managing a skewed
is based on logistic regression, svm, naive bayes, and k- dataset.
closest neighbor. 80% of that same sample is utilized for
preparation to evaluate the design whereas 20% was set aside
for experimentation. To evaluate the implementation of the ACKNOWLEDGMENT
classifiers, specificity, precision, accuracy, and sensitivity are The authors of this paper thankfully recognize project
used. mentor Bokefode Jayant for his knowledgeable impact in
machine learning and data science and also research
Classifiers (%) coordinator Yogesh Kakde for his assistance and for his
Metrics Support K-nearest Naïve Logistics valuable assistance.
Vector Neighbor Bayes Regression
Machine REFERENCES

978-1-7281-3694-3/$31.00 ©2019 IEEE


2019 Global Conference for Advancement in Technology (GCAT)
[1] R. Harrow, Is Your Credit Card Less Secure Than Ever Before? Forbes, [12] Shen, A., Tong, R., & Deng, Y. (2007). Application of classification
20-Apr-2018. [Online]. models on credit card fraud detection. In Service Systems and Service
[2] Maes, S., Tuyls, K., Vanschoenwinkel, B. and Manderick, B., (2002). Management, 2007 International Conference on (pp. 1-4). IEEE.
Credit card fraud detection using Bayesian and neural networks. [13] Chaudhary, K. and Mallick, B., (2012). Credit Card Fraud: The study
Proceeding International NAISO Congress on Neuro Fuzzy of its impact and detection techniques, International Journal of
Technologies. Computer Science and Network (IJCSN), Volume 1, Issue 4, pp. 31 –
[3] Singh, G., Gupta, R., Rastogi, A., Chandel, M. D. S., and Riyaz, A., 35, ISSN: 2277-5420
(2012). A Machine Learning Approach for Detection of Fraud based [14] Bhatla, T.P.; Prabhu, V.; and Dua, A. (2003). Understanding credit
on SVM, International Journal of Scientific Engineering and card frauds. Crads Business Review# 2003-1, Tata Consultancy
Technology, Volume No.1, Issue No.3, pp. 194-198, ISSN: 2277-1581 Services
[4] RamaKalyani, K. and UmaDevi, D., (2012). Fraud Detection of Credit [15] The Nilson Report. (2015). U.S. Credit & Debit Cards 2015. David
Card Payment System by Genetic Algorithm, International Journal of Robertson.
Scientific & Engineering Research, Vol. 3, Issue 7, pp. 1 – 6, ISSN [16] Supervised and Unsupervised Machine Learning Algorithms, Machine
2229-5518 Learning Mastery, 22-Sep-2016. [Online].
[5] Patil, S., Somavanshi, H., Gaikwad, J., Deshmane, A., and Badgujar, [17] What is Machine Learning? A definition, Expert System, 05-Oct-2017.
R., (2015). Credit Card Fraud Detection Using Decision Tree Induction [Online].
Algorithm, International Journal of Computer Science and Mobile
[18] C. Donalek, Supervised and Unsupervised Learning.
Computing (IJCSMC), Vol.4, Issue 4, pp. 92-95, ISSN: 2320-088X
[19] Bolton, R. J. and Hand, D. J., (2001). Unsupervised profiling methods
[6] Ogwueleka, F. N., (2011). Data Mining Application in Credit Card
for fraud detection, Conference on Credit Scoring and Credit Control,
Fraud Detection System, Journal of Engineering Science and
Edinburgh.
Technology, Vol. 6, No. 3, pp. 311 – 322
[20] Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011).
[7] Bahnsen, A. C., Stojanovic, A., Aouada, D., & Ottersten, B. (2014).
Improving credit card fraud detection with calibrated probabilities. In Data mining for credit card fraud: A comparative study. Decision
Support Systems, 50(3), 602-613.
Proceedings of the 2014 SIAM International Conference on Data
Mining (pp. 677-685). [21] Kou, Y., Lu, C-T., Sinvongwattana, S. and Huang, Y-P., (2004).
[8] J. Steele and J. Gonzalez, Credit card fraud and ID theft statistics, Survey of Fraud Detection Techniques, In Proceedings of the 2004
IEEE International Conference on Networking, Sensing & Control,
CreditCards.com. [Online].
Taipei, Taiwan, March 21-23.
[9] Seeja, K. R., and Zareapoor, M., (2014). FraudMiner: A Novel Credit
Card Fraud Detection Model Based on Frequent Itemset Mining, The [22] Bahnsen, A. C., Stojanovic, A., Aouada, D., & Ottersten, B. (2013).
Cost sensitive credit card fraud detection using Bayes minimum risk.
Scientific World Journal, Hindawi Publishing Corporation, Volume
2014, Article ID 252797, pp. 1 – 10. In Machine Learning and Applications (ICMLA), 2013 12th
International Conference on (Vol. 1, pp. 333-338). IEEE.
[10] Data Analytics vs Data Science: Two Separate, but Interconnected
[23] Stolfo, S., Fan, D. W., Lee, W., Prodromidis, A., & Chan, P. (1997).
Disciplines, Data Scientist Insights, 28-Apr-2018. [Online].
Credit card fraud detection using meta-learning: Issues and initial
[11] D. T. Larose and C. D. Larose, Discovering knowledge in data: an results. In AAAI-97 Workshop on Fraud Detection and Risk
introduction to data mining. Hoboken, NJ: John Wiley & Sons, 2014. Management.

978-1-7281-3694-3/$31.00 ©2019 IEEE

View publication stats

You might also like