Credit Card Fraud Detection
Credit Card Fraud Detection
https://doi.org/10.22214/ijraset.2022.47456
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
I. INTRODUCTION
'Fraud' in credit card transactions is unauthorized and unwanted usage of an account by someone other than the owner of that
account. Necessary prevention measures can be taken to stop this abuse and the behavior of such fraudulent practices can be studied
to minimize it and protect against similar occurrences in the future. In other words, Credit Card Fraud can be defined as a case
where a person uses someone else’s credit card for personal reasons while the owner and the card issuing authorities are unaware of
the fact that the card is being used. This problem is particularly challenging from the perspective of learning, as it is characterized
by various factors such as class imbalance. The number of valid transactions far outnumber fraudulent ones. Also, the transaction
patterns often change their statistical properties over the course of time.
II. SCOPE
Fraud detection involves monitoring the activities of populations of users in order to estimate, perceive or avoid objectionable
behavior, which consist of fraud, intrusion, and defaulting. This is a very relevant problem that demands the attention of
communities such as machine learning and data science where the solution to this problem can be automated.
A. Software Specifications
1) Google Colaboratory
B. Hardware Specifications
1) Microsoft® Windows® 7/8/10 (32- or 64-bit)
2) 3 GB RAM minimum, 8 GB RAM recommended;
3) 2 GB of available disk space minimum
4) core processor of i3 minimum or above.
C. Dataset
1) Creditcard.csv which is available on Kaggle. (https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud)
D. Packages Requried
1) ranger
2) caret
3) data.table
4) caTools
5) rpart.plot
6) neuralnet
7) gbm
8) pROC
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 988
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
V. IMPLEMENTATION
In the first step of this data science project, we will perform data exploration. We will import the essential packages required for
this role and then read our data. Finally, we will go through the input data to gain nec- essary insights about it.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 989
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
A. Data Exploration
First we imported the datasets that contain transactions made by credit cards. we then explored the data that is contained in the
creditcard_data dataframe. After displaying the creditcard_data using the head() function as well as the tail() function, we proceeded
to explore the other components of this dataframe.
B. Data Manipulation
In this section of the project, we scaled the data using the scale() function. We applied this to the amount component of
our creditcard_data amount. With the help of scaling, the data is structured according to a specified range. Therefore, there are no
extreme values in the dataset that might interfere with the functioning of the model.
C. Data Modelling
After standardizing the entire dataset, I split the dataset into training set as well as test set with a split ratio of 0.80. This means that
80% of the data will be attributed to the train_data whereas 20% will be attributed to the test_data. I then found the dimensions
using the dim() function.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 990
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
library(rpart.plot)
decisionTree_model <- rpart(Class ~ . , creditcard_data, method = 'class')
predicted_val <- predict(decisionTree_model, creditcard_data, type = 'class')
probability <- predict(decisionTree_model, creditcard_data, type = 'prob')
rpart.plot(decisionTree_model
D. AUC-ROC Curve
In the last section of the project, we calculated and plotted an ROC curve measuring the sensitivity and specificity of the model.
The print command plots the curve and calculates the area under the curve. The area of a ROC curve can be a test of the sensivity
and accuracy of a model.
Code:
# Plot and calculate AUC on test data
library(pROC)
gbm_test = predict(model_gbm, newdata = test_data, n.trees = gbm.iter)
gbm_auc = roc(test_data$Class, gbm_test, plot = TRUE, col = "red")
print(gbm_auc)
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 991
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
ANN Model
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 992
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue XI Nov 2022- Available at www.ijraset.com
IX. CONCLUSION
Concluding our R Data Science project, we learnt how to develop a credit card fraud detection model using machine learning. We
used a variety of ML algorithms to implement this model and also plotted the respective performance curves for the models. We
also learnt how data can be analyzed and visualized to discern fraudulent transactions from other types of data. Hope you enjoyed
this credit card fraud detection project of machine learning using R.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 993