191 - 197 - Detection of Transaction Fraud Using Deep Learning
191 - 197 - Detection of Transaction Fraud Using Deep Learning
191 - 197 - Detection of Transaction Fraud Using Deep Learning
Made by
Name-Adrineel Saha
Enrollment no-12019002003114
Sec-C
Roll no-191
Year-3rd
S. Makki, Z. Assaghir, Y. Taher, R. Haque, M. Hacid and H. Zeineddine research describes that the credit card fraud
cause huge financial loss. Most of the researchers have been working on this to provide an innovative ways to eradicate
this loss and most of the available methods are costly, time consuming and labor incentive task. The authors have
found out that the imbalanced classification of dataset is the main reason for the inaccurate results after many
experimental studies. These imbalance classifications consist of un-balanced dataset, which caused the model to
predict inaccurate and causes the financial loss. Therefore, they have found that LR, C5.0 decision tree algorithm, SVM
and ANN are best algorithm based on accuracy, AUCPR and sensitivity. They have used the balanced dataset in order to
train these models [2].
Debachudamani Prusti and Santhnu Kumar Rath designed an application with applied machine learning approaches
such as Decision tree (DT), k-nearest algorithm (kNN), Extreme learning machine (ELM), Multilayer perceptron (MLP)
and support vector machine (SVM) to detect the accuracy in fraud identification. They proposed a model by hybridizing
the DT, SVM and kNN techniques. They used two web-based protocols such as simple object access protocol (SOAP)
and Representational state transfer (REST) for efficient exchange of data across multiple heterogeneous platforms.
They compared five machine learning algorithm results based on accuracy metric. SVM performed better than other
algorithms by 81.63% but the hybrid system proposed by them had higher accuracy of 82.58% [3].
In Chouiekha and El Haj’s paper [4], Convolutional Neural Networks (CNN) are used for
Fraud Detection. A database was created with 18000 artificial images of 300
customers’ activity during 60 days. They used Customer Details Records in such a way
that long conversation or an unusual number of vouchers used would be detected. CNN
is applied to the images to detect fraudulent activity. 50% of the data set was used for
training, 25% for validation and 25% of for testing. Images have been rescaled to
improve classifier performance. The proposed Deep CNN(DCNN) contained 7 layers with
3 Convolutional layers, 2 pooling layers, 1 full connected layer and finally 1 SoftMax
regression layer. Results were evaluated using accuracy. Deep CNN’s performance is
compared against SVM, Random Forest and Gradient Boosting Classifier (GBC). The
results show that DCNN outperforms SVM by 5%, Random Forest by 10% and GBC by
3%. Deep CNN was found to train almost twice faster than the rest of the methods.
Tom Sweers in his bachelor thesis, [5] describes AutoEncoders as an effective neural
network which can encode the data as it would learn to decode it as well. In this
approach the Autoencoders are trained to non-anomaly points, introduced to the
anomaly points to classify it as ‘fraud’ or ‘no fraud’ according to the reconstruction
error which is expected to be high in the case of anomalies that the system has not
been trained on. Here, any value above the upper bound value or threshold could be
considered an anomaly.
Dataset Used
Financial dataset simulated by PaySim that identifies mobile money
transactions based on a sample of real transactions . These
transactions are collected from one month financial logs of a mobile
money service . The dataset consists of 6362620 online transaction
records during COVID-19 and each record is formulated as a collection
of several attributes. The non-numeric data present in the dataset is
transformed into numeric data. Next, all the numeric data are scaled
down into a specific range from 0 to 1. This will help in pre-processing
dataset on which proposed classifier is applied. Cash-out and transfer
type transactions are having suspicious transaction set.The attribute
‘isFraud’ is kept as target variable of classification procedure.
Proposed Methodology
The aim of the paper is to detect suspicious activities of money transaction
during COVID-19. A classifier model associates input data into output classes
after learning from training data. A stacked RNN based model is proposed as
classifier model that identifies transactions that may have deceptive issues.
Multiple RNN layers are stacked into a single platform for obtaining the
proposed model. Four simple RNN layers along with four dropout layers are
incorporated into a sequential model. Initially neural network models are
configured and training process is started. The training process goes through
one cycle and it is known as an epoch. During this period the dataset is
partitioned into smaller sections. Finally, iterative process is executed over a
couple of batch size as a subsections of training dataset for completing epoch
execution . This entire process is inclined towards solving binary classification
problem so binary cross entropy function is used as training criterion.
Implementation
Import Libraries:-
Numpy
Pandas
Matplotlib
Seaborn
Warnings
Sklearn
Inflection
Joblib
Scipy
Tensorflow
Load Dataset:-
fraud_0.1origbase.csv
Exploring Datset and descbibe numerical attributes:-
Check Data Type of the Features and missing values:-
Exploring all types of data by Plotting:-
Univariate Analysis
Numerical Variables
Categorical Variables
Bivariate Analysis
The majority fraud transaction occours for the same user
All the fraud amount is 60% of fraud transaction occours using cash-out-type
greater than 10.000. method
Values greater than 100.000
occurs using transfers-type method. Fraud transactions occours at least in 3 days
Multivariate Analysis
Machine Learning Modeling
Baseline Logistic Regression K nearest Neighbours
Comparing Model Performance
Hyperparameter Fine Tuning
Training Process of RNN
Final Model
Future Scope
1) Wewill be incorporating a website implementation for the
model deployment.
Conclusion
Due to increasing demand of mobile money transfer, it is necessary
to discover fraud activities during transactions. This is now
inevitable during COVID-19. Discovering illegal attempts will prevent
the customers to be harassed from financial dispute. The study has
been made from the announcement of Covid-19 to first unlock period
announced by the Government. The main aim of the to minimize
fraud as far as possible. It shows that the method is practical and is
highly suitable for implementation at the present scenario. This
proposed method is favourable because of its applicability on large
financial dataset. An efficient and low error system is required in
the field of mobile transaction since it will notify the customers by
triggering deceptive transactions.
References
[1]A.A. Taha, S.J. Malebary.“An intelligent approach to credit card fraud detection using an optimized light
gradient boosting machine”. IEEE Access, 8 (2020), pp. 25579-25587, 10.1109/ACCESS.2020.2971354
[2] S. Makki, Z. Assaghir, Y. Taher, R. Haque, M. Hacid, H. Zeineddine. “An experimental study with
imbalanced classification approaches for credit card fraud detection”.IEEE Access, 7 (2019), pp. 93010-
93022,10.1109/ACCESS.2019.2927266
[3] D. Prusti, S.K. Rath.“Web service based credit card fraud detection by applying machine learning
techniques”.Proceedings of the TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India
(2019), pp. 492-497,10.1109/TENCON.2019.8929372
[4] Alae Chouiekha, EL Hassane Ibn EL Haj. “ConvNets for Fraud Detection analysis”. Procedia Computer
Science 127, pp.133–138.2018.
[5] Tom Sweers. “Autoencoding Credit Card Fraud”. Bachelor Thesis, Radboud University. June 2018.