Credit Card Fraud Detection Report
Credit Card Fraud Detection Report
BY
Shubham Chavan
Roll no: 11
Prajakta Ghuge
Roll no: 19
Guide:
Prof. Rajashree Pawar
UNIVERSITY OF MUMBAI
(2023-2024)
i
CERTIFICATE
This is to certify the project entitled “Credit Card fraud Detection” is a
bonafide work of “Shuham Chavan (Roll no: 11), Prajakta Ghuge (Roll no:
19)” submitted to be University of Mumbai in partial fulfillment of the
requirement for the award of the degree of “B.E.” in “Computer Engineering”.
-------------------------------------------------
(Prof. Rajashree Pawar)
Guide
Department Of Computer Engineering
ii
Abstract
Due to the rise and rapid growth of E-Commerce, use of credit cards for online purchases has
dramatically increased and it caused an explosion in the credit card fraud. As credit card
becomes the most popular mode of payment for both online as well as regular purchase, cases
of fraud associated with it are also rising. In real life, fraudulent transactions are scattered with
genuine transactions and simple pattern matching techniques are not often sufficient to detect
those frauds accurately.
The purpose of this project is to detect the fraudulent transactions made by credit cards by the
use of machine learning techniques, to stop fraudsters from the unauthorized usage of
customers’ accounts. The increase of credit card fraud is growing rapidly worldwide, which is
the reason actions should be taken to stop fraudsters. Putting a limit for those actions would
have a positive impact on the customers as their money would be recovered and retrieved back
into their accounts and they won’t be charged for items or services that were not purchased by
them which is the main goal of the project. Detection of the fraudulent transactions will be
made by using three machine learning techniques KNN, SVM and Logistic Regression, those
models will be used on a credit card transaction dataset.
iii
Acknowlegdgment
We would take this opportunity to thank our project guide Prof. Rajashree Pawar for providing
timely assistant to our query and guidance that he gave owing to her experience in this age
forpast many years. She had indeed been a lighthouse for us in this journey.
iv
INDEX …
Abstract iii
Introduction 1
Scope of the project 2
Analysis 3
Algorithm 4
Requirment 5
Code 6
Result 10
Conclusion 12
INTRODUCTION :
Credit cards are now the most preferred way for customers to transact either offline or online.
There are a number of reasons, as illustrated below, due to which consumers are slowly shifting
from debit card transactions to credit cards, especially in developing countries like India.
With the increase of people using credit cards in their daily lives, credit card companies should
take special care in the security and safety of the customers.
Nowadays Credit card usage has drastically increased across the world, now people believe in
going cashless and are completely dependent on online transactions. The credit card has made
the digital transaction easier and more accessible. A huge number of dollars of loss are caused
every year by criminal credit card transactions. Fraud is as old as mankind itself and can take
an unlimited variety of different forms.
Those statistics caught my attention as the numbers are increasing drastically and rapidly
throughout the years, which gave me the motive to try to resolve the issue analytically by using
different machine learning methods to detect the credit card fraudulent transactions within
numerous transactions.
There are many financial Companies and institutions that lose massive amounts of money
because of fraud and fraudsters that are seeking different approaches continuously to violate
the rules and commit illegal actions; therefore, systems of fraud detection are essential for all
banks that issue credit cards to decrease their losses.
Fraud detection involves monitoring the activities of populations of users in order to estimate,
perceive or avoid objectionable behaviour, which consist of fraud, intrusion, and defaulting.
Machine learning algorithms are employed to analyse all the authorized transactions and report
the suspicious ones. These reports are investigated by professionals who contact the
cardholders to confirm if the transaction was genuine or fraudulent.
1
Scope Of The Project :
The credit card fraud detection features uses user behavior and location scanning to check for
unusual patterns. These patterns include user characteristics such as user spending patterns as
We used supervised machine learning algorithms to detect credit card fraud transactions using
real datasets. We use these algorithms to build classification using machine learning methods.
We found key variables that lead to greater accuracy in detecting credit card mall fraudulent
transactions.
2
System Analysis :
Data Collection: First we need to collects dataset from various sources such as Github,
Keggle & from many other source.
Data Storage: After collecting dataset we need to stored that database in some specific
folder in our device.
Data Processing: The dataset is processed using various tools like VS code, Spyder & Jupyter
Notebook. For processing the dataset we need to add dataset file in csv form and to read data
we need padas and numpy libraries.
Data Visualization: The processed data is displayed through visualizations such as charts,
graphs, and statistics. For visualization the data we need to use libraries like Matplotlib and
Seaborn.
3
Algorithms :
1) Logistic Regression: Logistic Regression is a statistical model that is its basic from uses of
a logistic function to model a binary dependent variable.
2) Confusion Matrix: In the field of Machine learning & specifically the problem of staticstical
classification , a confusion matrix also known as error matrix is a specific table layout that
allows visualization of the performance of an algorithm, a supervised learning.
3) XGBoost: XGBoost is a decision-tree-based ensemble Machine Larning algorithm that uses
a gradient boosting framework.
4) Decision trees: Decision Tree (DT) is a supervised ML based approach that is utilized to
solve regression and classification tasks. A DT contains the following types of nodes: root node,
decision node and leaf node. The root node is the starting point of the algorithm. The decision
node is a point whereby a choice is made in order to split the tree. A leaf node represents a final
decision.
5) Random Forests: Tree-based models stratify the predictor space into simple regions that can
be used to classify the dependent variable.
6) Neural Networks: Neural Networks are a popular set of machine learning algorithms that
are widely used for credit card fraud detection. Conceptually, a neural network is composed of
simple elements called neurons that receive inputs, change their internal state based on that
input, and produce an output based on an activation function.
4
Requirment :
Software Requirment :
. Operating System = Windows 8/10
. IDE Tool = Visual Studio & Jupyter Notebook
. Language = Python & ML
. APIs = Pandas, Numpy, Matplotlib, Seaborn
Hardware Requirment :
. Processor = Pentinum i3 or higher
. RAM = 4 GB or higher
. Hard Disk Drive = 20 GB (free)
5
Code :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pyod.models.abod import ABOD
from pyod.models.cblof import CBLOF
from pyod.models.feature_bagging import FeatureBagging
from pyod.models.hbos import HBOS
from pyod.models.iforest import IForest
from pyod.models.knn import KNN
from pyod.models.lof import LOF
from scipy import stats
import matplotlib
import seaborn as sns
df = pd.read_csv("creditcard.csv")
print(df.columns)
df.plot.scatter('Amount', 'Class')
plt.show()
fig, ax = plt.subplots(figsize=(20,10))
6
# Standardizing Data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0,1))
df[['Amount']] = scaler.fit_transform(df[['Amount']])
X1 = df['Amount'].values.reshape(-1,1)
X2 = df['Class'].values.reshape(-1,1)
random_state = np.random.RandomState(42)
outliers_fraction = 0.05
# decision function calculates the raw anomaly score for every point
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) * -1
Z = Z.reshape(xx.shape)
8
# fill blue map colormap from minimum anomaly score to threshold value
plt.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7),cmap=plt.cm.Blues_r)
# fill orange contour lines where range of anomaly score is from threshold to maximum
anomaly score
plt.contourf(xx, yy, Z, levels=[threshold, Z.max()],colors='orange')
plt.axis('tight')
plt.xlim((0, 1))
plt.ylim((0, 1))
plt.title(clf_name)
plt.show()
9
Results :
. Amount Plot :
. Balanced Plot :
10
. Scatter Plot :
11
Conclusion :
Credit card fraud is a persistent problem that can lead to significant financial losses for
individuals and businesses alike. With the increasing reliance on electronic payments, detecting
and preventing fraud has become a crucial task for financial institutions.
In recent years, various techniques and algorithms have been developed to improve the
accuracy and efficiency of credit card fraud detection. These techniques include rule-based
Machine learning algorithms, in particular, have shown promising results in detecting credit
card fraud, as they can learn from large datasets and identify patterns that are difficult for
12