Bank Fraud Detection Project

This presentation is about the prediction of bank fraud transactions using advanced Machine Learning techniques

Uploaded by

arunkumarr9791

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

246 views30 pages

Bank Fraud Detection Project

This presentation is about the prediction of bank fraud transactions using advanced Machine Learning techniques

Uploaded by

arunkumarr9791

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

EMPOWERING FINANCIAL SECURITY:

DETECTING FRAUDULENT
TRANSACTIONS USING ADVANCED
MACHINE LEARNING TECHNIQUES
AND PREDICTIVE ANALYTICS
PRESENTED BY: ARUN KUMAR R
DATA SCIENCE TRAINEE
LEARNBAY
AGENDA

1. Introduction to the problem

2. Data Collection and Preprocessing
3. Exploratory Data Analysis
4. Model Selection and Evaluation
5. Results and Conclusion
6. Q&A
1.INTRODUCTION
 According to the Market Statsville Group (MSG), the global e-commerce fraud prevention market
size is expected to grow from USD 38,714.0 million in 2022 to USD 303,870.4 million by 2033,
growing at a CAGR of 20.6% from 2023 to 2033
 Indian banks reported a Rs 4.69 lakh crore loss on account of frauds between June 1, 2014,
and March 31, 2023, from around 65017 frauds reported across banks
 In FY2023, the total number of fraud cases in the banking system were 13,530. Of this almost 49
per cent or 6,659 cases were in the digital payment – card/internet – category.
 India lost at least Rs 100 crore every day to bank fraud or scams over the past seven years
 In financial year 2023, the Reserve Bank of India (RBI) reported a total of more than 13
thousand bank fraud cases across India. The total value of bank frauds decreased from 1.38
trillion Indian rupees to 302 billion Indian rupees.
10 TYPES OF BANKING FRAUDS IN INDIA

1.Phishing-creating fake websites and gather important information.

2.Vishing-fraudster call & gather info from customers as they call from banks or institutions
3.Frauds using online sales platform
4. Frauds due to the use of unknown/unverified mobile apps
5.ATM card skimming
6.Frauds using screen sharing apps/Remote access
7.SIM swap or SIM cloning
8.Frauds by compromising credentials on results through search engines
9.Scam through QR code scan
10.Impersonation on social media
PROBLEM STATEMENT

 Develop a machine learning model to detect potentially

fraudulent transactions based on the provided features.
 The dataset contains information about various transactions,
including account age, payment method, time of transaction,
and category.
 The goal is to build a classification model that can accurately
classify transactions as either legitimate or potentially
fraudulent.
DATA DICTIONARY
 accountAgeDays: The number of days the account has been active.
 numItems: The number of items associated with the account.
 localTime: Some measure of time, possibly in hours or a similar unit.
 paymentMethod: The method used for payment (e.g., PayPal, store credit, credit
card).
 paymentMethodAgeDays: The number of days since the payment method was
associated with the account.(It indicates how long ago the current payment method
(e.g., PayPal, credit card) was linked to the account.)
 isWeekend: A binary indicator of whether the transaction occurred on a weekend (1
for yes, 0 for no).
 Category: The category of the transaction (e.g., electronics, shopping, food).
 Label(Target column) A binary label (0 for legitimate, 1 for potentially fraudulent).
DATA STRUCTURE
 No_of_columns – 8 Nos

 No_of_Rows – 38662 Nos

DATA DISTRIBUTION
2.DATA CLEANING AND
PREPROCESSING
 Duplicate values
 Treating missing values
 Encoding
 Outlier Treatment
 Feature Scaling
 Imbalanced data treatment – Random Over Sampler
DUPLICATE VALUES

 3033 duplicate rows

 7.73% of total data
 Made two models with and without duplicate values
MISSING VALUES
 Variables ‘isWeekend’ & ‘Category’ has 560 and 95 missing
values respectively.
 the missing values of 'isWeekend’ is aligned with 'label's
category of 'fraud' i.e.1. so filling this with 0 or 1(weekday or
weekend) would make a false model, so drop this variable.
 Treat the ‘category’ variable with “mode” values.
ENCODING
 Variables Category & paymentMethod has categorical
values.
 Treat them with One Hot Encoder & drop the duplicate
variable
OUTLIER TREATMENT
 Variables numItems & paymentMethodAgeDays has outlier
values.
 Since these outliers represent natural variations in the
population, they were leaved as it is.
FEATURE SCALING
 Variables accountAgeDays & paymentMethodAgeDays has
value range upto 2000.
 Since there is no limit for this values, I scaled the dataset
with standardization method.
IMBALANCED DATASET
 The dependent variable ‘label’ have 0’s & 1’s in 38661 &
560 times respectively.
 Huge imbalance(98.57% & 1.43%)
 Used SMOTE method to balance the data.
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
MODEL SELECTION
 After splitting the data into train & test,
I build the model in almost all the
classification algorithms.
 Out of all the classifier models, I choose
the model with high accuracy.
 i.e. RF model.
MODEL EVALUATION

Metrics Accuracy Precision Recall F1 score

model
Training 1.00 1.00 1.00 1.00
Test 1.00 1.00 1.00 1.00
MODEL EVALUATION

 Here, our focus should be on Type-II

error. i.e. False Negative.
 It is less compared to the False Positive.
MODEL EVALUATION
 The ROC-AUC curve also
shows accuracy score of 1.00
and 0.99 for training and test
accuracy.
 The area under the curve
value also 0.99
 To reduce the over fitting
problem I did Cross Validation
on this RF model.
RESULTS
 The final accuracy after cross validation: 99.63 & 99.31
 Business Impact: could avoid the loss of crores of money
for the customers of our bank.
CONCLUSION
 Summary: successfully implemented the bank fraud detection
model.
 Future Works:
1. Integration with real-time data by deploying the model in cloud.
2. Exploring the anomaly detection models.
THANK YOU

NLP Techmax NLP
100% (1)
NLP Techmax NLP
137 pages
Distributed Systems Unit I
100% (1)
Distributed Systems Unit I
35 pages
Forouzan6e ch01 PPTs Accessible
No ratings yet
Forouzan6e ch01 PPTs Accessible
77 pages
Routing Concept: Sirak Kaewjamnong
0% (1)
Routing Concept: Sirak Kaewjamnong
45 pages
CNS Notes
100% (1)
CNS Notes
51 pages
CCNA Security: Chapter 9 Managing A Secure Network
No ratings yet
CCNA Security: Chapter 9 Managing A Secure Network
97 pages
Distributed Systems
67% (3)
Distributed Systems
331 pages
Simondon Techno
No ratings yet
Simondon Techno
131 pages
Network Security and Concepts
100% (1)
Network Security and Concepts
34 pages
Advance Concepts of Networking
No ratings yet
Advance Concepts of Networking
168 pages
(Presentation) Understanding Network Infrastructure and Components PDF
0% (1)
(Presentation) Understanding Network Infrastructure and Components PDF
42 pages
Crypt DB
100% (1)
Crypt DB
28 pages
Network Access Control: Software and Harware
50% (2)
Network Access Control: Software and Harware
2 pages
Process Migration
No ratings yet
Process Migration
41 pages
Unit 1 Introduction: Network Hardware, Network Software, References Models. The Physical Layer: The
No ratings yet
Unit 1 Introduction: Network Hardware, Network Software, References Models. The Physical Layer: The
17 pages
Computer Networks: by Damera Venkatesh Assistant Professor
No ratings yet
Computer Networks: by Damera Venkatesh Assistant Professor
22 pages
Big Book of Data Engineering 2nd Edition Final
No ratings yet
Big Book of Data Engineering 2nd Edition Final
97 pages
Top 50 Cybersecurity Interview Questions and Answers - 2021
No ratings yet
Top 50 Cybersecurity Interview Questions and Answers - 2021
9 pages
Data Communication and Computer Networking: Content
No ratings yet
Data Communication and Computer Networking: Content
42 pages
How To Get Lots of Money For Anything
100% (4)
How To Get Lots of Money For Anything
40 pages
Lecture 14 Firewall
No ratings yet
Lecture 14 Firewall
46 pages
CN KCS-603 Important Questions Solved
No ratings yet
CN KCS-603 Important Questions Solved
162 pages
Types of Speech Context and Style
No ratings yet
Types of Speech Context and Style
13 pages
CHAPTER 2 Network Basics
No ratings yet
CHAPTER 2 Network Basics
50 pages
Entry Level Local Area Network Design Case Study
0% (2)
Entry Level Local Area Network Design Case Study
15 pages
Client Server Computing 2 Marks and Ques
No ratings yet
Client Server Computing 2 Marks and Ques
33 pages
Network Layer: Design Issues
No ratings yet
Network Layer: Design Issues
12 pages
19bit0102 Lab Da1 PDF
No ratings yet
19bit0102 Lab Da1 PDF
17 pages
Network Security Notes
No ratings yet
Network Security Notes
170 pages
Distributed Systems Distributed Systems: Multicast Communication Multicast Communication
No ratings yet
Distributed Systems Distributed Systems: Multicast Communication Multicast Communication
22 pages
TCOM 509: TCP/IP - Internet Protocols: Instructor: Scott T. Tran
No ratings yet
TCOM 509: TCP/IP - Internet Protocols: Instructor: Scott T. Tran
148 pages
Wi-Fi (Wireless Fidelity)
No ratings yet
Wi-Fi (Wireless Fidelity)
17 pages
Characteristics of Data Structures
No ratings yet
Characteristics of Data Structures
2 pages
CN FINAL Practical File
No ratings yet
CN FINAL Practical File
19 pages
E-Commerce Client Server Architecture
No ratings yet
E-Commerce Client Server Architecture
29 pages
M.SC., - CS &amp IT - 2011-12
No ratings yet
M.SC., - CS &amp IT - 2011-12
33 pages
Network Security
No ratings yet
Network Security
23 pages
Ec1008 Becse HSN 2marks With Answers
No ratings yet
Ec1008 Becse HSN 2marks With Answers
12 pages
Week 8 Merged PDF
No ratings yet
Week 8 Merged PDF
128 pages
Cybersecurity Challenges and Solutions
No ratings yet
Cybersecurity Challenges and Solutions
5 pages
The Role of Security in Trustworthy Cloud Computing
No ratings yet
The Role of Security in Trustworthy Cloud Computing
12 pages
Basic To Advanced Networking: Tutorials
No ratings yet
Basic To Advanced Networking: Tutorials
15 pages
Case Study Single Sign On Solution Implementation Software Luxoft For Ping Identity
No ratings yet
Case Study Single Sign On Solution Implementation Software Luxoft For Ping Identity
5 pages
IP Address Subnet Supernet
No ratings yet
IP Address Subnet Supernet
97 pages
Dot Net Frame Work
No ratings yet
Dot Net Frame Work
9 pages
Networking Fundamentals
100% (1)
Networking Fundamentals
0 pages
SQS Practice Questions
No ratings yet
SQS Practice Questions
39 pages
Unit I
No ratings yet
Unit I
24 pages
Network Segmentation and IsolationWA
No ratings yet
Network Segmentation and IsolationWA
11 pages
(Lecture Notes in Computer Science 6309 _ Information Systems and Applications, Incl. Internet_Web, And HCI) M. Tamer Özsu, Patrick Kling (Auth.), Mong Li Lee, Jeffrey Xu Yu, Zohra Bellahsène, Rainer
No ratings yet
(Lecture Notes in Computer Science 6309 _ Information Systems and Applications, Incl. Internet_Web, And HCI) M. Tamer Özsu, Patrick Kling (Auth.), Mong Li Lee, Jeffrey Xu Yu, Zohra Bellahsène, Rainer
163 pages
A Seminar Report ON Firewall
No ratings yet
A Seminar Report ON Firewall
35 pages
Evolution of Wireless Networks (Part III)
No ratings yet
Evolution of Wireless Networks (Part III)
30 pages
Digital Control Systems
100% (1)
Digital Control Systems
1 page
NMS Course Modified)
No ratings yet
NMS Course Modified)
89 pages
Database System: 1. Data
No ratings yet
Database System: 1. Data
5 pages
Challenges in Mobile Security
No ratings yet
Challenges in Mobile Security
8 pages
Distributed Systems-A Brief Introduction
No ratings yet
Distributed Systems-A Brief Introduction
30 pages
Role of Semiotics in Linguistics PDF
No ratings yet
Role of Semiotics in Linguistics PDF
10 pages
An Introduction To Firewalls
No ratings yet
An Introduction To Firewalls
21 pages
Server Monitoring System Using A Network Intelligent Agent: Keywords: CFP, Cryptographic License Key
No ratings yet
Server Monitoring System Using A Network Intelligent Agent: Keywords: CFP, Cryptographic License Key
6 pages
Protocol Family Encapsulations
No ratings yet
Protocol Family Encapsulations
1 page
Wireless LAN
No ratings yet
Wireless LAN
19 pages
Academic Analytics Using Machine Learning
No ratings yet
Academic Analytics Using Machine Learning
26 pages
9.storage Area Network
No ratings yet
9.storage Area Network
4 pages
Data Science RR Itec-Deep Learning
No ratings yet
Data Science RR Itec-Deep Learning
41 pages
Question: Design A BI System For Fraud Detection .Describe All The Steps From Data Collection To Decision Making Clearly?
No ratings yet
Question: Design A BI System For Fraud Detection .Describe All The Steps From Data Collection To Decision Making Clearly?
2 pages
IPC
No ratings yet
IPC
15 pages
Types of Meaning Essay
100% (2)
Types of Meaning Essay
4 pages
XAI MajorProject
No ratings yet
XAI MajorProject
14 pages
Unit 4
No ratings yet
Unit 4
9 pages
Software Design
No ratings yet
Software Design
12 pages
7641 Assignment 1
No ratings yet
7641 Assignment 1
4 pages
12 Asymptotic Notations 04-04-2023
No ratings yet
12 Asymptotic Notations 04-04-2023
18 pages
Martion Reinhold Arc Media
No ratings yet
Martion Reinhold Arc Media
4 pages
JCL Abend Codes
No ratings yet
JCL Abend Codes
42 pages
Li 2020
No ratings yet
Li 2020
3 pages
Bcse302l Database-Systems TH 1.0 67 Bcse302l
No ratings yet
Bcse302l Database-Systems TH 1.0 67 Bcse302l
3 pages
WS - Data Analytics Fundamental-R
No ratings yet
WS - Data Analytics Fundamental-R
51 pages
Artificial Intelligence in Practice
No ratings yet
Artificial Intelligence in Practice
4 pages
Thema AI Topic 1 - 084848
No ratings yet
Thema AI Topic 1 - 084848
42 pages
DS Lab # 04 1
No ratings yet
DS Lab # 04 1
3 pages
Previous Researches On Lexical Ambiguity and Polysemy
No ratings yet
Previous Researches On Lexical Ambiguity and Polysemy
14 pages
Coverage of Quiz 1
No ratings yet
Coverage of Quiz 1
24 pages
Chap.3 SLA
No ratings yet
Chap.3 SLA
2 pages
Phil Cogs 3750 Phil of AI Being There Putting Brain Body and World Together Again by Andy Clark
No ratings yet
Phil Cogs 3750 Phil of AI Being There Putting Brain Body and World Together Again by Andy Clark
5 pages
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
From Everand
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
Mulayam Singh
No ratings yet
Introductory Guideline for Using Twilio Programmable Messaging and Programmable Voice Services
From Everand
Introductory Guideline for Using Twilio Programmable Messaging and Programmable Voice Services
Dr. Hidaia Mahmood Alassouli
No ratings yet
Network operating system A Complete Guide
From Everand
Network operating system A Complete Guide
Gerardus Blokdyk
No ratings yet
Network performance Third Edition
From Everand
Network performance Third Edition
Gerardus Blokdyk
No ratings yet
AppDynamics Third Edition
From Everand
AppDynamics Third Edition
Gerardus Blokdyk
No ratings yet

Bank Fraud Detection Project

Uploaded by

Bank Fraud Detection Project

Uploaded by

EMPOWERING FINANCIAL SECURITY:

1. Introduction to the problem

1.Phishing-creating fake websites and gather important information.

 Develop a machine learning model to detect potentially

 No_of_Rows – 38662 Nos

 3033 duplicate rows

Metrics Accuracy Precision Recall F1 score

 Here, our focus should be on Type-II

You might also like