Fraud Detection in Banking Data Using Machine Learning

Uploaded by

shivanesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views17 pages

Fraud Detection in Banking Data Using Machine Learning

Uploaded by

shivanesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Fraud Detection in Banking Data

using Machine Learning

SHIVANESH A/L SIVAKUMAR

MCS231014
DR HABIBOLLAH BIN HARON
Problem Background
• Banking industry sees a surge in fraudulent
activities
• Traditional rule-based systems inadequate for
evolving fraud tactics
• Real transactions outnumber fake ones,
complicating detection
• Machine learning offers promise but faces
challenges in large, imbalanced datasets
• Need for a dynamic fraud detection system that
adapts to evolving threats in real-time
Problem Statement Objective
• What machine learning techniques be used to • To improve the overall performance of fraud
detect fraudulent activities in banking data? detection algorithms and increase computational
efficiency, apply Bayesian optimization to fine-
• How can we improve the performance of fraud tune hyperparameters.
detection systems for unbalanced datasets?
• To develop and compare the performance of
• What are the most predictive features of XGBoost, Random Forest, and Artificial Neural
fraudulent transactions? Networks in detecting fraudulent transactions in
banking data.

• To investigate the feature importance provided by

each algorithm. This can provide insights into
which features are most predictive of fraudulent
transactions.
Literature Review
Fraud in Banking Industry
• Cressey’s (1953) fraud triangle: Financial pressure, perceived opportunity, and rationalization drive fraud.
• Financial pressure, from diverse sources, motivates fraudulent behavior.
• Hollow’s (2014) study: Financial pressure is a key factor for bank employees, varying by positions.
• Hidajat’s (2020) research: Greed is a chief non-financial pressure for higher-ranking individuals in Indonesian rural
banks.
• Weak internal controls provide fraud opportunities (Ilter 2014, Hollow 2014, Asmah 2020).
• Contributing factors to opportunities: Poorly defined duties, lack of documentation, delayed transactions, and
inadequate controls (Kazemian et al. 2019).
• Weak internal control systems in banks facilitate fraud (Sanusi et al. 2015).
• Individuals with low self-control are more prone to fraud (Holtfreter et al. 2010).
Bayesian Optimization

• Invented by Jonas Mockus in the 1970s and 1980s.

• Optimizes algorithm performance through Bayesian statistical modeling.
• Components: Bayesian model for objective function, acquisition function for sampling decisions.
• Begins with space-filling experimental design, often random points.
• Iteratively allocates remaining budget for function evaluations.
• Enhances algorithm performance by optimizing hyperparameters.
• Applicable beyond machine learning: used in robotics, sensor placement, drug discovery, and
engineering design.
• Versatile and adaptable, making it valuable across diverse domains.
XGBoost

• Regularizing gradient boosting library for C++, Java, Python, R, Julia, Perl, Scala.
• Developed by Tianqi Chen for DMLC (DISTRIBUTE MACHINE LEARNING COMMAND) research project.
• Initial version: terminal app configured with Library for Support Vector Machines file.
• Boosting algorithm based on gradient boosted trees.
• Avoids overfitting with regularization term.
• Utilizes parallel and distributed computing for faster model creation. (Huang,2014)
• Employs sparsity-aware algorithm to remove missing values in split gain computation.
• Applied in finance, healthcare, e-commerce for fraud detection, churn prediction, credit risk modeling.
Random Forest

• Built on decision tree algorithm for regression and classification.

• Effective for high-accuracy predictions, especially with large datasets.
• Combines multiple classifiers to solve complex problems.
• Predicts average output from trees in the forest for enhanced precision.
• Overcomes decision tree limitations and reduces the need for dataset lifting.
• Each tree is a weak learner, but together they form a strong learner.
• Fast and effective for large and unbalanced datasets.
• Limitations in training regression problems across different datasets.
• Proposed by Olena et al. (2020) using random forest and isolation tree techniques. The system was
checked for finding users' location during transactions the study doesn't put enough focus on
keeping secrets safe and private.
Artificial Neural Networks (ANN)
• Inspiration from the human brain's structure led to the creation of Artificial Neural
Networks (ANNs).
• Warren McCulloch and Walter Pitts proposed the first mathematical model of a neuron in
1943.
• Perform tasks like the human brain; categorized as unsupervised and supervised.
• Unsupervised Neural Networks: 95% accuracy for fraud detection, identify patterns in credit
card transactions.
• Resilient to errors; can generate output with corrupted cells.
• Effective for Credit Card Fraud Detection (CCFD) due to high speed and processing
capabilities
• According to Mahji combination of ANN and clustering excels in detecting fraudulent
transactions.
Related Works
• Halvaiee & Akbari (2014): AIRS-based Fraud Detection Model (AFDM)
⚬ Proposed AIRS-based model. (IMMUNE SYSTEM INSPIRED ALGORITHM)
⚬ Improved fraud detection by up to 25%.
⚬ Reduced costs by up to 85%, system response time by up to 40%.
• Bahnsen et al. (2016): Transaction Aggregation Strategy
⚬ Developed strategy with von Mises distribution.
⚬ Introduced cost-based criterion.
⚬ Extended strategy for new features.
• Randhawa et al. (2018): Machine Learning Algorithms
⚬ Studied various models.
⚬ Proposed hybrid method with AdaBoost, majority voting for effective fraud detection.
• Porwal and Mukund (2018): Outlier Detection using Clustering
⚬ Proposed clustering for outlier detection.
⚬ Resistant to changing patterns.
⚬ Preferred precision-recall curve over ROC (receiver operating characteristic).
Comparison Between Algorithms
Performance Evaluation
• Accuracy:
⚬ Ratio of correct predictions to total predictions; suitable for balanced classes.
• Precision:
⚬ Ratio of correctly predicted positive observations to total predicted positives; emphasizes
precision.
• Recall (Sensitivity):
⚬ Ratio of correctly predicted positive observations to all actual positives; measures capture
ability.
• F1 Score:
⚬ Weighted average of Precision and Recall; balances precision and recall in imbalanced
classes.
• ROC Curve (Receiver Operating Characteristics):
⚬ Plot of true positive rate against false positive rate
Research Framework
Phase 1 : Data Accquisition

• Proposed cooperation with Maybank's data security and compliance team.

• Suggested signing a non-disclosure agreement for formalizing data use and security
conditions.
• Stressed positive impact on academic goals and banking industry's credit card security.
• Approached Maybank transparently, aiming for a collaborative relationship.
• Demonstrated commitment to responsible and ethical research practices.
Phase 2 : Data Cleaning and Data Exploration
• Python is used for crucial data cleaning, removing or modifying incorrect, incomplete,
irrelevant, or duplicated data.
• Data quality directly affects machine learning model effectiveness.
• Address missing values through imputation or removal.
• Remove duplicate rows to prevent model bias.
• Manage outliers using techniques like scaling and normalization.
• Data exploration is essential for understanding patterns and characteristics.
• Begins with variable identification, recognizing input and target variables, data types, and
categories
Phase 3 : Bayesian Optimizaton, Machine Learning and Data
Visualization
• Machine learning phase involves Bayesian optimization for hyperparameter tuning.
• Aims for optimal hyperparameters, enhancing fraud detection algorithm performance.
• Results in accurate, efficient models, saving computational resources and time.
• Utilizes XGBoost, Random Forest, and Artificial Neural Network for feature importance
metrics.
• Data visualization using Python libraries presents machine learning results effectively.
Phase 4 : Performance Evaluation
• Precision: Gauges accuracy of positive predictions (True Positives / Total Predicted Positives).
• Accuracy is a metric that measures how often a machine learning model correctly predicts the outcome.
It is calculated by dividing the number of correct predictions by the total number of predictions.
• Recall: Assesses model's ability to identify actual positive instances (True Positives / Total Actual
Positives).
• F1 Score: Harmonic mean of precision and recall, useful for imbalanced classes.
• ROC-AUC Metric: Measures model's ability to differentiate between positive and negative instances.
• Data Visualization: Visualize precision, recall, and F1 scores with bar charts or line graphs for model
comparison; ROC curve illustrates diagnostic ability; Confusion matrices visualized in heatmap or table
format for interpretation and presentation.

Credit Card Fraud Detection-ppt-1
100% (1)
Credit Card Fraud Detection-ppt-1
22 pages
Focus 2 - Modalne I Słownictwo
No ratings yet
Focus 2 - Modalne I Słownictwo
1 page
Fraud Detection
No ratings yet
Fraud Detection
13 pages
Major project stage 2 ppt (2)
No ratings yet
Major project stage 2 ppt (2)
19 pages
Upi Fraud Detection Using Machine Learning
No ratings yet
Upi Fraud Detection Using Machine Learning
11 pages
ProjectPPT(1)
No ratings yet
ProjectPPT(1)
17 pages
Data Science in Finance
No ratings yet
Data Science in Finance
83 pages
Machine Learning Algorithm For Financial Fruad Detection
100% (1)
Machine Learning Algorithm For Financial Fruad Detection
25 pages
Fraud_Detection_using_Machine_Learning_and_Deep_Learning
No ratings yet
Fraud_Detection_using_Machine_Learning_and_Deep_Learning
6 pages
s&Ml Unit 4- q & A
No ratings yet
s&Ml Unit 4- q & A
12 pages
Integrating a Machine Learning-driven Fraud Detection System
No ratings yet
Integrating a Machine Learning-driven Fraud Detection System
7 pages
Day 3
No ratings yet
Day 3
20 pages
DBNex_Deep_Belief_Network_and_Explainable_AI_based_Financial_Fraud_Detection
No ratings yet
DBNex_Deep_Belief_Network_and_Explainable_AI_based_Financial_Fraud_Detection
10 pages
project major 1
No ratings yet
project major 1
15 pages
Paper 28
No ratings yet
Paper 28
17 pages
Artigo_Fraud-Creditcard
No ratings yet
Artigo_Fraud-Creditcard
14 pages
Batch 03
No ratings yet
Batch 03
9 pages
Bank Fraud Detection System Using Machine Learning
No ratings yet
Bank Fraud Detection System Using Machine Learning
8 pages
Fraud Detection ML
No ratings yet
Fraud Detection ML
13 pages
CIPS L3M4
No ratings yet
CIPS L3M4
7 pages
MXenes and their Composites: Synthesis, Properties and Potential Applications (Micro and Nano Technologies) 1st Edition Kishor Kumar Sadasivuni (Editor) - eBook PDF pdf download
100% (3)
MXenes and their Composites: Synthesis, Properties and Potential Applications (Micro and Nano Technologies) 1st Edition Kishor Kumar Sadasivuni (Editor) - eBook PDF pdf download
57 pages
Fighting Money Laundering With Statistics and Machine Learning
No ratings yet
Fighting Money Laundering With Statistics and Machine Learning
7 pages
Survey Report
No ratings yet
Survey Report
850 pages
Summary 2
No ratings yet
Summary 2
5 pages
Major 1 2nd
No ratings yet
Major 1 2nd
13 pages
Development of a Machine Learning-Based Financial Risk Control Sy
No ratings yet
Development of a Machine Learning-Based Financial Risk Control Sy
70 pages
Advancements in Fraud Detection Systems Using Machine Learning
No ratings yet
Advancements in Fraud Detection Systems Using Machine Learning
3 pages
Icrito48877.2020.9197762
No ratings yet
Icrito48877.2020.9197762
3 pages
Naik 2019 Ijca 918521
No ratings yet
Naik 2019 Ijca 918521
6 pages
Secureswipe Pioneering Strategies for Next-gen Credit Card Fraud Prevention 1
No ratings yet
Secureswipe Pioneering Strategies for Next-gen Credit Card Fraud Prevention 1
9 pages
Fraud Detection in Banking Data by Machine Learning Techniques
No ratings yet
Fraud Detection in Banking Data by Machine Learning Techniques
10 pages
B17 Discrete Report
No ratings yet
B17 Discrete Report
16 pages
Link For Google Colab Note Book: Pa Ge
No ratings yet
Link For Google Colab Note Book: Pa Ge
17 pages
FINANCIAL DISTRESS PREDICTION USING MACHINE LEARNING
No ratings yet
FINANCIAL DISTRESS PREDICTION USING MACHINE LEARNING
5 pages
1 s2.0 S2666285X22000425 Main
No ratings yet
1 s2.0 S2666285X22000425 Main
7 pages
HR template
No ratings yet
HR template
6 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
34 pages
DBNex Deep Belief Network and Explainable AI Based Financial Fraud Detection
No ratings yet
DBNex Deep Belief Network and Explainable AI Based Financial Fraud Detection
10 pages
Anti Fraud
No ratings yet
Anti Fraud
23 pages
ML Final
No ratings yet
ML Final
34 pages
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
No ratings yet
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
6 pages
TE Seminar Formatfinal
No ratings yet
TE Seminar Formatfinal
16 pages
Fraud Detection in Financial Transactions.ppt.pptx_20240805_175608_0000 (1)
No ratings yet
Fraud Detection in Financial Transactions.ppt.pptx_20240805_175608_0000 (1)
22 pages
upi demo 1 (1)
No ratings yet
upi demo 1 (1)
12 pages
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
No ratings yet
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
19 pages
Final Year Project
No ratings yet
Final Year Project
27 pages
307A029Seminar
No ratings yet
307A029Seminar
16 pages
Approaches To Fraud Detection On
No ratings yet
Approaches To Fraud Detection On
10 pages
Bank Fraud Prediction
No ratings yet
Bank Fraud Prediction
16 pages
Synopsis Format for IT,HW and AI Workshop
No ratings yet
Synopsis Format for IT,HW and AI Workshop
16 pages
Credit Card Fraud Detection Using AI
No ratings yet
Credit Card Fraud Detection Using AI
18 pages
FINANCIAL FRAUD DETECTION
No ratings yet
FINANCIAL FRAUD DETECTION
11 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Case Study Front Page
No ratings yet
Case Study Front Page
11 pages
Project Zero
No ratings yet
Project Zero
15 pages
Research Proposal Template for Master Student
No ratings yet
Research Proposal Template for Master Student
15 pages
IJRPR16322
No ratings yet
IJRPR16322
15 pages
Fraud Detection Using Machine Learning and Deep Learning: December 2019
No ratings yet
Fraud Detection Using Machine Learning and Deep Learning: December 2019
7 pages
Math Cets Reviewer 2023
No ratings yet
Math Cets Reviewer 2023
18 pages
PISTON ROD RUNOUT - Reciprocating Compressor Cylinders
No ratings yet
PISTON ROD RUNOUT - Reciprocating Compressor Cylinders
34 pages
Report
No ratings yet
Report
14 pages
Project
91% (11)
Project
20 pages
Phase 5
No ratings yet
Phase 5
10 pages
Topic 2
No ratings yet
Topic 2
5 pages
IEEE_Conference_Template (2)
No ratings yet
IEEE_Conference_Template (2)
3 pages
How To COMPLETELY CHANGE Your Life in 2023 (My Process For Achieving Goals) - Tom Bilyeu
No ratings yet
How To COMPLETELY CHANGE Your Life in 2023 (My Process For Achieving Goals) - Tom Bilyeu
41 pages
Age of Empires 3
No ratings yet
Age of Empires 3
2 pages
Sleep and Dreams PDF
100% (1)
Sleep and Dreams PDF
16 pages
Why We Should T Trust Markets With Pur Civic Life, Michael Sandel
No ratings yet
Why We Should T Trust Markets With Pur Civic Life, Michael Sandel
4 pages
Lesson 2: The Strategic-Management Model and SWOT Analysis Topic: Learning Outcomes
No ratings yet
Lesson 2: The Strategic-Management Model and SWOT Analysis Topic: Learning Outcomes
40 pages
Auto Irrigation Using Arduino: Bachelor of Engineering
No ratings yet
Auto Irrigation Using Arduino: Bachelor of Engineering
43 pages
Credit Fraud
0% (1)
Credit Fraud
67 pages
03 - FCE - B2 First Exam Format - Reading and Use of English - Paper 1
100% (1)
03 - FCE - B2 First Exam Format - Reading and Use of English - Paper 1
3 pages
R12.2.9 TOI - Implement and Use Receivables - Receivables Command Center
100% (1)
R12.2.9 TOI - Implement and Use Receivables - Receivables Command Center
50 pages
Machining Process
No ratings yet
Machining Process
23 pages
CSDW Project Overview
No ratings yet
CSDW Project Overview
17 pages
OF Appraisal Formula A) Compounded Interest Future Value of Single Investment
No ratings yet
OF Appraisal Formula A) Compounded Interest Future Value of Single Investment
21 pages
Loyalty in Beowulf
No ratings yet
Loyalty in Beowulf
1 page
TRY Levi Roots
No ratings yet
TRY Levi Roots
1 page
Las q1 Week 1 Introtophilosophy12 Gacho
No ratings yet
Las q1 Week 1 Introtophilosophy12 Gacho
8 pages
Identity Confirmation - Tangerine
No ratings yet
Identity Confirmation - Tangerine
2 pages
Modal Verbs Detailed Lesson Plan
100% (5)
Modal Verbs Detailed Lesson Plan
6 pages
Technology and Change Management
No ratings yet
Technology and Change Management
24 pages
The United States of America (Interesting Facts) ESl Presentation
No ratings yet
The United States of America (Interesting Facts) ESl Presentation
3 pages
The Slug by Elise Gravel Teacher's Guide
No ratings yet
The Slug by Elise Gravel Teacher's Guide
8 pages
The Gifts of The Grasscutter.
No ratings yet
The Gifts of The Grasscutter.
3 pages
Technical Drawing Standards
No ratings yet
Technical Drawing Standards
11 pages
Earth and Life Science - 1st Quarter Exam (2022-2023)
No ratings yet
Earth and Life Science - 1st Quarter Exam (2022-2023)
4 pages
CBJIP 2023 2025s
No ratings yet
CBJIP 2023 2025s
10 pages
The Two Brothers
No ratings yet
The Two Brothers
4 pages