0% found this document useful (0 votes)
27 views27 pages

Final Year Project

The document presents a project on online payment fraud detection using machine learning, outlining its objectives, methodologies, and expected outcomes. It emphasizes the need for advanced detection methods due to the inadequacy of traditional rule-based systems and proposes a system that utilizes various machine learning algorithms to identify fraudulent transactions in real-time. The project aims to enhance security for consumers and businesses while addressing challenges such as imbalanced datasets and evolving fraud tactics.

Uploaded by

trippin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views27 pages

Final Year Project

The document presents a project on online payment fraud detection using machine learning, outlining its objectives, methodologies, and expected outcomes. It emphasizes the need for advanced detection methods due to the inadequacy of traditional rule-based systems and proposes a system that utilizes various machine learning algorithms to identify fraudulent transactions in real-time. The project aims to enhance security for consumers and businesses while addressing challenges such as imbalanced datasets and evolving fraud tactics.

Uploaded by

trippin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Online payment fraud detection

using Machine Learning


Initial Phase Presentation on Major Project of the AY: 2024-25

Presented By
Aditya Chug 1NT21EC004
Ruchi Yadav 1NT21EC117
Shreyas Somanache 1NT21EC148

Project Supervisor
Prof. Prajna.K.B
Associate Professor
Dept. of Electronics and Communication Engineering
Nitte Meenakshi Institute of Technology
Yelahanka, Bangalore-560064
Dept. of ECE, NMIT, Bengaluru
Presentation Outline
• Introduction
• Literature Survey
• Project Objectives
• Tools Required
• Architecture
• Mathematical modelling design approach
• Proposed System Methodology
• Expected Outcomes
• References

Dept. of ECE, NMIT, Bengaluru


Introduction
• Detecting fraud in online payments is a critical challenge,
requiring advanced methods to accurately distinguish between
legitimate and fraudulent transactions. Traditional fraud detection
systems, primarily rule-based, are increasingly inadequate due to
the evolving nature of fraud tactics.
• To address these challenges, machine learning offers a powerful
solution. By analyzing large volumes of transaction data,
machine learning models can identify hidden patterns and
anomalies indicative of fraudulent activity. These models
continuously improve as new data becomes available, making
them highly effective in combating emerging fraud tactics.
• This system promises to enhance security in online payment
environments, protecting both consumers and businesses from
financial losses.
Dept. of ECE, NMIT, Bengaluru
Literature Survey
 M. Venkatesh, Bhukya Keerthi Bai, Budati Bhargavi, Chilukoti Manasa, and
Dwarasala Mokshitha, "Online Payment Fraud Detection Using Machine
Learning," Vasireddy Venkatadri Institute of Technology, Guntur, Andhra
Pradesh, India.
• Focused on Random Forest: The proposed model uses the Random Forest
Algorithm for detecting online payment fraud, offering a robust method for
classification.
• Real-time Fraud detection: The System allows users to input transaction details and
immediately detect through a user-friendly interface.
• Handling Imbalance Data: The model addresses the challenge of imbalanced
datasets, where fraudulent transactions are a small portion of the total data, through
effective classification techniques.
• Preprocessing Importance: Highlights the importance of preprocessing steps like
handling missing data, converting categorical variables, and removing outliers to
ensure the model's accuracy.

Dept. of ECE, NMIT, Bengaluru


Literature Survey
 E. Ileberi, Y. Sun, and Z. Wang, "Performance Evaluation of Machine Learning
Methods for Credit Card Fraud Detection Using SMOTE and AdaBoost," IEEE
Access, vol. 9, pp. 165286-165294, 2021.
• Objective: Evaluate the effectiveness of machine learning(ML) models for credit card
fraud detection using imbalanced datasets.
• Key Metrics:
• Accuracy, Recall, Precision, Matthews Correlation Coefficient (MCC), and Area Under the
Curve (AUC) were used to evaluate model performance.
• Results
• AdaBoost significantly improved the performance of all models.
• Using SMOTE helped mitigate the class imbalance problem, improving both recall and precision.
• Conclusion
• Pairing AdaBoost with Traditional ML models yielded superior results compared to
existing methods.

Dept. of ECE, NMIT, Bengaluru


Project Objectives
The following are the proposed objectives of the project based on the research gaps:
• To develop an online fraud detection system using machine learning algorithms that identifies fraudulent
transactions with high accuracy in real-time.

• Provide a responsive and user-friendly interface with form validation and instant fraud detection feedback.

• To address the challenge of imbalanced datasets by implementing techniques such as oversampling, under-
sampling, and anomaly detection algorithms suited for rare-event detection.

• To evaluate the system’s robustness against adversarial attacks and ensure it can withstand attempts by
fraudsters to evade detection.

• Provide real-time alerts and detailed reports on flagged transactions via the interface.

Dept. of ECE, NMIT, Bengaluru


Tools Required Software Specification
1. User Interface (UI):
• Technology Used: HTML, CSS, JavaScript, Flask
• Key Feature: Simple and intuitive design for seamless fraud review and deploying the web app.
2. Application Logic:
• Technology Used: Python
• Role: Handles data collection, cleaning, and feature selection for the fraud detection model.
3. Storage:
• Technology Used: CSV files
• Role: Temporary storage for transaction data and model-related information.
4. Machine Learning Model:
• Technology Used: Supervised Learning (XG Boost, SVM Classifier, Extra Tree Classifier)
• Role: Core model for predicting and identifying fraudulent transactions based on the processed data.

Dept. of ECE, NMIT, Bengaluru


Tools Required Software Specification
i. XG Boost Classifier:
• An advanced algorithm that builds multiple decision trees sequentially to improve accuracy.
• Known for speed and performance, especially with large datasets.
• Includes regularization techniques (like L1, L2) to prevent overfitting.
• L1 regularization: It makes some weights (values assigned to different features), effectively removing irrelevant
features to make the model simpler.
• L2 regularization: It shrinks weights towards zero but doesn’t make them exactly zero, helping to avoid overfitting by
keeping all features with smaller contributions.
ii. SVM Classifier:
• Separates data into 2 categories (fraud, legit) by finding the best boundary (hyperplane).
• Performs effectively with smaller, clean datasets.
• Works well when there’s a clear margin of separation between classes.
iii. Extra Tree Classifier:
• Builds multiple decision trees & averages results for predictions.
• Can rank which features contribute the most to fraud detection.
• Computationally faster compared to some other tree-based models.
Dept. of ECE, NMIT, Bengaluru
Basic Architecture
1. Data collection layer
• Purpose: Collect transaction and behavioral data.
• Details:
• Sources: Transaction logs, user activity monitoring, third-party fraud detection APIs.
• Data Types: Structured (numeric, categorical) and time-series data.
• Technology: APIs, Database Systems (e.g., PostgreSQL, MongoDB).

2. Preprocessing Layer
• Purpose: Clean and prepare data for analysis.
• Technology: Python (Pandas, NumPy, Scikit-learn).
• Details:
• Handle missing values using imputation techniques.
• Scale numeric data for SVM compatibility (e.g., Min-Max Scaling).
• Perform one-hot encoding for categorical variables.
• Balance the dataset to address class imbalance (e.g., SMOTE).
Dept. of ECE, NMIT, Bengaluru
Basic Architecture
3. Modeling Layer
• Purpose: Apply machine learning models for fraud detection.
• Algorithms Used:
• Extra Tree Classifier: Randomized tree splits for fast computation and reduced overfitting.
• Support Vector Machine (SVM): Separates classes using hyperplanes, effective for small but complex
datasets.
• XGBoost: Gradient boosting framework with regularization for high accuracy.
• Technology: Scikit-learn, XG Boost library

4. Decision Layer
• Purpose: Output fraud probabilities for transactions.
• Details:
• Ensemble Approach: Combines predictions from Extra Tree, SVM, and XGBoost.
• Decision Threshold: Assigns "fraudulent" or "legitimate" labels based on a set confidence level.
• Technology: Python Flask/Django for REST API, model serialization with joblib or pickle.

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Basic Architecture
5. Feedback Layer
• Purpose: Continuously refine the models.
• Details:
• Use flagged transactions to update training datasets.
• Implement active learning for incorporating feedback.
• Technology: Database for logging misclassified transactions, active learning frameworks.

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
1. Problem Formulation
• Objective: Classify transactions as fraudulent (1) or non-fraudulent (0).
• Mathematically:
f:X→Y
where:
• XXX: Feature space (e.g., transaction amount, location, time)
• YYY: Binary output (0: Non-Fraud, 1: Fraud)

• Evaluation Metrics:
• Accuracy
• Precision
• Recall (Sensitivity)

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
2. Data Preprocessing
• Dataset Representation:
D={(x1​,y1​),(x2​,y2​),…,(xn​,yn​)}
where:
• xi​: Feature vector for the i^(th) transaction
• yi​: Label (0 or 1)

• Feature Scaling (Standardization):


x′=​
where:
• μ: Mean of the feature
• σ: Standard deviation

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
Why Standardization is Important:
1. Ensures Uniform Scale:
• Makes features comparable, especially important for algorithms like SVM and Logistic Regression
that are sensitive to feature scales.

2. Speeds Up Convergence:
• Helps optimization algorithms (like gradient descent) converge faster by eliminating bias
introduced by differing feature scales.

3. Improves Model Performance:


• Prevents large-scale features from skewing model coefficients and ensures balanced learning.

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
3. Model Design
• A. Extra Trees Classifier

• Goal: Build multiple decision trees using random splits and aggregate results.
• Mathematical Representation:

f(x)=

where:
• T: Number of decision trees
• ft​(x): Prediction from tree t

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
• B. Support Vector Machine (SVM)

• Goal: Find the optimal hyperplane that separates fraudulent and non-fraudulent
transactions.
• Hyperplane Equation:
w⋅x + b = 0
where:
• w: Weight vector
• x: Feature vector
• b: Bias term

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
4. Model Evaluation
• Confusion Matrix:

where:
• TP: True Positive
• FP: False Positive
• FN: False Negative
• TN: True Negative

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Mathematical Design Approach
• Using the Confusion Matrix for Metrics
1.Accuracy: Overall correctness of the model.

2.Precision: Focuses on how many flagged frauds are truly frauds.

3. Recall (Sensitivity): Measures how many actual frauds were detected.

4. F1-Score: Balances Precision and Recall.

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Proposed System Methodology
User
Register/Login

Pre-processed
Initiate Dataset
Transaction

Process with Attribute


real time Selection
Dataset

Classification
Techniques

SVM Extra tree Fraud Detection


Implementation classifier

Dept. of ECE, NMIT, Bengaluru


Proposed System Methodology
1. User Registration/Login :
• Users start by registering or logging into the system.
• This is the entry point where user credentials are authenticated for secure access.
2. Transaction Initiation :
• After login, users initiate a transaction (e.g., online payment, money transfer, etc.).
• This step captures key transactional details such as amount, location, and time.
3. Real-time Dataset Processing:
• Incoming transactions are processed with a real-time dataset.
• This involves using real-time data (e.g., user history, location patterns, transaction amounts)
to identify anomalies.

Dept. of ECE, NMIT, Bengaluru


Proposed System Methodology
4. Data Preprocessing:
• The system has pre-processed dataset available for model training.
• Pre-processing includes:
• Cleaning: Removing null or inconsistent data
• Normalization: Bringing all data to the same scale.
• Encoding: Converting categorical variables into numerical ones.
5. Attribute Selection:
• Important attributes or features are selected from the dataset.
• Feature selection optimizes model efficiency and accuracy by focusing on the most relevant
data.
6. Classification Techniques:
• Various classification algorithms are applied to label transactions as either fraudulent or
legitimate.
• Machine learning algorithms like SVM and Extra tree classifier are employed here.
Dept. of ECE, NMIT, Bengaluru
Proposed System Methodology
7. Support Vector Machine (SVM) Implementation:
• SVM is effective for high-dimensional data and works well with smaller datasets.
• It uses a hyperplane to separate classes, maximizing the margin between fraudulent and
non-fraudulent transactions.
• The RBF Kernel handles non-linear relationships in complex fraud patterns by converting
the data into higher dimensional space.
8. Extra Trees Classifier Implementation:
• Extra Trees Classifier reduces overfitting by creating multiple randomized decision trees.
• It selects features randomly at each split, ensuring robust predictions.
• The model also provides Feature Importance Scores, helping identify critical fraud
indicators.
9. Fraud Detection:
• Final stage where fraudulent transactions are flagged.
• Alerts are generated for potentially fraudulent activities, and actions can be taken instantly.
Dept. of ECE, NMIT, Bengaluru
Expected Outcomes
• Real-time Fraud Detection: The system will be able to detect fraudulent transactions in real
time, preventing financial losses before they occur.
• Improved Accuracy: By using machine learning algorithms like Random Forest, the system
will have a high accuracy in identifying fraudulent transactions while minimizing false
positives.
• Adaptability to New Fraud Patterns: The system will continuously learn from new data,
making it capable of adapting to emerging and evolving fraud tactics.
• Efficient Handling of Imbalanced Datasets: Techniques like resampling or using effective
classification algorithms will ensure that the system can detect rare fraudulent transactions
without bias.
• User-Friendly Application: The fraud detection model will be integrated into an easy-to-use
web interface, allowing users to quickly check transactions for fraud.

Dept. of ECE, NMIT, Bengaluru


Expected Outcomes

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
Expected Outcomes

Dept. of ECE, NMIT, Bengaluru


Dept. of ECE, NMIT, Bengaluru
References
[1].Nghia Nguyen, Truc Duong, Tram Chau, Van-Ho Nguyen, Trang Trinh, Duy Tran, and Thanh Ho, "A Proposed Model for Card Fraud Detection Based on CatBoost and
Deep Neural Network," IEEE Access, University of Economics and Law, Vietnam National University, Ho Chi Minh City, Vietnam.
[2].Nadia Boutaher, Amina Elomri, Noreddine Abghour, Khalid Moussaid, and Mohamed Rida, "A Review of Credit Card Fraud Detection Using Machine Learning
Techniques," IEEE Access, Hassan II University Casablanca, Morocco.
[3].Cheng Wang, Songyao Chai, Hangyu Zhu, and Changjun Jiang, "CAeSaR: An Online Payment Anti-Fraud Integration System With Decision Explainability," IEEE
Transactions on Dependable and Secure Computing, Senior Member, IEEE.
[4].Suraya Nurain Kalid, Kok-Chin Khor, Keng-Hoong Ng, and Gee-Kok Tong, "Detecting Frauds and Payment Defaults on Credit Card Data Inherited With Imbalanced Class
Distribution and Overlapping Class Problems: A Systematic Review," IEEE Access, Multimedia University, Malaysia, Universiti Tunku Abdul Rahman, Malaysia.
[5].Antonio Tudisco, Deborah Volpe, Giacomo Ranieri, Gianbiagio Curato, Davide Ricossa, Mariagrazia Graziano, and Davide Corbelletto, "Evaluating the Computational
Advantages of the Variational Quantum Circuit Model in Financial Fraud Detection," IEEE Access, Politecnico di Torino, Italy, Intesa Sanpaolo, Italy.
[6].Seyedeh Khadijeh Hashemi, Seyedeh Leili Mirtaheri, and Sergio Greco, "Fraud Detection in Banking Data by Machine Learning Techniques," IEEE Access, Kharazmi
University, Iran, University of Calabria, Italy.
[7].Darshan Aladaktatti, Gagana P, Ashwini Kodipalli, and Shoaib Kamal, "Fraud Detection in Online Payment Transaction Using Machine Learning Algorithms," IEEE
Access, Global Academy of Technology, Bangalore, India.
[8].Fahdah A. Almarshad, Ghada Abdalaziz Gashgari, and Abdullah I. A. Alzahrani, "Generative Adversarial Networks-Based Novel Approach for Fraud Detection for the
European Cardholders 2013 Dataset," IEEE Access, Prince Sattam Bin Abdulaziz University, Saudi Arabia, University of Jeddah, Saudi Arabia, Shaqra University, Saudi
Arabia.
[9].Abdulwahab Ali Almazroi and Nasir Ayub, "Online Payment Fraud Detection Model Using Machine Learning Techniques," IEEE Access, University of Jeddah, Saudi
Arabia, Air University, Pakistan.
[10].Cheng Wang and Hangyu Zhu, "Representing Fine-Grained Co-Occurrences for Behavior-Based Fraud Detection in Online Payment Services," IEEE Access, Senior
Member, IEEE.
Dept. of ECE, NMIT, Bengaluru
References
[11].Domenig, Thomas & Zvizdic, Ermin & Vanini, Paolo & Rossi, Sebastiano. (2022). Online Payment Fraud: From Anomaly Detection to Risk Management.
[12].Jack Nicholls, Aditya Kuppa, and Nhien-An Le-Khac, “Financial Cybercrime: A Comprehensive Survey of Deep Learning Approaches to Tackle the
Evolving Financial Crime Landscape” IEEE Access, University College Dublin, Ireland.
[13].Ranran Li, Zhaowei Liu, Yuanqing Ma, Dong Yang, and Shuaijie Sun, "Internet Financial Fraud Detection Based on Graph Learning," IEEE Access,
Graduate Student Member, IEEE.
[14].Fuad A. Ghaleb, Faisal Saeed, Mohammed Al-Sarem, Sultan Noman Qasem, and Tawfik Al-Hadhrami, "Ensemble Synthesized Minority Oversampling-
Based Generative Adversarial Networks and Random Forest Algorithm for Credit Card Fraud Detection," IEEE Access, University Teknologi Malaysia,
Birmingham City University, Taibah University, Imam Mohammad Ibn Saud Islamic University, and Nottingham Trent University.
[15].Alarfaj, Fawaz & Malik, Iqra & Khan, Hikmat & Almusallam, Naif & Ramzan, Muhammad & Ahmed, Muzamil. (2022). Credit Card Fraud Detection Using
State-of-the-Art Machine Learning and Deep Learning Algorithms. IEEE Access. 10. 1-1. 10.1109/ACCESS.2022.3166891.
[16].Reem M. Own, Sameh A. Salem, and Amr E. Mohamed, "TCCFD: An Efficient Tree-based Framework for Credit Card Fraud Detection," IEEE Access.
[17].Nashwa Shaker Ragab, Omnia Elrashidy, Omar Adel, et al., “Fraud_Detection_ML: Machine Learning Based on Online Payment Fraud Detection,” Journal
of Computing and Communication, vol. xx, February 2024. DOI: 10.21608/jocc.2024.339929.
[18].Aditya Oza, "Fraud Detection Using Machine Learning," Stanford University.
[19].M. Venkatesh, Bhukya Keerthi Bai, Budati Bhargavi, Chilukoti Manasa, and Dwarasala Mokshitha, "Online Payment Fraud Detection Using Machine
Learning," Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India.
[20].E. Ileberi, Y. Sun, and Z. Wang, "Performance Evaluation of Machine Learning Methods for Credit Card Fraud Detection Using SMOTE and AdaBoost,"
IEEE Access, vol. 9, pp. 165286-165294, 2021.

Dept. of ECE, NMIT, Bengaluru

You might also like