0% found this document useful (0 votes)
29 views14 pages

Fraud Detection Synopsis

The document presents a synopsis of a project focused on developing an Online Payment Fraud Detection System using machine learning techniques. The system aims to enhance real-time detection accuracy while addressing challenges such as class imbalance in datasets and evolving fraud patterns. The project involves training multiple machine learning models, evaluating their performance, and integrating the best-performing model into a production environment for continuous monitoring and improvement.

Uploaded by

trippin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views14 pages

Fraud Detection Synopsis

The document presents a synopsis of a project focused on developing an Online Payment Fraud Detection System using machine learning techniques. The system aims to enhance real-time detection accuracy while addressing challenges such as class imbalance in datasets and evolving fraud patterns. The project involves training multiple machine learning models, evaluating their performance, and integrating the best-performing model into a production environment for continuous monitoring and improvement.

Uploaded by

trippin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

JNANA SANGAMA, BELAGAVI - 590018

A Synopsis on
ONLINE PAYMENT FRAUD DETECTION SYSTEM

Submitted for the Final Year Project of the AY: 2024-25


by

Aditya Chug (1NT21EC004)


Ruchi Yadav (1NT21EC117)
Shreyas Somanache (1NT21EC148)

Under the Guidance of


Prof. Prajna.K.B
Associate Professor
Dept. of Electronics and Communication Engineering

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING


YELAHANKA, BENGALURU- 560064
Contents
List of Figures Page No.
List of Tables Page No.
1. Introduction Page No
2. Literature Survey Page No.
3. Motivation and Objectives Page No.
3.1 Motivation
3.2 Objectives
4. Architecture of the Proposed System Page No.
5. Hardware/Software Description Page No.
6. Design Specification/Dataset Description Page No.
7. Expected Outcomes Page No.
References Page No.

i
Online Fraud Detection System

1. Introduction
Detecting online payment fraud is a critical endeavour that demands a
comprehensive approach encompassing various stages, from data collection to continuous
monitoring and improvement. The initial stage entails the meticulous compilation of a
comprehensive dataset, encompassing legitimate and fraudulent transactions. This
dataset, reflective of real-world scenarios, incorporates transaction attributes such as time,
type of transaction, amounts etc, serving as the foundational element for subsequent
model development. Following data collection, a rigorous data preprocessing phase
ensues. Addressing missing values, outliers, and inconsistencies ensures a refined dataset.
This sets the stage for Exploratory Data Analysis, a discerning examination of the unique
features characterizing both legitimate and fraudulent transactions, elucidating patterns,
trends, and correlations within the dataset. A pivotal decision in the project lies in the
selection of a suitable machine learning algorithm for fraud detection. In this project, we
are training 5 different models based on 5 different machine learning algorithms and
comparing them based on their accuracy. The chosen model undergoes meticulous
training on a segmented dataset, with hyperparameters fine-tuned to optimize
performance. Model evaluation, conducted on a separate testing dataset, meticulously
considers the business implications of false positives and false negatives. Upon successful
evaluation, the model transitions to deployment, seamlessly integrating into the
production environment. It becomes an integral component of the online payment system,
analysing transactions in real-time. This deployment marks the initiation of a continuous
monitoring and improvement phase, ensuring the model remains adept at discerning
evolving fraud patterns through regular updates with new data and is done on business
level. Traditional fraud detection methods are often found to be limited by their lack of
adaptability to new & evolving patterns of fraud. The aim of this project is to develop a
fraud detection system using machine learning (ML) techniques to classify fraudulent 2
transactions. This can serve to improve monitoring, response and support many of the
current threat mitigation practices in place at financial institutions. However, there are
challenges associated with use of ML for fraud detection, namely having imbalanced
datasets where the number of legitimate transactions far outnumber fraudulent payments
and can lead to model bias. While there is a growing body of research on the issue of
class imbalance, there is a lack of studies that address the problem using anomaly
detection algorithms. This paper has been done in collaboration with Deloitte Cyber and
involve leveraging their expertise to understand the security landscape for payments,

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 2
Online Fraud Detection System

including the types of threats faced, tactics used by attackers, indicators for fraud and the
selection of appropriate threat mitigation strategies. Through this paper we are looking to
answer the following research question: To what extent can machine learning (ML)
techniques be used to develop methods supporting fraud detection and security of
payments?

2. Literature Survey
Fraud detection has become critical as the rise of online transactions has led to an increase
in fraudulent activities, especially in financial transactions such as credit card payments.
This survey explores various machine learning techniques applied to fraud detection,
drawing insights from previous studies on the subject.
Online fraud detection involves identifying fraudulent transactions while minimizing false
positives, ensuring legitimate transactions are not unnecessarily blocked. One of the
central challenges, highlighted in multiple works, is the imbalanced nature of transaction
data, where fraudulent transactions make up a very small portion of total transactions [5].
Effective fraud detection models must be capable of recognizing these rare events without
overwhelming the system with false alerts.
Supervised learning techniques such as logistic regression, decision trees, and support
vector machines (SVM) have been extensively used. These models rely on labeled datasets
to train algorithms on detecting fraudulent patterns [4]. However, they face challenges
when confronted with novel fraud patterns, which necessitates continuous retraining and
updating of the models [3].
Unsupervised learning, particularly anomaly detection techniques, can complement
supervised models by identifying outliers in transaction behaviour, which may indicate
fraud. These models are beneficial in identifying new fraud patterns, as they do not rely on
historical labels [7]. However, they tend to have higher false-positive rates, as they can
flag legitimate transactions that deviate from the norm [11].
Hybrid approaches that combine supervised and unsupervised methods have emerged as a
promising solution. These models aim to leverage the strengths of both techniques,
reducing false positives while maintaining the ability to detect new types of fraud. For
instance, combining clustering techniques with classification algorithms has been shown
to improve detection accuracy [9].

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 3
Online Fraud Detection System

Real-time fraud detection has gained significant attention, especially with the
development of APIs that integrate machine learning models for immediate transaction
analysis. These systems allow banks and financial institutions to act swiftly in preventing
fraudulent activities [7]. Despite these advances, challenges remain in scaling these
systems for large volumes of data and addressing the computational demands of
continuous monitoring [13].

In conclusion, machine learning has revolutionized fraud detection, providing scalable


and adaptable solutions. However, challenges such as class imbalance, false positives,
and evolving fraud patterns require ongoing research and optimization [9]. Future work
could focus on improving real-time detection capabilities and developing more robust
hybrid models to handle complex fraud scenarios.

3. Motivation and Objectives


3.1 Motivation

 Increasing Online Fraud Incidents: With the rise of digital transactions and e-
commerce, online fraud, especially in credit card transactions, has surged
globally. As financial institutions and consumers face massive financial losses,
there is a growing need to improve fraud detection systems [10].
 Limitations of Traditional Systems: Conventional rule-based fraud detection
systems often fail to adapt to new and evolving fraud patterns. These systems are
rigid, yielding high false positives and negatives, thus necessitating more
intelligent and adaptive models [14].
 Advancements in Machine Learning: Machine learning offers the potential to
overcome the shortcomings of traditional systems by automating the detection of
complex patterns and anomalies in real-time, improving both the speed and
accuracy of fraud detection [10].
 Real-Time Monitoring Need: Financial fraud can cause devastating losses within
seconds. The motivation lies in developing real-time detection systems that allow
instant identification and prevention of fraudulent activities [11].

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 4
Online Fraud Detection System

3.2 Objectives of the Proposed Project


 Develop a Real-Time Fraud Detection System: Implement a machine learning-
based system that can monitor and detect fraudulent transactions in real time,
helping financial institutions prevent losses more effectively.
 Enhance Detection Accuracy: Leverage supervised and unsupervised learning
techniques, or hybrid approaches, to increase the accuracy of fraud detection
while reducing false positives and false negatives [14].
 Address Class Imbalance in Datasets: Utilize techniques such as Synthetic
Minority Oversampling Technique (SMOTE) and under-sampling to manage the
highly skewed distribution of fraud data, ensuring the model can effectively
detect the minority fraud cases.
 Adapt to Evolving Fraud Patterns: Design an adaptable system that can learn
from new types of fraud and detect novel fraudulent behaviours without frequent
retraining [5].
 Ensure Scalability: Ensure that the fraud detection system can handle large
volumes of data and transactions, supporting scalability in real-world financial
environments.

4. Architecture of the Proposed System

This system is designed to detect fraudulent activities in an online environment using


machine learning algorithms. The process can be broken down into several key stages, as
shown in the diagram:

1. Data Input (CSV Data):


The system starts with the collection of data from various sources, usually stored
in CSV format. This raw data consists of user activity, transaction history, and
other relevant information needed for fraud detection.
2. Data Preprocessing:
The collected data is often unclean, containing missing values, redundant
information, or noise. Therefore, before feeding it into the model, the data
undergoes preprocessing steps. This involves:
o Handling missing data
o Data normalization or scaling
o Encoding categorical features
o Splitting the data into training and test sets

This ensures the data is clean, consistent, and ready for the next stages.

3. Training and Testing:


The pre-processed data is divided into two sets:

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 5
Online Fraud Detection System

o Train Data: Used to train the machine learning model. This dataset
includes known outcomes (whether a transaction is fraudulent or not)
which helps the model learn patterns and associations.
o Test Data: Used to evaluate the model's performance on unseen data. It
helps in testing the model's generalization ability and accuracy in
predicting fraud.
4. Algorithm (Machine Learning Model):
The heart of the fraud detection system is the machine learning algorithm. Various
algorithms can be used for this purpose, such as:
o Decision Trees
o Random Forest
o Support Vector Machines (SVM)
o Neural Networks
o Gradient Boosting The model learns patterns of fraudulent and non-
fraudulent behaviour from the training data. After training, the algorithm
generates a predictive model that can detect anomalies in user behaviour or
transactions.
5. Evaluation:
Once the model is trained, it is evaluated using metrics like:
o Accuracy
o Precision
o Recall
o F1-Score This step helps assess how well the model can distinguish
between genuine and fraudulent transactions. Based on the evaluation,
fine-tuning of the model may occur to improve its performance.
6. Prediction:
After evaluation, the trained model is deployed to make predictions in real-time.
When new data is input into the system, the model processes it and predicts
whether a transaction is likely to be fraudulent or not.
7. User Interface (UI):
The predictions are displayed on a user-friendly interface where users or
administrators can monitor the results. The UI can provide insights, such as the
probability of fraud, flagged transactions, and the reasons behind the model's
decision.
8. Feedback Loop:
The system continuously improves over time. User feedback and new data are fed
back into the system to retrain and update the model, making it more robust and
accurate at detecting new types of fraud.

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 6
Online Fraud Detection System

Figure 1 FLOW CHART

Figure 2 BLOCK DIAGRAM OF Fraud Detection

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 7
Online Fraud Detection System

5. Algorithm Description
 RANDOM FOREST
To distinguish between authentic and fraudulent transactions, the Random Forest
algorithm is utilized as a reliable and efficient instrument. To function, Random Forest
builds many decision trees, each of which is trained using a different subset of the
characteristics and data that are available. Together, these decision trees create an
ensemble, with each tree adding to the final classification.[5] The algorithm can recognize
suspicious activity depending on multiple transaction variables such transaction amount,
frequency, location, and user behavior because it has learned patterns and relationships
within the transaction data during training. Random Forest's capacity to handle the
imbalanced nature of transaction datasets—where fraudulent cases are frequently greatly
outnumbered by genuine ones—is one of its main advantages for fraud detection.
This issue of class disparity can be successfully addressed by Random Forest by using
strategies like class weighting or modifying decision criteria, which will enhance its
accuracy in detecting fraudulent transactions. [7]Moreover, high-dimensional data that is
frequently encountered in fraud detection applications is a good fit for Random Forest
models. Their real-time processing ability of vast amounts of transaction data makes them
perfect for online fraud detection systems that need to quickly identify any suspicious
activity. Furthermore, Random Forests guarantee dependable performance even in
dynamic transaction contexts since they are resistant to overfitting and noise in the data.
Random Forest also has the benefit of being interpretable. Through the examination of
feature important scores produced during model training, fraud analysts can acquire
valuable knowledge regarding the transaction attributes that most significantly affect that
have the greatest impact on the identification of fraudulent activity.[12]
 GRADIENT BOOSTING
One effective and popular method for spotting fraud is the Gradient Boosting algorithm.
Gradient Boosting, in contrast to conventional machine learning algorithms, creates an
ensemble of decision trees successively, with every new tree of decision trees in a
stepwise manner, with each new tree trying to improve on the mistakes of the one before
it.[3] By concentrating on the cases that were incorrectly classified in earlier iterations,
this iterative procedure enables the algorithm to progressively increase its forecast
accuracy. The methods are particularly effective in addressing imbalanced datasets, which
have a large proportion of genuine transactions compared to fraudulent transactions, in
the context of online transaction fraud detection. Also, it prioritizes the identification of

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 8
Online Fraud Detection System

fraudulent transactions, improving overall performance by assigning greater weight to


misclassified occurrences. [2]
These models are skilled in identifying intricate patterns and relationships in transaction
information, allowing them to distinguish minute distinctions between authentic and
fraudulent activity. Gradient Boosting can efficiently leverage features like transaction
amount, frequency, location, and user behavior to spot suspicious activities. Furthermore,
the flexibility that Gradient Boosting algorithms provide with regard to model complexity
and parameter adjustment enables fraud detection systems to adjust to shifting fraud
patterns and dynamic threats. This flexibility is essential in dynamic online contexts
because fraud strategies are ever-changing. Moreover, Gradient Boosting offers feature
important insights that help fraud analysts determine which transaction The most crucial
elements for spotting fraudulent activity.[14]

6. Design Specification/Dataset Description

Figure 3 First Five Dataset

Ten thousand transaction details were used as training data among them we divide as type
of payment, amount, original name, old balance and new balance. One thousand datasets
were used for testing and real-time datasets were generated as transactions occurred. To
efficiently detect fraud transactions the real-time datasets are processed and compared
with the acquired dataset.[4]

The dataset also includes two key accounts involved in each transaction: nameOrig and
nameDest, which represent the originating and destination accounts, respectively.
Additionally, there are balance-related columns—oldbalanceOrg and newbalanceOrig
for the originating account, and oldbalanceDest and newbalanceDest for the destination
account—showing account balances before and after the transaction.[8]

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 9
Online Fraud Detection System

7. Expected Outcomes
The primary expected outcome of this project is the development of a real-time fraud
detection system that leverages machine learning algorithms to classify and detect
fraudulent transactions in online payment systems. The system is expected to achieve the
following:

1. Real-time Fraud Detection: The system will be capable of analysing transactions


in real-time, identifying potentially fraudulent activities instantly, and alerting
administrators or blocking transactions before any damage occurs.

2. High Detection Accuracy: The system will compare various machine learning
algorithms (like Decision Trees, Random Forest, SVM, etc) to choose the most
accurate model, improving fraud detection rates while minimizing false positives
and false negatives.

3. Adaptive to Evolving Fraud Patterns: The system will incorporate machine


learning techniques to adapt to new and evolving fraud tactics, ensuring that it can
detect novel types of fraudulent activities without frequent manual updates or
retraining.

4. Reduced False Positives: Through the use of supervised and unsupervised


learning models, the system will aim to reduce the number of legitimate
transactions incorrectly flagged as fraudulent, improving the user experience for
genuine customers.

5. Handling Imbalanced Datasets: By implementing techniques like SMOTE and


under-sampling, the system will effectively manage imbalanced datasets where
fraudulent transactions make up a small portion, ensuring better detection of rare
fraud events.

6. Scalability: The system will be designed to handle large volumes of transactions


and can be deployed in high-traffic environments like e-commerce platforms,
payment gateways, and financial institutions.

7. Continuous Learning and Improvement: A feedback loop will be implemented


to continuous update the model based on new data and user feedback, making the
system more effective at detecting emerging fraud trends over time.

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 10
Online Fraud Detection System

8. Improved Fraud Monitoring: The system will provide a user-friendly dashboard


or interface that displays flagged transactions, fraud probability scores, and the
reasoning behind the system's decisions, helping financial institutions to monitor
transactions efficiently.

9. Prevention of Financial Loss: With the timely detection and prevention of


fraudulent transactions, the system will help mitigate significant financial losses
for businesses and consumers.

10. Automated Detection Process: The system will automate the entire fraud
detection process, from data input to fraud identification, reducing the need for
human intervention and making the process more efficient.

11. Integration with Existing Systems: The fraud detection model can be integrated
into existing online payment systems via APIs, providing seamless real-time
protection without disrupting current operations.

12. Robust Evaluation Metrics: The system will be evaluated using key metrics like
accuracy, precision, recall, and F1-score to ensure it performs well under real-
world conditions and can distinguish between legitimate and fraudulent
transactions effectively.

13. Cost-Effective Solution: By automating the fraud detection process with machine
learning models, the system reduces the need for manual checks and minimizes
the operational costs for businesses.

14. Regulatory Compliance: The system will support compliance with financial
regulations related to fraud prevention, ensuring businesses stay within legal
frameworks while offering secure transaction processing.

15. Enhanced Customer Trust: As a result of improved fraud detection and


prevention, the system will help build trust between financial institutions and their
customers, as they can rely on a secure, fraud-resistant transaction environment.

These outcomes will help significantly improve the monitoring, detection, and response
capabilities of financial institutions against online payment fraud.

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 11
Online Fraud Detection System

References
[1]. Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., & Jiang, C. (2018). Random Forest for Credit Card
Fraud Detection. Department of Computer Science, Tongji University, Shanghai, China.
[2]. Wedge, R., Kanter, J. M., Veeramachaneni, K., Rubio, S. M., Perez, S. I. (2019). Solving the
“False Positives” Problem in Fraud Prediction. Data to AI Lab, MIT, Cambridge, MA & Banco
Bilbao Vizcaya Argentaria (BBVA), Madrid, Spain.
[3]. Sahin, Y., & Duman, E. (2016). Support Vector Machines and Malware Detection. Journal of
Computer Virology and Hacking Techniques, 12(3), 45-59.
[4]. Sorournejad, S., Zojaji, Z., Ebrahimi Atani, R., & Monadjemi, A. H. (2020). A Survey of Credit
Card Fraud Detection Techniques: Data and Technique Oriented Perspective. Department of
Computer Engineering, University of Guilan, Rasht, Iran.
[5]. Oladimeji Kazeem Fraud Detection Using Machine Learning (Thesis: September 2023) University
of Stirling.
[6]. Barker, K. J., D’Amato, J., & Sheridon, P. (2020). Credit Card Fraud: Awareness and Prevention.
College of Business, University of South Florida, St Petersburg, Florida, USA.
[7]. Thennakoon, A., Bhagyani, C., Premadasa, S., Mihiranga, S., & Kuruwitaarachchi, N. (2021).
Real-Time Credit Card Fraud Detection Using Machine Learning. Faculty of Computing, Sri
Lanka Institute of Information Technology, Colombo, Sri Lanka.
[8]. Sahin, Y., & Duman, E. (2015). Detecting Credit Card Fraud by ANN and Logistic Regression.
Marmara University, Istanbul, Turkey & Dogus University, Istanbul, Turkey.
[9]. Deepika, T., & Manimekalai, S. (2022). A Novel Method to Find Credit Card Counterfeit
Detection Using K-Means Algorithm. Journal of Algebraic Statistics, 13(2), 1125-1130.
[10]. Kadam, K. D., Omanna, M. R., Neje, S. S., & Nandai, S. S. (2023). Online Transactions Fraud
Detection Using Machine Learning. Sharad Institute of Technology College of Engineering,
Ichalkaranji.
[11]. Siddaiah, U., Anjaneyulu, P., & Ramesh, M. (2023). Fraud Detection in Online Payments Using
Machine Learning Techniques. Department of Information Technology, Velagapudi Ramakrishna
Siddhartha Engineering College, Vijayawada, India.
[12]. Ranjit, K. N., Vernekar, B. R., Chandana, M. R., Spandana, M. P., & Bachwar, M. (2024).
Online Transaction Fraud Detection Using Machine Learning. International Research
Journal of Engineering and Technology (IRJET), 11(4), 2499-2504.
[13]. Chen, H., & Chen, L. (2023). An Application of XGBoost Algorithm for Online Transaction Fraud
Detection Based on Improved Sailfish Optimizer. School of Computer Science, Hubei University
of Technology, Wuhan, China.
[14]. Anonymous. (2023). Payments Fraud Detection Using ML Methods: Exploring Performance,
Ethical and Real-World Considerations in Machine Learning-Based Fraud Detection for Secure
Payments. Conference Paper, September 2023.

Citing the Textbook:


Han, Jiawei, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques. 3rd ed.,
Waltham, MA: Morgan Kaufmann, 2011.

Citing the Journal Article:


Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A Comprehensive Survey of Data Mining-based Fraud
Detection Research. Artificial Intelligence Review, 34(4), 287–340.

Citing the Conference Article:


Ngai, E. W. T., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The Application of Data Mining
Techniques in Financial Fraud Detection: A Classification Framework and an Academic Review of
Literature. Proceedings of the 2011 IEEE International Conference on Decision Support Systems, 50(3),
559-569.

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 12
Online Fraud Detection System

Remarks/Comments to be Filled by the Department Project


Coordination Committee (DPCC)
Guide’s Remarks

Name of the Guide Signature

Panel Head Remarks

Name and Signature of the Panel Head

Dept. of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 13

You might also like