Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine Learning
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - With the rise in online transactions and Lost/Stolen Card Fraud: Physical possession of the
digital banking, credit card fraud has emerged as a critical card is misused.
threat to financial institutions and customers. Traditional rule-
Phishing and Social Engineering: Users are tricked
based systems fail to adapt to evolving fraud patterns,
into revealing sensitive information.
resulting in poor detection rates and financial losses.This
project presents a credit card fraud detection system using Skimming: Card information is captured using
machine learning algorithms—K-Nearest Neighbors (KNN), devices at ATMs or terminals.
Logistic Regression, Support Vector Machine (SVM), and
Decision Tree. We utilized a publicly available dataset of real While many fraud detection systems are already in place,
credit card transactions, applied preprocessing and balancing traditional rule-based approaches are insufficient. They
techniques to address class imbalance, and trained each model operate on pre-defined rules (e.g., "flag all transactions over
on the transformed data.Model performance was evaluated $5,000 from foreign IPs") which:
using accuracy, precision, recall, and F1-score. The Decision
Tree and SVM classifiers demonstrated high recall values, Fail to adapt to evolving fraud patterns.
suitable for minimizing false negatives in fraud detection.
Often result in high false positive rates.
This research contributes to financial fraud prevention by
implementing efficient ML techniques to detect fraudulent Struggle with scalability and non-linear patterns.
behavior in real-time.
Furthermore, the class imbalance problem poses a
Keywords: Credit card fraud, machine learning, KNN, SVM, substantial challenge—fraudulent transactions are often <1%
Logistic Regression, Decision Tree, anomaly detection, of total data. Standard classifiers become biased toward the
imbalanced data majority (legitimate transactions), leading to high accuracy
but poor fraud detection.
Specific Objectives:
To develop, evaluate, and deploy a machine learning- Tabulated and visual comparison of models across
based credit card fraud detection system that is scalable, key metrics.
explainable, and capable of real-time fraud prediction.
Confusion matrix and ROC/PR curves for in-depth The dataset is anonymized, limiting feature
analysis. engineering.
6. Ethical and Legal Framework The system architecture of a credit card fraud
detection platform must be designed to process large volumes
A checklist of compliance with GDPR, GLBA, of transaction data rapidly and accurately while being capable
DPDPA, and other relevant laws. of adapting to evolving fraud patterns. The proposed
architecture for this project is structured into modular layers,
Bias mitigation strategies and fairness metrics. each with a dedicated function to ensure efficiency,
scalability, and maintainability.
1.5 Scope of the Project
2.1.1. Data Acquisition Layer
The scope includes:
This layer is responsible for collecting transaction data in
Only binary classification (fraud vs genuine).
real-time or batch mode from multiple sources such as:
Use of supervised learning techniques.
Bank transaction databases.
Focus on classical ML algorithms (not deep
Third-party APIs.
learning).
Payment gateways.
Use of a publicly available Kaggle dataset.
The system must support both streaming data (real-time
Real-time prediction via REST API.
transactions) and historical data (for training and evaluation).
Exclusion of financial or legal liability from This layer ensures that the data is captured securely and with
prediction outcomes. minimal latency.
Limitations:
Encoding: Although the original dataset contains This layer ensures seamless integration with banking
numerical values (PCA-transformed), any additional systems and enables real-time decision-making.
categorical data can be one-hot encoded.
2.1.6. Monitoring and Feedback Layer
Data Balancing: Using SMOTE to address the class
imbalance by generating synthetic examples of To maintain accuracy over time, models must adapt. This
minority (fraud) class. layer:
This layer transforms raw data into a clean, consistent, Monitors prediction accuracy and drift in
and structured format suitable for modeling. transaction patterns.
This is the core layer where machine learning algorithms Collects user feedback from analysts for
are applied. The models implemented in this project include: continuous learning.
K-Nearest Neighbors (KNN). This feedback loop transforms the system into a self-
improving fraud detection platform.
Logistic Regression.
3.METHODOLOGY
Support Vector Machine (SVM).
This chapter describes the step-by-step methodology followed
Decision Tree Classifier. for developing, training, and evaluating the credit card fraud
detection models. It includes dataset details, preprocessing
Each model is trained on the preprocessed and balanced strategies, model selection, hyperparameter tuning, and
dataset. Cross-validation and hyperparameter tuning are evaluation criteria.
applied to optimize performance. The layer also includes
version control and model validation mechanisms. 3.1 Dataset Description
2.1.4. Evaluation Layer The dataset used in this study is sourced from
Kaggle's publicly available Credit Card Fraud Detection
After training, models are evaluated based on multiple repository, originally made available by a European card
metrics: issuer. This dataset has become a standard benchmark for
evaluating fraud detection models due to its real-world origin
Accuracy
and the challenges it presents, such as extreme class
Precision imbalance and anonymized features.
Dashboards (e.g., Grafana, Power BI) for fraud o ‘Amount’: The transaction amount in euros.
analysts.
o ‘Class’: The target variable (0 = genuine, 1 SMOTE was applied only to the training set to avoid
= fraud). data leakage.
3.2 Data Preprocessing The transformation ensures these features have zero
mean and unit variance.
Preprocessing is a critical step in any machine
learning project. It ensures that the data fed into algorithms is Why not normalize all features?
clean, consistent, and optimized for learning. For fraud The PCA components already have unit variance due to
detection, preprocessing must also tackle class imbalance and the nature of dimensionality reduction. Re-scaling them could
feature scale issues. distort their meaning.
The original dataset contains 492 fraudulent For supervised learning, it is essential to evaluate the
transactions out of 284,807, making it highly imbalanced. A model’s performance on unseen data. Therefore, we split the
naïve model trained on this data may classify all transactions dataset as follows:
as genuine and still achieve 99.8% accuracy—yet such a
model would be useless in practice. Training Set: 70%
Generating a new sample along the line Stratification prevents the testing set from being
segment connecting the selected sample and dominated by the majority class.
its neighbors.
Further Improvements:
Advantages: For final evaluation, k-fold cross-validation was used
during model training to ensure robustness and
Reduces class imbalance without duplicating data. generalizability.
We selected four widely recognized classification Works well on linearly separable data.
algorithms to compare their effectiveness in detecting credit
Cons:
card fraud. Each has different strengths and computational
characteristics. Less effective with non-linear data.
Description: Description:
A linear model that estimates the probability that a A hierarchical structure that splits the data based on
given input belongs to the positive class using a logistic feature values, leading to decisions at the leaf nodes.
function. It is particularly suited for binary classification
problems. Hyperparameters:
Solver: liblinear Max Depth: None (splits until leaves are pure)
Pros:
Easy to interpret and visualize. Precision ensures that flagged transactions are indeed
fraudulent.
Cons:
F1-Score balances both.
Prone to overfitting.
AUC-ROC evaluates performance across thresholds.
Less stable (small data changes can cause large tree
structure changes). 3.4.1 Confusion Matrix
Used to visualize:
After preprocessing and training, the four selected The Precision-Recall Curve is especially useful in
machine learning models were evaluated on the test dataset fraud detection because it focuses on the positive (fraud)
using accuracy, precision, recall, F1-score, and AUC-ROC class. A model that can achieve high recall without
metrics. Each model exhibited different strengths and compromising much on precision is considered optimal.
weaknesses, offering trade-offs in terms of detection ability,
interpretability, and real-time feasibility Area Under Precision-Recall
Model
Curve (PR-AUC)
Below is a summary table of performance metrics:
KNN 0.942
Model Accuracy Precision Recall F1-Score
Logistic
0.963
KNN 94.1% 92.3% 90.8% 91.5% Regression
SVM showed the highest recall, making it ideal for Predicted Predicted
minimizing false negatives (missing actual frauds). Genuine Fraud
True Positives (TP): 438 transactions were correctly Even with SMOTE, some fraud patterns may
flagged as fraud. remain underrepresented, especially rare or highly novel
schemes.
False Positives (FP): 52 genuine transactions were
incorrectly flagged. 2. Generalization to Real-World Systems:
False Negatives (FN): 21 frauds were missed, which The dataset lacks contextual features (e.g., IP
is relatively low. address, location, device ID).
True Negatives (TN): Over 83,000 genuine Models trained on static datasets may not generalize
transactions were correctly classified. well unless continuously updated.
Low FN is critical: Missing frauds can lead to large Fraud techniques evolve rapidly. Static models
financial losses. degrade in accuracy over time unless retrained frequently with
updated data.
Moderate FP is manageable: False positives can be
reviewed manually or verified through OTPs, 4. Scalability Issues:
ensuring minimal customer inconvenience.
KNN is slow at prediction time due to distance
Analysis: calculations.
Only 21 fraudulent transactions were missed out of SVM, although highly accurate, is computationally
459 frauds. expensive for very large datasets.
4..4 Time and Resource Efficiency The model evaluation reveals valuable insights
into the practical applicability of each algorithm:
4.5Training Time
Support Vector Machine (SVM):
Training Time Inference Time (avg per 1000
Model Best Overall Performance: Achieved highest recall
(s) samples)
and F1-score.
KNN <5 1.2 seconds
Use Case: Ideal for high-stakes fraud detection
Logistic systems where minimizing false negatives is crucial.
<2 0.01 seconds
Reg.
Logistic Regression:
SVM ~30 0.05 seconds Best Interpretability: High precision and low
latency.
Decision
<3 0.01 seconds
Tree Use Case: Suitable for deployment in financial
institutions where decisions must be explainable and
fast.
4.6 Limitations Identified Decision Tree:
Despite strong performance, some limitations Fast and Transparent: Slight trade-off in precision.
were observed in this study:
Use Case: Great for rule-based augmentation or as
1. Data Imbalance: part of ensemble models.
K-Nearest Neighbors: Balancing data with SMOTE helps but must be done
carefully.
Effective but Inefficient: High computational cost Transparency and deployment readiness are
makes it less practical for real-time fraud detection. essential.
Regulations guide building trustworthy AI.
Use Case: Academic benchmarks or systems with
small datasets. 5.5 Final Thoughts
Fraud detection is an ongoing battle. This project lays the
4.8 Recommendations Based on Results foundation for strong machine learning solutions, but
continuous improvements and ethical considerations are
Implement SVM with fallback logic to Logistic needed to keep systems effective and fair.
Regression if latency exceeds threshold.