Machine Learning Outlier Detection by Using Autoencoders

Machine Learning Outlier detection by using autoencoders for financial data to predict weather it’s a fraud or normal

Uploaded by

Sri Vigneshwara Enterprises

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views14 pages

Machine Learning Outlier Detection by Using Autoencoders

Machine Learning Outlier detection by using autoencoders for financial data to predict weather it’s a fraud or normal

Uploaded by

Sri Vigneshwara Enterprises

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Outlier Detection in Financial

Transactions Using Autoencoders

Vineeth Reddy GUDA

Overview of the Dataset

Context:
• Limited availability of financial datasets, especially in emerging domains like mobile money transactions.
• Synthetic dataset generated using the PaySim simulator to address this gap.
Content:
• PaySim simulates mobile money transactions based on real data from an African country.
• Dataset scaled down for Kaggle use, preserving the original transaction patterns.
• Fraudulent transactions are canceled, emphasizing the need for careful consideration during analysis.
Solution Overview

Utilizing Autoencoder for Anomaly Detection:

• Introduction of machine learning approach.
• Autoencoders are utilized for their ability to detect anomalies in
transaction data.
Understanding Autoencoders:
• Autoencoders consist of encoder and decoder layers.
• Encoder compresses input data, while decoder reconstructs it.
Detection via Reconstruction Error:
• Anomalies are identified based on reconstruction error.
• Higher error indicates potential fraudulent activity.
Learning to Distinguish Normal vs. Fraudulent Transactions:
• Autoencoder learns patterns to differentiate between normal and
fraudulent transactions.
• By training on diverse data, the model becomes adept at identifying
anomalies.
Libraries
numpy and pandas: Data manipulation and analysis.

keras: For building and training neural networks.

matplotlib and seaborn: Plotting and visualization.

pickle: Serialization of Python objects.

scikit-learn: Evaluation metrics for model

performance.
keras.backend: Access to TensorFlow backend
operations.
Fraud Distribution:
Data Preprocessing • Majority (99.87%) of transactions are non-
fraudulent, with only 0.13% being fraudulent.
• Indicates significant class imbalance, posing a
challenge for model training.
Data Preprocessing:
• Log Transformation: Applied to numerical features
to handle skewness and improve distribution.
• Scaling: Normalized numerical features to a range
between 0 and 1 for uniformity.
• One-Hot Encoding: Converted categorical 'type'
variable to numerical format for model
compatibility.
Sample Preprocessed Data:
• Displayed sample rows of preprocessed data,
demonstrating transformed features and encoded
variables.
• Data is now ready for model training and evaluation,
enhancing interpretability and model performance.
Autoencoder Model Architecture
Model Architecture:
• Input layer with 9 features representing transaction attributes.
• Encoder layer with 16 neurons, followed by a code layer with 8 neurons,
compressing input data into a lower-dimensional representation.
• Decoder layer with 16 neurons reconstructs the compressed representation.
• Output layer reconstructs the original 9 features using sigmoid activation.
Parameter Summary:
• Total trainable parameters: 593.
• Parameters include weights and biases of each layer in the autoencoder
architecture.
• No non-trainable parameters present.
Model Summary:
• Provides an overview of the layers, their output shapes, and the total number of
parameters.
• Highlights the simplicity and efficiency of the autoencoder architecture for fraud
detection tasks.
Training the Autoencoder Model
Model Compilation:
• Loss function: Binary cross-entropy, suitable for binary classification tasks like fraud detection.
• Optimizer: Adam optimizer, known for its efficiency and adaptability.
• Metrics: Accuracy is tracked during training for evaluation purposes.
Callbacks:
• ModelCheckpoint: Saves the best performing model weights during training to "autoencoder_fraud.h5" file.
• TensorBoard: Logs training progress and writes logs for visualization in the './logs' directory.
Training Process:
• Dataset consists of 63,62,620 samples, with 20% used for validation (1,272,524 samples).
• Model trained for 10 epochs with a batch size of 128, shuffled before each epoch.
• Each epoch shows loss and accuracy metrics for both training and validation sets.
• Training time: Approximately 156 seconds per epoch.
Model Evaluation:
• Loss Curves:
• Plots training and validation loss against epochs to visualize learning
progress.
• Interpretation:
• Training Loss: Decreases steadily, indicating improved reconstruction
of normal transactions.
• Validation Loss: Follows a similar trend, indicating good
generalization without overfitting.
• Convergence Analysis:
• Both curves converge towards the end of 10 epochs, indicating stable
model performance.
• Model Performance:
• Close alignment of curves demonstrates effective learning and
identifies fraudulent activities accurately.
• Overall decreasing trend signifies the effectiveness of autoencoder
architecture for fraud detection.
Reconstruction Error Analysis
Reconstruction Error Calculation:
• Autoencoder predicts reconstructed transactions (train_x_predictions).
• Mean Squared Error (MSE) between original and reconstructed transactions
calculated.
• Error DataFrame created with MSE and true class labels.
Error Distribution:
• Mean reconstruction error: 8.35e-06, indicating overall low error in normal
transactions.
• Standard deviation: 6.43e-05, showing variability in reconstruction errors.
• Minimum error: 1.14e-11, representing accurately reconstructed transactions.
• Maximum error: 7.68e-02, indicating significant deviation from normal
behavior.
Interpretation:
• Majority of transactions have low reconstruction error, reflecting accurate
reconstruction of normal transactions.
• Higher errors may indicate anomalies or potentially fraudulent activities.
Precision-Recall Analysis
Precision-Recall Curve:
• Visualizes the trade-off between precision and recall for various threshold values.
• Precision: Fraction of true positive predictions among all positive predictions.
• Recall: Fraction of true positive predictions among all actual positive instances.
• Demonstrates how precision and recall change with threshold adjustments.
Interpretation:
• Lowering the threshold increases recall (detecting more fraudulent transactions) but
decreases precision (increasing false positives).
• Precision-Recall curve aids in selecting an optimal threshold for practical
deployment of the fraud detection model.
Precision and Recall vs. Threshold:
• Plots precision and recall against different threshold values.
• Highlights the trade-off between precision and recall as the threshold varies.
• Intersection of curves assists in selecting a suitable threshold balancing the trade-off
between metrics.
Confusion Matrix Analysis

Confusion Matrix:

• Provides visual representation of model predictions

compared to actual class labels.

Interpretation:

• Majority of normal transactions correctly identified

(6,354,406 true negative).
• Presence of false negatives (8,213) indicates potential
missed fraudulent transactions.
• Single false positive observed (normal transaction
misclassified as fraudulent).
Confusion Matrix
Analysis
Conclusion

Key Findings: Implications: Future Directions:

ML techniques, particularly the ML-driven fraud detection offers Continued refinement and
autoencoder model, show cost savings and enhances exploration of ML algorithms.
promise in fraud detection. customer trust. Application of findings to real-
Achieved commendable accuracy, Potential for further optimization world fraud detection systems.
precision, and recall rates. and exploration of advanced
algorithms.
Thank you…

Vineeth Reddy GUDA

OceanofPDF - Com LLMs in Enterprise - Ahmed Menshawy
No ratings yet
OceanofPDF - Com LLMs in Enterprise - Ahmed Menshawy
194 pages
Fraud Detection in Financial Transactions - PPT.PPTX - 20240805 - 175608 - 0000
No ratings yet
Fraud Detection in Financial Transactions - PPT.PPTX - 20240805 - 175608 - 0000
22 pages
Single Layer & Multilayer Perceptron
No ratings yet
Single Layer & Multilayer Perceptron
14 pages
Facial Emotion Detection Using Convolutional Neural Network Artificial Intelligence Project
100% (1)
Facial Emotion Detection Using Convolutional Neural Network Artificial Intelligence Project
6 pages
Introduction To AI
No ratings yet
Introduction To AI
22 pages
20241024111806transparency and Privacy The Role of Explainable AI and Federated Learning in Financial Fraud Detection
No ratings yet
20241024111806transparency and Privacy The Role of Explainable AI and Federated Learning in Financial Fraud Detection
11 pages
Adi Shankara
No ratings yet
Adi Shankara
16 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
Credit Card Frau
No ratings yet
Credit Card Frau
34 pages
ANN Question Paper 2022
No ratings yet
ANN Question Paper 2022
4 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
8 pages
SC - M1 - Ktunotes - in
No ratings yet
SC - M1 - Ktunotes - in
190 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
10 pages
AD502 QuestionBank
No ratings yet
AD502 QuestionBank
2 pages
Unit 2
No ratings yet
Unit 2
38 pages
Fraud Detection Project Report
No ratings yet
Fraud Detection Project Report
4 pages
Overfitting Vs Underfitting
No ratings yet
Overfitting Vs Underfitting
8 pages
Anti Fraud
No ratings yet
Anti Fraud
23 pages
Final Year Project
No ratings yet
Final Year Project
27 pages
Lecture 1 - Intro
No ratings yet
Lecture 1 - Intro
63 pages
AI in Fraud Detection: Leveraging Real-Time Machine Learning For Financial Security
No ratings yet
AI in Fraud Detection: Leveraging Real-Time Machine Learning For Financial Security
16 pages
AI FC Neural Network Presentation
No ratings yet
AI FC Neural Network Presentation
29 pages
Linear Models (Unit II) Chapter III 1
No ratings yet
Linear Models (Unit II) Chapter III 1
24 pages
Comp3314 8. Convolutional Neural Networks
No ratings yet
Comp3314 8. Convolutional Neural Networks
64 pages
New Report
No ratings yet
New Report
61 pages
DL Unit 4
No ratings yet
DL Unit 4
27 pages
Credit Card Fraud Detection Report
No ratings yet
Credit Card Fraud Detection Report
2 pages
Report
No ratings yet
Report
14 pages
11
No ratings yet
11
15 pages
Executive Post Graduate Certification in Data Science and Artificial Intelligence
No ratings yet
Executive Post Graduate Certification in Data Science and Artificial Intelligence
14 pages
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
No ratings yet
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
15 pages
Module 3.4 Classification Models, Case Study
No ratings yet
Module 3.4 Classification Models, Case Study
12 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
25 pages
Upi Demo 1
No ratings yet
Upi Demo 1
12 pages
AI Bias
No ratings yet
AI Bias
16 pages
Project Report
No ratings yet
Project Report
34 pages
Phase 3
No ratings yet
Phase 3
19 pages
Tsai - Deep Learning-Based Real-Time Multiple-Person Action Recognition System - 21
No ratings yet
Tsai - Deep Learning-Based Real-Time Multiple-Person Action Recognition System - 21
17 pages
Phase 5
No ratings yet
Phase 5
10 pages
Data Science 30 Days Learning Plan - by Data Analytics - Mr. Plan Publication - Jun, 2024 - Medium
No ratings yet
Data Science 30 Days Learning Plan - by Data Analytics - Mr. Plan Publication - Jun, 2024 - Medium
11 pages
Internship Project
No ratings yet
Internship Project
8 pages
Fraud Prediction Random Forest
No ratings yet
Fraud Prediction Random Forest
22 pages
B17 Discrete Report
No ratings yet
B17 Discrete Report
16 pages
Financial Fraud Detection
No ratings yet
Financial Fraud Detection
11 pages
Fraud Detection in Banking Data Using Machine Learning
No ratings yet
Fraud Detection in Banking Data Using Machine Learning
17 pages
Pdsreport
No ratings yet
Pdsreport
6 pages
FDS Project Report
No ratings yet
FDS Project Report
7 pages
ANN, KNN & Decision Tree
No ratings yet
ANN, KNN & Decision Tree
13 pages
Secureswipe Pioneering Strategies For Next-Gen Credit Card Fraud Prevention 1
No ratings yet
Secureswipe Pioneering Strategies For Next-Gen Credit Card Fraud Prevention 1
9 pages
Project PPT
No ratings yet
Project PPT
17 pages
A Comparison Study of Fraud Detection in Usage of Credit Cards Using Machine Learning
No ratings yet
A Comparison Study of Fraud Detection in Usage of Credit Cards Using Machine Learning
24 pages
Report
No ratings yet
Report
14 pages
Project Major 1
No ratings yet
Project Major 1
15 pages
Sample Phase 4
No ratings yet
Sample Phase 4
16 pages
Topic 2
No ratings yet
Topic 2
5 pages
Capstone Project - 1
No ratings yet
Capstone Project - 1
12 pages
Application of Artificial Neural Network To Forecast Actual Cost of A Project To Improve Earned Value Management System
No ratings yet
Application of Artificial Neural Network To Forecast Actual Cost of A Project To Improve Earned Value Management System
4 pages
TE Seminar Formatfinal
No ratings yet
TE Seminar Formatfinal
16 pages
307 A029 Seminar
No ratings yet
307 A029 Seminar
16 pages
Phase-2 For DS
No ratings yet
Phase-2 For DS
13 pages
AI and DS Final Document For Phase 5
No ratings yet
AI and DS Final Document For Phase 5
9 pages
309288560TheFacialEmotionRecognitionFER 2013DatasetforPredictionSystemofMicroExpressionsFaceUsingCNNalgbasedRaspberryPi
No ratings yet
309288560TheFacialEmotionRecognitionFER 2013DatasetforPredictionSystemofMicroExpressionsFaceUsingCNNalgbasedRaspberryPi
10 pages
Machine Learning Report
No ratings yet
Machine Learning Report
5 pages
How Do Convolutional Neural Networks Learn Design
No ratings yet
How Do Convolutional Neural Networks Learn Design
6 pages
Credit Card Fraud Detection Using Deep Learning
No ratings yet
Credit Card Fraud Detection Using Deep Learning
9 pages
LibMTL - Pytorch Library For MTL - March 2022
No ratings yet
LibMTL - Pytorch Library For MTL - March 2022
6 pages
Advancements in Fraud Detection Systems Using Machine Learning
No ratings yet
Advancements in Fraud Detection Systems Using Machine Learning
3 pages
IJIRSET Paper Sample
No ratings yet
IJIRSET Paper Sample
4 pages
HR Template
No ratings yet
HR Template
6 pages
Ai Assignment 2
No ratings yet
Ai Assignment 2
4 pages
ML Fraud Detection Case Study
No ratings yet
ML Fraud Detection Case Study
5 pages
Aifb Lab Manual Exp 6 - Aids
No ratings yet
Aifb Lab Manual Exp 6 - Aids
3 pages
Fraud Detection Synopsis
No ratings yet
Fraud Detection Synopsis
5 pages
FD Rout, 2024
No ratings yet
FD Rout, 2024
5 pages
Phase 2-AI Credit Card Fraud Detection System-1-2
No ratings yet
Phase 2-AI Credit Card Fraud Detection System-1-2
4 pages
Phase 2-AI Credit Card Fraud Detection System-2
No ratings yet
Phase 2-AI Credit Card Fraud Detection System-2
4 pages
Fraud Detection Dummy
No ratings yet
Fraud Detection Dummy
4 pages
PROPOSAL - TechFusion Innovators Challenge 2024
No ratings yet
PROPOSAL - TechFusion Innovators Challenge 2024
4 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
3 pages
Credit Card Fraud Detection Using Machine Learning Techniques
No ratings yet
Credit Card Fraud Detection Using Machine Learning Techniques
4 pages
Machine Learning Mock
No ratings yet
Machine Learning Mock
3 pages
.Trashed 1750261541 Phase 2 - Hari
No ratings yet
.Trashed 1750261541 Phase 2 - Hari
3 pages
CSC354 ML CDF V3.1
No ratings yet
CSC354 ML CDF V3.1
2 pages
Untitled Document
No ratings yet
Untitled Document
2 pages
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
No ratings yet
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
2 pages
AIML Resume
No ratings yet
AIML Resume
2 pages
Cy20602 Ai Syllabus
No ratings yet
Cy20602 Ai Syllabus
2 pages
Model
No ratings yet
Model
2 pages