Machine Learning for Anomaly Detection
Machine Learning for Anomaly Detection
detection
December 2024
1. Understanding techniques, applications, and best practices
Agenda
2. Case studies
3. Points to remember
AI vs ML?
Anomaly detection identifies suspicious activity that falls outside of your established normal
patterns of behavior. A solution protects your system in real-time from instances that could result
in significant financial losses, data breaches, and other harmful events
TYPES OF ANOMALIES
Point Anomalies
Data points significantly
different from the majority (e.g., Contextual Anomalies
a sudden spike in network
traffic). Unusual only within a specific
context (e.g., high temperature
during winter).
Collective Anomalies
A collection of related data
points that deviate as a group
(e.g., a distributed denial- of-
service attack).
SUPERVISED ANOMALY DETECTION UNSUPERVISED ANOMALY DETECTION
• Supervised machine learning builds a • Unsupervised methods do not demand
predictive model using a labeled training manual labeling of training data. Instead,
set with normal and anomalous samples they operate based on the presumption
• The most common supervised methods • The most popular unsupervised anomaly
include Bayesian networks, k-nearest detection algorithms include Autoencoders,
neighbors, decision trees, supervised neural K-means, GMMs, hypothesis tests-based
networks, and SVMs analysis, and PCAs.
• The advantage of supervised models is that • These techniques thus assume collections
they may offer a higher rate of detection of frequent, similar instances are normal
and flag infrequent data groups as
malicious.
• The most common semi supervised methods include Linear regression, Outlier detection,Graph-
based.
• A semi-supervised anomaly detection algorithm might also work with a data set that is partially
flagged. It will then build a classification algorithm on just that flagged subset of data, and use that
model to predict the status of the remaining data.
WHY USE MACHINE LEARNING FOR ANOMALY DETECTION?
Advantages of ML Challenges
Key Inference
1. Dataset
2. Scale,
3. Application requirements.
PRACTICAL WORKFLOW FOR ANOMALY DETECTION
Workflow
Step 3 Model evaluation using key metrics
like F1- score.