Lecture Notes _ Anomaly Detection in Time Series
Lecture Notes _ Anomaly Detection in Time Series
Introduction
Series to Anomaly Detection in Time
Notes
This is a Supporting Introductory Notebook for my Linked in Series in Anomaly
Detection
REFERENCE LINKS for Linked in Series on Time Series
Anomaly Detection in Time Series -Part 1 Level Shift
https://www.linkedin.com/pulse/anomaly-detection-time-series-part-1-level-
shift-dr-anish-utcic/
Anomaly Detection Part 2 – Isolation Forest
https://www.linkedin.com/pulse/anomaly-detection-part-2-isolation-forest-
roychowdhury-ph-d--qfyac/
Anomaly Detection Part 3 – Local Outlier Factor
https://www.linkedin.com/pulse/anomaly-detection-part-3-local-outlier-factor-
roychowdhury-ph-d--khshc/
Anomaly Detection Part 4 - using Auto Encoders
https://www.linkedin.com/posts/activity-7285011475323109376-oD3A?
utm_source=share&utm_medium=member_desktop
Section 1 : Background
What is Anomaly Detection
Anomaly detection is the process of identifying data points, patterns, or events that
significantly deviate from the normal pattern within a dataset. These outliers typically
represent rare or unexpected behaviours, making them critical in various domains
such as fraud detection, system performance monitoring, and predictive
maintenance. The primary goal of anomaly detection is to recognize unusual
occurrences that may indicate potential risks, failures, or opportunities, enabling
timely intervention or decision-making.
A point anomaly occurs when a single data point deviates significantly from the rest
of the data.
Example:
In sensor data, a sudden spike in temperature could indicate a malfunction.
2) Contextual Anomalies (Seasonal or Context-Based):
A contextual anomaly occurs when a data point is anomalous within a specific
context but may appear normal in another. This type often occurs in time series
where seasonality or trends are present.
Example:
A sudden drop in sales during a holiday season when sales are expected to rise.
3) Collective Anomalies:
A collective anomaly is when a sequence or a group of data points deviate from the
expected pattern, though individual points may not appear anomalous.
Example:
A sensor's readings gradually deviate from the norm over time, indicating system
degradation.
4) Level Shift Anomalies:**
A level shift occurs when the mean value of a time series changes abruptly, indicating
an anomaly.
Example:
A sudden change in electricity consumption after a policy update
1. The Box
The box represents the Interquartile Range (IQR)
Lower edge = First Quartile (Q1, 25th percentile)
Upper edge = Third Quartile (Q3, 75th percentile)
The line inside the box = Median (Q2, 50th percentile)
IQR = Q3 - Q1
2. The Whiskers
Extend from the box to show the rest of the distribution
Lower whisker: Q1 - 1.5 × IQR
Upper whisker: Q3 + 1.5 × IQR
Whiskers stop at the last data point within these bounds
3. Outliers
Points plotted individually beyond the whiskers
Any value below Q1 - 1.5 × IQR
Any value above Q3 + 1.5 × IQR
def create_figure():
"""Create a figure with two subplots side by side"""
return plt.subplots(1, 2, figsize=(15, 6))
# Add outliers
outlier_indices = np.random.choice(n_samples, n_outliers, replace=Fal
outliers = np.random.normal(loc=25, scale=3, size=n_outliers)
data_with_outliers[outlier_indices] = outliers
def plot_boxplots():
"""Create and plot boxplots for univariate data"""
normal_data, data_with_outliers = generate_univariate_data()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 7/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
plt.tight_layout()
plt.show()
Plots
In [10]: print("Example 1: Box Plot - Univariate Outliers")
plot_boxplots()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 8/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
# Compute Z-scores
mean = np.mean(data)
std_dev = np.std(data)
z_scores = (data - mean) / std_dev
# Flag anomalies
threshold = 3
anomalies = np.where(np.abs(z_scores) > threshold)[0]
# Plot
plt.figure(figsize=(10, 6))
plt.plot(data, label="Time Series")
plt.scatter(anomalies, data[anomalies], color="red", label="Anomalies", z
plt.legend()
plt.show()
# Combine features
normal_data = np.column_stack((feature1, feature2))
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 9/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
for i, x in enumerate(data):
diff = x - mean
distances[i] = np.sqrt(diff.dot(inv_covmat).dot(diff))
Detect Anomalies
In [7]: def detect_anomalies(distances, significance_level=0.01):
"""
Detect anomalies using chi-square distribution threshold
"""
# For Mahalanobis distance squared, use chi-square with p degrees of
threshold = np.sqrt(chi2.ppf(1 - significance_level, df=2))
return distances > threshold, threshold
Plot Results
In [8]: def plot_results(normal_data, anomalous_data, anomaly_indices, detected_a
distances, threshold, mean, cov, significance_level, titl
"""
Create side-by-side plots showing normal and anomalous data
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 10/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
"""
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
plt.suptitle(title, y=1.05)
plt.tight_layout()
plt.show()
Execute Example
In [9]: # Set random seed for reproducibility
np.random.seed(42)
# Detect anomalies
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 11/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
significance_level = 0.01
detected_anomalies, threshold = detect_anomalies(distances, significance_
# Plot results
plot_results(normal_data, anomalous_data, anomaly_indices, detected_anoma
distances, threshold, mean, cov, significance_level,
'Multivariate Time Series Anomaly Detection using Mahalanobis
print("\nPerformance Metrics:")
print(f"Precision: {precision:.3f}")
print(f"Recall: {recall:.3f}")
print(f"F1-Score: {f1_score:.3f}")
Performance Metrics:
Precision: 0.000
Recall: 0.000
F1-Score: 0.000
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 12/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
def plot_cluster_outliers():
"""Plot cluster-based outliers"""
normal_data, data_with_outliers, n_outliers = generate_cluster_data()
fig, (ax1, ax2) = create_figure()
plt.tight_layout()
plt.show()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 13/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 14/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
# Select days for outliers, focusing on winter months (days 0-90 and
winter_days = np.where((days < 90) | (days > 270))[0]
outlier_indices = np.random.choice(winter_days, n_outliers, replace=F
def plot_seasonal_outliers():
"""Plot seasonal ice cream sales with contextual outliers"""
normal_data, data_with_outliers, outlier_indices = generate_seasonal_
fig, (ax1, ax2) = create_figure()
plt.tight_layout()
plt.show()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 15/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
C(t) = Cb + ϵt
After intervention (t ≥ ti ) :
C(t) = Cb + ΔC + ϵt
where:
is the intervention time point
ti
Statistical Properties
The level shift can be characterized by:
1. Mean Shift:
Δμ = E[C(t ≥ ti )] − E[C(t < ti )] = ΔC
2. Hypothesis Test:
H0 : Δμ = 0 vs H1 : Δμ < 0
3. Effect Size:
|Δμ|
d =
σ
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 16/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Business Context
This model represents a subscription-based streaming service's monthly customer
churn rate over a 36-month period. The level shift occurs at month 18 when a new
customer retention strategy was implemented, including:
1. Enhanced customer support system
2. Personalized content recommendations
3. Improved user interface
4. Loyalty rewards program
The intervention resulted in:
Immediate reduction in baseline churn rate
Sustained improvement in customer retention
More stable month-to-month variations
Practical Implications
The reduction in churn rate from to μ1 = 5.0% represents:
μ2 = 3.5%
# Parameters
base_churn_rate = 5.0 # Initial 5% monthly churn rate
noise_level = 0.3 # Random variation in churn
intervention_impact = -1.5 # 1.5% reduction in churn after intervention
intervention_point = 18 # Intervention at month 18
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 17/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
# Plotting
plt.figure(figsize=(12, 6))
# Format axes
plt.gcf().autofmt_xdate()
y_min, y_max = plt.ylim()
plt.ylim(0, y_max + 0.5)
# Add annotation
plt.annotate('Strategy Implementation\nReduced Churn',
xy=(intervention_date, np.mean(churn_after[:6])),
xytext=(30, 30), textcoords='offset points',
arrowprops=dict(arrowstyle='->'), fontsize=10)
plt.tight_layout()
plt.show()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 18/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Example 7 : A Seasonal
Occupancy Rates Time Series - Level Shift : Hotel
Model Specification
Let represent the hotel occupancy rate at time . The model includes these
O(t) t
components:
Base occupancy rate: μ = 65%
O(t) = μ + S(t) + ϵt
After intervention (t ≥ ti ) :
O(t) = μ + Δ + S(t) + ϵt
Business Context
This model represents a luxury hotel's monthly occupancy rates from 2020 to 2024,
with a significant change occurring after implementing a dynamic pricing strategy.
The key components are:
1. Seasonal Pattern:
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 19/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
2. Mean Shift:
E[O(t ≥ ti )] − E[O(t < ti )] = Δ
3. Variance Stability:
2
V ar(O(t)) = σ for all t
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 20/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
# Parameters
base_occupancy = 65 # Base occupancy rate (%)
seasonal_amplitude = 15 # Seasonal variation amplitude
level_shift = 10 # Increase after new pricing strategy
noise_level = 2 # Random variation
intervention_point = 30 # New strategy implementation (month 30)
# Plotting
plt.figure(figsize=(15, 8))
# Format axes
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 21/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
plt.gcf().autofmt_xdate()
plt.ylim(30, 100)
# Add annotation
plt.annotate('Strategy Implementation\nIncreased Base Occupancy',
xy=(intervention_date, np.mean(occupancy_after[:6])),
xytext=(30, 30), textcoords='offset points',
arrowprops=dict(arrowstyle='->'), fontsize=10)
plt.tight_layout()
plt.show()
Example
commerce8:Sales
Uptrending Time Series with Level Shift : E
Business Context:
The code simulates an e-commerce company's daily sales data, showing the impact
of a major marketing campaign launch.
Realistic Parameters:
Let be the base daily sales, be the growth rate, be the campaign impact, and
Sb r Ic
S(t) = Sb + rt + ϵt
After campaign (t ≥ tc ) :
S(t) = Sb + rt + Ic + ϵt
where:
is the time index
t
which combines both the immediate campaign impact and the continued growth
trend.
In [3]: import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
# Generate time points (200 days with 0.1 day intervals for smooth visual
days = np.arange(0, 200, 0.1)
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 23/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
plt.tight_layout()
plt.show()
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 24/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
1. Base Model
V (t) = μ + ϵt + L(t)
where:
is the base trading volume
μ
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 25/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Business Context
This analysis monitors stock trading volume to detect significant changes and
anomalies. Key components include:
1. Base Parameters:
Daily trading volume ( shares)
μ = 1, 000, 000
1. Market Events:
Company earnings announcement
Market structure change
Institutional investor activity
2. Volume Characteristics:
Sustained increase in trading activity
New baseline volume level
Maintained volatility pattern
Statistical Properties
1. Pre-Event Distribution:
2
V (t) ∼ N (μ, σ ) for t < te
2. Post-Event Distribution:
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 26/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
2
V (t) ∼ N (μ + Δ, σ ) for t ≥ te
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 27/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
rolling_std = np.array([np.std(trading_volume[i:i+window_size])
for i in range(len(trading_volume)-window_size+1)]
plt.tight_layout()
plt.show()
# Print analysis
pre_event_avg = np.mean(trading_volume[:100])
post_event_avg = np.mean(trading_volume[100:])
volume_increase = (post_event_avg - pre_event_avg) / pre_event_avg * 100
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 28/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Section
Price 4 : Real
Fluctuation World use
Anomaly case
using- Stock
Mahanalobis Distance for NVIDIA Stock
Imports and Installs
In [11]: # Install required libraries if not already installed
# !pip install yfinance matplotlib numpy scipy
import yfinance as yf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.spatial.distance import mahalanobis
from scipy.stats import chi2
Custom Functions
Get Stock Data
In [16]: def get_stock_data(ticker, days, price_column='Close'):
"""
Fetch stock data from Yahoo Finance.
Parameters:
ticker: str - Stock ticker symbol
days: int - Number of days of historical data to fetch
price_column: str - Which price column to use ('Open', 'High', 'Low',
Returns:
pandas Series - Price data for the specified period
"""
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 29/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
end_date = pd.Timestamp.today()
start_date = end_date - pd.Timedelta(days=days)
# Fetch data
data = yf.download(ticker, start=start_date, end=end_date)
if data.empty:
raise ValueError(f"No data fetched for {ticker}. Check the ticker
return prices
Parameters:
series: pandas Series - The original time series
window_size: int - Number of lags to create
Returns:
pandas DataFrame with columns ordered from most recent to oldest lag
e.g., for window_size=3:
t0 (current), t-1 (1 day ago), t-2 (2 days ago)
"""
# Create list of shifted series in reverse order (from oldest to newe
lagged_series = [series.shift(i) for i in range(window_size-1, -1, -1
return lagged_df
Detect Anomalies
In [26]: def detect_anomalies(ticker="NVDA", days=30, window_size=5, confidence_le
"""
Detect anomalies in stock price data using Mahalanobis distance.
Parameters:
ticker: str - Stock ticker symbol
days: int - Number of days of historical data to analyze
window_size: int - Size of the rolling window for lag features
confidence_level: float - Confidence level for anomaly threshold
price_column: str - Which price column to use
"""
# Step 1: Fetch Stock Data
try:
prices = get_stock_data(ticker, days, price_column)
except Exception as e:
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 30/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
if lagged_data.empty:
raise ValueError("Insufficient data for the specified window size
# Highlight anomalies
anomaly_indices = anomalies[anomalies].index
anomaly_values = prices.loc[anomaly_indices]
plt.scatter(anomaly_indices, anomaly_values, color='red',
label=f'Anomalies (>{confidence_level*100}% confidence)',
zorder=5, s=100)
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 31/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
'lagged_data': lagged_data,
'mahalanobis_distances': mahalanobis_distances,
'anomalies': anomalies,
'threshold': threshold
}
if nvidia_results:
print("\nAnalysis Results:")
print(f"Number of anomalies found: {nvidia_results['anomalies'].s
print(f"Threshold value: {nvidia_results['threshold']:.2f}")
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
Tesla stock data shape: (247, 1)
Analysis Results:
Number of anomalies found: 4
Threshold value: 5.57
Appendix A :
Plots in DetailInterquartile Range and Box
https://en.wikipedia.org/wiki/Exploratory_data_analysis
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 32/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Definition of IQR
The Interquartile Range (IQR) is a statistical measure that represents the spread of
the middle 50% of a dataset. It is defined as:
I QR = Q3 − Q1
where:
Q1 (First Quartile): The 25th percentile (lower quartile) of the data
Q3 (Third Quartile): The 75th percentile (upper quartile) of the data
Why Is the Bulk of Data Found Within This Range?
Captures Central Distribution
Since the IQR focuses on the middle 50% of the data, it excludes extreme values and
provides a robust measure of data variability.
Resistant to Outliers
Unlike the mean and standard deviation, which are sensitive to extreme values, the
IQR is not influenced by outliers and provides a more reliable representation of
data dispersion.
Statistical Distribution Properties
In a normal distribution, approximately 50% of the data falls within the IQR
In skewed distributions, the IQR still contains the core of the data, though it
might be asymmetrically distributed
In real-world datasets, most data points are concentrated around the median,
making the IQR a natural boundary for defining expected variation
Using IQR for Outlier Detection
The Tukey's Rule uses the IQR to define outliers:
Lower Bound = Q1 − 1.5 × I QR
Any data points beyond these bounds are considered potential outliers.
Applications of IQR
Data Cleaning: Identifying and handling anomalies in datasets
Descriptive Statistics: Summarizing data variability without being affected by
extreme values
Machine Learning: Feature engineering and preprocessing for robust models
Finance & Economics: Measuring stock price variability and income
distributions
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 33/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Conclusion
The IQR is crucial for understanding data spread while maintaining robustness
against extreme values, making it widely used in statistical analysis and anomaly
detection.
Appendix
Detail B : Understanding Z score in
What is a Z-Score?
A Z-score (also called a standard score) measures how many standard deviations
away from the mean a data point is. The formula is:
x − μ
Z =
σ
where:
is the data point
x
Properties of Z-Scores
The mean of Z-scores is always 0
n
1
μ = ∑ xi
n
i=1
n
1
2
σ = ∑(xi − μ)
⎷ n
i=1
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 34/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
xi − μ
Zi =
σ
Interpreting Z-Scores
Z = 0 : The value equals the mean
Z > 0 : The value is above the mean
Z < 0 : The value is below the mean
|Z| = 1 : The value is one standard deviation from the mean
|Z| = 2 : The value is two standard deviations from the mean
Applications of Z-Scores
1. Standardization: Converting datasets to a common scale
2. Outlier Detection: Values with are often considered outliers
|Z| > 3
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 35/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Important Notes
1. Population vs. Sample:
For population: Use σ = √
1
n
∑(xi − μ)
2
2. Assumptions:
n−1
Appendix
Distance C
and: Understanding
its ApplicationsMahanalbis
in Anomaly
Detection
1. What is Mahalanobis Distance?
The Mahalanobis distance measures how far a point is from a distribution with x
mean and The covariance matrix . Unlike Euclidean distance, it considers the
μ Σ
variance and correlation of the data, making it more robust for multivariate data.
Formula:
The Mahalanobis distance DM between a point and a distribution is given by:
x
T −1
DM (x) = √(x − μ) Σ (x − μ)
T −1
= √d Σ d where d = (x − μ)
Where:
: A data point (vector) in
x R
n
1. It accounts for the correlation between variables (e.g., multiple features in time
series).
2. It scales the data, so variables with larger variances do not dominate the
distance calculation.
3. It provides a probabilistic interpretation of how "unusual" a data point is.
3.Detection
Steps to Use Mahalanobis Distance for Anomaly
Step 1: Preprocess the Time Series Data
Ensure the time series data is clean and normalized.
If the data has multiple features, organize it into a matrix , where each row X
N
1
μ = ∑ xi
N
i=1
N
1
T
Σ = ∑(xi − μ)(xi − μ)
N − 1
i=1
T −1
DM (xi ) = √(xi − μ) Σ (xi − μ)
4. Mathematical Intuition
Covariance Matrix
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 37/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
The covariance matrix captures the relationships between variables. Its inverse
Σ
Chi-Squared Distribution
Under the assumption that the data follows a multivariate normal distribution, the
squared Mahalanobis distance follows a Chi-squared distribution with degrees
D
2
n
2 2
D (x) ∼ χn
M
⎢ x21 x22 ⎥
X = ⎢ ⎥
⎢ ⎥
⎢ ⋮ ⋮ ⎥
⎣ ⎦
xN 1 xN 2
Steps:
1. Compute the mean vector and covariance matrix .
μ Σ
Appendix
how its D
used: The
for Chi Square
Mahalanobis DIstribution
method and -
setting the threshold
Why Use the Chi-Squared Distribution?
The squared Mahalanobis distance follows a Chi-squared distribution with
D
2
(x)
n n
because:
2 T −1 2
D (x) = (x − μ) Σ (x − μ) ∼ χn
M
This relationship allows us to use the properties of the Chi-squared distribution to set
a threshold for anomaly detection.
Key Properties of the Chi-Squared Distribution
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 38/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
1. Degrees of Freedom:
The degrees of freedom correspond to the number of features in the data.
n
For example, if your data has 2 features, the squared Mahalanobis distance
follows . χ
2
2
2. Probabilistic Interpretation:
The Chi-squared distribution provides a probabilistic framework for determining
how "unusual" a data point is.
For a given significance level (e.g., 0.05 for 95% confidence), you can
α
2
P (D (x) ≤ τ ) = 1 − α
M
This means that of the data points under normal conditions will
100(1 − α)%
3. Threshold Calculation:
The threshold is computed using the percent-point function (PPF) of the
τ
Chi-squared distribution:
2
τ = χn (1 − α)
Where:
χn
2
is the Chi-squared distribution with degrees of freedom.
n
T −1
M Di = √(xi − μ) S (xi − μ)
with degrees of freedom equal to the number of features ( ), we define the threshold
p
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 39/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
as:
2 2
MD ≤ χp,α
i
where:
is the number of dimensions (features)
p
considered an anomaly
χp,αis the critical value of the Chi-Square distribution at the desired confidence
2
level
Choosing the Right Confidence Level
The choice of the confidence level ( ) affects anomaly detection:
1 − α
2
χ ≈ 5.99
2,0.05
Conclusion
The Mahalanobis distance threshold is derived from the Chi-Square
distribution
The confidence level ( ) determines the strictness of anomaly detection
1 − α
positives
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 40/43
04/02/2025, 16:15 Introduction_to_Anomaly_Detection
Appendix E : Window
Level Shift Detection Size Selection for
Theoretical Guidelines
The choice of window size depends on several factors:
1. General Rules of Thumb:
Minimum window size: to observations
wmin = 8 12
where:
is Average Run Length under control
ARL0
2
4σ
wmin =
δ2
Implementation Example
def optimal_window_size(n_observations, expected_shift_magnitude,
std_dev):
"""
Calculate optimal window size for level shift detection
Parameters:
n_observations: Total number of observations
expected_shift_magnitude: Expected magnitude of level shift
std_dev: Standard deviation of the process
"""
return w_opt
3. Business Requirements:
Detection speed needs
False alarm tolerance
Computational resources
The most cited references suggest:
For quick detection: to
w = 8 15
References:
1. Basseville, M., & Nikiforov, I. V. (1993). Detection of Abrupt Changes: Theory and
Application.
2. Montgomery, D. C. (2009). Statistical Quality Control.
3. Lucas, J. M., & Saccucci, M. S. (1990). Exponentially weighted moving average
control schemes.
End of Tutorial
file:///Users/anishroychowdhury/Desktop/Introduction_to_Anomaly_Detection.html 43/43