Change Point Detection
Change Point Detection
I E S 6 0 1 S E MI N AR
S A L BA N I CH A K RA BO RTTY (18I190011)
U N D E R T H E G U I D A NCE O F P RO F. N H E MA CH A N D RA
Contents
Background
Sample time series and change points
Classifications of change point detection algorithms and methods of detecting change points
Maximum Likelihood Estimation Of Change Point Location
Hypothesis test for the Change Point Problem – Single and Multiple Change point detection
Binary Segmentation Method in Multiple change point detection
CUSUM Procedure
Change Finder Method
Performance Measures of CPD Algorithms
References
Background
Data in reality may not often retain the same statistical properties is The Change point Problem
over time.
Detection of change points is useful in modelling and prediction of time series , medical
condition monitoring, climate change detection, speech and image analysis, and human
activity analysis.
The most commonly investigated changes in behaviour are:
Change in mean
Change in Variance
Change in Regression Model/parameter
Sample time series and change points
Classifications of Change Point
Detection
Change point detection algorithms are classified as “online” or “offline”.
The goal of off-line detection is generally to identify all of a sequence’s change points in
batch mode.
The goal of on-line detection is to detect change as soon as possible after it occurs, ideally
before the next data point arrives.
Both Parametric and Non-parametric methods are used in change point detection.
Basic Algorithms Used In CPD Problems
The techniques used in CPD include both supervised and unsupervised methods.
Supervised learning algorithms learn a mapping from input data to a target attribute of
the data, usually a class label.
This presentation will mostly focus on un-supervised methods of CPD.
Un-supervised learning algorithms are typically used to discover patterns in unlabeled
data.
Un-supervised methods includes
Likelihood Ratio Method
Change Finder Method
CUSUM Method
Maximum Likelihood Estimation Of The
Change Point Location
Given dataset 𝑥1 , 𝑥2 ,. . . . . . . . 𝑥𝑛 (Normally distributed) the estimated changepoint location 𝑘 is
k=arg k max 2≤k≤n−1 Vk
Vk = σnt=1 xt − μො 2 − σkt=1 xt − μ
ෞ1 2 + σnt=k+1 xt − μෞn 2
1 n 1 1
μො = σk=1 xk μ1 = σkt=1 xt , μෞn =
,ෞ σnt=k+1 xt
n k n−k
From this 𝐤መ value we get statistic 𝐔𝐤መ = 𝐕𝐤መ equivalent to the likelihood ratio test-statistic.
Hypothesis test for the Change Point
Problem
A natural approach to detecting a single changepoint is to perform an hypothesis test.
The hypotheses where there is a change in mean of the distribution at the point k is defined as
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = ... = 𝝁𝒏 ; 𝑯𝟏 : 𝝁𝟏 = ... = 𝝁𝒌 ≠ 𝝁𝒌+𝟏 = ... = 𝝁𝒏
Here assumption is the data is normally distributed.
H0 is rejected when Uk > cα
𝑈𝑘 is the test statistic of k and 𝑐𝛼 is the critical value 𝛼 is a chosen significance level.
Likelihood Ratio Test For Single Change
We can view detecting a single changepoint as a hypothesis test
𝑯𝟎 : No changepoint, m = 0
𝑯𝟏 : A single changepoint, m = 1
One approach is to find τ, the position of change which maximises the log likelihood
𝟏 ) + 𝐥𝐨𝐠 𝐩(𝐲𝛕+𝟏:𝐧 |𝛉
L(τ) = 𝐥𝐨𝐠 𝐩(𝐲𝟏:𝛕 |𝛉 𝟐 )
Then, calculate the test statistic
]
𝛌 = 2[𝐦𝐚𝐱 𝛕 L(τ) - 𝐥𝐨𝐠 𝐩(𝐲𝟏:𝐧 |𝛉)
We then choose a threshold, c, such that we reject the null hypothesis if λ > c.
Multiple Change Point Detection
In practice the assumption of only one change may be unrealistic.
The search method for multiple change point aim to minimise,
σ𝒎+𝟏
𝒊=𝟏 [𝑪(𝒚(𝝉𝒊−𝟏+𝟏):𝝉 )] + 𝜷𝒇(𝒎)
𝒊
∁ is a cost function for a segment and 𝛽𝑓(𝑚) is a penalty to guard against over fitting.
∁ is negative log-likelihood and 𝛽𝑓(𝑚) may be 𝑐m.
An approximate method for minimizing the above is Binary Segmentation.
Binary Segmentation
Input: A time series of the form {𝑦1 , 𝑦2 , . . . . . . . . . , 𝑦𝑛 }
A test statistic Λ(.) dependent on the time series
An estimator of change point position 𝜏(.)
Ƹ
A rejection threshold C
Initialise: Let L = 𝜑, and S = {[1,n]}
Iterate: While S ≠ 𝜑
1. Choose an element of S; denote this element as [s,t].
2. If Λ(𝑦𝑠:𝑡 ) < C, remove [s,t] from S.
3. If Λ(𝑦𝑠:𝑡 ) ≥ C then:
a. remove [s,t] from S;
Can be inefficient, Ο(𝑄𝑛2 ), but efficiency can be improved through pruning such that Ο 𝑛 .
Exact methods include
Segment Neighbourhood Search
Pruned Exact Linear Time Method
CUSUM Procedure
It is method for detecting change in distribution of sequentially observed data.
At the 𝑘𝑡ℎ stage, the likelihood ratio test statistic is
𝐟𝟏 𝐗𝐢
𝐓𝐤 =max 𝐦𝐚𝐱 𝟏≤𝐣≤𝐤 σ𝐤𝐢=𝐣 𝐙𝐢 , 𝟎 where 𝐙𝐢 =𝐥𝐨𝐠
𝐟𝟎 𝐗𝐢
When difference in time between the detected CP and the actual CP represents the measure of
performance, then the measures used are
MAE
MSE
MSD
RMSE
Measures Of Comparison Between
Various Change Point Algorithms
Accuracy = 𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃
𝑇𝑃+𝑇𝑁
Sensitivity=𝑇𝑃+𝐹𝑁
𝑇𝑃
𝑇𝑃
Precision=
𝑇𝑃+𝐹𝑃