Stationary stochastic processes
Maria Sandsten
Lecture 6
October 4 2022
1 / 39
The drawback of AR-estimators
Often the model order p is difficult to estimate, and errors lead to either
spurious peaks or missing information. We exemplify with data sequences
simulated from an ARMA(6,2)-process, and estimation with an
AR(6)-model and an AR(24)-model. If the chosen AR-model and
corresponding order do not match the structure of data, both bias and
variance could be very large.
2 / 39
Schedule for today
I The periodogram again - with zero-padding
I Expected value - Resolution and leakage
I Variance reduction - Welch method
I An example
I Motivation to cross-spectral estimation
I Multitapers
I The need for time-frequency analysis
I Some examples of research projects
I Master thesis proposals
3 / 39
The periodogram
Let xt , t = 0, 1, 2, . . . N − 1 be a sequence of data and compute the
Fourier transform,
N−1
X
X (f ) = xt e −i2πft .
t=0
The periodogram is defined as
1
Rbx (f ) = |X (f )|2 ,
N
and is the most well known estimator of the spectral density.
4 / 39
The FFT and zero-padding
With the Discrete Fourier Transform (DFT),
N−1
k X k
X( )= xt e −i2π NFFT t ,
NFFT t=0
where k = 0, 1, . . . , NFFT − 1, the periodogram can be computed as,
k 1 k
Rbx ( ) = |X ( )|2 ,
NFFT N NFFT
where NFFT ≥ N. For a better view of the periodogram, we use
zero-padding (noll-utfyllnad), i.e. NFFT > N. Often NFFT = 2integer as
this is a good choice when the calculations are made using the Fast
Fourier Transform (FFT) algorithm.
5 / 39
Examples
The periodogram of a discrete time sequence xt = cos(2πf0 t),
t = 0, 1, . . . , N − 1, with frequency f0 = 2/16 = 0.125, N = 16 (left) and
f0 = 0.15, N = 16 (right) is calculated.
6 / 39
Zero-padding
The same signals but we use zero-padding with NFFT = 64, which is
extension of the sequence length with NFFT − N = 48 zero values,
resulting in periodogram estimates at 64 frequency values instead of 16.
7 / 39
Expected value of the periodogram
The expected value of a periodogram is controlled by the function KN (f )
according to
Z 1/2
E [Rx (f )] =
b KN (f − u)RX (u)du,
−1/2
where 10 log10 |KN (f )| is presented below (n=N).
-5 Mainlobe
-10
-15 Sidelobes
-20
10log |K (f)|
n
-25
10
-30
-35
-40
-45
-50
-5/n -4/n -3/n -2/n -1/n 0 1/n 2/n 3/n 4/n 5/n
f
8 / 39
Expected value of the periodogram
Periodograms of an ARMA(4,2)-process for different data lengths
N = 16, 32 and 128. To increase resolution, we need more data.
However, more data can not solve the problem of leakage.
9 / 39
The modified periodogram
To reduce leakage, data windowing or tapering can be applied,
N−1
1 X
Rbx (f ) = | xt w (t)e −i2πft |2 ,
N t=0
where w (t) is a data window. The method is often called the modified
periodogram. The expected value is, similar as before
Z 1/2
E [Rbx (f )] = Kw (f − u)RX (u)du,
−1/2
where now the window spectrum Kw (f ) = |W (f )|2 with W (f ) as the
Fourier transform of w (t).
10 / 39
Comparison
A comparison of the window spectra of the rectangular window function
(blue) and a Hanning window function (black).
11 / 39
Example
Expected value of the spectral estimate using different lengths of the
rectangular window (blue) and the Hanning window (black).
12 / 39
Variance of the periodogram
The variance of the periodogram is V [Rbx (f )] ≈ RX2 (f ). Example with
periodograms (blue) of 5 realizations of an ARMA(2,2)-process (red).
Periodogram
25
20
15
10
10log 10 (Rx (f))
-5
-10
-15
-20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
13 / 39
Variance of the modified periodogram
The use of a data window does not decrease the variance. Example with
Hanning periodograms (blue) of 5 realizations of an ARMA(2,2)-process
(red).
Periodogram with Hanning window
25
20
15
10
10log 10 (Rx (f))
-5
-10
-15
-20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
14 / 39
Averaging of periodograms
We calculate the average of K , Hanning windowed, periodogram
estimates, Rbx,j (f ), j = 1 . . . K , where the length of each sequence is
N/K , The variance is V [Rbmv (f )] ≈ K1 RX2 (f ) (the Bartlett method).
Bartlett method (no overlap)
25
20
15
10
10log 10 (Rx (f))
5
*
-5
-10
-15
-20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
15 / 39
The Welch method
A standard estimator is the Welch method using Hanning windows and
50% overlap. Example with K = 4. The variance has been shown to be
close to V [Rbmv (f )] ≈ K1 RX2 (f )
Welch method with K=4, 50% overlap
25
20
15
10
10log 10 (Rx (f))
-5
-10
-15
-20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
16 / 39
Exercise
The expected spectral estimates using three methods are given: 1) the
periodogram with a rectangular window, 2) the modified periodogram with a
Hanning window, 3) the Welch method using K = 3 Hanning windows with
50% overlap. Combine the correct method and estimate. The realizations are
simulated from the same ARMA(2,4)-process.
17 / 39
Exercise, cont’d
Also determine if the following statements are correct.
1) Zero-padding will increase the frequency resolution of a spectral
estimate.
2) The periodogram with a rectangular window has the best resolution
of the three presented methods.
3) The modified periodogram with a Hanning window has less spectral
leakage than the periodogram with a rectangular window.
4) The window lengths of the Hanning windows in the Welch method
are L = 20 when the total number of data samples is N = 60.
5) The standard deviation of the Welch method is 3 times smaller in
comparison to the periodogram.
6) A spectral estimator with a wider main lobe can resolve two close
peaks better than a spectral estimator with a more narrow main
lobe.
18 / 39
Cross-spectrum
The cross-spectrum is complex valued,
RX ,Y (f ) = AX ,Y (f )e iΦX ,Y (f ) ,
where AX ,Y (f ) = |RX ,Y (f )| is the cross-amplitude spectrum and
ΦX ,Y (f ) = arg RX ,Y (f ) is the phase spectrum.
Similar to the periodogram for one measured data sequence,
1
Rbx (f ) = X (f )X (f ),
N
the cross-spectrum estimate for two sequences is
1
Rbx,y (f ) = X (f )Y(f ).
N
19 / 39
Estimation of frequency function
An estimate of the frequency function is then found as
1
b ) = Rx,y (f ) = N X (f )Y(f ) Y(f )
b
H(f 1
= .
Rbx (f ) N X (f )X (f ) X (f )
Amplitude estimate Phase estimate
12 4
3
10
8
1
6 0
-1
4
-2
2
-3
0 -4
0 5 10 15 0 5 10 15
f f
For a more reliable estimate apply the Welch method or a multitaper
approach.
20 / 39
Estimation of coherence spectrum
The (squared) coherence spectrum is defined as
|RX ,Y (f )|2
κ2X ,Y (f ) = ,
RX (f )RY (f )
and 0 ≤ κ2X ,Y ≤ 1.
If we use the periodogram for estimation of the coherence spectrum
1 1
2 =
|Rbx,y (f )|2 N X (f )Y(f ) · N X (f )Y(f )
κd
x,y = 1 1
= 1,
Rbx (f )Rby (f ) N X (f )X (f ) · N Y(f )Y(f )
the resulting estimate is obviously worthless. The need for the Welch
method or a multitaper approach is obvious.
21 / 39
Multitaper spectral estimates
A multitaper spectral estimate is given as the average of
N−1
1 X
Rbj,x (f ) = | x(t)wj (t)e −i2πft |2 ,
N t=0
where the windows wj (t), j = 1 . . . K , are chosen to give uncorrelated
spectra from the same data.
The 4 first DPSS Thomson multitaper method, K=8
0.08 25
20
0.06
15
0.04
10
10log 10 (Rx (f))
0.02
5
0 0
-5
-0.02
-10
-0.04
-15
-0.06 -20
0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
22 / 39
The Thomson multitapers in Matlab
23 / 39
The Welch method again
The Welch method is also a multitaper estimator.
Welch method with K=4, 50% overlap
0.25 25
20
0.2
15
10
10log 10 (Rx (f))
0.15
5
0
0.1
-5
-10
0.05
-15
0 -20
0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
24 / 39
The Peak Matched Multiple Windows
Multitapers tailored for spectra with peaks.
The 4 first PMMW PM MW method, K=8
0.08 25
0.06 20
15
0.04
10
0.02
10log 10 (Rx (f))
5
0
0
-0.02
-5
-0.04
-10
-0.06 -15
-0.08 -20
0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
f
I M. Hansson and G. Salomonsson, ”A Multiple Window Method for Estimation of Peaked Spectra”, IEEE
Trans. on Signal Processing, Vol. 45, No. 3, March 1997.
I M. Hansson, ”Optimized Weighted Averaging of Peak Matched Multiple Window Spectrum Estimates”,
IEEE Trans. on Signal Processing, Vol. 47, No. 4, April 1999.
25 / 39
Example: Comparison of methods
26 / 39
The need for time-frequency analysis
a) Chirp signal b) Impulse signal
1.5 15
1
10
0.5
0
5
-0.5
-1 0
-1.5
0 100 200 300 400 500 0 100 200 300 400 500
Time Time
c) Periodogram of chirp d) Periodogram of impulse
800 800
600 600
Power
Power
400 400
200 200
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
Frequency Frequency
27 / 39
The need for time-frequency analysis
a) Chirp signal b) Impulse signal
1.5 15
1
10
0.5
0
5
-0.5
-1 0
-1.5
0 100 200 300 400 500 0 100 200 300 400 500
Time Time
c) Periodogram of chirp d) Periodogram of impulse
800 800
600 600
Power
400 Power 400
200 200
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
Frequency Frequency
28 / 39
Window shape of the spectrogram
The spectrogram window shape should be chosen for the best trade-off
between sidelobe suppression and mainlobe width, similar as for the
choice of periodogram window.
29 / 39
Cross-terms and resolution
30 / 39
Motivation: Binary classification
An example of a binary classification using Matlabs feed-forward pattern
recognition neural network (patternnet). We simulate 2x200 realizations
from two non-stationary random harmonic function models with
disturbance of Gaussian white noise, divided in training (40%), validation
(10%) and test data (50%). Classification is based on different features,
1) raw data samples (128 features),
2) periodogram samples (128 features),
3) spectrogram images (128x128 features).
The Matlab-code for this example is found Canvas together with the
lecture notes of lecture 5.
31 / 39
Two non-stationary random harmonic function models
32 / 39
Feature extraction, SNR=0 dB
33 / 39
Classification results
SNR=0 dB and results of 10 average runs of the network.
I Raw data Y1-Y2: 54.2%
I Periodogram R1-R2: 70.0%
I Spectrogram S1vect-S2vect: 86.7%
The computational time increases about 10 times for the
spectrogram feature classification, caused by the higher dimension
of the feature vector.
34 / 39
Clustering of bird song syllables/Soundscape analysis
Aim: To estimate the repertoire size of a bird’s song from features of
time-frequency representations. eSSENCE, Biology - Lund University.
Aim: To optimize time-frequency features for source classification and
separation. SSF, RISE - Gothenburg. Started February 2022 .
Strophe
Signal
Time-frequency images
1 2 3 4 5 6 7 8 9 10 11 12
of syllables
0.8
0.6
0.4
0.2
Amplitude
−0.2
−0.4
−0.6
−0.8
Syllable
1.5 2 2.5 3 3.5 4 4.5
s
Clustering of time-frequency
image features
35 / 39
Characterization of electrical brain signals (EEG)
Aim: To optimize the time-frequency feature representation of EEG, in terms
of resolution and reliability, for use in Brain Computer Interfaces (BCIs).
WASP, Automatic control and Psychology - Lund University .
Aim: To learn how the brain processes sound from several sources (the cocktail
party problem). ELLIIT, Linköping University and Oticon A/S (hearing
devices). Started May 2021.
Cocktail party problem
36 / 39
Matched reassigned spectrogram
37 / 39
Phase difference estimation of VEP
Phase difference estimation of EEG data measured during visual stimulation
with a 9 Hz flickering light flashed for the time interval of about 1 s, resulting
in a visual evoked potential (VEP).
I M. Åkesson and M. Sandsten, ”Phase Reassignment with Efficient Estimation of Phase Difference”,
EUSIPCO 2022, Belgrade.
I O. Keding and M. Sandsten, ”Robust Phase Difference Estimation of Transients in High Noise Levels”,
EUSIPCO 2022, Belgrade.
38 / 39
Master thesis proposals
For more information, see
www.maths.lth.se/utbildning/matematisk-statistik/examensarbetesforslag/
Speech-to-text and machine learning in
MASTER THESIS insurance fraud detection
Frequency analysis of peel off force
Carton packages for beverages are made of material that consists of several layers. For example, there
are layers of plastic, carton (of course), and often aluminium. Good adhesion between the layers is
important for the function of the package. The adhesion is usually measured by peeling off a strip of a
layer from the rest of the material, collecting values of displacement and corresponding force.
Figure 1 Sketch of a peel test in 180°
Data from a peel off test for adhesion
3.5
2.5
2
Force
1.5
A large insurance company receives thousands of insurance claims every day.
1
Almost all of these are valid and should be settled hastily – however a few are
0.5
fraudulent and should not be settled at all. The scarcity of fraudulent cases and their
0 often complex data patterns makes fraud modelling and prediction very challenging.
0 5 10 15 20 25 30 35 40
Displacement
One way to address the problem is to introduce new data points that improve the
segmentation of cases into fraud/non-fraud. One such new data point could be the
Figure 2 An example of a force–displacement curve
customers verbal description of his or her insurance claim. This is a proposal to a
Traditionally, the plateau value of the force curve is the value considered to be of interest, and the master’s thesis that includes 1) speech-to-text analysis of recorded conversations, 2)
adhesion is expressed in terms of N/m, where m is the width of the strip. However, there is not always machine learning on the fraud/non-fraud binary response problem and 3) evaluation
such a plateau value. There can be a variation of the force. The purpose of this thesis is to analyse this of model performance compared to the existing prediction method.
variation using frequency analysis. The master’s thesis will be a continuation of previous thesis which can be found here:
The sources of the variation can be of several kinds. It could be a variation of the base material, it could http://lup.lub.lu.se/student-papers/record/9097486
be vibrations or something else in the Tetra Pak factories during lamination (when the layers are put
together), or it could be the way of testing rather than the adhesion itself that gives the variation.
We want to Contact information: Fredrik Thuring, Head of Operational analytics, Tryg Forsikring
• Learn what frequencies there is in the curves for several varieties of material, and if possible, [email protected], +46 701 683 548.
connect these frequencies to process or material parameters
• Produce some sketches of how we can proceed with estimating the spectral density on University contact: Maria Sandsten, Professor
regular basis
[email protected], +46 46 222 49 53
Contact persons:
Magnus Arnér, [email protected], Linda Hartman, [email protected]
Tetra Pak Packaging Solutions AB
Ruben Rausings gata, SE-221 86, Lund, Sweden, Tel: +46 46 36 10 00, www.tetrapak.com
1(1)
.
Tetra Pak is a trademark belonging to the Tetra Pak Group
General 39 / 39