0% found this document useful (0 votes)

109 views

Advanced Signal Processing Linear Stochastic Processes

This document introduces linear stochastic models for real world data. It aims to teach how stochastic processes are created and their autocorrelation, variance and power spectrum. Specific topics covered include autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) models. It also discusses stability conditions, model order selection, and applying stochastic modelling to real world data like speech and finance.

Uploaded by

Alex Stihi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

109 views

Advanced Signal Processing Linear Stochastic Processes

Uploaded by

Alex Stihi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

Advanced Signal Processing

Linear Stochastic Processes

Danilo Mandic
room 813, ext: 46271

Department of Electrical and Electronic Engineering

Imperial College London, UK
[email protected], URL: www.commsp.ee.ic.ac.uk/∼mandic

c D. P. Mandic Advanced Signal Processing 1
Aims of this lecture
◦ To introduce linear stochastic models for real world data

◦ Learn how stochastic models shape up the spectrum of white noise

◦ Understand how stochastic processes are created, and to get familiarised

with the autocorrelation, variance and power spectrum of such processes

◦ Learn how to derive the parameters of linear stochastic ARMA models

◦ Introduce special cases: autoregressive (AR), moving average (MA)

◦ Stability conditions and model order selection (partial correlations)

◦ Optimal model order selection criteria (MDL, AIC, ...)

◦ Apply stochastic modelling to real world data (speech, environmental,

finance), and address the issues of under– and over–modelling
This material is a first fundamental step for the modelling of real world data

c D. P. Mandic Advanced Signal Processing 2
Example 1: Assessing the nature of a signal from its ACF
Windowed clean signal, signal in WGN, signal with DC offset (see also Lecture 1)

function sin(2x) function sin(2x) + 2 * randn function sin(2x) + 4

1 8 5

6
4
0.5 4

2 3
0 0

−2 2

−0.5 −4
1
−6

−1 −8 0
−5 0 5 −5 0 5 −5 0 5

ACF of sin(2x) ACF of sin(2x) + 2*randn ACF of sin(2x) + 4

100 1200 3500

1000 3000
50
800 2500

600 2000
0
400 1500

200 1000
−50
0 500

−100 −200 0
−10 −5 0 5 10 −10 −5 0 5 10 −10 −5 0 5 10

Which disturbance is more detrimental: deterministic DC or stochastic noise

c D. P. Mandic Advanced Signal Processing 3
How can we categorise real–world measurements?
2
where would you place a DC level in WGN, x[n] = A + w[n], w ∼ N (0, σw )

(a) Noisy oscillations, (b) Nonlinearity and noisy oscillations, (c) Random nonlinear process
(? left) Route to chaos, (? top) stochastic chaos, (? middle) mixture of sources
Nonlinearity
Chaos
?
(c) ?
(b)
? ?
? NARMA
(a)

? ? ?
ARMA
Linearity
Determinism Stochasticity
Our lecture is about ARMA models (linear stochastic)
How about observing the signal through a nonlinear sensor?

c D. P. Mandic Advanced Signal Processing 4
Justification # Wold decomposition theorem
(Existence theorem, also mentioned in your coursework)
Wold’s decomposition theorem plays a central role in time series analysis,
and explicitly proves that any covariance–stationary time series can be
decomposed into two different parts: deterministic (such as a sinewave)
and stochastic (filtered WGN).
Therefore, a general process can be written a sum of two processes
Xq
x[n] = xp[n] + xr [n] = xp[n] + bj w[n − j] w white process
j=1
⇒ xr [n] # regular random process
⇒ xp[n] # predictable process, with xr [n] ⊥ xp[n],
E{xr [m]xp[n]} = 0

we can treat separately the predictable part (e.g. a deterministic

sinusoidal signal) and the random signal.
Our focus will be on the modelling of the random component
NB: Recall the difference between shift–invariance and time–invariance

c D. P. Mandic Advanced Signal Processing 5
Towards linear stochastic processes
Wold’s theorem implies that any purely non-deterministic covariance–stationary
process can be arbitrarily well approximated by an ARMA process

Therefore, the general form for the power spectrum of a WSS process is
XN
Px(eω ) = αk δ(ω − ωk ) + Pxr (eω )
k=1
We are interested in processes generated by filtering white noise with a
linear shift–invariant filter that has a rational system function.
This class of digital filters includes the following system functions:
• Autoregressive (AR) → all pole system → H(z) = 1/A(z)
• Moving Average (MA) → all zero system → H(z) = B(z)
• Autoregressive Moving Average (ARMA) → poles and zeros
→ H(z) = B(z)/A(z)

Definition: A covariance-stationary process x[n] is called (linearly)

R deterministic if p(x[n] | x[n − 1], x[n − 2], . . .) = x[n].

A stationary deterministic process, xp[n], can be predicted correctly (with
zero error) using the entire past, xp[n − 1], xp[n − 2], x[n − 3], . . .

c D. P. Mandic Advanced Signal Processing 6
Recap: Second-order all–pole transfer functions
p1 = 0.999exp(jπ/4), p2 = 0.9exp(jπ/4), p3 = 0.9exp(j7π/12)
Im
60
1
p3 p1 50
p1 We have two conjugate
p2 p2
0.5 40 p3
complex poles, e.g. p1 and
7π/12
p∗1 , therefore

Magnitude
30
π/4
0 Re 20 1
H(z) =
−0.5
p2*
10
(z − p1)(z − p∗1 )
0
p3* p1*
−10 z −2
−1 =
−1 −0.5 0 0.5 1
−20
0 1 2 3 (1 − p1z −1)(1 − p∗1 z −1)
Frequency rad/s

Transfer function for p = ρejθ (ignoring z −2 in the numerator on the RHS):

1 1
H(z) = =
(1 − ρejθ z −1)(1 − ρe−jθ z −1) 1 − 2ρ cos(θ)z −1 + ρ2z −2

1 1
for the sinewave ρ = 1 ⇒ H(z) = =
1 − 2 cos(θ)z −1 + z −2 1 + a1z −1 + a2z −2
⇒ Indeed, a sinewave can be modelled as an autoregressive process

c D. P. Mandic Advanced Signal Processing 7
Example 2: Sinewave revisited, is it det. or stoch.?
Is a sinewave best described as nonlinear deterministic or linear stochastic?
1
Im agi nar y Par t

0.5
Matlab code:
0

−0.5

−1
z1=0;
−4 −3 −2 −1 0 1 2 3 4
p1=[0.5+0.866i,0.5-0.866i];
Re al Part
[num1,den1]=zp2tf(z1,p1,1);
4
zplane(num1,den1);
W hi t e noi s e

2 s=randn(1,1000);
0
s1=filter(num1,den1,s);
figure;
−2
subplot(311),plot(s),
−4
0 50 100 150 200 250 300 350
subplot(313),plot(s1),
subplot(312),;
30 zplane(num1,den1)
F i l t e r e d noi s e

10 The AR model of a
0
sinewave
−10
x(k)=a1*x(k-1)+a2*x(k-2)+w(k)
−20

−30
a1=-1, a2=0.98, w~N(0,1)
0 50 100 150 200 250 300 350

c D. P. Mandic Advanced Signal Processing 8
0

Example
−50 3: Spectra of real–world
−0.5 data
0 100 200 300 0 10 20 30 40 50
Sunspot numbers and their
Sample power spectrum
Number Correlation lag

Sunspot
Partial ACF forseries
sunspot series Burg PowerACF for sunspot
Spectral series
Density Estimate

(dB/rad/sample)
150 1 35 1

30
100 0.5
Signal values

250.5

Correlation
Correlation

50 0 20

Power/frequency
15 0
0 −0.5
10

−50 −1 5
−0.5
0 0 10
100 20 30
200 40 50
300 0 0 0.2 10 0.4 200.6 300.8 401 50
Correlation
Sample Number lag Normalized Frequency (×π rad/sample)
Correlation lag

Partial ACF for sunspot series

Recorded from Burgonwards
about 1700 Power Spectral Density Estimate
1 frequency (dB/rad/sample) 35

This signal is random, as sunspots originate30from the explosions of helium

on the
0.5 Sun. Still, the number of sunspots obeys a relatively simple model
25
Correlation

and is predictable, as shown later in the Lecture.

0 20

c D. P. Mandic Advanced Signal Processing 9
15
−0.5
How do we model a real world signal?
Suppose the measured real world signal has X(ω )

e.g. a bandpass (any other) power spectrum

We desire to describe the whole long
signal with very few parameters
bandpass ω
spectrum
1. Can we model first and second statistics of real world signal by shaping
up the white noise spectrum using some transfer function?
2. Does this produce the same second order properties (mean, variance,
ACF, spectrum) for any white noise input?
W( ω ) B(z) X(ω )
H(z) =
A(z)

2
Px = | H( ω )| Pw

flat (white) ω bandpass ω

spectrum spectrum
Can we use this linear stochastic model for prediction?

c D. P. Mandic Advanced Signal Processing 10
Spectrum of ARMA models (look also at Recap slides)
recall that two conjugate complex poles of A(z) give one peak in the spectrum

ACF ≡ P SD in terms of the information available

In ARMA modelling we filter white noise w[n] (so called driving input)
with a causal linear shift–invariant filter with the transfer function H(z), a
rational system function with p poles and q zeros given by
Pq −k
Bq (z) k=0 bk z
X(z) = H(z)W (z) # H(z) = =
Ap(z) 1 + pk=1 ak z −k
P

For a stable H(z), the ARMA(p,q) stochastic process x[n] will be

2
wide–sense stationary. For the driving noise power Pw = σw , the power of
the stochastic process x[n] is (recall Py = |H(z)|2Px = H(z)H ∗(z)Px)

c D. P. Mandic Advanced Signal Processing 11
Example 4: Can the shape of power spectrum tell us
about the order of the polynomials B(z) and A(z)?
Plot the power spectrum of an ARMA(2,2) process for which
◦ the zeros of H(z) are z = 0.95e±π/2

◦ poles are at z = 0.9e±2π/5

Solution: The system function is (poles and zeros – resonance & sink)
1 + 0.9025z −2
H(z) =
1 − 0.5562z −1 + 0.81z −2
7
Power Spectrum [dB]

−1
0 0.5 1 1.5 2 2.5 3 3.5
Frequency

c D. P. Mandic Advanced Signal Processing 12
Difference equation representation # the ACF follows
the data model!
Random processes x[n] and w[n] are related by a linear difference equation
with constant coefficients, given by
Pq −k p q
B(z) k=0 bk z
X X
H(z)= = P p −k
↔ ARMA(p,q) ↔ x[n] = alx[n − l] + blw[n − l]
A(z) 1 + k=1 ak z
|l=1 {z } |l=0 {z }
autoregressive moving average

Notice that the autocorrelation function of x[n] and crosscorrelation

between the stochastic process x[n] and the driving input w[n] follow
the same difference equation, i.e. if we multiply both sides of the above
equation by x[n − k] and take the statistical expectation, we have
p
X q
X
rxx(k) = al rxx(k − l) + bl rxw (k − l) (see Slide 18)
|l=1 {z } |l=0 {z }
easy to calculate can be complicated

Since x is WSS, it follows that x[n] and w[n] are jointly WSS

c D. P. Mandic Advanced Signal Processing 13
General linear processes: Stationarity and invertibility
Can we tell anything about the process x from the coefficients a, b (cf. h in FIR)

Consider a linear stochastic process # output from a linear filter, driven by

WGN, denoted by w[n]. (NB: Here w is “input” and x “output”)
∞
X
x[n] = w[n] + b1w[n − 1] + b2w[n − 2] + · · · = w[n] + bj w[n − j]
j=1
that is, a weighted sum of past samples of driving white noise w[n].
For this process to be a valid stationary
P∞ process, the coefficients must be
absolutely summable, that is j=0 |bj | < ∞.
This also implies that under stationarity conditions, x[n] is also a weighted
sum of past values of x, plus an added shock w[n], that is
x[n] = a1x[n − 1] + a2x[n − 2] + · · · + w[n]
P∞
◦ Linear Process is stationary if j=0 |bj | < ∞
P∞
◦ Linear Process is invertible if j=0 |aj | < ∞
P∞ −ωn
P∞
Recall that H(ω) = n=0 h(n)e → for ω = 0 ⇒ H(0) = n=0 h(n)

c D. P. Mandic Advanced Signal Processing 14
Autoregressive processes (pole–only)

A general AR(p) process (autoregressive of order p) is given by

p
X
x[n] = a1x[n − 1] + · · · + apx[n − p] + w[n] = aix[n − i] + w[n] = aT x[n] + w[n]
i=1
Observe the auto–regression in {x[n]} # the past of x is used to generate the future

Duality between the AR and MA processes:

For example, the first order autoregressive process, AR(1),
∞
X
x[n] = a1x[n − 1] + w[n] ⇔ bj w[n − j]
j=0

has an MA representation, too (right hand side above).

This follows from the duality between IIR and FIR filters.

c D. P. Mandic Advanced Signal Processing 15
Example 5: Statistical properties of AR processes
Drive the AR(4) model from Example 6 with two different WGN realisations ∼ N (0, 1)
4 4 3000 20
Value of random signal

PSD (dB/rad/sample)
2000

Value of AR signal

ACF of AR signal
2 2
1000 10

0 0 0

-1000 0
-2 -2
-2000

-4 -4 -3000 -10
0 500 1000 1500 2000 0 500 1000 1500 2000 0 500 1000 1500 2000 0 0.2 0.4 0.6 0.8 1
Sample number Sample number Correlation lag Normalized Frequency (× π rad/sample)
4 4 3000 20
Value of random signal

PSD (dB/rad/sample)
2000
Value of AR signal

ACF of AR signal
2 2
1000 10

0 0 0

-1000 0
-2 -2
-2000

-4 -4 -3000 -10
0 500 1000 1500 2000 0 500 1000 1500 2000 0 500 1000 1500 2000 0 0.2 0.4 0.6 0.8 1
Sample number Sample number Correlation lag Normalized Frequency (× π rad/sample)

r = wgn(2048,1,1); ◦ The time domain random AR(4)

a = [2.2137, -2.9403, 2.1697, -0.9606]; processes look different
a = [1 -a]; ◦ The ACFs and PSDs are exactly
x = filter(1,a,r); the same (2nd-order stats)!
xacf = xcorr(x); ◦ This signifies the importance
xpsd = abs(fftshift(fft(xacf))); of our statistical approach

c D. P. Mandic Advanced Signal Processing 16
ACF and normalised ACF of AR processes
Key: ACF has the same form as the AR process in hand!
To obtain the autocorrelation function of an AR process, multiply the
above equation by x[n − k] to obtain (recall that r(−m) = r(m))

x[n − k]x[n] = a1x[n − k]x[n − 1] + a2x[n − k]x[n − 2] + · · ·

+apx[n − k]x[n − p] + x[n − k]w[n]

Notice that E{x[n − k]w[n]} vanishes when k > 0. Therefore, we have

2
rxx(0) = a1rxx(1) + a2rxx(2) + · · · + aprxx(p) + σw , k=0
rxx(k) = a1rxx(k − 1) + a2rxx(k − 2) + · · · + aprxx(k − p), k>0

On dividing throughout by rxx(0) we obtain

ρ(k) = a1ρ(k − 1) + a2ρ(k − 2) + · · · + apρ(k − p) k > 0

Quantities ρ(k) are called normalised correlation coefficients

c D. P. Mandic Advanced Signal Processing 17
Variance and spectrum of AR processes
Variance:
2
For k = 0, the contribution from the term E{x[n − k]w[n]} is σw , and
2
rxx(0) = a1rxx(−1) + a2rxx(−2) + · · · + aprxx(−p) + σw

Divide by rxx(0) = σx2 to obtain

2
σw
σx2 =
1 − ρ1a1 − ρ2a2 − · · · − ρpap

Power spectrum: (recall that Pxx = |H(z)|2Pww = H(z)H ∗(z)Pww , the

expression for the output power of a linear system → see Appendix)

2
2σw
Pxx(f ) = 2 0 ≤ f ≤ 1/2
|1 − a1 e−2πf − · · · − ap e−2πpf |

Fro more detail: “Spectrum of Linear Systems” from Lecture 1: Background

c D. P. Mandic Advanced Signal Processing 18
Example 6a: AR(p) signal generation
Consider an AR(4) process with coeff. a = [2.2137, −2.9403, 2.1697, −0.9606]

Yule-Walker Power Spectral Density Estimate

◦ Generate the input signal 30
256 points

Power/frequency (dB/rad/sample)
x by filtering white noise 1024 points
20
through the AR filter
10
◦ Estimate the PSD of x
based on a fourth-order 0

AR model -10

◦ Careful! The Matlab -20

routines require the AR
-30
coeff. a in the format 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency (×π rad/sample)
a = [1, −a1, . . . , −ap]
Notice the dependence on data length
Solution:
randn(’state’,1);
x = filter(1,a,randn(256,1)); % AR system output
pyulear(x,4) % Fourth-order estimate

c D. P. Mandic Advanced Signal Processing 19
Example 6b: Alternative AR power spectrum calculation
(an alternative function in Matlab)
Consider the AR(4) system given by
x[n] = 2.2137x[n−1]−2.9403x[n−2]+2.1697x[n−3]−0.9606x[n−4]+w[n]
a = [1 -2.2137 2.9403 -2.1697 0.9606]; % AR filter coefficients
freqz(1,a) % AR filter frequency response
title(’AR System Frequency Response’)
AR System Frequency Response
40
Magnitude (dB)

-20
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency (×π rad/sample)
100
Phase (degrees)

-100

-200
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency (×π rad/sample)

c D. P. Mandic Advanced Signal Processing 20
Key: Finding AR coefficients # the Yule–Walker eqns
(there are several similar forms – we follow the most concise one)

For k = 1, 2, . . . , p from the general AR(p) autocorrelation function, we

obtain the set of equations:
rxx(1) = a1rxx(0) + a2rxx(1) + · · · + aprxx(p − 1)
rxx(2) = a1rxx(1) + a2rxx(0) + · · · + aprxx(p − 2)
.. = ..

rxx(p) = a1rxx(p − 1) + a2rxx(p − 2) + · · · + aprxx(0)

These equations are called the Yule–Walker or normal equations.

Their solution gives us the set of autoregressive parameters, a1, . . . , ap,
T
or a = [a1, . . . , ap] .
The above equations can be expressed in a compact vector–matrix form as

rxx = Rxxa ⇒ a = R−1

xx rxx

The ACF matrix Rxx is positive definite (Toeplitz) which guarantees matrix inversion

c D. P. Mandic Advanced Signal Processing 21
Example 7: Find the parameters of an AR(2) process,
x(n), generated by x[n] = 1.2x[n − 1] − 0.8x[n − 2] + w[n]
Coursework: comment on the shape of the ACF for large lags
AR(2) signal x=filter([1],[1, −1.2, 0.8],w) ACF for AR(2) signal x=filter([1],[1, −1.2, 0.8],w) ACF for AR(2) signal x=filter([1],[1, −1.2, 0.8],w)
6 2000

1500
1500
4

ACF of AR(2) signal

AR(2) signal values

1000
1000
2

500 500

0
0 0

−2
−500
−500

−4
−1000
−1000

−6 −1500
0 50 100 150 200 250 300 350 400 −400 −300 −200 −100 0 100 200 300 400 −20 −10 0 10 20
Sample number Correlation lag Correlation lag

Matlab: for i=1:6; [a,e]=aryule(x,i); display(a);end

a(1) = [0.6689] a(2) = [1.2046, −0.8008]
a(3) = [1.1759, −0.7576, −0.0358]
a(4) = [1.1762, −0.7513, −0.0456, 0.0083]
a(5) = [1.1763, −0.7520, −0.0562, 0.0248, −0.0140]
a(6) = [1.1762, −0.7518, −0.0565, 0.0198, −0.0062, −0.0067]

c D. P. Mandic Advanced Signal Processing 22
Example 8: Advantages of model-based analysis
Consider the PSD’s for different realisations of the AR(4) process from Example 5

PSD from data PSD from data

PSD from model PSD from model
50 50
PSD (dB)

PSD (dB)
0 0

-50 -50

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Normalized frequency ω/ωmax Normalized frequency ω/ωmax

◦ The different realisations lead to different Empirical PSD’s (in thin black)
◦ The theoretical PSD from the model is consistent regardless of the data (in thick red)

N = 1024;
w = wgn(N,1,1);
a = [2.2137, -2.9403, 2.1697, -0.9606]; % Coefficients of AR(4) process
a = [1 -a];
x = filter(1,a,w);
xacf = xcorr(x); % Autocorrelation of AR(4) process
dft = fft(xacf);
EmpPSD = abs(dft/length(dft)).^ 2; % Empirical PSD obtained from data
ThePSD = abs(freqz(1,a,N,1)).^ 2 ; % Theoretical PSD obtained from model

c D. P. Mandic Advanced Signal Processing 23
Normal equations for the autocorrelation coefficients

For the autocorrelation coefficients

ρk = rxx(k)/rxx(0)

we have

ρ1 = a1 + a2ρ1 + · · · + apρp−1
ρ2 = a1ρ1 + a2 + · · · + apρp−2
.. = ..

ρp = a1ρp−1 + a2ρp−2 + · · · + ap

When does the sequence {ρ0, ρ1, ρ2, . . .} vanish?

Homework: Try command xcorr in Matlab

c D. P. Mandic Advanced Signal Processing 24
Yule–Walker modelling in Matlab
In Matlab – Power spectral density using Y–W method pyulear

Pxx = pyulear(x,p)
[Pxx,w] = pyulear(x,p,nfft)
[Pxx,f] = pyulear(x,p,nfft,fs)
[Pxx,f] = pyulear(x,p,nfft,fs,’range’)
[Pxx,w] = pyulear(x,p,nfft,’range’)

Description:

Pxx = pyulear(x,p)

implements the Yule-Walker algorithm, and returns Pxx, an estimate of the

power spectral density (PSD) of the vector x.

To remember for later → This estimate is also an estimate of the

maximum entropy.
See also aryule, lpc, pburg, pcov, peig, periodogram

c D. P. Mandic Advanced Signal Processing 25
Stochastic modelling: From data to an ARM A(p, q) model

So far, we have assumed the model (AR, MA, or ARMA) and analysed the
ACF and PSD based on known model coefficients.
In practice: DATA # MODEL

This procedure is as follows:

* record data x(k)

* find the autocorrelation of the data ACF(x)
* divide by r_xx(0) to obtain correlation coefficients \rho(k)
* write down Yule-Walker equations
* solve for the vector of AR parameters

The problem is that we do not know the model order p beforehand.

An illustration of the importance of the correct morel order is given
in the following example.

c D. P. Mandic Advanced Signal Processing 26
Example 9: Sunspot number estimation
consistent with the properties of a second order AR process

S u n sp ots ACF Zoomed ACF

200 1 1

0.9 0.9
150
0.8
0.8
0.7
100 0.7
0.6
0.6
0.5
50

0.4 0.5

0 0.4
1700 1800 1900 −100 0 100 −10 0 10
Time [years] Delay [years] Delay [years]

a1 = [0.9295] a2 = [1.4740, −0.5857]

a3 = [1.5492, −0.7750, 0.1284]
a4 = [1.5167, −0.5788, −0.2638, 0.2532]
a5 = [1.4773, −0.5377, −0.1739, 0.0174, 0.1555]
a6 = [1.4373, −0.5422, −0.1291, 0.1558, −0.2248, 0.2574]
# The sunspots model is x[n] = 1.474 x[n − 1] − 0.5857 x[n − 2] + w[n]

c D. P. Mandic Advanced Signal Processing 27
Special case #1: AR(1) process (Markov)
For Markov processes, instead of the iid condition, we have the first order
conditional expectation, that is
p(x[n], x[n − 1], x[n − 2], . . . , x[0]) = p(x[n]|x[n − 1])

Then x[n] = a1x[n − 1] + w[n] = w[n] + a1w[n − 1] + a21w[n − 2] + · · ·

Therefore: first order dependence, order-1 memory
i) for the AR(1) process to be stationary −1 < a1 < 1.
ii) Autocorrelation Function of AR(1): From the Yule-Walker equations
rxx(k) = a1rxx(k − 1), k > 0

In terms of the correlation coefficients, ρ(k) = r(k)/r(0), with ρ0 = 1

ρk = ak1 , k>0

Notice the difference in the behaviour of the ACF for a1 positive and negative

c D. P. Mandic Advanced Signal Processing 28
Variance and power spectrum of AR(1) process
Both can be calculated directly from the general expression for the
variance and spectrum of AR(p) processes.

◦ Variance: Also from a general expression for the variance of linear

processes from Lecture 1

2 2
σw σw
σx2 = =
1 − ρ1a1 1 − a21

◦ Power spectrum: Notice how the flat PSD of WGN is shaped

according to the position of the pole of AR(1) model (LP or HP)

2 2
2σw 2σw
Pxx(f ) = 2 = 1 + a2 − 2a cos(2πf )
|1 − a1e −2πf | 1
1

c D. P. Mandic Advanced Signal Processing 29
Example 10: ACF and spectrum of AR(1) for a = ±0.8
a < 0 → High Pass a > 0 → Low Pass
x[n] = −0.8*x[n−1] + w[n] x[n] = 0.8*x[n−1] + w[n]
Signal Values 5 5

Signal Values
0 0

−5 −5
0 200 400 600 800 1000 0 200 400 600 800 1000
Sample Number Sample Number
ACF ACF
1
1
Correlation

Correlation
0 0.5

−1 0
0 5 10 15 20 0 5 10 15 20
Correlation Lag Correlation Lag
Burg Power Spectral Density Estimate Burg Power Spectral Density Estimate
2 2
10 10
Power, dB

Power, dB

0 0
10 10

−2 −2
10 10
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Normalised Frequency, ×π rad/sample Normalised Frequency, ×π rad/sample

c D. P. Mandic Advanced Signal Processing 30
Special case #2: Second order autoregressive processes,
p = 2, q = 0, hence the notation AR(2)
The input–output functional relationship is given by (w[n] ∼ any white noise)
x[n] = a1x[n − 1] + a2x[n − 2] + w[n]
−1 −2

X(z) = a1z + a2z X(z) + W (z)
X(z) 1
⇒ H(z) = =
W (z) 1 − a1z −1 − a2z −2
1
H(ω) = H(eω ) =
1 − a1e−ω − a2e−2ω
Y-W equations for p=2 Connecting a’s and ρ’s
ρ1 = a1 + a2ρ1 a1
ρ1 =
1 − a2
ρ2 = a1ρ1 + a2
a21
ρ2 = a2 +
when solved for a1 and a2, we have 1 − a2
ρ1(1 − ρ2) ρ2 − ρ21 Since ρ1 < ρ(0) = 1 # a stability
a1 = a2 =
1 − ρ21 1 − ρ21 condition on a1 and a2

c D. P. Mandic Advanced Signal Processing 31
Variance and power spectrum
Both readily obtained from the general AR(2) process equation!

Variance 2
2
σw 1 − a2 σw
σx2 = =
1 − ρ1a1 − ρ2a2 1 + a2 (1 − a2)2 − a21
Power spectrum
2
2σw
Pxx(f ) = 2
|1 − a1e−2πf − a2e−4πf |
2
2σw
= 2 2 , 0 ≤ f ≤ 1/2
1 + a1 + a2 − 2a1(1 − a2 cos(2πf ) − 2a2 cos(4πf ))

Stability conditions (Condition 1 can be obtained from the

denominator of variance, Condition 2 from the expression for ρ1, etc.)
Condition 1 : a1 + a2 < 1
Condition 2 : a2 − a1 < 1
Condition 3 : − 1 < a2 < 1
This can be visualised within the so–called “stability triangle”

c D. P. Mandic Advanced Signal Processing 32
Stability triangle
a2

1
ACF ACF

II I

m m
Real Roots

−2 2 a1
ACF
ACF

m
m
III Complex Roots IV

−1

i) Real roots Region 1: Monotonically decaying ACF

ii) Real roots Region 2: Decaying oscillating ACF
iii) Complex roots Region 3: Oscillating pseudo-periodic ACF
iv) Complex roots Region 4: Pseudo-periodic ACF

c D. P. Mandic Advanced Signal Processing 33
Example 11: Stability triangle and ACFs of AR(2) signals
Left: a = [−0.7, 0.2] (region 2) Right: a = [1.474, −0.586] (region 4)

x[n] = −0.7x[n−1] + 0.2x[n−2] + w[n] Sunspot series

6 200

4
150
Signal Values

Signal Values
2

0 100

−2
50
−4

−6 0
0 100 200 300 0 100 200 300
Sample Number Sample Number

ACF for AR(2) signal ACF for sunspot series

1 1

0.5
0.5
Correlation

Correlation

0
−0.5

−1 −0.5
0 10 20 30 40 50 0 10 20 30 40 50
Correlation Lag Correlation Lag

c D. P. Mandic Advanced Signal Processing 34
Determining regions in the stability triangle
let us examine the autocorrelation function of AR(2) processes

The ACF
ρk = a1ρk−1 + a2ρk−2 k>0

◦ Real roots: ⇒ (a21 + 4a2 > 0) ACF = mixture of damped exponentials

◦ Complex roots: ⇒ (a21 + 4a2 < 0) ⇒ ACF exhibits a pseudo–periodic
behaviour
Dk sin(2πf0k + Φ)
ρk =
sin Φ
D - damping factor, of a sinewave with frequency f0 and phase Φ
√
D = −a2
a1
cos(2πf0) = √
2 −a2
1 + D2
tan(Φ) = tan(2πf0)
1 − D2

c D. P. Mandic Advanced Signal Processing 35
Example 12: AR(2) where a1 > 0, a2 < 0 # Region 4
Consider: x[n] = 0.75x[n − 1] − 0.5x[n − 2] + w[n]
x[n] = 0.75*x[n−2] − 0.5*x[n−1] + w[n]
20

10
Signal values

−10 The damping factor

−20
√
0 50 100 150 200 250
Sample Number
300 350 400 450 500 D = 0.5 = 0.71,
ACF
1
Frequency
Correlation

0.5
cos−1 (0.5303) 1
f0 = 2π = 6.2
0

−0.5
0 5 10 15 20 25 30 35 40 45 50
The fundamental period of
Correlation lag
the autocorrelation function
Power/frequency (dB/rad/sample)

Burg Power Spectral Density Estimate

10
is therefore
5 T0 = 6.2.
0

−5

−10
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency (×π rad/sample)

c D. P. Mandic Advanced Signal Processing 36
Model order selection: Partial autocorrelation function
Consider an earlier example using a slightly different notation for AR coefficients
AR(2) signal x=filter([1],[1, −1.2, 0.8],w) ACF for AR(2) signal x=filter([1],[1, −1.2, 0.8],w) ACF for AR(2) signal x=filter([1],[1, −1.2, 0.8],w)
6 2000

1500
1500
4

ACF of AR(2) signal

AR(2) signal values

1000
1000
2

500 500

0
0 0

−2
−500
−500

−4
−1000
−1000

−6 −1500
0 50 100 150 200 250 300 350 400 −400 −300 −200 −100 0 100 200 300 400 −20 −10 0 10 20
Sample number Correlation lag Correlation lag

To find p, first re-write AR coeffs. of order p as [a_p1,...,a_pp]

p = 1 # [0.6689] = a11 p = 2 # [1.2046, −0.8008] = [a21, a22]
p = 3 #[1.1759, −0.7576, −0.0358] = [a31, a32, a33]
p = 4 # [1.1762, −0.7513, −0.0456, 0.0083] = [a41, a42, a43, a44]
p = 5 # [1.1763, −0.7520, −0.0562, 0.0248, −0.0140] = [a51, . . . , a55]
p = 6 # [1.1762, −0.7518, −0.0565, 0.0198, −0.0062, −0.0067]

c D. P. Mandic Advanced Signal Processing 37
Partial autocorrelation function: Motivation
Notice: ACF of AR(p) infinite in duration, but can by be described in
terms of p nonzero functions ACFs.
Denote by akj the jth coefficient in an autoregressive representation of
order k, so that akk is the last coefficient. Then

ρj = akj ρj−1 + · · · + ak(k−1)ρj−k+1 + akk ρj−k j = 1, 2, . . . , k

leading to the Yule–Walker equations, which can be written as

    
1 ρ1 ρ2 ··· ρk−1 ak1 ρ1
 ρ1 1 ρ1 ··· ρk−2   ak2   ρ2 
.. .. .. ...  ..  =  .. 
..  
    

ρk−1 ρk−2 ρk−3 ··· 1 akk ρk

The only difference from the standard Y-W equations is the use of the
symbols aki to denote the AR coefficient ai # k indicating the model order

c D. P. Mandic Advanced Signal Processing 38
Finding partial ACF coefficients
Solving these equations for k = 1, 2, . . . successively, we obtain

1 ρ1 ρ1

ρ1 1 ρ2

2 ρ2 ρ1 ρ3
ρ2 − ρ1
a11 = ρ1, a22 = 2 , a33 = , etc
1 − ρ1 1 ρ1 ρ2

ρ1 1 ρ1

ρ2 ρ1 1

◦ The quantity akk , regarded as a function of the model order k, is called

the partial autocorrelation function (PAC).

◦ For an AR(p) process, the PAC akk will be nonzero for k ≤ p and zero
for k > p ⇒ indicates the order of an AR(p) process.

In practice, we introduce a small threshold, as for real world data it is

difficult to guarantee that akk = 0 for k > p. (see your coursework)

c D. P. Mandic Advanced Signal Processing 39
Example 13: Work by Yule # model of sunspot numbers
Recorded for > 300 years. To study them in 1927 Yule invented the AR(2) model

Sunspot series
150
We first center the data, as we do
100
not wish to model the DC offset
Signal values

(determinisic component), but

50
the stochastic component (AR
model driven by white noise)!
0

Using the Y-W equations we obtain:

−50
0 50 100 150
Sample Number
200 250 300 a1 = [0.9295]
a2 = [1.4740, −0.5857]
ACF for sunspot series
1
a3 = [1.5492, −0.7750, 0.1284]
a4=[1.5167,-0.5788,-0.2638,0.2532]
0.5
Correlation

a5=[ 1.4773,-0.5377,-0.1739,
0.0174,0.1555]
0

a6=[1.4373, -0.5422, -0.1291,

0.1558, -0.2248, 0.2574]
−0.5
0 5 10 15 20 25 30 35 40 45 50
Correlation lag

c D. P. Mandic Advanced Signal Processing 40
Example 13 (contd.): Model order for sunspot numbers
After k = 2 the partial correlation function (PAC) is very small, indicating p = 2
Sunspot series ACF for sunspot series
150 1

100

Signal values
0.5

Correlation
50

0
0

−50 −0.5
0 100 200 300 0 10 20 30 40 50
Sample Number Correlation lag

Partial ACF for sunspot series Burg Power Spectral Density Estimate

Power/frequency (dB/rad/sample)
1 35

30
0.5
25
Correlation

0 20

15
−0.5
10

−1 5
0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1
Correlation lag Normalized Frequency (×π rad/sample)

The broken red

√ lines denote the 95% confidence interval which has the
value ±1.96/ N , and where P AC ≈ 0

c D. P. Mandic Advanced Signal Processing 41
Example 14: Model order for an AR(3) process
An AR(3) process realisation, its ACF, and partial autocorrelation (PAC)

AR(3) signal ACF for AR(3) signal

10 1

5 0.8

Signal values

Correlation
0 0.6

−5 0.4

−10 0.2

−15 0
0 100 200 300 400 500 0 10 20 30 40 50
Sample Number Correlation lag

Partial ACF for AR(3) signal Burg Power Spectral Density Estimate

Power/frequency (dB/rad/sample)
0.4 30

0.2
20
0
Correlation

10
−0.2

−0.4
0

−0.6
−10
−0.8

−1 −20
0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1
Correlation lag Normalized Frequency (×π rad/sample)

After lag k = 3, the PAC becomes very small (broken line conf. int.)

c D. P. Mandic Advanced Signal Processing 42
Example 14: Model order selection for a financial time
series (the ’correct’ and ’time-reversed’ time series)
£/$ Exchange Rate Time-Reversed £/$ Exchange Rate
1970-2018 1970-2018
3 3
Partial correlations:
Rate [$/£]

Rate [$/£]
2.5 2.5

2 2 AR(1): a = [0.9994]
1.5 1.5

1
72 75 77 80 82 85 87 90 92 95 97 00 02 05 07 10 12 15 17
1
72 75 77 80 82 85 87 90 92 95 97 00 02 05 07 10 12 15 17
AR(2): a = [.9994, −.0354]
Date Date
Autocorrelation Function Autocorrelation Function
£/$ Exhange Rate Time-Reversed £/$ Exhange Rate AR(3): a = [.9994, −.0354,
1500 1500
−.0024]
AutoCorr value

AutoCorr value

1000 1000

500 500

0 0
AR(4): a = [.9994, −.0354,
-500 -500
−.0024, .0129]
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
104 4
Lags Lags 10

Autoconvolution Function Autoconvolution Function AR(5): a = [.9994, −.0354,

£/$ Exhange Rate Time-Reversed £/$ Exhange Rate
−.0024, .0129, −.0129]
AutoConv value

AutoConv value

600 600

400 400

200 200

0 0 AR(6): a = [.9994, −.0354,

-200 -200

-400 -400 −.0024, .0129, −.0129, −.0172]

-600 -600
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
4
104
Lags 10
Lags

c D. P. Mandic Advanced Signal Processing 43
AR model based prediction: Importance of model order
For a zero mean process x[n], the best linear predictor, in the mean
square error sense, of x[n] based on x[n − 1], x[n − 2], . . . is
x̂[n] = ak−1,1x[n − 1] + ak−1,2x[n − 2] + · · · + ak−1,k−1x[n − k + 1]

(apply the E{·} operator to the general AR(p) model expression, and
recall that E{w[n]} = 0)
(Hint:
E{x[n]} = x̂[n] = E {ak−1,1x[n − 1] + · · · + ak−1,k−1x[n − k + 1] + w[n]} =
ak−1,1x[n − 1] + · · · + ak−1,k−1x[n − k + 1]) )

whether the process is an AR or not

In MATLAB, check the function:
ARYULE
and functions
PYULEAR, ARMCOV, ARBURG, ARCOV, LPC, PRONY

c D. P. Mandic Advanced Signal Processing 44
Example 16: Under– vs Over–fitting a model #
Estimation of the parameters of an AR(2) process
Original AR(2) process x[n] = −0.2x[n − 1] − 0.9x[n − 2] + w[n],
w[n] ∼ N (0, 1), estimated using AR(1), AR(2) and AR(20) models:

Original and estimated signals AR coefficients

5 0.2

−0.2
0 −0.4 Original AR(2) signal
AR( 1), Error=5.2627
−0.6
AR( 2), Error=1.0421
−0.8 AR( 20), Error=1.0621
−5
360 370 380 390 400 410 0 5 10 15 20
Time [sample] Coefficient index

The higher order coefficients of the AR(20) model are close to zero and
therefore do not contribute significantly to the estimate, while the AR(1)
does not have sufficient degrees of freedom. (see also Appendix 3)

c D. P. Mandic Advanced Signal Processing 45
Effects of over-modelling on autoregressive spectra:
Spectral line splitting
Consider an AR(2) signal x(n) = −0.9x(n − 2) + w(n) with w ∼ N (0, 1).
N = 64 data samples, model orders p = 4 (solid blue) and p = 12 (broken
green). AR 2 Highpass Circularity.m
900

800

700
Magnitude (dB)

600

500

400

300

200

100

0
0 0.2 0.4 0.6 0.8 1
Frequency (units of π)
Notice that this is an AR(2) model!

Although the true spectrum has a single spectral peak at ω − π/2 (blue),
when over-modelling using p = 12 this peak is split into two peaks (green).

c D. P. Mandic Advanced Signal Processing 46
Model order selection # practical issues
In practice: the greater the model order the greater accuracy & complexity
Q: When do we stop? What is the optimal model order?
Solution: To establish a trade–off between computational complexity and
model accuracy, we introduce “penalty” for a high model order. Such
criteria for model order selection are:
MDL: The minimum description length criterion (MDL) (by Rissanen),
AIC: The Akaike information criterion (AIC)

p ∗ log(N )
MDL popt = min log(E) +
p N
AIC popt = min [log(E) + 2p/N ]
p

E the loss function (typically cumulative squared error),

p the number of estimated parameters (model order),
N the number of available data points.

c D. P. Mandic Advanced Signal Processing 47
Example 17: Model order selection # MDL vs AIC
MDL and AIC criteria for an AR(2) model with a1 = 0.5 a2 = −0.3

MDL for AR(2) The graphs on the left

1
MDL
Cumulative Squared Error
show the (prediction
0.98
error)2 (vertical axis)
0.96
versus the model order p
0.94
(horizontal axis). Notice
0.92
that popt = 2.
0.9

0.88
1 2 3 4 5 6 7 8 9 10
The curves are convex,
i.e. a monotonically
AIC for AR(2) decreasing error2 with an
1
AIC increasing penalty term
0.98 Cumulative Squared Error
(MDL or AIC correction).
0.96

0.94 Hence, we have a

0.92 unique minimum at p =
0.9 2, reflecting the correct
0.88
1 2 3 4 5 6 7 8 9 10
model order (no over-
model order p modelling)

c D. P. Mandic Advanced Signal Processing 48
Moving average processes, MA(q)
A general MA(q) process is given by
x[n] = w[n] + b1w[n − 1] + · · · + bq w[n − q]

Autocorrelation function: The autocovariance function of MA(q)

ck = E[(w[n] + b1w[n − 1] + · · · + bq w[n − q])(w[n − k]
+b1w[n − k − 1] + · · · + bq w[n − k − q])]

Hence the variance of the process

c0 = (1 + b21 + · · · + b2q )σw
2

The ACF of an MA process has a cutoff after lag q.

Spectrum: All–zero transfer function ⇒ struggles to model ’peaky’ PSDs

2 −2πf −4πf −2πqf 2

P (f ) = 2σw 1 + b1e + b2e + · · · + bq e

c D. P. Mandic Advanced Signal Processing 49
Example 18: Third order moving average MA(3) process
An MA(3) process and its autocorrel. (ACF) and partial autocorrel. (PAC) fns.
MA(3) signal ACF for MA(3) signal
3 1.2

1
2
0.8

Signal values

Correlation
1
0.6

0.4
0

0.2
−1
0

−2 −0.2
0 100 200 300 400 500 0 10 20 30 40 50
Sample Number Correlation lag

Partial ACF for MA(3) signal Burg Power Spectral Density Estimate

Power/frequency (dB/rad/sample)
0.5 −4

0.4
−6
0.3
−8
Correlation

0.2

0.1 −10

0
−12
−0.1
−14
−0.2

−0.3 −16
0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1
Correlation lag Normalized Frequency (×π rad/sample)

After lag k = 3, the PAC becomes very small (broken line conf. int.)

c D. P. Mandic Advanced Signal Processing 50
Analysis of nonstationary signals
Speech Signal
1 ◦ Consider a real–
W1 W2 W3
world speech signal,
Signal values

0.5 and thee different

segments with
0 different statistical
2000 3000 4000 5000 6000 7000 8000 9000 10000

Partial ACF for W1

Sample Number
Partial ACF for W2 Partial ACF for W3
properties
0 1 1
◦ Different AR
−1 0.5 0.5
model orders
Correlation

Correlation

Correlation
1 0 0
required for
0 −0.5 −0.5
different segments
−1 −1 −1
0 25 50 0 25 50 0 25 50 of speech #
Correlation lag Correlation lag Correlation lag

1
MDL calculated for W1
1
MDL calculated for W2
1
MDL calculated for W3 opportunity for
0.8 Calculated Calculated Model 0.8 Calculated Model
content analysis!
Model Order = 13 Order > 50 Order = 24
◦ To deal with
MDL

MDL

0.6 0.5 0.6

0.4 0.4 nonstationarity we

0.2 0 0.2 need short sliding
0 25 50 0 25 50 0 25 50
Model Order Model Order Model Order data windows

c D. P. Mandic Advanced Signal Processing 51
Summary: AR and MA Processes
i) A stationary finite AR(p) process can be represented as an infinite order
MA process. A finite MA process can be represented as an infinite AR
process.

ii) The finite MA(q) process has an ACF that is zero beyond q. For an AR
process, the ACF is infinite in length and consists of mixture of damped
exponentials and/or damped sine waves.

iii) Finite MA process are always stable, and there is no requirement on the
coefficients of MA processes for stationarity. However, for invertibility,
the roots of the characteristic equation must lie inside the unit circle.

iv) AR processes produce spectra with sharp peaks (two poles of A(z) per
peak), whereas MA processes cannot produce peaky spectra.

ARMA modelling is a classic technique which has found a

tremendous number of applications

c D. P. Mandic Advanced Signal Processing 52
Summary: Wold’s Decomposition Theorem and ARMA
◦ Every stationary time series can be represented as a sum of a perfectly
predictable process and a feasible moving average process

◦ Two time series with the same Wold representations are the same, as
the Wold representation is unique

◦ Since any MA process also has an ARMA representation, working with

ARMA models is not an arbitrary choice but is physically justified

◦ The causality and stationarity on ARMA processes depend entirely on

the AR parameters and not on the MA parameters

◦ An MA process is not uniquely determined by its ACF

◦ An AR(p) process is always invertible, even if it is not stationary

◦ An MA(q) process is always stationary, even if it is non-invertible

c D. P. Mandic Advanced Signal Processing 53
Recap: Linear systems
transfer function
known unknown/known
input H(z) output Y(z)
H(z) =
X(z) h(k) Y(z) X(z)
x(k) known/unknown
y(k)
Described by their impulse response h(n) or the transfer function H(z)

In the frequency domain (remember that z = eθ ) the transfer function is

∞
X {h(n)}
H(θ) = h(n)e−nθ {x[n]} → → {y[n]}
H(θ)
n=−∞

∞
X
that is y[n] = h(r)x[n − r] = h ∗ x
r=−∞
The next two slides show how to calculate the power of the output, y(k).

c D. P. Mandic Advanced Signal Processing 54
Recap: Linear systems – statistical properties # mean
and variance
i) Mean
∞ ∞
( )
X X
E{y[n]} = E h(r)x[n − r] = h(r)E{x[n − r]}
r=−∞ r=−∞
∞
X
⇒ µy = µ x h(r) = µxH(0)
r=−∞
P∞ −jrθ P∞
[ NB: H(θ) = r=−∞ h(r)e . For θ = 0, then H(0) = r=−∞ h(r) ]

ii) Cross–correlation
∞
X
ryx(m) = E{y[n]x[n + m]} = h(r)E{x[n − r]x[n + m]}
r=−∞
∞
X
= h(r)rxx(m + r) convolution of input ACF and {h}
r=−∞

⇒ Cross-power spectrum Syx(f ) = F(ryx) = Sxx(f )H(f )

c D. P. Mandic Advanced Signal Processing 55
Recap: Lin. systems – statistical properties # output
These are key properties # used in AR spectrum estimation
From rxy (m)
P∞= ryx(−m) we have
rxy (m) = r=−∞ h(r)rxx(m − r). Now we write
∞
X
ryy (m) = E{y[n]y[n + m]} = h(r)E{x[n − r]y[n + m]}
r=−∞
∞
X ∞
X
= h(r)rxy (m + r) = h(−r)rxy (m − r)
r=−∞ r=−∞

by taking Fourier transforms we have

Sxy (f ) = Sxx(f )H(f )
Syy (f ) = Sxy (f )H(−f ) function of rxx

or
Syy (f ) = H(f )H(−f )Sxx(f ) = |H(f )|2Sxx(f )

Output power spectrum = input power spectrum × squared transfer function

c D. P. Mandic Advanced Signal Processing 56
More on Wold Decomposition (Representation) Theorem
Example: A “paradox”, can we talk about a deterministic random process

Consider a stochastic process given by

x[n] = A cos[n] + B sin[n]

where A, B ∈ N (0, σ 2) and A is independent of B (A and B are

independent normal random variables).
This process is deterministic because it can be written as

sin(2)
x[n] = x[n − 1] − x[n − 2]
sin(1)

that is, based on the history of x[n]. Therefore

sin(2)
p(x[n] | x[n − 1], x[n − 2], . . .) = x[n − 1] − x[n − 2] = x[n]
sin(1)

Remember: Deterministic does not mean that x[n] is non-random

c D. P. Mandic Advanced Signal Processing 57
Appendix 1: More on numbers (recorded since 1874)
Top: original sunspots Middle and Bottom: AR(2) representations

Left: time Middle:spectrum Right: autocorrelation

Average number of sunspots NASA Burg Power Spectral Density Estimation Partial ACF for sunspot series
200 1
0.8
100 0.5
0.6

0 0.4 0
0.2
100 0.5
0 200 400 600 5 10 15 20 0 5 10 15
1st filtered Gaussian process (2nd order YW) PSD for 1st filtered signal Partial ACF for 1st filtered signal
10 1
0.8
0.5
0.6
0
0.4 0
0.2
10 0.5
0 200 400 600 5 10 15 20 0 5 10 15
2nd filtered Gaussian process (2nd order YW) PSD for 2nd filtered signal Partial ACF for 2nd filtered signal
10 1
0.8
0.5
0.6
0
0.4 0
0.2
10 0.5
0 200 400 600 5 10 15 20 0 5 10 15
Top: original Middle: first AR(2) model Bottom: second AR(2) model

c D. P. Mandic Advanced Signal Processing 58
Appendix 2: Model order for an AR(2) process
An AR(2) signal, its ACF, and its partial autocorrelations (PAC)

AR(2) signal ACF for AR(2) signal

8 1

4 0.5

Signal values

Correlation
2

0 0

−2

−4 −0.5

−6

−8 −1
0 100 200 300 400 500 0 10 20 30 40 50
Sample Number Correlation lag

Partial ACF for AR(2) signal Burg Power Spectral Density Estimate

Power/frequency (dB/rad/sample)
0.8 15

0.6 10
0.4
5
Correlation

0.2
0
0
−5
−0.2
−10
−0.4

−0.6 −15

−0.8 −20
0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1
Correlation lag Normalized Frequency (×π rad/sample)

After lag k = 2, the PAC becomes very small (broken line conf. int.)

c D. P. Mandic Advanced Signal Processing 59
Appendix 3: More on over–parametrisation
Consider the linear stochastic process given by
x[n] = x[n − 1] − 0.16x[n − 2] + w[n] − 0.8w[n − 1]
It clearly has an ARMA(2,1) form. Consider now its coefficient vectors
written as polynomials in the z–domain:
a(z) = 1 − z + 0.16z 2 = (1 − 0.8z)(1 − 0.2z)
b(z) = 1 − 0.8z
These polynomials clearly have a common factor (1 − 0.8z), and therefore
after cancelling these terms, we have the resulting lower–order polynomials:
a(z) = 1 − 0.2z
b(z) = 1
The above process is therefore an AR(1) process, given by
x[n] = 0.2x[n − 1] + w[n]
and the original ARMA(2,1) version was over–parametrised.

c D. P. Mandic Advanced Signal Processing 60
Something to think about ...
◦ What would be the properties of a multivariate (MV) autoregressive,
say MVAR(1), process, where the quantities w, x, a now become
matrices, that is

X(n) = AX(n − 1) + W(n)

◦ Would the inverse of the multichannel correlation matrix depend on

’how similar’ the data channels are? Explain also in terms of eigenvalues
and ’collinearity’.

◦ Threshold autoregressive (TAR) models allow for the mean of a time

series to change along the blocks of data. What would be the
advantages of such a model?

◦ Try to express an AR(p) process as a state-space model. What kind of

the transition matrix between the states do you observe?

c D. P. Mandic Advanced Signal Processing 61
Consider also: Fourier transform as a filtering operation
We can see FT as a convolution of a complex exponential and the data (under a
mild assumption of a one-sided h sequence, ranging from 0 to ∞)
R∞
1) Continuous FT. For a continuous FT F (ω) = −∞
x(t)e−ωtdt
Let us now swap variables t → τ and multiply by eωt, to give
Z Z
eωt x(τ )e−ωτ dτ = x(τ ) |eω(t−τ )
{z } dτ = x(t) ∗ e
ωt
(= x(t) ∗ h(t))
h(t−τ )

2) Discrete Fourier transform. For DFT, we have a filtering operation

N −1 h i
X 2π
− N nk − 2π
W = e Nn

X(k) = x(n)e = x(0) + W x(1) + W x(2) + · · ·
n=0 | {z }
cumulative add and multiply

1 1−z −1 W ∗
with the transfer function (large N) H(z) = 1−z −1 W
= 1−2 cos θk z −1 +z −2

exp(jwt) discrete time case

x[n] DFT
x(t)*exp(jwt) x +
x(t) DFT −
z −1 W
continuous time case

c D. P. Mandic Advanced Signal Processing 62
Notes

c D. P. Mandic Advanced Signal Processing 63
Notes

c D. P. Mandic Advanced Signal Processing 64
Notes

c D. P. Mandic Advanced Signal Processing 65
Notes

c D. P. Mandic Advanced Signal Processing 66

EXTENDED PROJECT-Shoe - Sales
100% (6)
EXTENDED PROJECT-Shoe - Sales
28 pages
Time Series Analysis Henrik Madsen
100% (1)
Time Series Analysis Henrik Madsen
156 pages
Wren 80i Gas Turbine Engine Tech Specs
100% (1)
Wren 80i Gas Turbine Engine Tech Specs
25 pages
Lecture Note 5 - Non-Stationary Time Series and Unit Root Testing
No ratings yet
Lecture Note 5 - Non-Stationary Time Series and Unit Root Testing
21 pages
DSP
No ratings yet
DSP
46 pages
adsp
No ratings yet
adsp
47 pages
SSP 4 1 - Modelling 1
No ratings yet
SSP 4 1 - Modelling 1
26 pages
3 Random Processes
No ratings yet
3 Random Processes
24 pages
Digital Signal Processing-Random Processes
100% (1)
Digital Signal Processing-Random Processes
121 pages
Discrete-Time Random Signals
No ratings yet
Discrete-Time Random Signals
34 pages
Random Processes Sildes
No ratings yet
Random Processes Sildes
24 pages
Presentation 1
No ratings yet
Presentation 1
20 pages
Course 09-2 - Discrete Time Random Signals
No ratings yet
Course 09-2 - Discrete Time Random Signals
40 pages
BME I5100 Biomedical Signal Processing: Lucas C. Parra Biomedical Engineering Department City College of New York
No ratings yet
BME I5100 Biomedical Signal Processing: Lucas C. Parra Biomedical Engineering Department City College of New York
34 pages
Analyse Spectrale Chap 3-1
No ratings yet
Analyse Spectrale Chap 3-1
32 pages
Random Process LN
No ratings yet
Random Process LN
27 pages
ee321_lec2
No ratings yet
ee321_lec2
26 pages
Introduction To Signal Processing: Iasonas Kokkinos Ecole Centrale Paris
No ratings yet
Introduction To Signal Processing: Iasonas Kokkinos Ecole Centrale Paris
32 pages
Introduction of Statistical Signal Processing: Random Signals
No ratings yet
Introduction of Statistical Signal Processing: Random Signals
7 pages
Signal Processing - Exercices: 1 Exercice 1
No ratings yet
Signal Processing - Exercices: 1 Exercice 1
2 pages
Random Variable and Random Process - Practice Sheet 02
No ratings yet
Random Variable and Random Process - Practice Sheet 02
6 pages
Adaptive Signal Processing Bernard Widrow, Peter N. Stearns
No ratings yet
Adaptive Signal Processing Bernard Widrow, Peter N. Stearns
14 pages
Digital Signal Processing Lab
No ratings yet
Digital Signal Processing Lab
6 pages
spectral_density_function
No ratings yet
spectral_density_function
51 pages
SSP (Meng Pfiev) Chapter3
No ratings yet
SSP (Meng Pfiev) Chapter3
53 pages
Introduction To ARMA Models: T Iid 2
No ratings yet
Introduction To ARMA Models: T Iid 2
15 pages
Random Processes PDF
No ratings yet
Random Processes PDF
37 pages
SSPI Lecture 1 Slides 2025
No ratings yet
SSPI Lecture 1 Slides 2025
68 pages
Ss Fodsp
No ratings yet
Ss Fodsp
7 pages
Neural Network Time Series
No ratings yet
Neural Network Time Series
42 pages
Analog Digital Communication:: Modulation, Demodulation and Coding
No ratings yet
Analog Digital Communication:: Modulation, Demodulation and Coding
16 pages
ECE 420 Digital Communications: Lecture #3
No ratings yet
ECE 420 Digital Communications: Lecture #3
40 pages
Ch2-SignalsSystems
No ratings yet
Ch2-SignalsSystems
33 pages
ADC Chapter 1 Notes
No ratings yet
ADC Chapter 1 Notes
24 pages
Signals and Spectra
No ratings yet
Signals and Spectra
9 pages
Adaptive Filter Design
No ratings yet
Adaptive Filter Design
25 pages
Random Signal Analysis & Power Spectral Density: Digital Signal Processing, © 2006 Robi Polikar, Rowan University
No ratings yet
Random Signal Analysis & Power Spectral Density: Digital Signal Processing, © 2006 Robi Polikar, Rowan University
28 pages
Linecodes PSD Notes 25 Aug 2024
No ratings yet
Linecodes PSD Notes 25 Aug 2024
29 pages
5152
No ratings yet
5152
1 page
L4 Random Signals and Noise PDF
No ratings yet
L4 Random Signals and Noise PDF
55 pages
Lect 13
No ratings yet
Lect 13
14 pages
03 Frequency Estimation Lect Release PDF
No ratings yet
03 Frequency Estimation Lect Release PDF
22 pages
Computer Exercise: 1.1 Question Dictionary
No ratings yet
Computer Exercise: 1.1 Question Dictionary
7 pages
Lecture13_RandomProcesses
No ratings yet
Lecture13_RandomProcesses
21 pages
Advanced Signal Processing Introduction To Estimation Theory
No ratings yet
Advanced Signal Processing Introduction To Estimation Theory
40 pages
ECE 6151, Spring 2017 Lecture Notes 2: 1 Random Process
No ratings yet
ECE 6151, Spring 2017 Lecture Notes 2: 1 Random Process
9 pages
Statistical Digital Signal Processing: Chapter 2: Linear Signal Models
100% (1)
Statistical Digital Signal Processing: Chapter 2: Linear Signal Models
47 pages
Randomization in Matlab
No ratings yet
Randomization in Matlab
30 pages
Stochastic Processes: Power Spectral Density (PSD)
No ratings yet
Stochastic Processes: Power Spectral Density (PSD)
53 pages
Part B
No ratings yet
Part B
16 pages
AP5152-Advanced Digital Signal Processing
No ratings yet
AP5152-Advanced Digital Signal Processing
10 pages
ECE 6123 Advanced Signal Processing: 1 Filters
No ratings yet
ECE 6123 Advanced Signal Processing: 1 Filters
9 pages
Part 3 Hilbert Transform And
No ratings yet
Part 3 Hilbert Transform And
61 pages
Lecture 1-5 EEEN323 March2022
No ratings yet
Lecture 1-5 EEEN323 March2022
139 pages
Signal Modelling & Yule Walker Equation
No ratings yet
Signal Modelling & Yule Walker Equation
16 pages
lec2&3
No ratings yet
lec2&3
22 pages
1 - intro moments
No ratings yet
1 - intro moments
26 pages
Random Processes Sept 25 26 Oct 3 5
No ratings yet
Random Processes Sept 25 26 Oct 3 5
53 pages
Intro To Advanced DSP
No ratings yet
Intro To Advanced DSP
49 pages
Power Spectrum: X X Xte DT X XTDT X D E X
No ratings yet
Power Spectrum: X X Xte DT X XTDT X D E X
38 pages
Chapter 2: Deterministic and Random Signal Analysis: Department of Communication Engineering, National Central University
No ratings yet
Chapter 2: Deterministic and Random Signal Analysis: Department of Communication Engineering, National Central University
35 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
AP Calculus Flashcards, Fourth Edition: Up-to-Date Review and Practice
From Everand
AP Calculus Flashcards, Fourth Edition: Up-to-Date Review and Practice
Barron's Educational Series
No ratings yet
Statistical Learning
No ratings yet
Statistical Learning
25 pages
Columbus Tubes 2018 Catalogue V3 PDF
No ratings yet
Columbus Tubes 2018 Catalogue V3 PDF
34 pages
Ansys Fluent Guide - Creating Simple Geometries
No ratings yet
Ansys Fluent Guide - Creating Simple Geometries
18 pages
Tutorial Sheet 5
No ratings yet
Tutorial Sheet 5
1 page
Guitar Setup Guide
No ratings yet
Guitar Setup Guide
6 pages
Trek Emonda ALR 5 Disc Specifications
No ratings yet
Trek Emonda ALR 5 Disc Specifications
2 pages
Cheats For GTA San Andreas
No ratings yet
Cheats For GTA San Andreas
4 pages
Econ275 (Stanford) PDF
No ratings yet
Econ275 (Stanford) PDF
4 pages
Bohn (2005) - The Sustainability of Fiscal Policy in The United States
No ratings yet
Bohn (2005) - The Sustainability of Fiscal Policy in The United States
45 pages
Random Process Characteristic Equation NOTES
No ratings yet
Random Process Characteristic Equation NOTES
147 pages
Level Crossing and Other Level Functionals by M. Kratz
No ratings yet
Level Crossing and Other Level Functionals by M. Kratz
59 pages
M.tech Electronics Communication Engineering
No ratings yet
M.tech Electronics Communication Engineering
72 pages
Fadi Al-Turjman, Manoj Kumar, Thompson Stephan, Akashdeep Bhardwaj - Evolving Role of AI and IoMT in The Healthcare Market-Springer (2022)
No ratings yet
Fadi Al-Turjman, Manoj Kumar, Thompson Stephan, Akashdeep Bhardwaj - Evolving Role of AI and IoMT in The Healthcare Market-Springer (2022)
283 pages
Time Series Forecast - A Basic Introduction Using Python
No ratings yet
Time Series Forecast - A Basic Introduction Using Python
18 pages
Ma 1252
No ratings yet
Ma 1252
25 pages
Assignment Cointegration
No ratings yet
Assignment Cointegration
2 pages
Retail Sales Forecasting
No ratings yet
Retail Sales Forecasting
31 pages
Time Series Analysis Forecasting and Control 4th Edition George E.P. Box instant download
100% (2)
Time Series Analysis Forecasting and Control 4th Edition George E.P. Box instant download
60 pages
Probability & Random Process QB
No ratings yet
Probability & Random Process QB
13 pages
Ensemble Average and Time Average
No ratings yet
Ensemble Average and Time Average
31 pages
Market Trend Prediction Using Sentiment Analysis: Lessons Learned and Paths Forward
No ratings yet
Market Trend Prediction Using Sentiment Analysis: Lessons Learned and Paths Forward
10 pages
Stochastics Processes R-Studio
No ratings yet
Stochastics Processes R-Studio
23 pages
Quantitive Finances Exam Commentaries
No ratings yet
Quantitive Finances Exam Commentaries
44 pages
Forecasting: Principles and Practice: Rob J Hyndman
No ratings yet
Forecasting: Principles and Practice: Rob J Hyndman
26 pages
PTSP Notes Final
No ratings yet
PTSP Notes Final
57 pages
Time Series Solution Manual
No ratings yet
Time Series Solution Manual
147 pages
Non-Stationarity and Unit Roots
No ratings yet
Non-Stationarity and Unit Roots
25 pages
Tourism Management: Ching-Fu Chen, Song Zan Chiou-Wei
No ratings yet
Tourism Management: Ching-Fu Chen, Song Zan Chiou-Wei
7 pages
Diogo Resende - Time Series Forecasting Models in Python
No ratings yet
Diogo Resende - Time Series Forecasting Models in Python
77 pages
Time Series Forecasting With Python Cheat Sheet
No ratings yet
Time Series Forecasting With Python Cheat Sheet
7 pages
Time Series Analysis Exercises: Universität Potsdam
100% (1)
Time Series Analysis Exercises: Universität Potsdam
30 pages
CK1 Booklet 1 PDF
100% (1)
CK1 Booklet 1 PDF
52 pages
Effect of Exchange Rate volatility and inflation on stock market returns Dynamics - evidence from India
No ratings yet
Effect of Exchange Rate volatility and inflation on stock market returns Dynamics - evidence from India
8 pages
Time Series With SAS
No ratings yet
Time Series With SAS
364 pages
A. Le Bot - Foundation of Statistical Energy Analysis in Vibroacoustics-Oxford University Press (2015)
100% (1)
A. Le Bot - Foundation of Statistical Energy Analysis in Vibroacoustics-Oxford University Press (2015)
333 pages

Advanced Signal Processing Linear Stochastic Processes

Uploaded by

Advanced Signal Processing Linear Stochastic Processes

Uploaded by

Advanced Signal Processing

Linear Stochastic Processes

Department of Electrical and Electronic Engineering

◦ Learn how stochastic models shape up the spectrum of white noise

◦ Understand how stochastic processes are created, and to get familiarised

◦ Learn how to derive the parameters of linear stochastic ARMA models

◦ Introduce special cases: autoregressive (AR), moving average (MA)

◦ Stability conditions and model order selection (partial correlations)

◦ Optimal model order selection criteria (MDL, AIC, ...)

◦ Apply stochastic modelling to real world data (speech, environmental,

function sin(2x) function sin(2x) + 2 * randn function sin(2x) + 4

ACF of sin(2x) ACF of sin(2x) + 2*randn ACF of sin(2x) + 4

Which disturbance is more detrimental: deterministic DC or stochastic noise

we can treat separately the predictable part (e.g. a deterministic

Definition: A covariance-stationary process x[n] is called (linearly)

R deterministic if p(x[n] | x[n − 1], x[n − 2], . . .) = x[n].

Transfer function for p = ρejθ (ignoring z −2 in the numerator on the RHS):

Partial ACF for sunspot series

This signal is random, as sunspots originate30from the explosions of helium

and is predictable, as shown later in the Lecture.

e.g. a bandpass (any other) power spectrum

flat (white) ω bandpass ω

ACF ≡ P SD in terms of the information available

For a stable H(z), the ARMA(p,q) stochastic process x[n] will be

◦ poles are at z = 0.9e±2π/5

Notice that the autocorrelation function of x[n] and crosscorrelation

Consider a linear stochastic process # output from a linear filter, driven by

A general AR(p) process (autoregressive of order p) is given by

Duality between the AR and MA processes:

has an MA representation, too (right hand side above).

r = wgn(2048,1,1); ◦ The time domain random AR(4)

x[n − k]x[n] = a1x[n − k]x[n − 1] + a2x[n − k]x[n − 2] + · · ·

Notice that E{x[n − k]w[n]} vanishes when k > 0. Therefore, we have

On dividing throughout by rxx(0) we obtain

ρ(k) = a1ρ(k − 1) + a2ρ(k − 2) + · · · + apρ(k − p) k > 0

Quantities ρ(k) are called normalised correlation coefficients

Divide by rxx(0) = σx2 to obtain

Power spectrum: (recall that Pxx = |H(z)|2Pww = H(z)H ∗(z)Pww , the

Fro more detail: “Spectrum of Linear Systems” from Lecture 1: Background

Yule-Walker Power Spectral Density Estimate

◦ Careful! The Matlab -20

For k = 1, 2, . . . , p from the general AR(p) autocorrelation function, we

rxx(p) = a1rxx(p − 1) + a2rxx(p − 2) + · · · + aprxx(0)

These equations are called the Yule–Walker or normal equations.

rxx = Rxxa ⇒ a = R−1

ACF of AR(2) signal

ACF of AR(2) signal

Matlab: for i=1:6; [a,e]=aryule(x,i); display(a);end

PSD from data PSD from data

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

For the autocorrelation coefficients

When does the sequence {ρ0, ρ1, ρ2, . . .} vanish?

Homework: Try command xcorr in Matlab

implements the Yule-Walker algorithm, and returns Pxx, an estimate of the

To remember for later → This estimate is also an estimate of the

This procedure is as follows:

* record data x(k)

The problem is that we do not know the model order p beforehand.

S u n sp ots ACF Zoomed ACF

a1 = [0.9295] a2 = [1.4740, −0.5857]

Then x[n] = a1x[n − 1] + w[n] = w[n] + a1w[n − 1] + a21w[n − 2] + · · ·

In terms of the correlation coefficients, ρ(k) = r(k)/r(0), with ρ0 = 1

◦ Variance: Also from a general expression for the variance of linear

◦ Power spectrum: Notice how the flat PSD of WGN is shaped

Stability conditions (Condition 1 can be obtained from the

i) Real roots Region 1: Monotonically decaying ACF

x[n] = −0.7*x[n−1] + 0.2*x[n−2] + w[n] Sunspot series

ACF for AR(2) signal ACF for sunspot series

◦ Real roots: ⇒ (a21 + 4a2 > 0) ACF = mixture of damped exponentials

−10 The damping factor

Burg Power Spectral Density Estimate

ACF of AR(2) signal

ACF of AR(2) signal

To find p, first re-write AR coeffs. of order p as [a_p1,...,a_pp]

ρj = akj ρj−1 + · · · + ak(k−1)ρj−k+1 + akk ρj−k j = 1, 2, . . . , k

leading to the Yule–Walker equations, which can be written as

x[n] = −0.7x[n−1] + 0.2x[n−2] + w[n] Sunspot series