0% found this document useful (0 votes)
7 views133 pages

DSP II - Up To 28th Nov

The document outlines the course content for Digital Signal Processing II (EEE 431), focusing on advanced topics such as spectral estimation, adaptive filters, and multirate signal processing. It emphasizes the importance of Power Spectral Density (PSD) in various applications including telecommunications, audio processing, and structural analysis. The document also discusses challenges in power spectrum estimation and introduces classical methods like the periodogram for analyzing random signals.

Uploaded by

Work Station
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views133 pages

DSP II - Up To 28th Nov

The document outlines the course content for Digital Signal Processing II (EEE 431), focusing on advanced topics such as spectral estimation, adaptive filters, and multirate signal processing. It emphasizes the importance of Power Spectral Density (PSD) in various applications including telecommunications, audio processing, and structural analysis. The document also discusses challenges in power spectrum estimation and introduces classical methods like the periodogram for analyzing random signals.

Uploaded by

Work Station
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 133

Digital Signal Processing II

(EEE 431)

Prof. Md. Kamrul Hasan


Dept. of Electrical & Electronic Engg.
BUET

28-Nov-23 1
Why Study DSP-II
In Continuous-time Signals and Linear Systems and DSP-I Courses
You have mostly learned mathematical tools but
No Solid Techniques
for Solving Real-world Problems

You will get some flavor and taste of DSP!


2
Course Contents
 Spectral Estimation (Hayes, Proakis)
 Classical methods
 High-resolution and Super-resolution methods
 Applications, e.g., speech enhancement, ECG/PPG analysis
 Adaptive Filters (S. Haykin)
 Wiener-Hopf equation
 LMS Filters, VSS LMS, MC-LMS, VSS-MC-FLMS, c-MC-LMS
 RLS Filters
 Applications, e.g. Echo Cancellation, Blind channel estimation, HR
estimation from noisy PPG
 Multirate Signal Processing with Applications (Mitra, Ifeachor)
 Sampling Rate Conversion
 Perfect Reconstruction Filter Bank
 Non-stationary Signal Decomposition Techniques (class)
 Filter Bank and Wavelet Transform
 EMD
EEE 431: Advanced Digital Signal Processing Spectral estimation of random processes: classical methods, model based high resolution methods, super
resolution techniques, spectral estimation in noisy condition, applications: estimation of sinusoids in noise, speech enhancement, quantitative tissue
 Applications
characterization, ECG/PPG signal analysis; adaptive filters: single and multichannel filters, non-blind and blind techniques, time and frequency domain
approaches, convergence analysis, variable-step size single and multichannel LMS filters, constrained adaptive filters, applications in noise and
3 echo
cancellation; multi-rate signal processing with applications; non-stationary signal decomposition techniques and theirs applications.
What is Spectrum Estimation?
What exactly does it mean, and PSD application areas, e.g.,
why is it important in different fields?  Telecommunications
 optimize transmission systems
 Power Spectral Density (PSD) refers  reduce interference/reverberation
to the distribution of a signal's power  Audio and acoustics
over frequency  analyze sound waves and design
acoustic systems
 PSD tells us how much power is  Vibration and structural analysis
contained in a given frequency band,  to study the behavior of structures
allowing us to better understand the under different loads, Pile Health
characteristics of a signal  Speech processing
 to enhance noisy speech
 When analyzing a signal, it is often  Image processing
helpful to break it down into its  to analyze and manipulate digital
frequency components - > PSD images

 By calculating the PSD of a signal, Overall, PSD is a powerful tool that allows
we can gain insight into the relative us to better understand the characteristics
strength of different frequency of a signal and make informed decisions4
components. about how to process or analyze it.
Why called power density spectrum?
Wiener-Khintchine Theorem:
 Inv. 1 
j
Px (e )   r (k )e
k 
x
 jk
rx (k ) 
2   P (e
 x
j
)e jk
d

Ideal ACF
Parseval’s Relation
For k = 0,

the Area Under [unit: W/Hz]

Thus denotes the distribution of signal power as a


5
function of frequency, hence called the power density spectrum.
Applications of Power Spectral Density
Power spectral density (PSD) is a mathematical tool used to analyze the
frequency content of a signal. It is defined as the distribution of power across
different frequencies, and it has a wide range of applications across various
fields.

Communications and Signal


Processing
It can be used to determine the optimal
bandwidth for a communication system
THD HR
or to analyze the frequency content of a
noisy signal. In telecommunications,
PSD is used to assess the quality of a Speech/Wiener
signal, and it is a crucial factor in the Enhancement
design of communication systems.
- Channel BW, Baseband/Passband signal, SNR

6
Applications of power spectral density
Vibration and structural
Audio and acoustics analysis
PSD is often utilized in audio and acoustics PSD is valuable in understanding the
applications, such as determining the behavior of mechanical systems,
frequency response of a microphone or particularly in the context of vibration
analyzing the spectral content of music. In analysis. It can be used to identify
the music industry, PSD is used to analyze resonant frequencies, determine the
the frequency content of a song, which can amount of damping in a system, and
help in the mixing and mastering process. more. In the automotive industry,
It is also a key player in areas like noise PSD is used to analyze the vibration
reduction and speech enhancement. In the of a car's engine and chassis, which
field of acoustics, PSD is used to analyze can help in the design of a more
the sound pressure level and frequency comfortable and efficient vehicle. In
response of a room or building, which can the field of civil engineering, PSD is
help in the design of sound systems and used to analyze the vibration of
the mitigation of noise pollution. buildings and bridges, which can
help in the design of more resilient
7

structures.
Applications of power spectral density

Image processing and computer vision


In image processing and computer vision applications, PSD can be used
to analyze the frequency content of an image or to filter out unwanted
noise. In the field of medical imaging, PSD is used to analyze the
frequency content of an MRI or CT scan, which can help in the diagnosis
of diseases. In the field of robotics, PSD is used to analyze the
frequency content of sensor data, which can help in the navigation and
control of robots.

Power spectrum of a decomposed signal


Wavelet/EMD transform: This method involves decomposing a signal into a set
of wavelets, which are functions that oscillate at different frequencies and scales.
The power spectrum can then be calculated for each wavelet/IMF, providing8 a
time-frequency representation of the signal.
Review of Linear and Matrix Algebra
Vector differentiation of a scalar Differentiation of a matrix-vector
 c  Product
 a 
 1 

  c 
(Ra)  R T
(c ) 
 a2 
.  a
a
. 

 c 
  T
 a p  (a R )  R
  a
 T  T 
Vector differentiation of a vector (a Ra)  [a R ]a  [a](aT R )T
a a a
 T

 c1 c2 cm 

(a Ra)  Ra  R T a  (R  R T )a
  a a1
...
a1  a
(c)   1
 : : : 
For symmetric matrix R
a 
 c1
 a
c2
...
cm 

a p 
 p a p  T
(a Ra)  2Ra
a
9

Note: Ch. 2, Modern Spectral Estimation by S. M. Kay


Power Spectrum Estimation

 Why is it still a challenge?


The data is random and its length is finite
- Often only one member of the ensemble is available for
spectral characterization of the random process
- The data collection process is of short duration, i.e., earthquake
The measurement noise corrupts the data

Broad classification of estimation techniques


Nonparametric methods (Classical methods)
Parametric methods (Model-based methods)
Super-resolution methods (Eigenvalue based methods)
10
See Books: Lim & Oppenheim: Ch.2, Hayes: Ch.8, Porat Ch.13, Proakis: Ch.14, Kay Ch.
Non-Parametric Classical Methods
Indirect method: Two-step process
 Estimate ACF of a WSS process from a set of measured data
 ACF of a WSS is the appropriate statistical average to characterize a
random signal in the time-domain
 Take FT of ACF to obtain an estimate of the PSD

 Periodogram
 Modified Periodogram
 Bartlett method
 Welch method
 Blackman –Tucky method
11
Introduction

12
Introduction

13
Introduction

14
Introduction: Review of Statistics

15
Introduction: Review of Statistics

16
Introduction: Review of Statistics

17
Introduction: Review of Statistics

18
Ergodic Signals

Ensemble averages
can be replaced by
time averages

Stationary Signals

Joint PDF does not


change over time, i.e.
invariant to time-shift
19
20
21
The Power Spectrum of Random Signals
Dealing random signals in the same way as the deterministic signals, the FT
of any sample function x(n) is:

?
Will it be also a frequency-domain representation of the random
process x(n)?

NOT possible at least for TWO reasons: A function is


1. The FT will be a random variable over the ensemble (for any fixed Fourier
transformable
), since it will have a different value for each member of the
if it is
ensemble of possible sample functions.
absolutely
Hence it cannot be a frequency representation of the process but
summable, i.e.,
only of one member of the process. This can be overcome by
taking expected value (or mean) over the ensemble.
2. If x(n) is a WSS random process, then it exist for all n,
22

- Cannot be satisfied by a non-zero sample function from a WSS random process


Summary: PSD of a Random Process

- Frequency domain representation of the random process x(n)

In practice, we have one set of finite-duration data.


Two practical problems:
 Cannot do the expected value
 Cannot do the limit
The periodogram is a method that ignores them both!!!!

23
The Periodogram
- Assumes that the observed signal is a truncated
version of the originally intended signal x(n)
- Truncate the sample function to use FT technique

= 0, |n|>N
- Asymtotically unbiased estimator of the PSD

Finite
duration

24
Classical Methods: The Periodogram
 The ideal power spectrum of a WSS random
process x(n) is the DTFT of the ACF,

1 
j
Px (e )   r ( k )e
k 
x
 jk
rx (k ) 
2   P (e
 x
j
)e jk
d

Ideal ACF
 For an ergodic process
N
1
rx (k )  E (n) x ( n  k )  lim
*
 x (n) x ( n  k )
*
N  2 N  1
n  N

- This is an average computed over the infinite interval


- A small window is to be used so that the stationarity
25
assumption holds allowing the ACF and PS to be a FT pair
Estimate of Periodogram
 Given a finite sequence x(n), n  0,1,....., N
N 1
1st Definition: ˆ j
Pper (e )  
k  N 1
ˆrx (k )e jk

DTFT
 Unbiased estimate of ACF see the limit

1 N 1
rˆx (k )  
N  k n 0
x(n  k ) x(n), k  0

Symmetry: rˆx ( k )  rˆx (k ), rˆx ( k )  0, | k | N

 To exclude values of x(n) outside [0,N-1]


N 1 k
1
rˆx (k ) 
N k

n 0
x( n  k ) x( n), k  0,1,..., N -1 26
Periodogram (Cont.)
Biased Estimate of ACF:

- More reliable than the unbiased one, because it assigns lower weights
to the poorer estimates of long correlation lags (at least not increasing)
- Guarantees positive semi-definiteness of the correlation matrix, 𝒙
- Unbiased estimate may lead to –Ve value of the PS
- Periodogram = Correlogram for biased estimate

27
Periodogram (Cont.)
Let xN(n) be a finite sequence related to x(n) as
xN (n)  wR (n) x(n)
Then the estimated ACF is

1 1
rˆx (k ) 
N

n 
x N ( n  k ) x N ( n)  x N ( k ) * x N (  k )
N
Taking the DTFT:
ˆ j 1 j j 1 j 2
Pper (e )  X N (e ) X N (e )  | X N (e ) |
*

N N
where  N 1
j
X N (e )  x
n 
N ( n)e  jn
  x ( n)e
n0
 jn

DFT 1 ˆ j 2  k / N 28
Implementation: x N ( n )  X N ( k )  | X N ( k ) |  Pper ( e
2
)
N
Periodogram (Cont.)
2nd Definition: x(n), n  0,1,....., N
Given a finite
random sequence

- Frequency domain representation of the random process x(n)

Drop “ ” and “ ”, to get

29

- Used by Schuster (~ 1900) to determine “hidden periodicities” [hence the name]


Statistical Properties of the
Periodogram: Performance Analysis
1. An estimator,e.g., Periodogram, is statistically unbiased if

True PSD
2. Asymptotically unbiased if

3. Consistent if

4. Resolution of PSD estimator


Check 1-4 for Periodogram 30
Statistical Property of Biased ACF

biased

) ),
Now,

) )= )
31
- So biased ACF estimator is asymptotically Unbiased
Performance of Periodogram: Bias
N 1
Pˆper (e j )  
k  N 1
rˆx (k )e jk
1 N 1 k
1 N 1 k
N k
E rˆx (k ) 
N

n0
E  x(n  k ) x(n) 
N

n0
rx (k ) 
N
rx (k )

The expected value is zero for .


Then we can write, for all k
E rˆx (k )  wB ( k )rx (k )
where
N | k |
 , | k | N for all |k|<N
wB (k )   N
0, |k | N
is a Bartlet (triangular) window.
32
Therefore, rˆx ( k ) is a biased estimate of ACF,
.
Performance of Periodogram:Bias (Cont.)
 N 1
 N 1
ˆ j

E Pper (e )  E   rx (k )e
 k  N 1
ˆ  jk
   E rx (k )e
 k  N 1
ˆ  jk


 
k 
rx (k ) wB (k )e  jk

Using the frequency convolution theorem, we get

* Periodic Conv.
where FT of Bartlett window is:
2
j1  sin( N  / 2) 
WB (e )   sin( / 2) 
N  
 Note: The bias is zero if N infinity, because becomes an impulse 33
Performance of Periodogram:Bias (Cont.)
The Bartlett window can be obtained from the Rectangular window as:

*
Length of is
So the Periodogram is a biased estimator, since the expected value of is
not equal to However, since ,
the periodogram is asymptotically Unbiased
34

*
Performance of Periodogram: Variance
For a Gaussian white noise process, v(n)
[ independent of N ]
constant

Thus the variance does not go to zero as, and the periodogram is
NOT a consistent estimate of the power spectrum.
For sequences other than Gaussian white noise, e.g., x(n)
) for a regular Gaussian RP

Example: Calculate the periodogram of unity variance White Noise sequence x(n)
and comment on the performance of the estimator.

35
Example 8.2.1 (Hayes): Periodogram
of White Noise

What is the periodogram of

36

Home work
Example: White Noise

*
Here

- So, periodogram is asymptotically unbiased

- So, variance is constant and independent of N. Plot for 37


different N and take average.
Performance of Periodogram: Resolution

* Periodic Conv.

The width of the main lobe of increases as the data record length
decreases. For a data record length N, there is a limit on how closely two
sinusoids or two narrowband processes may be located before they can no
longer be resolved.

One way to define this resolution limit equal to the width of the main lobe of
the spectral window, , at its half-power 6dB point, equivalent to the 3
dB bandwidth of the data window, , given by

38
Rule of thumb
3dB (Half Power)

39
Assignment: Determine Mean of
Periodogram
Consider a random sinusoids:

True ACF

True PSD

40
Assignment: Determine Mean of
Periodogram (Cont.)
Now, *

𝟐 𝒋𝝎 ( (
𝑩

𝟐 𝒋𝜽 ( (
𝑩

Note: 41
Assignment: Determine Mean of
Periodogram (Cont.)

42
Example: What is the min N to
Resolve?
Consider a random signal of TWO sinusoids:

noise power

Plot for the following values:

43
44
45
Modified Periodogram
Rectangular window

BIASED as,
*
- Need to get UNBIASED estimate but IMPOSSIBLE!!!
- Other window functions, e.g., Bartlett, Hamming and Hanning may be used
to obtain modified periodogram to reduce side lobe leakage
Assignment: Compare the performance of periodogram and modified periodogram
using a suitable example. Check resolution and variance for different window functions.

Note: 46
The Modified Periodogram: Bais (Cont.)

Note:

Now,

47
The Modified Periodogram: Bais (Cont.)

correlation of the data window

Thus,
*
PS of the window
gives biased estimate
48
The Modified Periodogram: Bias

Now,
*
Data window:

49
Asymptotically Unbiased
Window Properties

50
Modified Periodogram (Cont.)
Var

Res For Bartlett window

Worse than Periodogram


Then what is the benefit?
- Provides a trade-off between spectral resolution (main lobe width and
spectral masking (side lobe amplitude)

- Variance of the Modified Periodogram variance of periodogram

- Reduces bias but still biased

51
PSD of two sinusoids in noise

almost masked by
side lobes without data window

Thumb Rule with data window


Hamming

Use data window if freq, separation


is cycles/sample

Rectangular window is optimal if freq,


52
separation is cycles/sample
53
Periodogram – Viewed as Filter Bank
Although we ALWAYS implement periodogram using the DFT, it is helpful to
interpret it as a filter bank
Define the impulse response of an FIR filter as:

The frequency response of this filter is:

modulating

N 1
j
j   sin( N  / 2) 
WR (e )  e 2
 sin( / 2) 
54

 
Periodogram – Viewed as Filter Bank
Now the output of the i-th filter is:
causal & finite

55
Improving the Spectral Estimate:
Bartlett’s Method
 Reduce the variance of the periodogram by averaging a
number of different periodograms
 Variance of the periodogram estimator does not imrpove
with greater lengths of data

 Bartlett’s Spectrum Estimator


To reduce the variance of the periodogram
 Sub-divide the observed data records into K
non-overlapping segments
 Find the periodogram of each segment
 Finally evaluate the average of the so-obtained periodograms
56
Bartlett’s Method
N- point data
Divide

into L non-overlapping
segments
i=1

Periodogram of the i-th segment i=2 i=K

The Bartlett estimate is Data samples N = KL

i =1,2,……,K
Bartlett’s Method: Statistical Properties
BIAS:
because adding K
times the expected value
then dividing by K will
give the same result

Now, *
same as the Periodogram
So, * data length N or L makes
No difference

- So, Bartlett estimator is asymptotically unbiased


Variance: The variance is inversely proportional to K.
Assuming that the data sequences are approx. uncorrelated and large N

58
Bartlett’s Method: Statistical Properties
Thus if both K and L are allowed to go to infinity as

- So, Bartlett estimator is a consistent estimator


but the cost is paid in resolution
Resolution:
Since the Periodograms used in are computed using sequences of length L,
then the resolution is

Thus the resolution


of Bartlett Method
is K times larger
(Worse) than that
of the Periodogram

59

resolution of the periodogram


Homework
 Consider x(n) to be white noise. N=512 points. Find
Periodogram for K=1 (L=N), 4 (L=128), 16 (L=32). Overlay
50 realizations and average. Compare and comment.


 Overlay plot of 50 Periodograms for

Compare and comment.


60
The Welch Method
- In 1967 Welch proposed TWO modifications to Bartlet’s method

 Allow the sequence to overlap (usually 50% to 75%)


 Allow a data window to be applied to each sequence
modified periodogram
The modified periodogram is

Say D = L/2 (50% OL)

The Welch Spectrum is


i =1,2,……,K

Note: 61
The Welch Method (Cont.)
Let, the length of the segment be L, and the offset of successive sequences
is D samples. Then

where K is the no. of sequences that cover the entire N data points.

With no overlap , and sections of length L.


For 50% overlap /2, and

For 50% overlap sections of length L.

K sequences 62
Welch’s Method: Statistical Properties
BIAS:
because adding K
times the expected value
then dividing by K will
give the same result

Now, *
same as the Periodogram
data length N or L makes
No difference

- So, Welch estimator is asymptotically unbiased 63


Welch’s Method: Statistical Properties
Variance: For Bartlett window of length 2L,i.e double data window length
and 50% overlap

Gain in:

- the variance of the Welch’s method is 9/8 times > Bartlett’s method
- Better resolution as the length is 2L
But for fixed N, and resolution (same L), 50% overlap, & Bartlett window

Bartlett’s method

Example: Repeat the previous example for 50% and 75% OVERLAP
Resolution:
Since the Modified Periodograms used in are computed using sequences
of length L, then the resolution is

, for Bartlet Window


64
Blackman-Tukey Method
 Another way to improve the spectral estimate is to apply a window to the
estimated sample ACF

= ,

 Same weightage to all lags of the ACF


 AC with smaller lags are estimated with higher accuracy
 Thus larger lags should be avoided to increase accuracy of the PS
65
Blackman-Tukey Method (Contd.)
Blackman and Tukey proposed to weight the ACF so that the autocorrelations with
higher lags are weighted less.

=
=0, |k|>M
DTFT of product w(k) may be a Bartlett window

Now, the expected value is given by


Performance Comparison

67
Blackman-Tukey Method (Contd.)

Because

If N, M goes to infinity

So, BT estimator is asymptotically unbiased


Blackman-Tukey Method (Contd.)

 Variance: Var

For a Bartlett window of length 2M, N>>M>>1

Var
 Trade-off between resolution and variance
 M should be large to minimize the width of the mail lobe
1
 M should be small in order to minimize the sum 𝑤 (𝑘)
𝑁

Resolution: Window dependent 69

For Bartlett window of length 2M, it is


Minimum Variance Spectral Estimation
Capon Method, High-resolution method
 The power spectrum of a random process x(n) is estimated by filtering the
process with a bank of narrowband bandpass filters

The power spectrum of the output process is

and the average power in is


70
MVSE using Filter Banks

71
Minimum Variance Spectral Estimation

If is small enough so that is approximately constant over the passband


of the filter, the power in is approximately

Normalized filter
Bandwidth
72
Minimum Variance Spectral Estimation
The MV spectrum estimation involves the following steps:

1. Design with centre freq. so that each filter rejects


the maximum amount of out of band power while passing
the component at freq. with no distortion.

2. Filter x(n), estimate power in each

3. Set Power estimated in step-2/

Difference of MVSE with Periodogram as Filter Bank:


 For Periodogram, filter bank is independent of data. The filter is
fixed EXCEPT the change in center freq.
 If x(n) has significant power in the out-band, the PS estimate will
be erroneous due to leakage through side lobes. 73
 MV filter is data dependent and rejects the out-band power
Design of the Narrowband Filter,
1. The filter must be as narrowband as possible.
2. The PS estimate for x(n) at the centre freq. , is taken to be
the average power output of this filter. That is

3. Since the average power output of this filter is minimized


while the gain at is kept equal to 1, the output power is
due mostly to the power of the input at the freq. .
Define the following vectors to develop the details of the method:

74
Design of the Narrowband Filter,
The output of the filter is then given by:

It is desired to minimize the average power

75
Design of the Narrowband Filter,
Minimize the Matrix Form (for each i)
Min

To ensure that the filter does not alter the power in the input
Process at freq. , will be constrained to
Min 𝒊 𝒊
subject to

- A zero phase unity gain filter


- No phase distortion
- No amplitude distortion

Most common way to do constrained optimization is using the


76
Lagrange Multiplier method.
Design of the Narrowband Filter,
Lagrange Multiplier method: Lagrange multiplier

Lagrange says: Choose and to minimize

77
Design of the Narrowband Filter,

78
Design of the Narrowband Filter,

79
data dependent
Design of the Narrowband Filter,

&

Hermitian symmetric
matrix

80
Design of the Narrowband Filter,

81
MVSE (Cont.)
Since the foregoing analysis holds for any choice of the centre
frequency , the MV spectral estimation is given by

where,

Now, may be obtained by dividing by the bandwidth of the


Bandpass filter. We use the value of that produces the correct PSD for white noise.

Example: White Noise


The MV bandpass filter is 82
MVSE (Cont.)

The estimate of the power in y(n) at is Independent of

83
MVSE (Cont.)
Thus for a random process x(n)

The ACF matrix is unknown - the MV


and shall be estimated PSD of x(n)

- Large p is better for optimum h


becomes difficult
- For p close to N, error in ACF
84
- p<N
Data Correlation Matrix R
- assume  white noise

85
this two definition has
transpose relation

86
Design of the Narrowband Filter,

87
Comments on MV Spectral Estimation
Implementation of MVSE:
 Generally done directly on the data matrix X
for efficiency
 Even with that, it is more complex than
Classical Methods

Performance of MVSE:
 Provides better resolution than Classical
Methods
 Mostly used when spiky spectra are expected
- AR method may be better
88
Model-based Spectral Estimation
Why parametric method?
 Don’t require infinite AC sequence as periodogram
does. Modeling the infinite AC sequence by a finite
parameter model.

 While using a finite data sequence, AC estimates are


increasingly poor at higher lags. Therefore,
Parametric Methods are more accurate as it requires
only the first few lags.

 Noise effect can be compensated in the first few lags


if not for the whole sequence. 89
Model-based Spectral Estimation
u(n) B(z) x(n)
H(z) 
A(z)
rxx ( k )  x ( k ) * x (  k )  u ( k ) * u (  k ) * h ( k ) * h (  k )
 ruu ( k ) * rhh ( k ) Taking the z-transform,

Px ( z )  Pu ( z ) Ph ( z )
Therefore, the output signal power spectrum is given by
j j j
Px (e )  Pu (e ) Ph (e ) 2
 B(z) 2
  u2 H ( z ) H * ( z ) |z e j P(z)  u, z  exp( j2f )
A(z)
  u2 | H ( z ) |2 z e j u2 1
90
Parametric Methods
1 1
 AR Method PˆAR (e j )  2
 p
| A( z ) |
|1   aˆk e  jk |2
k 1

p
 MA Method PˆMA (e j ) | B( z ) |2 |1   bˆk e  jk |2
k 1

2 |1   bˆk e  jk |2
| B ( z ) |
 ARMA Method PˆARMA (e j )  2
 k 1
p
| A( z ) |
|1   aˆk e  jk |2
k 1

Problem: Estimate the model parameters 91


Linear System Identification

‘System
 Linear Models
identification deals
 AR models
with the
 MA models
mathematical  ARMA models
modeling of
Additive
physical systems’ noise
Input
signal Linear System + Observed
signal

Spectral
Applications Estimation
Speech Modeling 92
Communications &
Control
93
Blind AR System Identification Techniques

 Yule-Walker/
Autocorrelation ARMA model Input-Output Equation
Method p q
x(n)  ai x(ni) bju(n j)
The picture can't be display ed.

 Prediction Error
Method i1 j0

 Covariance Method AR model: b  0, j  1,2,....,q


j
 Burg/Lattice Method
MA model: a  0, i  1,2,...., p
 Maximum Likelihood i
Method  Parameter estimation
 Order estimation
Problems 94
Yule-Walker/Autocorrelation Method

AR model I/O relation


p
Rux (m)   h(m) 2
x(n)  ai x(ni) u(n) u
i1
Rux (m)   , m  0 2
u
Multiplying by x(n-m) and taking E{}
p
Rxx (m)    ai Rxx (m  i )  Rux (m)
Rux (m)  0, m  0
i 1

Rux (m)  E[u (n) x(n  m)]


Normal Equation
x ( n)  h( n) * u ( n)
  R xx (0) R xx (1)  R xx ( p  1)   a1   R xx (1) 
x ( n  m)   h( k )u ( n  m  k )  R (1) Rxx (0)  R xx ( p  2)   a 2   R ( 2) 
k 0  xx
      xx 
           
Rux (m)   h( k ) Ruu ( m  k )  R ( p  1)  
k 0  xx R xx ( p  2)  R xx (0)   a p   R ( p)
 xx 

Rux (m)   u2  h(k ) (m  k ) Use: Levinson-Durbin recursion
k 0
95
2 
Rux (m)   u
 h(k ) (k  m)
k 0
96
Prediction Error Method
Prediction error

a LS  R r 1
p

x (n)   ai x(n  i )
a LS  R 1
i x1 rx x x

e( n )  x ( n )  x ( n ) where
 rxx (0) rxx (1)  rxx ( p  1)   a1   rxx (1) 
 x(n)  a x(n  1) T  r (1)
 xx rxx (0)
 
 rxx ( p  2)  a2   r (2) 
   xx 
          
    
Cost function rxx ( p  1) rxx ( p  2)  rxx (0)  a p  rxx ( p)
N 1
1
J LS 
N
e
n 0
2
( n) USE
J MSE  E[ x(n)  aT x(n  1)]2  T
[a R x a]  2R x a
a
 J LS
Setting 0  T
a xx
a
97

a
98
99
100
101
102
103
104
MA model identification

x(n)  B( z )u (n)
B( z )  b0  b1 z 1    bq z  q u (n) B( z ) 
1 x(n)
 2 q m C ( z)
 u  bk bk  m ,0  m  q
 k 0
Rxx (m)  0, m  q
 R (m), m  0 Rxx (0)   u2 [b02  b12  b22  b32 ]
 xx

Rxx (1)   u2 [b0b1  b1b2  b2b3 ]
Difficulty: nonlinear equations
Rxx (2)   u2 [b0b2  b1b3 ]
Example for an MA(3) system
Rxx (3)   u2b0b3
b0u(n) b1u(n 1) b2u(n 2) b3u(n 3)
Rxx (0) b0u(n) b1u(n 1) b2u(n  2) b3u(n 3)
Rxx (1) b0u(n1) b1u(n2) b2u(n3) b3u(n4)
Rxx (2) b0u(n2) b1u(n3) b2u(n4) b3u(n5)
Rxx (3) b0u(n3) b1u(n4) b2u(n5) b3u(n6) 105
ARMA Model Identification
ARMA model I/O relation
p q
u (n) B(z) x(n) z (n)
x(n) ax
i (n i) bku(n k) A(z)
i1 k0 A(z)
pp qq AR approximation of the MA part:
m))
aaiiRRxxxx((mmii)) 
bbkkRRuxux((mmkk))
B( z )C ( z )  1
RRxx
xx((m
ii11 kk00

For m > q
AR approximation of the ARMA model:
p
Rxx (m)    ai Rxx (m  i ) B( z )C ( z )  A( z )
i 1
AR Parameters of the ARMA system: A( z)
B(z) 
C( z)
 R xx ( q ) Rxx ( q  1)  R xx ( q  p  1)   a1   Rxx ( q  1) 
 R ( q  1)  
R xx ( q )  Rxx ( q  p  2)   a2   R ( q  2)  Use: polynomial
 xx     xx 
           division
 
106
  
 Rxx ( M  1) Rxx ( M  2)  Rxx ( M  p )   a p  R
 xx ( M ) 
DFT in Spectrum Estimation

 N-point data
Direct Method
sequence x(n) 2

 N-point (min) DFT S xx  f   X ( f )   x ( n )e
2  j 2 fn

n 

N 1 2 Indirect Method
k  1
Pxx   
N N
 x
n 0
( n ) e  j 2 kn / N
S xx  f  

r xx ( k ) e  j 2 fk
k 
k  0,1,...., N  1
k 2 where
f k  , or , k  k
N N Sparse 

Sampling rxx ( k )   x ( n )x ( n  k )
n 
Zero Padding for interpolation
L 1 2
k 1
Pxx   
L N
 x ( n ) e  j 2 kn / L
,L  N 107

n 0 Freq. Resol. is determined by data point ‘N’’


AR Spectral Estimation
p
1 k
PAR ( z )  2
, A( z )  1   k
a z
A( z ) k 1
z  exp( j 2 f )

Yule-Walker method
p
Rxx (k )   al Rxx (k  l ), for k  0
l 1

x(n) Rxx (k )  Ex(n  k ) x(n) Estimated


AR parameter
Lattice filter algorithm {a1,a2,,ap}
Levinson
Lattice
Durbin
coefficient ki recursion
108
AR Spectral Estimation from Noisy
Observations
unknown
Noise
v(n)
Observed signal
u (n ) 1 x(n)
y(n)
White noise A( z )

Noisy observation
y ( n)  x ( n)  v ( n)
Assumption u(n) & v(n) are uncorrelated

Estimate the AR parameters solely from y(n) 109


Noise Compensation Scheme
-Single-point compensation
 R xx ( 0) R xx (1)  R xx ( p  1)   a1   R xx (1) 
 R (1)  
xx R xx ( 0)  R xx ( p  2 )   a2   R (2) 
     xx 
          
    
 R xx ( p  1) R xx ( p  2 )  R xx ( 0)   a p  R
 xx ( p ) 
 R yy ( m )   v2 m  0
R xx ( m )  
 R yy ( m ) m0

Noisy data y(n) Yule-Walker


AR parameters
Method

Noise power 110

estimator
Different PSD Estimators

N 1 2
 1
Direct method PPER( f ) 
N
x(n) exp( j2fn)
Conventional n0

methods N m1
1
Indirect
Rˆxx (m)  
N n0
x(n  m) x(n)

method N 1

PSD PˆBT ( f )  Rˆ


m( N 1)
xx (m) exp(2 jfm)
Estimators
u (n) B(z) x(n)
H(z) 
A(z)
Parametric 2
methods  B(z) 2
P(z)  u, z  exp( j2f )
A(z) 111

u2 1
Order Selection of AR & ARMA Models

2 N  P 1
 FPE (Final Prediction FPE( p )   u ( p )
N  p 1
Error)
2P
 AIC (Akaike AIC( p)  ln(u2( p)) 
N
Information Criterion)
2
 MDL (Minimum ( p) Nln(u(p))  plnN
MDL

Description Length)
 CAT (Criterion
Autoregressive
Transfer)

112
Examples on System Identification

 a1=-1.8, a2=0.97
 a1=-0.55, a2=-0.155, a3=0.5495, a4=-0.6241
 a1=-2.7607, a2=3.8106, a3=-2.6535, a4=0.9238
 b1=0.493, b2=0.433
 b1=-1.8, b2=0.97

113
Model based Spectral Estimation
 x2=x;
 N=2048;  x1=x;
 Nd=4000;  sizeR=p+1;
 n=0:Nd-1;  lenLag=Nd;
 x=2*sin(2*pi/5*n) +  for pos=1:sizeR;
4*sin(2.8*pi/5*n);  a = x2(pos:lenLag);
 Q1=0;  b = x1(1:lenLag-pos+1);
 p=4;  r(pos)= a*b';
 X=1/N*abs(fft(x,N));  end
 k=0:N-1;
 f=k./N;  r= r./lenLag;
 NF=max(X);  R= toeplitz(r(1:sizeR-1));
 plot(f,X/NF,'-or')  par_est=-inv(R)*r(2:sizeR)'
 A=[1 par_est'];
 H=freqz(1,A,N,'whole');
 NF=max(abs(H));
 plot(f,abs(H)/NF,'-+k') 114
 k_est=KPAR(x,Nd,p,Q1)';
 [MA,AR] = ktar(k_est,p,Q1);
 H=freqz(1,AR,N,'whole');
 NF=max(abs(H));
 plot(f,abs(H)/NF,'-xg')
 grid
 hold off

115
 p = length(A1)-1;
 overfit=0;%% 1 order over-fit gives better results in MA
 no_of_pole = max(size(A)) - 1+overfit;
 Q1 = max(size(B)) - 1;
 % ----------------------------------
 % -----Generating random signal-----
 % ----------------------------------
 %randn('state', 5);
 u = randn(1, Nd);
 U1 = 2*u/sqrt(cov(u));
 % -----The generated signal is X-----
 x = filter(B, A, U1);
116
 A0 = [1];  B1 = [1];
 A1 = [1 -0.6500 -0.7200 0.7600];%%S3  B2 = [1 0.433 0.49];
 A2 = [1 -2.299 2.1262 -0.7604];%%%S4  B3 = [1 -0.5 0.785]; %%%S1
 A3 = [1 -0.5500 0.1550 -0.5495  B4 = [1 0.76 0.85];%%%S3
0.6241];%%S1  B5 = [1 1.8 0.97];
 A4 = [1 -2.595 3.339 -2.2 .731];  B6 = [1 -1.8 0.97]; %%%S2
 A5 = [1 -0.86 1.0494 -0.668 0.9592 -0.7563  B7 = [1 -0.87 0.92];%%%S4
0.5656];
 A6 = [1 -2.7607 3.8106 -2.6535  B8 = [1 0.556 0.81];
0.9238];%%S2
 A7 = [1 -0.96 1.1494 -0.868 0.9092 -0.5563  B9 = [1 1.5 0.8];
0.4656];  B10 = [1 0 0.732];
 % A8 = [1 -2.265 2.572 -1.837 0.656];  B11 = [1 0.4 0.2 0.15 0.26];
 A9 = [1 0.479 0.086 0.29 0.731];
 A11 = [1 -0.5 -0.61 .585];  %%%%%%%%%%%%%%%%Select
 A12 = [1 -.45 -0.68 0.6175]; system%%%%%%%%%%%%
 A13 = [1 0.6 -0.2975 -0.1927 .6329 .7057];  A = A1;
 A14 = [1 -0.5500 -0.1550 0.5495 -0.6241];  B = B1;
 A15 = [1 -.8484 1.6590 -.7372 1.4044 -.5556
.6236];
 A16 = [1 -0.73 0.127 0.533 -0.81];

117
Examples:

118
Results

119
120
The roots of the predictor polynomial
The denominator of the transfer function
may be factored: 1  a z k  1  c z 1 
p p


k 1
k k 1
k

Where are a set of complex numbers, with each


complex conjugate pair of poles representing a
resonance at frequency:  F  1  Imc  
Fˆk   s  tan  k

and bandwidth: 2
   Re c 
k 

F 
Bˆ k   s  ln ck

If the pole is close to the unit circle then the root
represents a formant.
if rk  Imck   Reck   0.7
2 2

121
Line spectral pairs
 The LP polynomial can be
decomposed into two (p+1) - order polynomials:

P z   Q z 
A z  
2
122
 Now:
• All the roots of P(z) and Q(z) lie on the unit circle
• P(z) corresponds to the vocal tract with the glottis closed
and Q(z) with the glottis open
• The roots of P(z) and Q(z) are interspersed except with
P having only a real zero at z=-1, and Q a zero at z=1
• The zeros comprise the LSP parameters and the name
derives from the fact that each zero pair corresponds to a
pole pair in the forward model, which lies on the unit
circle
• This pole pair would represent an undamped sinusoid

123
• Since zeros occur in complex conjugate pairs, only p
unique zeros are needed to specify the model
• the zeros are found through an iterative search along the
unit circle. Further, although the zeros are complex, their
magnitudes are known to be unity, so that only a single
real parameter (the frequency or angle) is needed to
specify each one
• This makes LSP parameters very useful for speech
coding applications

124
Random Variable & Random Process

 RV: Explain with the coin flip experiment


Bernoulli random variable
A RV x is said to be Gaussian if its density function is
of the form
1  (  mx ) 
f x ( )  exp  
x 2  2 x 
2

 RP: A random process is an indexed


sequence of random variables.

125
 WSS: A random process x(n) is said to be wide sense
stationary if the following conditions are satisfied:
1. The mean of the process is a constant, mx(n) = mx, all n.
2. The autocorrelation rx(k,l) depends only on the difference,
k-l, i.e., rx(k,l) = rx(k-l) = rx(t), t = k-l, for all k and l
3. The variance of the process is finite, rx(0)< infinity.

126
Power Spectrum

 PS provides a FD description of the


second-order moment of the process.
 Properties:
P1: Symmetry. The PS of a WSS random process
x(n) is real-valued, Px (e j )  Px* (e j )
Px ( z )  Px* (1/ z * )
j  j
P
 If x(n) is real then the PS is even, x ( e )  Px ( e )

Px ( z )  Px* ( z * )
127
Power Spectrum
 P2: Positivity. The PS of a WSS random
process is nonnegative
j
Px (e )  0
 P3: Total power. The power in a zero
mean WSS random process is propotional
to the area under the power spectral
density curve
1
E | x( n) |  

 Px (e j )d 
2

2 

 Find the power spectrum of a zero mean white noise v(n). 128
Parametric Methods
1 1
 AR Method PˆAR (e j )  2
 p
| A( z ) |
|1   aˆk e  jk |2
k 1

p
 MA Method PˆMA (e j ) | B( z ) |2 |1   bˆk e  jk |2
k 1

2 |1   bˆk e  jk |2
| B ( z ) |
 ARMA Method PˆARMA (e j )  2
 k 1
p
| A( z ) |
|1   aˆk e  jk |2
k 1

129
Why parametric methods?

 Don’t require infinite AC sequence as


periodogram does. Modeling the infinite
AC sequence by a finite parameter model.
 While using a finite data sequence, AC
estimates are increasingly poor at higher
lags. Therefore, PM is more accurate as it
requires only first few lags.
 Noise effect can be compensated in the
first few lags if not for the whole sequence.
130
Some Formulas
x(n) y(n)
h(n)

  
ryx (k )  E  y (n  k ) x(n)  E   h(l ) x(n  k  l )x(n) 

l  
  h(l ) E  x(n  k  l ) x(n)
l  
ryx (k )   h(l )r (k  l )
l 
x

 rx (k )* h(k )

131
ry (k )  E  y (n  k ) y (n)
 

 E  y (n  k )  x(l )h(n  l ) 
 l  

  h(n  l )E  y(n  k ) x(l )
l 

  h(n  l )r
l 
yx (n  k  l )

Setting m=n-k,

ry (k )   h ( m) r
m 
yx (m  k )

  h(m)r
m 
yx ( k  m)

  g (m)r
m 
yx (k  m)  g (k ) * ryx (k )  h( k )* ryx (k ) 132
ry (k )  h( k ) * ryx (k )
 h( k )* h(k )* rx (k )
 rx (k ) * rh (k )
Taking the z-transform,
Py ( z )  Px ( z ) Ph ( z )
Therefore, the output signal power spectrum is given by

j j j
Py (e )  Px (e ) Ph (e )
  H ( z)H ( z)
2
x
*

  | H ( z) |
2
x
2
133

You might also like