Il 0% ha trovato utile questo documento (0 voti)
16 visualizzazioni101 pagine

ASP MI Lecture 3 NonParametricSE 2017 DPM

rrty

Caricato da

desalewmelkamu
Copyright
© © All Rights Reserved
Per noi i diritti sui contenuti sono una cosa seria. Se sospetti che questo contenuto sia tuo, rivendicalo qui.
Formati disponibili
Scarica in formato PDF, TXT o leggi online su Scribd
Il 0% ha trovato utile questo documento (0 voti)
16 visualizzazioni101 pagine

ASP MI Lecture 3 NonParametricSE 2017 DPM

rrty

Caricato da

desalewmelkamu
Copyright
© © All Rights Reserved
Per noi i diritti sui contenuti sono una cosa seria. Se sospetti che questo contenuto sia tuo, rivendicalo qui.
Formati disponibili
Scarica in formato PDF, TXT o leggi online su Scribd
Sei sulla pagina 1/ 101

Adaptive Signal Processing & Machine Intelligence

Lecture 3 - Spectrum Estimation


Danilo Mandic
room 813, ext: 46271

Department of Electrical and Electronic Engineering


Imperial College London, UK
[email protected], URL: www.commsp.ee.ic.ac.uk/∼mandic

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 1


Outline
Part 1: Background
◦ Some intuition and history
◦ The Discrete Fourier Transform
◦ Practical issues with DFT
∗ Aliasing
∗ Frequency resolution
∗ Incoherent sampling
∗ Leakage
∗ Time-bandwidth product

Part 2: The Periodogram and its modifications


◦ Schuster periodogram
◦ The role of autocorrelation estimation
◦ Windowing
◦ Averaging
◦ Blackman-Tukey Method
◦ Statistical properties of these methods (bias, variance)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 2


Part 1: Background

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 3


Problem Statement
From a finite record of stationary data sequence, estimate how the total
power is distributed over frequency.

Has found a tremendous number of applications:-


◦ Seismology → oil exploration, earthquake

◦ Radar and sonar → location of sources

◦ Speech and audio → recognition

◦ Astronomy → periodicities

◦ Economy → seasonal and periodic components

◦ Medicine → EEG, ECG, fMRI

◦ Circuit theory, control systems

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 4


Some examples
Seismic estimation Speech processing
periodic pulse excitation frequency
Pneumatic drill

Sensor 1 Sensor 2

direct path

reflected path
Layer 1

reflected path reflected path


Layer 2

(a) Simplified seismic paths. time


Amplitude M aaa t l aaa b
For every time segment ’∆t’, the
pulse
PSD is plotted along the vertical
direct
axis. Observe the harmonics in ’a’
reflected 2
Darker areas: higher magnitude of
reflected 1 Time PSD (magnitude encoded in color)
(b) Seismic impulse response.
Use Matlab function ’specgram’
c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 5
Historical perspective
1772 Lagrange proposes use of rational functions to identify multiple periodic components;

1840 Buys–Ballot, tabular method;

1860 Thomson, harmonic analyser;

1897 Schuster, periodogram, periods not necessarily known;

1914 Einstein, smoothed periodogram;

1920-1940 Probabilistic theory of time series, Concept of spectrum;

1946 Daniell, smoothed periodogram;

1949 Hamming & Tukey transformed ACF;

1959 Blackman & Tukey, B–T method;

1965 Cooley & Tukey, FFT;

1976 Lomb, periodogram of unevenly spaced data;

1970– Modern spectrum estimation!

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 6


Fourier transform & the DFT
Fourier transform: Z ∞
F (ω) = f (t)e−ωtdt
−∞

Not really convenient for real–world signals ⇒ need for a signal model.
More natural: Can we estimate the spectrum from N samples of f (t),
that is
[f (0), f (1), . . . , f (N − 1)]

where the spacing in time is T ?


One solution ⇒ perform a rectangular approximation of the above integral.
We have two problems with this approach:-

i) due to the sampling of f (t), aliasing for non–bandlimited signals;


ii) only N samples retained ⇒ resolution?

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 7


Some intuition: DFT as a demodulator
Spectrum estimation paradigm: For any general signal x(t), we wish to
establish if it contains a component with frequency ω0.
We cannot perform this just by averaging
Z ∞
x(t)dt as the oscillatory components are zero − mean
−∞
To answer whether ω0 is in x(t), we can multiply by e−ω0t, to obtain
(recall AM demodulation and for convenience consider one signal period)
Z T /2
x(t)e−ω0tdt = constant
−T /2

since for every oscillatory component eω0t we have

Aeω0te−ω0t = A

which is effectively a Fourier coefficient.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 8


Some intuition: Fourier transform as a digital filter
We can see FT as a convolution of a complex exponential and the data (under a
mild assumption of a one-sided h sequence, ranging from 0 to ∞)
R∞
1) Continuous FT. For a continuous FT F (ω) = −∞
x(t)e−ωtdt
Let us now swap variables t → τ and multiply by eωt, to give
Z Z
eωt x(τ )e−ωτ dτ = x(τ ) |eω(t−τ )
{z } dτ = x(t) ∗ e
ωt
(= x(t) ∗ h(t))
h(t−τ )

2) Discrete Fourier transform. For DFT, we have a filtering operation


N −1 h i
X 2π
− N nk − 2π
W = e Nk

X(k) = x(n)e = x(0) + W x(1) + W x(2) + · · ·
n=0 | {z }
cumulative add and multiply

1 1−z −1 W ∗
with the transfer function (large N) H(z) = 1−z −1 W
= 1−2 cos θk z −1 +z −2

exp(jwt) discrete time case


x[n] DFT
x(t)*exp(jwt) x +
x(t) DFT −
z −1 W
continuous time case

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 9


Rank of the covariance matrix for sinusoidal data
The difference between R2 and C
Consider a single complex sinusoid with no noise
zk = Aeωk = A cos(ωk + φ) + A sin(ωk + φ)

There are two possible representations of the signal: A univariate


complex-valued vector or bivariate real-valued matrix:
def
1. z = [z0, z1, . . . , zN −1]T = A[1, ejω , . . . , ej(N −1)ω ]T = Ae
   T
Re{z} 1 cos(ω + φ) . . . cos(ω(N − 1) + φ)
2. Z = =A
Im{z} 0 sin(ω + φ) . . . sin(ω(N − 1) + φ)
The corresponding covariance matrices exhibit a very interesting property:

◦ Czz = E{zzH } = |A|2eeH → Rank = 1.

◦ CZZ = E{ZZT } → Rank = 2.

What would happen with p sinusoids?

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 10


Discrete Fourier Transform as a Least Squares problem
Problem: Fitting data x[n] with a linear model with [N − 1] complex
sinusoids:
N −1
1 X 2π
x̂[n] = w[k]e N nk (1)
N
k=0
1
Eq (1) can be formulated in vector notation as x̂ = N Fw, where
    
x̂[0] 1 1 1 1 ··· 1 w[0]
 x̂[1]  1
 α α2 α3 ··· αN −1   w[1] 


 x̂[2] 
 1 
1 α2 α4 α6 ··· α 2(N −1)  
w[2]

 x̂[3]  = 
  
 N 1 α3 α6 α9 ··· α3(N −1)   w[3] 
  
.   .. 

 ..   ... ... ... ... ... ... .


x̂[N −1] 1 αN −1 α2(N −1) α3(N −1) ··· α(N −1)(N −1) w[N −1
| {z }
F

where α = eω = e N .
Each column of F represents a sinusoid with a different frequency.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 11


Discrete Fourier Transform as a Least Squares problem
Properties of the Fourier Matrix

The least squares solution to w is found by (CW question):


ŵ = argmin kx − Fwk2 = FH x
w
− 2π
PN −1
=⇒ DFT coefficient at bin k is w[k] = x[n]e N nk
n=0

What are the properties of the Fourier matrix?


?
◦ Is it unitary? (FH F = I)
→ Can you prove these
? properties?
◦ Is it Hermitian? (FH = F)
What happens if your signal x cannot be represented as a sum of the
uniformly spaced sinusoids?
h iT
1 1 1
Example: What if x = 1 α 2 α2 2 . . . α(N −1) 2 ?

Incoherent sampling =⇒ A limitation of the DFT for a small N.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 12


Spectrum estimation as an eigen-analysis problem
Def: A function which remains unchanged when passed through a system,
apart from a scaling by a constant, is called an eigenfunction, and the
scaling constant is called an eigenvalue.
For a digital filter

with the imp. resp. hk , the eigenfunction ek must satisfy
X
λek = hiek−i no general method for deriving ek
i=−∞
Consider a candidate eigenfunction ek = cos(ωk), then
X∞ h X∞ i ∞
h X i
yk = hi cos[ω(k − i)] = cos(ωk) hi cos ωi + sin(ωk) hi sin ωi
i=−∞ i=−∞ i=−∞
◦ Clearly cos comes close, but is not suitable due to the sin terms.
◦ A sum a cos ωk + b sin ωk = c cos(ωk + Φ) is therefore not suitable either
On the other hand, for eωk = cos ωk +  sin ωk, we have
X∞ ∞
h X i
ω(k−i) ωk −ωi
yk = hie =e hie = eωk H(ω) clearly an eigenfunction
i=−∞ i=−∞

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 13


FT basics
Periodic signal ! Discrete FT
Discrete signal ! Periodic FT
Periodic and Discrete signal ! Discrete and Periodic FT
Discrete and Periodic signal ! Periodic and Discrete FT
◦ Sampling yields a new signal (fs = 2π
T ) (poor approximation)

X
g[n] = T f (nT ) ⇔ G(ω) = F (ω + kΩ0)
k=−∞

◦ Limiting the length to N samples effectively introduces rectangular


windowing (Leakage)

sin(N ωT /2) − N −1 ωT
W (ω) = e 2
sin(ωT /2)

V Estimated Spectrum = True spectrum * Sinc

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 14


Practical Issue #1: Aliasing
Sampling Theorem Revisited
Original signal Sampled original signal
2.5 2.5
x(t) x[k]
2 2

1.5 1.5

1 1

0.5 0.5
t k
0 0

−0.5 −0.5

−1 −1

−1.5 −1.5

−2 −2

−2.5 −2.5
−10 −5 0 5 10 −10 −5 0 5 10

X(f)
X(f)

fs

−f h fh f −f h fh f s− f h f s+ f h f

Original spectrum Spectrum of sampled signal


For sampling period T and sampling frequency fs = 1/T ⇒ fs ≥ 2fh

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 15


Practical Issue #1: Aliasing
Sampling theorem: An example
Original 12Hz sine wave and sample s
2
1.2KHz
12Hz
1.5
20Hz ◦ Sub-Nyquist sampling
48Hz
1
causes aliasing
0.5

0
◦ This distorts physical
−0.5

−1
meaning of information
−1.5

−2
◦ In signal processing,
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
we require faithful data
representation
Pe riodogram Powe r Spe c tral De nsity Estimate
50
48Hz
20Hz
12Hz
◦ Problem: the noise
0
1.2KHz
model is always all-pass
−50
Powe r

◦ The easiest and most


−100
logical remedy is to
−150 low-pass filter the data
−200
so that the Nyquist
0 2 4 6 8 10 12 14 16 18 20
Fre que nc y (Hz ) criterion is satisfied.
c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 16
Practical Issue #2: Frequency Resolution
Def: Frequency resolution is the minimum separation between two
sinusoids, resolvable in frequency.
Ideally, we want an excellent resolution for a very few data samples
(genomic SP)

However,

i) Due to the wide mainlobe of the SINC function (spectrum of the


rectangular window), the convolution between the true spectrum and
the sinc function smears the spectrum;

ii) For two impulses in frequency to be resolvable, at least one


frequency bin must separate them, that is


⇒ T f ixed → N increase
NT

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 17


Practical Issue #2: Frequency Resolution
Time-bandwidth Product
◦ Suppose we know the maximum frequency in the signal ωmax, and
the required resolution ∆ω. Then
2π ωs 4ωmax
∆ω > 2 =2 ⇒ N>
NT N ∆ω
◦ For both the prescribed resolution and bandwidth, then

ωs = > 2ωmax & 2ωs < ∆ωN
T
hence
fs π π 4ωmax
= > ωmax that is T < ⇔ N>
2 T ωmax ∆ω
2π π
◦ If we know signal duration (fs ≥ 2fmax ⇒ T ≥ 2ωmax ⇒ T < ωmax )

2tmax 2tmaxωmax
N> ⇒ N>
T π
tmax × ωmax → time–bandwidth product of a signal.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 18


Example: the time–bandwidth product
Top: AM signals Bottom: Gaussian signals
t =1 sec, N=210 f1=19 Hz, fmax=21 Hz, ωmax=132 rad/s
max
2

Amplitude Spectrum
100
1
Amplitude
0
50
−1
f1 fmax
−2 0
0 0.5 1 0 10 20 30 40
Time (sec) Frequency (Hz)
tmax=0.836 sec, N=210 f1=15 Hz, fmax=25 Hz, ωmax=156 rad/s
2 300

Amplitude Spectrum
1
200
Amplitude

0
100
−1

−2 0
0 0.2 0.4 0.6 0.8 0 10 f1 20 f 30 40
max
Time (sec) Frequency (Hz)

Time Domain Gaussian Window, σ=0.125 Amplitude Spectrum of Gaussian Window


1 10

Amplitude Spectrum
0.8 8
Amplitude

0.6 6

0.4 4

0.2 2

0 0
−40 −20 0 20 40 −1 −0.5 0 0.5 1
Sample Index Normalised Frequency

Time Domain Gaussian Window, σ=0.25 Amplitude Spectrum of Gaussian Window


1 20
Amplitude Spectrum

0.8
15
Amplitude

0.6
10
0.4

5
0.2

0 0
−40 −20 0 20 40 −1 −0.5 0 0.5 1
Sample Index Normalised Frequency

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 19


Practical Issue #3: Spectral Leakage
Two sines with close frequencies
DFT (mixed signal)
Top: A 32-point DFT of an N = 32 long 14
sampled (fs = 64Hz) mixed sinewave 12

10
x(k) = sin(2π11k) + sin(2π17k) 8

6
It is difficult to determine how many distinct
4
sinewawes we have.
2
Bottom: A 3200-point DFT of an N = 32 0
−20 −10 0 10 20
long sampled (fs = 64Hz) sine Frequency [Hz]
High resolution DFT (mixed signal)
x(k) = sin(2π11k) + sin(2π17k)
15

◦ Both the f = 11Hz and f = 17Hz


sinewaves appear quite sharp 10

◦ This is a consequence of a high-resolution


(N = 3200) DFT 5

◦ The overlay plot compares it with the top


0
diagram −20 −10 0 10 20
Frequency [Hz]

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 20


Example: FFT leakage # EEG power spectrum
we record ≈ 10µV signals in the presence of external noise
Power spectrum of EEG channel (Fp1) [no window]
Problem: estimate power of the −10

−20
50Hz artefact picked up by EEG −30

leads −40
sidelobe

power (dB)
−50

• Using the standard periodogram - the −60

resolution is good but the artefact is −70

−80
partially masked
−90
45 50 55
• Remedy: Use a windowing function frequency (Hz)

(e.g. Hanning window). periodogram(x,[],N,Fs,‘onesided’);


Power spectrum of EEG channel (Fp1) [Hanning window]
– Note that sidelobes are reduced, −10

energy over narrow frequency range −20

−30
around 50Hz. −40

power (dB)
• Window value is zero at the beginning −50

and end of a segment −60

−70
– Multiply with the signal with a −80

window that has small sidelobes to −90

reduce leakage −100


45 50 55
frequency (Hz)
• Windows reduce, but do not periodogram(x,hann(length(x)),N,Fs,‘onesided’);
eliminate leakage completely!
c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 21
Practical Issue #4: Incoherent sampling
Are the signal frequencies, f = k fNs ?
Top: A 32-point DFT of an N = 32 long DFT (coherent sampling)
sampled (fs = 64Hz) sinewave of f = 10Hz
15
◦ For fs = 64 Hz, the DFT bins will be
located in Hz at k/N T = 2k, k =
10
0, 1, 2, ..., 63
◦ One of these points is at given signal
5
frequency of 10 Hz
Bottom: A 32-point DFT of an N = 32 0
−15 −10 −5 0 5 10 15
long sampled (fs = 64Hz) sine of f = 11Hz Frequency [Hz]
◦ Since DFT (non−coherent sampling)
12

fR f ×N 11 × 32 10
= = = 5.5
fs fs 64 8

6
the impulse at f = 11 Hz appears
between the DFT bins k = 5 and k = 6 4

◦ The impulse at f = −11 Hz appears 2

between DFT bins k = 26 and k = 27 0


−15 −10 −5 0 5 10 15
(10 and 11 Hz) Frequency [Hz]

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 22


Practical Issue #4: Incoherent sampling
Visual Representation
f = 10 Hz f = 11 Hz

DFT Spectrum, f = 10Hz DFT Spectrum, f = 11Hz


1 0.7

0.6
0.8
0.5
0.6 0.4

0.4 0.3

0.2
0.2
0.1

0 0
−10 0 10 −10 0 10
Frequency (Hz) Frequency (Hz)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 23


Part 2: The Periodogram

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 24


Power Spectrum estimation: Problem statement
Estimate Power Spectral Density (PSD) of a wide-sense stationary signal
Recall that PSD = F (ACF ).
Therefore, estimating the power spectrum is equivalent to
estimating the autocorrelation.
Recall that
 for an autocorrelation ergodic
 process,
 1 XN 
lim x(n + k)x(n) = rxx(k)
N →∞  2N + 1 
n=−N

If x(n) is known for all n, estimating the power spectrum is


straightforward
◦ Difficulty 1: the amount of data is always limited, and may be very
small (genomics, biomedical)
◦ Difficulty 2: real world data is almost invariably corrupted by
noise, or contaminated with an interfering signal

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 25


PSD properties

i) Pxx(f ) is a real function ( Pxx(f ) = Pxx (f )).
Proof: Since r(−m) = r(m) and f ∈ (−1/2, 1/2] (ω ∈ (−π, π]), we have
X∞ X∞
Pxx(f ) = F{rxx(m)} rxx(m)e−2πmf = rxx(−m)e2πmf
m=−∞ m=−∞
and hence it has no notion of the phase information in data

X ∞
X
Pxx(f ) = rxx(m) cos(2πmf ) = rxx(0) + 2 rxx(m) cos(2πmf )
m=−∞ m=1

ii) Pxx(f ) is a symmetric function Pxx(−f ) = Pxx(f ). This follows from


the last expression.
R 1/2 2
iii) r(0) = −1/2
P xx (f )df = E{x [n]} ≥ 0.

⇒ the area below the PSD (power spectral density) curve = Signal Power

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 26


Spectral estimation techniques
In practice, we only have a finite length of data sequence, therefore it is
only possible to estimate the true PSD.
This is why spectral estimation is a challenging problem, because we must
use the available data to form to most accurate estimate of the PSD and
consider the statistical stationarity of the real measurement.
To quantify the error, we consider the statistical properties of the
associated spectral estimation techniques.
◦ Conventional methods
– They only assume F{rxx(k)} = Pxx(f ).

◦ Model–based schemes
– assume that the measurement is generated by some prescribed
parametric form, for instance by a rational transfer function (filter)
driven by white Gaussian noise
WGN ⇒ FILTER ⇒ Measurement

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 27


Power spectrum – some insights
It would be advantageous to obtain power spectum directly from the DFT of data

We shall now show that the PSD canbe written in an equivalent


 form:
1  +MX 2
Pxx(f ) = lim E x[k]e−2πf k
M →+∞ 2M + 1  
k=−M

Let us beginby expanding 


1  +M
X X M 
Pxx(f ) = lim E x[k]x[l]e−2πf (k−l)
M →+∞ 2M + 1  
k=−M l=−M

+M M
1 X X
= lim E {x[k]x[l]} e−2πf (k−l)
M →+∞ 2M + 1 | {z }
k=−M l=−M rxx (k−l)
+M M
1 X X
= lim g(k − l)
M →+∞ 2M + 1
k=−M l=−M

P 2 P P
Note that i = j× k

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 28


Converting double into a single summation
+M
X M
X 2M
X
g(k − l) = (2M + 1 − |τ |)g(τ )
k=−M l=−M τ =−2M

g(+2M) (2M+1) points ! g = g(0)


k
g(0)
2M points ! g = g(1)
.. .. ..
1 point ! g = g(2M )
etc

Reminds you of a triangle?


l Recall: the autocorrelation
of two rectangles of width
2M is a triangle of width 4M!
etc
g(1)
etc
g(−2M)
This underpins our first
practical power spectrum
estimator

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 29


Schuster’s periodogram (1898)
2M Hence
 
X 2M + 1 − |τ |
Pxx(f ) = lim rxx(τ )e−2πf τ
M →+∞ 2M + 1
τ =−2M | {z  }
|τ |

= 1− 2M +1

Provided the autocorrelation function decays fast enough, we have


+∞
X
Pxx(f ) = rxx(τ )e−2πf τ
τ =−∞
Note rxx(τ ) = rxx(−τ ) ⇒ Pxx(f ) is real!

In practice, we only have access to [x(0), . . . , x(N − 1)] data points (we
drop the expectation), then
N −1 2
1 X
P̂per (f ) = x[k]e−2πf k
N
k=0

Symbol ˆ denotes an estimate, since due to the finite N the ACF is imperfect

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 30


Periodogram based estimation of power spectrum
more intuition # connection with DFT
A nonparametric estimator the power spectrum – the periodogram

N
X +1
P̂per (eω ) = r̂xx(k)e−kω
k=−N +1

It is, however, more convenient to express the periodogram in terms of the


process x[n] (alternative derivation):

1
◦ Notice that r̂xx(k) = N x(k) ∗ x(−k)

◦ Apply the FT to obtain

1 1 2
P̂per (eω ) = X(eω )X ∗(eω ) = |X(eω )|
N N
PN −1
ω
where X(e ) = n=0 x(n)e−ωn. (this is a DTFT of x(n)).

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 31


What to look for next?
◦ We must examine the statistical properties of the periodogram estimator
◦ For the general case, the statistical analysis of the periodogram is
intractable
◦ We can, however, derive the mean of the periodogram estimator for any
real process
◦ The variance can only be derived for the special case of a real
zero–mean WGN process with Pxx(f ) = σx2
◦ Can this can be used as indication of the variance of the periodogram
estimator for other random signals
◦ Can we use our knowledge about the analysis of various estimators, to
treat the periodogram in the same light (is it an MVU estimator, does it
attain the CRLB)
◦ Can we make a compromise between the bias and variance in order to
obtain a mean squared error (MSE) estimator of power spectrum?

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 32


Why do not you think a little about ...
~ The resolution for zero-padded spectra is higher, what can we tell about
the variance of such a periodogram?
~ If the samples at the start and end of a finite-length data sequence have
significantly different amplitudes, how does this affect the spectrum?
~ What uncertainties are associated with the concept of “frequency bin”?
~ Why happens with high frequencies in tapered periodograms?
~ What would be the ideal properties of a “data window”?
~ How frequently do we experience incoherent sampling in real life
applications and what is a most pragmatic way to deal with the
frequency resolution when calculating spectra of such signals?
~ How can we use the time–bandwidth product to ensure physical
meaning of spectral estimates?
~ The “double summation” formula that uses progressively fewer samples
to estimate the ACF is very elegant, but does it come with some
problems too, especially for larger lags?

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 33


Physical intuition: Connecting PSD and ACF
positive (semi)-definiteness
 
r(0) r(1) ··· r(N − 1)
r(1) r(0) ··· r(N − 2) 
= E{xxT } = 

Recall: Rxx  ... ... ... ... 

r(N − 1) r(N − 2) ··· r(0)
Then, for a linear system with input sequence {x}, output {y}, and the
vector of coefficients a, the output has the form
N
X −1
y(n) = a(k)x(n − k) = xT a = aT x where a = [a(0), . . . , a(N − 1)]T
k=0
The power Py = E{y 2} is always positive, and thus ((aT b)T = bT aT )
E y [n] = E y[n]y [n] = E a xx a = a E xx a = aT Rxxa
 2  T
 T T T
 T

⇒ to maintain positive power, the autocorrelation matrix Rxx must


be positive semidefinite
In other words: a positive semidefinite Rxx will alway produce
positive power spectrum!
But, is our estimate of ACF always positive definite?

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 34


Two ways to estimate the ACF
For an autocorrelation ergodic process with an unlimited amount of
data, the ACF may be determined:
1) Using the time–average
N
1 X
rxx(k) = lim x(n + k)x(n)
N →∞ 2N + 1
n=−N
If x(n) is measured over a finite time interval, n = 0, 1, . . . , N − 1 then we
need to estimate the ACF from a finite sum
N −1
1 X
r̂xx(k) = x(n + k)x(n)
N n=0
2) In order to ensure that the values of x(n) that fall outside interval
[0, N − 1] are excluded from the sum, we have (biased estimator)
N −1−k
1 X
r̂xx(k) = x(n + k)x(n), k = 0, 1, . . . , N − 1
N n=0
Cases 1) and 2) are equivalent for small lags and a fast decaying ACF
Case 1) gives positive semidefinite ACF, this is not guaranteed for Case 2)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 35


Periodogram and Matlab
Px=abs(fft(x(n1:n2))).^2/(n2-n1-1)

or the direct command ‘periodogram’

◦ Pxx = PERIODOGRAM(X)
returns the PSD estimate of the signal specified by vector X in the
vector Pxx. By default, the signal X is windowed with a BOXCAR
window of the same length as X;

◦ PERIODOGRAM(X,WINDOW)
specifies a window to be applied to X. WINDOW must be a vector of
the same length as X;

◦ [Pxx,W] = PERIODOGRAM(X,WINDOW,NFFT)
specifies the number of FFT points used to calculate the PSD estimate.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 36


Performance of the periodogram
(we desire a minimum variance unbiased (MVU) est.)
Its performance is analysed in the same was as the performance of any
other estimator:

◦ Bias, that is, whether


n o
lim E P̂per (f ) = Px(f )
N →∞

◦ Variance
n o
lim V ar P̂per (f ) = 0
N →∞

◦ Mean square convergence


h i2 
MSE = bias2 + variance = E P̂per (f ) − Px(f )
h i2 

R
we desire lim E P̂per (f ) − Px(f ) =0
N →∞

we need to check P̂per (f ) is a consistent estimator of the true PSD.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 37


Bias of the periodogram as an estimator
We can calculate this by finding the expected value of
1
PN −1−|k|
r̂xx(k) = N n=0 x(n)x(n + |k|). Thus (biased estimate)

N
X −1
E {Pper (f )} = E{r̂xx(k)}e−2πf k
k=−(N −1)
N −1
X N − |k|
= rxx(k)e−2πf k = “wB (k) × rxx(k)00
N
k=−(N −1)

where rxx is the true ACF and the Bartlett (triangular) window is defined
by

1 − |k|

wB (k) = N ; |k| ≤ N
0; |k| > N − 1

Notice the maximum at n=0, and a slow decay towards the end of the sequence

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 38


Inherent windowing in the Periodogram
Issues with finite duration measurements
To analyse the effects of a finite signal duration, consider a rectangular
window

N −1
F
X
··· −→ e−2πf k
| {z } k=0
0,...,N −1

N −1 2πf N
X 1 − e−2πf N e− 2 2 sin(πf N )
W (f ) = e−2πf k = = =
1 − e−2πf − 2πf
e 2 2sin(πf )
k=0

−πf (N −1) sin(πf N ) πf N −πf (N −1) sinc(πf N )


e × =e ×N
πf N sin(πf ) sinc(πf )

If the sampling is coherent, zeroes of the sinc functions all lie at multiplies
of 1/N , and hence the outputs of DFT are all zero except at f = ± N1 .

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 39


Effects of the Bartlett window on resolution

Behaves as sinc2

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 40


Periodogram bias – continued
From the previous observation, we have
n o X∞
E P̂per (f ) = rxx(k)wB (k)e−2πkf ⇔ WB (f ) ∗ Pxx(f )
k=−∞

where h i2
1 sin πf N
WB (f ) = N sin πf .
In words, the expected value of the periodogram is the convolution of the
power spectrum Pxx(f ) with the Fourier transform of the Bartlett window,
and therefore, the periodogram is a biased estimate.
Since when N → ∞, WB (f ) → δ(0), the periodogram is asymptotically
unbiased
n o
lim E P̂per (f ) = Pxx(f )
N →∞

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 41


Example: Sinusoid in WGN
x(n) = A sin(nω0 + Φ) + w(n), A = 5, ω0 = 0.4π

N=64: Overlay of 50 periodograms periodogram average


30 30

20 25

10
20
Magnitude (dB)

Magnitude (dB)
0
15
−10
10
−20

5
−30

−40 0

−50 −5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

40 35

30 30

20
25
Magnitude (dB)

Magnitude (dB)
10
20
0
15
−10
10
−20

5
−30

−40 0

−50 −5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

N=256: Overlay of 50 periodograms periodogram average

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 42


Periodogram resolution: Two sinusoids in white noise
2
This is a random process (Φ1 ⊥ Φ2, w(n) ∼ U(0, σw ) described by :

x(n) = A1 sin(nω1 + Φ1) + A2 sin(nω2 + Φ2) + w(n)

The true PSD is


2 1 2 1 2
Pxx(ω) = + πA1 [δ(ω − ω1) + δ(ω + ω1)] + πA2 [δ(ω − ω2) + δ(ω + ω2)]
σw
2 2
n o
The expected PSD E P̂per (ω) (Px ∗ WB ) becomes

2 1 2 1 2
σw + A1 [WB (ω − ω1) + WB (ω + ω1)] + A2 [WB (ω − ω) + WB (ω + ω2)]

R
4 4

there is a limit on how closely two sinusoids or two narrowband


processes may be located before they can no longer be resolved.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 43


Example: Estimation of two sinusoids in WGN

Based on previous example, try to generate these yourselves

x(n) = A1 sin(nω1 + Φ1) + A2 sin(nω2 + Φ2) + w(n)

where

◦ datalength N = 40, N = 64, N = 256

◦ A1 = A2, ω1 = 0.4π, ω2 = 0.45π

◦ A1 6= A2, ω1 = 0.4π, ω2 = 0.45π

◦ produce overlay plots of 50 periodograms and also averaged


periodograms

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 44


Example: Periodogram resolution # two sinusoids
see also Problem 4.6 in your Problem/Answer set

N=40: Overlay of 50 periodograms periodogram average


30 25

20
20
10
Magnitude (dB)

Magnitude (dB)
0 15

−10
10
−20

−30 5

−40
0
−50

−60 −5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

30 30

20
25

10
Magnitude (dB)

Magnitude (dB)
20
0

−10 15

−20
10

−30

5
−40

−50 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

N=64: Overlay of 50 periodograms periodogram average

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 45


Effects of the Window Choice
Recall: The spectrum of the (rectangular) window is a sinc which has a
main lobe and sidelobes
All the other window functions (addressed later) also have the
mainlobe and sidelobes.

◦ The effect of the main lobe (its width) is to smear or smooth the
estimated spectrum shape

◦ From the previous slide: the width of the mainlobe causes the next peak
in the spectrum to be masked if the two peaks are not separated by
1/N - the spectral resolution

◦ The sidelobes cause spectral leakage # transferring power from the


correct frequency bin into the frequency bins which contain no signal
power

These effects are dangerous, e.g. when estimating peaky spectra

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 46


Some observations
◦ The Bartlett window biases the periodogram;

◦ It also introduces smoothing, which limits the ability of the


periodogram to resolve closely–spaced narrowband components in x(n);

◦ This is due to the width of the main lobe of WB (f );

◦ Periodogram averaging would reduce the variance (remember MVU


estimators!)

◦ Resolution of the periodogram


– set ∆ω = width of the main lobe of spectral window, at its “half
power”
– for Bartlett window ∆ω ∼ 0.89(2π/N ) = periodogram resolution!
– notice that the resolution is inversely proportional to the amount of
data N

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 47


Variance of the periodogram
§ it is difficult to evaluate the variance of the periodogram of an
arbitrary process x(n) since the variance depends on the fourth–order
moments of the process.
© the variance may be evaluated in the special case of WGN −→
n o  1 2 X X X X
E P̂per (f1)P̂per (f2) = E {x(k)x(l)x(m)x(n)} ×
N m n
k l

× e−2π[f1(k−l)+f2(m−n)]

For WGN, these fourth–order moments become


E {x(k)x(l)x(m)x(n)} =
E{x(k)x(l)}E{x(m)x(n)} + E{x(k)x(m)}E{x(l)x(n)} + E{x(k)x(n)}E{x(l)x(m)}
= σx4 [δ(k − l)δ(m − n) + δ(k − m)δ(l − n) + δ(k − n)δ(l − m)]

This is = σx4 if k=l, m=n, or k=m, l=n, or k=n, l=m, or otherwise 0

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 48


Variance of the periodogram – contd.

After some simplifications, and recognising

N −1 N −1
1 X X 4 4
σ x = σ x
N2 m=0
k=0

we have the variance of the periodogram for a given frequency:


"  2 #
n o
2 sin 2πN f
var P̂per (f ) = Pxx (f ) 1 +
N sin 2πf

For the periodogram to be consistent, var(Pper ) → 0 as N → ∞.


From the above, this is not the case ⇒ the periodogram estimator is
inconsistent. In fact, var(Pper (f )) = Px2(f ) # quite large

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 49


Example: Periodogram of white noise
N=64 N = 128 N=256
10 20 20

0 10
10

−10 0
Magnitude (db)

Magnitude (db)

Magnitude (db)
0

−20 −10
−10
−30 −20

−20
−40 −30

−30
−50 −40

−60 −50 −40


0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)

10 20 20

0 10
10

−10 0
Magnitude (db)

Magnitude (db)

Magnitude (db)
0

−20 −10
−10
−30 −20

−20
−40 −30

−30
−50 −40

−60 −50 −40


0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)
h i
Pxx = 1, E{P̂per (eω )} = 1, var P̂per (eω ) = 1

Although the periodogram is unbiased, the variance is equal to a


constant, that is, independent of the data length N

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 50


Bias vs variance
Recall that for any estimator, its mean square error (MSE) is given by:
MSE = bias2 + variance

A way to overcome periodogram limitations:

◦ bias performance must be traded for variance performance

◦ the dataset is divided up into independent blocks

◦ the periodograms for every block may be averaged

◦ the resultant estimator is termed the averaged periodogram


L−1
1 X (m)
P̂aver,per = P̂per (f )
L m=0

From Estimation Theory: averaging of random trials reduces noise power!

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 51


Bias vs variance – recap
◦ Bias pertains to the question: “Does the estimate approach the
correct value as N → ∞”.
~ If yes then the estimator is unbiased, else it is biased
~ Notice that the main lobe of the window has a width of 2π/N and
hence when N → ∞ we have limN →∞ P̂per (f ) = Pxx(f ) ⇒
periodogram is an asymptotically unbiased estimator of true PSD.
~ For the window to yield an unbiased estimator:
PN −1 2 1
n=0 w (n) = N & the mainlobe width ∼ N

◦ Variance refers to the “goodness” of the estimate, that is, whether the
power of the estimation error tend to zero when N → ∞.
~ We have shown that even for a very large window the variance of the
estimate is as large as the true PSD
~ This means that the periodogram is not a consistent estimator of
true PSD.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 52


Properties of the standard periodogram
Functional relationship:
N −1 2
1 X
P̂per (ω) = x[n]e−nω
N n=0

◦ Bias
n o 1
E P̂per (ω) = Px(ω) ∗ WB (ω)

◦ Resolution

∆ω = 0.89
N

◦ Variance
n o
V ar P̂per (ω) ≈ Px2(ω)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 53


Part 3: Periodogram Modifications

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 54


Periodogram modifications # some intuition
Clearly, we need to reduce the variance of the periodogram, since in
general they are not adequate for precise estimation of PSD.
We can think of several modifications:

1) averaging over a set of periodograms (we have already seen the


effect of this in some simulations).
Recall that from the general estimation theory, by averaging M times
we have the effect of var → var/M .

2) applying different windows # it is possible to choose or design a


window which will have a narrow mainlobe

3) overlapping windowed segments for additional variance reduction #


averaging periodograms along one realisation of a random process
(instead of across the ensemble)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 55


Periodogram Based Methods
Overview of Periodogram Modifications

Windowing Averaging
Modified Periodogram Bartlett’s Method

+ Overlapping windows
Welch’s Method

© Danilo P Mandic Spectral Estimation &


& Adaptive Signal Processing 56 3
c D. P. Mandic Adaptive Signal Processing Machine Intelligence
Modified
Windowing:Periodogram
The Modified Periodogram
Windowing
Reduction the
“Edge Effects”

Windowing mitigates the problem of spurious


high frequency components in the spectrum.

c D. P. Mandic © Danilo P Mandic Spectral


Adaptive Signal Estimation &
Processing & Adaptive
Machine Signal Processing 57
Intelligence 4
The Modified Periodogram
The periodogram of a process that is windowed with a suitable general
window w[n] is called a modified periodogram and is given by:

∞ 2
1 X
P̂M (ω) = x[n]w[n]e−nω
NU n=−∞

1
PN −1
2
where N is the window length and U = n=0 |w[n]| is a constant,
N
and is defined so that P̂M (ω) is asymptotically unbiased.
In Matlab:

xw=x(n1:n2).*w/norm(w);
Pm=N * periodogram(xw);

where, for different windows

w=hanning(N); w=bartlett(N);w=blackman(n);

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 58


The Modified Periodogram – “Windowing”
Recall that
2

P eriodogram ∼ F |x[n]wr [n]|

Therefore: The amount of smoothing in the periodogram is determined


by the window that is applied to the data. For instance, a rectangular
window has a narrow main lobe (and hence least amount of spectral
smoothing), but its relatively large sidelobes may lead to masking of weak
narrowband components.
Question: Would there be any benefit of using a different data window on
the bias and resolution of the periodogram.
Example: can we differentiate between the following two sinusoids for
ω1 = 0.2π, ω2 = 0.3π, N = 128

x[n] = 0.1 sin(nω1 + Φ1) + sin(nω2 + Φ2) + w[n]

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 59


Some common windows for different window lengths:
Time domain Spectrum N=64 Spectrum N=128 Spectrum N=256
Rectangular window (64 samples) Spectral leakage − 64−sample window Spectral leakage − 128−sample window Spectral leakage − 256−sample window
0 0 0
1.5 10 10 10

1
Magnitude

−1 −1

dB

dB

dB
−1 10 10
10
0.5

0
10 20 30 40 50 60 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02
Time sample Normalised frequency Normalised frequency Normalised frequency

Bartlett window (64 samples) Spectral leakage − 64−sample window Spectral leakage − 128−sample window Spectral leakage − 256−sample window
0 0 0
1.5 10 10 10

−1
10
1 −2
10
Magnitude

−2
10
dB

dB

dB
−3
0.5 10
−4
10
−1 −4
10 10

0
10 20 30 40 50 60 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02
Time sample Normalised frequency Normalised frequency Normalised frequency

Hamming window (64 samples) Spectral leakage − 64−sample window Spectral leakage − 128−sample window Spectral leakage − 256−sample window
0 0 0
1.5 10 10 10

−1 −1
10 10
1
Magnitude

−2 −2
10 10
dB

dB

dB
−3 −3
0.5 10 10

−4
10 −4
10
−1
0 10
10 20 30 40 50 60 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02 −0.02 −0.01 0 0.01 0.02
Time sample Normalised frequency Normalised frequency Normalised frequency

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 60


Example: Estimation of two sinusoids in WGN
Modified periodogram using Hamming window
Problem: Estimate spectra of the following two sinusoids using: (a) The
standard periodogram; (b) Hamming-windowed periodogram
x[n] = 0.1 sin(n ∗ 0.2π + Φ1) + sin(n ∗ 0.3π + Φ2) + w[n] N = 128
 n
Hamming window w[n] = 0.54 − 0.46 cos 2π
N

15 20

10 10

5
0
Magnitude (dB)

Magnitude (dB)
0
−10
−5
−20
−10
−30
−15

−40
−20

−25 −50

−30 −60
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

Expected value of periodogram Periodogram using Hamming window

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 61


Properties of an ideal window function
Consider a window sequence w(n) whose DFT is a squared magnitude of
another sequence v(n), that is
MX−1
V (ω) = v(k)e−ωk # W (ω) = |V (ω)|2 (positive definite)
k=0
Then
M −1 M −1 M −1
−ωk −(n−p)
X X X
w(k)e = v(n)v(p)e
k=−(M −1) n=0 p=0

M −1 hM −1 i
−k
X X
= v(n)v(n − k) e , for v(k) = 0, k ∈
/ [0, M − 1]
k=−(M −1) n=0
This gives
M
X −1
w(k) = v(n)v(n − k) = v(k) ∗ v(k) ⇔ W (ω) ≥ 0 pos. semidefinit.
n=0

A window design should trade-off between smearing and leakage


For instance: weak sinewave + strong narrowband interference → leakage more detrimental than smearing
Homework: can we use optimisation to balance between smearing and leakage

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 62


Several frequently used “cosine–type windows”
Idea: suppress sidelobes, perhaps sacrifice the width of mainlobe

◦ Hann window

w = 0.5 * (1 - cos(2*pi*(0:m-1)’/(n-1)));

◦ Hamming window

w = (54 - 46*cos(2*pi*(0:m-1)’/(n-1)))/100;

◦ Blackman window

w = (42 - 50*cos(2*pi*(0:m-1)/(n-1)) +

+ 8*cos(4*pi*(0:m-1)/(n-1)))’/100;

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 63


Performance of the modified periodogram
◦ Bias: Since
N −1 Z π Z π
1 X 1 2 1 2
U= |w[n]|2 = |W (eω )| dω ⇒ |W (eω )| dω = 1
N n=0
N −π 2πN U −π

for N → ∞ the modified periodogram is asymptotically unbiased.

◦ Variance: Since P̂M is simply P̂per of a windowed data sequence


n o
2
V ar P̂M (ω) ≈ Pxx (ω)

⇒ not a consistent estimate of the power spectrum, and the data


window offers no benefit in terms of reducing the variance

◦ Resolution: Data window provides a trade–off between spectral


resolution (main lobe width) and spectral masking (sidelobe amplitude).

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 64


Periodogram modifications: Effects of different windows

Properties of several commonly used windows with length N :

◦ Rectangular – Sidelobe level = -13 [dB], 3 dB BW → 0.89(2π/N )

◦ Bartlett – Sidelobe level = -27 [dB], 3 dB BW → 1.28(2π/N )

◦ Hanning – Sidelobe level = -32 [dB], 3 dB BW → 1.44(2π/N )

◦ Hamming – Sidelobe level = -43 [dB], 3 dB BW → 1.30(2π/N )

◦ Blackman – Sidelobe level = -58 [dB], 3 dB BW → 1.68(2π/N )


Notice the relationship between the sidelobe level and bandwidth!

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 65


Bartlett’s Method
Bartlett’s Method

Averaging

Reduction in
Variance

Tradeoff:
Frequency Resolution &
Variance Reduction

c D. P. Mandic © Danilo P Mandic Spectral


Adaptive Signal Estimation &
Processing & Adaptive
Machine Signal Processing 66
Intelligence 5
Partitioning the data set (K segments of length L each)

Partitioning x(n) into K non–overlapping segments


This way, the total length N = K × L

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 67


Bartlett’s method: Averaging periodograms
The averaged periodogram can be expressed as:
K
1 X (m)
P̂aver,per (f ) = P̂ (f )
K m=1 per
where for each of the K segments, the segment-wise PSD estimate
(i)
Pper , i = 1, . . . , K is given by
L−1 2
1 X
(i)
Pper (ω) = xi[n]e−nω
L n=0

◦ Idea: to reduce the variance by the factor of “K” = total number of


blocks
◦ Therefore: provided that the blocks are statistically independent (not
often the case in practice) we desire to have
n o 1 n o
var P̂aver,per (f ) = var P̂per (f )
K

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 68


Example: Estimation of WGN spectrum using Bartlett’s
method
50 periodograms 50 Bartlett estimates 50 Bartlett estimates
with N = 512 K = 4, L = 128 K = 8, L = 64

20 20 20
Magnitude (dB)

Magnitude (dB)

Magnitude (dB)
10 10 10

0 0 0

−10 −10 −10

−20 −20 −20

−30 −30 −30

−40 −40 −40

−50 −50 −50


0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)

2 2 2

1 1 1
Magnitude (dB)

Magnitude (dB)

Magnitude (dB)
0 0 0

−1 −1 −1

−2 −2 −2

−3 −3 −3

−4 −4 −4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)

Ensemble averages

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 69


Performance evaluation of Bartlett’s method
◦ Bias: The expected value of Bartlett’s estimate
n o 1
E P̂B (ω) = Px(ω) ∗ WB (ω)

⇒ asymptotically unbiased.

◦ Resolution: Due to K segments of length L, as a consequence we have


that Res(PB ) < Res(Pper ), that is
h i 2π 2π
Res P̂B (ω) = 0.89 = 0.89 K
L N
◦ Variance:
n 1 o n
(i)
o 1 2
V ar P̂B (ω) ≈ V ar P̂per (ω) ≈ Px (ω)
K K
For non–white data, variance reduction is not as large as K times!
By changing the values of L and K, Bartlett’s method allows us to:
trade a reduction in spectral resolution for a reduction in variance

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 70


Example:
√ Estimation of two sinewaves in white noise
x[n] = 10sin(n ∗ 0.2π + Φ1) + sin(n ∗ 0.25π + Φ2) + w[n]
50 periodograms 50 Bartlett estimates 50 Bartlett estimates
with N = 512 K = 4, L = 128 K = 8, L = 64
40 40 40
Magnitude (dB)

Magnitude (dB)

Magnitude (dB)
30 30 30

20 20 20

10 10 10

0 0 0

−10 −10 −10

−20 −20 −20

−30 −30 −30

−40 −40 −40

−50 −50 −50


0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)

35 35 35

30 30 30

25 25 25
Magnitude (dB)

Magnitude (dB)

Magnitude (dB)
20 20 20

15 15 15

10 10 10

5 5 5

0 0 0

−5 −5 −5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi) Frequency (units of pi)

Ensemble averages
Notice the variance – resolution trade–off!

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 71


Welch’s Method
Welch Method
Overlapping Windows
Averaging

Achieves a good
balance between
Resolution &
Variance

c D. P. Mandic © Danilo P Mandic Spectral


Adaptive Signal Estimation &
Processing & Adaptive
Machine Signal Processing 72
Intelligence 6
Welch’s method: Averaging modified periodograms
In 1967, Welch proposed two modifications to Bartlett’s method:

◦ allow the sequences xi[n] to overlap

◦ to allow data window w[n] to be applied to each sequence ⇒ averaging


modified periodograms

This way, successive segments are offset by D points and each segment is
L points long

xi[n] = x[n + iD] n = 0, 1, . . . , L − 1

The amount of overlap between xi[n] and xi+1[n] is L − D points and

N = L + D(K − 1)

N - total number of points, L- length of segments, D- amount of overlap,


K- number of sequences

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 73


Variations on the theme
We may vary between no overlap D=L and say 50 % overlap D = L/2
or anything else.
© we can trade a reduction in the variance for a reduction in the
resolution, since

K−1 2
1 X L−1
X
P̂W (ω) = w[n]x[n + iD]e−nω
KLU i=0 n=0

or in terms of modified periodograms

K−1
1 X (i)
P̂W (ω) = P̂ (ω)
K i=0 M

asymptotically unbiased (follows from the bias of the modified


periodogram)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 74


Welch vs. Bartlett

◦ the amount of overlap between xi[n] and xi+1[n] is L − D points, and if K


sequences cover the entire N data points, then
N = L + D(K + 1)

◦ If there is no overlap, (D = L) we have K = N L sections of length L as in Bartlett’s


method
◦ Of the sequences are overlapping by 50 % D = L2 then we may form K = 2 N L −1
sections of length L. thus maintaining the same resolution as Bartlett’s method while
doubling the number of modified periodograms that are averaged, thereby reducing
the variance.
◦ With 50% overlap we could also form K = N L − 1 sequences of length 2L, thus
increasing the resolution while maintaining the same variance as Bartlett’s method.

Therefore, by allowing sequences to overlap, it is possible to increase the


number and/or length of the sequences that are averaged, thereby trading
a reduction in variance for a reduction in resolution.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 75


Properties of Welch’s method
◦ Functional relationship:
K−1 2
1 X L−1
X 1X
L−1
P̂W (ω) = w[n]x[n + iD]e−nω U= |w[n]|2
KLU i=0 n=0
L n=0

◦ Bias
n o 1 2
E P̂W (ω) = Px(ω) ∗ |W (ω)|
2πLU

◦ Resolution # window dependent

◦ Variance (assuming 50 % overlap and Bartlett window)


n o 9 L 2
V ar P̂W (ω) ≈ Px (ω)
16 N

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 76


Example: Two sinusoids in noise # Welch estimates
Problem: Estimate the spectra of the following two sinewaves using
Welch’s method

x[n] = 10 sin(n ∗ 0.2π + Φ1) + sin(n ∗ 0.3π + Φ2) + w[n]
Unit noise variance, N = 512, L = 128, 50 % overlap (7 sections)

25 25

20
20

15
Magnitude (dB)

Magnitude (dB)
15

10
10
5

5
0

0
−5

−10 −5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Frequency (units of pi) Frequency (units of pi)

Overlay of 50 estimates Periodogram using Welch’s method

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 77


SSVEP in EEG # we look for a 14 Hz stimulus in a 50s
recording using Welch’s method
Standard: A 50s EEG from scalp (Oz) and right ear (ITE). Averaged: 27 segments of 12s.
Top: no window Bottom: Hann window
−10 −12
10 10

Right ITE Electrode


Scalp Electrode

−13
10

−14
10

−15
10

−15 −16
10 10
10 15 20 25 30 35 10 15 20 25 30 35
Frequency (Hz) Frequency (Hz)
−10 −12
10 10

Right ITE Electrode


Scalp Electrode

−13
10

−14
10

−15
10

−15 −16
10 10
10 15 20 25 30 35 10 15 20 25 30 35
Frequency (Hz) Frequency (Hz)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 78


Blackman-Tukey Method
Blackman-Tukey Method

The Periodogram
can also be
expressed as:

Autocorrelation Estimates
at large lags are unreliable

Lags:
Windowing

Next: Can we extrapolate the autocorrelation estimates for lags ?

c D. P. Mandic © Danilo P Mandic Spectral


Adaptive Signal Estimation &
Processing & Adaptive
Machine Signal Processing 79
Intelligence 7
Blackman–Tukey method: Periodogram smoothing
Recall that the methods by Bartlett and Welch are designed to reduce the
variance of the periodogram by averaging periodograms and modified
periodograms, respectively.
Another possibility is “periodogram smoothing” often called the
Blackman–Tukey method.
Let us identify the problem §
1
r̂x[N − 1] = x[N − 1]x[0]
N
⇒ there is little averaging when calculating the estimates of r̂x[k] for
|k| ≈ N .
These estimates will be unreliable no matter how large N . We have two
choices:
◦ reduce the variance of those unreliable estimates

◦ reduce the contribution these unreliable estimates make to the


periodogram

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 80


Blackman–Tukey Method: Resolution vs. Variance
The variance of the periodogram is decreased by reducing the variance of
the ACF estimate by calculating more robust ACF estimates over fewer
data points (M < N ).
⇒ Apply a window to r̂x[k] to decrease the contribution of unreliable
estimates and obtain the Blackman–Tukey estimate:
M
X
P̂BT (ω) = r̂x[k]w[k]e−kω
k=−M

where w[k] is a lag window applied to the ACF estimate.


Z π
1 1
P̂BT (ω) = P̂per (ω) ∗ W (ω) = P̂per (eu)W (e(ω−u))du
2π 2π −π

that is, we trade the reduction in the variance for a reduction in the
resolution (smaller number of ACF estimates used to calculate the PSD)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 81


Properties of the Blackman–Tukey method
◦ Functional relationship:
M
X
P̂BT (ω) = r̂x[k]w[k]e−kω
k=−M
◦ Bias n o 1
E P̂BT (ω) ≈ Px(ω) ∗ W (ω)

◦ Resolution– window dependent (window – conjugate symmetric and


with non–negative FT)
◦ Variance: Generally, it is recommended M < N/5.
M
n o 1 X
V ar P̂BT (ω) ≈ Px2(ω) w2[k]
N
k=−M
Trade–off: for a small bias M needs to be large to minimize the width
of the mainlobe of W (ω), whereas M should be small in order to
minimize the variance.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 82


Non-negative definiteness of the BT spectrum estimator
see also Problem 4.9 in your Problem/Answer set

The main problem with periodogram is its high statistical variability. This
arises from:

◦ Poor accuracy of the autocorrelation estimate for large lags m

◦ Accumulating of these errors in the spectrum estimate

These effects can be mitigated by taking fewer points (M instead of N) in


ACF estimation.
Observe that the Blackman–Tukey spectral estimator corresponds to a
locally weighted average of the periodogram.
Roughly speaking:

~ the resolution of the BT estimator is ∼ 1/M

~ the variance of the BT estimator is ∼ M/N

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 83


Performance comparison of periodogram–based methods
Let us introduce criteria for performance comparison:

◦ Variability of the estimate


n o
var P̂x(ω)
ν= n o
E 2 P̂x(ω)

which is effectively normalised variance

◦ Figure of merit

M = ν × ∆ω

that is, product of variability and resolution.


M should be as small as possible.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 84


Performance measures for the Nonparametric methods
of Spectrum Estimation
Method Variability ν Resolution ∆ω Figure of merit M
—————– —————– —————— ————————–
Periodogram 1 0.89 2π
N 0.89 2π
N
1
Bartlett K 0.89K 2π
N 0.89 2π
N
91
Welch 8K 1.28 2π
L 0.72 2π
N
2M
Blackman–Tukey 3N 0.64 2π
M 0.43 2π
N

◦ Observe that each method has a Figure of Merit which is approximately


the same

◦ Figure of merit are inversely proportional to N

◦ Although each method differs in its resolution and variance, the overall
performance is fundamentally limited by the amount of data that
is available.

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 85


Conclusions
FFT based spectral estimation is limited by:
◦ correlation assumed to be zero beyond N - biased/unbiased estimates

◦ resolution limited by the DFT “baggage”


1
◦ if two frequencies are separated by ∆f , then we need N ≥ ∆f data
points to separate them

◦ limitations for spectra with narrow peaks (resonances, speech, sonar)

◦ limit on the resolution imposed by N also causes bias

◦ variance of the periodogram is almost independent of data length

◦ the derived variance formulae are only illustrative for real–world signals
But also many opportunities: spectral coherency, spectral entropy, TF, ...
Next time: model based spectral estimation for discrete spectral lines

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 86


Appendix: Spectral Coherence and LS Periodogram see
also Problem 4.7 in your P/A sets

The spectral coherence shows similarity between two spectra


Pxy (ω)
Cxy (ω) =  1/2
Pxx(ω)Pyy (ω)
It is invariant to linear filtering of x and y (even with different filters)
The periodogram Pper (ω) can be seen as a Least Squares solution to
N
X
Pper (ω) = kβ̂(ω)k2, β̂ = argmin ky(n) − βejωnk2,
β(ω) n=1
Periodogram and LS periodog. for a sinewave mixture (100, 400, 410) Hz
Time series − freqs: 100, 400 and 410 hz Classic periodogram LS Periodogram
0 10
4 −10
0
Power/frequency (dB/Hz)

−20
2 −10
−30

0 −40 −20

−50
−30
−2 −60
−40
−70
−4
−80 −50
0 0.02 0.04 0.06 0.08 0.1 0 100 200 300 400 500 0 100 200 300 400 500
Frequency (Hz)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 87


Appendix: Time-Frequency estimation
time–frequency spectrogram of “Matlab” # ‘specgramdemo‘

Frequency

time
M aaa t l aaa b

For every time instant “t”, the PSD is plotted along the vertical axis
Darker areas: higher magnitude of PSD

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 88


Appendix: Time-Frequency (TF) analysis - Principles
Assume x(n) has a Fourier transform X(ω) and power spectrum |X(ω)|2.
The function T F (n, ω) determines how the energy is distributed in
time-frequency, and it satisfies the following marginal properties:

X
T F (n, ω) = |X(ω)|2 energy in the signal at frequency ω
n=−∞
Z π
1
T F (n, ω)dω = |x(n)|2 energy at time instant ‘k0 due to all ω
2π −π
frequency
Then
∞ ∞ ∞
1 X
Z X 2
T F (n, ω)dω = |x(n)|
2π n=−∞ ∞ n=−∞
time−frequency

1
Z
2
= |X(ω)| dω
2π −∞
ω
giving the total energy (all frequencies and
samples) of a signal. time
k

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 89


Time–frequency spectrogram of a speech signal
(wide band spectrogram) (narrow band spectrogram)

Data=[4001x1], Fs=7.418 kHz Data=[4001x1], Fs=7.418 kHz

dB

dB
dB dB
20 20

3.5 3.5
30
30

3.0 20 3.0
20

10
2.5 2.5 10
Frequency, kHz

Frequency, kHz
0
2.0 2.0 0

-10
-10
1.5 1.5
-20
-20
1.0 1.0
-30
-30

0.5 -40 0.5


-40

-50
0.0 0.0
5 5
Ampl

Ampl
515.5028 ms 0
241.5745 ms 0

0.0000 Hz 1.8545 kHz


-5 -5
50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450
29.2416 dB Time, ms 3.2925 dB Time, ms

(win-len=256, overlap=200, ftt-len=32) (win-len=512, overlap=200, ftt-len=256)


Homework: evaluate all the methods from the lecture for this T-F spectrogram

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 90


TF spectrogram of a frequency-modulated signal
(check also your coursework)
The time-frequency spectrogram of a frequency modulated (FM) signal
Z t
 
y(t) = A cos ω0t + kf x(α)dα
−∞
frequency

time

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 91


Opportunities: ARMA spectrum
N=512 samples, freq. res=1/500

Blackman−Tukey (M=128): Mean (+ − std) Blackman−Tukey (M=32): Mean (+ − std) Blackman−Tukey (M=16): Mean (+ − std)

10 8
8
8
6
6 6

4 4
4
2
2
0 2

−2
0
0
−4

−6 −2 −2

−8
−4
0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
Frequency Frequency Frequency

Welch (M=128): Mean (+ − std) Welch (M=32): Mean (+ − std) Welch (M=16): Mean (+ − std)

25 20
30 18
20 16
25
14
20 15 12

10
15
10 8
10 6

5 4
5
2

0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
Frequency Frequency Frequency

Signal: ARMA(4,4), b=[1, 0.3544, 0.3508, 0.1736, 0.2401] a=[1, -1.3817, 1.5632, -0.8843, 0.4096]
Sometimes we only desire the correct position of the peaks # ARMA Spectrum Estimation

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 92


A note on positive-semidefiniteness of the Rxx
 T
The autocorrelation matrix Rxx = E xx
 T
where x = x[0], . . . , x[N − 1] . It is symmetric and of size N × N .

There are four ways to define positive semidefiniteness: (see also


your Problem-Answer sets)
1. All the eigenvalues of the autocorrelation matrix R are such that
λi ≥ 0, for i=1,. . . ,N
2. For any nonzero vector a ∈ RN ×1 we have aT Ra ≥ 0. For complex
valued matrices, the condition becomes aH Ra
3. There exists a matrix U such that R = UUT , where the matrix U is
called a root of R
4. All the principal submatrices of R are positive semidefinite. A principal
submatrix is formed by removing i = 1, . . . , N rows and columns of R
For positive definiteness conditions, replace ≥ with >

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 93


Opportunities: Spectral Entropy
Spectral entropy can be used to measure the peakiness of the spectrum.
This is achieved via the probability mass function (PMF) (normalised PSD) given by
N −1 N −1
Pper [i] X X 1
η[i] = PN −1 → Hsp = − η[i] log2 η[i] = η[i] log2
l=0 Pper [l] i=0 i=0
η[i]
’That is correct’
Intuition:
0.4
- peaky spectrum (e.g. sin(x)) 0.2

(a)
0
−0.2
# low spectral entropy 0.5 1 1.5 2 2.5 3

- flat spectrum (e.g. WGN) # 5

(b)
high spectral entropy 4
3
0.5 1 1.5 2 2.5 3

Figure on the right: 0.5

From top to bottom: a)


(c)

0
−0.5
clean speech, b) spectral 0.5 1 1.5 2 2.5 3

entropy, c) speech + 6.7


6.6
(d)

6.5
6.4
noise, d)spectral entropy of 6.3
0.5 1 1.5 2 2.5 3
Time (s)
(speech+noise)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 94


Appendix: Practical issues in correlation and spectrum
estimation
Re c tangle Re c tangle ACF
Rectangle Rectangle spectrum
2 140 2 20
120
0
1 100 1

Powe r
80
−20
0 60 0
40 −40

−1 20 −1
−60
0

−2 −20 −2 −80
0 100 200 300 400 500 600 −600 0 −400 100 −200200 0 300 200
400 400
500 600
600 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time sample Time
Timedesample
lay Normalised frequency
Sine wave Sinewave
Sine wave ACF Sinewave spectrum
2 400 2 50

1 200 1 0

Powe r
0 0 0 −50

−1 −200 −1 −100

−2 −400 −2 −150
0 100 200 300 400 500 600 −600 0 −400 100 −200200 0 300 400
200 500
400 600
600 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time sample Time
Timedesample
lay Normalised frequency
Expone ntially-de c aying sine wave Exponentially-decaying
Expone ntially-de sinewave
c aying sine wave ACF Exponentially-decaying sinewave spectrum
50
60
1 1
0
40
0.5 0.5
−50

Powe r
20

0 0 0 −100

−20 −150
−0.5 −0.5
−40 −200
−1 −1
−60 −250
0 100 200 300 400 500 600 −600 0 −400 100 −200200 0 300 400
200 500
400 600
600 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time sample Timedesample
Time lay Normalised frequency

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 95


Appendix: Trade-off in window design
window length # trade-off between spectral resolution and statistical variance

◦ most windows take non-negative values in both time and frequency


◦ They also peak at origin in both domains
For this type of window we can define:
◦ An equivalent time width Nx (Nx ≈ 2M for rectangular and
Nx ≈ M for triangular window)
◦ An equivalent bandwidth Bx (≈ determined by window’s length), as
PM −1 1

k=−(M −1) w(k) 2π −π W (ω)dω
Nw = Bw =
w(0) W (0)
We also know that
∞ M −1 Z π
X X 1
W (0) = w(k) = w(k) and w(0) = W (ω)dω
2π −π
k=−∞ k=−(M −1)

It then follows that Nw × Bw = 1


A window cannot be both time-limited and band-limited, usually M ≤ N/10

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 96


Appendix: More on time–bandwidth products
The previous slide assumes that both w(n) and W (ω) peak at the origin #
most energy concentrated in the main lobe, whose width should be ∼ 1/M.
For a general signal: x(n) and X(ω) can be negative or complex
P∞ 1

n=−∞ |x(n)| 2π −π |X(ω0 )|dω
If x(n) peaks at n0 (cf. X(ω) at ω0) # Nx = , Bx =
|x(n0)| |X(ω0)|
Because x(n) and X(ω) are Fourier transform pairs:
X∞ ∞
X
|X(ω0)| = x(n)e−ω0n ≤ |x(n)|
n=−∞ n=−∞
Z π π Z
1 1
ωn0
|x(n0)| = X(ω)e dω ≤ X(ω) dω
−π 2π 2π −π
This implies
Nx×Bx ≥ 1 (a sequence cannot be narrow in both time and frequency)

More precisely: if the sequence is narrow in one domain then it


must be wide in the other domain (uncertainty principle)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 97


Appendix: STFT of a speech signal
wide band spectrogram narrow band spectrogram

Data=[4001x1], Fs=7.418 kHz Data=[4001x1], Fs=7.418 kHz

dB

dB
dB dB
20 20

3.5 3.5
30
30

3.0 20 3.0
20

10
2.5 2.5 10
Frequency, kHz

Frequency, kHz
0
2.0 2.0 0

-10
-10
1.5 1.5
-20
-20
1.0 1.0
-30
-30

0.5 -40 0.5


-40

-50
0.0 0.0
5 5
Ampl

Ampl
515.5028 ms 0
241.5745 ms 0

0.0000 Hz 1.8545 kHz


-5 -5
50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450
29.2416 dB Time, ms 3.2925 dB Time, ms

(win-len=256, overlap=200, ftt-len=32) (win-len=512, overlap=200, ftt-len=256)

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 98


Notes

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 99


Notes

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 100


Notes

c D. P. Mandic Adaptive Signal Processing & Machine Intelligence 101

Potrebbero piacerti anche