Solved Exercises
Solved Exercises
----------------------------------------------------------------------------------------------------------
SOLVED EXERCISES
Index
Chapter I ---------------------------------------- 1
Chapter II --------------------------------------- 9
Chapter III--------------------------------------- 28
Chapter IV--------------------------------------- 43
Chapter V --------------------------------------- 50
Chapter VI -------------------------------------- 54
Miguel Angel Lagunas 13/08/2007 2
----------------------------------------------------------------------------------------------------------
CHAPTER I
I.8.1.-
R n +1 = β .R n + α . X n . X n
H
H
a) Each factor X m . X m contributes to the estimate with a coefficient equal to
α .β n − m . This is easy to check just by considering the IIR filter with numerator
equal to α and denominator equal to 1-βz-1. The effective length is defined when
the coefficient goes down to a given value, let us say 1/e, in consequence the
effective length will be defined as:
1
β N = 1/ e or N =−
Ln( β )
1
− Ln( β ) ≅ (1 − β ) and N≈
1− β
α
b) H ( z ) =
1 − β z −1
X n .X n
H
Rn
H(z)
∞ ∞
H ( z ) = α .∑ β q .z − q R n = ∑ α .β q . X n − q . X n − q
H
c) and
q =0 q =0
d) E ( R n +1 ) = β .E ( R n ) + α .E ( X n +1. X n +1 )
H
Thus,
Miguel Angel Lagunas 13/08/2007 3
----------------------------------------------------------------------------------------------------------
α H α
E(Rn ) = .E ( X n +1. X n +1 ) = .R
(1 − β ) (1 − β )
Obviously, to remove bias and ensuring that the expected value of the estimate
tends to the actual value α has to be equal to 1-β.
e)
−1 δ −1 −1
( A + δ .a.a ) −1 = A −
H H
−1
. A .a.a . A
1 + δ .a . A .a
H
Using the second one (much easy to use in our case) we get:
1 −1 α .R n . X n +1. X n +1.R n
2 −1 H −1
−1
R n +1 = R n −
β β + α 2 X nH+1.R n−1. X n +1
I.8.2.-
a) S0.S0H This is a rank-one matrix (all its columns are proportional each
other). Being rank one it has only an eigenvalue different from zero.
The eigenvector associated to this non cero eigenvalue is precisely S0
normalized, i.e.
( S .S ) . SS
o
H
o
o
o
= S0 .
2 So
So
The magnitude square of the vector
is the eigenvalue.
λ2 ,...., λQ are cero and the eigenvector are any set of orthonormal vectors
each other and with So
(α S.S H
)
+ σ 2 .I .
S
S
2
(
= α S +σ 2 .
S
S
)
the rest of eigenvectors have to be orthogonal to this largest eigenvector.
In consequence when multiplying any of them by the rank one part the
dot product will be cero and:
Miguel Angel Lagunas 13/08/2007 4
----------------------------------------------------------------------------------------------------------
(α S.S H
)
+ σ 2 .I .e q = σ 2 .I .e q = σ 2 .e q ; ∀q = 2, Q
All the eigenvalues are equal to the contribution of the identity matrix.
Only the first one has the contribution of the rank one matrix in addition.
I.8.3 .-
H
The d.c. response equal to 0 dB. Is set as A .1 = 1 , where vector 1 contains 1 at every
Q −1
component, i.e. ∑ a(q).1 = 1
q =0
H
The output filter power is given by A .R. A , where matrix R is the autocorrelation
Q −1 Q −1
matrix of the input signal, i.e. ∑∑ a (q).a( p).r (q, p)
q =0 p =0
∗
In particular, for q equal 2, the d.c. constraint is a line in the 2-D space where the two
coefficients stay, and the power is defined by ellipses centered at the origin, which size
is proportional to the output power of the filter under design.
Optimum: Minimum
power staying on the
constrain line
Constraint
If the input is white noise its autocorrelation matrix is the identity and the design
evolves to:
H H
A .1 = 1 A . A
MIN
The Lagrangian (i.e. the objective minus a multiplier per every constraint) is
( )
ϒ = A . A − λ . A .1 − 1 . Taking derivatives with respect the real and the imaginary
H H
H
parts of the filter coefficients (equivalent to take derivatives with respect A ) and
setting them to cero, results on
Miguel Angel Lagunas 13/08/2007 5
----------------------------------------------------------------------------------------------------------
A − λ .1 = 0 and A = λ.1
1
λ= H
= 1/ Q = 1/ 2
1 .1
Signal power A2
Noise power E A .wn ( H 2
) = E ( A .σ .I .A) = σ .A .A = σ
H 2 2 H 2
A2
SNR = 2. A gain of 3 dB.
σ2
c) When the noise is not white, the constraint is maintained but the objective to be
minimized changes. The solution to the new problem is given by
−1
R w .1
A . X n = A . ( A.1 + wn ) = A + A .wn still holds
H H H
A= H −1
and
1 .R w .1
Signal power A2
Noise Power (
E A .wn
H 2
) = E ( A .R .A) = 1 .R1 .1
H
w H −1
w
(
SNR = A . 1 .R w .1
2 H −1
)
For the given acf matrix,
σ − γ σ − γ = 1
2 2
−1 1 σ 2 −γ H −1 1
Rw = . ; 1 .R = σ 2 + γ ;
σ 4 − γ 2 −γ σ 2 w
σ 4 −γ 2 σ + γ
2
H −1 2
1 .R w 1 = 2
σ +γ
finally
A 1 2
SNR = 2. 2 . ¡Increasing correlation decreases gain from the maximum of
σ γ
1+
σ 2
3 dB. For the white noise case!
Just in case the white noise (sub-optimum) filter is used, the SNR is:
Miguel Angel Lagunas 13/08/2007 6
----------------------------------------------------------------------------------------------------------
A 2
1
SNR = 2 . with a permanent loss of 3 dB. With respect the
σ γ
1+
σ 2
optimum gain. This loss increases when increasing the filter order Q.
I.8.4 .-
Given a, we can select it as one of axis for the new orthogonal space. Once this is
decided, the next vector, to be orthogonal to it, must lie in the subspace orthogonal to a
which is defined by the projection operator as follows:
a.a
H
A⊥ = I − H In consequence, the second vector is given by the projection of it
a .a
in the subspace orthogonal to a
(
A⊥ = I PxP − α . α .α ) .α
H −1 H
Then
I.8.5 .-
P Q
The IRR equation is y (n) + ∑ a ( p ). y (n − p ) = ∑ b(q ).x(n − q) , it can be written in
p =1 q =0
H H
vector form as: y n .a = x .b where the first component of vector a is equal to one.
n
Now setting x(n)=δ(n) then y(n)=h(n), and arranging the vector equation in matrix
forms
h(0 0 0 .. 0 1 0 .. 0
h(1) h(0) 0 .. 0 0
1 0 1 ..
b(0)
h(2) h(1) h(0) .. 0 0 0 .. 0
a(1) b(1)
.. .. .. .. . .. = 0 0 .. 1 .
..
h( P − 1) h( P − 2) h( P − 3) .. h(0) 0 0 .. 0
a( P) b(Q)
h( P ) h( P − 1) h( P − 2) .. h(1) 0 0 .. 0
.. .. .. .. .. .. .. .. ..
Miguel Angel Lagunas 13/08/2007 7
----------------------------------------------------------------------------------------------------------
From this formula matrix H is easily identified.
I.8.6 .-
coefficients A such that the response to the filter is a given level (let us say 1 or 0 dB.)
and to any other signal w(t) with acf equal to R the power of its response is minimal. In
the case of white noise, this traduces in the norm of the filter vector. In summary the
formulation will be the following:
H
A .P = 1
H
A .A
min
P
(see I.8.3 for details) The optimum is given by A = 2
P
H 2
A .P
The signal to noise ratio is, for any filter design, equal to: SNR = 2
. Using the
σ 2. A
optimum filter, the resulting SNR is:
1 Pulse Energy
SNRmax = 2 2
=
σ / P Spectral density of white noise
I.8.7 .-
a) Yes, since is the quotient of the power at the filter output for signal and noise
respectively.
b) Since R s = R − R n (Whenever signal and noise are uncorrelated) then
H
a .R.a
SNR = H
−1 In consequence the use of the full correlation matrix is
a .R n .a
correct for design purposes.
c)
SNR =
H
a .R.a
=
H 1/ 2
(
a .R n . R n .R.a
−1/ 2
)=
H H
a .R n .a a .R n .a
H
u .v
= H ≤
(u .u )( v .v ) =
H H
H
v .v
H H
u .u u .u u .u
The equality holds when both vectors are co-linear, i.e. u = ρ −1.v which
provides the design to obtain the maximum SNR. Thus, the design equation is:
1 −1/ 2
R.a = ρ .R n .a
1/ 2
R n .a = .R n R.a or
ρ
Miguel Angel Lagunas 13/08/2007 8
----------------------------------------------------------------------------------------------------------
( )
d) Since R s + R n .a = ( ρ − 1) .a the objective still is the maximum eigenvalue
e) Easily concluded from the previous section
H
f) When R s is rank-one, i.e. R s = c.c , the eigendecomposition reduces to
( )
R s .a = c. c .a = ρ .R n .a
H
At this moment, note that any constant multiplying the optimum vector does not
affect its optimum character because the SNR does not change.
−1
a ∝ R n .c
I.8.8.-
Since Trace ( A ) = ∑ eigenvalues and all the eigenvalues of a positive definite matrix
are positive removing terms in the sum always produce a lower bound of the trace
operator.
I.8.9, I.8.10,I.8.11 .-
II.13.1 .-
The characteristic function is defined as the Fourier transform of the pdf. A good
candidate to estimate this function is to use the less committed pdf, i.e. uniform and to
replace the continuous integral by a sum constrained just to the available samples. In
summary the estimate of the characteristic function will be:
)
Φ ( w) = ∑ exp(− j.w.x(i )) Where i covers the set of the N available samples
i
The expected value of the estimate is
)
E {Φ ( w)} = E ∑ exp(− j.w.x(i )) = ∑ E {exp(− j.w.x(i ))} =
i i
= ∑ ∫ Pr( x(i )).exp(− j.w.x(i )).dx(i ) = N .Φ ( w)
i
In consequence, to remove bias, the definite formulation of the estimate will be:
) 1
Φ ( w) = ∑ exp(− j.w.x(i ))
N i
Computing the variance,
) )
E Φ ( w) − Φ ( w) = E Φ ( w) − Φ ( w) =
2 2 2
when x(i )
1
= ∑∑ 2 .E ( e ) − Φ(w) = and x( j ) are =
− jwx ( i ) jwx ( j ) 2
.e
i j N
independent
1
= ∑∑ .E ( e − jwx (i ) ) E (.e jwx ( j ) ) − Φ ( w) =
2
2
i j N
1
= ∑∑
2 2
2
Φ ( w) − Φ ( w) =
i j N
2
1 N2 − N 2 2 1 − Φ ( w)
= + 2
. Φ ( w) − Φ ( w) =
N N N
It is easy to check that the magnitude of the actual characteristic function is always
below one. At the same time note that estimate is consistent since it tends to cero as the
length of the data set tends to infinity.
Note also that the direct estimation of the pdf by the histogram with a given weight or
window function p(x), is equivalent to the following estimate:
)
Φ ( w) = P( w).∑ exp(− j.w.x(i )) where P(w) is the Fourier transform of p(x).
i
II.13.2 .-
Miguel Angel Lagunas 13/08/2007 10
----------------------------------------------------------------------------------------------------------
We are looking for a non-linear system which maps x to y such that the second pdf is un
α 1
Pr( x) = .exp(−α . x ) Pr( y ) =
2 2.Yo
dg Pr( x) α
= = 2.Yo . .exp(−α . x )
dx Pr( y ) y = g ( x ) 2
and
g ( x) = Yo .(1 − exp(−α . x ))
II.13.3.-
−1
1 α −α
H
−1 1 1 1 .R . X n
R= then R = The ML estimate is µ ML =
α 1 1 − α 2 −α H −1
1 1 .R .1
H
1 .X n
performing a few computations µ ML =
2
II.13.4 .-
( 1
Instantaneous estimate R xy = X n .Y n averaged estimate Rˆ xy = ∑X
H H
n .Y n
N n
The use of the biased or unbiased estimate for rxy(n) is a relevant issue in this case since
the Fourier transform is not longer a positive defined function. Windows data dependent
or independent can be used for the cross-functions in the same framework that they
were defined for auto-functions within this chapter.
I.3.5 .-
Un-related events
Related
events
t t+τ
II.13.6 .-
Since
Eθ [ cos(2 wt + wτ + 2θ ) ] =
= Re ( exp( j 2wt + jwτ ).Φ (Ω = 2) )
again being the phase distribution uniform the characteristic function evaluated at
frequency 2 is cero.
Thus, the random process is stationary in its acf whenever the characteristic function of
the phase is cero at frequency equal 2.
σ a2 Fourier
In summary, rx (τ ) = . ( Pr( w) )
2 Transform
σ a2
And the power spectral density S x ( w) = .Pr( w)
2
This explains why the bandwidth of wideband FM coincides with the dynamic range of
the instantaneous frequency.
Miguel Angel Lagunas 13/08/2007 13
----------------------------------------------------------------------------------------------------------
II.13.7 .-
The MEM extrapolation of the acf is just to assume perfect AR modeling and to
continue ahead with the Y-W equations, i.e.
1 0.5 0.1 σ 22
1
0.5 1 0.5 0
0.1 0.5 1 . a (1) = 0
3 r 0.1 0.5
a (2)
0
.. .. .. 0
Levinson’s algorithm
a11 = −0.5 K1 = −0.5 σ 12 = 1.(1 − 0.25) = 0.75
∆ 2 = 0.1 + 0.5(−0.5) = −0.15
0.15
K2 = = 0.2
0.75
a12 = −0.5 + 0.2(−0.5) = −0.49
a22 = K 2 = 0.2
σ 22 = 0.75(1 − 0.22 ) = 0.72
MEM extrapolation for r3
r3MEM + 0.1a12 + 0.5a22 = 0 r3MEM = 0.039
Matlab© routine for Levinson algorithm
% File LEV.M
%
% Given the set of acf values, the function return in vectors cor(.),
% par(.) and err(.) the predictor coefficients, the nq-1 parcors, and the nq
% prediction error power for each successive predictor order.
%
% M.A. Lagunas
%-------------------------------------------------------------------------
function [coe,par,err,nq]=lev(r);
nq=length(r);
err(1)=r(1);
coe(1)=1;coe(2)=-r(2)/r(1);par(1)=coe(2);
err(2)=r(1)+coe(2)*r(2);
for ii=3:nq;
delta=coe*r(ii:-1:2)';par(ii-1)=-delta/err(ii-1);
err(ii)=(1-par(ii-1)*par(ii-1))*err(ii-1);
coe(ii)=par(ii-1);
for j=2:round((ii)/2);
aus=coe(j);bus=coe(ii-1-j+2);
coe(j)=aus+par(ii-1)*bus;coe(ii-1-j+2)=aus*par(ii-1)+bus;
end;
end;
%-------------------------------------------------------------------------
Miguel Angel Lagunas 13/08/2007 14
----------------------------------------------------------------------------------------------------------
II.3.8 .-
x(n) y(n)
AR
w(n)
White
Noise
For { x}
σ x2
S y ( w) = indepependent = S x ( w) + S w ( w) = −1
+ σ w2
A( z ). A( z )
of {w}
σ x2 + σ w2 . A( z ). A( z −1 )
In summary S y ( w) = which results in an AMRA model.
A( z ). A( z −1 ) z = exp( jwT )
In consequence speech recording, even assuming is pure AR, due to recording noise
results in an ARMA model.
II.3.9.- a)
fT
0.1 0.25
b) s (n) = x(n) − w(n) when using the recursive equation for the deterministic and
completely predicted signal s(n) results in a special ARMA model
Q Q
∑ a(q).x(n − q) = ∑ a(q).w(n − q)
q =0 q =0
and
Miguel Angel Lagunas 13/08/2007 15
----------------------------------------------------------------------------------------------------------
1 Q 1 Q
x ( n) = .∑ a (q).w(n − q) − .∑ a(q).x(n − q )
a(0) q =0 a(0) q =1
A( z ) 2
S x ( w) = .σ w = σ w2 ????
A( z )
c) The problem comes from the fact that s(n) is not a random process but a deterministic
signal described by a recurrence equation. For this reason it is not longer stationary and
has not a time invariant power spectral density profile. Modeling x(n) as an ARMA we
impose stationarity and, in consequence, only the stationary part ot x(n) remains.
Of course, for this s(n) the recurrence equation is not longer valid.
II.13.10.-
A modulated signal takes the form o modulating waveform acting over a carrier
waveform as x(t ) = a (t ).cos( wt ) . Usually the modulating wave is assumed to be strict
sense stationary
E [ a(t )] = 0 ∀t and E [ a(t ).a(t + τ )] = ra (τ ) ∀t
thus
E [ x(t ).x(t + τ ) ] = ra (τ ).cos( wt ).cos( w(t + τ ))
Note that the carrier, being deterministic, introduces the non-stationarity
r (τ ) r (τ )
rx (t ,τ ) = a .cos( wτ ) + a .cos( w(2t + τ ))
2 2
In accordance with this result, the power of the random process fluctuates at frequency
2w as well as any other moment of the correlation function. Since in engineering the
average power is a key design parameter, it is decided just to keep the average term of
this time-varying correlation. Formally, it is said that the random process {x} is ciclo-
stationary (periodic) and we keep just the first coefficient of its Fourier series. In
summary, we define the acf of {x} as:
1 r (τ )
rx (τ ) > . ∫ rx (t ,τ ).dt = a .cos( wτ )
Tp T p 2
With respect the discrete o digital modulation case,
N
x(t ) = LimN →∞ ∑ a(k ). p(t − k / r )
k =− N
for − N / r ≤ t ≤ N / r and being r the symbol
velocity (bauds), a(.) the information symbols draw from a discrete constellation, and
p(.) the pulse shape (either full or partial response).
Miguel Angel Lagunas 13/08/2007 16
----------------------------------------------------------------------------------------------------------
Assuming that the symbol sequence are stationary, it can be defined is acf in the same
manner than in the continuous wave modulation case seen previously.
N N
E [ x(t ).x(t + τ ) ] = lim N →∞ ∑ ∑ E [ a(k ).a(n)]. p(t − kT ). p(t + τ − nT )
k =− N n =− N
rx (t ,τ ) = lim N →∞ ∑∑ ra (l ). p (t − mT ). p (t + τ − (m − l ).T )
1 NT
rx (τ ) > lim N →∞ .∫ rx (t ,τ ).dt
(2 N + 1)T − NT
since N large and full-response pulses
NT
∫
− NT
p(t − mT ). p(t + τ − (l + m)T ).dt > rpp (τ − lT ) i.e. the acf for finite energy signals
n
Example
for N=3
k
Add
diagonals
with index
l
∞
rx (τ ) = r. ∑ ra (l ).rpp (τ − lT )
l =−∞
Miguel Angel Lagunas 13/08/2007 17
----------------------------------------------------------------------------------------------------------
Note that the acf defined before is just the convolution of the symbols’ acf with the acf
of the signaling pulse
In fact, the power spectral density is the product of the energy spectrum of the pulse by
the FT of the acf of the information symbols sequence.
∞
S x ( w) = r.S pp ( w). ∑ ra (q ).exp(− jqw / r )
q =−∞
II.13.11.-
See II.13.13
II.13.12.-
X n = a.S + wn
1
Λ ( a, S , σ 2 ) = K 0 − Q.Ln (σ 2 ) − 2 (
. X n − a.S ) . ( X n − a.S )
H
σ
a)
H
∂Λ S .X n 1 H
= 0 ; 0 = S . ( X n − a.S )
H
aML = max a Λ ; and aML = = S .X n
∂a H H
S .S Q
--
1 H
H 2
S .S
Λ ( S , σ ) = K 0 − Q.Ln (σ ) − 2 X n . I −
2 2
.X n
σ Q
--
b)
S .S
H
H H
since P = I − and P = P together with P.P = P allows writing down the
Q
likelihood in a more compact form.
1
Λ ( S , σ 2 ) = K 0 − Q.Ln (σ 2 ) − X nH .P. X n
σ 2
and
1
Λ ( S , σ 2 ) = K 0 − Q.Ln (σ 2 ) − Trace P. X n . X n
H
σ2
Miguel Angel Lagunas 13/08/2007 18
----------------------------------------------------------------------------------------------------------
c)
Independent N −1
N −1
Xn
N −1
Λ N ( S ,σ 2 ) = Ln ∏ Pr X n 2 = ∑ Ln Pr 2 = ∑ Ln {Λ n }
sample vector n=0 S , σ n=0 S , σ n=0
and the log-likelihood results in
N −1
Λ N ( S , σ 2 ) = K1 − N .Q.Ln(σ 2 ) −
N
σ 2 ∑ ( H
)
. Trace P. X n . X n =
n=0
N 1 N −1 H
= K1 − N .Q.Ln(σ 2 ) − .Trace P. ∑ X n . X n =
σ 2
N n=0
N
= K1 − N .Q.Ln(σ 2 ) − 2 .Trace ( P.R )
σ
d)
E X n . X n = a .S .S + σ 2 .I
H 2 H
e)
S H .R.S
( H
)
Trace ( P.R ) = Trace R − S .S .R / Q = Trace ( R ) − Trace
Q
=
1
= Trace ( R ) − Trace S .R.S =
Q
H
( )
= Trace ( R ) −
1 H
Q
(
S .R.S )
f)
dΛN
σ ML
2
= maxσ 2 Λ N ( S , σ 2 ) ; =0
dσ 2
N .Q N 1 H
0 = − 2 + 4 . Trace ( R ) − .S .R.S
σ σ Q
1 1 H
and the ML estimate of the noise power is σ ML
2
= . Trace ( R ) − .S .R.S
Q Q
g)Using the acf of section (d) the new expression for the expected value of the noise
covariance follows
1 2
E σ ML
2
= Q
2 1
a Q + Qσ −
Q
2
{
a Q 2 + σ 2Q
}
Trace(R)
And
Q −1
E σ ML
2
= σ 2
Q
h)
Using the estimates of the complex envelope and the noise power, the likelihood
reduces to
Miguel Angel Lagunas 13/08/2007 19
----------------------------------------------------------------------------------------------------------
Λ ( S ) = K1 − N .Q. Ln (σ ML
2
) − 1 and
S ML = max Λ ( S ) = min Ln (σ ML
2
) = min Trace( R) − ... =
1 which is the maximum of
H
(
max . S .R.S
Q
)
the Periodogram (Welch procedure) of the available data.
II.13.13.-
) 1
a) E Px = ( 2 N − 1) .Px = Px unbiased
2N −1
( )
) ) ) 2
( )
b) var 2 Px = E Px2 − E Px the second term is the previous section and the
second term is:
) 1 1
2 ∑∑ ( x
E Px2 = E 2 ∑∑
x 2 (n).x 2 (m) = r 2 (0) + 2.rx2 (n − m) ) =
(2 N − 1) n m (2 N − 1) n m
1
2 ∑∑ (
= Px2 + 2.rx2 (n − m) )
(2 N − 1) n m
in consequence
) 1 n−m = q 1 N −1
( )
var 2 Px = ∑∑
(2 N − 1) n m
2 ( 2.r x
2
( n − m ) ) =
m= p
= ∑
(2 N − 1) q =− N +1
2
rx2 (q ). ( ( 2 N − 1) − q ) =
1 N −1
q
= . ∑ rx2 (q ). 1 −
(2 N − 1) q =− N +1 (2 N − 1)
c)
When the number of available data N tends to infinity
) 1 N −1
1 1 B
( )
var 2 Px = . ∑ rx2 (q ) = Parseval =
(2 N − 1) q =− N +1
. ∫ S x2 ( f ).df
(2 N − 1) T − B
Now assuming that the spectral density is flat in the bandwidth [-B,B]
Px f ) Px2
i.e. S x ( f ) = .∏ then var 2 ( Px ) ⇒
2B 2B 2 B.(2 N − 1)T
Note that the denominator is the product of the signal time duration by the signal
frequency duration. This product is now as the degrees of freedom in the estimation
procedure and they reduce always (with the proper estimation procedure) the variance
of the estimate. A brief explanation for this fact follows: Since the random process has a
bandwidth of 2B, the coherence time, i.e. the time we have to wait to label independent
records is equal to the inverse (approx.) 1/2B sec. And, in consequence, the number of
independent records, which dictates the possible variance reduction, is given by the
quotient between the time support of the data and the coherence time. This is just the
denominator of the asymptotic expression of the variance depicted above.
Miguel Angel Lagunas 13/08/2007 20
----------------------------------------------------------------------------------------------------------
e)
x 2 ( n) α P%x (n)
1 − β .z −1
and E P%x (n) = β .E P%x (n − 1) + α .E x 2 (n) ; assuming that the r.p. is stationary
α
E P%x (n) = β .E P%x (n) + α .Px ;
E P%x (n) = and
Px from which is easy to
1− β
deduce the adequate choice of the filter parameters to obtain an unbiased estimate.
( 1 n ( N −1 ( x 2 ( n) x 2 ( n − N )
f) when Px (n) = . ∑ x 2 (n) then Px (n) = .Px (n − 1) + −
N m = n − N +1 N N N −1
1 1
thus, the estimate is similar to the previous one when α = and β = 1 − . The only
N N
difference is the last term which can be considered of low impact in the estimate the
power at instant n mainly for N large. Also, this term does not longer exist for n<N and
x(n) is cero for negative arguments.
II.13.14.-
∞ ∞
1 1 1
It is a pdf since ∫
π −∞ 1 + a 2
.da = .arg tg (a ) = 1 , the mean is cero, but the variance
π −∞
∞
1 a2 1 ∞
∫
π −∞ 1 + a 2
.da = (1 − arg tg (a)) −∞ = ∞
π
!!is infinite!!
Nevertheless, the Cauchy distribution is preserved when adding Cauchy r.v. since it is
invariant to the convolution operator.
∞
1 1 1 1 1
π 2
−∞
∫ 1+ a 2
.
1 + (b − a )
2
.da ∝ .
π 1 + b2
1 cos( wa )
π ∫ 1 + a2
The characteristic function is da = Φ ( w) to solve this integral, note that
∂ 2Φ −w
Φ (0) = 1 and = −Φ ( w) being Φ ( w) = e the solution
∂ω 2
II.13.15.-
II.13.16.-
Sentences are correct when applied to a finite record length but no to describe global
features of the random process to which input and output records belong. Note that there
is implicit the assumption of circular convolution in sentences that does not longer
apply for the convolution of the input with the impulse response. In fact, it is easy to
Miguel Angel Lagunas 13/08/2007 21
----------------------------------------------------------------------------------------------------------
prove that the periodogram for a non white random process is always biased. This bias
is introduced by the convolution of the actual spectral density with the lag-window
transform. This lag-window is implicit in all the sentences, from (a) to (d).
II.13.17.-
Pr ( x, y ) = Pr ( z ) = .e
det R z( )
We are going to use the following diagonalization of the acf matrix:
b)
x1 1 −ryy−1.rxy x − mx
x2 = .
0 1 y − my
x1 = ( x − mx ) − ryy−1.rxy . ( y − my )
x2 = y − my
c)
( r − r .r −1.r )−1 0 x1
( z − mz ) .R z . ( z − m z ) = [ x1 x 2] . xx xy yy yx . =
H −1 H
0 ryy−1 x 2
= ( x1) . ( rxx − rxy .ryy−1.ryx ) + ( x 2 ) .ryy−1
2 −1 2
d)
Pr x ( y ) = K .exp ( − ( x1) .( r
2
2
xx − rxy .ryy−1.ryx )
−1
)
Gaussian and it can be formulated as
( )
( x − m x ) − r −1 .r . y − m y . ( rxx − rxy .ryy−1.ryx ) . ( x − m x ) − r −1 .r . y − m y ( )
H −1
yy xy yy xy
Note that, in a more general approach, in the last formula both x and y have been
considered vectors instead mere scalars. In fact the previous presentation is still valid in
all the respects for this general case.
( )
−1
−1
C x) x) = r xx − r xy .r yy .r yx
f)
) 2
ECM ≡ ∫ ( x − x ) .Pr x ( y ).dx =
= ∫ x 2 .Pr x ( y ).dx + x) − 2.∫ ( x − x) ).Pr ( x y ).dx
2
∂ECM )
) = 2.x − 2.∫ x.Pr x y .dx = 0
∂x ( )
)
xmin = ∫ x.Pr x
ECM
( y ).dx = E Pr ( x y )
x
g)
II.13.18.-
1 + b(1) 2 m = 0
H ( z ) = 1 − b(1).z −1 rx (m) = −b(1) m = ±1
0 else
then
S x ( z ) = − z.b(1) + (1 + b(1)2 ) − b(1).z −1 = (1 − b(1).z −1 ) . (1 − b(1).z ) where the first term is
minimum phase
II.13.19.-
Miguel Angel Lagunas 13/08/2007 23
----------------------------------------------------------------------------------------------------------
Descriptive of the role of non linear systems for linearizing power amplifiers.
II.13.20.-
x ( n) 1 3 −1 1
j . w0 .t jw0
x n = ( a.e ) .e
j .θ
C = E wn .wn = −1 3 −1
H
x n = x(n − 1) . e + wn
x(n − 2) j 2 w0
e 1 −1 3
a) with b = a.e jθ ,i.e. the low pass complex envelope,
( x ) = K .Pr ( x b ).Pr(b)
Pr b
n
0
n
Pr (
b)
x
= K .exp − ( x − b.S ) .C . ( x − b.S )
H −1
n
1 n
n
Pr ( b ) = K 2 .exp − b − mb / σ b2
2
b) Taking logarithms and removing constant irrelevant for the minimization procedure
2
b − mb
Λ ( b ) = − ( x n − b.S ) .C . ( x n − b.S ) −
H −1
σ b2
bMAP = min b Λ ( b )
∂Λ
= 0 = S .C . ( x n − b.S ) −
H −1 ( b − mb )
∂b σ b2
H −1 mb
S .C .x n +
σ b2
bMAP =
H −1 1
S .C .S +
σ b2
c)
Λ ML ( b ) = − ( x n − b.S ) .C . ( x n − b.S )
H −1
bML = min b Λ ML ( b )
∂Λ
= 0 = S .C . ( x n − b.S )
H −1
∂b
−1
σ b → ∞ bMAP ⇒ bML
H
S .C .x n
bMAP = H −1
S .C .S σ b → 0 bMAP ⇒ mb
d)
S .C .E ( x n )
H −1 H −1
S .C .S .b
E [bML ] = H −1
= H −1
=b unbiased estimate
S .C .S S .C .S
Miguel Angel Lagunas 13/08/2007 24
----------------------------------------------------------------------------------------------------------
S H .C −1.S
var 2 ( bML ) = E ( b − bML ) . ( b − bML ) = E b. H −1 − bML . ( b − bML ) =
S .C .S
S H .C −1.S S H .C −1.x n
= b. H −1 − E [bML ] . b − E H −1 =
S .C .S
S .C .S
H −1 −1
S .C C .S
= H −1 .E ( x n − b.S ) . ( x n − b.S ) . H −1 =
H
S .C .S S .C .S
H −1 −1
S .C C .S 1
.E wn .wn . H −1 = H −1
H
= H −1
S .C .S S .C .S S .C .S
1 σ2
var 2 ( bML ) = = ⇒ 0 when N ⇒ ∞
(σ 2 ) .S H .S
−1
N
When both frequencies are unknown, the use of any spectral estimation procedure to
locate their positions is not longer optimum. In fact the ML estimate of the locations of
both frequencies is formulated as follows:
S ML = min S1 , S 2 Λ ( S )
Λ ( S ) = x n . I − C .S . A.S .C . I − S . A.S .C .x n
H −1 H −1 H −1
−1
A = S .C .S
−1 H
where
N n =0
The expression looks easy but it entails a great complexity for searching the two
frequencies that minimize the trace. (See course notes of Arrays for a detailed
description of the problem and sub-optimum solutions).
II.13.21.-
H −1
1 .R .x n 1
a) µ ML = H −1
var 2 ( µ ML ) = H −1
1 .R .1 1 .R .1
H H
1 .x n 1 .R.1
b) µ% = H
var ( µ% ) =
2
H
1 .1 1 .1
c)
1 1 λ
0≤ < −1 H = max when Q → ∞ tends to cero
H −1
1 .R .1 λmax . 1 .1 Q ( )
H
1 .R.1 λmax λmax
0≤ < = when Q → ∞ tends to cero
( ) (1 .1)
2 H
H
1 .1 Q
d)
(1 )( )
H H −1 H 2
.R.1 . 1 .R .1 ≥ 1 .1 = Q 2
1/ 2
u = R .1
use −1/ 2
v=R .1
and
H
1 1 .R.1
≤
( )
H −1 2
1 .R .1 1H .1
they are equal only for u proportional to v, i.e. when the samples belong to a white
random process.
e)
done
II.13.22.-
a.- x(n) = − a (1).x(n − 1) + v(n) multiply both sides by x(n+m) and taking expectations
we get rxx (m) = − a (1).rxx (m − 1) + σ v2 .δ (m) ∀m ≥ 0 . Thus for any m greater than cero
r ( m)
a(1) = − xx
rxx (m − 1)
b.- Since {x} and{w} are statistically independents ryy (m) = rxx (m) + σ w2 .δ (m) Using the
taps involved in the proposed estimate,
Miguel Angel Lagunas 13/08/2007 26
----------------------------------------------------------------------------------------------------------
rxx (1)
−
) ryy (1) rxx (0) a (1).SNR
ryy (0) = rxx (0) + σ w2 ; ryy (1) = rxx (1) and a (1) = − = =
ryy (0) σ 2
1 + SNR
1+ w
rxx (0)
c.- s yy ( w) = sxx ( w) + σ (Just use FT in the acf formula of section (b).
2
w
The model for {y} is AR plus a constant, in consequence, the resulting model for
the obsevartion r.p. is ARMA (1,1).
d.-
x(n) = y (n) − w(n)
because x(n) = − a(1).x(n − 1) + v(n)
y (n) − w(n) = −a (1).( y (n − 1) − w(n − 1)) + v(n)
or y (n) = −a(1). y (n − 1) + [ w(n) + a(1).w(n − 1)] + v(n)
clearly y(n-2) does not depends neither w(n) not w(n-1) resulting
ryy (2)
ryy (2) = − a(1).ryy (1) − − − − > a (1) = −
ryy (1)
ryy (1) = rxx (1) = −a (1).rxx (0)
also ryy (1) ryy2 (1)
and σ = ryy (0) − rxx (0) = ryy (0) +
2
w = ryy (0) −
a(1) ryy (2)
II.13.23.
a.- Just assigning the following vectors:
1 = [1 1 ... 1] y N = [− N + 1 − N + 2 ... N − 2 N − 1]
H H
E (a ) = (Φ .Φ ) .Φ .E ( X ) = (Φ .Φ ) [ ]
ML H −1 H H −1 H
The expected value is: .Φ . Φ.a = a Which
proves that the estimate is unbiased.
{(
= E Φ H .Φ
} {( } =
H
E a ( )( ) ) )
ML ML H −1 H H −1 H
−a . a −a .Φ . X − a . Φ .Φ .Φ . X − a
using
= E Φ .Φ ( ) { X − Φ a} .{.......} (
= Φ H .Φ
) ( )
−1 H −1 −1
.σ 2 =
H H H H
Φ Φ Φ Φ .Φ
( )
−1
a = Φ .Φ
H H
.Φ .Φ.a
1/(2 N − 1) 0
( )
−1
.σ 2 = σ 2 .
H
= Φ .Φ
0 1/ S
Where both terms tend to zero when the length tends to infinity.
Miguel Angel Lagunas 13/08/2007 28
----------------------------------------------------------------------------------------------------------
III.10.1.-
σ 02 + σ 2 A ( z )
2
σ 02
S xx ( z ) = +σ = 2
A( z) A( z)
2 2
σ 02 + σ 2 A ( z )
2
S xx ( z ) . A( z ) =
A ( z −1 )
since the right hand term is cero for those samples above Q, the inverse transform of the
length hand term verifies:
rx (m) * a (m) = 0 ∀m ≥ Q + 1
Q
rx (m) + ∑ a (q ).rx (m − q ) = 0 m = Q + 1, ∞
q =1
From these extended Y-W equations, given the acf of the random process, the
coefficients of the denominator of the model can be obtained.
Filtering the original data with the coefficients obtained before, the output is a pure MA
process
{ x} { y}
A( z )
from the first two terms of the acf of the new data record as:
Q
ry ( 0 ) = σ + σ .∑ a 2 (q )
2
0
2
q =0
Q −1
from which σ 02 and σ 2 can be obtained
ry (1) = σ 2 .∑ a (q ).a (q + 1)
q =0
III.10.2.-
From a bank filter approach to spectral estimation, the filter associated to given value of
S
the Periodogram is A = this filter results from the following minimization problem:
Q
H
A .S = 1 steers the desired frequency (0dB. response)
H
A .A with the lowest response to white noise
MIN
Moving the same arguments to the case when interference is present at S i results in the
following problem:
Miguel Angel Lagunas 13/08/2007 29
----------------------------------------------------------------------------------------------------------
A .[ S S i ] = [1 0] or
H H H
A .S = 1
H
with A .A
MIN
∂ℑ
H
ℑ = A . A − A .S − 1 ( H H
) .λ ; ∂A
H
=0
taking this solution to the constraint equation
A − S .λ = 0 or A = S .λ
( ) ( )
−1
λ H . S H .S = 1H or λ = S .S
H
.1
H
A .R. A
and since the spectral estimate is H
A .A
( )
H −1
A = S . S .S .1
( ) ( )
H H −1 H H −1
1 . S .S .S .R.S . S .S .1
then Sˆxx ( w) = ∀w ≠ wi
( )
H H −1
1 . S .S .1
III.10.3.-
For every frequency, denoted with index l0, the estimate is and average of the
surrounding DFT samples.
Averaging
filter Data DFT
l0
In consequence, the estimate is given by the convolution of the averaging function w(n),
moved to frequency l0 with the original sequence
Q
∑ w(q).x(n − q).exp(− j 2π l n / N )
q =− Q
0
The procedure is equivalent to modulate the original data, in such a way that index l0
moves to the cero frequency, and then a low pas filter W(l) is applied
Miguel Angel Lagunas 13/08/2007 30
----------------------------------------------------------------------------------------------------------
MAJOR DRAWBACKS
-Averaging in the frequency domain does not take into account that the phase of the
DFT may produce undesired cancellations. A better procedure is to average directly the
Periodogram samples. This implies that the equivalence at the time domain will be the
convolution of the sample autocorrelation with the inverse Fourier transform.
- The equivalence at the time domain is not strict since a finite support is used in the
frequency domain. This implies that the time support of the time window may exceed
the duration of the original record.
III.10.4.-
See the solution in A. Papoulis “The Fourier Integral and its applications” a pioneer
work (first book) describing the role of Fourier transform in electrical engineering and
communications.
III.10.5.-
The problem of 2-D spectral estimation is fully described in chapter V of these course
notes. This section of chapter V is easy to read from the background obtained from this
chapter. The exercise is solved for completeness here.
n1
Q
n2
or A ( w,ψ ) = a .S where
H
S = [1, exp( jw),.., exp( jPw), exp( jξ ),..., exp( jψ + jPw), exp( j 2ξ ),.., exp( jQξ + jPw)]
H
n1, n 2
with x = [ x(n, m), x(n − 1, m),.., x(n − P, m), x(n, m − 1),.., x(n − P, m − 1),..., x(n − P, m − Q) ]
H
III.10.6.-
Given the 2D predictor a (n1, n 2) , the first problem is to decide which sample is going
to be predicted. At least four of them are possible. Note that the concept of causality,
usual in 1D is lost.
a .1 = 1 where 1 = [ 0,...0,1( position P,1), 0,.., 0] Now to minimize the prediction error
H H
H
we need to minimize a .C.a
MIN
−1
C .1 1
The solution is a = H −1
the prediction error is ξ =
−1
and the spectral density
H
1 .C .1 1 .C .1
estimate, i.e. the prediction error divided by the frequency response of the linear
H −1
ˆ 1 .C .1
predictor is S xx ( w,ψ ) =
LP
2
H −1
1 .C .S
It is NOT possible to say that this estimate is a maximum entropy estimate as it was in
1D, since the acf support used for the filter computation exceeds the size of the
predictor. To see this in a clearer manner, let us formulate the MEM estimate as it was
in 1D
Miguel Angel Lagunas 13/08/2007 32
----------------------------------------------------------------------------------------------------------
∫∫ Ln ( S ( w,ψ ) )dw.dψ
MEM
xx
MAX
subject to
∫∫ S
MEM
xx ( w,ψ ).exp j ( w.n1 + ψ .n2 ) .dw.dψ = c ( n1 , n 2 ) ∀n1, n 2 ≤ P, Q
assuming P and Q equal to 1. the number of values of matrix C used to derive the
MEM estimate will be 9 (i.e. (-1,-1),(-1,0),(-1,1),(0,-1),(0,0),(0,1),(1,-1),(1,0) and (1,1)).
For a linear predictor of 9 coefficients the matrix involved requires 37 acf values.
Clearly the support or basis information is different.
III.10.7.-
Being a 0 , a1 ,...., a Q −1 the set of vectors containing the optimum linear predictors for a
given random process. Due to the different lengths (in increasing order), it is easy to
check that A = a 0 , a1 ,..., a Q −1 is upper triangular. At the same time, due to the fact that
the forward and the backward predictors for a stationary random process are equal, the
mentioned matrix diagonalizes the acf matrix of the process.
A .R. A = diag (ξ 0 , ξ1 ,.., ξQ −1 ) = ξ where the elements of the diagonal are the prediction
H
−1 −1
A .R. A = ξ and, in consequence R = A .ξ . A and R = A .ξ . A .
H H H
Since matrix of the prediction errors is diagonal, this last expression can be further
developed showing its explicit dependence on the linear predictors as:
Q −1 H
a q .a q
R = A .ξ . A = ∑
−1 H −1 H
Multiplying both sides by S . and by .S and taking the
q =0 σ 2
q
inverse we obtain
1 1 1
( Q −1) ( w) =
S xxMLM H −1
= Q −1
= Q −1
or
S .R .S 1 2 1
∑σ ∑S
H
2
a .Sq MEM
q =0 q q =0 xx ( q ) ( w)
Q −1
1 1
MLM
=∑ In other words, The MLM estimate, being the “parallel
MEM
S ( w) q =0 S
xx ( Q −1) ( w) xx ( q )
union” of successive MEM estimates will have always poorer resolution than the
corresponding MEM estimate for the same order or equal size of predictor MEM or
filter MLM. THIS IS NOT LONGER TRUE FOR NMLM WHICH HAS SUPERIOR
PERFORMANCE IN TERMS OF LOW SIDELOBE AND RESOLUTION THAN
Miguel Angel Lagunas 13/08/2007 33
----------------------------------------------------------------------------------------------------------
MEM. Only when the process under analysis is actually and AR, the order of the
predictor matches perfectly the model order and the number of data samples are above
ten times the order MEM will be superior to NMLM.
III.10.8.-
R = α .S 0 .S 0 + σ 2 .I
H
2
(
R.S 0 = α . S 0 .S 0 + σ 2 .S 0 = σ 2 + α . S 0
2
) .S 0 thus
S0
S0
is the eigenvector and
(σ 2
+α. S 0
2
) is the eigenvalue
III.10.9.-
r ( m) r (m − 1) ... r (m − Q + 1) 1
r (m + 1) r ( m) ... r (m − Q) a (1)
. = 0 ∀P ≥ Q
... ... ... ... ...
r (m + P − 1) r (m + P − 2) ... r (m + P − Q) a (Q − 1)
This system of equations R e .a = 0 is over determined and has not a solution, unless the
estimate is perfect and the order matches the model order. In general, there is no a
solution for it.
One way out is to match the systems of equations in the MSE sense, i.e. find vector a
2
such that R e .a − 0 is minimized, of course with the constraint that the first coefficient
of the unknown vector is equal to one.
( R .R ) .1
H −1
H H
a .R e .R e .a e e
MIN the solution to this problem is a=
1 . ( R .R ) .1
H H −1
H
a .1 = 1 e e
III.10.10.-
III.10.11.-
Since we are looking for two frequencies, two equations are enough to find them. These
two equations reflect than the signal, without noise, are perfectly predictable since the
signal is the solution of a deterministic differential equation. To set these two equations
we use all the data available and, in order to do this, we have to select the forward and
backward equations for the two border samples x(0) and x(N-1). This is correct since
pure sinusoids present the same forward and backward evolution.
Miguel Angel Lagunas 13/08/2007 34
----------------------------------------------------------------------------------------------------------
a (0)
x(0) x(1) ... x( N − 3) x( N − 2) a(1) x( N − 1)
x( N − 1) x( N − 2) ... . =
x(2) x(1) ... x(0)
a( N − 2)
in vector form X .a = x . Now, since the system is over determined, we take into account
the presence of white noise by imposing that the solution has to minimize its response to
the noise. This is equivalent to take the solution of the above system with the minimum
norm among all the possible solutions.
X .a = x
( )
H H −1
H the solution is a = X . X . X .x and the spectral estimate is derived
a .a
MIN
based in the fact that the roots of the polynomial formed by the vector lie in one circle
inside the unit circle, but two of them must lie in the frequencies corresponding with the
location of the two frequencies contained in the data record. In summary, the two major
1
peaks of Sˆxx ( w) = 2
will coincide, or close depending on the signal to noise ratio,
H
a .S
to the actual frequencies.
( H
)
Since X . X is 2x2, the product by vector x will be 2x1
β1
(X )
H −1
.x =
.X and
β 2
a .S = [ β 1 β 2] . X .S = β 1. ( DFT {signal forward } + β 2.DFT { DFTsignal backward } )
H
Then, the resulting procedure is basically the combination of two period grams
weighted in such a manner that they null out at the actual frequency locations.
The similarity wit MUSIC or, much close to the pioneer work of Pisarenko, can be
easily viewed when, instead of minimum norm from and over determined system of
equations, we force unity norm with minimum prediction error ( when vector a has
been extended previously with anew coefficient a(N-1) in order to leave cero in the
second term)
H
a .a = 1
2 the solution to this system is the minimum eigenvector associated with
X .a MIN
( H
)
matrix X . X . In fact, any of the N-2 eigenvectors with minimum eigenvalue will be
adequate. (the reader may extent this idea to use all the noise eigenvalues to derive
Music).
III.10.12.-
Miguel Angel Lagunas 13/08/2007 35
----------------------------------------------------------------------------------------------------------
For the 2D case, there are four possible equations that allow the use of the overall data
when formulating the exact prediction. These samples are the four corners of the
original 2D data. Note that we search for 4 frequencies, two pairs (w,Φ).
III.10.13.-
{e} H(w)
{ x}
a) Sˆx ( w) = H .Sˆe ( w)
2
Unbiased for
b) E Sˆx ( w) = H .E Sˆe ( w) =
2 2
= H .N 0
white noise
( )
( ( )) = E ( Sˆ ) − E ( Sˆ ) = E ( Sˆ ) − H
2
c) var 2 Sˆx ( w) = E Sˆx − E Sˆx 2 2 2 4
x x x .N 02 and
( )
E Sˆ 2 = H .E Sˆ 2
x
4
( )
e altogether
E Sˆe2 ( ) E Sˆe2 ( )
2 ˆ ( ) (
var S x ( w) = H N 0 .
2
N 02
) − 1 = S x ( w).
2
N 02
− 1
2 2 2
( )
because E Sˆe = var ( Sˆe ) + E Sˆe = var ( Sˆe ) + N 0 the desired result follows
2 2
III.10.14.-
Thus S NMLM ( w ) ≤ Q 2 .S MLM ( w ) being equal when both vectors are proportional.
This occurs at the actual frequency location since the actual steering vector is an
eigenvector of the acf matrix.
--------
( )
Λ α , R 0 , S = ( X n − α .S ) .R 0 . ( X n − α .S )
H −1
f) ∂Λ S .R . X n
H −1
α ML = H = 0 = H 0 −1
∂α S .R 0 .S
----
1
2
varML =
(S )( ) ( )
−1
H
S .R 0 .S H −1 H −1
.R 0 .S . S .R .R 0 .R .S ≥ S .R .S
−1 H −1 2
g) H −1 −1 since
S .R .R 0 .R .S 1/ 2 1/ 2
var 2
= where u = R 0 .S and v = R 0 .R.S
(S )
Capon 2
H −1
.R .S
H −1 −1 H −1
S .R .R 0 .R .S S .R .S
then var 2
= ≥ 2
and varCapon ≥ varML
2
(S ) (S )
Capon 2 2
H −1 H −1
.R .S .R 0 .S
−1 −1
u = β .v ; R .S = β .R 0 .R .S ; R.R 0 S = β .S
1/ 2 1/ 2
a)
b)
Being so small the line a 1/6 is severely masked out due to the window leakage
promoted by the strong line at 1.
Assuming that we are looking for the spectral content at frequency w0, the best choice
for the data window is to have cero response an f1=1, i.e. the window will have a cero at
w0-w1. Also a cero is necessary to remove leakage from line at -1, i.e. at w0-w-1.
Desired
window
A(w)
w-1 w w1 w
Miguel Angel Lagunas 13/08/2007 38
----------------------------------------------------------------------------------------------------------
d) Assuming that only one line has to be removed, the constraint is A(w1)=0. The use of
a single frequency of nulling in the design is for the shake of presentation and, without
loss of generality, the design included hereafter can be extended to several nulling
frequencies. Going back to the window design, we have two constraints: The value at
w=0 has to be one (no bias) and cero to null w1, in consequence:
a
H
[1 S ∆ ] = [1 0] or
H
a .Ψ = 1
H
The design is completed with minimum bandwidth (white noise equivalent bandwidth,
i.e. the norm of vector a minimum, which is equivalent to minimum d.c. response of the
lag-window corresponding to the data window under design.
H
a .a
MIN
( )
H −1
and the solution is: a = Ψ. Ψ .Ψ .1
1 1
( ) 1
2 (
. Q −α * ). = 2
Q
−1
a .a = [1 0] . Ψ .Ψ
H H
. = 2
0 Q − α 0 Q − α
2
1
also a . .. = 1
H
1
e) The length of the segment, given the desired resolution, is 32 months, then
III.10.18.-
E ( x0 ) = x
x0 2 x1 = ρ .x0
var ( x0 ) = σ 0 (min .)
2
x2
in consequence: ρ=
σ 02 + x 2
1
In the case of Periodogram, σ 02 = x 2 and ρ =
2
This exercise shows how the MSE is a trade-off between bias and variance. In any case,
the relevance of a scale factor is removed when a logarithmic scale is used to plot the
estimates.
III.10.19.-
This last integral is just the DFT of signal x(.) windowed by h(.), along the duration of
2
the impulse response T). In summary y ( t ) = DFT ( x ) w= w (t )
2
∂φ
∂t wmax
y (t )
2
t
w m in
III.10.21
MEM set correlation constrains in such a way that the estimate the same autocorrelation
values that the data record.
1 π
. sx (w). exp( jqw).dw = r (q ); q = −Q, Q
2π ∫−π
The objective is the maximum flatness of the estimate that, at the same time, satisfies
the correlation constrains. Maximum flatness is equivalent to maximum entropy. Thus,
the objective is:
1 π
.∫ Ln(s x ( w) ).dw
2π −π MAX
2A.- When there is only one correlation constrain the formulation of the estimate is:
1 π
.∫ Ln(s x ( w) ).dw
2π −π
MAX
1 π
. s x ( w).dw = r (0 )
2π ∫−π
1 1
− λ0 = 0 i.e. the solution is: s x ( w) = ; ∀w . In consequence the maximum
s x ( w) λ0
entropy estimate with power constrain (Just the zero lag of the autocorrelation) is white
noise with the same power.
Miguel Angel Lagunas 13/08/2007 41
----------------------------------------------------------------------------------------------------------
3A.- Solving the general problem, the integrand of the Lagrangian is:
Q
Ln(s x ( w) − ∑ λ .s q x ( w). exp( jqw) and its derivative set to zero results in:
q =−Q
Q
1 1
− ∑ λ q . exp( jqw) = 0 or s x ( w) = Q
.
s x ( w) q = − Q
∑ λ . exp( jqw)
q =−Q
q
σ2
s x ( w) = 2
which coincides with the AR model estimate of the same order.
A(exp( jw))
4A.-
The Yules-Walker equations, allowing the computation of the coefficients of an AR
model of order Q are as follows:
1
R. A = σ .0 = σ 2 .1 being 1 = [1 0 .. 0] , R is the autocorrelation matrix, A the
2 T
..
vector containing the denominator coefficients with the first coefficient equal to one.
A = [1 a (1) ... a (Q )]
T
The design of a filter with minimum power at its output when the input is the signal {x}
H
implies to minimize A .R. A with the constrain of the first coefficient equal to
MINIMO
H
one that can be set, in a vector form, as A .1 = 1 . The solution to this problem is:
R .1
−1
1
A= or R. A = H −1 .1 where it is evident that these equations are identical
H
1 .R .1
−1 1 .R .1
to the Yules-Walker equations. In summary, the solution coincides for both designs.
Clearly, the noise power at the input of the AR model can be computed as:
1
σ 2 = H −1
1 .R .1
5A.- When the signal under analysis is pure AR then the white noise at the input of the
model is Gaussian and verifies the AR equation as:
Miguel Angel Lagunas 13/08/2007 42
----------------------------------------------------------------------------------------------------------
Q
w(n) = x(n) + ∑ a(q).x(n − q) = A . X n The likelihood will be:
H
q =1
A
{ H
}
Pr w(n) = cte. exp − A .R. A Clearly the maximization of the likelihood implies the
minimization of the exponent yet preserving the first coefficient equal to one. Thus, the
previous design coincides with the maximum likelihood procedure to estimate the AR
coefficients.
Miguel Angel Lagunas 13/08/2007 43
----------------------------------------------------------------------------------------------------------
IV.9.1.-
2
X n .a − d n
( ) .( X )
−1
solving this problem, the solution is: a = X n . X n + λ .I
H H
MIN
n
.d n
H
a .a = 1
Note that the Lagrange multiplier, always greater than cero, acts as white noise is added
to the original data. This procedure is also known as the diagonal loading method to
make more robust to impairments the resulting Wiener filter.
IV.9.2.-
1
H
−1 d .d
R = d .d + σ 2 .I
H
X n = d + wn R = . I − P = Pd .d
σ 2 σ 2 + d 2
−1
Pd d
2
P
then R .P = .d . 1 − = d . 2 d 2 which reveals that the solution
σ 2
2
σ +d σ +d
2
coincides, within a constant that does not modify the output SNR, with the vector
containing the deterministic signal component.
IV.9.3.-
H (l ) =
∑ Y (l ). X (l ) *
domain of the Wiener filter from the DFTs of segments, corresponding to the input and
output signals, faces a problem of spectral estimation. In fact, see chapter II when
describing the spectral coherence, the actual expression of the optimum filter is given by
the quotient of the cross-spectral density between the output and the input divided by
S ( w)
the spectral density of the input. H ( w) = xy . Windows, number of segments,
S xx ( w)
resolution, etc. are the problems that this way of getting the Wiener filter have to
overpass. On the other hand, the processing and design is done only with FFTs. It is
more convenient than traditional time-domain design when the length of the filter is
long (above 64 samples).
IV.9.4.-
xˆ (n + 2) = − a(1).x(n − 1) − a(2).x(n − 2)
since the prediction error has to be orthogonal to the data,
IV.9.5.-
The proper choice is to select those lags when the estimate acf shows the greatest
absolute values. The reason is that the prediction is favored with the fact that the
samples use to make the prediction have high correlation (positive or negative) with the
sample to be predicted.
IV.9.6.-
1 0.5 a
0.5 1 0.5 = R Using Levinson for this matrix:
a 0.5 1
Clearly the MEM extrapolation for r(2) is that the corresponding Parcor is cero, i.e. the
value of a that coincides with the MEM extrapolation of r(0) and r(1) is 0.25.
1
The range of values for a (see IV.35) is rMEM (2) ± σ 12 =
−0.5
IV.9.7.-
IV.9.8.-
2.∑ F .B ∑ F .B
kQ = ≤ ≤1
∑F +∑B 2 2
( ∑ F ).( ∑ B )
2 2
We used that the arithmetic mean is less than or equal than the geometric mean and,
∑ u .∑ v ≥ ( ∑ u.v )
2 2 2
later on
Miguel Angel Lagunas 13/08/2007 45
----------------------------------------------------------------------------------------------------------
IV.9.9.-
x(n − Q )
Q H x ( n − Q + 1)
( )
eb (n) = a b .
Q
...
and
x ( n)
x(n − Q + 1) x(n − Q )
x(n − Q + 1)
Q −1 H x ( n − Q + 2)
( ) ( )
H
= ebQ (n) = 0 a b .
Q
ebQ −1 (n) = a b .
... ...
x ( n) x ( n)
then
x ( n − Q )
x(n − Q + 1) 0
E eb (n).eb (n) = a b .E
Q Q −1
( )
Q H
...
.[ x(n − Q) x(n − Q + 1) ... x(n) ] Q −1
ab
x(n)
0
( )Q H
or E ebQ (n).ebQ −1 (n) = a b .R. Q −1 .
ab
Now using the expression of the Y-W equations for the length Q predictor
σ b2,Q
0
0 −1
we have E ebQ (n).ebQ −1 (n) = σ b2,Q 0 .. 0 .R .R. Q −1 = 0
Q
R.a b =
.. ab
0
IV.9.10.-
X n = α .e jwd .n .S d + wn
−1
R .1 1
h= H −1
ξ MIN = H −1
1 .R .1 1 .R .1
−1 −1 α 2 −1 −1 −1
R = α 2 .S d .S d + R 0 .R 0 .S d .S d .R 0 with ρ = α 2 .S d .R 0 .S d
H H H
R = R0 −
1+ ρ
−1 α 2 −1 H −1
R .1 = I − .R 0 .S d .S d .R 0 .1
1 + ρ
H −1
1 .R 0 .S d
2
α2
H −1 H
(
−1
1 .R 1 = 1 .R 0 1 1 − ) .
1 + α 2 .S .R −1.S H 1H .R −11
d 0 d 0 ( )
Now, using the definition of the spectral estimates of MLM and MEM for the noise
covariance matrix,
Miguel Angel Lagunas 13/08/2007 46
----------------------------------------------------------------------------------------------------------
−1
α 2
ξ MIN (H
)
−1 −1
= 1 .R 0 1 1 −
α2
. MEM
1
S0 ( wd )
or
1 + S MLM w
0 ( d)
1
1 + α 2 MLM
S0 ( wd )
ξ MIN (H
= 1 .R 0 1 )
−1 −1
.
1 + α 2 . MLM1 − MEM
1
S
0 ( wd ) S 0 ( w )
d
Note that depending on the spectral density of the noise at the frequency of interest the
noise reduction will change. When the MEM estimate of the noise is very high then
( )
−1
ξ MIN = 1H .R 0−11 . Meanwhile the density is very low, i.e. a band pass filter may take
out the line, then ξ MIN ⇒ 0 .
IV.9.11.-
Now
e= x− y
(
E e.e = E x − y . x − y)( ) =
H H
= E bU
. .U b + E a.V .V a − E bU
. .V a − E a.V .U b =
H H H H H H H H
H H H H
= b .R uu .b + a .R vv .a − b .R uv .a − a .R vu .b
The minimization of the error entails the following gradients equal to cero:
ξ = b H .R uu .b + a H .R vv .a − b H .R uv .a − a H .R vu .b
............
∂ξ
H
= 0 = R uu .b − R uv .a ⇒ R uu .b = R uv .a
∂b
∂ξ
H
= 0 = R vv .a − R vu .b ⇒ R vv .a = R vu .b
∂a
These two equations correspond with the Wiener solutions of one given the another one.
−1
Combining both: R vv .a = R vu .R uu .R uv .a
Solving this equation may be is not possible since it implies the existence of an
eigenvalue equal to one. When solved for all the eigenvalues,
Miguel Angel Lagunas 13/08/2007 47
----------------------------------------------------------------------------------------------------------
( −1
λ .a = R vv − R vu .R uu .R uv .a )
and then the error is not longer cero but equals to:
−1
b = R uu .R uv .a
( −1
ξ MIN = a H . R vv − R vu .R uu )
.R vu .a = λ
which implies that the optimum is the eigenvector associated to the maximum
eigenvalue. Note that this technique does not guarantee that the denominator will be a
minimum phase polynomial.
They can be solved directly from the chapter content and previous exercises
IV.9.16.
Pr X n
i (n) n {[
= cte. exp − ( X − h.i (n) )H R −1 .( X − h.i (n) )
0 n ]}
H −1
d (.) h .R 0 . X n
= 0 = h .R 0 .( X n − h.i (n) )
H −1
*
i ML
( n) = H −1
di (n) h .R 0 .h
2B
2
ξ = X n − h.i (n) which is the same than the exponent of the likelihood when R 0 is a
H
h .X n
constant by the identity matrix. In other words: i MSE
( n) = H
h .h
2C
[ H
] [
H H
Ψ = h e . X n − i (n) . h e . X n − i (n) ] dh
dΨ
H
[ H
]
= X n . X n h e − i * (n) and minimizing the
e
(
h e = h.h + R 0
H
)
−1
.h = R 0 .h −
1 + h .R 0 .h
H −1
= R −1 .h.1 − ρ thus, both differ in
0 1+ ρ
a constant.
IV.9.17.
d ( n)
x(n) 1 a 0
x(n − 1) = 0 1 a . d (n − 1) and X n = H .d n e Y n = X n + w n = H .d n + w n
d ( n − 2)
( H
)
since d(n) is uncorrelated with unity power ( E d n .d n = I 3 ) , and the noise is
1 0
+ σ 2 .
H
uncorrelated and independent of y(n), we have H .H
0 1
b.- Given a reference at the output of the first linear system, since the reference at the
output of the first linear system has a delay of one simple (it passes by a FIR filter of
two coefficients) it has no sense to ask that the system recovers by anticipation. Also it
can be seen that the 3 samples of d(n) that participate on vector X n , the simple that
appears more times than the rest is d(n-1), in consequence it is intuitive that this lag will
be the most easy to handle.
c.-
1 0
a 1 + σ 2 .1 0 = 1 + a + σ
2 2
1 0 1 a 0 a
H .H + σ 2 .
H
= . 2
0 1 0 1 a 0 a 0 1 a 1+ a + σ
2
−1 1 1 + a 2 + σ 2 −a
then R = Also, the P-vector of the
(1 + a+σ 2)
2 2
− a2 −a 1+ a +σ
2 2
power equal to one and using the expression of the Wiener filter,
( )
a2. a2 + σ 2 +1+ σ 2
ξ min = 1 − P . A = 1 −
[(1 + a ]
H
2
+σ 2 )
2
− a2
d.- The denominator of the step size is the trace of the data autocorrelation matrix and
the numerator the miss-adjusment, in consequence:
0.1
µ= At the same time since the number of iterations for convergence is
2.(1 + a 2 + σ 2 )
Ln(10)
nc = − small values of the miss-adjusment implies that the number of
Ln(1 − µ .λmin )
iteration for convergence is bounded by:
λmax Trace ( R ) 2.(1 + a 2 + σ 2 )
nc ≅ Ln(10). < Ln(10). = Ln(10). Note that the number is
λmin λmin (1 + σ 2 )
the closest integer to the above expression.
Miguel Angel Lagunas 13/08/2007 49
----------------------------------------------------------------------------------------------------------
e.- Since z(n)=y(n)+b.d(n-1)=d(n)+a.d(n-1)+w(n)+b.d(n-1), or z(n)=(d(n)+w(n))+
(a+b).d(n-1) then the error would be:
e(n)=z(n)-d(n)=w(n)+(a+b).d(n-1)
Clearly, being w(n) y d(n) uncorrelated, the solution in order to minimize the power of
the error is obtained when the second coefficient will be zero, i.e. b=-a.
In can be seen that this second solution is a system that perfectly inverts with an IIR the
FIR channel. This solution requires that the value of a has to be strictly less than one
otherwise the system will be non stable, precluding that the detector may obtain the
desired signal at its output. In fact, the major problem of the filter stays on that requires
that the detector provides a perfectly regenerated reference. This implies that any time
z(n) has to be as close as possible to d(n), i.e. the signal to noise ratio has to be high (or
moderate high). Under these two circumstances, a lower than one and good SNR the
second system performs better than the first. Nevertheless, in general the IIR solution is
usually disregarded in practical systems.
Miguel Angel Lagunas 13/08/2007 50
----------------------------------------------------------------------------------------------------------
V.8.1.-
M 0.1
µ= =
Trace ( R ) M .Px
H
X n .X n
Px = β .Px + (1 − β ) . with β ≥ 0.99 equivalent to more than 100 samples
M
2.3 2.3
c) nc = =
Ln (1 − µ .λmin ) 0.1.σ 2
Ln 1 −
Trace ( R )
d)
A2
2
Amax .Trace ( R )
Since M = trace ( R.Σ ) and Σ = diagonal maxb then M quant . =
3.2 3.2b
V.8.2.-
a) The eigenvalue spread is given by the ex-centric are the plots. Approximately
the length of the major with respect the minor axis in the plot is 0.8.
b) The spreading around the optimum seems to be 5 times larger in the second case
than in the first. This is the same for the convergence rate. In consequence µ in
the second case looks 5 times greater than in the first case.
V.8.3.-
a)
aQ 0
= + kQ . Q where “r” indicates reverse order: Start with a 0 = [1]
Q +1
a
0 ar
b)
kq (n) = kq (n) − µ . (γ .ebQ −1 (n − 1).eQf (n) + (1 − γ ) .eQf −1 (n).ebQ (n) ) being the objective
eQ −1 (n)
c) The data used in this gradient lattice is data = bQ −1 and its trace is
e f (n)
0.1
F Q −1 + B Q −1 . Then the steep size will be: µ = Q −1
F + B Q −1
Miguel Angel Lagunas 13/08/2007 51
----------------------------------------------------------------------------------------------------------
d) It is obvious that the steep size will grow since the denominator at each section are
the prediction errors for successive orders that, in general will strictly decrease. Only
they remain the same when the adequate order of an AR process is over-passed.
(
= ξ MIM + Trace E ( An − Aopt ) .R. ( An − Aopt ) =
H
)
= ξ MIM + Trace R.E ( An − Aopt )( An − Aopt ) =
H
( H
)
= ξ MIM + Trace R.E A% . A% = ξ MIM + Trace R.Σ
c)
A n +1 = A n + µ . X n .(d (n) − y (n))
Subtracting the optimum in both terms
A% n +1 = A% n + µ . X n .(d (n) − y (n))
Computing the covariance
Σ n +1 = Σ n + µ 2 .E ε 2 (n) E X n . X n − 2.µ .E X n .ε (n). A% n where we have assumed that the
H H
error is orthogonal to the data since we are close to convergence (note that we are dealing with miss
adjustment)
E X n .ε (n). A% n ≈ R.Σ n and
H
Because, being close of convergence, ε (n) ≈ X nH . A% n + w(n) then
E ε 2 (n) ≈ ξ min .
µ
and Σ = .ξ min .I
2
d)
E [ξ n ] − ξ min µ
From the definition of the missadjusment M= = Trace ( R.Σ ) = .Trace ( R )
ξ min 2
e)
Done in the previous exercise
Miguel Angel Lagunas 13/08/2007 52
----------------------------------------------------------------------------------------------------------
f)
Pd Pd
SNRopt = and SNR =
ξ min E [ξ n ]
Since then
SNRopt SNRopt
SNR = =
E [ξ n ] M +1
ξ min
V.8.5.-
a.-
{
P = E y n .x(n − 1) } and {
R = E yn.yn
H
}
−1
h = R .P
−−−−−−−−−−−−−−−−−−−−−−−−−−−
x ( n) x(n − 1) ...
( y ( n) .... y (n − M + 1) ) = C . X n
H
; Xn =
x(n − 1) x(n − 2) ...
1xM 1x 2 2 xM
or
x ( n)
y (n) c(1) c(2) 0 .. 0 0
x(n − 1)
.. = 0 c(1) c(2) .. 0 0 .
..
y (n − m + 1) .. .. .. .. c(1) c(2)
x(n − M )
y n = C. X n
Thus, the filter design equations are:
0
1 c(2)
c(1)
P = C.E ( X n .x(n − 1) ) = C. 0 = Vector of M components.
0
..
..
0
H H
R = C.R xx .C + N 0 .I = C.C + N 0 .I
b.- For M=2 we have:
1/ 2 0.5 + N 0
h= .
(1 + N 0 ) − 0.25 0.5 + N 0
2
Miguel Angel Lagunas 13/08/2007 53
----------------------------------------------------------------------------------------------------------
1 + 2.N o 0.5 + 2.N o + 2.N 02
c.- ξ = Px − P H .R −1.P = Px − P H .h = 1 − =
1.5 + 4.N 0 + 2.N 02 1.5 + 4.N 0 + 2.N 02
d.- h n = h n −1 + µ .ε ( n). y
n
1 2 2 1
e.- µ≤ ≤ = =
λmax Tr ( R ) 2.N 0 + 2. ( c(1) + c(2) ) N 0 + 1
2 2
1 N0 + 1 1
f.- ncon = k0 . ≈ .k0 = k0 . 1 +
µ .λmin N0 N0
g.-
2.α
M = α % when µ =
Tr ( R )
if M = 0.1 then α = 0.1
for 0.01 criteria of convergence
Miguel Angel Lagunas 13/08/2007 54
----------------------------------------------------------------------------------------------------------
VI.8.1.-
H
1 3
a.- Since A . A = 4.I then B = 0.5.
3 1
1 1
λ1 = 1 + ρ e1 = .
1 ρ 2 1
b.- C = the eigenvalues and eigenvectors are
x
ρ 1 1 1
λ2 = 1 − ρ e2 = .
2 −1
1 1 1
thus Φ = .
2 1 −1
c.-
z = z + ε = B.x + ε xˆ = B .z = x + B .ε
Q H Q H
z = B.x
ς = x − xˆ = B H .ε (
ς H .ς = ε H .B.B H .ε = Tr B H .ε .ε H .B )
taking the expected value of the last expression we obtain the error power E
(
MSE = Tr B .E ε .ε
H
{ H
}.B ) = Tr ( B H
) ( )
.E.B = Tr E.B .B = Tr ( E )
H
1/ 22.k1 0
Now, E = diag ( R ). since the quantization error is the power of the signal to be
z
0 1/ 22.k 2
quantized, reflected in the diagonal of the acf matrix, divided by two raised to two times the number of
bits used.
H
being R z = B .R x . B
H 1 + ρ 0
Φ .R.Φ =
0 1 − ρ
MSE ( KL) = (1 + ρ ) .2−2.k 1 + (1 − ρ ) .2−2.k 2
When k2=0
Miguel Angel Lagunas 13/08/2007 55
----------------------------------------------------------------------------------------------------------
MSE KL = (1 + ρ ) .2−2.k
4 + 2. 3.ρ −2.k 3 −2 k
MSE proposed = .2 = 1 + .ρ .2
4 2
d. −
MSE direct = 2.2−2.k
1+ ρ
2.k1KL = log 2
D
4 + 2. 3.ρ
2.k1proposed = log 2
4.D
e. −
2
2.k direct = log 2
D
f.- Obvious.
VI8.3.-
Q
a)The given expression X n = ∑ ϕ n (r ).u r , n .v r ,n , can be written as X n = U n .diagϕ .V n
H H
n
r =1
which indicates that matrixes U n y V n diagonalize directly the given matrix. For this
reason for every sub-image it is necessary to transmit both matrices, which implies a
severe waste of channel capacity.
Q Q
b) In this case X n = ∑∑ φn (r , s ).u r .u s or X n = U .φ .U
H H
which does not diagonalize
n
r =1 s =1
the original matrix, it is just a mere transformation. At the same time, this transform
does not depend on every sub-image under processing and can be computed as an
average over an ensemble of sub-images.
In any case, it is important to set additional criteria in order to select a transform in such
a way that the transmission of φ reports more advantages than the direct transmission
n
of the original image. Otherwise the transform would be un-useful.
c) Since the determinant of a definite positive matrix R x , which trace is fixed since the
elements of the main diagonal are the energy of every component, is always lower than
the product of its main diagonal components, then B is maximum when the off-diagonal
of the transformed matrix are zero. In other words, the components are uncorrelated.
( H
) (
H
)
d) Since the power is Py = Trace R y = E Y n .Y n = Trace U .R x .U = The trace has
the circular property, i.e. altering the order does not change the trace= Traza R x .U .U .( H
)
Miguel Angel Lagunas 13/08/2007 56
----------------------------------------------------------------------------------------------------------
In order that this last expression be equal to Px = Traza ( R ) is clear that the transform
H
matrix must be orthonormal. U .U = I
Clearly B is preserved (it does not increases) since the determinant of the resulting or
transformed correlation matrix is just the product of the eigenvalues which is the same
that the determinant of the original matrix.
f) The DFT is also an orthonormal transform. The difference is that the autocorrelation
matrix of the transformed data is not diagonal. In other words, assuming that the
transform is the set of outputs of a filter bank, the DFT does not use orthonormal filters
they have aliasing precluding the un-correlation of their outputs. Only when the size of
the signal tends to infinity then the filters become ideal and their outputs uncorrelated.
Nevertheless, for reasonable sample size the DFT may show good decorrelation among
its components.
Any non-linear processing tends to spread the energy in the frequency domain. Since
the DCT implies an additional extrapolation making even the original signal this
provides better result that the zero-padding associated with the DFT. The energy
leakage due to non-linearities is always from frequency regions of similar energy levels.
In addition, DCT implies only real operations.
V.18.4.
1 0
a.- UIT a rectangular sampling matrix U R = d . , the repetition matrix in the
0 1
1 0 2π 1 0
frequency plane would be V R .U = 2π
T
and V R = . . With this matrix
0 1 d 0 1
the spectrum plane is:
Miguel Angel Lagunas 13/08/2007 57
----------------------------------------------------------------------------------------------------------
f2
1/d
f1
1/d
2B
1/d
( )
2 / d. 3
Miguel Angel Lagunas 13/08/2007 58
----------------------------------------------------------------------------------------------------------
1
Now the closest spectrum are located at a distance equal to and the constrain for
d. 3
no-aliasing is d ≤ (2 B). 3
c.- The hexagonal sampling is better since it requires a longer distance in sampling (i.e.
the sampling frequency yet preserving the anti-alisaing condition is lower for hexagonal
than for rectangular sampling)
d.- The inverse Fourier transform for the discrete signal is:
f (n ) = cte.∫ Fd (w). exp( j n .U R .w).d w
T
w
This is a periodic function in the spatial domain and it has to show repetition when
passing from n to n + N .l . Clearly, to avoid aliasing, matrix N must be diagonal
(rectangular case) with entries above M where M.d>2D. Thus
M 0
N =
0 M
Finally, the existing relationship between the sampling matriz and the sampling matrix
T
at the frequency domain is derived from setting factor N .U R .Φ equal to the identity
matrix. In consequence:
−T
2π Φ = V .N
And, because f d (n) is the basic function repeated without aliasing, we can write the
following expression:
f (n ) = cte.∑ Fd (m ). exp(2πj n .N
T −T
.m).d w
m
e.- The reasons for using the DCT instead of the DFT in image processing are all of
them based on the real and positive character of the signal at the spatial domain. The
major differences among both transforms are:
1.- Always use real numbers (no complex quantities).
2.- Twice resolution of the DCT with respect the DFT of the same size of data signal.
As a consequence of this second property, the repetition is performed over the even part
of the original signal, this implies than any non-linear processing, like quantification
included in all the standards for image coding, the leakage of energy motivated by the
non-linear processing in the borders of the data segment does not occurs on different
border (left or right in the figure. This property is essential in reducing distortion on
coding and compression procedures.
Periodicity on DFT.
Miguel Angel Lagunas 13/08/2007 59
----------------------------------------------------------------------------------------------------------