0% found this document useful (0 votes)
11 views13 pages

Week 5 - Gaussian Channels (Chapter 9)

The document discusses Gaussian channels, including the characteristics of deterministic and random signals, power spectral density (PSD), and the implications of white Gaussian noise in communication systems. It covers the capacity of Gaussian channels, binary signaling, and the optimization of power allocation in parallel channels, along with the water-filling algorithm for colored noise. Additionally, it addresses the use of whitening filters to convert colored noise into white noise for improved signal processing.

Uploaded by

v3193373
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

Week 5 - Gaussian Channels (Chapter 9)

The document discusses Gaussian channels, including the characteristics of deterministic and random signals, power spectral density (PSD), and the implications of white Gaussian noise in communication systems. It covers the capacity of Gaussian channels, binary signaling, and the optimization of power allocation in parallel channels, along with the water-filling algorithm for colored noise. Additionally, it addresses the use of whitening filters to convert colored noise into white noise for improved signal processing.

Uploaded by

v3193373
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Week 5 - Gaussian Channels (Chapter 9)

1 / 13
Brief Recap of Signals, Bandwidth, PSD
A deterministic signal is a function of time x(t) or x(n) based on time
which be continuous R T or discrete. RT
Px = limT →∞ T1 −T x 2 (t)dt, Ex = limT →∞ −T x 2 (t)dt are power
and energy of a signal.
Fourier Transform of a signal is defined as X (f ) = x(t)e −j2πft dt.
R

X (f ) = |X (f )|e j∠X (f ) . |XR (f )| is even symmetric for real signals.


We can show that Ex = |X (f )|2 df (Parseval’s Theorem)
Most real-life signals have most of their energy (∼ 90 − 95%)
concentrated on a range of frequencies which is called the bandwidth.
Nyquist Sampling Theorem: If you wish to sample a signal of
bandwidth W , and then reconstruct it later, you need to sample it at
2W samples/sec.
Random signals (Ex: noise): A random process X (t) is described
as a random variable for each time instant t i.e. X (t1 ), X (t2 ), . . . are
all random variables.
For ergodic process 1 , power is defined as PX = E [X 2 (t)].
1
A process is ergodic if time averages and statistical averages are same. Here, we
assume ergodicity of 2nd order. 2 / 13
PSD, White Noise and Filtering
The autocorrelation of X (t) is defined as RX (t1 , t2 ) = E [X (t1 )X (t2 )].
If the random process is Wide Sense Stationary (WSS) ⇔
E [X (t)] = X̄ and E [X (t)X (t + τ )] = RX (τ ), we can compute
F R
RX (τ ) ↔ SX (f ) and hence PX = RX (0) = SX (f )df
SX (f ) is called Power Spectral Density (PSD)
We call random process to be white if it has same PSD for all
frequencies (think of white color comprising of all wavelengths)
Flat PSD = N20 ⇔ RX (τ ) = N20 δ(τ ) ⇒ All samples are uncorrelated.
If additionally Gaussian, they are independent. Cov (N(t)) = N20 I .
In comms, we assume white Gaussian noise with PSD Sn (f ) = N0 /2
If we pass this through an LTI filter of impulse reponse h(t), the o/p
PSD is Sn (f )|H(f )|2
At Rx, we pass signal through filters of bandwidth W (equivalent to
that of signal). So, assoiciated noise is also filtered.
o/p noise power is area of the o/p PSD1 = N20 2W = N0 W
1
2W because the ideal filter will have symmetric flat spectrum from -W to W
3 / 13
Gaussian Channels

Let i represent a time index


Xi is a n-bit code of the form
(x1 , x2 , . . . , xn )
Zi ∼ N (0, N). Each Zi is i.i.d1
Yi = Xi + Zi
Zi is independent of Xi

Power Constraint
At the Tx, we have Xi = (x1 , x2 , . . . , xn ). Here, each bit will require some
power to be generated and launched to channel called transmit power.
This is the strength of the outgoing signal. Since power is limited:
n
1X
E [xi2 ] ≤ P
n
i=1
1
The Gaussian assumption for density is due to Central Limit Theorem(CLT). N is
the noise power/variance. By IID it implies, that K = NI where I is the identity matrix.
4 / 13
Binary Signaling over Gaussian Channels

Gaussian Channel is continuous in nature i.e X , Y , Z ∈ R. However, in


practice, we usually are able to√work with
√ a discrete version of this.
Consider the case when Xi ∈ { P, − P}2 .
Input can be any one of the two levels, so input is discrete.
Z , however is continuous and so is Y .
But, we need to decode X̂ from Y by passing it through decoding
function which must be a discrete o/p.
Hence, this is a discrete channel, with a continuous transition matrix
What should be the capacity of this (given we have discretized the i/p
and o/p)?
What should be the optimum decoding scheme?
What is the error probability and how does it behave with P?

2
Communication engineers are hopefully
√ familiar
√ with this! It implies binary signaling
(BPSK) with constellation points at + P and − P
5 / 13
Channel Capacity of Gaussian Channel1
We consider the input to be a scalar random variable X to be from an
arbitrary distribution f (x). Z ∼ (0, N) is also scalar and Y = X + Z

C= max I (X ; Y )
f (x):E [X 2 ]≤P

I (X ; Y ) = h(Y ) − h(Y |X ) = h(Y ) − h(X + Z |X ) = h(Y ) − h(Z |X )


1
= h(Y ) − h(Z ) [∵ Z ⊥⊥ X ] = h(Y ) − log 2πeN
2
E [Y 2 ] = E [(X + Z )2 ] = E [X 2 ] + E [X ]E [Z ] + E [Z 2 ] = P + N. Since h(Y )
has a finite variance, h(Y ) is maximum if Y ∼ N (0, P + N). So,
 
1 1 1 P
I (X ; Y ) ≤ log 2πe(P + N) − log 2πeN = log 1 +
2 2 2 N
Y ∼ N (0, P + N) ⇔ X ∼ N (0, P) and such a capacity will be achievable.
Hence we, have: C = 12 log2 (1 + N
P
) bits/transmission.
1
There is a nice spherical packing argument to show why this expression is possibly
an upper bound for capacity which was covered in class. Please read that from the book.
6 / 13
Bits/Transmission → Bits/sec and Shannon Limit
From Nyquist sampling theorem, we need 2W samples in 1 sec
⇒ 2WT samples in [0, T ].
In 1 transmission, we need to send at least one sample (half a sample
does not mean anything). So, we compute the equivalent signal and
noise power of a sample.
Power per sample = NoTotal energy spent in [0,T ] PT P
of samples to send in [0,T ] = 2WT = 2W
N0
Noise variance per sample1 : T
(2W )( 2WT ) = N20
2 
P  
C in bits per sample = 12 log 1 + 2W N0 = 1
2 log 1 + P
N0 W
2 
C in bits per second = (2W ) 12 log 1 + N0PW = W log (1 + SNR)
    N0 W . P
P P P N0
lim W log 1 + = log2 e. lim ln 1 + =
W →∞ N0 W W →∞ N0 W
P
log2 e bits/sec. There is a limit to capacity (for infinite BW)
N0
called Shannon limit and it grows linearly with transmit power.
1
Why?
7 / 13
Channel Capacity for Parallel Channels

In comms, we multiplex signals over the same channel using


modulation.
In modulation, we translate (shift) the signals in frequency so that
they are non-overlapping and then we can transmit them together.

Consider parallel channels, each


corresponding to a different frequency
Each of them is flat and independent of
other.
Yj = Xj + Zj , j = 1.2, . . . , k with
Zj ∼ N (0, Nj ), Zi ⊥⊥ Zj (∀i ̸= j)
Common Power Constraint:
2
P
j E [Xj ] ≤ P
C=
maxf (x:E [||X ||2 ]≤P) I (X1 , . . . , Xk ; Y1 , . . . , Yk )
8 / 13
Capacity for Parallel Channels
X
I (X1 , . . . , Xk ; Y1 , . . . , Yk ) = h(Y1 , . . . , Yk ) − h(Zi )
i
X1 
X Pi
≤ h(Yi ) − h(Zi ) ≤ log 1 +
2 Ni
i i

where Pi = E [Xi2 ], the power constraint should imply i Pi = P and the


P
equality is achieved for
(X1 , . . . , Xn ) ∼ N (0, diag(P1 , . . . , Pn ))
Hence, we would like to compute Pi by solving:
X  Pi

max ln 1 +
Pi Ni
i
X
s.t. Pi ≤ P
i
Pi ≥ 0
9 / 13
Power Allocation for Parallel Channels

The objective function is concave in P = [P1 . . . Pk ]T while the constraints


are convex and hence this is a convex optimization problem. The
allocation Pi = 0 is always a feasible point for the problem (Slater’s
condition is satisfied). So. the KKT conditions1 are necessary and
sufficient for the global maxima.
Using Lagrangian:P Pi P P
L(P, λ, µ) = i ln (1 + N i
) + λ( i Pi − P) − i µi Pi
∂L 1 1 1
∂Pi = Pi . N + λ − µi = 0 ⇒ µi = λ − P +N
1+ N i i i
i
By the slackness conditions, if µi = 0 ⇒ Pi > 0 and then, Pi = λ1 − Ni .
For µi > 0, Pi = 0 ⇒ λ > N1i ⇒ λ1 < Ni ⇒ Pi = [ λ1 − Ni ]+
For λ > 0, we should have i Pi = P and if λ = 0, i Pi < P. Let λ1 = ν
P P

The power allocation


P strategy is: Pi = [ν − Ni ]+ where ν can be
+
calculated using i [ν − Ni ] = P
1
Please check the material shared on optimization if you are not familiar with these.
10 / 13
Water-filling Algorithm

We have the noise levels for each channel.


Compute ν using i [ν − Ni ]+ = P 1
P

Now the algorithm is like filling water with respect to a maximum


limit on the water-level as shown in figure.
Outcome: Allocate more power to channel with less noise.
1
Note that there are multiple possibilities but for any choice of ν > 0, it is optimal!
11 / 13
Colored Noise/Correlated Channel

Previous case: Considered each channel is independent. Now we take the


more general case with dependent noise.
Let KZ be the covariance matrix of the noise
Let KX be input covariance matrix
Power constraint: n1 i E [Xi2 ] ≤ P = 1
P
n tr (KX ) ≤ P
h(Y ) is maximized for covariance matrix KX + KZ . So, we maximize
h(Y ) = 21 log (2πe)n (|KX + KZ |) subject to tr (KX ) ≤ nP
Since log is a monotonic function, we can equivalently maximize just
|KX + KZ |1
Now, we will use Eigen-value decomposition (EVD) of KZ . As this is a
covariance matrix, it is p.s.d and we can decompose this as KZ = QΛQ T
where Λ is a diagonal matrix of positive eigen values of KZ . Q is an
unitary matrix (QQ T = I ) where each column is an eigen vector of KZ .

1
why?
12 / 13
Power Allocation for Colored/Correlated Channel
|KX + KZ | = |KX + QΛQ T | = |QQ T KX QQ T + QΛQ T | [∵ QQ T = I ]
= |Q(Q T KX QQ T + ΛQ T )| = |Q(Q T KX Q + Λ)Q T |
= |Q||Q T KX Q + Λ||Q T | = |Q T KX Q + Λ| = |A + Λ|
where A = Q T KX Q, tr(A) = tr(Q T KX Q) = tr(QQ T KX ) = tr(KX ) Hence,
the problem is same as max |A + Λ| s.t. tr(A) ≤ nP. By Hadamard’s
Inequality, |A + Λ| ≤ Πni=1 (Aii + λiP) with equality iff A is diagonal. Since A
1
is subject to a trace constraint, n i Aii ≤ P and Aii ≥ 0, by A.M.-G.M.
inequality, Πi (Aii + λi ) attains maximum when Aii + λi = ν. However, it is
not guaranteed that Aii ≥ 0 and hence the solution 1 is given as
+
P
Aii = (ν − λi ) , where ν is chosen so that i Aii = nP.
Whitening Filter
What we have actually done is that we converted the colored noise to
white noise using EVD which acts as a whitening filter. The filtering
operation is a rotation of the axes using eigen vectors of the noise
covariance and levels decided by eigen values, thus de-correlating them.
1 13 / 13
We can do these using KKT as well

You might also like