0% found this document useful (0 votes)
23 views

Lectnotes 1

This document provides an introduction and overview of digital message transmission and communication systems. It discusses pioneers like Maxwell, Hertz, Bell and Marconi who helped enable long distance communication. Modern digital communication systems bring together signals, implementation, and mathematics. The document outlines the basic components of a digital communication system including an encoder that maps messages to codewords, a medium that transmits the codewords and introduces randomness, and a decoder that estimates the original message. It introduces some common notations that will be used and describes an abstract memoryless channel model for the transmission medium.

Uploaded by

Roman Reings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Lectnotes 1

This document provides an introduction and overview of digital message transmission and communication systems. It discusses pioneers like Maxwell, Hertz, Bell and Marconi who helped enable long distance communication. Modern digital communication systems bring together signals, implementation, and mathematics. The document outlines the basic components of a digital communication system including an encoder that maps messages to codewords, a medium that transmits the codewords and introduces randomness, and a decoder that estimates the original message. It introduces some common notations that will be used and describes an abstract memoryless channel model for the transmission medium.

Uploaded by

Roman Reings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Indian Institute of Technology Bombay

Department of Electrical Engineering

Handout 2 EE 703 Digital Message Transmission


Lecturenotes 1 Aug 01, 2017

1 Introduction
You have possibly learned something about signals, done some implementation, and had
some mathematics in your undergraduate days. This course essentially explains how they
come together in equal proportions in modern digital communication systems.
While there is a lot to be written about the evolution of digital communication, we will
make some short scribes here. Pioneering works from Maxwell, Hertz, Bell and others led to
long distance communication becoming a reality, the synergy of theory and practice crucial
to this evolution. It was Marconi’s turn at the dawn of the past century to push communi-
cation out of their wired confinements. Unlike the analog telephones, Marconi’s radio was
about digital communication, for transmitting Morse codes through wireless telegraphic
links. The less written part is that Marconi’s effort in commercialization of communica-
tion techniques led to unprecedented developments, whose footprints still remain in the
modern digital era. In particular, the development of diodes, triodes, transistors, ICs etc
were all defining points in the race for communication superiority. History repeats itself!.
Look around you, all sorts of communication gadgets springs to life, from tiny RF-ids
to spohisticated smart phones, all striving to keep abreast of the latest communication
demands. Furthermore, at the time of this writing, communication is driving the compu-
tational market too, from GPUs to cloud computers. It will be interesting at this point to
read Chapter 1 of Wozencraft and Jacobs, it was written 50 years ago!

R. V. L. Hartley H. Nyquist C. E. Shannon

While Marconi is known as the father of radio, that attribute for digital communication
unequivocally goes to Claude Shannon. Among the most influential figures of twentieth
century, Shannon went several steps further from the initial stones laid by Hartley and
Nyquist, the resulting theory had far reaching consequences. Unlike the other biggest
development of twentieth century, which went nuclear, the communication revolution went
the ballistic way. Above all, this was a giant leap in enabling communication theory to
become a mathematical discipline, facilitating analysis and computation even before actual
systems are practically rolled out. That we could be sure of communicating to the MARS
shuttly, modulo mishaps, even before launching a test run is a testament to the resounding
nature of communication theory. Following Shannon’s foot-steps, in order to study or
devise a communication technique for an environment, the following course of action is
advocated.
• abstract the practical constraints to meaningful mathematical models for the signals
and systems.
• solve the mathematical objective to obtain optimal communication schemes.
• translate and optimal schemes to practical design guidelines, which can be realized
by existing apparatus.
• test the performance in operating environments, and refine the design and implemen-
tation.
Most communication systems that you see around have gone through such an evolutionary
path, only to be named as a revolutionary technology later.

2 Some Notations
Let us list some notations that we will try to stick throughout the course. In particular,
random variables will be denoted by capital letters, and their realizations will be written
in small cases. For example, random variable Y takes values in the set y ∈ Y. Here Y is
the sample space of Y .
Unless otherwise specified, a vector will be considered as a column vector. Thus, a
vector u is an n × 1 vector, and uT will be written as Ð

u . In communication theory, often
we have to dealy with complex numbers. So we will define a dot product of vectors in the
complex field. For n−dimensional column vectors u and v, we define the dot product as

u ⋅ v ∶= u† v ∶= ∑ u∗i vi ∶= (u∗ ) v ∶= ⟨u, v⟩ ∶= ⟨Ð



u ,Ð

n
v ⟩,
T

i=1

where a∗ is the complex conjugate of a, and u† stands for u∗ T , which is same as Ð →u ∗.


A collection of symbols x1 , ⋯, xn will be denoted as xN . Generalizing this, Y n denotes
a random vector which can take the realization y n ∈ Y n .

3 Digital Communication System


Let us demonstrate the first step in our design of a communication scheme. Fortunately, we
can introduce a very generic model here, without even talking about any specific technology,
see Figure 1.

W Encoder X1 , ⋯, Xn Medium Y1 , ⋯, Yn Decoder Ŵ

Figure 1: A Communication System

In the above figure, W is a random variable, which denotes a set of information symbols
(usually bits), known as a message. The sample space for W can be taken as the index set
{1, ⋯, M }, where M is some finite number1 . It is usual to assume that the messages are
uniformly distributed, i.e.
1
P (W = i) = , 1 ≤ i ≤ M.
M
1
The sigma field is not explicitly specified whenever it is understood by the context, the power set in
this example

2
The encoder translates the message W to an n−dimensional vector X1 , ⋯, Xn in an appro-
priate field. We will denote the vector X1 , ⋯, Xn by X1n , or sometimes X n , in this course. In
particular, Xi ∈ X are symbols which are suitable to be transmitted over the given medium.
Another way to visualize X n is to consider them as samples of an underlying continuous
time waveform sent over a medium, typically voltage/current waveforms in the baseband
circuitry. We will opine more on this view later. A vector of values Y n ∶= Y1 , ⋯, Yn are
received, and the decoder declares Ŵ ∈ {∅, 1, ⋯, M } as the estimate of the transmitted
message. This is a high level view of the communication systems considered in this course.

3.1 Encoder
The encoder can be described by a M × n matrix, where the rows are indexed by the
messages in {1, ⋯, M }. This matrix is also known as a codebook, where each row is named
as a codeword. Thus, each message is mapped to an n−dimensional codeword, this is
demonstrated in Figure 2.

W =1 x11 , x12 , ⋯, x1n


W =2 x21 , x22 , ⋯, x2n

W =M xM 1 , xM 2 , ⋯,xM n

Figure 2: Encoder Mapping

Often, the j th codeword xj1 , ⋯, xjn will be conveniently written as xn (j).

3.2 Decoder
The observed output vector Y n ∈ Y n is also random. The randomness here is induced by
two sources. Since W is random, so is the transmitted codeword X n (W ), and the resulting
output Y n . Moreover, even when M = 1, the output can be random due to the randomness
induced by the medium. The decoder Ŵ is a surjective mapping Ŵ (Y n ) from the space
Y n to {∅, 1, ⋯, M }. The first element asserts the receiver’s freedom to say ‘I do not know’,
or declare an erasure. Nevertheless, in most cases we demand the decoder to declare one
of the messages.

3
D1

D2
DM

Figure 3: Decoder is a partition of Y n

In Figure 3, the region Di represents the set of Y n ∈ Y n which are mapped to message i
by the decoder, i.e.
Di = {y n ∈ Y n ∣Ŵ (y n ) = i}.
More generally, a decoding rule is a partition of the space Y n to M disjoint subsets.
We have left out the medium in the above description, modelling this appropriately is
paramount to the success of our design. Down the line, we will treat this more rigorously.
In fact, the system model for the medium can be very much application dependent. As an
exercise, can you look around and identify three different communication systems that are
used in our day to day life. Though the long haul optical cables and high speed LAN cables
still adorn our offices and infrastructure, the ubiquitous spread of mobile devices has made
digital communication almost synonymous with wireless communication.

3.3 Modelling the Medium


In spite of having many practical examples around us, we are going to introduce an abstract
representation of a medium, that too of a peculiar type. In fact, the simple representation
shown in Figure 4 may look almost trivial in the first sight.

X p(y∣x) Y

Figure 4: A Memoryless Channel

The above representation is that of a memoryless channel, where the input symbol x ∈ X
is mapped to one of the output symbols y ∈ Y, with probability p(y∣x). Thus, the medium
is specified by a collection of probability laws p(y∣x), one for each x ∈ X . Notice that we
did not insist X or Y being scalars, or even complex valued. This gives us enough flexibility
as a generic model suitable for communication theoretic analysis. If you are not familiar
with probability, don’t worry, we can make the representation even shorter, and depict the
communication medium literally as a pipe from X to Y , as in Figure 5.

4
X Y

Figure 5: A point to point link

Such a representation fits systems where the output takes the form Y = f (X, Z), where
Z is some randomness (often noise) independent of the transmissions. The function f (⋅, ⋅)
captures the effect of the transmitted symbol at the receiver. In wireless and wireline
systems, it often makes sense to assume that f (⋅, ⋅) is a linear function of the arguments.
Let us now demonstrate the components of a communication system using an example,
albeit a futuristic one, i.e. something that is yet in the developing phase. The idea of
interference cancellation is among the latest developments in digital communication.

4 Interference Cancellation
Interference is something we all worry about. While it is unclear how to model this, in one
sentence, ‘can we have several simultaneous communications over a shared medium’. By
appealing to physical laws, we know that simultaneous transmissions will cause superposi-
tion of the EM waves at the receiving terminal. Since we are yet to introduce the details of
the medium, our immediate approach is to model the superposition of transmissions by an
appropriate graphical representation, which can be visualized as an extension of Figure 5
to several communication links.

H X1 Y1
X1
1

H X2 Y2
2
X2

Figure 6: Cellular Users and Interference Graph

Imagine a cellular infrastructure with a frequency reuse factor of unity. Thus, the
transmitter-receiver pairs in the neighboring cells may interfere with each other, as depicted
by the graph in Figure 6. Transmitter i ∈ {1, 2} emits the scalar symbols Xi ∈ C. In
our notation the transmitted vector for n consecutive transmission instants from user i is
Xi1 , Xi2 , ⋯, Xin ∶= Xi1
n
. The receiver observes

Y1i = f1 (X1i , X2i ) and Y2i = f2 (X1i , X2i ), 1 ≤ i ≤ n, (1)

where the functions f1 (⋅, ⋅) and f2 (⋅, ⋅) model appropriate superpositions (linear). For ex-
ample, we use the popular choice of fk (x, y) = α1k x+α2k y, k = 1, 2 in our illustrations below.
The coefficients αik are called the fading coefficients, usually assumed to be complex scalars.
Notice that we excluded any additive noise in (1), this is purely to illustrate the concept
of interference management, we can later add noise in our discussions. Also let us somewhat
naively assume that interference results in collision and data loss. How do we communicate
in this situation? Sounds familiar!, imagine so many people discussing with their respective

5
counterparts across a conference table, or a crowded party room. Such interference is also
the subject of user management within a cell. In such situations, the good old TDMA
(used in GSM), and the more recent CDMA come to our rescue. For instructive purposes,
let us go through some abstract details.

4.0.1 TDMA
The essential idea here is to orthogonalize the communication resources in time. i.e. the
users take turn in transmitting. Suppose dij , j ≥ 1 is the data available at user i. In a
simple model where the two users are given alternate transmission instants, user 1 transmits
X1j , j ≥ 1 = {d11 , 0, d12 , 0, ⋯}, while user 2 sends {X2j,j≥1 } = {0, d21 , 0, d22 , ⋯}, where the zero
symbol stands for no transmission. For simplicity, assume that the data symbols are real,
though our discussion easily extends to complex values.
Observe that in order to send the data symbol d1i , user 1 multiplies the symbol d1i with
the row vector u⃗1 = [1, 0], and schedules the resulting vector for the next two transmissions.
In particular, we can write

[X11 X12 ] = d11 Ð


u→1 and [X21 X22 ] = d21 Ð
u→2 ,

where u2 = [0, 1]. Thus, the vector Ð


u→1 converts a single data symbols to a beam (vector) of
values. Hence it is also known as a beamformer. Similarly, Ð u→2 is the beamformer for user 2.
The received values at the users in two instants are

[Y11 Y12 ] = α11 d11 Ð


u→1 + α21 d21 Ð
u→2
[Y Y ] = α d Ð
21 22 u→ + α d Ð
12 11 1 u→. 22 21 2

In order to get d11 back at user 1, we first combine the elements of Y11 and Y12 . This can
be achieved by taking dot product with an appropriate weight vector Ð →
v = [v11 v12 ]. Thus,

[Y11∗ Y12∗ ] [ 11 ] = v11 Y11∗ + v12 Y12∗ .


v
v12

The vector Ð →v is also called a combiner. It will also be called a zero-forcer if all other data
than the intended one is cancelled at the output of the encoder. It is easy to see that the
choice of Ð
u→1 = Ð
v→1 and Ð
u→2 = Ð
v→2 is sufficient for zero-forcing. In other words
Ð
u→1 1 0 Ðv→1 1 0
U = [Ð→ ] = [ ] and V = [Ð → ]=[ ] (2)
u2 0 1 v2 0 1

will guarantee that the data symbols are transmitted without interference from the other
user.
In general, we call the matrix U with rows Ð u→i , 1 ≤ i ≤ n as a beamforming matrix.
Similarly V with rows Ð v→i , 1 ≤ i ≤ n is called the zero-forcing matrix. To illustrate, consider
n users operating in TDMA mode. We can collect n symbols at the output of receiver k
as a vector Ð
y→k , given by

Ð
y→k = ∑ αik di Ð
u→i .
n
(3)
i=1

With U = V = In , we can have interference free operations, since ⟨Ð


y→k , Ð
v→k ⟩ = αkk

dk .

Exercise 1. Verify that ⟨Ð


y→k , Ð
v→k ⟩ = αkk

dk when U = V = I.

6
4.1 CDMA
In the beamforming and zero-forcing employed in the previous section, the key property
ensuring interference free transmission is that
⟨Ð
u→, Ð
v→⟩ = δ .
k i k,i

Generalizing this, we can pick any orthonormal n × n matrix as U , and then take V = U .
Recall that Ðu→i (ith row of U ) as the beamforming vector at transmitter i, and Ð
v→j as the
zeroforcer at receiver j. From (3), we have

⟨Ð
y→j , Ð
v→j ⟩ = ∑ αij di ⟨Ð
u→i , Ð
v→j ⟩ = αjj
n
∗ ∗
dj .
i=1

Notice that any unitary U is good enough, and the choice U = I does indeed give TDMA.
A popular technique which designs an orthogonal U using values from the set {−1, +1} is
called CDMA or plain CDMA.

4.2 Interference Alignment


Techniques like TDMA/CDMA takes a pessimistic view of the network, in the sense that
each transmission is expected to create interference at all other receivers. Orthogonalizing
n users require n transmissions in such cases, yielding a transmission efficiency of 1 symbol
per user in every n transmissions. However, in practice, the network topology is often
not that fully connected to warrant such an extreme pessimistic treatment. For example,
consider the interference topology depicted in Figure 7.

1 1

2 2

3 3

4 4

5 5

Figure 7: Interference Graph

In the interference graph shown, there are five transmitter receiver pairs participating in
communication. Transmitter i intends to communicate to receiver i, shown by the dashed
line. The solid lines represent the additive interference structure. Thus, user 1 causes
additive disturbance at receivers 3 as well as 4. The link coefficients αij are taken to be
identically unity for a simple exposition.
While TDMA/CDMA will achieve an efficiency of 51 data symbol per transmission,
better efficiencies are feasible in the above network. For example, users 1 and 2 can transmit
simultaneously without interference. However, what more can be done is slightly unclear
at this point. Let us build a more formal mechanism to analyze this model. To this end,
let U be a m × n beamforming matrix. Thus, the beamformer at transmitter i is Ð u→i , which
is m− dimensional. Our intention is get m as small as possible. By collecting m samples
as a vector at receiver i,
Ð
y→i = ∑ dj Ð
u→j ,
j∈Ai

7
where Ai is the set of transmitters which are connected to receiver i. After combining or
zeroforcing

⟨Ð
y→i , Ð
v→i ⟩ = ⟨ ∑ dj Ð
u→j , Ð
v→i ⟩
j∈Ai

= di ⟨ui , vi ⟩ + ∑ dj ⟨Ð
u→j , Ð
v→i ⟩.
j∈Ai ,j≠i

Clearly, interference free operation can be achieved if we design beamformers Ð


u→i , 1 ≤ i ≤ n
Ð→
and combiners vj , 1 ≤ j ≤ n such that

⟨Ð
u→j , Ð
v→i ⟩ = δi,j for j ∈ Ai . (4)

Notice that (4) will imply


⎡Ð→⎤ ⎡1 × ×⎤⎥
⎢ u1 ⎥ ⎢ 0 0
⎢Ð→ ⎥ ⎢×
⎢ u2 ⎥
⎢ ⎥ ⎢ 1 × × 0 ⎥⎥
⎢ ⎥
U V † = ⎢⎢ ⋅ ⎥⎥ [v1† v2† ⋯ vn† ] = ⎢ 0 × ×⎥ ,
⎢ ⎥
0 1 (5)
⎢⋯⎥ ⎢0 × ×⎥⎥
⎢ ⎥ ⎢
⎢Ð ⎥
0 1

⎢un ⎥ ⎢× × 1 ⎥⎦
⎣ ⎦ ⎣ 0 0

where × inside the matrix stands for a don’t care condition. In other words, we are free to
fill those with any values that we wish, a reminiscent of the so called Matrix Completion
problem. In this particular example, we take an easy way out by filling all the don’t care
values by 1 in the first 4 columns. Fortunately, this makes two of the columns to repeat,
and taking the fifth column as the difference of the second and third column only changes
the don’t care values. We can then write,
⎡1 0 1 ⎤⎥ ⎡⎢1 0⎤⎥
⎢ 1 0
⎢1 1 0 ⎥⎥ ⎢⎢1 1⎥⎥
⎢ 1 1
⎢ ⎥ ⎢ ⎥ 1 1 0 0 1
⎢0 1 −1⎥ = ⎢0 1⎥ [ ],
⎢ ⎥ ⎢ ⎥ 0 0 1 1 −1
0 1
⎢0 1 −1⎥⎥ ⎢⎢0 ⎥
⎢ 0 1 1⎥
⎢1 0 1 ⎥⎦ ⎢⎣1 0⎥⎦
⎣ 1 0

which gives a rank-2 decomposition. The interpretation of this from the transmission side
is as follows. Users 1, 2 and 5 send their respective data symbols in every odd time-slots.
Even time-slots are occupied by users 2, 3 and 5. By multiplying by Ð v→i at receiver i data
symbol di can be recovered.

Sources: For details of Interference alignment please refer, S.A.Jafar, Topo-


logical Interference Management Through Index Coding, Trans. Information
Theory, 2011. Some of the material is from a recent tutorial conducted at IIT
Bombay by Babak Hassibi, Caltech, 2017

5 Matrix Decomposition
In the above interference management problem, there were two aspects that we used in
getting the decomposition. The first was the matrix completion problem, while maintaining
a low rank to the matrix. The second was in finding a decomposition. We elaborate more
on the latter part now. In particular we will learn the singular value decomposition (SVD).

8
5.1 SVD
Theorem 1. A m × n matrix A can be decomposed as A = U ∆V † , where U and V are
unitary and ∆ in an m × n diagonal matrix.

Before we embark of the proof, the reader should be reassured that the proof uses noth-
ing but some elementary techniques from linear algebra. In particular Eigen vectors and
Eigen values of the matrix AA† play key role. Recall that x is an Eigen-vector corresponding
to Eigen value λ of AA† if ∣∣x∣∣ = 1 and

Ax = λx.

We recapitulate some essential properties in the following three lemmas.

Lemma 1. All the Eigen values of AA† are non-negative, and the Eigen vectors form an
orthonormal set.

Proof. Notice that x† AA† x = x† λx = λ∣∣x∣∣2 = λ. On the other hand, x† AA† x = ∣∣A† x∣∣2 ≥ 0,
being the norm of a vector. This proves the first assertion. For the second take λ1 ≠ λ2 as
two Eigen values, with x and y the corresponding Eigen vectors. It is easy to see that

y † AA† x = (AA† y) x.

The LHS above is nothing but y † λ1 x = λ1 y † x, whereas the RHS is λ2 y † x. For equality
y † x = 0, which is the intended result.
For the case where λ1 = λ2 =, ⋯, = λl , and other Eigen values different, clearly u1 , ⋯, ul
will be orthogonal to all other Eigen vectors ui with λi ≠ λ1 . Thus u1 , ⋯, ul will span a
sub-space. We will choose the orthonormal basis of this sub-space as the Eigen vectors.
Let us now write [λ1 , λ2 , ⋯, λr , 0, ⋯, 0] in descending order as the ordered Eigen values
of AA† , and let U = [u1 , u2 , ⋯, um ] denote the corresponding Eigen vectors. Here r is called
the rank of the matrix A with r ≤ min{m, n}.
Looks like we are being partial to AA† . To change that perception, we can collect the
ordered Eigen values of A† A as [λ′1 , ⋯, λ′r , 0, ⋯, 0] and let the corresponding Eigen vectors
be V = [v1 , ⋯, vn ]. By the above arguments V contains an orthonormal set of vectors as
well.

Lemma 2.√The matrices AA† and A† A have the same non-zero Eigen values, and we can
take vj = ( λj )−1 A† uj .

Proof. By definition of Eigen vector ui

A† AA† ui = A† λi ui = λi A† ui .

Thus by taking vi = A† ui
∣∣A† ui ∣∣
, we get
A† Avi = λi vi ,
thus proving both the statements, since ∣∣Aui ∣∣2 = u†i AA† ui = λi .
The following lemma is now straightforward.

Lemma 3. √
u†i Avj = λj δi,j ,
where δ{⋅} is the Kronecker delta.

9
√ † √
Proof. By Lemma 2, we have u†i Avj = (A† ui )† vj = λ vi vj = λδi,j .
Now, in order to obtain the proof for SVD, let us compute U † AV with U and V defined
as above. Assume m ≥ n for simplicity.

U † AV = U † A [v1 , ⋯, vn ] (6)
= U [Av1 , ⋯, Avn ]

(7)
⎡ u† ⎤
⎢ 1⎥
⎢ †⎥
⎢ u2 ⎥
⎢ ⎥
= ⎢⎢ . ⎥⎥ [Av1 , ⋯, Avn ] (8)
⎢ . ⎥
⎢ ⎥
⎢ †⎥
⎢um ⎥
⎣ ⎦

The entry at index (i, j) of this matrix is nothing but u†i Avj = λj δi,j . Thus

U † AV = ∆,

where √
⎡ λ1 0 ⋯ 0 ⎤⎥

⎢ 0 √λ 0 ⋯
0
⎢ 0 ⎥⎥
⎢ 2

⎢ ⎥
⎢ ⎥
∆ = ⎢⎢ √ ⎥⎥

⎢ 0 0 ⋯ 0
⎢ λn ⎥
⎢ ⎥
⎢ 0 0 ⋯ 0 0 ⎥⎥

⎢ 0 0 ⋯ 0 0 ⎥⎦

Notice that U † AV = ∆ will imply that A = U † ∆V † , since U and V are unitary matrices,
i.e. U † U = U U † = I, and V † V = V V † = I. Notice however that U U † and V V † may be of
different dimensions.

10

You might also like