White Paper A Coding Theory Tutorial Randy Yates 19-Aug-2009 19:03 PA1 N/a Tutorial - Tex
White Paper A Coding Theory Tutorial Randy Yates 19-Aug-2009 19:03 PA1 N/a Tutorial - Tex
s
signal processi ng systems
http://www.digitalsignallabs.com
888-708-3698
Contents
1 Introduction 3
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Intended Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Theory 4
2.1 Elements of a Digital Communication System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Coding Theory Basics and Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Two Important Families of Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Other Important Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4.1 Constructing a Decision Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4.2 Quantifying Coding Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6 Revision History 10
7 References 10
Appendices 11
List of Figures
1 Digital Communication System Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Coded BPSK Simulation Matlab M-File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Uncoded BPSK Simulation Matlab M-File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Uncoded BPSK Theoretical Computation Matlab M-File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
List of Tables
1 Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1 Introduction
1.1 Overview
Any real-world communication system is plagued by noise and thus susceptible to errors in the transmission of information.
Theoretical work by pioneers such as Claude Shannon [1] laid the foundation for methods of designing communication systems
such that errors which occur in transmission can be reduced to arbitrarily small probability. These methods are collectively
known as coding theory.
In the terminology of digital communication systems, data is transmitted through a channel, and the components which embody
coding theory are called the channel encoder and channel decoder, as shown in Figure 1. Therefore we will be discussing the
design and simulation of these blocks in this tutorial.
f ∈F w∈W x∈C
Data Source Source Encoder Channel Encoder Modulator
C ⊆ Xn
m(t) ∈ R
Channel
Noise
r(t) ∈ R
Receiver
fˆ ∈ F ŵ ∈ W y∈D
Data Sink Source Decoder Channel Decoder Demodulator
D ⊆ Yn
In this tutorial we define fundamental concepts of coding theory, show an example of a block code, and provide theoretical and
simulated results of this code applied to a BPSK system.
1. The reader has a working knowledge of linear algebra and general vector spaces ([2], [3]).
2. The reader has at least some familiarity with abstract (or modern) algebra ([3], [4]).
3. The reader has a working knowledge of probability theory ([5], [6]).
2 Theory
2.1 Elements of a Digital Communication System
Figure 1 is a block diagram of a digital communication system. The input to the system is a sequence of elements f 1 that we
wish to transmit over the channel and reproduce at the output of the receiver as fˆ with as little error as possible within a given
set of constraints (power, bandwidth, computational complexity, latency, etc.).
The source encoder removes redundancy from the source samples by the use of compression. This is desirable since it reduces
the average data rate required for the system, utilizes the channel more efficiently, and improves error correction. Functions
such as A/D conversion and encryption are also sometimes performed in this block.
Source coding, while related to coding theory, is a separate topic and will not be covered in this tutorial. For more information,
see the texts by Cover and Thomas [7] or Roman [8], or search for the following topics: information theory, entropy, Kraft’s
theorem, Huffman codes.
The channel encoder/decoder improve the reliability of the transmission of the source-encoded information through the channel.
These two blocks jointly embody the topic of this tutorial.
The modulator generates a continous-time output m(t) from the discrete-time channel encoder output suitable for the channel.
It can include functions such as pulse-shaping, constellation mapping, line-coding, RF upconversion, etc.
The channel is the physical medium through which the message must be sent. Examples include an ethernet cable, the magnetic
media and read/write heads of a disk drive, an RF channel, and the optical fiber and transmitter/receiver of a fiber-optic system.
Each block in the receiver performs the inverse function of the corresponding transmitter block. The final output of the receiver,
fˆ, is an estimate of our original input source message f .
In order to understand what a block code is, we must first make some preliminary definitions. Note that the notation used in
these definitions is consistent with that in Figure 1.
Definition (Encoding Scheme) An encoding scheme is a bijective mapping f : W 7→ C ⊆ X n .
Definition (Decision Scheme) A decision scheme is a surjective mapping g : D ⊆ Y n 7→ W.
1 Note that we omit the sequence index typically used in discrete-time systems, e.g., f [n], with the understanding that all inputs and outputs correspond to a
specific index.
Definition (Discrete Memoryless Channel) A discrete memoryless channel2 consists of an input alphabet X = {x1 , x2 , ..., xq } and an
output alphabet Y = {y1 , y2 , ..., yt }, where X ⊆ Y, and a set of channel probabilities, or transition probabilities, p(y j |xi ), satisfying
t
X
p(y j |xi ) = 1, i = 1, 2, . . . , q. (1)
j=1
Think of a channel as sequentially accepting values from C ⊆ X n and outputting values from Y n . Since the channel is memo-
ryless, each symbol is independent from the other and therefore the probability p(d|c) that d is received given that c was sent
is
n
Y
p(d|c) = p(di |ci ). (2)
i=1
Definition (Block Code) Let X = {x1 , x2 , ..., xq } be a finite set, called a code alphabet, and let X n be the set of all vectors of length n over
X. Any nonempty subset C of X n is called a q-ary block code. q is called the radius of the code. Each vector in C is called a codeword. If
C ⊆ X n contains M codewords, then it is customary to say that C has length n and size M, and is an (n, logq M)-code. The rate of a q-ary
(n, logq M)-code is
logq M
R= . (3)
n
We now know what a block code is. Why is it useful? Shannon’s Noisy Coding Theorem provides the answer to that:
Theorem (Shannon’s Noisy Coding Theorem) Consider a discrete memoryless channel with capacity C. For any value R < C, R ∈ R+ ,
there exists a sequence Cn of q-ary codes and corresponding decision schemes fn with the following properties:
1. Cn is an (n, logq |Cn |)-code; that is, Cn has length n and rate at least (logq |Cn |)/n;
2. The maximum probability of error of fn approaches 0 as n approaches infinity,
lim pmax
e (n) = 0. (4)
n→∞
Definition (Systematic Code) A q-ary (n, k)-code is called a systematic code if there are k positions i1 , i2 , . . . , ik with the property that,
by restricting the codewords to these positions, we get all of the qk possible q-ary words of length k. The set {i1 , i2 , . . . , ik } is called the
information set.
In other words, a systematic code is one in which the original message canh be picked out
i of the codeword by looking at the
appropriate positions. For example, consider the encoding of the message 0 1 2 3 over the alphabet from Z4 given by
h i
0 3 2 0 3 1 , where the information set is {4, 6, 3, 2}, and where q = 4, n = 6, and k = 4.
Theorem (Finite Fields) All finite fields have size pn for some prime number p and n ∈ N. Furthermore, there is exactly one field (up to
isomorphism) of size q = pn , denoted by Fq or GF(q).
The set Fnq of all n-tuples with components from Fq is a vector space over Fq with dimension n.
Definition (Linear Code) A code L ⊆ Fnq is a linear code if L is a subspace of Fnq . If L has dimension k over Fnq , we say that L is an
[n,k]-code.
2 Note that to make this definition consistent with the block diagram in Figure 1, the “channel” defined here includes the modulator, channel, and demodulator
in the block diagram.
We’ve now defined the key components of a block code, including the decision scheme, but we haven’t shown how a decision
scheme can be constructed.
Definition (Weight) Let y be a vector from Y n . The weight of y, denoted w(y), is the number of non-zero elements in the vector.
h i
For example, let y = 1 0 2 3 0 3 1 ∈ Z74 . Then w(y) = 5.
Definition (Hamming Distance) Let y1 and y2 be two vectors from Y n . The Hamming distance between y1 and y2 , denoted d(y1 , y2 ), is
w(y1 − y2 ). Thus the Hamming distance can be thought of as the number of places in which two vectors differ.
Definition (Minimum Distance) Let C be a block code. The minimum distance of the code is the minimum Hamming distance between
any two elements in the code.
Now consider a block code which has a minimum distance of 3: any two codewords will differ in at least three places. If a
received codeword has a single error, the Hamming distance between it and the true codeword will be one, while the Hamming
distance between it and any other codeword will be at least two. Thus we see that the Hamming distance can be used as a
decision scheme for mapping received codewords to actual codewords C. Such a decision scheme is called minimum-distance
decoding. The final mapping from a codeword in C to a message in W is performed by the inverse function f −1 : C 7→ W,
which is guaranteed to exist since f is bijective.
The performance of an error-correcting code is often quantified in terms of the coding gain:
Definition (Coding Gain) For a given modulation type and bit-error rate, coding gain is defined as the ratio of the SNR per bit required for
the uncoded signal to the SNR per bit required for the coded signal.
Let the source alphabet be X = Z2 (a field). Consider the following matrix over Z2
1 0 0 0 0 1 1
0 1 0 0 1 0 1
G = (5)
0 0 1 0 1 1 0
0 0 0 1 1 1 1
The rows of this matrix can be thought of as four vectors in Z72 . Since these vectors are linearly independent, they span a
4-dimensional subspace of Z72 which we denote C. Hence the k × n G matrix can be thought of as a linear transformation from
all vectors in Z42 to C, taking 1 × 4 source data vectors w to 1 × 7 codeword vectors c ∈ C:
c = wG (6)
Hence we see that the code represented by G is a linear code. The G matrix is referred to as the generator matrix.
We may partition the G matrix into a 4 × 4 identity matrix and a 3 × 4 G′ matrix as follows:
h i
G = I | G′ , (7)
0 1 1
1 0 1
G′ = . (8)
1 1 0
1 1 1
c = w×G
h i
= w × I | G′
h i
= w | w × G′ ,
(9)
This shows that the code defined by this matrix is systematic, the source message being embedded in the left four symbols of
the codeword.
After the codewords are formed from the message source, they are sent through the channel. Since the information may be
corrupted on its way through the channel, the received codeword c′ may no longer be in the subspace C. How could we detect
this? If we could somehow come up with another linear transformation which had C as a kernel (or nullspace), then we would
know that any received codeword that transformed to the zero vector would be correctly received.
Let H be the matrix
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1 (10)
1 0 1 0 1 0 1
Since the rows of H are linearly independent, H spans a 3-dimensional subspace of Z72 . It is not hard to show that each row of
H is also orthogonal to each row of G. This shows that the nullspace of H is at least a subset of G. Finally, since, for any linear
transformation H,
then the nullspace of H is precisely the space spanned by G. Thus we may conclude that
c ∈ C ⇐⇒ cH T = 0. (12)
For this reason, H is called the parity check matrix for the code C. The 1 × 3 vector that results from cH T is referred to as the
syndrome.
Suppose that sending the codeword c through the channel results in a single error in the ith position. Let ei be the error vector
with a 1 in the ith position so that
c′ = c ⊕ ei (13)
(c ⊕ ei )H T = cH T ⊕ ei H T
= 0 ⊕ ei H T
= ei H T
(14)
But notice that the matrix H has been cleverly designed so that its i-th column, read from the top down, is just the binary
representation of the number i. In other words, the vector c′ H T tells us the location of the error, where location zero means no
error.
h i
An example encoding of the message 1 0 1 1 :
1 0 0 0 0 1 1
h i 0 1 0 0 1 0 1 h
i
c = mG = 1 0 1 1 = 1 0 1 1 0 1 0 (15)
0 0 1 0 1 1 0
0 0 0 1 1 1 1
h i
Let us introduce an error in the fourth digit, resulting in c′ = 1 0 1 0 0 1 0 . The syndrome is then
h i 0 0 0 1 1 1 1 h i
1 0 1 0 0 1 0 0 1 1 0 0 1 1 = 1 0 0 (16)
1 0 1 0 1 0 1
3.2 Simulation
In this section we will apply the Hamming code of the previous section to a BPSK system using soft-decision decoding, simulate
the system with and without coding, and determine the coding gain.
From the previous section, W = X k and C ∈ X n , where X = Z2 , k = 4, and n = 7. In other words, the source symbols from the
source encoder are binary and are blocked into vectors of four symbols each. The encoding scheme then maps each element
from Z42 to C ∈ Z72 by multiplying by the generator matrix G as discussed above.3
The BPSK modulator transforms the codewords’ input alphabet from {0, 1} to {−1, +1} and sends the words over the channel.
Each element of each vector is transmitted sequentially since this is BPSK, and therefore each element of each vector is
corrupted by the noise of the channel.
Soft-decision decoding ([9], section 8.1.4) combines the demodulator and the channel decoder by formulating the so-called
correlation metrics. The simulation code is shown in Figure 3.
Probability of bit errors were simulated and computed for the uncoded system in order to verify the correctness of the simulation.
The well-known result for BPSK is given by the equation
r
2Eb
Pe = Q . (17)
No
At a BER of approximately 2 × 10−4 , Gc = 8 − 6.4 = 1.6 dB. At a BER of approximately 1 × 10−2 , Gc = 4.4 − 3.4 = 1 dB.
Thus we see that the coding gain is a function of the BER.
To summarize, we find that even a short, simple Hamming code provides a coding gain of about 1.5 dB at moderate bit-error
rates. This allows us to use approximately 30 percent less transmit power to achieve the same BER as an uncoded system.
0.1
Uncoded (Simulated)
Uncoded (Theoretical)
Soft-Decision Decoded Hamming Code (Simulated)
0.01
e P
0.001
Probability of Bit Error,
0.0001
1e-05
1e-06
0 1 2 3 4 5 6 7 8
Eb / N
o (dB)
6 Revision History
Table 1 lists the revision history for this document.
7 References
[1] C. E. Shannon, “Communication in the presence of noise,” Proceedings of the Institute of Radio Engineers, vol. 37, pp. 10–21, 1949.
[2] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics, 2000.
[3] M. Artin, Algebra. Prentice Hall, 1991.
[4] I. Herstein, Topics in Algebra, 2nd ed. Wiley, 1975.
[5] Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. WCB/McGraw-Hill, 1991.
[6] Alberto Leon-Garcia, Probability and Random Processes for Electrical Engineering. Addison-Wesley, 1989.
[7] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley and Sons, Inc., 1991.
[8] S. Roman, Coding and Information Theory. Springer, 1992.
[9] J. G. Proakis, Digital Communications, 4th ed. McGraw-Hill, 2001.
Appendices
function ber = simcoded(ebno, N)
%ber = simcoded(ebno, N)
% FUNCTION: Simulated Soft-Decision Decoding Bit Error Rate of Hamming-Coded Coherent Binary PSK
% AUTHOR: Randy Yates
% INPUT PARAMETERS:
% ebno = E_b / N_o (may be vector)
% N = number of data values to simulate. note that N should be a multple of 4
% OUTPUT PARAMETERS:
% ber = bit-error rate result of the simulations
% DESCRIPTION:
% CREDITS:
% From section 8.1.4 of
% @BOOK{proakiscomm,
% title = "{Digital Communications}",
% author = "John~G.~Proakis",
% publisher = "McGraw-Hill",
% edition = "fourth",
% year = "2001"}
M = length(ebno);
% force N to be a multiple of 4
K = ceil(N / 4);
N = 4 * K;
for m = 1:M
fprintf(1,’ Eb/No = %f (%d of %d)\n’, ebno(m), m, M);
% produce a string of 1’s and 0’s
w = round(rand(K, 4));
% simulate passing through channel by adding zero-mean Gaussian noise of specified variance:
ch = modulated + randn(K,7) * sqrt(var);
% decode
cmmaxidx = (cm == repmat(cmmax, 1, 16)); % Kx16, with a single 1 in each row indicating the decoded element
what = cmmaxidx * W;
M = length(ebno);
ber = zeros(1, M);
for m = 1:M
fprintf(1,’ Eb/No = %f (%d of %d)\n’, ebno(m), m, M);
% produce a string of 1’s and -1’s
data = 2 * (rand(N, 1) > 0.5) - 1;
% simulate passing through channel by adding zero-mean Gaussian noise of specified variance:
chdata = data + randn(N,1) * diag(sqrt(var));