C&C Combined Module Notes
C&C Combined Module Notes
Text Book : Bose, Ranjan. Information theory, coding and cryptography, 3 rd Edition, Tata McGraw-Hill Education,
2015, ISBN: 978-9332901257
Where pi is the probability of the occurrence of character number i from a given stream of characters and
b is the base of the algorithm used. Hence, this is also called as Shannon’s Entropy.
Conditional Entropy: The amount of uncertainty remaining about the channel input after observing the
channel output, is called as Conditional Entropy.
It is denoted by
Example:
Consider a diskette storing a data ile consisting of 100,000 binary digits (binits), i.e., a total of 100,000
“0”s and “1”s . If the binits 0 and 1 occur with probabilities of ¼ and ¾ respectively, then binit 0 conveys
an amount of information equal to log2 (4/1) = 2 bits, while the binit 1 conveys information amounting to
log2 (4/3) = 0.42 bit.
The quantity H is called the entropy of a discrete memory-less source. It is a measure of the average
information content per source symbol. It may be noted that the entropy H depends on the probabilities of
the symbols in the alphabet of the source.
Example
Consider a discrete memory-less source with source alphabet {s 0,s1,s2} with probabilities p0=1/4, p1=1/4
and p2=1/2. Find the entropy of the source.
Solution
The entropy of the given source is
H = p0log2(1/p0) + p1log2(1/p1) + p2log2(1/p2)
= ¼log2(4) + ¼log2(4) + ½log2(2)
= 2/4 + 2/4 + 1/2
= 1.5 bits
For a discrete memory-less source with a ixed alphabet:
• H=0, if and only if the probability pk=1 for some k, and the remaining probabilities in
the set are all zero. This lower bound on the entropy corresponds to ‘no uncertainty’.
• H=log2(K), if and only if pk=1/K for all k (i.e. all the symbols in the alphabet are
equiprobable). This upper bound on the entropy corresponds to ‘maximum
uncertainty’.
• In Case I, it is very easy to guess whether the message s0 with a probability =0.01 will occur or the
message s1 with probability =0.99 will occur.(Most of the time message s 1 will occur). Thus in this
case, the uncertainty is less.
• In Case II, it is somewhat dif icult to guess whether s0 will occur or s1 will occur as their
probabilities are nearly equal. Thus in this case, the uncertainty is more.
In Case III, it is extremely dif icult to guess whether s0 or s1 will occur, as their probabilities are
equal. Thus in this case, the uncertainty is maximum
Entropy is less when uncertainty is less.
Entropy is more when uncertainty is more.
Thus, we can say that entropy is a measure of uncertainty.
An analog signal is band limited to B Hz, sampled at the Nyquist rate, and the samples are quantized into 4-
levels. The quantization levels Q1, Q2, Q3, and Q4 (messages) are assumed independent and occur with
probs. P1 = P2 = 1 and P2 = P3 = 3 . Find the information rate of the source.
Relation between Entropy and Mutual Information
Mutual Information: quanti ies the amount of information that knowing one random variable Y gives about
another random variable X. It is a measure of how much the uncertainty in X is reduced by knowing Y.
SHANNON- FANO CODING:
Lempel Ziv–Welch Coding
A drawback of the Huffman code is that it requires knowledge of a probabilistic model of
the source; unfortunately, in practice, source statistics are not always known a priori.
thereby compromising the ef iciency of the code. To overcome these practical
limitations, we may use the Lempel-Ziv algorithm/ which is intrinsically adaptive and
simpler to implement than Huffman coding.
A key to ile data compression is to have repetitive patterns of data so that patterns seen
once, can then be encoded into a compact code symbol, which is then used to represent
the pattern whenever it reappears in the ile. For example, in images, consecutive scan
lines (rows) of the image may be indentical. They can then be encoded with a simple code
character that represents the lines. In text processing, repetitive words, phrases, and
sentences may also be recognized and represented as a code. A typical ile data
compression algorithm is known as LZW - Lempel, Ziv, Welch encoding. Variants of this
algorithm are used in many ile compression schemes such as GIF iles etc. These are
lossless compression algorithms in which no data is lost, and the original ile can be
entirely reconstructed from the encoded message ile. The LZW algorithm is a greedy
algorithm in that it tries to recognize increasingly longer and longer phrases that are
repetitive, and encode them. Each phrase is de ined to have a pre ix that is equal to a
previously encoded phrase plus one additional character in the alphabet. Note “alphabet”
means the set of legal characters in the ile. For a normal text ile, this is the ascii character
set. For a gray level image with 256 gray levels, it is an 8 bit number that represents the
pixel’s gray level. In many texts certain sequences of characters occur with high frequency.
In English, for example, the word the occurs more often than any other sequence of three
letters, with and, ion, and ing close behind. If we include the space character, there are
other very common sequences, including longer ones like of the. Although it is impossible
to improve on Huffman encoding with any method that assigns a ixed encoding to each
character, we can do better by encoding entire sequences of characters with just a few
bits. The method of this section takes advantage of frequently occurring character
sequences of any length. It typically produces an even smaller representation than is
possible with Huffman trees, and unlike basic Huffman encoding it 1) reads through the
text only once and 2) requires no extra space for overhead in the compressed
representation. The algorithm makes use of a dictionary that stores character sequences
chosen dynamically from the text. With each character sequence the dictionary associates
a number; if s is a character sequence, we use codeword(s) to denote the number assigned
to s by the dictionary. The number codeword(s) is called the code or code number of s. All
codes have the same length in bits; a typical code size is twelve bits, which permits a
maximum dictionary size of 2 12 = 4096 character sequences.
Module - 1
6) Explain mutual information between two random variables. Illustrate the relationship between
entropy and mutual information.
7) Construct a Huffman code for the following source: Calculate the coding efficiency.
8) A DMS has the following alphabet with probability of occurrence as shown below:
Symbol s0 s1 s2 s3 s4 s5 s6
9) Generate the Huffman code with minimum code variance. Determine the code variance and code
efficiency. Comment on code efficiency.
10) Consider a DMS with three symbols xi,i=1,2,3x_i, i = 1, 2, 3xi,i=1,2,3 and their respective
probabilities p1=0.5,p2=0.3,p_1 = 0.5, p_2 = 0.3,p1=0.5,p2=0.3, and p3=0.2p_3 = 0.2p3=0.2. Encode
the source symbols using the Huffman encoding algorithm and compute the efficiency of the code
suggested. Now group together the symbols, two at a time, and again apply the Huffman encoding
algorithm to find the codewords. Compute the efficiency of this code. How can the coding efficiency
be improved?
11) An information source produces a sequence of independent symbols having the following
probabilities:
Symbols S1 S2 S3 S4 S5 S6 S7
Construct binary code using Huffman encoding procedure and find its efficiency.
12) The source of information A generates the symbols {A1,A2,A3,A4,A5,A6} with the corresponding
probabilities {0.2,0.3,0.11, 0.16,0.18,0.05}. Compute the code for source symbols using Huffman
coding and calculate its efficiency.
15) Consider the message "ABABAB” with the symbol probabilities of p(A)=0.4 p(B)=0.4 and p(C)=0.2.
Calculate the encoded values messages using arithmetic encoding.
16) Explain Lempel-Ziv Coding with an example. Also discuss its advantages and its limitations.
17) With the help of an example, describe the notion of a discrete memory-less source.
18) Explain run-length coding with the help of an example.
19) Construct a Lempel-Ziv code for the following bit sequence:
101011010101001011 List the steps involved in Shannon-Fano coding.
20) What is a simple way to shorten a repeated pattern of numbers? if you have a long list of numbers
like this: 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3 How can we write it in a shorter way?
21) Explain the difference between lossless and lossy compression? Explain Prefix free coding with
examples.
22) Design a Shannon-Fano code for the following symbol probabilities: {A: 0.4, B: 0.3, C: 0.2, D: 0.1}
23) Given the messages x1,x2,x3,x4,x5 and x6 with respective probabilities 0.4,0.2,0.1,0.1,0.06,0.04 (i)
Construct a binary code by applying Shannon-Fano encoding procedure.
25) Explain the properties of Entropy with suitable expressions and Examples.
26) Construct a binary Huffman code of azero memory source with probabilities P = {0.4, 0.2, 0.1, 0.1,
0.1, 0.05, 0.05}.
27) Make use of Lempel – Ziv algorithm and encode the following string 101011011010101011.
01000101110010100101.
29) State the properties of entropy. If a weather forecast model predicts one of 15 possible
weather conditions, each equally likely to occur. Calculate the entropy of this prediction model.
30) The source of information A generates the symbols {A1,A2,A3,A4,A5,A6} with the corresponding
probabilities {0.2,0.3,0.11, 0.16,0.18,0.05}. Compute the code for source symbols using Huffman and
Shannon-Fano encoder and compare its efficiency.
Module 2
Error-Correcting Codes
Channel models, channel capacity, channel coding, Types of Codes.
Linear Block Codes: matrix description of Linear Block Codes, Error detection & Correction, hamming codes,
Low Density Parity Check (LDPC) Codes.
Binary Cyclic Codes: Algebraic Structure of Cyclic Codes, Encoding using an (n-k) Bit Shift register,
Syndrome Calculation, Error Detection and Correction.
Channel Models
Channel models represent how signals propagate from the transmitter to the receiver. They include various factors
such as noise, interference, and physical properties of the transmission medium. Common channel models are:
AWGN (Additive White Gaussian Noise): Simplest model where noise is Gaussian and uncorrelated
with the signal.
Rayleigh Fading: Models multipath propagation where there is no dominant line-of-sight path.
Rician Fading: Similar to Rayleigh but includes a strong line-of-sight component.
Path Loss Models: Account for the reduction in signal power over distance, like the Free Space Path Loss
model.
2. Channel Capacity
This refers to the maximum rate at which information can be transmitted over a communication channel without
error, given by Shannon's Capacity Theorem:
Where:
CCC is the channel capacity (bits per second).
BBB is the bandwidth of the channel (Hz).
SSS is the signal power.
NNN is the noise power.
3. Channel Coding
Channel coding is used to detect and correct errors in transmitted data. It adds redundancy to the transmitted signal
to improve its reliability. There are two main types of channel coding:
Error Detection: Identifies the presence of errors (e.g., Parity Check, CRC).
Error Correction: Detects and corrects errors at the receiver (e.g., Hamming Code, Reed-Solomon Code).
4. Types of Codes
Block Codes: These divide the message into fixed-size blocks and add redundancy to each block
independently.
o Example: Hamming Code, Reed-Solomon Code.
Convolutional Codes: Encode the entire message in a continuous stream, where the output depends on the
current and previous input bits.
o Example: Trellis Codes, Viterbi Algorithm for decoding.
Turbo Codes: Combine two or more convolutional codes with an interleaver, allowing very low error
rates close to Shannon’s limit.
LDPC (Low-Density Parity-Check) Codes: A class of highly efficient linear block codes with sparse
parity-check matrices, used in modern wireless standards like 5G.
Linear Block Codes
Linear block codes are a type of error-correcting code used in digital communication, where a set of information
bits are transformed into a larger set of bits by adding redundant bits (parity bits) to allow for error detection and
correction.
1. Matrix Description of Linear Block Codes
A linear block code can be described by two matrices:
Generator Matrix (G): Used to encode the message bits.
Parity-Check Matrix (H): Used to check for errors in the received message.
Encoding Process:
Error Detection and Correction
Linear block codes are designed to detect and correct errors by adding redundancy in the form of parity
bits.
3. Hamming Codes
Hamming codes are a specific type of linear block code that can detect and correct single-bit errors. They are
widely used due to their simplicity and efficiency.
Parameters:
Hamming codes are typically denoted as (n,k)(n, k)(n,k), where:
o n=2m−1n = 2^m - 1n=2m−1 (total bits, including parity bits).
o k=2m−1−mk = 2^m - 1 - mk=2m−1−m (number of information bits).
o mmm is the number of parity bits.
Construction of Hamming Codes:
The parity-check matrix HHH for Hamming codes is constructed such that each column is a unique binary
representation of numbers from 1 to nnn.
The generator matrix GGG can be derived from HHH by ensuring that it is in systematic form.
Error Detection and Correction:
Hamming codes are capable of:
Detecting 2-bit errors.
Correcting 1-bit errors.
4. Low-Density Parity-Check (LDPC) Codes
LDPC codes are a powerful type of linear block code that are used in modern communication systems, including
5G, due to their high error-correcting capability and near-Shannon-limit performance.
LDPC Code Structure:
Low Density: The parity-check matrix HHH of LDPC codes is sparse, meaning that most of the entries are
zero, which reduces complexity and allows efficient decoding.
Parity-Check Matrix: For an LDPC code, the matrix HHH has far fewer 1s than 0s. This structure makes
decoding more efficient using iterative algorithms like the belief propagation (or sum-product) algorithm.
Encoding:
The encoding of LDPC codes is similar to other linear block codes. A generator matrix GGG is derived from the
sparse parity-check matrix HHH. The message vector u\mathbf{u}u is encoded using c=u⋅G\mathbf{c} =
\mathbf{u} \cdot Gc=u⋅G.
Decoding:
LDPC codes use iterative decoding algorithms, such as the belief propagation or sum-product algorithm. These
algorithms update the likelihood of individual bits being correct based on the received message and the parity-
check constraints, gradually converging on the most likely transmitted codeword.
Applications of LDPC Codes:
LDPC codes are widely used in various communication standards like Wi-Fi (802.11n), 5G, and digital
television broadcasting (DVB-S2), where high data reliability is required.
Summary
Linear Block Codes: Represented by matrices GGG (generator) and HHH (parity-check), allowing error
detection and correction.
Hamming Codes: Simple block codes that detect 2-bit errors and correct 1-bit errors.
LDPC Codes: Advanced block codes with sparse parity-check matrices, enabling efficient error correction
in modern communication systems.
ERROR CONTROL CODING
INTRODUCTION
The earlier chapters have given you enough background of Information theory and Source
encoding. In this chapter you will be introduced to another important signal - processing operation,
namely, “Channel Encoding”, which is used to provide „reliable‟ transmission of information
over the channel. In particular, we present, in this and subsequent chapters, a survey of „Error
control coding‟ techniques that rely on the systematic addition of „Redundant‟ symbols to the
transmitted information so as to facilitate two basic objectives at the receiver: „Error- detection‟
and „Error correction‟. We begin with some preliminary discussions highlighting the role of error
control coding.
The main task required in digital communication is to construct „cost effective systems‟ for
transmitting information from a sender (one end of the system) at a rate and a level of reliability that
are acceptable to a user (the other end of the system). The two key parameters available are transmitted
signal power and channel band width. These two parameters along with power spectral density of noise
determine the signal energy per bit to noise power density ratio, Eb/N0 and this ratio, as seen in chapter
4, uniquely determines the bit error for a particular scheme and we would like to transmit information
at a rate RMax = 1.443 S/N. Practical considerations restrict the limit on Eb/N0 that we can assign.
Accordingly, we often arrive at modulation schemes that cannot provide acceptable data quality (i.e.
low enough error performance). For a fixed Eb/N0, the only practical alternative available for changing
data quality from problematic to acceptable is to use “coding”.
Another practical motivation for the use of coding is to reduce the required Eb/N0 for a fixed
error rate. This reduction, in turn, may be exploited to reduce the required signal power or reduce the
hardware costs (example: by requiring a smaller antenna size).
The coding methods discussed in chapter 2 deals with minimizing the average word length of
the codes with an objective of achieving the lower bound viz. H(S) / log r, accordingly, coding is
termed “entropy coding”. However, such source codes cannot be adopted for direct transmission over
the channel. We shall consider the coding for a source having four symbols with probabilitiesp
(s1) =1/2, p (s2) = 1/4, p (s3) = p (s4) =1/8. The resultant binary code using Huffman‟s procedure is:
s1……… 0 s3…… 1 1 0
s2……… 10 s4…… 1 1 1
Clearly, the code efficiency is 100% and L = 1.75 bints/sym = H(S). The sequence s3s4s1 will
then correspond to 1101110. Suppose a one-bit error occurs so that the received sequence is 0101110.
This will be decoded as “s1s2s4s1”, which is altogether different than the transmitted sequence. Thus
although the coding provides 100% efficiency in the light of Shannon‟s theorem, it suffers a major
disadvantage. Another disadvantage of a „variable length‟ code lies in the fact that output data rates
measured over short time periods will fluctuate widely. To avoid this problem, buffers of large length
will be needed both at the encoder and at the decoder to store the variable rate bit stream if a fixed
output rate is to be maintained.
Some of the above difficulties can be resolved by using codes with “fixed length”. For example,
if the codes for the example cited are modified as 000, 100, 110, and 111. Observe that evenif there is
a one-bit error, it affects only one “block” and that the output data rate will not fluctuate. The
encoder/decoder structure using „fixed length‟ code words will be very simple compared to the
complexity of those for the variable length codes.
Here after, we shall mean by “Block codes”, the fixed length codes only. Since as discussed
above, single bit errors lead to „single block errors‟, we can devise means to detect and correct these
errors at the receiver. Notice that the price to be paid for the efficient handling and easy manipulations
of the codes is reduced efficiency and hence increased redundancy.
In general, whatever be the scheme adopted for transmission of digital/analog information, the
probability of error is a function of signal-to-noise power ratio at the input of a receiver and the data
rate. However, the constraints like maximum signal power and bandwidth of the channel (mainly the
Governmental regulations on public channels) etc, make it impossible to arrive at a signaling scheme
which will yield an acceptable probability of error for a given application. The answer to this problem
is then the use of „error control coding‟, also known as „channel coding‟. In brief, “error control
coding is the calculated addition of redundancy”. The block diagram of a typical data transmission
system is shown in Fig. 4.1
The information source can be either a person or a machine (a digital computer). The source
output, which is to be communicated to the destination, can be either a continuous wave form or a
sequence of discrete symbols. The „source encoder‟ transforms the source output into a sequence of
binary digits, the information sequence u. If the source output happens to be continuous, this involves
A-D conversion as well. The source encoder is ideally designed such that (i) the number of bints per
unit time (bit rate, rb) required to represent the source output is minimized (ii) the source output can be
uniquely reconstructed from the information sequence u.
Fig 4.1: Block diagram of a typical data transmission
The „Channel encoder‟ transforms u to the encoded sequence v, in general, a binary sequence,
although non-binary codes can also be used for some applications. As discrete symbols are not suited
for transmission over a physical channel, the code sequences are transformed to waveformsof specified
durations. These waveforms, as they enter the channel get corrupted by noise. Typical channels include
telephone lines, High frequency radio links, Telemetry links, Microwave links, and Satellite links and
so on. Core and semiconductor memories, Tapes, Drums, disks, optical memory and so on are typical
storage mediums. The switching impulse noise, thermal noise, cross talk and lightning are some
examples of noise disturbance over a physical channel. A surface defect on a magnetic tape is a source
of disturbance. The demodulator processes each received waveform and produces an output, which
may be either continuous or discrete – the sequence r. The channel decoder transforms r into a binary
sequence, û which gives the estimate of u, and ideally should be the replica of u. The source decoder
then transforms û into an estimate of source output and delivers this to the destination.
Error control for data integrity may be exercised by means of „forward error correction‟
(FEC) where in the decoder performs error correction operation on the received information
according to the schemes devised for the purpose. There is however another major approach known
as „Automatic Repeat Request‟ (ARQ), in which a re-transmission of the ambiguous information is
effected, is also used for solving error control problems. In ARQ, error correction is not done at all.
The redundancy introduced is used only for „error detection‟ and upon detection, the receiver requests
a repeat transmission which necessitates the use of a return path (feed back channel).
We will briefly discuss in this chapter the channel encoder and decoder strategies, our major
interest being in the design and implementation of the channel „encoder/decoder‟ pair to achieve fast
transmission of information over a noisy channel, reliable communication of information and reduction
of the implementation cost of the equipment.
The simplest channel results from the use of binary symbols (both as input and output). When
binary coding us used the modulator has only „0‟`s and „1‟`s as inputs. Similarly, the inputs to the
demodulator also consists of „0‟`s and „1‟`s provided binary quantization is used. If so we say a
„Hard decision‟ is made on the demodulator output so as to identify which symbol was actually
transmitted. In this case we have a „Binary symmetric channel‟ (BSC). The BSC when derived from
an additive white Gaussian noise (AWGN) channel is completely described by the transition
probability „p‟. The majority of coded digital communication systems employ binary coding with hard-
decision decoding due to simplicity of implementation offered by such an approach.
The use of hard-decisions prior to decoding causes an irreversible loss of information in the
receiver. To overcome this problem “soft-decision” coding is used. This can be done by including a
multilevel quantizer at the demodulator output as shown in Fig. 4.2(a) for the case of binary PSK
signals. The input-output characteristics and the channel transitions are shown in Fig. 4.2(b) and
Fig. 4.2(c) respectively. Here the input to the demodulator has only two symbols „0‟`s and „1‟`s.
However, the demodulator output has „Q‟ symbols. Such a channel is called a “Binary input-Q-ary
output DMC”. The form of channel transitions and hence the performance of the demodulator,
depends on the location of representation levels of the quantizer, which inturn depends on the signal
level and variance of noise. Therefore, the demodulator must incorporate automatic gain control, if an
effective multilevel quantizer is to be realized. Further the soft-decision decoding offers significant
improvement in performance over hard-decision decoding.
Fig. 4.2 (a) - Reciever
“It is possible in principle, to devise a means where by a communication system will transmit
information with an arbitrarily small probability of error, provided the information rate R (=r I(X,Y)
where r-is the symbol rate) is less than or equal to a rate „C‟ called the „channel capacity”. The
technique used to achieve this goal is called “Coding”. For the special case of a BSC, the theorem tells
us that if the code rate, Rc (defined later) is less than the channel capacity, then it is possible to find a
code that achieves error free transmission over the channel. Conversely, it is not possible tofind
such a code if the code rate Rc is greater than C.
The channel coding theorem thus specifies the channel capacity as a “Fundamental limit” on
the rate at which reliable transmission (error-free transmission) can take place over a DMC. Clearly,
the issue that matters is not the signal to noise ratio (SNR), so long as it is large enough, but how the
input is encoded.
The most un-satisfactory feature of Shannon‟s theorem is that it stresses only about the
“existence of good codes”. But it does not tell us how to find them. So, we are still faced with the
task of finding a good code that ensures error-free transmission. The error-control coding techniques
presented in this and subsequent chapters provide different methods of achieving this important system
requirement.
Types of errors:
The errors that arise in a communication system can be viewed as „independent errors‟ and
„burst errors‟. The first type of error is usually encountered by the „Gaussian noise‟, which is the
chief concern in the design and evaluation of modulators and demodulators for data transmission. The
possible sources are the thermal noise and shot noise of the transmitting and receiving equipment,
thermal noise in the channel and the radiations picked up by the receiving antenna. Further, in majority
situations, the power spectral density of the Gaussian noise at the receiver input is white. The
transmission errors introduced by this noise are such that the error during a particular signaling interval
does not affect the performance of the system during the subsequent intervals. The discrete channel, in
this case, can be modeled by a Binary symmetric channel. These transmission errors dueto Gaussian
noise are referred to as „independent errors‟ (or random errors).
The second type of error is encountered due to the „impulse noise‟, which is characterized by
long quiet intervals followed by high amplitude noise bursts (As in switching and lightning). A noise
burst usually affects more than one symbol and there will be dependence of errors in successive
transmitted symbols. Thus errors occur in bursts
Types of codes:
There are mainly two types of error control coding schemes – Block codes and convolutional
codes, which can take care of either type of errors mentioned above.
In a block code, the information sequence is divided into message blocks of k bits each,
represented by a binary k-tuple, u = (u1, u2 ….uk) and each block is called a message. The symbol u,
here, is used to denote a k – bit message rather than the entire information sequence. The encoder
then transforms u into an n-tuple v = (v1, v2 ….vn). Here v represents an encoded block rather than the
entire encoded sequence. The blocks are independent of each other.
The encoder of a convolutional code also accepts k-bit blocks of the information sequence u
and produces an n-symbol block v. Here u and v are used to denote sequences of blocks rather than a
single block. Further each encoded block depends not only on the present k-bit message block but also
on m-pervious blocks. Hence the encoder has a memory of order „m‟. Since the encoder has memory,
implementation requires sequential logic circuits.
If the code word with n-bits is to be transmitted in no more time than is required for the
transmission of the k-information bits and if τb and τc are the bit durations in the encoded and coded
words, i.e. the input and output code words, then it is necessary that
n.τc = k.τb
Better way to understand the important aspects of error control coding is by way of an example.
Suppose that we wish transmit data over a telephone link that has a useable bandwidth of 4 KHZ and
a maximum SNR at the out put of 12 dB, at a rate of 1200 bits/sec with a probability of error less than
10-3. Further, we have DPSK modem that can operate at speeds of 1200, 1400 and 3600 bits/sec with
error probabilities 2(10-3), 4(10-3) and 8(10-3) respectively. We are asked to design an error control
coding scheme that would yield an overall probability of error < 10-3. We have:
S S
[C=Blog2 (1+ ). 12dB or15.85 , B=4KHZ], p = 2(10-3), 4(10-3) and 8(10-3) respectively.
N N
Since Rc < C, according to Shannon‟s theorem, we should be able to transmit data with arbitrarily
small probability of error. We shall consider two coding schemes for this problem.
(i) Error detection: Single parity check-coding. Consider the (4, 3) even parity check code.
Parity 0 1 1 0 1 0 0 1
This code is capable of „detecting‟ all single and triple error patterns. Data comes out of the channel
encoder at a rate of 3600 bits/sec and at this rate the modem has an error probability of 8(10-3). The
decoder indicates an error only when parity check fails. This happens for single and triple errors only.
pd p( 1 p ) 3 p ( 1 p ), 4C1 4 , 4C3 4
1 1 3
Expanding we get pd 4 p 12 p2 16 p3 8 p4
pd = 32 (10-3) - 768 (10-6) +8192 (10-9) – 32768 (10-12) = 0.031240326 > > (10-3)
However, an error results if the decoder does not indicate any error when an error indeed has occurred.
This happens when two or 4 errors occur. Hence probability of a detection error = pnd (probability of
no detection) is given by:
pnd P( X 2 ) P( X 4 ) p2 ( 1 p )2 p4 ( 1 p )0 6 p2 12 p3 7 p4
4 4
2 4
(ii) Error Correction: The triplets 000 and 111 are transmitted whenever 0 and 1 are inputted.
A majority logic decoding, as shown below, is employed assuming only single errors.
Output 0 0 0 0 1 1 1 1
message
3 2 3 3 0 2 3
Probability of no detection, pnd =P (All 3 bits in error) = p3 =512 x 10-9 < < pde!
In general observe that probability of no detection, pnd < < probability of decoding error, pde.
The preceding examples illustrate the following aspects of error control coding. Note that in
both examples with out error control coding the probability of error =8(10-3) of the modem.
1. It is possible to detect and correct errors by adding extra bits-the check bits, to the message
sequence. Because of this, not all sequences will constitute bonafied messages.
3. Addition of check bits reduces the effective data rate through the channel.
4. Since probability of no detection is always very much smaller than the decoding error
probability, it appears that the error detection schemes, which do not reduce the rate efficiency
as the error correcting schemes do, are well suited for our application. Since error detection
schemes always go with ARQ techniques, and when the speed of communication becomes a
major concern, Forward error correction (FEC) using error correction schemes would be
desirable.
Block codes:
We shall assume that the output of an information source is a sequence of Binary digits. In
„Block coding‟ this information sequence is segmented into „message‟ blocks of fixed length, say k.
Each message block, denoted by u then consists of k information digits. The encoder transforms these
k-tuples into blocks of code words v, each an n- tuple „according to certain rules‟. Clearly,
corresponding to 2k information blocks possible, we would then have 2k code words of length n > k.
This set of 2k code words is called a “Block code”. For a block code to be useful these 2k code words
must be distinct, i.e. there should be a one-to-one correspondence between u and v. u and v are also
referred to as the „input vector‟ and „code vector‟ respectively. Notice that encoding equipment must
be capable of storing the 2k code words of length n > k. Accordingly, the complexity of the equipment
would become prohibitory if n and k become large unless the code words have a special structural
property conducive for storage and mechanization. This structural is the „linearity‟.
A block code is said to be linear (n ,k) code if and only if the 2k code words from a k-
dimensional sub space over a vector space of all n-Tuples over the field GF(2).
Fields with 2m symbols are called „Galois Fields‟ (pronounced as Galva fields), GF (2m).Their
arithmetic involves binary additions and subtractions. For two valued variables, (0, 1).The modulo –
2 addition and multiplication is defined in Fig 4.3.
Fig 4.3
The binary alphabet (0, 1) is called a field of two elements (a binary field and is denoted by GF
(2). (Notice that represents the EX-OR operation and represents the AND operation).Furtherin
binary arithmetic, X=X and X – Y = X Y. similarly for 3-valued variables, modulo – 3 arithmetic
can be specified as shown in Fig 6.4. However, for brevity while representing polynomials involving
binary addition we use + instead of and there shall be no confusion about such usage.
Polynomials f(X) with 1 or 0 as the co-efficients can be manipulated using the above relations.
The arithmetic of GF(2m) can be derived using a polynomial of degree „m‟, with binary co-efficients
and using a new variable called the primitive element, such that p() = 0.When p(X) isirreducible
(i.e. it does not have a factor of degree m and >0, for example X3 + X2 + 1, X3 + X + 1, X4 +X3 +1,
X5 +X2 +1 etc. are irreducible polynomials, whereas f(X)=X4+X3+X2+1 is not as f(1) = 0 and hence
has a factor X+1) then p(X) is said to be a „primitive polynomial‟.
If vn represents a vector space of all n-tuples, then a subset S of vn is called a subspace if (i) the
all Zero vector is in S (ii) the sum of any two vectors in S is also a vector in S. To be more specific, a
block code is said to be linear if the following is satisfied. “If v1 and v2 are any two code words of
length n of the block code then v1 v2 is also a code word length n of the block code”.
Observe the linearity property: With v3 = (010 101) and v4 = (100 011), v3 v4 = (110 110) = v7.
Remember that n represents the word length of the code words and k represents the number
of information digits and hence the block code is represented as (n, k) block code.
Thus by definition of a linear block code it follows that if g1, g2…gk are the k linearly
independent code words then every code vector, v, of our code is a combination of these code words,
i.e.
Where uj= 0 or 1, 1 j k
Eq (6.1) can be arranged in matrix form by nothing that each gj is an n-tuple, i.e.
Notice that any k linearly independent code words of an (n, k) linear code can be used to form
a Generator matrix for the code. Thus it follows that an (n, k) linear code is completely specified by
the k-rows of the generator matrix. Hence the encoder need only to store k rows of G and form linear
combination of these rows based on the input message u.
Example 4.2: The (6, 3) linear code of Example 6.1 has the following generator matrix:
g1 1 0 0 0 1 1
G g 0 1 0 1 0 1
2
g3 0 0 1 1 1 0
Thus v = (0 1 1 0 1 1)
“v can be computed simply by adding those rows of G which correspond to the locations of
1`s of u.”
A desirable property of linear block codes is the “Systematic Structure”. Here a code word is
divided into two parts –Message part and the redundant part. If either the first k digits or the last k
digits of the code word correspond to the message part then we say that the code is a “Systematic Block
Code”. We shall consider systematic codes as depicted in Fig.4.5.
vk 2 u1 p12 u2 p22 u3 p32 uk pk2
……………… (4.6 b)
⁝ ⁝
k1 k2 k ,n k
i.e., v = u.G
where, „T‟ represents transposition. Accordingly for any kn matrix, G, with k linearly independent
rows there exists a (n-k) n matrix H with (n-k) linearly independent rows such that any vector in
the row space of G is orthogonal to the rows of H and that any vector that is orthogonal to the rows
of H is in the row space of G. Therefore, we can describe an (n, k) linear code generated by G
alternatively as follows:
“An n – tuple, v is a code word generated by G, if and only if v.HT = O”. ……… (4.9a)
(O represents an all zero row vector.)
This matrix H is called a “parity check matrix” of the code. Its dimension is (n – k) n.
If the generator matrix has a systematic format, the parity check matrix takes the following form.
p11 p21 ... pk 1 1 0 0 ... 0
p
T 12 p22 ... pk 2 0 1 0 ... 0
H = [P .In-k] = ……… (4.10)
⁝ ⁝ ⁝ ⁝⁝⁝ ⁝ ⁝ ⁝ ⁝⁝⁝ ⁝
p ... p 0 0 0 ... 1
p
1,n k 2 ,n k k ,n k
gihj = (0 0 …1 …0…0 pi,1 pi,2…pi,j…pi, n-k) ( p1,j p2,j …pi,j ...pk, j 0 0 … 0 1 0 …0)T
ith element (k + j) th element ith element (k + j) th element
= pij + pij. = 0 (as the pij are either 0 or 1 and in modulo – 2 arithmetic X + X = 0)
Further, since the (n – k) rows of the matrix H are linearly independent, the H matrix ofEq.
(4.10) is a parity check matrix of the (n, k) linear systematic code generated by G. Notice that theparity
check equations of Eq. (4.6b) can also be obtained from the parity check matrix using the fact
v.HT = O.
Where pi =( u1 p1,i + u2 p2,i + u3 p3,i …+ uk pk, i) are the parity bits found from Eq (4.6b).
P
Now H T I
n k
v.HT = [u1 p11 + u2 p21 +…. + …. + uk pk1 + p1, u1 p12 + u2 p22 + ….. + uk pk2 + p2, …
u1 p1, n-k + u2 p2, n-k + …. + uk pk, n-k + pn-k]
Thus v. HT = O. This statement implies that an n- Tuple v is a code word generated by G if and only
if
v HT = O
If this is to be true for any arbitrary message vector v then this implies: G HT = Ok (n – k)
Example 4.3:
Consider the generator matrix of Example 4.2, the corresponding parity check matrix is
0 1 1 1 0 0
H = 1 0 1 0 1 0
1 1 0 0 0 1
The implementation of Block codes is very simple. We need only combinational logic circuits.
Implementation of Eq (4.6) is shown in the encoding circuit of Fig.4.6. Notice that pij is either a „0‟ or
a „1‟ and accordingly pij indicates a connection if pij = 1 only (otherwise no connection). The
encoding operation is very simple. The message u = (u1, u2 … uk) to be encoded is shifted into the
message register and simultaneously into the channel via the commutator. As soon as the entire
message has entered the message register, the parity check digits are formed using modulo -2 adders,
which may be serialized using, another shift register – the parity register, and shifted into the channel.
Notice that the complexity of the encoding circuit is directly proportional to the block length of the
code. The encoding circuit for the (6, 3) block code of Example 2 is shown in Fig 4.7
Fig 4.6 Encoding circuit for systematic block code
Fig 4.7 Encoder for the (6,3) block code of example 4.2
Suppose v = (v1, v2… vn) be a code word transmitted over a noisy channel and let:
r = (r1, r2 ….rn) be the received vector. Clearly, r may be different from v owing to the channel noise.
The vector sum
is an n-tuple, where ej = 1 if rj vj and ej = 0 if rj = vj. This n – tuple is called the “error vector” or
“error pattern”. The 1‟s in e are the transmission errors caused by the channel noise. Hence from
Eq (4.12) it follows:
s = r. HT …………………….. (4.13)
= (s1, s2… sn-k)
It then follows from Eq (4.9a), that s = 0 if and only if r is a code word and s 0 iffy r is not
a code word. This vector s is called “The Syndrome” (a term used in medical science referring to
collection of all symptoms characterizing a disease). Thus if s = 0, the receiver accepts r as a valid code
word. Notice that there are possibilities of errors undetected, which happens when e is identical to a
nonzero code word. In this case r is the sum of two code words which according to our linearity
property is again a code word. This type of error pattern is referred to an “undetectable error pattern”.
Since there are 2k -1 nonzero code words, it follows that there are 2k -1 error patterns as well. Hence
when an undetectable error pattern occurs the decoder makes a “decoding error”.
Eq. (4.13) can be expanded as below:
A careful examination of Eq. (4.14) reveals the following point. The syndrome is simply the vector
sum of the received parity digits (rk+1, rk+2 ...rn) and the parity check digits recomputed from the
received information digits (r1, r2 … rn). Thus, we can form the syndrome by a circuit exactly similar
to that of Fig.6.6 and a general syndrome circuit is as shown in Fig. 4.8.
Example 4.4:
We shall compute the syndrome for the (6, 3) systematic code of Example 4.2. We have
0 1 1
1 0 1
s = (s1, s2, s3) = (r1, r2, r3, r4, r5, r6) 1 1 0
1 0 0
0 1 0
0 0 1
or s1 = r2 +r3 + r4
s2 = r1 +r3 + r5
s3 = r1 +r2 + r6
Fig 4.8 Syndrome circuit for the (n,k) Linear systematic block code
Fig 4.8 Syndrome circuit for the (6,3) systematic block code
s = r.HT = (v e) HT
= v .HT e.HT
or s = e.HT …………… (4.15)
as v.HT= O. Eq. (4.15) indicates that the syndrome depends only on the error pattern and not on the
transmitted code word v. For a linear systematic code, then, we have the following relationship between
the syndrome digits and the error digits.
s1 = e1 p11 + e2 p21 + …. + ek pk,1 + ek+1
s2 = e1 p12 + e2 p22 + …+ ek pk, 2 + ek+ 2
…………… (4.16)
⁝ ⁝ ⁝ ⁝ ⁝
sn-k = e1 p1, n-k + e2 p2, n-k + …..+ ek pk, n-k + en
Thus, the syndrome digits are linear combinations of error digits. Therefore they must provide
us information about the error digits and help us in error correction.
Notice that Eq. (4.16) represents (n-k) linear equations for n error digits – an under-determined
set of equations. Accordingly it is not possible to have a unique solution for the set. As the rank of the
H matrix is k, it follows that there are 2k non-trivial solutions. In other words there exist 2k error
patterns that result in the same syndrome. Therefore to determine the true error pattern is not any easy
task
Example 4.5:
For the (6, 3) code considered in Example 4.2, the error patterns satisfy the following equations:
There are 23 = 8 error patterns that satisfy the above equations. They are:
{0 0 1 0 0 0, 1 0 0 0 0, 0 0 0 1 1 0, 0 1 0 0 1 1, 1 0 0 1 0 1, 0 1 1 1 0 1, 1 0 1 0 1 1, 1 1 1 1 1 0}
To minimize the decoding error, the “Most probable error pattern” that satisfies Eq (4.16) is
chosen as the true error vector. For a BSC, the most probable error pattern is the one that has the
smallest number of nonzero digits. For the Example 4.5, notice that the error vector (0 0 1 0 0 0) has
the smallest number of nonzero components and hence can be regarded as the most probable error
vector. Then using Eq. (4.12) we have
vˆ = r e
= (0 1 1 1 0 1) + (0 0 1 0 0 0) = (0 1 0 1 0 1)
The concept of distance between code words and single error correcting codes was first
developed by R .W. Hamming. Let the n-tuples,
be two code words. The “Hamming distance” d (,) between such pair of code vectors is defined
as the number of positions in which they differ. Alternatively, using Modulo-2 arithmetic, we have
n
d( , ) ( j j ) ……………………. (4.17)
j 1
(Notice that represents the usual decimal summation and is the modulo-2 sum, the EX-OR
function).
The “Hamming Weight” () of a code vector is defined as the number of nonzero
elements in the code vector. Equivalently, the Hamming weight of a code vector is the distance between
the code vector and the „all zero code vector‟.
Notice that the two vectors differ in 4 positions and hence d (,) = 4. Using Eq (4.17) we find
d (,) = (0 1) + (1 0) + (1 1) + (1 0) + (0 1) + (1 1)
= 1 + 1 + 0 + 1 + 1 + 0
The “Minimum distance” of a linear block code is defined as the smallest Hammingdistance
between any pair of code words in the code or the minimum distance is the same as the
smallest Hamming weight of the difference between any pair of code words. Since in linear block
codes, the sum or difference of two code vectors is also a code vector, it follows then that “the
minimum distance of a linear block code is the smallest Hamming weight of the nonzero code
vectors in the code”.
The Hamming distance is a metric function that satisfies the triangle inequality. Let, and
be three code vectors of a linear block code. Then
Notice that the above three distances satisfy the triangle inequality:
Similarly, the minimum distance of a linear block code, „C‟ may be mathematically
represented as below:
That is dmin min . The parameter min is called the “minimum weight” of the linear
code C.The minimum distance of a code, dmin, is related to the parity check matrix, H, of the code in
a fundamental way. Suppose v is a code word. Then from Eq. (4.9a) we have:
0 = v.HT
Here h1, h2 … hn represent the columns of the H matrix. Let vj1, vj2 …vjl be the „l‟ nonzero
components of v i.e. vj1 = vj2 = …. vjl = 1. Then it follows that:
That is “if v is a code vector of Hamming weight „l‟, then there exist „l‟ columns of H such
that the vector sum of these columns is equal to the zero vector”. Suppose we form a binary n- tuple
of weight „l‟, viz. x = (x1, x2 … xn) whose nonzero components are xj1, xj2 … xjl. Consider the product:
x.HT = x1h1 x2h2 …. xnhn = xj1hj1 xj2hj2 …. xjlhjl = hj1 hj2 … hjl
If Eq. (4.22) holds, it follows x.HT = O and hence x is a code vector. Therefore, we conclude
that “if there are „l‟ columns of H matrix whose vector sum is the zero vector then there exists a
code vector of Hamming weight „l‟ ”.
From the above discussions, it follows that:
i) If no (d-1) or fewer columns of H add to OT, the all zero column vector, the code has a
minimum weight of at least„d‟.
ii) The minimum weight (or the minimum distance) of a linear block code C, is the smallest
number of columns of H that sum to the all zero column vector.
0 1 1 1 0 0
For the H matrix of Example 6.3, i.e. H = 1 0 1 0 1 0 , notice that all columns of H are non
1 1 0 0 0 1
zero and distinct. Hence no two or fewer columns sum to zero vector. Hence the minimum weight of
the code is at least 3.Further notice that the 1st, 2nd and 3rd columns sum to OT. Thus the minimum
weight of the code is 3. We see that the minimum weight of the code is indeed 3 from the table of
Example 4.1.
The minimum distance, dmin, of a linear block code is an important parameter of the code. To
be more specific, it is the one that determines the error correcting capability of the code. Tounderstand
this we shall consider a simple example. Suppose we consider 3-bit code words plotted at the vertices
of the cube as shown in Fig.4.10.
Fig 4.10 The distance concept
Clearly, if the code words used are {000, 101, 110, 011}, the Hamming distance between the
words is 2. Notice that any error in the received words locates them on the vertices of the cube which
are not code words and may be recognized as single errors. The code word pairs with Hamming distance
= 3 are: (000, 111), (100, 011), (101, 010) and (001, 110). If a code word (000) is receivedas (100,
010, 001), observe that these are nearer to (000) than to (111). Hence the decision is made that the
transmitted word is (000).
Suppose an (n, k) linear block code is required to detect and correct all error patterns (over a
BSC), whose Hamming weight, t. That is, if we transmit a code vector and the received vector
is = e, we want the decoder out put to be ˆ = subject to the condition (e) t.
Further, assume that 2k code vectors are transmitted with equal probability. The best decision
for the decoder then is to pick the code vector nearest to the received vector for which the Hamming
distance is the smallest. i.e., d (,) is minimum. With such a strategy the decoder will be able to detect
and correct all error patterns of Hamming weight (e) t provided that the minimum distance of the
code is such that:
dmin is either odd or even. Let „t‟ be a positive integer such that
Suppose be any other code word of the code. Then, the Hamming distances among
, and satisfy the triangular inequality:
Combining Eq. (4.25) and (4.26) and with the fact that d(,) = t, it follows that:
Eq 4.28 says that if an error pattern of „t‟ or fewer errors occurs, the received vector is closer
(in Hamming distance) to the transmitted code vector than to any other code vector of the code.
For a BSC, this means P (|) > P (|) for . Thus based on the maximum likelihood decoding
scheme, is decoded as , which indeed is the actual transmitted code word and this results in the
correct decoding and thus the errors are corrected.
On the contrary, the code is not capable of correcting error patterns of weight l>t. To show this
we proceed as below:
Suppose d (,) = dmin, and let e1 and e2 be two error patterns such that:
i) e1 e2 =
Suppose, is the transmitted code vector and is corrupted by the error pattern e1. Then the received
vector is:
= e1 ……………………….. (4.30)
d (,) = ()
This inequality says that there exists an error pattern of l > t errors which results in a received
vector closer to an incorrect code vector i.e. based on the maximum likelihood decoding scheme
decoding error will be committed.
To make the point clear, we shall give yet another illustration. The code vectors and the received
vectors may be represented as points in an n- dimensional space. Suppose we construct two spheres,
each of equal radii,„t‟ around the points that represent the code vectors and . Further let these two
spheres be mutually exclusive or disjoint as shown in Fig.4.11 (a).
For this condition to be satisfied, we then require d (,) 2t + 1.In such a case if d (,) t,
it is clear that the decoder will pick as the transmitted vector.
Fig. 4.11(a)
On the other hand, if d (,) 2t, the two spheres around and intersect and if „‟ is located as in
Fig. 4.11(b), and is the transmitted code vector it follows that even if d (,) t, yet is as close to
as it is to. The decoder can now pick as the transmitted vector which is wrong. Thus it is imminent
that “an (n, k) linear block code has the power to correct all error patterns of weight„t‟ or less if and
only if d (,) 2t + 1 for all and”. However, since the smallest distance between any pair of code
words is the minimum distance of the code, dmin , „guarantees‟ correcting all the error patterns of
1
t (d
min
1 ) …………………………. (4.35)
2
1 1
where ( d min 1 ) denotes the largest integer no greater than the number ( d 1 ) . The
min
2 2
1
parameter„t‟ = ( d min 1 ) is called the “random-error-correcting capability” of the code and
2
the code is referred to as a “t-error correcting code”. The (6, 3) code of Example 4.1 has a
minimum distance of 3 and from Eq. (6.35) it follows t = 1, which means it is a „Single Error
Correcting‟ (SEC) code. It is capable of correcting any error pattern of single errors over a block of
six digits.
For an (n, k) linear code, observe that, there are 2n-k syndromes including the all zero
syndrome. Each syndrome corresponds to a specific error pattern. If „j‟ is the number of error
n
locations in the n-dimensional error pattern e, we find in general, there are nC j multiple error
j
t n
patterns. It then follows that the total number of all possible error patterns = , where„t‟ is the
j
j 0
maximum number of error locations in e. Thus we arrive at an important conclusion. “If an (n, k)
linear block code is to be capable of correcting up to„t‟ errors, the total number of syndromes
shall not be less than the total number of all possible error patterns”, i.e.
n-k
t n
2 ………………………. (4.36)
j
j 0
Eq (6.36) is usually referred to as the “Hamming bound”. A binary code for which the Hamming
Bound turns out to be equality is called a “Perfect code”.
Suppose vj , j = 1, 2… 2k, be the 2k distinct code vectors of an (n, k) linear block code.
Correspondingly let, for any error pattern e, the 2k distinct error vectors, ej, be defined by
ej = e vj , j = 1, 2… 2k ………………………. (4.37)
The set of vectors {ej, j = 1, 2 … 2k} so defined is called the “co- set” of the code. That is, a
„co-set‟ contains exactly 2k elements that differ at most by a code vector. It then fallows that there are
2n-k co- sets for an (n, k) linear block code. Post multiplying Eq (4.37) by HT, we find
ej HT = eHT vj HT
33
= e HT ………………………………………………. (4.38)
34
Notice that the RHS of Eq (4.38) is independent of the index j, as for any code word the term
vjHT = 0. From Eq (4.38) it is clear that “all error patterns that differ at most by a code word have
the same syndrome”. That is, each co-set is characterized by a unique syndrome.
Since the received vector r may be any of the 2n n-tuples, no matter what the transmitted code
word was, observe that we can use Eq (4.38) to partition the received code words into 2k disjoint sets
and try to identify the received vector. This will be done by preparing what is called the “standard
array”. The steps involved are as below:
Step1: Place the 2k code vectors of the code in a row, with the all zero vector
v1 = (0, 0, 0… 0) = O as the first (left most) element.
Step 2: From among the remaining (2n – 2k) - n – tuples, e2 is chosen and placed below the all-
zero vector, v1. The second row can now be formed by placing (e2 vj),j =
2, 3… 2k under vj
Step 3: Now take an un-used n-tuple e3 and complete the 3rd row as in step 2.
Step 4: continue the process until all the n-tuples are used.
Since all the code vectors, vj, are all distinct, the vectors in any row of the array are also distinct.
For, if two n-tuples in the l-th row are identical, say el vj = el vm, j m; we should have vj = vm
which is impossible. Thus it follows that “no two n-tuples in the same row of a slandered array are
identical”.
Next, let us consider that an n-tuple appears in both l-th row and the m-th row. Then for some
j1 and j2 this implies el vj1 = em vj2, which then implies el = em (vj2 vj1); (remember thatX
X = 0 in modulo-2 arithmetic) or el = em vj3 for some j3. Since by property of linear block codes
vj3 is also a code word, this implies, by the construction rules given, that el must appear in the m-th
row, which is a contradiction of our steps, as the first element of the m-th row is em and is an unused
vector in the previous rows. This clearly demonstrates another important property of thearray:
“Every n-tuple appearance in one and only one row”.
From the above discussions it is clear that there are 2n-k disjoint rows or co-sets in the standard
array and each row or co-set consists of 2k distinct entries. The first n-tuple of each co-set, (i.e., the
entry in the first column) is called the “Co-set leader”. Notice that any element of the co-set can be
used as a co-set leader and this does not change the element of the co-set - it results simply in a
permutation.
Suppose DjT is the jth column of the standard array. Then it follows
where vj is a code vector and e2, e3, … e2n-k are the co-set leaders.
The 2k disjoints columns D T, D T… D T can now be used for decoding of the code. If v is
1 2 2k j
the transmitted code word over a noisy channel, it follows from Eq (5.39) that the received vector r is
in DjT if the error pattern caused by the channel is a co-set leader. If this is the case r will be decoded
correctly as vj. If not an erroneous decoding will result for, any error pattern eˆ which is not a co-set
leader must be in some co-set and under some nonzero code vector is, say, in the i-th co-set and under
v 0. Then it follows
So, from the above discussion, it follows that in order to minimize the probability of a decoding
error, “The most likely to occur” error patterns should be chosen as co-set leaders. For a BSC an error
pattern of smallest weight is more probable than that of a larger weight. Accordingly, when forming a
standard array, error patterns of smallest weight should be chosen as co-set leaders. Then the decoding
based on the standard array would be the „minimum distance decoding‟ (the maximum likelihood
decoding). This can be demonstrated as below.
Suppose a received vector r is found in the jth column and lth row of the array. Then r will be
decoded as vj. We have
where we have assumed vj indeed is the transmitted code word. Let vs be any other code word, other
than vj. Then
Suppose “a0, a1, a2 …, an” denote the number of co-set leaders with weights 0, 1, 2… n. This
set of numbers is called the “Weight distribution” of the co-set leaders. Since a decoding error will
occur if and only if the error pattern is not a co-set leader, the probability of a decoding error for a
BSC with error probability (transition probability) p is given by
n
Example 4.8:
For the (6, 3) linear block code of Example 4.1 the standard array, along with the syndrome
table, is as below:
The weight distribution of the co-set leaders in the array shown are a0 = 1, a1 = 6, a2 = 1, a3 = a4 = a5
= a6 = 0.From Eq (5.40) it then follows:
A received vector (010 001) will be decoded as (010101) and a received vector (100 110) will be
decoded as (110 110).
Notice that an (n, k) linear code is capable of detecting (2n -2k) error patterns while it is
capable of correcting only 2n-k error patterns. Further, as n becomes large 2n-k/ (2n-2k) becomes
smaller and hence the probability of a decoding error will be much higher than the probability of an
undetected error.
Let us turn our attention to Eq (5.35) and arrive at an interpretation. Let x1and x2 be two n-
tuples of weights„t‟ or less. Then it follows
Suppose x1 and x2 are in the same co-set then it follows that (x1 x2) must be a nonzero code
vector of the code. This is impossible because the weight of (x1 x2) is less than the minimum weight
of the code. Therefore, “No two n-tuples, whose weights are less than or equal to„t‟, can be in the
same co-set of the code and all such n-tuples can be used as co-set leaders”.
Further, if v is a minimum weight code vector, i.e. (v) = dmin and if the n-tuples, x1 and x2
satisfy the following two conditions:
i) x1 x2 = v
It follows from the definition, x1 and x2 must be in the same co-set and
Suppose we choose x2 such that (x2) = t + 1. Since 2t+1 dmin 2t+2, we have (x1) = t or (t+1).
If x1 is used as a co-set leader then x2 cannot be a co-set leader.
The above discussions may be summarized by saying “For an (n , k) linear block code with
1
minimum distance dmin, all n-tuples of weight t ( d min 1 ) can be used as co-set leaders of
2
a standard array. Further, if all n-tuples of weight t are used as co-set leaders, there is at least
one n-tuple of weight (t + 1) that cannot be used as a co-set leader”.
These discussions once again re-confirm the fact that an (n, k) linear code is capable of
1
correcting error patterns of (d 1 ) or fewer errors but is incapable of correcting all the
min
2
error patterns of weight (t + 1).
We have seen in Eq. (4.38) that each co-set is characterized by a unique syndrome or there is
a one- one correspondence between a co- set leader (a correctable error pattern) and a syndrome. These
relationships, then, can be used in preparing a decoding table that is made up of 2n-k co-set leaders and
their corresponding syndromes. This table is either stored or wired in the receiver. The following are
the steps in decoding:
Step 1: Compute the syndrome s = r.HT
Step 2: Locate the co-set leader ej whose syndrome is s. Then ej is assumed to be the error
pattern caused by the channel.
This decoding scheme is called the “Syndrome decoding” or the “Table look up decoding”.
Observe that this decoding scheme is applicable to any linear (n, k) code, i.e., it need not necessarily
be a systematic code. However, as (n-k) becomes large the implementation becomes difficult and
impractical as either a large storage or a complicated logic circuitry will be required.
For implementation of the decoding scheme, one may regard the decoding table as the truth
table of n-switching functions:
where s1, s2... sn-k are the syndrome digits and are regarded as the switching variables and e1, e2 … en
are the estimated error digits. The stages can be released by using suitable combinatorial logic circuits
as indicated in Fig 4.13.
Fig. 4.13 General Decoding scheme for an (n,k) linear block code
Example 4.9:
From the standard array for the (6, 3) linear block code of Example 4.8, the following truth table can
be constructed.
The two shaded portions of the truth table are to be observed carefully. The top shaded one
corresponds to the all-zero error pattern and the bottom one corresponds to a double error patter which
cannot be corrected by this code. From the table we can now write expressions for the correctable
single error patterns as below:
1) Notice that for all correctable single error patterns the syndrome will be identical to a
column of the H matrix and indicates that the received vector is in error corresponding to
that column position.
For Example, if the received vector is (010001), then the syndrome is (100). This is identical
withthe4th column of the H- matrix and hence the 4th – position of the received vector is in error. Hence
the corrected vector is 010101. Similarly, for a received vector (100110), the syndrome is 101 and this
is identical with the second column of the H-matrix. Thus the second position of the received vector
is in error and the corrected vector is (110110).
2) A table can be prepared relating the error locations and the syndrome. By suitable
combinatorial circuits data recovery can be achieved. For the (6, 3) systematic linear code we have
the following table for r = (r1 r2 r3 r4 r5 r6.).
Notice that for the systematic encoding considered by us (r1 r2 r3) corresponds to the data digits and
(r4 r5 r6) are the parity digits.
Hence the circuit of Fig 6.14 can be modified to have data recovery by removing only the connections
of the outputs v̂4 , v̂5 and v̂6 .
Hamming Codes:
Hamming code is the first class of linear block codes devised for error correction. The single error
correcting (SEC) Hamming codes are characterized by the following parameters.
The parity check matrix H of this code consists of all the non-zero m-tuples as its columns. In
systematic form, the columns of H are arranged as follows
H = [Q ⁝ Im]
(2m-m-1) columns which are the m-tuples of weight 2 or more. As an illustration for k=4 we have
from k = 2m – m – 1.
Thus we require 3 parity check symbols and the length of the code 23 – 1 = 7. This results in the
(7, 4) Hamming code.
The parity check matrix for the (7, 4) linear systematic Hamming code is then
G I2 m m 1 ⁝ QT
And for the (7, 4) systematic code it follows:
A non systematic Hamming code can be constructed by placing the parity check bits at 2l, l=0, 1,
2…locations. It was the conventional method of construction in switching and computer applications
(Refer, for example „Switching circuits and applications -Marcus).One simple procedure for
construction of such code is as follows:
Step 3: Transpose of the matrix obtained in step 2 gives the parity check matrix H for the code.
The code words are in the form
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Where p1, p2, p3…are the parity digits and m1, m2, m3…are the message digits. For example, let us
consider the non systematic (7, 4) Hamming code.
Step1:
1
0 01 0
0
1 1 0
Step2: H T= 0 0 1
1 0 1
0 1 1
1 1 1
10 10 10 1
Step3: H= 0 1 1 0 0 1 1
0 0 0 1 1 1 1
Notice that the parity check bits, from he above H matrix apply to positions.
Accordingly, the check bits can be represented as linear combinations of the message bits.
For the (7, 4) code under consideration we have
p1 = m1 + m2 +m4
p2 = m1 + m3 +m4
p3 = m2 + m3 + m4
Notice that the message bits are located at the positions other than 2l, l = 0, 1, 2, 3…. locations.
i.e., they are located in the positions of 3, 5, 7, 9, 11, 13, 15, 17, 18….. The k- columns of the identity
matrix Ik are distributed successively to these locations. The Q sub-matrix in the H matrix can be
identified to contain those columns which have weights more than one. The transposeof this matrix
then gives the columns to be filled, in succession, in the G- matrix. For the Example of the (7, 4) linear
code considered, the Q- sub-matrix is:
1 1 0
1 1 0 1
Q= 1 0 1 1 , and hence QT = 1 0 1
0 1 1
0 1 1 1
1 1 1
The first two columns of this matrix then are the first two columns of the G: matrix and the third
column is the Forth column of the G matrix. Table below gives the codes generated by this method.
Observe that the procedure outlined for the code construction starts from selecting the H matrix
which is unique and hence the codes are also unique. We shall consider the correctable error patterns
and the corresponding syndromes listed in the table below.
Table: Error patterns and syndromes for the (7, 4) linear non-systematic code
e1 e2 e3 e4 e5 e6 s1 s2 s3
1 0 0 0 0 0 1 0 0
0 1 0 0 0 0 0 1 0
0 0 1 0 0 0 1 1 0
0 0 0 1 0 0 0 0 1
0 0 0 0 1 0 1 0 1
0 0 0 0 0 1 0 1 1
0 0 0 0 0 0 1 1 1
If the syndrome is read from right to left i.e. if the sequence is arranged as „s3 s2 s1‟ it is interesting to
observe that the decimal equivalent of this binary sequence corresponds to the error location. Thus if
the code vector 1 0 1 1 0 1 0 is received as 1 0 1 0 0 1 0, the corresponding syndrome is „0 0 1‟, which
is exactly the same as the 4th column of the H-matrix and also the sequence 100 corresponds todecimal
4.
It can be verified that (7, 4), (15, 11), (31, 26), (63, 57) are all single error correcting
Hamming codes and are regarded quite useful.
An important property of the Hamming codes is that they satisfy the condition of Eq. (4.36) with
equality sign, assuming that t=1.This means that Hamming codes are “single error correcting binary
perfect codes”. This can also be verified from Eq. (4.35)
We may delete any „l‟columns from the parity check matrix H of the Hamming code resulting
in the reduction of the dimension of H matrix to m (2m-l-1).Using this new matrix as the parity check
matrix we obtain a “shortened” Hamming code with the following parameters.
Notice that if the deletion of the columns of the H matrix is proper, we may obtain a Hamming code
with dmin = 4.For example if we delete from the sub-matrix Q all the columns of even weight, we obtain
an m 2m-1 matrix
H Q :I m
Where Q contains (2m-1 -m) columns of odd weight. Clearly no three columns add to zero as all
columns have odd weight .However, for a column in Q , there exist three columns in Im such that four
columns add to zero .Thus the shortened Hamming codes with H as the parity check matrix has
minimum distance exactly 4. The distance – 4 shortened Hamming codes can be used for correcting all
single error patterns while simultaneously detecting all double error patterns. Notice that when single
errors occur the syndromes contain odd number of one‟s and for double errors it contains even number
of ones. Accordingly the decoding can be accomplished in the following manner.
(2) If s contains odd number of ones, single error has occurred .The single error pattern pertaining
to this syndrome is added to the received code vector for error correction.
(3) If s contains even number of one‟s an uncorrectable error pattern has been detected.
Alternatively the SEC Hamming codes may be made to detect double errors by adding an extra
parity check in its (n+1) Th position. Thus (8, 4), (6, 11) etc. codes have dmin = 4 and correct single
errors with detection of double errors.
BINARY CYCLIC CODES
INTRODUCTION
"Binary cyclic codes” form a sub class of linear block codes. Majority of important linear block
codes that are known to-date are either cyclic codes or closely related to cyclic codes. Cyclic codes are
attractive for two reasons: First, encoding and syndrome calculations can be easily implemented using
simple shift registers with feed back connections. Second, they posses well defined mathematical
structure that permits the design of higher-order error correcting codes.
The second property can be easily understood from Fig, 4.1. Instead of writing the code as a
row vector, we have represented it along a circle. The direction of traverse may be either clockwise or
counter clockwise (right shift or left shift).
For example, if we move in a counter clockwise direction then starting at „A‟ the code word is
110001100 while if we start at B it would be 011001100. Clearly, the two code words are related in
that one is obtained from the other by a cyclic shift.
is a code vector, then the code vector, read from B, in the CW direction, obtained by a one bit cyclic
right shift:
v(1) = (vn-1 , vo, v1, v2, … vn-3,vn-2,) …………………… (4.2)
is also a code vector. In this way, the n - tuples obtained by successive cyclic right shifts:
are all code vectors. This property of cyclic codes enables us to treat the elements of each code vector
as the co-efficients of a polynomial of degree (n-1).
This is the property that is extremely useful in the analysis and implementation of these codes.
Thus we write the "code polynomial' V(X) for the code in Eq (6.1) as a vector polynomial as:
V(X) = vo + v1 X + v2 X2 + v3 X3 +…+ vi-1 Xi-1 +... + vn-3 Xn-3 + vn-2 Xn-2 + vn-1 Xn-1 ….. (4.4)
Notice that the co-efficients of the polynomial are either '0' or '1' (binary codes), i.e. they belong to
GF (2) as discussed in sec 5.7.1.
. Therefore multiplication of V(X) by X maybe viewed as a cyclic shift or rotation to the right subject
to the condition Xn = 1. This condition (i) restores XV(X) to the degree (n-1) (ii) Implies that right
most bit is fed-back at the left.
V (i) (X) = vn-i + vn-i+1X + vn-i+2X2 + …vn-1X i+ …v0Xi +v1Xi+1+…vn-i-2Xn-2 +vn-i-1Xn- …… (4.7)
An (n, k) cyclic code is specified by the complete set of code polynomials of degree (n-1)
and contains a polynomial g(X), of degree (n-k) as a factor, called the "generator polynomial" of
the code. This polynomial is equivalent to the generator matrix G, of block codes. Further, it is the only
polynomial of minimum degree and is unique. Thus we have an important theorem
Theorem 4.1 "If g(X) is a polynomial of degree (n-k) and is a factor of (Xn +1) then g(X) generates
an (n, k) cyclic code in which the code polynomial V(X) for a data vector u = (u0, u1… uk-1) is
generated by
The theorem can be justified by Contradiction: - If there is another polynomial of same degree, then
add the two polynomials to get a polynomial of degree < (n, k) (use linearity property and binary
arithmetic). Not possible because minimum degree is (n-k). Hence g(X) is unique
Clearly, there are 2k code polynomials corresponding to 2k data vectors. The code vectors
corresponding to these code polynomials form a linear (n, k) code. We have then, from the theorem
n k 1
g( X ) 1 …………………… (4.10)
g X i X n k
i 1 i
is a polynomial of minimum degree, it follows that g0 = gn-k = 1 always and the remaining co-
efficients may be either' 0' of '1'. Performing the multiplication said in Eq (4.8) we have:
In this book we have described cyclic codes with right shift operation. Left shift version can
be obtained by simply re-writing the polynomials. Thus, for left shift operations, the various
polynomials take the following form
n k
= X n k gi X n k i gn k ………………… (4.13d)
i 1
As a convention, the higher-order co-efficients of a polynomial are transmitted first. This isthe
reason for the format of polynomials used in this book.
where ai‟s are either a ' 0' or a '1', the right most bit in the sequence (a0, a1, a2 ... an-1) is transmitted
first in any operation. The product of the two polynomials A(X) and B(X) yield:
This product may be realized with the circuits of Fig 4.2 (a) or (b), where A(X) is the input and the co-
efficient of B(X) are given as weighting factor connections to the mod - 2 .adders. A '0' indicates no
connection while a '1' indicates a connection. Since higher order co-efficients are first sent, the highest
order co-efficient an-1 bm-1 of the product polynomial is obtained first at the output ofFig 6.2(a).
Then the co-efficient of Xn+m-3 is obtained as the sum of {an-2bm-1 + an-1 bm-2}, the first term directly
and the second term through the shift register SR1. Lower order co-efficients are then generated
through the successive SR's and mod-2 adders. After (n + m - 2) shifts, the SR's contain
{0, 0… 0, a0, a1} and the output is (a0 b1 + a1 b0) which is the co-efficient of X. After (n + m-1) shifts,
the SR's contain (0, 0, 0,0, a0) and the out put is a0b0. The product is now complete and the contents
of the SR's become (0, 0, 0 …0, 0). Fig 4.2(b) performs the multiplication in a similar way but the
arrangement of the SR's and ordering of the co-efficients are different (reverse order!). This
modification helps to combine two multiplication operations into one as shown in Fig 4.2(c).
From the above description, it is clear that a non-systematic cyclic code may be generated using
(n-k) shift registers. Following examples illustrate the concepts described so far.
B(X) = 1 + X + X3 + X4 + X6
The circuits of Fig 4.3 (a) and (b) give the product C(X) = A(X). B(X)
Fig 4.3: Circuit to perform C(X)*(1+X2+X3+X4+X6)
Example 4.2: Consider the generation of a (7, 4) cyclic code. Here (n- k) = (7-4) = 3 and we have to
find a generator polynomial of degree 3 which is a factor of Xn + 1 = X7 + 1.
To find the factors of‟ degree 3, divide X7+1 by X3+aX2+bX+1, where 'a' and 'b' are binary
numbers, to get the remainder as abX2+ (1 +a +b) X+ (a+b+ab+1). Only condition for the remainder
to be zero is a +b=1 which means either a = 1, b = 0 or a = 0, b = 1. Thus we have two possible
polynomials of degree 3, namely
Thus selection of a 'good' generator polynomial seems to be a major problem in the design of cyclic
codes. No clear-cut procedures are available. Usually computer search procedures are followed.
Let us choose g (x) = X3+ X + 1 as the generator polynomial. The encoding circuits are shown
in Fig 4.4(a) and (b).
U (X) = 1 +X2+X3.
= 1 +X2+X3+X+X3+X4+X3+X5+X6
=> v = (1 1 1 1 1 1 1)
The multiplication operation, performed by the circuit of Fig 6.4(a), is listed in the Table below step
by step. In shift number 4, „000‟ is introduced to flush the registers. As seen from the tabulation the
product polynomial is:
V (X) = 1 +X+X2+X3+X4+X5+X6,
and hence out put code vector is v = (1 1 1 1 1 1 1), as obtained by direct multiplication. The reader
can verify the operation of the circuit in Fig 4.4(b) in the same manner. Thus the multiplication circuits
of Fig 6.4 can be used for generation of non-systematic cyclic codes.
As in the case of multipliers, the division of A (X) by B (X) can be accomplished by using shift
registers and Mod-2 adders, as shown in Fig 4.5. In a division circuit, the first co-efficient of the
quotient is (an-l (bm-1) = q1, and q1.B(X) is subtracted from A (X). This subtraction is carried out by
the feed back connections shown. This process will continue for the second and subsequent terms.
However, remember that these coefficients are binary coefficients. After (n-1) shifts, the entire
quotient will appear at the output and the remainder is stored in the shift registers.
We shall understand the operation of one divider circuit through an example. Operation of other
circuits can be understood in a similar manner.
Example 4.3:
Let A(X) = X3+X5+X6, A= (0001011), B(X) = 1 +X+X3. We want to find the quotient and
remainder after dividing A(X) by B(X). The circuit to perform this division is shown in Fig 4.7, drawn
using the format of Fig 4.5(a). The operation of the divider circuit is listed in the table:
The quotient co-efficients will be available only after the fourth shift as the first three shifts
result in entering the first 3-bits to the shift registers and in each shift out put of the last register, SR3,
is zero.
The quotient co-efficient serially presented at the out put are seen to be (1111) and hence the
quotient polynomial is Q(X) =1 + X + X2 + X3. The remainder co-efficients are (1 0 0) and the
remainder polynomial is R(X) = 1.
Since the code polynomial is a multiple of the generator polynomial we can write:
V (X) = P (X) +Xn-k U (X) = Q (X) g (X) ....................... (4.18)
X nkU ( X ) P( X )
Q( X ) ………………. (4.19)
g( X ) g( X )
Thus division of Xn-k U (X) by g (X) gives us the quotient polynomial Q (X) and the remainder
polynomial P (X). Therefore to obtain the cyclic codes in the systematic form, we determine the
remainder polynomial P (X) after dividing Xn-k U (X) by g(X). This division process can be easily
achieved by noting that "multiplication by Xn-k amounts to shifting the sequence by (n-k) bits".
Specifically in the circuit of Fig 4.5(a), if the input A(X) is applied to the Mod-2 adder after the (n-k)
th shift register the result is the division of Xn-k A (X) by B (X).
Accordingly, we have the following scheme to generate systematic cyclic codes. The
generator polynomial is written as:
The circuit of Fig 4.8 does the job of dividing Xn-kU (X) by g(X). The following steps describe the
encoding operation.
Fig 4.8 Syndrome encoding of cyclic codes using (n-k) shift register stages
Clearly, the encoder is very much simpler than the encoder of an (n, k) linear block code and the
memory requirements are reduced. The following example illustrates the procedure.
Example 4.4:
Let u = (1 0 1 1) and we want a (7, 4) cyclic code in the systematic form. The generator polynomial
chosen is g (X) = 1 + X + X3
We perform direct division Xn-kU (X) by g (X) as shown below. From direct division observe that
p0=1, p1=p2=0. Hence the code word in systematic format is:
After the Fourth shift GATE Turned OFF, switch S moved to position 2, and the parity bits
contained in the register are shifted to the output. The out put code vector is v = (100 1011) which
agrees with the direct hand calculation.
The generator polynomial g(X) and the parity check polynomial h(X) uniquely specify the
generator matrix G and the parity check matrix H respectively. We shall consider the construction of
a generator matrix for a (7, 4) code generated by the polynomial g(X) = 1 +X+X3.
We start with the generator polynomial and its three cyclic shifted versions as below:
g(X) = 1 + X + X3
X g(X) = X + X2 + X4
X2g(X) = X2 + X3 + X5
X3g(X) = X3 + X4 + X6
The co-efficients of these polynomials are used as the elements of the rows of a (47) matrix to get
the following generator matrix:
1 1 0 1 0 0 0
0 1 1 0 1 0 0
G
0 0 1 1 0 1 0
0 0 0 1 1 0 1
Clearly, the generator matrix so constructed is not in a systematic format. We can transform this into
a systematic format using Row manipulations. The manipulations are:
First row = First row; Second row = Second row; Third row = First row + Third row; and Fourth row
= First row + second row + Fourth row.
1 1 0 1 0 0 0
1 0 0
G 0 1 1 0 [ P ⁝I4 ]
1 1 1 0 0 1 0
1 0 1 0 0 0 1
Using this generator matrix, which is in systematic form the code word for u = (1 0 1 1) is
v = (1 0 0 1 0 1 1) (obtained as sum of 1st row + Third row + Fourth row of the G-matrix). The result
agrees with direct hand calculation.
To construct H-matrix directly, we start with the reciprocal of the parity check polynomial
defined by Xkh(X-1). Observe that the polynomial Xkh(X-1) is also a factor of the polynomial Xn+
1. For the polynomial (X7+1) we have three primitive factors namely, (X + 1), (X3+X+1) and
(X3+X2+1). Since we have chosen (X3+X+1) as the generator polynomial the other two factors
should give us the parity check polynomial.
h(X) = (X +1) (X3+X2+1) = X4+X2+X+1
X5h(X-1) = X5 + X4 +X3 + X
X6P(X-1) = X6 + X5 + X4 + X2
Or X4h(X-1) = X4+X3+X2+1
X5h(X-1) = X5 + X4 +X3 + X
X6h(X-1) = X6 + X5 + X4 + X2
1 0 1 1 1 0 0
H 0 1 0 1 1 1 0
0 0 1 0 1 1 1
Clearly, this matrix is in non systematic form. It is interesting to check that for the non-
systematic matrixes obtained GHT = O. We can obtain the H matrix in the systematic format
H = [I 3 ⁝P T ] , by using Row manipulations. The manipulation in this case is simply.
'First row = First row + Third row'. The result is
1 0 0 1 0 1 1
H 0 1 0 1 1 1 0
0 0 1 0 1 1 1
R(X) = r0+r1X+r2X2+…+rn-1Xn-l
Let A(X) be the quotient and S(X) be the remainder polynomials resulting from the division
of R(X) by g(X) i.e.
R( X ) S( X ) ……………….. (4.21)
A( X )
g( X ) g( X )
The remainder S(X) is a polynomial of degree (n-k-1) or less. It is called the "Syndrome polynomial".
If E(X) is the polynomial representing the error pattern caused by the channel, then we have:
R(X) =V(X) + E(X) ……………….. (4.22)
That is, the syndrome of R(X) is equal to the remainder resulting from dividing the error pattern by the
generator polynomial; and the syndrome contains information about the error pattern, which can be
used for error correction. Fig 4.5. A “Syndrome calculator” is shown in Fig 4.10.
1 The register is first initialized. With GATE 2 -ON and GATE1- OFF, the received vector is
entered into the register
2 After the entire received vector is shifted into the register, the contents of the register will be the
syndrome, which can be shifted out of the register by turning GATE-1 ON and GATE-2OFF.
The circuit is ready for processing next received vector.
Cyclic codes are extremely well suited for 'error detection' .They can be designed to detect
many combinations of likely errors and implementation of error-detecting and error correcting circuits
is practical and simple. Error detection can be achieved by employing (or adding) an additional R-S
flip-flop to the syndrome calculator. If the syndrome is nonzero, the flip-flop sets and provides an
indication of error. Because of the ease of implementation, virtually all error detecting codes are
invariably 'cyclic codes'. If we are interested in error correction, then the decoder must be capable of
determining the error pattern E(X) from the syndrome S(X) and add it to R(X) to
determine the transmitted V(X). The following scheme shown in Fig 6.11 may be employed for the
purpose. The error correction procedure consists of the following steps:
Step1. Received data is shifted into the buffer register and syndrome registers with switches
SIN closed and SOUT open and error correction is performed with SIN open and SOUT
closed.
Step2. After the syndrome for the received code word is calculated and placed in thesyndrome
register, the contents are read into the error detector. The detector is a combinatorial
circuit designed to output a „1‟ if and only if the syndrome corresponds to a correctable
error pattern with an error at the highest order position Xn-l. That is, if the detector output
is a '1' then the received digit at the right most stage of the buffer register is assumed to
be in error and will be corrected. If the detector output is '0' thenthe received digit at the
right most stage of the buffer is assumed to be correct. Thus the detector output is the
estimate error value for the digit coming out of the buffer register.
Step3. In the third step, the first received digit in the syndrome register is shifted right once. If
the first received digit is in error, the detector output will be '1' which is used for error
correction. The output of the detector is also fed to the syndrome register to modify the
syndrome. This results in a new syndrome corresponding to the „altered
„received code word shifted to the right by one place.
Step4. The new syndrome is now used to check and correct the second received digit, which
is now at the right most position, is an erroneous digit. If so, it is corrected, a new
syndrome is calculated as in step-3 and the procedure is repeated.
Step5. The decoder operates on the received data digit by digit until the entire
received code word is shifted out of the buffer.
At the end of the decoding operation, that is, after the received code word is shifted out of the
buffer, all those errors corresponding to correctable error patterns will have been corrected, and the
syndrome register will contain all zeros. If the syndrome register does not contain all zeros, thismeans
that an un-correctable error pattern has been detected. The decoding schemes described in Fig
6.10 and Fig6.11 can be used for any cyclic code. However, the practicality depends on the complexity
of the combinational logic circuits of the error detector. In fact, there are special classesof cyclic
codes for which the decoder can be realized by simpler circuits. However, the price paid for such
simplicity is in the reduction of code efficiency for a given block size.
A decoder of the form described above operates on the received data bit by bit; and each bit is
tested in turn for error and is corrected whenever an error is located. Such a decoder is called a“Meggitt
decoder”.
For illustration let us consider a decoder for a (7, 4) cyclic code generated by
g(X) = 1 + X + X 3
The circuit implementation of the Meggitt decoder is shown on Fig 6.12. The entire received
vector R(X) is entered in to the SR‟s bit by bit and at the same time it is stored in the buffer memory.
The division process will start after the third shift and after the seventh shift the syndrome will be stored
in the SR‟s. If S(X) = (000) then E(X) = 0 and R(X) is read out of the buffer. Since S(X) can be found
from E(X) with nonzero coefficients, suppose E(X) = (000 0001). Then the SR contents are given as:
(001, 110, 011, 111, 101) showing that S(X) = (101) after the seventh shift. At the eighth shift, the SR
content is (100) and this may be used through a coincidence circuit to correct the error bit coming out
of the buffer at the eighth shift. On the other hand if the error polynomial wereE(X) = (000 1000)
then the SR content will be (100) at he eleventh shift and the error will be corrected when the buffer
delivers the error bit at the eleventh shift. The SR contents for different shifts, for two other error
patterns are as shown in the table below:
SR contents for the error patterns (1001010) and (1001111)
Shift Input SR-content for Input SR- content for
Number (1001010) (1001111)
1 0 000 1 100
2 1 100 1 110
3 0 010 1 111
4 1 101 1 001
5 0 100 0 110
6 0 010 0 011
7 1 101 1 011 *Indicates an error
8 0 100 0 111
9 - - 0 101
10 - - 0 100
Fig 4.12 Meggitt decoder for (7,4) cyclic code
For R(X) = (1001010), the SR content is (100) at the 8-th shift and the bit in X6 position of
R(X) is corrected giving correct V(X) = (1001011). On the other hand , if R(X) = (1001111), then it
is seen from the table that at the 10-th shift the syndrome content will detect the error and correct the
X4 bit of R(X) giving V(X) = (1001011).
The decoder for the (15, 11) cyclic code, using g(X) = 1 + X + X 4, is shown in Fig 6.13. It is
easy to check that the SR content at the 16-th shift is (1000) for E(X) =X 14. Hence a coincidence
circuit gives the correction signal to the buffer out put as explained earlier.
Although the Meggitt decoders are intended for Single error correcting cyclic codes, they may
be generalized for multiple error correcting codes as well, for example (15, 7) BCH code.
An error trapping decoder is a modification of a Meggitt decoder that is used for certain cyclic codes.
The syndrome polynomial is computed as: S(X) = Remainder of [E(X) / g(X)]. If the error
E(X) is confined to the (n-k) parity check positions (1, X, X2… Xn-k-1) of R(X), then E(X) = S(X),
since the degree of E(X) is less than that of g(X). Thus error correction can be carried out by simply
adding S(X) to R(X). Even if E(X) is not confined to the (n-k) parity check positions of R(X) but has
nonzero values clustered together such that the length of the nonzero values is less than the syndrome
length, then also the syndrome will exhibit an exact replica of the error pattern after some cyclic shifts
of E(X). For each error pattern, the syndrome content S(X) (after the required shifts) is subtracted from
the appropriately shifted R(X), and the corrected V(X) recovered.
“If the syndrome of R(X) is taken to be the remainder after dividing Xn-k R(X) by g(X),
and all errors lie in the highest-order (n-k) symbols of R(X), then the nonzero portion of the error
pattern appears in the corresponding positions of the syndrome”. Fig 4.14 shows an error trapping
decoder for a (15, 7) BCH code based on the principles described above. A total of 45 shifts
are required to correct the double error, 15 shifts to generate the syndrome, 15 shifts to correct the
first error and 15 shifts to correct the second error.
Illustration:
U(X) = X6 + 1; g(X) = X8 + X7 + X6 + X4 + 1
E(X) = X11 + X
r = (110010011011111)
Shift Syndrome Shift Middle Shift Bottom
Number Generator Number Register Number Register
Register
1 10001011 16 01100011 31 00000100
2 01000101 17 10111010 32 00000010
3 00100010 18 01011101 33 00000001
4 10011010 19 10100101 34 00000000
5 11000110 20 11011001 35 ;
6 01100011 21 11100111 36 :
7 00110001 22 11111000 37 :
8 00011000 23 01111100 38 All zeros
9 00001100 24 00111110 39 :
10 00000110 25 00011111 40 :
11 10001000 26 10000100 41 :
12 01000100 27 01000010 42 :
13 00100010 28 00100001 43 :
14 10011010 29 00010000 44 :
15 11000110 30 00001000 45 :
Errors trapped at shift numbers 28 and 33.
Some times when error trapping cannot be used for a given code, the test patterns can be
modified to include the few troublesome error patterns along with the general test. Such a modified
error trapping decoder is possible for the (23, 12) Golay code in which the error pattern E(X) will be
of length 23 and weight of 3 or less (t 3). The length of the syndrome register is 11 and if E(X) has
a length greater than 11the error pattern is not trapped by cyclically shifting S(X). In this case, it is
shown that one of the three error bits must have at least five zeros on one side of it and at least six zeros
the other side. Hence all error patterns can be cyclically shifted into one of the following three
configurations (numbering the bit positions, e0, e1, e2 … e22):
(ii) One error occurs in position e5 and the other two errors occur in the 11
high-order bits.
(iii) One error occurs in position e6 and the remaining two errors occur in the
11 high-order bits.
In the decoder shown in Fig 4.15, the received code vector R(X) is fed at the rightmost stage of
the syndrome generator (as was done in Fig 6.14), equivalent to multiplying R(X) by X11. Then the
syndrome corresponding to e5 and e6 are obtained (using g1(X) as the generator polynomial) as:
The syndrome vectors for the errors e5 and e6 will be (01100110110) or (00110011011)
respectively. Two more errors occurring in the 11 high-order bit positions will cause two 1‟s in the
appropriate positions of the syndrome vectors, thereby complementing the vector for e5 or e6. Based
on the above relations, the decoder operates as follows:
(i).The entire received vector is shifted into the syndrome generator (with switch G1 closed)
and the syndrome S(X) corresponding to X 11 R(X) is formed.
(ii).If all the three or less errors are confined to X 12, X 13 … X22 of R(X), then the syndrome
matches the errors in these positions. The weight of the syndrome is now 3 or less. This is
checked by a threshold gate and the gate output T0 switches G2 ON and G1 OFF. R(X) is now
received from the buffer and corrected by the syndrome bits (as they are clocked bit by bit)
through the modulo-2 adder circuit.
(iii).If the test in (ii) fails then it is assumed that one error is either at e5 or at e6, and the other
two errors are in the 11 high-order bits of R(X). Then if the weight of S(X) is more than 3 (in
test (ii)), then the weights of [S(X) + S (e5)] and [S(X) + S (e6)] are tested. The decisions are:
1. If weight of [S(X) + S (e5)] 2 then the decision (T1 = 1) is that one error
is at position e5 and two errors are at positions where [S(X) + S (e5)] are nonzero.
2. If weight of [S(X) + S (e6)] 2 then the decision (T2 = 1) is that one error
is at position e6 and two errors are at positions where [S(X) + S (e6)] are nonzero. The
above tests are arranged through combinatorial switching circuits and the appropriate
corrections in R(X) are made as R(X) is read from the buffer.
(iv). If the above tests fail then with G1 and G3 ON and G2 OFF, the syndrome and buffer
contents are shifted by one bit. Tests (ii) and (iii) are now repeated. Bit by bit shifting of S(X)
and R(X) is continued till the errors are located, and then corrected. A maximum of 23 shifts
will be required to complete the process. After correction of R(X), the corrected V(X) is further
processed through a divider circuit to obtain the message U(X) = V(X) / g(X).
Assuming that upon shifting the block of 23 bits with t 3 cyclically, „at most one error will
lie outside the 11 high-order bits of R(X)‟ at some shift, an alternative decoding procedure can be
devised for a Golay coder – The systematic search decoder. Here the test (ii) is first carried out. If the
test fails, then first bit of R(X) is inverted and a check is made to find if the weight of S(X) 2. If this
test is successful, then the nonzero positions of S(X) give the two error locations (similar to test
(iii) above) and the other error is at first position. If this test fails, then the syndrome content is
cyclically shifted, each time testing for weight of S(X) 3; and if not, invert 2nd, 3rd ……and 12th bit
of R(X) successively and test for weight of S(X) 2. Since all errors are not in the parity check section,
an error must be detected in one of the shifts. Once one error is located and corrected, the other two
errors are easily located and corrected by test (ii). Some times the systematic searchdecoder is simpler
in hardware than the error trapping decoder, but the latter is faster in operation.The systematic search
decoder can be generalized for decoding other multiple-error-correcting cyclic codes. It is to be
observed that the Golay (23, 12) code cannot be decoded by majority-logic decoders.
Module – 2
(i) Find the parity check matrix. (ii) Find the minimum distance of the code.
Explain the a typical data transmission system with suitable block diagram.
(i) Find the parity check matrix. (ii) Find the minimum distance of the code. (iii) Draw the
encoder and syndrome computation circuit.
5) Consider the (7, 4) linear block code whose generator matrix is given below:
For the channel matrix given below, compute the channel capacity
8) Generate a (7, 4) Hamming Code for the data [1 0 1 0] and calculate how a single-bit error
can be detected and corrected.
9) Describe the purpose of a syndrome calculation in cyclic codes with example.
10) For a binary symmetric channel shown below P(x1)= β.
(i) Show that the mutual information I(X;Y) is given by,
I(X;Y)=H(Y)+Plog2P+(1-P)log2(1-P).
26) A voice grade channel of the telephone network has a bandwidth of 3.4KHz.
(i) Determine Channel Capacity of the telephone channel for a signal to noise ratio
of 30dB.
(ii) Obtain the minimum signal-to-noise ratio required to support information
transmission through the telephone channel at the rate of 4800 bits/ sec
27) For a systematic (6.3) Linear Block Codes, the parity matrix P is given by
1 0 1
0 1 1 Find all Possible code vectors
1 1 0
28) a) Explain the significance of the entropy H(X/Y) of a communication system where X
is the transmitter and Y is the receiver.
b) Derive the relationship between entropy and mutual information.
29) A Binary symmetric channel has the following noise matrix with source probabilities
2/3 1/3
p(x1) =3/4 and p(x2) =1/4 P(Y/X)[ ].
1/3 2/3
Calculate H(x) ,H(X,Y) and H(Y/X)
30) Discuss Channel Capacity and its equation.
31) Suppose you have two different messages, "ABABAB" and "AAABBB", with the
symbol probabilities of p(a)=0.5 p(b)=0.3 and p(c)=0.2. Calculate the encoded values
for both messages using arithmetic encoding. Examine which message results in a more
efficient encoding.
32) Find the generator and parity check matrices of a (7, 4) cyclic code with generator
polynomial g (X) = 1 + X + X3.
Module 3 :
Codes on Graph
Introduction to Convolutional Codes, Tree Codes and Trellis Codes, Description of Convolutional Codes
(Analytical Representation), The Generating Function, Matrix Description of Convolutional Codes.
Viterbi Decoding of Convolutional Codes, Turbo codes, Encoding and decoding of Turbo codes.
INTRODUCTION
In block codes, a block of n-digits generated by the encoder depends only on the block of k-
data digits in a particular time unit. These codes can be generated by combinatorial logic circuits. In a
convolutional code the block of n-digits generated by the encoder in a time unit depends on not only
on the block of k-data digits with in that time unit, but also on the preceding „m‟ input blocks. An (n,
k, m) convolutional code can be implemented with k-input, n-output sequential circuit with input
memory m. Generally, k and n are small integers with k < n but the memory order m must be made
large to achieve low error probabilities. In the important special case when k = 1, the information
sequence is not divided into blocks but can be processed continuously.
Similar to block codes, convolutional codes can be designed to either detect or correct errors.
However, since the data are usually re-transmitted in blocks, block codes are better suited for error
detection and convolutional codes are mainly used for error correction.
Convolutional codes were first introduced by Elias in 1955 as an alternative to block codes.
This was followed later by Wozen Craft, Massey, Fano, Viterbi, Omura and others. A detailed
discussion and survey of the application of convolutional codes to practical communication channels
can be found in Shu-Lin & Costello Jr., J. Das etal and other standard books on error control coding.
Fig 5.2
From Fig 5.2, the encoding procedure can be understood clearly. Initially the registers are in
Re-set mode i.e. (0, 0). At the first time unit the input bit is 1. This bit enters the first register and pushes
out its previous content namely „0‟ as shown, which will now enter the second register and pushes out
its previous content. All these bits as indicated are passed on to the X-OR gates and the output pair (1,
1) is obtained.The same steps are repeated until time unit 4, where zeros are introduced to clear the
register contents producing two more output pairs. At time unit 6, if an additional „0‟ is introduced the
encoder is re-set and the output pair (0, 0) obtained. However, this step is not absolutely necessary as
the next bit, whatever it is, will flush out the content of the second register. The „0‟ and the „1‟
indicated at the output of second register at time unit 5 now vanishes. Hence after (L+m) = 3 + 2 = 5
time units, the output sequence will read v = (11, 10, 00, 10, 11). (Note: L = length of the input
sequence). This then is the code word produced by the encoder. It is very important to remember that
“Left most symbols represent earliest transmission”.
As already mentioned the convolutional codes are intended for the purpose of error correction.
However, it suffers from the „problem of choosing connections‟ to yield good distance properties. The
selection of connections indeed is very complicated and has not been solved yet. Still,good codes have
been developed by computer search techniques for all constraint lengths less than
20. Another point to be noted is that the convolutional codes do not have any particular block size.
They can be periodically truncated. Only thing is that they require m-zeros to be appended to the end
of the input sequence for the purpose of „clearing‟ or „flushing‟ or „re-setting‟ of the encodingshift
registers off the data bits. These added zeros carry no information but have the effect of reducing the
code rate below (k/n). To keep the code rate close to (k/n), the truncation period is generally made as
long as practical. The encoding procedure as depicted pictorially in Fig 5.2 is rather tedious. We can
approach the encoder in terms of “Impulse response” or “generator sequence”which merely represents
the response of the encoder to a single „1‟ bit that moves through it.
The encoder for a (2, 1, 3) code is shown in Fig. 8.3. Here the encoder consists of m=3 stage
shift register, n=2 module-2 adders (X-OR gates) and a multiplexer for serializing the encoder outputs.
Notice that module-2 addition is a linear operation and it follows that all convolution encoders can be
implemented using a “linear feed forward shift register circuit”.
The “information sequence‟ u = (u1, u2, u3 …….) enters the encoder one bit at a time starting from u1.
As the name implies, a convolutional encoder operates by performing convolutions on theinformation
sequence. Specifically, the encoder output sequences, in this case v(1) = {v1(1), v2(1), v3(1)
… }and v(2) = {v1(2),v2(2),v3(2) … } are obtained by the discrete convolution of the informationsequence
with the encoder "impulse responses'. The impulse responses are obtained by determining the output
sequences of the encoder produced by the input sequence u = (1, 0, 0, 0…).The impulse responses so
defined are called 'generator sequences' of the code. Since the encoder has a m-time unit memory the
impulse responses can last at most (m+ 1) time units (That is a total of (m+ 1) shifts are necessary for
a message bit to enter the shift register and finally come out) and are written as:
By inspection, these can be written as: g (1) = {1, 0, 1, 1} and g (2) = {1, 1, 1, 1}
Observe that the generator sequences represented here is simply the 'connection vectors' of the
encoder. In the sequences a '1' indicates a connection and a '0' indicates no connection to the
corresponding X - OR gate. If we group the elements of the generator sequences so found in to pairs,
we get the overall impulse response of the encoder, Thus for the encoder of Fig 5.3, the over-all
impulse response‟ will be:
ul i .gi 1
( j) ( j)
vl
i 0
(j)
= ul g1 + ul – 1 g2 (j) + ul – 2 g3 (j) + ….. +ul – m gm+1(j) ……………. (5.2)
for j = 1, 2 and where ul-i = 0 for all l < i and all operations are modulo - 2. Hence for the encoder of
Fig 5.3, we have:
vl(1) = ul + ul – 2 + ul - 3
vl(2) = ul + ul – 1+ ul – 2 + ul - 3
This can be easily verified by direct inspection of the encoding circuit. After encoding, the two
output sequences are multiplexed into a single sequence, called the "code word" for transmissionover
the channel. The code word is given by:
Suppose the information sequence be u = (10111). Then the output sequences are:
v (1) = (1 0 1 1 1) * (10 1 1)
= (1 0 0 0 0 0 0 1),
v (2) = (1 0 1 1 1) * (1 1 1)
= (1 1 0 1 1 1 0 1),
The discrete convolution operation described in Eq (8.2) is merely the addition of shifted
impulses. Thus to obtain the encoder output we need only to shift the overall impulse response by 'one
branch word', multiply by the corresponding input sequence and then add them. This is illustrated in
the table below:
1 11 01 11 11
0 0 0 0 0 0 0 0 0 ----- one branch word shifted
sequence
1 1 1 0 1 1 1 1 1---------- Two branch word
shifted
1 11 01 11 11
1 11 01 11 11
Modulo -2 sum 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 1
The Modulo-2 sum represents the same sequence as obtained before. There is no confusion at
all with respect to indices and suffices! Very easy approach - super position or linear addition of shifted
impulse response - demonstrates that the convolutional codes are linear codes just as the blockcodes
and cyclic codes. This approach then permits us to define a 'Generator matrix' for the convolutional
encoder. Remember, that interlacing of the generator sequences gives the overall impulse response and
hence they are used as the rows of the matrix. The number of rows equals the number of information
digits. Therefore the matrix that results would be “Semi-Infinite”. The secondand subsequent rows of
the matrix are merely the shifted versions of the first row -They are each shifted with respect to each
other by "One branch word". If the information sequence u has a finite length, say L, then G has L
rows and n (m +L) columns (or (m +L) branch word columns) and v has a length of n (m +L) or
a length of (m +L) branch words. Each branch word is of length 'n'. Thus the Generator matrix G, for
the encoders of type shown in Fig 8.3 is written as:
g (1)
g (2)
g (1)
g (2 )
g (1)
g (2 )
g (1)
g (2)
1 1 2 2 3 3 4 4
G g (1)
g (2)
g (1)
g (2)
g (1)
g (2 )
g (1)
g (2)
….. (5.3)
1 1 2 2 3 3 4 4
g1( 1 ) g1( 2 ) g 2 ( 1 ) g 2( 2 ) g 3 ( 1 ) g 3( 2 ) g4 ( 1 ) g4 ( 2 )
(Blank places are zeros.)
v = u .G …………………. (5.4)
Example 5.2:
For the information sequence of Example 5.1, the G matrix has 5 rows and 2(3 +5) =16 columns and
we have
1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
1
0 1 1 0 1 1 1 1 1 0 0 0 0 0 0
0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0
G 0
0 0 0 0 0 1 1 0 1 1 1 1 1 0 0
0
0
0 0 0 0 0 0 0 1 1 0 1 1 1 1 1
Performing multiplication, v = u G as per Eq (5.4), we get: v = (11, 01, 00, 01, 01, 00, 11) same as
before.
As a second example of a convolutional encoder, consider the (3, 2, 1) encoder shown in Fig.8.4.
Here, as k =2, the encoder consists of two m = 1 stage shift registers together with n = 3 modulo -2
adders and two multiplexers. The information sequence enters the encoder k = 2 bits at a time and can
be written as u = {u1 (1) u1 (2), u2 (1) u2 (2), u3 (1) u3 (2) …} or as two separate input sequences:
u (1) = {u1 (1), u2 (1), u3 (1) …} and u (2) = {u1 (2), u2 (2), u3 (2) …}.
v l (3) = u l (1)
as can be seen from the encoding circuit.
v = { v 1 ( 1) v 1 ( 2) v 1 ( 3) , v 2 ( 1) v 2 ( 2) v 2 ( 3) , v 3 ( 1) v 3 ( 2) v 3 ( 3) … }
Example 5.3:
v = (1 0 1, 0 0 0, 0 0 1, 1 0 0).
The generator matrix for a (3, 2, m) code can be written as:
The encoding equations in matrix form are again given by v = u G. observe that each set of k = 2 rows
of G is identical to the preceding set of rows but shifted by n = 3 places or one branch word to the right.
Example 5.4:
*Remember that the blank places in the matrix are all zeros.
Performing the matrix multiplication, v = u G, we get: v = (101,000,001,100), again agreeing
with our previous computation using discrete convolution.
This second example clearly demonstrates the complexities involved, when the number of input
sequences are increased beyond k > 1, in describing the code. In this case, although the encoder contains
k shift registers all of them need not have the same length. If ki is the length of the i-th shift register,
then we define the encoder "memory order, m" by
An example of a (4, 3, 2) convolutional encoder in which the shift register lengths are 0, 1 and 2 is
shown in Fig 5.5.
Since each information bit remains in the encoder up to (m + 1) time units and during each time
unit it can affect any of the n-encoder outputs (which depends on the shift register connections)it
follows that "the maximum number of encoder outputs that can be affected by a single
information bit" is
„nA‟ is called the 'constraint length" of the code. For example, the constraint lengths of the encoders
of Figures 5.3, 5.4 and 5.5 are 8, 6 and 12 respectively. Some authors have defined the constraint length
(For example: Simon Haykin) as the number of shifts over which a single message bit can
influence the encoder output. In an encoder with an m-stage shift register, the “memory” of the encoder
equals m-message bits, and the constraint length nA = (m + 1). However, we shall adopt the definition
given in Eq (5.8).
The number of shifts over which a single message bit can influence the encoder output is usually
denoted as K. For the encoders of Fig 5.3, 5.4 and 5.5 have values of K = 4, 2 and 3 respectively. The
encoder in Fig 8.3 will be accordingly labeled as a „rate 1/2, K = 4’ convolutional encoder. The term
K also signifies the number of branch words in the encoder‟s impulse response.
Turning back, in the general case of an (n, k, m) code, the generator matrix can be put in the
form:
G1 G2 G3 Gm Gm 1
G1
G G2 Gm 1 Gm Gm 1
Gm 1 …………… (5.9)
G1 Gm 2 Gm 1 Gm
⋱ ⋱ ⋱ ⋱ ⋱
g 1,i ( 1 ) g1,i
(2)
g1,i ( n )
(1) ( n )
g2 ,i g2 ,i
(2)
g2 ,i
Gi ………………… (5.10)
⁝ ⁝ ⁝ ⁝
gk ,i
(1)
gk ,i
(2)
gk ,i
(n )
Notice that each set of k-rows of G are identical to the previous set of rows but shifted n-places to
the right. For an information sequence u = (u1, u2…) where ui = {ui (1), ui (2)…ui (k)}, the code word is
v = (v1, v2…) where vj = (vj (1), vj (2) ….vj (n)) and v = u G. Since the code word is a linear combination
of rows of the G matrix it follows that an (n, k, m) convolutional code is a linear code.
Since the convolutional encoder generates n-encoded bits for each k-message bits, we define R
= k/n as the "code rate". However, an information sequence of finite length L is encoded into a code
word of length n (L +m), where the final nm outputs are generated after the last non zero information
block has entered the encoder. That is, an information sequence is terminated with all zero blocks in
order to clear the encoder memory. The terminating sequence of m-zeros is called the "Tail of the
message". Viewing the convolutional-code as a linear block code, with generator matrix G, then the
block code rate is given by kL/n(L +m) - the ratio of the number of message bits to the length of the
code word. If L >> m, then, L/ (L +m) ≈ 1 and the block code rate of a convolutional code and its rate
when viewed as a block code would appear to be same. Infact, this is the normal mode of operation for
convolutional codes and accordingly we shall not distinguish between the rate of a convolutional code
and its rate when viewed as a block code. On the contrary, if „L’ were small, the effective rate of
transmission indeed is kL/n (L + m) and will be below the block code rate by a fractional amount:
m
k / n kL / n( L m )
L m …………………….. (5.11)
k/n
and is called "fractional rate loss". Therefore, in order to keep the fractional rate loss at a minimum
(near zero), „L‟ is always assumed to be much larger than „m‟. For the information 'sequence of
Example 8.1, we have L = 5, m =3 and fractional rate loss = 3/8 = 37.5%. If L is made 1000, the
fractional rate loss is only 3/1003≈ 0.3%.
are the “generator polynomials” of' the code; and all operations are modulo-2. After multiplexing, the
code word becomes:
The indeterminate 'X' can be regarded as a “unit-delay operator”, the power of X defining the
number of time units by which the associated bit is delayed with respect to the initial bit in the sequence.
Example 5.5:
For the (2, 1, 3) encoder of Fig 8.3, the impulse responses were: g(1)= (1,0, 1, 1), and g(2) = (1,1, 1, 1)
For the information sequence u = (1, 0, 1, 1, 1); the information polynomial is: u(X) = 1+X2+X3+X4
v (l) (X2) = 1 + X14 and v (2) (X2) = 1+X2+X6+X8+X10+X14; Xv (2) (X2) = X + X3 + X7 + X9 + X11 +
X15;
and the code polynomial is: v(X) = v (1) (X2) + X v (2) (X2) = 1 + X + X3 + X7 + X9 + X11 + X14 + X15
Hence the code word is: v = (1 1, 0 1, 0 0, 0 1, 0 1, 0 1, 0 0, 1 1); this is exactly the same as
obtained earlier.
The generator polynomials of an encoder can be determined directly from its circuit diagram.
Specifically, the co-efficient of Xl is a '1' if there is a "connection" from the l-th shift register stage to
the input of the adder of interest and a '0' otherwise. Since the last stage of the shift register in an
(n, l) code must be connected to at least one output it follows that at least one generator polynomial
should have a degree equal to the shift register length 'm', i.e.
m
Max deg g( j ) ( X ) ……………… (5.14)
1 j n
In an (n, k) code, where k > 1, there are n-generator polynomials for each of the k-inputs, each
set representing the connections from one of the shift registers to the n-outputs. Hence, the length Kl
of the l-th shift register is given by:
K
Max deg g ( j )(
X ) , 1 l k ………… (5.15)
l 1 j n l
Where gl (j) (X) is the generator polynomial relating the l-th input to the j-th output and the encoder
memory order m is:
Max
( j) ( X )
Max 1 j deg g
m K ………… (5.16)
1 l k l 1 l k l
Since the encoder is a linear system and u (l) (X) represents the l-th input sequence and v (j)
(X) represents the j-th output sequence the generator polynomial gl (j) (X) can be regarded as the
'encoder transfer function' relating the input - l to the output – j. For the k-input, n- output linear system
there are a total of kn transfer functions which can be represented as a (k n) "transfer function
matrix".
g ( 1 )( X ), g ( 2 )( X ), g ( n )( X )
1
g 1 ( 1 )( X ),
1
(n)
g 2 ( 2 )( X ), g2 ( X ) ……………
G( X ) 2
(5.17)
⁝ ⁝ ⁝ ⁝
(1)
gk ( X ), gk ( X ), g k ( X )
( 2 ) ( n )
Using the transfer function matrix, the encoding equations for an (n, k, m) code can be expressed as
U(X) = [u (1) (X), u (2) (X)...u (k) (X)] is the k-vector, representing the information polynomials, and.
V(X) = [v (1) (X), v (2) (X) … v (n) (X)] is the n-vector representing the encoded sequences. After
multiplexing, the code word becomes:
Example 5.6:
1 X 1 1
G( X )
X 1 X 0
For the information sequence u (1) = (1 0 1), u (2) = (1 1 0), the information polynomials are:
u (1) (X) = 1 + X2, u(2)(X) = 1 + X
X
1 X
0
Hence the code word is:
v(X) = v(1)(X3) + Xv(2)(X3) + X2v(3)(X3)
= 1 + X2 + X8 + X9
v = (1 0 1, 0 0 0, 0 0 1, 1 0 0).
This is exactly the same as that obtained in Example 5.3.From Eq (5.17) and (5.18) it follows that:
k
v ( X ) u (i) ( X )g
( j) ( j)
(X)
i 1
i
Example 5.7:
For the input sequence u (1) (X) = 1 + X2, u (2) (X) = 1 + X, we have
v (X) = u (1) (X3) g1(X) + u (2) (X3) g 2 (X) = 1+ X2 + X8 + X9. This is exactly the same as obtained
before.
(5.
23)
where δi, j is the Kronecker delta, having values: δi, j = 1 … if j = i
= 0 … if j ≠ i
The generator matrix for such codes is given by
I P1 O P2 O P3 O Pm 1
⋱
G ⋱ I P1 O P2 O Pm O Pm 1 ……………. (5.24)
⋱ ⋱ ⋱ ⋱ I P1 O Pm 1 O Pm O
⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱
Where I is a k k identity (unit) matrix, O is the k k all zero matrix and Pi is a k (n - k) matrix
given by:
g 1,i ( k 1 ) g1,i ( k 2 ) g1,i ( n )
( k1 ) ( n )
g2 ,i g2 ,i
( k2 )
g2 ,i
Pi ………………… (5.25)
⁝ ⁝ ⁝ ⁝
( k1 ) ( k2 )
gk ,i gk ,i gk ,i (n)
G( X ) 3 3
…………….. (5.26)
⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝
⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝
0 0 0 1 gk ( k 1 )( X ) gk( n )( X )
For a systematic code we require only k (n-k) sequences. Thus systematic codes form a sub class of
the set of all possible convolutional codes. Any code not satisfying Eq (5.22) to Eq (5.26) is said to
be "non systematic".
Example 5.8:
The generator sequences are: g (1) = (1 0 0 0) and g (2) = (1 1 0 1); and the generator matrix is:
1 1 0 1 0 0 0 1
⋱ ⋱ 1 1 0 1 0 0 0 1
G
⋱ ⋱ ⋱ ⋱ 1 1 0 1 0 0 0 1
⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱
One advantage of systematic codes is that encoding is much simpler than for the non systematic
codes - because less hardware is required. For example, the (2, 1, 3) encoder of Fig 5.6 needs only one
modulo-2 adder while that of Fig 5.5 requires three such adders. Notice also the total number of inputs
to the adders required. Further, for systematic (n, k, m) codes with K > n – k, encoding schemes that
normally require fewer than K-shift registers exist as illustrated in the following simple example.
Example 5.9:
1 0 1 X X 2
G( X )
0 1 1 X 2
The straight forward realization requires a total of K=K1+K2=2+2=4 shift registers and is
shown in Fig 5.7(a). However, since the parity sequences are generated by: v(3)(X)
= u(1)(X). g1(3)(X) + u(2)(X) g2(3)(X), an alternative realization as shown in Fig 5.7(b) can be obtained.
Fig 5.7 Realization of encoder for example 5.9
In majority of situations, the straight forward realization is the most efficient. However, in the
case of systematic codes simpler realizations usually exist as shown in Example 5.9.
Another advantage of' systematic codes is that no inverting circuit is needed for recovering
information from the code word. For information recovery from a non systematic code, inversion is
required in the form of an (n k) matrix G -1(X) such that
G(X).G -1(X) = Ik X l ……………… (5.27)
and the information sequence can be recovered with an l-time unit delay from the code word by
letting V(X) to be the input to the n-input, k-output linear sequential circuit whose transfer function
matrix is G – 1(X).
For an (n, l, m) code the transfer function matrix G(X) will have a "feed forward" inverse
G – 1 (X) of delay l units if and only if :
for some l 0; where G.C.D denotes the 'greatest common divisor'. For an (n, k, m) code with k >
n n
1, let i (X), i=1, 2 … k be the determinants of the k distinct k k sub-matrices of the transfer
function matrix G(X).Then a feed forward inverse of delay l-units exists if an only if:
n
GGD [i (X), i=1, 2 … k ] = X l ……………… (5.3
Example 5.10:
For the (2, 1, 3) encoder of Fig 5.3, we have, from Example 8.5, the generator matrix as
G ( X )
X X
2
Example 5.11:
For the (3, 2, 1) encoder of Fig 5.4, the generator matrix as found in Example 5.6 is:
1 X 1 1
G( X )
X 1 X 0
The determinants of the (2 2) sub matrices are 1+ X + X2, X and 1 + X. Their GCD is 1.
A feed forward inverse with no delay exists and can be computed as:
1 X X
G1( X ) X 1 X I
X X 2 1 X 2
Fig 5.8 Feed forward encoder of (2,1,3) code Fig 5.9 Feed forward encoder of (3,2,1) code
To understand what happens when a feed forward inverse does not exist consider an example
of a (2, 1, 2) encoder with generator matrix
Since the GCD of g (1) (X) and g (2) (X) is (1+ X) (not of the form X l), a feed forward inverse does
not exist. Suppose the input sequence is:
1 1 X 2 X 3 ... Then the output sequences are: v(1)(X) = 1, v(2)(X) = 1 + X.
u( X ) 1 X
That is, the code word contains only three nonzero bits even though the information sequence has
infinite weight. If this code word is transmitted over a BSC and the three nonzero bits are changed to
zeros by the channel noise, the received sequence will be all zeros. A maximum likely hood decoder
(MLD) will then produce the all-zero code word as its estimate, since this a valid code word and it
agrees exactly with the received sequence. Thus, the estimated information sequence will beû(
X ) 0 , implying an infinite number of decoding errors caused by a finite number (only three in this
case) of channel errors. Clearly this is a very undesirable situation and the code is said to be
subject to "Catastrophic error propagation" and the code is called a "catastrophic code".
Equations (5.29) and (5.30) can be shown to be necessary and sufficient conditions for a code
to be 'non-catastrophic'. Hence any code for which a feed forward inverse exists is non-catastrophic.
Another advantage of systematic codes is that they are always non-catastrophic.
memory (m - represents the memory order which we have defined as the maximum length of any shift
register), the encoder state at time unit T', when the encoder inputs are, {u l (1), u l (2)…u l (k)}, arethe
binary k-tuple of inputs:
{u l-1 (1) u l-2 (1), u l-3 (1)… u l-k (1); u l-1 (2), u l-2(2, u l-3 (2)… u l-k (2); … ; u l-1 (k) u l-2 (k), u l-3 (k)… u l-k (k)},
and there are a total of 2k different possible states. For a (n, 1, m) code, K = K1 = m and the encoder
state at time unit l is simply {ul-1, ul-2 … ul-m}.
Each new block of k-inputs causes a transition to a new state. Hence there are 2k branches
leaving each state, one each corresponding to the input block. For an (n, 1, m) code there are only two
branches leaving each state. On the state diagram, each branch is labeled with the k-inputs causing the
transition and the n-corresponding outputs. The state diagram for the convolutional encoder of Fig 8.3
is shown in Fig 8.10. A state table would be, often, more helpful while drawing the state diagram and
is as shown.
State S0 S1 S2 S3 S4 S5 S6 S7
Binary
000 100 010 110 001 101 011 111
Description
Fig 5.10 State diagram of encoder of Fig 5.3
Recall (or observe from Fig 8.3) that the two out sequences are:
v (1) = ul + ul – 2 + ul – 3 and
v (2) = ul + ul – 1 + ul – 2 + ul – 3
Till the reader, gains some experience, it is advisable to first prepare a transition table using the
output equations and then translate the data on to the state diagram. Such a table is as shown below:
Assuming that the shift registers are initially in the state S0 (the all zero state) the code word
corresponding to any information sequence can be obtained by following the path through the state
diagram determined by the information sequence and noting the corresponding outputs on the branch
labels. Following the last nonzero block, the encoder is returned to state S0 by a sequence of m-all-
zero block appended to the information sequence. For example, in Fig 5.10, if u = (11101), the code
word is v = (11, 10, 01, 01, 11, 10, 11, 10) the path followed is shown in thin gray lines with arrows
and the input bit written along in thin gray. The m = 3 zeros appended are indicated in gray which is
much lighter compared to the information bits.
Apart from obtaining the output sequence for a given input sequence, the state diagram can be
modified to provide a complete description of the Hamming weights of all nonzero code words. (That
is, the state diagram is useful in determining a weight- distribution for the code).
This is achieved as follows: The state S0 is split into an initial and a final state. The self loop
around S0 is discarded. Each branch is labeled with a 'branch gain', „X i‟, where 'i' is the weight
(number of ones) of the n-encoded bits on that branch. Each path that connects the initial state to the
final state which diverges from and remerges with state S0, exactly once, represents a nonzero code
word.
Those code words that diverge from and remerge with S0 more than once can be regarded as a
sequence of shorter code words. The "path gain" is the product of the branch gains along a path and
the weight of the associated code word is the power of X in the path gain. As an lustration, let us
consider the modified state diagram for the (2, 1, 3) code of Fig 5.3 as shown in Fig 5.11 and another
version of the same as shown in Fig 5.12.
Consider the encoder shown in Fig 5.15. We shall use this example for discussing further graphical
representations viz. Trees, and Trellis.
Fig 5.15 (2,1,2) convolution encoder
State transition table for the (2, 1, 2) convolutional encoder of Example 5.12
Previous Binary Input Next Binary u l u l – 1 u l - 2 Output
state description State description
S0 0 0 0 S0 0 0 0 0 0 0 0
1 S1 1 0 1 0 0 1 1
S1 1 0 0 S2 0 1 0 1 0 1 0
1 S3 1 1 1 1 0 0 1
S2 0 1 0 S0 0 0 0 0 1 1 1
1 S1 1 0 1 0 1 0 0
S3 1 1 0 S2 0 1 0 1 1 0 1
1 S3 1 1 1 1 1 1 0
The state diagram and the augmented state diagram for computing the „complete path
enumerator function‟ for the encoder are shown in Fig 5.16.
The loops l1 and l2 are non-touching and their gain product is: l1l2 = D2L3I2
= 1 – (l1 + l2 + l3) – l1 l2
= 1 – DLI (1 + L)
There are two forward paths: F1 S0 → S1 → S2→ S0. Path gain = D5L3I
The loop l1 does not touch the forward path F1. 1 = 1 – l1 = 1 – DLI.
D5 L3I D6 L4 I 2 ( 1 L ) D7 L5 I 3 ( 1 L )2 ...
Thus there is one code word of weight 5 that has length 3 branches and an information sequence
of weight 1, Two code words of weight 6, of which one has length 4 branches and an information
sequence of weight 2, and the other has length 5 branches and an information sequenceof weight 2
and so on.
Following the procedure just described we find that the encoded sequence for an information
sequence (10011) is (11, 10, 11, 11, 01) which agrees with the first 5 pairs of bits of the actual encoded
sequence. Since the encoder has a memory = 2 we require two more bits to clear and re-set the encoder.
Hence to obtain the complete code sequence corresponding to an information sequence of length kL,
the tree graph is to extended by n(m-l) time units and this extended part is called the
29
"Tail of the tree", and the 2kL right most nodes are called the "Terminal nodes" of the tree. Thus
the extended tree diagram for the (2, 1, 2) encoder, for the information sequence (10011) is as in Fig
5.19 and the complete encoded sequence is (11, 10, 11, 11, 01, 01, 11).
30
Beyond the third branch, the nodes labeled S0 are identical and so are all the other pairs of nodes that
are identically labeled. Since the encoder has a memory m = 2, it follows that when the third
information bit enters the encoder, the first message bit is shifted out of the register. Consequently,
after the third branch the information sequences (000u3u4---) and (100u3u4---) generate the same code
symbols and the pair of nodes labeled S0 may be joined together. The same logic holds for the other
nodes.
Accordingly, we may collapse the tree graph of Fig 5.18 into a new form of Fig 5.21 called a
"Trellis". It is so called because Trellis is a tree like structure with re-merging branches (You will have
seen the trusses and trellis used in building construction).
1. There are no fundamental paths at distance 1, 2 or 3 from the all zero path.
2. There is a single fundamental path at distance 5 from the all zero path. It diverges from the all-zero
path three branches back and it differs from the all-zero path in the single input bit.
3. There are two fundamental paths at a distance 6 from the all zero path. One path diverges from the
all zero path four branches back and the other five branches back. Both paths differ from the all zero
path in two input bits. The above observations are depicted in Fig 8.24(a).
4. There are four fundamental paths at a distance 7 from the all-zero path. One path diverges from the
all zero path five branches back, two other paths six branches back and the fourth path diverges seven
branches back as shown in Fig 8.24(b). They all differ from the all zero path in three input bits. This
information can be compared with those obtained from the complete path enumerator function found
earlier.
31
THE VITERBI ALGORITHM:
The Viterbi algorithm, when applied to the received sequence r from a DMC finds the path
through the trellis with the largest metric. At each step, it compares the metrics of all paths entering
each state and stores the path with the largest metric called the "survivor" together with its metric.
The Algorithm:
Step: 1. Starting at level (i.e. time unit) j = m, compute the partial metric for the single path
entering each node (state). Store the path (the survivor) and its metric for each state.
Step: 2. Increment the level j by 1. Compute the partial metric for all the paths entering a state
by adding the branch metric entering that state to the metric of the connecting survivor
at the preceding time unit. For each state, store the path with the largest metric (the
survivor), together with its metric and eliminate all other paths.
Step: 3. If j < (L + m), repeat Step 2. Otherwise stop.
Notice that although we can use the Tree graph for the above decoding, the number of nodes
at any level of the Trellis does not continue to grow as the number of incoming message bits increases,
instead it remains a constant at 2m.
There are 2k survivors from time unit „m‟ up to time unit L, one for each of the 2kstates. After
L time units there are fewer survivors, since there are fewer states while the encoder is returning to the
all-zero state. Finally, at time unit (L + m) there is only one state, the all-zero state and hence only one
survivor and the algorithm terminates.
32
M(r | vˆ) M(r | v), v vˆ .Thus it is clear that the Viterbi algorithm is optimum in the sense that it
always finds the maximum likely hood path through the Trellis. From an implementation point of view,
however, it would be very inconvenient to deal with fractional numbers. Accordingly, the bit metric M
(ri|vi) = ln P (ri|vi) can be replaced by “C2 [ln P (ri|vi) + C1]”, C1 is any real number and C2is any
positive real number so that the metric can be expressed as an integer. Notice that a path v
which maximizes M(r | v) N M(r | v ) N ln P (r | v ) also maximizes N C ln P(r | v ) C .
i i i i 2 i i 1
i 1 i 1 i 1
Therefore, it is clear that the modified metrics can be used without affecting the performance of the
Viterbi algorithm. Observe that we can always choose C1 to make the smallest metric as zero and then
C2 can be chosen so that all other metrics can be approximated by nearest integers. Accordingly,there
can be many sets of integer metrics possible for a given DMC depending on the choice of C2. The
performance of the Viterbi algorithm now becomes slightly sub-optimal due to the use of modified
metrics, approximated by nearest integers. However the degradation in performance is typically very
low.
Example 5.13:
As an illustration let us consider a binary input-quaternary output DMC shown in Fig 5.23(a).
The bit metrics ln P (ri| vi) are shown in Fig 5.23(b). Choosing C1= − 2.3 and C2 = 7.195 yields the
"integer metric table" shown in Fig 5.23(c).
Now suppose that a code word from the (2,1,2) encoder of Fig 5.15, whose Trellis diagram is
shown in Fig 5.21, is transmitted over the DMC of Fig 8.26 and the quaternary received sequence is:
In the first time unit (j = 1) there are two branches originating from the state S0 with output
vectors (00) terminating at S0 and (11) terminating at S1. The received sequence in this time unit is
(y3 y4) and using the integer metric table of Fig 8.23(c) we have:
33
M [r1|v1 (1)] = M (y3 y4|00) = M (y3|0) + M (Y4|0) = 5 + 0 = 5, and
These computations are indicated in Fig 8.24(a). The path discarded is shown by a cross. Note that
the branch metrics are also indicated along the branches with in brackets and the state metrics are
indicated at the nodes.
For j = 2 there are single branches entering each state and the received sequence in this time
unit is (y3 y1). The four branch metrics are computed as below.
The metrics at the four states are obtained by adding the branch metrics to the metrics of the previous
states (survivors) and are shown in Fig 8.24(b).
Fig 5.24 Computation for time units j=1, j=2 and j=3
Next for j = 3, notice that there are two branches entering each state as shown in Fig 8.24(c).
The received sequence in this time unit is (y3, y2) and the branch metrics are computed as below:
34
Fig 5.25 Application of Viterbi algorithm
Notice that, in the last step we have ignored the highest metric computed! Indeed, if the
sequence had continued we should take this into account. However, in the last m-time unitsremember
that the path must remerge with S0.
From the path that has survived, we observe that the transmitted sequence is:
Notice that “the final m-branches in any trellis path always corresponds to „0‟ inputs and
hence not considered part of the information sequence”.
As already mentioned, the MLD reduces to a 'minimum distance decoder' for a BSC (see Eq
8.40). Hence the distances can be reckoned as metrics and the algorithm must now find the path through
the trellis with the smallest metric (i.e. the path closest to r in Hamming distance). Thedetails of
the algorithm are exactly the same, except that the Hamming distance replaces the loglikely hood
function as the metric and the survivor at each state is the path with the smallest metric. The following
example illustrates the concept.
Example 5.14:
Suppose the rode word r = (01, 10, 10, 11, 01, 01, 11), from the encoder of Fig 5.15 is received
through a BSC. The path traced is shown in Fig 5.25 as dark lines.
35
Fig 5.26 Viterbi algorithm for a BSC
36
Module – 3
1) Consider a (3,1,2) convolutional encoder with g(1) = (110), g(2) = (101) and g(3) = (111)
(i) Draw the encoder block diagram
(ii) Draw state diagram
(iii) Find the encoder output by traversing through the state diagram for the input sequence of
(11101)
(iv) Obtain the output of the convolutional encoder and draw trellis for the message sequence
(10101) where g(1) = (111), g(2) = (110) and g(3) = (101)
2) Examine encoding and decoding process of turbo codes for an input sequence 101110111011
3) Apply the Viterbi algorithm to decode the received sequence 110111011101 for a given
convolutional code with generator polynomials g1= (1 1 0) and g2 = (101)
4) Examine the encoding and decoding process of turbo codes for the input sequence
1101011011011.
5) (i) Draw the block diagram for the turbo encoder.
(ii) Illustrate the interleaving process and show how it affects the encoded sequence.
(iii) Simulate the decoding process using iterative decoding and compute the
likelihood for the sequence.
6) Apply the Viterbi algorithm to decode the received sequence 111101110111111 for a
convolutional code with generator polynomials:
g(1)=(111)g^{(1)} = (111)g(1)=(111) and g(2)=(110)g^{(2)} = (110)g(2)=(110).
7) (i) Draw the trellis diagram for the convolutional code.
(ii) Traverse the trellis to find the most likely transmitted message sequence.
(iii) Show how the path metrics are calculated at each step.
10) Consider the convolutional code with the generator polynomial matrix:
g(D) = [ 1 1+D+D2 ]
Draw the trellis diagram corresponding to the code. For the received sequence 1000001 ,
perform Viterbi decoding and obtain the corresponding decoded bits.
14) Examine encoding and decoding process of turbo codes for an input sequence 101110111011
15) A convolutional encoder has the following generating sequence, g0=[1 1 1], g1=[1 0 1].
Apply Viterbi algorithm for the decoding of the received sequence 1101110001100011.
16) Consider the transmitter transmits the message sequence 100000 code word in a transmission
medium. In this medium, some errors occurred due to noise. In a receiver side, we receive the error
code word of 01 10 11 10 00 00. Using Viterbi decoding, find and correct the error. Also, get the
message sequence in the receiver side.
Module 4:
Introduction:
This is the age of universal electronic connectivity, where the activities like hacking,
viruses, electronic fraud are very common. Unless security measures are taken, a network
conversation or a distributed application can be compromised easily.
Network Security has been affected by two major developments over the last several
decades. First one is introduction of computers into organizations and the second one being
introduction of distributed systems and the use of networks and communication facilities for
carrying data between users & computers. These two developments lead to ‘computer security’
and ‘network security’, where the computer security deals with collection of tools designed to
protect data and to thwart hackers. Network security measures are needed to protect data during
transmission. But keep in mind that, it is the information and our ability to access that
information that we are really trying to protect and not the computers and networks.
Threat Categories
Acts of human error or failure
Compromises to intellectual property
Deliberate acts of espionage or trespass
Deliberate acts of information extortion
Deliberate acts of sabotage or vandalism
Deliberate acts of theft
Deliberate software attack
Forces of nature
Deviations in quality of service
Technical hardware failures or errors
Technical software failures or errors
Technological obsolesce
Definitions
Computer Security - generic name for the collection of tools designed to protect
data and to thwart hackers
Network Security - measures to protect data during their transmission
Internet Security - measures to protect data during their transmission over a
collection of interconnected networks
our focus is on Internet Security
which consists of measures to deter, prevent, detect, and correct security
violations that involve the transmission & storage of information
Aspects Of Security
consider 3 aspects of information security:
Security Attack
Security Mechanism
Security Service
Security Attack
any action that compromises the security of information owned by an
organization
information security is about how to prevent attacks, or failing that, to
detect attacks on information-based systems
often threat & attack used to mean same thing
have a wide range of attacks
can focus of generic types of attacks
Passive
Active
Passive Attack
Active Attack
Interruption
An asset of the system is destroyed or becomes unavailable or unusable. It is an
attack on availability.
Examples:
5
When an unauthorized party gains access and tampers an asset. Attack is on
Integrity.
Examples:
Changing data file
Altering a program and the contents of a message
Fabrication
An unauthorized party inserts a counterfeit object into the system. Attack on
Authenticity. Also called impersonation
Examples:
Hackers gaining access to a personal email and sending message
Insertion of records in data files
Insertion of spurious messages in a network
Security Services
It is a processing or communication service that is provided by a system to give a
specific kind of production to system resources. Security services implement security policies
and are implemented by security mechanisms.
Confidentiality
Authentication
Peer entity authentication: Verifies the identities of the peer entities involved in
communication. Provides use at time of Mediaconnectionestblishment and during data
transmission. Provides confidence against a masquera or replay attack
Data origin authentication: Assumes the authenticity of source of data unit, but does not
provide protection against duplication or modification of data units. Supports applications like
electronic mail, where no prior interactions take place between communicating entities.
Integrity
Security Mechanisms
According to X.800, the sec rity mechanisms are divided into those implemented in a
specific protocol layer and those that are not specific to any particular protocol layer or security
service. X.800 also differentiates reversible & irreversible encipherment mechanisms. A
reversible encipherment mechanism is simply an encryption algorithm that allows data to be
encrypted and subsequently decrypted, whereas irreversible encipherment include hash
algorithms and message authentication codes used in digital signature and message
authentication applications
Specific Security Mechanisms
Incorporated into the appropriate protocol layer in order to provide some of the OSI
security services,
Encipherment: It refers to the process of applying mathematical algorithms for converting
data into a form that is not intelligible. This depends on algorithm used and encryption keys.
Digital Signature: The appended data or a cryptographic transformation applied to any data
unit allowing to prove the source and integrity of the data unit and protect against forgery.
Access Control: A variety of techniques used for enforcing access permissions to the system
resources.
Data Integrity: A variety of mechanisms used to assure the integrity of a data unit or stream
of data units.
Authentication Exchange: A mechanism intended to ensure the identity of an entity by
means of information exchange.
Traffic Padding: The insertion of bits into gaps in a data stream to frustrate traffic analysis
attempts.
Routing Control: Enables selection of particular physically secure routes for certain data
and allows routing changes once a breach of security is suspected.
Notarization: The use of a trusted third party to assure cert in properties of a data exchange
Pervasive Security Mechanisms
These are not specific to any particular OSI security service or protocol layer.
Trusted Functionality: That which is perceived to b correct with respect to some criteria
Security Level: The marking bound to a resource (which may be a data unit) that names or
designates the security attributes of that resource.
Event Detection: It is the process of detecting all the events related to network security.
Security Audit Trail: Data collected and potentially used to facilitate a security audit, which
is an independent review and examination of system records and activities. Security
Recovery: It deals with requests from mechanisms, such as event handling and management
functions, and takes recovery actions.
Model For Network Security
Data is transmitted over network between two communicating parties, who must
cooperate for the exchange to take place. A logical information channel is established by
defining a route through the internet from source to destination by use of communication
protocols by the two parties. Whenever an opponent presents a threat to confidentiality,
authenticity of information, security aspects come into play. Two components are present in
almost all the security providing techniques.
A security-related transformation on the information to be sent making it unreadable
by the opponent, and the addition of a code based on the contents of the message, used to
verify the identity of sender.
Some secret information shared by the two principals and, it is hoped, unknown to the
opponent. An example is an encryption key used in conjunction with the transformation to
scramble the message before transmission and unscramble it on reception
A trusted third party may be needed to achieve secure transmission. It is responsible for
distributing the secret information to the two parties, while keeping it away from any opponent.
It also may be needed to settle disputes between the two parties regarding authenticity of a
message transmission. The general model shows that there are four basic tasks in designing a
particular security service:
1. Design an algorithm for performing the security-related transformation. The algorithm
should be such that an opponent cannot defeat its purpose
2. Generate the secret information to be used with the algorithm
3. Develop methods for the distribution and sharing of the secret information
4. Specify a protocol to be used by the two principals that makes use of the security
algorithm and the secret information to achieve a particular security service various other
threats to information system like unwanted access still exist.
Information access threats intercept or modify data on behalf of users who should not have
access to that data Service threats exploit service flaws in computers to inhibit use by legitimate
users Viruses and worms are two examples of software attacks inserted into the system by
means of a disk or also across the network. The security mechanisms needed to cope with
unwanted access fall into two broad categories.
Some basic terminologies used
1. CIPHER TEXT - the coded message
2. CIPHER - algorithm for transforming plaintext to cipher text
3. KEY - info used in cipher known only to sender/receiver
4. ENCIPHER (ENCRYPT) - converting plaintext to cipher text
5. ECIPHER (DECRYPT) - recovering cipher text from plaintext
6. CRYPTOGRAPHY - study of encryption principles/methods
7. CRYPTANALYSIS (CODEBREAKING) - the study of principles/ methods of
deciphering cipher text without knowing key
8. CRYPTOLOGY - the field of both cryptography and cryptanalysis
Cryptography
Cryptographic systems are generally classified along 3 independent dimensions:
Type of operations used for transforming plain text to cipher text:
All the encryption algorithms are a based on two general principles: substitution, in
which each element in the plaintext is mapped into another element, and transposition, in
which elements in the plaintext are rearranged.
The number of keys used:
If the sender and receiver uses same key then it is s to be symmetric key (or) single
key (or) conventional encryption. If the sender and receiver use different keys then it is said
to be public key encryption.
The way in which the plain text is processed:
A block cipher processes the input and block of elements at a time, producing output
block for each input block. A Stream cipher processes the input elements continuously,
producing output element one at a time, as it goes along.
Cryptanalysis
The process of attempting to discover X or K or both is known as cryptanalysis. The
strategy used by the cryptanalysis depends on the nature of the encryption scheme and the
information available to the cryptanalyst. There are various types of cryptanalytic attacks
based on the amount of information known to the cryptanalyst.
Cipher text only – A copy of cipher text alone is known to the cryptanalyst.
Known plaintext – The cryptanalyst has a copy of the cipher text and the corresponding
plaintext.
Chosen plaintext – The cryptanalysts gains temporary access to the encryption machine. They
cannot open it to find the key, however; they can encrypt a large number of suitably chosen
plaintexts and try to use the resulting cipher texts to deduce the key.
Chosen cipher text – The cryptanalyst obtains temporary access to the decryption machine,
uses it to decrypt several string of symbols, and tries to use the results to deduce the key.
Substitution Techniques
In which each element in the plaintext is mapped into another element.
1. Caesar Cipher
2. Monoalphabetic cipher
3. Playfair Cipher
4. Hill Cipher
5. Polyalphabetic Cipher
6. One Time Pad
Steganography
A plaintext message may be hidden in any one of the two ways. The methods of
steganography conceal the existence of the message, whereas the methods of cryptography
render the message unintelligible to outsiders by various transformations of the text. A simple
form of steganography, but one that is time consuming to construct is one in which an
arrangement of words or letters within an apparently innocuous text spells out the real message.
e.g., (i) the sequence of first letters of each word of the overall message spells out the real
(hidden) message. (ii) Subset of the words of the overall message is used to convey the hidden
message. Various other techniques have been used historically, some of them are:
Drawbacks of Steganography
Requires a lot of overhead to hide a relatively few bits of information.
Once the system is discovered, it becomes virtually worthless.
Conventional Encryption Principles
A Conventional/Symmetric encryption scheme has five ingredients:
1. Plain Text: This is the original message or data which is fed into the algorithm as input.
3. Secret Key: The key is another input to the algor thm. The substitutions and transformations
performed by algorithm depend on the key.
4. Cipher Text: This is the scrambled (unreadable) message which is output of the encryption
algorithm. This cipher text is dependent on plaintext and secret key. For a given plaintext, two
different keys produce two different cipher texts.
5. Decryption Algorithm: This is the reverse of encryption algorithm. It takes the cipher text and
secret key as inputs and outputs the plain text.
The important point is that the security of conventional encrypt on depends on the secrecy of the key,
not the secrecy of the algorithm i.e. it is not necessary to keep the algorithm secret, but only the key
is to be kept secret. This feature that algorithm need not be kept secret made it feasible for wide spread
use and enabled manufacturers develop low cost chip implementation of data encryption algorithms.
With the use of conventional algorithm, the principal security problem is maintaining the secrecy of
the key.
Definitions
Conventional Encryption Principles
An encryption scheme has five ingredients:
1. Plaintext – Original message or data.
This table is read from left to right; each position in the table gives the identity of the input bit
that produces the output bit in that position. So the first output bit is bit 3 of the input; the
second output bit is bit 5 of the input, and so on. For example, the key (1010000010) is
permuted to (10000 01100). Next, perform a circular left shift (LS-1), or rotation, separately
on the first five bits and the second five bits. In our example, the result is (00001 11000). Next
we apply P8, which picks out and permutes 8 of the 10 bits according to the following rule:
P8
6 3 7 4 8 5 10 9
The result is subkey 1 (K1). In our example, this yields (10100100). We then go back to the
pair of 5-bit strings produced by the two LS-1 functions and performs a circular left shift of 2
bit positions on each string. In our example, the value (00001 11000) becomes (00100
00011). Finally, P8 is applied again to produce K2. In our example, the result is (01000011).
S-DES encryption
Encryption involves the sequential application of five functions.
Initial and Final Permutations The input to the algorithm is an 8-bit block of plaintext,
which we first permute using the IP function:
IP
2 6 3 1 4 8 5 7
This retains all 8 bits of the plaintext but mixes them up.
Consider the plaintext to be 11110011.
Permuted output = 10111101
At the end of the algorithm, the inverse permutation is use :
IP –1
4 1 3 5 7 2 8 6
E/P
4 1 2 3 2 3 4 1
R= 1101 E/P output = 11101011 It is clearer to depict the result in this fashion:
The 8-bit subkey K1 = (k11, k12 12, k13 13, k14 14, k15 15, k16 16, k17 17, k18) is added to
this value using exclusive-OR:
The first 4 bits (first row of the preceding matrix) are fed into the S-box S0 to produce a 2- bit
output, and the remaining 4 bits (second row) are fed into S1 to produce another 2- bit output.
These two boxes are defined as follows:
The S-boxes operate Skyups as follows. The first and fourth input bits are treated as a 2-bit
number that specify a row of the -box, and the second and third input bits specify a
column of the S-box. The entry in that row and column, in base 2, is the 2-bit output. For
example, if (p0,0 p0,3) = ) (00) and ( p0,1 p0,2) = (10), then the output is from row 0, column
2 of S0, which is 3, or (11) in ) binary. Similarly, (p1,0 p1,3) and ( p1,1 p1,2) are used to index
into a row and column of S1 to produce an additional 2 bits. Next, the 4 bits produced by S0
and S1 undergo a further permutation as follows:
P4
2 4 3 1
1. Initial permutation (IP - defined in table 2.1) rearranging the bits to form the
“permuted input”.
2. Followed by 16 iterations of the same function (substitution and permutation). The output
of the last iteration consists of 64 bits which is a function of the plaintext and key. The left
and right halves are swapped to produce the pre-output.
3. Finally, the pre-output is passed through a permutation (IP−1 - defined in table 2.1) which
is simply the inverse of the initial permutation (IP). The output of IP−1 is the 64- bit cipher
text
As figure shows, the inputs to each round consist of the Li , Ri pair and a 48 bit subkey which
is a shifted and contracted version of the original 56 bit key. The use of the key can be seen in
the right hand portion of figure 2.2: • Initially the key is passed through a permutation function
(PC1 - defined in table 2.2) • For each of the 16 iterations, a subkey (Ki) is produced by a
combination of a left circular shift and a permutation (PC2 - defined in table 2.2) which is the
same for each iteration. However, the resulting subkey is different for each iteration because of repeated
shifts.
34
Details Of Individual Rounds
The main operations on the data are encompassed into what is referred to as the cipher function
and is labeled F. This function accepts two different length inputs of 32 bits and 48 bits and
outputs a single 32 bit number. Both the data and key are operated on in parallel, however the
operations are quite different. The 56 bit key is split into two 28 bit halves Ci and Di (C and D
being chosen so as not to be conf sed with L and R). The value of the key used in any round is
simply a left cyclic shift and a permuted contraction of that used in the previous round.
Mathematically, this can be written as
Ci = Lcsi(Ci−1), Di = Lcsi(Di−1)
Ki = P C2(Ci , Di)
where Lcsi is the left cyclic shift for round i, Ci and Di are the outputs after the shifts, P C2(.)
is a function which permutes and compresses a 56 bit number into a 48 bit number and Ki is
the actual key used in round i. The number of shifts is either one or two and is determined by
the round number i. For i = {1, 2, 9, 16} the number of shifts is one and for every other round
it is two
OX Details
Advanced Encryption Algorithm (AES)
AES is a block cipher with a block length of 128 bits.
AES allows for three different key lengths: 128, 192, or 256 bits. Most of our
discussion will assume that the key length is 128 bits.
Encryption consists of 10 rounds of processing for 128-bit keys, 12 rounds for
192-bit keys, and 14 rounds for 256-bit keys.
Except for the last round in each case, all other rounds are identical.
Each round of processing includes one single-byte based substitution step, a row-
wise permutation step, a column-wise mixing step, and the addition of the round
key. The order in which these four steps are executed is different for encryption and
decryption.
To appreciate the processing steps used in single round, it is best to think of a
128-bit block as consisting of a 4 × 4 matrix of bytes, rearranged as follows:
Therefore, the first four bytes of a 128-bit input block occupy the first column in the 4
× 4 matrix of bytes. The next four bytes occupy the second column, and so on.
The 4×4 matrix of bytes shown above is referred to as the state array in AES.
The algorithm begins with an Add round key stage followed by 9 rounds of four stages and a
tenth round of three stages.
This applies for both encryption and decryption with the exception that each stage of a round
the decryption algorithm is the inverse of its counterpart in the encryption algorithm.
The four stages are as follows: 1. Substitute bytes 2. Shift rows 3. Mix Columns 4. Add
Round Key
Substitute Bytes
This stage (known as SubBytes) is simply a table lookup using a 16 × 16 matrix of
byte values called an s-box.
This matrix consists of all the possible combinations of an 8 bit sequence (28 = 16
× 16 = 256).
However, the s-box is not just a random permutation of these values and there is a
well defined method for creating the s-box tables.
The designers of Rijndael showed how this was done unlike the s-boxes in DES
for which no rationale was given. Our concern will be how state is affected in
each round.
For this particular round each byte is mapped into a new byte in the following way:
the leftmost nibble of the byte is used to specify a particular row of the s-box and
the rightmost nibble specifies a column.
For example, the byte {95} (curly brackets represent hex values in FIPS PUB
197) selects row 9 column 5 which turns out to contain the value {2A}.
This is then used to update the state matrix.
The development of public-key cryptography is the greatest and perhaps the only
true revolution in the entire history of cryptography. It is asymmetric, involving the use
of two separate keys, in contrast to symmetric encryption, which uses only one key. Public
key schemes are neither more nor less secure than private key (security depends on the
key size for both). Public-key cryptography complements rather than replaces symmetric
cryptography. Both also have issues with key distribution, requiring the use
of some suitable protocol. The concept of public-key cryptography evolved from an
attempt to attack two of the most difficult problems associated with symmetric
encryption:
1.) key distribution – how to have secure communications in general without having to
trust a KDC with your key
2.) digital signatures – how to verify a message comes intact from the claimed sender
Public-key/two-key/asymmetric cryptography involves the use of two keys:
Public-Key algorithms rely on one key for encryption and different but related key
for decryption. These algorithms have the following important characteristics:
it is computationally infeasible to find decryption key knowing only
algorithm & encryption key
it is computationally easy to en/decrypt messages when the relevant
(en/decrypt) key is known
either of the two related keys can be used for encryption, with the other
used for decryption (for some algorithms like RSA)
The following figure illustrates public-key encryption process and shows that a public-
key encryption scheme has six ingredients: plaintext, encryption algorithm, public &
private keys, cipher text & decryption algorithm.
The essential steps involved in a public-key encryption scheme are given below:
1.) Each user generates a pair of keys to be used for encryption and decryption.
2.) Each user places one of the two keys in a public register and the other key is kept private.
3.) If B wants to send a confidential message to A, B encrypts the message using A’s public
key.
4.) When A receives the message, she decrypts it using her private key. Nobody else can
decrypt the message because that can only be done using A’s private key (Deducing a
private key should be infeasible).
5.) If a user wishes to change his keys –generate another pair of keys and publish the
public one: no interaction with other users is needed. Notations used in Public-key
cryptography:
The public key of user A will be denoted KUA.
The private key of user A will be denoted KRA.
Encryption method will be a function E.
Decryption method will be a function D.
If B wishes to send a plain message X to A, then he sends the
cryptotext Y=E(KUA,X)
The intended receiver A will decrypt the message: D(KRA,Y)=X
A can now get the plaintext and ensure that it comes from B (he is the only one who
knows his private key): decrypt Y using B’s public key: X=E(KUB,Y).
Applications For Public-Key Cryptosystems:
1.) Encryption/decryption: sender encrypts the message with the receiver’s public key.
2.) Digital signature: sender “signs” the message (or a representative part of the
message) using his private key
3.) Key exchange: two sides cooperate to exchange a secret key for later use in a
secret-key cryptosystem.
RSA is the best known, and by far the most widely used general public key
encryption algorithm, and was first published by Rivest, Shamir & Adleman of MIT in 1978
[RIVE78]. Since that time RSA has reigned supreme as the most widely accepted and
implemented general-purpose approach to public-key encryption. The RSA scheme is a
block cipher in which the plaintext and the ciphertext are integers between 0 and n-1 for
some fixed n and typical size for n is 1024 bits (or 309 decimal digits). It is based on
exponentiation in a finite (Galois) field over integers modulo a prime, using large integers
(eg. 1024 bits). Its security is due to the cost of factoring large numbers. RSA involves a
public-key and a private-key where the public key is known to ll and is used to encrypt
data or message. The data or message which has been encrypted using a public key can
only be decryted by using its corresponding private-k y. Each user generates a key pair
i.e. public and private key using the following steps:
each user selects two large primes at random - p, q
compute their system modulus n=p.q
calculate ø(n), where ø(n)=(p-1)(q- 1)
selecting at random the encry tion key e, where 1<e<ø(n),and gcd(e,ø(n))=1
solve following equation to find decryption key d: e.d=1 mod ø(n) and 0≤d≤n
publish their public encr ption key: KU={e,n}
keep secret private decryption key: KR={d,n}
Both the sender and receiver must know the values of n and e, and only the receiver
knows the value of d. Encryption and Decryption are done using the following equations.
To encrypt a message M the sender:
– obtains public key of recipient KU={e,n}
– computes: C=Me mod n, where 0≤M<n
To decrypt the ciphertext C the owner:
– uses their private key KR={d,n}
– computes: M=Cd mod n = (Me) d mod n = Med mod n
For this algorithm to be satisfactory, the following requirements are to be met.
a) Its possible to find values of e, d, n such that Med = M mod n for all M<n
b) It is relatively easy to calculate Me and C for all values of M < n.
The way RSA works is based on Number theory: Fermat’s little theorem: if p is
prime and a is positive integer not divisible by p, then ap-1 ≡ 1 mod p. Corollary: For
any positive integer a and prime p, ap ≡ a mod p.
Fermat’s theorem, as useful as will turn out to be does not provide us with integers
d,e we are looking for –Euler’s theorem (a refinement of Fermat’s) does. Euler’s function
associates to any positive integer n, a number φ(n): the number of positive integers
smaller than n and relatively prime to n. For example, φ(37) = 36 i.e. φ(p) = p-1 for any
prime p. For any two primes p,q, φ(pq)=(p-1)(q-1). Euler’s theorem: for any relatively
prime integers a,n we have aφ(n)≡1 mod n. Corollary: For ny integers a,n we have
aφ(n)+1≡a mod n Corollary: Let p,q be two odd primes and n=pq. Then: φ(n)=(p-1)(q-
1) For any integer m with 0<m<n, m(p-1)(q-1)+1 ≡ m mod n For any integers k,m with
0<m<n, mk(p-1)(q-1)+1 ≡ m mod n Euler’s theorem provides us the numbers d, e such
that Med=M mod n. We have to choose d,e such that ed=kφ(n)+1, or equivalently, d≡e-
1mod φ(n)
Security of RSA
There are three main approaches of attacking RSA algorithm.
Brute force key search (infeasible given size of numbers) As explained before,
involves trying all possible private keys. Best defense is using large keys.
Mathematical attacks (based on difficulty of computing ø(N), by factoring modulus N)
There are several approaches, all equivalent in effect to factoring the product of two
primes. Some of them are given as:
– factor N=p.q, hence find ø(N) and then d
– find d directly
The possible defense would be using large keys and also choosing large numbers for p
and q, which should differ only by a few bits and are also on the order of magnitude 1075
to 10100. And gcd (p-1, q-1) should be small.
Diffie-Hellman Key Exchange
Diffie-Hellman key exchange (D-H) is a cryptographic protocol that allows two parties
that have no prior knowledge of each other to jointly establish a shared secret key over
an insecure communications channel. This key can then be used to encrypt subsequent
communications
First, a primitive root of a prime number p, can be fined as one whose powers generate
all the integers from 1 to p-1. If a is a primitive root of the prime number p, then the
numbers, a mod p, a2 mod p,..., ap-1 mod p, are distinct and consist of the integers from
1 through p 1 in some permutation.
For any integer b and a primitive root a of prime number p, we can find a unique exponent
Let Alice pick a = 10. Alice calculates 1310 (mod 37) which is 4 and sends that to Bob. Let
Bob pick b = 7. Bob calculates 137 (mod 37) which is 32 and sends that to Alice. (Note: 6
and 7 are secret to Alice and Bob, respectively, but both 4 and 32 are known by all.)
10 (mod 37) which is 30, the secret key.
2) Let p = 47 and g = 5. Let Alice pick a = 18. Alice calculates 518 (mod 47) which is 2 and
sends that to Bob. Let Bob pick b = 22. Bob calculates 522 (mod 47) which is 28 and sends
that to Alice.
18 (mod 47) which is 24, the secret key.
3. Darth intercepts YA and transmits YD1 to Bob. Darth also calculates K2 = (YA) XD2mod q.
6. Darth intercepts XA and transmits YD2 to Alice. Darth calculates K1 = (YB) XD1 mod q.
3. Darth sends Bob E(K1, M) or E(K1, M'), where M' is any message. In the first case, Darth
simply wants to eavesdrop on the communication without altering it. In the second case,
Darth wants to modify the message going to Bob.
The key exchange protocol is vulnerable to such an attack because it does not authenticate
the participants. This vulnerability can be overcome with the use of digital signatures and
public- key certificates.
Authentication Requirements
In the context of communications across a network, the following eight attacks can be identified:
1. Disclosure
2. Traffic analysis
3. Masquerade
4. Content modification
5. Sequence modification
6. Timing modification
7. Source repudiation
8. Destination repudiation
Message Authentication
MESSAGE ENCRYPTION:
Message encryption by itself can provide a measure of authentication. The analysis differs
for conventional and public-key encryption schemes. The message must have come from
the sender itself, because the ciphertext can be decrypted using his (secret or public) key.
Also, none of the bits in the message have been altered because an opponent does not
know how to manipulate the bits of the ciphertext to induce meaningful changes to the
plaintext. Often one needs alternative authentication schemes than just encrypting the
message.
Sometimes one needs to avoid encryption of full messages due to legal requirements.
Encryption and authentication may be separated in the system architecture.
The data (e.g., message, record, file, or program) to be authenticated are grouped into
contiguous 64-bit blocks: D1, D2,..., DN. If necessary, the final block is padded on the right
with zeroes to form a full 64-bit block. Using the DES encryption algorithm, E, and a secret
key, K, a data authentication code (DAC) is calculated as follows:
The DAC consists of either the entire block ON or the leftmost M bits of the block, with 16
≤ M ≤ 64
Use of MAC needs a shared secret key between the communicating parties and also MAC
does not provide digital signature. The following table summarizes the confidentiality
and authentication implications of the approaches shown above.
HASH FUNCTION
Rotated XOR –before each addition the hash value is rotated to the
left with 1 bit
Digital signature:
➢ It is an authentication mechanism that allows the sender to attach an electronic code
with the message. This electronic code acts as the signature of the sender and hence, is
named digital signature.
➢ It is done to ensure its authenticity and integrity.
➢ Digital signature uses the public-key cryptography technique. The sender uses his or
her private key and a signing algorithm to create a digital signature and the signed
document can be made public. The receiver, uses the public key of the
sender and a verifying algorithm to verify the digital signature.
➢ A normal message authentication scheme protects the two communicating parties
against attacks from a third party (intruder). However, a secure digital signature
scheme protects the two parties against each other also.
➢ Suppose A wants to send a signed message (message with A's digital signature) to B
through a network. For this, A encrypts the message using his or her private key, which
results in a signed message. The signed message is then sent through the network to B.
➢ Now, B attempts to decrypt the received message using A's public key in order to
verify that the received message has really come from A.
➢ If the message gets decrypted, B can believe that the message is from A. However, if
the message or the digital signature has been modified during transmission, it cannot be
decrypted using A's public key. From this, B can conclude that either
the message transmission has tampered with, or that the message has not been
generated by A.
Message integrity:
➢ Digital signatures also provide message integrity.
➢ If a message has a digital signature, then any change in the message after the
signature is attached will invalidate the signature.
➢ That is, it is not possible to get the same signature if the message is changed.
Moreover, there is no efficient way to modify a message and its signature such that a
new message with a valid signature is produced.
Non-repudiation:
➢ Digital signatures also ensure non-repudiation.
➢ For example, if A has sent a signed message to B, then in future A cannot deny about
the sending of the message. B can keep a copy of the message along with A's signature.
➢ In case A denies, B can use A’s public key to generate the original message. If the
newly created message is the same as that initially sent by A, it is proved that the
message has been sent by A only.
In the same way, B can never create a forged message bearing A's digital signature,
because only A can create his or her digital signatures with the help of that private key.
Message confidentiality:
➢ Digital signatures do not provide message confidentiality, because anyone knowing
the sender's public key can decrypt the message.
4) Describe briefly about the digital signature process also describe how it ensures non-
reputation with a suitable example.
5) Explain the key components of the Diffie-Hellman algorithm?
6) Write a note on Digital Signature.
7) Explain the working of a message authentication code with an example.
8) In a public-key system using RSA, you intercept the ciphertext C =10 sent to a user whose
public key is e=5, n=35. What is the plaintext M?
9) Explain why HMAC is preferred over simple MAC schemes.
10) Give an overview of digital signature.
11) Explain the main requirements of public key cryptography.
12) Compare the security features of MACs and hash functions.
13) Explain the working of a message authentication code with an example.
14) User Alice & Bob exchange the key using Diffie Hellman algorithm. Assume α=5 q=83 XA=6
XB=10. Find YA, YB, K.
15) Comment on the security of HASH functions.
16) Demonstrate the RSA encryption and decryption process with an example that uses small
values for simplicity.
17) Describe the digital signature creation and verification process, and explain how it ensures
authentication and integrity with an example.
18) Outline the Diffie-Hellman key exchange algorithm, including its key parameters and the
steps involved.
19) Write a detailed note on the role of digital signatures in ensuring data integrity and
authenticity.
20) Illustrate the concept of a message authentication code (MAC) with an example of how it
ensures data security.
21) In an RSA system, the intercepted ciphertext is C=15C = 15C=15, and the public key of the
user is e=7, n=33e = 7, \, n = 33e=7,n=33. Determine the plaintext MMM.
22) Discuss the advantages of HMAC over traditional MAC schemes, focusing on security and
efficiency.
23) Provide an overview of digital signature standards (DSS) and their applications in secure
communication.
24) List and explain the fundamental principles of public key cryptography, such as key pairs and
asymmetric encryption.
25) Compare the design goals and use cases of MACs and cryptographic hash functions,
highlighting their differences.
26) Describe the working of a keyed-hash message authentication code (HMAC) with an example
to demonstrate its use.
27) Explain how the Elliptic Curve Cryptography (ECC) can be an alternative to RSA for public-key
encryption and digital signatures.
28) Discuss the differences between digital signatures and message authentication codes, with
examples of their applications.