0% found this document useful (0 votes)
22 views206 pages

C&C Combined Module Notes

Module 1 covers Information Theory and Source Coding, introducing concepts such as entropy, mutual information, and various coding techniques including Huffman, Arithmetic, Lempel-Ziv, and Run Length coding. It emphasizes the quantification of information, uncertainty, and the relationship between entropy and mutual information, along with practical applications in communication systems. The module includes exercises and examples to illustrate these concepts and their applications in coding efficiency and data compression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views206 pages

C&C Combined Module Notes

Module 1 covers Information Theory and Source Coding, introducing concepts such as entropy, mutual information, and various coding techniques including Huffman, Arithmetic, Lempel-Ziv, and Run Length coding. It emphasizes the quantification of information, uncertainty, and the relationship between entropy and mutual information, along with practical applications in communication systems. The module includes exercises and examples to illustrate these concepts and their applications in coding efficiency and data compression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 206

Module 1

Information Theory and Source Coding


Syllabus :
Introduction to Information Theory, Uncertainty and Information, Entropy, Mutual information, Relationship
between entropy and mutual information, Shannon Fano coding.
Source Coding Techniques: Huffman Coding, Arithmetic coding, Lempel-Ziv Coding, Run length coding.

Text Book : Bose, Ranjan. Information theory, coding and cryptography, 3 rd Edition, Tata McGraw-Hill Education,
2015, ISBN: 978-9332901257

General Introduction to Information Theory


Information Theory is a branch of mathematics that deals with the quanti ication of information. It
provides a framework for understanding how information is transmitted, stored, and processed.
Uncertainty and Information
 Uncertainty: The degree of unpredictability associated with an event.
 Information: The reduction in uncertainty.
Entropy
 Entropy (H): A measure of the average amount of information contained in a message.
o Higher entropy indicates greater uncertainty.
o Lower entropy indicates less uncertainty.
Formula for Entropy:
H(X) = -∑ P(xi) log₂ P(xi)
where:
 X is a random variable.
 P(xi) is the probability of the i-th outcome.
Mutual Information
 Mutual Information (I): A measure of the shared information between two random variables.
 It quanti ies the reduction in uncertainty about one random variable when the other is known.
Formula for Mutual Information:
I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X)
where:
 H(X|Y) is the conditional entropy of X given Y.
Relationship between Entropy and Mutual Information
 Mutual information is always non-negative.
 Mutual information is zero if and only if the two random variables are independent.
 Mutual information is equal to the entropy of one variable if the other variable is a deterministic
function of the irst.
Shannon Fano Coding
 A source coding technique that assigns variable-length codes to symbols based on their
probabilities.
 The codes are constructed such that more probable symbols have shorter codes.
Source Coding Techniques
Huffman Coding
 A greedy algorithm that constructs optimal pre ix-free codes.
 It involves building a binary tree based on the probabilities of the symbols.
 The codes are assigned by traversing the tree from the root to the corresponding leaf.
Arithmetic Coding
 A source coding technique that represents a sequence of symbols as a single real number.
 It achieves higher compression ef iciency than Huffman coding for many sources.
Lempel-Ziv Coding
 A class of algorithms that exploit repeated patterns in the input data.
 The data is compressed by replacing repeated sequences with pointers to previously seen
occurrences.
 Lempel-Ziv-Welch (LZW) is a popular variant of Lempel-Ziv coding.
Run Length Coding
 A simple compression technique that replaces sequences of identical symbols with a pair of values:
the symbol and the number of consecutive occurrences.
 It is effective for data with long runs of the same symbol.
Note: These are just a brief overview of the concepts involved in information theory and source coding.
For a deeper understanding, it is recommended to explore textbooks and online resources.
Information Theory:
• Information theory applies the laws of probability theory, and mathematics in general, to study the
collection and processing of information.
• In the context of communication systems, information theory, originally called the mathematical
theory of communication, deals with mathematical modelling and analysis of communication
systems, rather than with physical sources and physical channels.
Can we measure information?
• Consider the two following sentences:
1. There is a traf ic jam on New Horizon College of Engineering
2. There is a traf ic jam on New Horizon College of Engineering near Gate No 3.
Sentence 2 seems to have more information than that of sentence 1. From the semantic viewpoint,
sentence 2 provides more useful information.
Information theory is the scienti ic study of information and communication systems designed to handle it
(information).
• Including telegraphy, radio communications, and all other systems concerned with the
processing and/or storage of signals.
• In particular, Information Theory provides answers to the following two fundamental questions:
• What is the minimum number of bits per symbol required to fully represent the source?—
Entropy of the source
• What is the ultimate transmission rate for reliable communication over a noisy channel?—
Capacity of a channel
An information source is an object that produces an event, the outcome of which is random and in
accordance with some probability distribution.
A practical information source in a communication system is a device that produces messages. It can be
either analogue or digital.
Here, we shall deal mainly with the discrete sources, since the analogue sources can be transformed to
discrete sources through the use of sampling and quantisation techniques.
A discrete information source is a source that has only a inite set of symbols as possible outputs. The set
of possible source symbols is called the source alphabet, and the elements of the set are called symbols
ENTROPY:
Conditions of Occurrence of Events
If we consider an event, there are three conditions of occurrence.
 If the event has not occurred, there is a condition of uncertainty.
 If the event has just occurred, there is a condition of surprise.
 If the event has occurred, a time back, there is a condition of having some information.
These three events occur at different times. The differences in these conditions help us gain knowledge on
the probabilities of the occurrence of events.
Entropy: When we observe the possibilities of the occurrence of an event, how surprising or uncertain it
would be, it means that we are trying to have an idea on the average content of the information from the
source of the event.
Entropy can be de ined as a measure of the average information content per source symbol.

Where pi is the probability of the occurrence of character number i from a given stream of characters and
b is the base of the algorithm used. Hence, this is also called as Shannon’s Entropy.
Conditional Entropy: The amount of uncertainty remaining about the channel input after observing the
channel output, is called as Conditional Entropy.

It is denoted by
Example:
Consider a diskette storing a data ile consisting of 100,000 binary digits (binits), i.e., a total of 100,000
“0”s and “1”s . If the binits 0 and 1 occur with probabilities of ¼ and ¾ respectively, then binit 0 conveys
an amount of information equal to log2 (4/1) = 2 bits, while the binit 1 conveys information amounting to
log2 (4/3) = 0.42 bit.
The quantity H is called the entropy of a discrete memory-less source. It is a measure of the average
information content per source symbol. It may be noted that the entropy H depends on the probabilities of
the symbols in the alphabet of the source.
Example
Consider a discrete memory-less source with source alphabet {s 0,s1,s2} with probabilities p0=1/4, p1=1/4
and p2=1/2. Find the entropy of the source.
Solution
The entropy of the given source is
H = p0log2(1/p0) + p1log2(1/p1) + p2log2(1/p2)
= ¼log2(4) + ¼log2(4) + ½log2(2)
= 2/4 + 2/4 + 1/2
= 1.5 bits
For a discrete memory-less source with a ixed alphabet:
• H=0, if and only if the probability pk=1 for some k, and the remaining probabilities in
the set are all zero. This lower bound on the entropy corresponds to ‘no uncertainty’.
• H=log2(K), if and only if pk=1/K for all k (i.e. all the symbols in the alphabet are
equiprobable). This upper bound on the entropy corresponds to ‘maximum
uncertainty’.

• 0  H  log2(K) K is the radix (number of symbols) of the alphabet S of the source

• In Case I, it is very easy to guess whether the message s0 with a probability =0.01 will occur or the
message s1 with probability =0.99 will occur.(Most of the time message s 1 will occur). Thus in this
case, the uncertainty is less.
• In Case II, it is somewhat dif icult to guess whether s0 will occur or s1 will occur as their
probabilities are nearly equal. Thus in this case, the uncertainty is more.
In Case III, it is extremely dif icult to guess whether s0 or s1 will occur, as their probabilities are
equal. Thus in this case, the uncertainty is maximum
Entropy is less when uncertainty is less.
Entropy is more when uncertainty is more.
Thus, we can say that entropy is a measure of uncertainty.

An analog signal is band limited to B Hz, sampled at the Nyquist rate, and the samples are quantized into 4-
levels. The quantization levels Q1, Q2, Q3, and Q4 (messages) are assumed independent and occur with
probs. P1 = P2 = 1 and P2 = P3 = 3 . Find the information rate of the source.
Relation between Entropy and Mutual Information
Mutual Information: quanti ies the amount of information that knowing one random variable Y gives about
another random variable X. It is a measure of how much the uncertainty in X is reduced by knowing Y.
SHANNON- FANO CODING:
Lempel Ziv–Welch Coding
A drawback of the Huffman code is that it requires knowledge of a probabilistic model of
the source; unfortunately, in practice, source statistics are not always known a priori.
thereby compromising the ef iciency of the code. To overcome these practical
limitations, we may use the Lempel-Ziv algorithm/ which is intrinsically adaptive and
simpler to implement than Huffman coding.
A key to ile data compression is to have repetitive patterns of data so that patterns seen
once, can then be encoded into a compact code symbol, which is then used to represent
the pattern whenever it reappears in the ile. For example, in images, consecutive scan
lines (rows) of the image may be indentical. They can then be encoded with a simple code
character that represents the lines. In text processing, repetitive words, phrases, and
sentences may also be recognized and represented as a code. A typical ile data
compression algorithm is known as LZW - Lempel, Ziv, Welch encoding. Variants of this
algorithm are used in many ile compression schemes such as GIF iles etc. These are
lossless compression algorithms in which no data is lost, and the original ile can be
entirely reconstructed from the encoded message ile. The LZW algorithm is a greedy
algorithm in that it tries to recognize increasingly longer and longer phrases that are
repetitive, and encode them. Each phrase is de ined to have a pre ix that is equal to a
previously encoded phrase plus one additional character in the alphabet. Note “alphabet”
means the set of legal characters in the ile. For a normal text ile, this is the ascii character
set. For a gray level image with 256 gray levels, it is an 8 bit number that represents the
pixel’s gray level. In many texts certain sequences of characters occur with high frequency.
In English, for example, the word the occurs more often than any other sequence of three
letters, with and, ion, and ing close behind. If we include the space character, there are
other very common sequences, including longer ones like of the. Although it is impossible
to improve on Huffman encoding with any method that assigns a ixed encoding to each
character, we can do better by encoding entire sequences of characters with just a few
bits. The method of this section takes advantage of frequently occurring character
sequences of any length. It typically produces an even smaller representation than is
possible with Huffman trees, and unlike basic Huffman encoding it 1) reads through the
text only once and 2) requires no extra space for overhead in the compressed
representation. The algorithm makes use of a dictionary that stores character sequences
chosen dynamically from the text. With each character sequence the dictionary associates
a number; if s is a character sequence, we use codeword(s) to denote the number assigned
to s by the dictionary. The number codeword(s) is called the code or code number of s. All
codes have the same length in bits; a typical code size is twelve bits, which permits a
maximum dictionary size of 2 12 = 4096 character sequences.
Module - 1

1) Define the following with suitable expressions:

(i) Mutual Information (ii) Entropy (iii) Efficiency (iv) Redundancy

2) List out the properties of Entropy

3) List out the properties of Mutual Information

4) Compare and contrast Entropy and mutual information.

5) Derive the relationship between entropy and mutual information.

6) Explain mutual information between two random variables. Illustrate the relationship between
entropy and mutual information.

7) Construct a Huffman code for the following source: Calculate the coding efficiency.

8) A DMS has the following alphabet with probability of occurrence as shown below:

Symbol s0 s1 s2 s3 s4 s5 s6

Probability 0.125 0.0625 0.25 0.0625 0.125 0.125 0.25

9) Generate the Huffman code with minimum code variance. Determine the code variance and code
efficiency. Comment on code efficiency.

10) Consider a DMS with three symbols xi,i=1,2,3x_i, i = 1, 2, 3xi,i=1,2,3 and their respective
probabilities p1=0.5,p2=0.3,p_1 = 0.5, p_2 = 0.3,p1=0.5,p2=0.3, and p3=0.2p_3 = 0.2p3=0.2. Encode
the source symbols using the Huffman encoding algorithm and compute the efficiency of the code
suggested. Now group together the symbols, two at a time, and again apply the Huffman encoding
algorithm to find the codewords. Compute the efficiency of this code. How can the coding efficiency
be improved?

11) An information source produces a sequence of independent symbols having the following
probabilities:

Symbols S1 S2 S3 S4 S5 S6 S7

Probabilities 1/27 1/27 1/3 1/9 1/9 1/27 1/27

Construct binary code using Huffman encoding procedure and find its efficiency.

12) The source of information A generates the symbols {A1,A2,A3,A4,A5,A6} with the corresponding
probabilities {0.2,0.3,0.11, 0.16,0.18,0.05}. Compute the code for source symbols using Huffman
coding and calculate its efficiency.

13) A zero memory source is with


S={S1,S2,S3,S4,S5,S6} and P=[0.4,0.2,0.1,0.1,0.05,0.05]
Construct a binary Huffman code by placing the composite symbol as high as possible and determine
the variance of the word lengths.

14)Discuss the main purpose of source coding techniques in communication systems.

15) Consider the message "ABABAB” with the symbol probabilities of p(A)=0.4 p(B)=0.4 and p(C)=0.2.
Calculate the encoded values messages using arithmetic encoding.

16) Explain Lempel-Ziv Coding with an example. Also discuss its advantages and its limitations.

Also explain some of its advantages.

17) With the help of an example, describe the notion of a discrete memory-less source.
18) Explain run-length coding with the help of an example.
19) Construct a Lempel-Ziv code for the following bit sequence:
101011010101001011 List the steps involved in Shannon-Fano coding.

20) What is a simple way to shorten a repeated pattern of numbers? if you have a long list of numbers
like this: 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3 How can we write it in a shorter way?

21) Explain the difference between lossless and lossy compression? Explain Prefix free coding with
examples.

22) Design a Shannon-Fano code for the following symbol probabilities: {A: 0.4, B: 0.3, C: 0.2, D: 0.1}

23) Given the messages x1,x2,x3,x4,x5 and x6 with respective probabilities 0.4,0.2,0.1,0.1,0.06,0.04 (i)
Construct a binary code by applying Shannon-Fano encoding procedure.

(i) Draw the Code tree for the same.

(ii) Calculate code efficiency and redundancy of the code.

24) Illustrate the information system with a suitable diagram.

25) Explain the properties of Entropy with suitable expressions and Examples.

26) Construct a binary Huffman code of azero memory source with probabilities P = {0.4, 0.2, 0.1, 0.1,
0.1, 0.05, 0.05}.

27) Make use of Lempel – Ziv algorithm and encode the following string 101011011010101011.

28) Encode given data sequence by Lempel Zip coding.

01000101110010100101.
29) State the properties of entropy. If a weather forecast model predicts one of 15 possible
weather conditions, each equally likely to occur. Calculate the entropy of this prediction model.
30) The source of information A generates the symbols {A1,A2,A3,A4,A5,A6} with the corresponding
probabilities {0.2,0.3,0.11, 0.16,0.18,0.05}. Compute the code for source symbols using Huffman and
Shannon-Fano encoder and compare its efficiency.
Module 2
Error-Correcting Codes
Channel models, channel capacity, channel coding, Types of Codes.
Linear Block Codes: matrix description of Linear Block Codes, Error detection & Correction, hamming codes,
Low Density Parity Check (LDPC) Codes.
Binary Cyclic Codes: Algebraic Structure of Cyclic Codes, Encoding using an (n-k) Bit Shift register,
Syndrome Calculation, Error Detection and Correction.
Channel Models
Channel models represent how signals propagate from the transmitter to the receiver. They include various factors
such as noise, interference, and physical properties of the transmission medium. Common channel models are:
 AWGN (Additive White Gaussian Noise): Simplest model where noise is Gaussian and uncorrelated
with the signal.
 Rayleigh Fading: Models multipath propagation where there is no dominant line-of-sight path.
 Rician Fading: Similar to Rayleigh but includes a strong line-of-sight component.
 Path Loss Models: Account for the reduction in signal power over distance, like the Free Space Path Loss
model.
2. Channel Capacity
This refers to the maximum rate at which information can be transmitted over a communication channel without
error, given by Shannon's Capacity Theorem:

Where:
 CCC is the channel capacity (bits per second).
 BBB is the bandwidth of the channel (Hz).
 SSS is the signal power.
 NNN is the noise power.
3. Channel Coding
Channel coding is used to detect and correct errors in transmitted data. It adds redundancy to the transmitted signal
to improve its reliability. There are two main types of channel coding:
 Error Detection: Identifies the presence of errors (e.g., Parity Check, CRC).
 Error Correction: Detects and corrects errors at the receiver (e.g., Hamming Code, Reed-Solomon Code).
4. Types of Codes
 Block Codes: These divide the message into fixed-size blocks and add redundancy to each block
independently.
o Example: Hamming Code, Reed-Solomon Code.
 Convolutional Codes: Encode the entire message in a continuous stream, where the output depends on the
current and previous input bits.
o Example: Trellis Codes, Viterbi Algorithm for decoding.
 Turbo Codes: Combine two or more convolutional codes with an interleaver, allowing very low error
rates close to Shannon’s limit.
 LDPC (Low-Density Parity-Check) Codes: A class of highly efficient linear block codes with sparse
parity-check matrices, used in modern wireless standards like 5G.
Linear Block Codes
Linear block codes are a type of error-correcting code used in digital communication, where a set of information
bits are transformed into a larger set of bits by adding redundant bits (parity bits) to allow for error detection and
correction.
1. Matrix Description of Linear Block Codes
A linear block code can be described by two matrices:
 Generator Matrix (G): Used to encode the message bits.
 Parity-Check Matrix (H): Used to check for errors in the received message.
Encoding Process:
Error Detection and Correction
Linear block codes are designed to detect and correct errors by adding redundancy in the form of parity
bits.

3. Hamming Codes
Hamming codes are a specific type of linear block code that can detect and correct single-bit errors. They are
widely used due to their simplicity and efficiency.
Parameters:
 Hamming codes are typically denoted as (n,k)(n, k)(n,k), where:
o n=2m−1n = 2^m - 1n=2m−1 (total bits, including parity bits).
o k=2m−1−mk = 2^m - 1 - mk=2m−1−m (number of information bits).
o mmm is the number of parity bits.
Construction of Hamming Codes:
 The parity-check matrix HHH for Hamming codes is constructed such that each column is a unique binary
representation of numbers from 1 to nnn.
 The generator matrix GGG can be derived from HHH by ensuring that it is in systematic form.
Error Detection and Correction:
Hamming codes are capable of:
 Detecting 2-bit errors.
 Correcting 1-bit errors.
4. Low-Density Parity-Check (LDPC) Codes
LDPC codes are a powerful type of linear block code that are used in modern communication systems, including
5G, due to their high error-correcting capability and near-Shannon-limit performance.
LDPC Code Structure:
 Low Density: The parity-check matrix HHH of LDPC codes is sparse, meaning that most of the entries are
zero, which reduces complexity and allows efficient decoding.
 Parity-Check Matrix: For an LDPC code, the matrix HHH has far fewer 1s than 0s. This structure makes
decoding more efficient using iterative algorithms like the belief propagation (or sum-product) algorithm.
Encoding:
The encoding of LDPC codes is similar to other linear block codes. A generator matrix GGG is derived from the
sparse parity-check matrix HHH. The message vector u\mathbf{u}u is encoded using c=u⋅G\mathbf{c} =
\mathbf{u} \cdot Gc=u⋅G.
Decoding:
LDPC codes use iterative decoding algorithms, such as the belief propagation or sum-product algorithm. These
algorithms update the likelihood of individual bits being correct based on the received message and the parity-
check constraints, gradually converging on the most likely transmitted codeword.
Applications of LDPC Codes:
 LDPC codes are widely used in various communication standards like Wi-Fi (802.11n), 5G, and digital
television broadcasting (DVB-S2), where high data reliability is required.
Summary
 Linear Block Codes: Represented by matrices GGG (generator) and HHH (parity-check), allowing error
detection and correction.
 Hamming Codes: Simple block codes that detect 2-bit errors and correct 1-bit errors.
 LDPC Codes: Advanced block codes with sparse parity-check matrices, enabling efficient error correction
in modern communication systems.
ERROR CONTROL CODING

INTRODUCTION

The earlier chapters have given you enough background of Information theory and Source
encoding. In this chapter you will be introduced to another important signal - processing operation,
namely, “Channel Encoding”, which is used to provide „reliable‟ transmission of information
over the channel. In particular, we present, in this and subsequent chapters, a survey of „Error
control coding‟ techniques that rely on the systematic addition of „Redundant‟ symbols to the
transmitted information so as to facilitate two basic objectives at the receiver: „Error- detection‟
and „Error correction‟. We begin with some preliminary discussions highlighting the role of error
control coding.

RATIONALE FOR CODING:

The main task required in digital communication is to construct „cost effective systems‟ for
transmitting information from a sender (one end of the system) at a rate and a level of reliability that
are acceptable to a user (the other end of the system). The two key parameters available are transmitted
signal power and channel band width. These two parameters along with power spectral density of noise
determine the signal energy per bit to noise power density ratio, Eb/N0 and this ratio, as seen in chapter
4, uniquely determines the bit error for a particular scheme and we would like to transmit information
at a rate RMax = 1.443 S/N. Practical considerations restrict the limit on Eb/N0 that we can assign.
Accordingly, we often arrive at modulation schemes that cannot provide acceptable data quality (i.e.
low enough error performance). For a fixed Eb/N0, the only practical alternative available for changing
data quality from problematic to acceptable is to use “coding”.
Another practical motivation for the use of coding is to reduce the required Eb/N0 for a fixed
error rate. This reduction, in turn, may be exploited to reduce the required signal power or reduce the
hardware costs (example: by requiring a smaller antenna size).

The coding methods discussed in chapter 2 deals with minimizing the average word length of
the codes with an objective of achieving the lower bound viz. H(S) / log r, accordingly, coding is
termed “entropy coding”. However, such source codes cannot be adopted for direct transmission over
the channel. We shall consider the coding for a source having four symbols with probabilitiesp
(s1) =1/2, p (s2) = 1/4, p (s3) = p (s4) =1/8. The resultant binary code using Huffman‟s procedure is:

s1……… 0 s3…… 1 1 0
s2……… 10 s4…… 1 1 1

Clearly, the code efficiency is 100% and L = 1.75 bints/sym = H(S). The sequence s3s4s1 will
then correspond to 1101110. Suppose a one-bit error occurs so that the received sequence is 0101110.
This will be decoded as “s1s2s4s1”, which is altogether different than the transmitted sequence. Thus
although the coding provides 100% efficiency in the light of Shannon‟s theorem, it suffers a major
disadvantage. Another disadvantage of a „variable length‟ code lies in the fact that output data rates
measured over short time periods will fluctuate widely. To avoid this problem, buffers of large length
will be needed both at the encoder and at the decoder to store the variable rate bit stream if a fixed
output rate is to be maintained.

Some of the above difficulties can be resolved by using codes with “fixed length”. For example,
if the codes for the example cited are modified as 000, 100, 110, and 111. Observe that evenif there is
a one-bit error, it affects only one “block” and that the output data rate will not fluctuate. The
encoder/decoder structure using „fixed length‟ code words will be very simple compared to the
complexity of those for the variable length codes.

Here after, we shall mean by “Block codes”, the fixed length codes only. Since as discussed
above, single bit errors lead to „single block errors‟, we can devise means to detect and correct these
errors at the receiver. Notice that the price to be paid for the efficient handling and easy manipulations
of the codes is reduced efficiency and hence increased redundancy.

In general, whatever be the scheme adopted for transmission of digital/analog information, the
probability of error is a function of signal-to-noise power ratio at the input of a receiver and the data
rate. However, the constraints like maximum signal power and bandwidth of the channel (mainly the
Governmental regulations on public channels) etc, make it impossible to arrive at a signaling scheme
which will yield an acceptable probability of error for a given application. The answer to this problem
is then the use of „error control coding‟, also known as „channel coding‟. In brief, “error control
coding is the calculated addition of redundancy”. The block diagram of a typical data transmission
system is shown in Fig. 4.1

The information source can be either a person or a machine (a digital computer). The source
output, which is to be communicated to the destination, can be either a continuous wave form or a
sequence of discrete symbols. The „source encoder‟ transforms the source output into a sequence of
binary digits, the information sequence u. If the source output happens to be continuous, this involves
A-D conversion as well. The source encoder is ideally designed such that (i) the number of bints per
unit time (bit rate, rb) required to represent the source output is minimized (ii) the source output can be
uniquely reconstructed from the information sequence u.
Fig 4.1: Block diagram of a typical data transmission

The „Channel encoder‟ transforms u to the encoded sequence v, in general, a binary sequence,
although non-binary codes can also be used for some applications. As discrete symbols are not suited
for transmission over a physical channel, the code sequences are transformed to waveformsof specified
durations. These waveforms, as they enter the channel get corrupted by noise. Typical channels include
telephone lines, High frequency radio links, Telemetry links, Microwave links, and Satellite links and
so on. Core and semiconductor memories, Tapes, Drums, disks, optical memory and so on are typical
storage mediums. The switching impulse noise, thermal noise, cross talk and lightning are some
examples of noise disturbance over a physical channel. A surface defect on a magnetic tape is a source
of disturbance. The demodulator processes each received waveform and produces an output, which
may be either continuous or discrete – the sequence r. The channel decoder transforms r into a binary
sequence, û which gives the estimate of u, and ideally should be the replica of u. The source decoder
then transforms û into an estimate of source output and delivers this to the destination.

Error control for data integrity may be exercised by means of „forward error correction‟
(FEC) where in the decoder performs error correction operation on the received information
according to the schemes devised for the purpose. There is however another major approach known
as „Automatic Repeat Request‟ (ARQ), in which a re-transmission of the ambiguous information is
effected, is also used for solving error control problems. In ARQ, error correction is not done at all.
The redundancy introduced is used only for „error detection‟ and upon detection, the receiver requests
a repeat transmission which necessitates the use of a return path (feed back channel).

In summary, channel coding refers to a class of signal transformations designed to improve


performance of communication systems by enabling the transmitted signals to better withstand the
effect of various channel impairments such as noise, fading and jamming. Main objective of error
control coding is to reduce the probability of error or reduce the Eb/N0 at the cost of expending more
bandwidth than would otherwise be necessary. Channel coding is a very popular way of providing
performance improvement. Use of VLSI technology has made it possible to provide as much as
8 – dB performance improvement through coding, at much lesser cost than through other methods such
as high power transmitters or larger Antennas.

We will briefly discuss in this chapter the channel encoder and decoder strategies, our major
interest being in the design and implementation of the channel „encoder/decoder‟ pair to achieve fast
transmission of information over a noisy channel, reliable communication of information and reduction
of the implementation cost of the equipment.

Discrete memory less channel:


Referring to the block diagram in Fig. 4.2 the channel is said to be memory less if the de-
modulator (Detector) output in a given interval depends only on the signal transmitted in the interval,
and not on any previous transmission. Under this condition, we may model (describe) the combination
of the modulator – channel – and the demodulator as a “Discrete memory less channel”. Such a
channel is completely described by the set of transition probabilities p (yj | xk) where xk is the modulator
input symbol.

The simplest channel results from the use of binary symbols (both as input and output). When
binary coding us used the modulator has only „0‟`s and „1‟`s as inputs. Similarly, the inputs to the
demodulator also consists of „0‟`s and „1‟`s provided binary quantization is used. If so we say a
„Hard decision‟ is made on the demodulator output so as to identify which symbol was actually
transmitted. In this case we have a „Binary symmetric channel‟ (BSC). The BSC when derived from
an additive white Gaussian noise (AWGN) channel is completely described by the transition
probability „p‟. The majority of coded digital communication systems employ binary coding with hard-
decision decoding due to simplicity of implementation offered by such an approach.

The use of hard-decisions prior to decoding causes an irreversible loss of information in the
receiver. To overcome this problem “soft-decision” coding is used. This can be done by including a
multilevel quantizer at the demodulator output as shown in Fig. 4.2(a) for the case of binary PSK
signals. The input-output characteristics and the channel transitions are shown in Fig. 4.2(b) and
Fig. 4.2(c) respectively. Here the input to the demodulator has only two symbols „0‟`s and „1‟`s.
However, the demodulator output has „Q‟ symbols. Such a channel is called a “Binary input-Q-ary
output DMC”. The form of channel transitions and hence the performance of the demodulator,
depends on the location of representation levels of the quantizer, which inturn depends on the signal
level and variance of noise. Therefore, the demodulator must incorporate automatic gain control, if an
effective multilevel quantizer is to be realized. Further the soft-decision decoding offers significant
improvement in performance over hard-decision decoding.
Fig. 4.2 (a) - Reciever

Fig 4.2 (b) Transfer characteristics (c) Channel Diagram

Shannon‟s theorem on channel capacity Revisited:


The “Shannon‟s theorem on channel capacity” is re-stated here and call it the “Coding
Theorem”.

“It is possible in principle, to devise a means where by a communication system will transmit
information with an arbitrarily small probability of error, provided the information rate R (=r I(X,Y)
where r-is the symbol rate) is less than or equal to a rate „C‟ called the „channel capacity”. The
technique used to achieve this goal is called “Coding”. For the special case of a BSC, the theorem tells
us that if the code rate, Rc (defined later) is less than the channel capacity, then it is possible to find a
code that achieves error free transmission over the channel. Conversely, it is not possible tofind
such a code if the code rate Rc is greater than C.
The channel coding theorem thus specifies the channel capacity as a “Fundamental limit” on
the rate at which reliable transmission (error-free transmission) can take place over a DMC. Clearly,
the issue that matters is not the signal to noise ratio (SNR), so long as it is large enough, but how the
input is encoded.

The most un-satisfactory feature of Shannon‟s theorem is that it stresses only about the
“existence of good codes”. But it does not tell us how to find them. So, we are still faced with the
task of finding a good code that ensures error-free transmission. The error-control coding techniques
presented in this and subsequent chapters provide different methods of achieving this important system
requirement.

Types of errors:
The errors that arise in a communication system can be viewed as „independent errors‟ and
„burst errors‟. The first type of error is usually encountered by the „Gaussian noise‟, which is the
chief concern in the design and evaluation of modulators and demodulators for data transmission. The
possible sources are the thermal noise and shot noise of the transmitting and receiving equipment,
thermal noise in the channel and the radiations picked up by the receiving antenna. Further, in majority
situations, the power spectral density of the Gaussian noise at the receiver input is white. The
transmission errors introduced by this noise are such that the error during a particular signaling interval
does not affect the performance of the system during the subsequent intervals. The discrete channel, in
this case, can be modeled by a Binary symmetric channel. These transmission errors dueto Gaussian
noise are referred to as „independent errors‟ (or random errors).

The second type of error is encountered due to the „impulse noise‟, which is characterized by
long quiet intervals followed by high amplitude noise bursts (As in switching and lightning). A noise
burst usually affects more than one symbol and there will be dependence of errors in successive
transmitted symbols. Thus errors occur in bursts

Types of codes:

There are mainly two types of error control coding schemes – Block codes and convolutional
codes, which can take care of either type of errors mentioned above.
In a block code, the information sequence is divided into message blocks of k bits each,
represented by a binary k-tuple, u = (u1, u2 ….uk) and each block is called a message. The symbol u,
here, is used to denote a k – bit message rather than the entire information sequence. The encoder
then transforms u into an n-tuple v = (v1, v2 ….vn). Here v represents an encoded block rather than the
entire encoded sequence. The blocks are independent of each other.

The encoder of a convolutional code also accepts k-bit blocks of the information sequence u
and produces an n-symbol block v. Here u and v are used to denote sequences of blocks rather than a
single block. Further each encoded block depends not only on the present k-bit message block but also
on m-pervious blocks. Hence the encoder has a memory of order „m‟. Since the encoder has memory,
implementation requires sequential logic circuits.

If the code word with n-bits is to be transmitted in no more time than is required for the
transmission of the k-information bits and if τb and τc are the bit durations in the encoded and coded
words, i.e. the input and output code words, then it is necessary that
n.τc = k.τb

We define the “rate of the code” by (also called rate efficiency)


k
Rc 
n
Accordingly, with f  1 and f  1 , we have fb c  k  R

b  c 
c
fc b n

Example of Error Control Coding:

Better way to understand the important aspects of error control coding is by way of an example.
Suppose that we wish transmit data over a telephone link that has a useable bandwidth of 4 KHZ and
a maximum SNR at the out put of 12 dB, at a rate of 1200 bits/sec with a probability of error less than
10-3. Further, we have DPSK modem that can operate at speeds of 1200, 1400 and 3600 bits/sec with
error probabilities 2(10-3), 4(10-3) and 8(10-3) respectively. We are asked to design an error control
coding scheme that would yield an overall probability of error < 10-3. We have:

C = 16300 bits/sec, Rc = 1200, 2400 or 3600 bits/sec.

S S
[C=Blog2 (1+ ).  12dB or15.85 , B=4KHZ], p = 2(10-3), 4(10-3) and 8(10-3) respectively.
N N
Since Rc < C, according to Shannon‟s theorem, we should be able to transmit data with arbitrarily
small probability of error. We shall consider two coding schemes for this problem.

(i) Error detection: Single parity check-coding. Consider the (4, 3) even parity check code.

Message 000 001 010 011 100 101 110 111

Parity 0 1 1 0 1 0 0 1

Codeword 0000 0011 0101 0110 1001 1010 1100 1111

Parity bit appears at the right most symbol of the codeword.

This code is capable of „detecting‟ all single and triple error patterns. Data comes out of the channel
encoder at a rate of 3600 bits/sec and at this rate the modem has an error probability of 8(10-3). The
decoder indicates an error only when parity check fails. This happens for single and triple errors only.

pd = Probability of error detection.

= p(X =1) + p(X = 3), where X = Random variable of errors.

Using binomial probability law, we have with p = 8(10-3):


 n
P(X = k) =  k  pk ( 1  p )nk

 

4 3 4 3 4 4

pd    p( 1  p )   3  p ( 1  p ),    4C1  4 ,   4C3  4
1 1 3
       

Expanding we get pd  4 p  12 p2  16 p3  8 p4

Substituting the value of p we get:

pd = 32 (10-3) - 768 (10-6) +8192 (10-9) – 32768 (10-12) = 0.031240326 > > (10-3)

However, an error results if the decoder does not indicate any error when an error indeed has occurred.
This happens when two or 4 errors occur. Hence probability of a detection error = pnd (probability of
no detection) is given by:

pnd  P( X  2 )  P( X  4 )    p2 ( 1  p )2    p4 ( 1  p )0  6 p2  12 p3  7 p4
4 4
2 4
   

Substituting the value of p we get pnd=0.410-3  10-3

Thus probability of error is less than 10-3 as required.

(ii) Error Correction: The triplets 000 and 111 are transmitted whenever 0 and 1 are inputted.
A majority logic decoding, as shown below, is employed assuming only single errors.

Received 000 001 010 100 011 101 110 111


Triplet

Output 0 0 0 0 1 1 1 1
message

Probability of decoding error, pde= P (two or more bits in error)

 3 2  3 3 0 2 3

= 2  p (1-p) +  3 p (1-p) =3p -2p


   
=190.464 x 10-6=0.19x 10-3 < p= 10-3

Probability of no detection, pnd =P (All 3 bits in error) = p3 =512 x 10-9 < < pde!

In general observe that probability of no detection, pnd < < probability of decoding error, pde.

The preceding examples illustrate the following aspects of error control coding. Note that in
both examples with out error control coding the probability of error =8(10-3) of the modem.

1. It is possible to detect and correct errors by adding extra bits-the check bits, to the message
sequence. Because of this, not all sequences will constitute bonafied messages.

2. It is not possible to detect and correct all errors.

3. Addition of check bits reduces the effective data rate through the channel.

4. Since probability of no detection is always very much smaller than the decoding error
probability, it appears that the error detection schemes, which do not reduce the rate efficiency
as the error correcting schemes do, are well suited for our application. Since error detection
schemes always go with ARQ techniques, and when the speed of communication becomes a
major concern, Forward error correction (FEC) using error correction schemes would be
desirable.

Block codes:
We shall assume that the output of an information source is a sequence of Binary digits. In
„Block coding‟ this information sequence is segmented into „message‟ blocks of fixed length, say k.
Each message block, denoted by u then consists of k information digits. The encoder transforms these
k-tuples into blocks of code words v, each an n- tuple „according to certain rules‟. Clearly,
corresponding to 2k information blocks possible, we would then have 2k code words of length n > k.
This set of 2k code words is called a “Block code”. For a block code to be useful these 2k code words
must be distinct, i.e. there should be a one-to-one correspondence between u and v. u and v are also
referred to as the „input vector‟ and „code vector‟ respectively. Notice that encoding equipment must
be capable of storing the 2k code words of length n > k. Accordingly, the complexity of the equipment
would become prohibitory if n and k become large unless the code words have a special structural
property conducive for storage and mechanization. This structural is the „linearity‟.

Linear Block Codes:

A block code is said to be linear (n ,k) code if and only if the 2k code words from a k-
dimensional sub space over a vector space of all n-Tuples over the field GF(2).

Fields with 2m symbols are called „Galois Fields‟ (pronounced as Galva fields), GF (2m).Their
arithmetic involves binary additions and subtractions. For two valued variables, (0, 1).The modulo –
2 addition and multiplication is defined in Fig 4.3.
Fig 4.3

The binary alphabet (0, 1) is called a field of two elements (a binary field and is denoted by GF
(2). (Notice that  represents the EX-OR operation and  represents the AND operation).Furtherin
binary arithmetic, X=X and X – Y = X  Y. similarly for 3-valued variables, modulo – 3 arithmetic
can be specified as shown in Fig 6.4. However, for brevity while representing polynomials involving
binary addition we use + instead of  and there shall be no confusion about such usage.

Polynomials f(X) with 1 or 0 as the co-efficients can be manipulated using the above relations.
The arithmetic of GF(2m) can be derived using a polynomial of degree „m‟, with binary co-efficients
and using a new variable  called the primitive element, such that p() = 0.When p(X) isirreducible
(i.e. it does not have a factor of degree  m and >0, for example X3 + X2 + 1, X3 + X + 1, X4 +X3 +1,
X5 +X2 +1 etc. are irreducible polynomials, whereas f(X)=X4+X3+X2+1 is not as f(1) = 0 and hence
has a factor X+1) then p(X) is said to be a „primitive polynomial‟.

If vn represents a vector space of all n-tuples, then a subset S of vn is called a subspace if (i) the
all Zero vector is in S (ii) the sum of any two vectors in S is also a vector in S. To be more specific, a
block code is said to be linear if the following is satisfied. “If v1 and v2 are any two code words of
length n of the block code then v1  v2 is also a code word length n of the block code”.

Example 4.1: Linear Block code with k= 3, and n =6

Observe the linearity property: With v3 = (010 101) and v4 = (100 011), v3  v4 = (110 110) = v7.
Remember that n represents the word length of the code words and k represents the number
of information digits and hence the block code is represented as (n, k) block code.

Thus by definition of a linear block code it follows that if g1, g2…gk are the k linearly
independent code words then every code vector, v, of our code is a combination of these code words,
i.e.

v = u1 g1u2 g2 … uk gk ……………… (4.1)

Where uj= 0 or 1,  1  j  k

Eq (6.1) can be arranged in matrix form by nothing that each gj is an n-tuple, i.e.

gj= (gj1, gj2,….gjn) …………………… (4.2)

Thus we have v=uG …………………… (4.3)

Where: u = (u1, u2…uk) …………………… (4.4)


represents the data vector and
 g11
g  g12 g1n 

 1   g 21 g22 g2n
G   g2    …………………… (4.5)
⁝ ⁝ 
 g3   
 g k 1 gk 2 gkn 

is called the “generator matrix”.

Notice that any k linearly independent code words of an (n, k) linear code can be used to form
a Generator matrix for the code. Thus it follows that an (n, k) linear code is completely specified by
the k-rows of the generator matrix. Hence the encoder need only to store k rows of G and form linear
combination of these rows based on the input message u.

Example 4.2: The (6, 3) linear code of Example 6.1 has the following generator matrix:

 g1  1 0 0 0 1 1
 
G  g  0 1 0 1 0 1
 2  
g3 0 0 1 1 1 0 

If u = m5 (say) is the message to be coded, i.e. u = (011)

We have v = u .G = 0.g1 + 1.g2 +1.g3


= (0,0,0,0,0,0) + (0,1,0,1,0,1) + (0,0,1,1,1,0) = (0, 1, 1, 0, 1, 1)

Thus v = (0 1 1 0 1 1)
“v can be computed simply by adding those rows of G which correspond to the locations of
1`s of u.”

Systematic Block Codes (Group Property):

A desirable property of linear block codes is the “Systematic Structure”. Here a code word is
divided into two parts –Message part and the redundant part. If either the first k digits or the last k
digits of the code word correspond to the message part then we say that the code is a “Systematic Block
Code”. We shall consider systematic codes as depicted in Fig.4.5.

Fig 4.5 Systematic format of code word

In the format of Fig.4.5 notice that:

v1 = u1, v2 = u2, v3 = u3 … vk = uk …………… (4.6 a)


vk  1  u1 p11  u2 p21  u3 p31    uk pk1 


vk  2  u1 p12  u2 p22  u3 p32    uk pk2 
 ……………… (4.6 b)
⁝ ⁝ 

vn  u1 p1,n-k  u2 p2,n-k  u3 p3,n-k  ..  uk pk, n-k 

Or in matrix from we have

v1 v2 ... vk vk  1 vk  2 ... vn  


1
0 01 0 ... 0 p11 p12 ... p1,n k 
0 ... 0 ...  ……. (4.7)
p21 p22 p2 ,n k
u1 u2 ... uk  

 ⁝ ⁝ ⁝ ⁝⁝⁝ ⁝ ⁝ ⁝ ⁝⁝⁝ ⁝
0 0 0 ... 1 p p ... p

 k1 k2 k ,n k 

i.e., v = u.G

Where G = [Ik, P] ………………. (4.8)


 p11 p12  p1 ,n k 
 
p21 p22  p2 ,n k
Where P =   ………………. (4.9)
⁝ 
 ⁝ ⁝ 
 pk , 1 pk ,2  pk ,nk 
Ik is the k  k identity matrix (unit matrix), P is the k  (n – k) „parity generator matrix‟, in
which pi, j are either 0 or 1 and G is a k  n matrix. The (n  k) equations given in Eq (4.6b) are referred
to as parity check equations. Observe that the G matrix of Example 4.2 is in the systematic format.The
n-vectors a = (a1, a2…an) and b = (b1, b2 …bn) are said to be orthogonal if their inner product defined
by:

a.b = (a1, a2…an) (b1, b2 …bn) T = 0.

where, „T‟ represents transposition. Accordingly for any kn matrix, G, with k linearly independent
rows there exists a (n-k)  n matrix H with (n-k) linearly independent rows such that any vector in
the row space of G is orthogonal to the rows of H and that any vector that is orthogonal to the rows
of H is in the row space of G. Therefore, we can describe an (n, k) linear code generated by G
alternatively as follows:

“An n – tuple, v is a code word generated by G, if and only if v.HT = O”. ……… (4.9a)
(O represents an all zero row vector.)

This matrix H is called a “parity check matrix” of the code. Its dimension is (n – k) n.

If the generator matrix has a systematic format, the parity check matrix takes the following form.
 p11 p21 ... pk 1 1 0 0 ... 0
 p 
T  12 p22 ... pk 2 0 1 0 ... 0 
H = [P .In-k] =  ……… (4.10)
⁝ ⁝ ⁝ ⁝⁝⁝ ⁝ ⁝ ⁝ ⁝⁝⁝ ⁝ 
 p ... p 0 0 0 ... 1
p
 1,n k 2 ,n k k ,n k 

The ith row of G is:

gi = (0 0 …1 …0…0 pi,1 pi,2…pi,j…pi, n-k)


 
i th element (k + j) th element

The jth row of H is:


i th element (k + j) th element
 
hj = ( p1,j p2,j …pi,j ...pk, j 0 0 … 0 1 0 …0)

Accordingly the inner product of the above n – vectors is:

gihj = (0 0 …1 …0…0 pi,1 pi,2…pi,j…pi, n-k) ( p1,j p2,j …pi,j ...pk, j 0 0 … 0 1 0 …0)T
   
ith element (k + j) th element ith element (k + j) th element

= pij + pij. = 0 (as the pij are either 0 or 1 and in modulo – 2 arithmetic X + X = 0)

This implies simply that:

G. HT = Ok (n – k) …………………………. (4.11)

Where Ok (n – k) is an all zero matrix of dimension k (n – k).

Further, since the (n – k) rows of the matrix H are linearly independent, the H matrix ofEq.
(4.10) is a parity check matrix of the (n, k) linear systematic code generated by G. Notice that theparity
check equations of Eq. (4.6b) can also be obtained from the parity check matrix using the fact
v.HT = O.

Alternative Method of proving v.HT = O.:

We have v = u.G = u. [Ik: P]= [u1, u2… uk, p1, p2 …. Pn-k]

Where pi =( u1 p1,i + u2 p2,i + u3 p3,i …+ uk pk, i) are the parity bits found from Eq (4.6b).

 P 
Now H T   I 
 n k 
 v.HT = [u1 p11 + u2 p21 +…. + …. + uk pk1 + p1, u1 p12 + u2 p22 + ….. + uk pk2 + p2, …
u1 p1, n-k + u2 p2, n-k + …. + uk pk, n-k + pn-k]

= [p1 + p1, p2 + p2… pn-k + pn-k]


= [0, 0… 0]

Thus v. HT = O. This statement implies that an n- Tuple v is a code word generated by G if and only
if
v HT = O

Since v = u G, This means that: u G HT = O

If this is to be true for any arbitrary message vector v then this implies: G HT = Ok (n – k)

Example 4.3:

Consider the generator matrix of Example 4.2, the corresponding parity check matrix is

0 1 1 1 0 0

H = 1 0 1 0 1 0 
1 1 0 0 0 1

Circuit implementation of Block codes:

The implementation of Block codes is very simple. We need only combinational logic circuits.
Implementation of Eq (4.6) is shown in the encoding circuit of Fig.4.6. Notice that pij is either a „0‟ or
a „1‟ and accordingly  pij  indicates a connection if pij = 1 only (otherwise no connection). The
encoding operation is very simple. The message u = (u1, u2 … uk) to be encoded is shifted into the
message register and simultaneously into the channel via the commutator. As soon as the entire
message has entered the message register, the parity check digits are formed using modulo -2 adders,
which may be serialized using, another shift register – the parity register, and shifted into the channel.
Notice that the complexity of the encoding circuit is directly proportional to the block length of the
code. The encoding circuit for the (6, 3) block code of Example 2 is shown in Fig 4.7
Fig 4.6 Encoding circuit for systematic block code

Fig 4.7 Encoder for the (6,3) block code of example 4.2

Syndrome and Error Detection:

Suppose v = (v1, v2… vn) be a code word transmitted over a noisy channel and let:
r = (r1, r2 ….rn) be the received vector. Clearly, r may be different from v owing to the channel noise.
The vector sum

e = r – v = (e1, e2… en) …………………… (4.12)

is an n-tuple, where ej = 1 if rj  vj and ej = 0 if rj = vj. This n – tuple is called the “error vector” or
“error pattern”. The 1‟s in e are the transmission errors caused by the channel noise. Hence from
Eq (4.12) it follows:

r=ve ………………………………. (4.12a)


Observer that the receiver noise does not know either v or e. Accordingly, on reception of r
the decoder must first identify if there are any transmission errors and, then take action to locate these
errors and correct them (FEC – Forward Error Correction) or make a request for re–transmission
(ARQ). When r is received, the decoder computes the following (n-k) tuple:

s = r. HT …………………….. (4.13)
= (s1, s2… sn-k)

It then follows from Eq (4.9a), that s = 0 if and only if r is a code word and s  0 iffy r is not
a code word. This vector s is called “The Syndrome” (a term used in medical science referring to
collection of all symptoms characterizing a disease). Thus if s = 0, the receiver accepts r as a valid code
word. Notice that there are possibilities of errors undetected, which happens when e is identical to a
nonzero code word. In this case r is the sum of two code words which according to our linearity
property is again a code word. This type of error pattern is referred to an “undetectable error pattern”.
Since there are 2k -1 nonzero code words, it follows that there are 2k -1 error patterns as well. Hence
when an undetectable error pattern occurs the decoder makes a “decoding error”.
Eq. (4.13) can be expanded as below:

s1  r1 p11  r2 p21  ....  rk pk1 rk 1 


s  r p  r p  ....  r p  r
2 1 12 2 22 k k2 k 2 
From which we have  ………… (4.14)
⁝ ⁝ ⁝ ⁝ ⁝ 

sn k  r1 p1,n k  r2 p2,n k  ....  rk pk ,n k  rn 

A careful examination of Eq. (4.14) reveals the following point. The syndrome is simply the vector
sum of the received parity digits (rk+1, rk+2 ...rn) and the parity check digits recomputed from the
received information digits (r1, r2 … rn). Thus, we can form the syndrome by a circuit exactly similar
to that of Fig.6.6 and a general syndrome circuit is as shown in Fig. 4.8.

Example 4.4:
We shall compute the syndrome for the (6, 3) systematic code of Example 4.2. We have

0 1 1 
 
1 0 1 
s = (s1, s2, s3) = (r1, r2, r3, r4, r5, r6) 1 1 0 
 
1 0 0 
0 1 0 
 

0 0 1 

or s1 = r2 +r3 + r4
s2 = r1 +r3 + r5
s3 = r1 +r2 + r6

The syndrome circuit for this code is given in Fig.4.9.

Fig 4.8 Syndrome circuit for the (n,k) Linear systematic block code

Fig 4.8 Syndrome circuit for the (6,3) systematic block code

In view of Eq. (4.12a), and Eq. (4.9a) we have

s = r.HT = (v e) HT

= v .HT  e.HT
or s = e.HT …………… (4.15)

as v.HT= O. Eq. (4.15) indicates that the syndrome depends only on the error pattern and not on the
transmitted code word v. For a linear systematic code, then, we have the following relationship between
the syndrome digits and the error digits.
s1 = e1 p11 + e2 p21 + …. + ek pk,1 + ek+1 

s2 = e1 p12 + e2 p22 + …+ ek pk, 2 + ek+ 2 
 …………… (4.16)

⁝ ⁝ ⁝ ⁝ ⁝ 
sn-k = e1 p1, n-k + e2 p2, n-k + …..+ ek pk, n-k + en 

Thus, the syndrome digits are linear combinations of error digits. Therefore they must provide
us information about the error digits and help us in error correction.

Notice that Eq. (4.16) represents (n-k) linear equations for n error digits – an under-determined
set of equations. Accordingly it is not possible to have a unique solution for the set. As the rank of the
H matrix is k, it follows that there are 2k non-trivial solutions. In other words there exist 2k error
patterns that result in the same syndrome. Therefore to determine the true error pattern is not any easy
task

Example 4.5:

For the (6, 3) code considered in Example 4.2, the error patterns satisfy the following equations:

s1 = e2 +e3 +e4 , s2 = e1 +e3 +e5 , s3 = e1 +e2 +e6

Suppose, the transmitted and received code words are v = (0 1 0 1 0 1), r = (0 1 1 1 0 1)

Then s = r.HT = (1, 1, 0)

Then it follows that:


e2 + e3 +e4 = 1
e1 + e3 +e5 =1
e1 + e2 +e6 = 0

There are 23 = 8 error patterns that satisfy the above equations. They are:

{0 0 1 0 0 0, 1 0 0 0 0, 0 0 0 1 1 0, 0 1 0 0 1 1, 1 0 0 1 0 1, 0 1 1 1 0 1, 1 0 1 0 1 1, 1 1 1 1 1 0}

To minimize the decoding error, the “Most probable error pattern” that satisfies Eq (4.16) is
chosen as the true error vector. For a BSC, the most probable error pattern is the one that has the
smallest number of nonzero digits. For the Example 4.5, notice that the error vector (0 0 1 0 0 0) has
the smallest number of nonzero components and hence can be regarded as the most probable error
vector. Then using Eq. (4.12) we have

vˆ = r  e

= (0 1 1 1 0 1) + (0 0 1 0 0 0) = (0 1 0 1 0 1)

Notice now that vˆ indeed is the actual transmitted code word.

Minimum Distance Considerations:

The concept of distance between code words and single error correcting codes was first
developed by R .W. Hamming. Let the n-tuples,

 = (1, 2 … n),  = (1, 2 … n)

be two code words. The “Hamming distance” d (,) between such pair of code vectors is defined
as the number of positions in which they differ. Alternatively, using Modulo-2 arithmetic, we have
n
d(  , )   ( j   j ) ……………………. (4.17)
j 1

(Notice that  represents the usual decimal summation and  is the modulo-2 sum, the EX-OR
function).

The “Hamming Weight” () of a code vector  is defined as the number of nonzero
elements in the code vector. Equivalently, the Hamming weight of a code vector is the distance between
the code vector and the „all zero code vector‟.

Example 4.6: Let  = (0 1 1 1 0 1),  = (1 0 1 0 1 1)

Notice that the two vectors differ in 4 positions and hence d (,) = 4. Using Eq (4.17) we find

d (,) = (0  1) + (1  0) + (1  1) + (1  0) + (0  1) + (1  1)

= 1 + 1 + 0 + 1 + 1 + 0

= 4 ….. (Here + is the algebraic plus not modulo – 2 sum)

Further, () = 4 and () = 4.

The “Minimum distance” of a linear block code is defined as the smallest Hammingdistance
between any pair of code words in the code or the minimum distance is the same as the
smallest Hamming weight of the difference between any pair of code words. Since in linear block
codes, the sum or difference of two code vectors is also a code vector, it follows then that “the
minimum distance of a linear block code is the smallest Hamming weight of the nonzero code
vectors in the code”.

The Hamming distance is a metric function that satisfies the triangle inequality. Let, and 
be three code vectors of a linear block code. Then

d (,) + d (, )  d(,) ……………. (4.18)

From the discussions made above, we may write

d (,) =  (  ) …………………. (4.19)

Example 4.7: For the vectors  and  of Example 4.6, we have:

   = (01), (10), (11) (10), (01) (11)= (11 0 1 1 0)

 ( ) = 4 = d (,)

If  = (1 0 1 01 0), we have d (,) = 4; d (,) = 1; d (,) = 5

Notice that the above three distances satisfy the triangle inequality:

d (,) + d (,) = 5 = d (,)

d (,) + d (,) = 6 > d (,)

d (,) + d (,) = 9 > d (,)

Similarly, the minimum distance of a linear block code, „C‟ may be mathematically
represented as below:

dmin =Min {d (,):,  C,   } …………. (4.20)

=Min {( ):,  C,   }

=Min {(v), v  C, v  0} ………………. (4.21)

That is dmin  min . The parameter min is called the “minimum weight” of the linear

code C.The minimum distance of a code, dmin, is related to the parity check matrix, H, of the code in
a fundamental way. Suppose v is a code word. Then from Eq. (4.9a) we have:
0 = v.HT

= v1h1  v2h2  ….  vnhn

Here h1, h2 … hn represent the columns of the H matrix. Let vj1, vj2 …vjl be the „l‟ nonzero
components of v i.e. vj1 = vj2 = …. vjl = 1. Then it follows that:

hj1 hj2 …  hjl = OT …………………. (4.22)

That is “if v is a code vector of Hamming weight „l‟, then there exist „l‟ columns of H such
that the vector sum of these columns is equal to the zero vector”. Suppose we form a binary n- tuple
of weight „l‟, viz. x = (x1, x2 … xn) whose nonzero components are xj1, xj2 … xjl. Consider the product:

x.HT = x1h1  x2h2 …. xnhn = xj1hj1  xj2hj2  ….  xjlhjl = hj1  hj2  …  hjl

If Eq. (4.22) holds, it follows x.HT = O and hence x is a code vector. Therefore, we conclude
that “if there are „l‟ columns of H matrix whose vector sum is the zero vector then there exists a
code vector of Hamming weight „l‟ ”.
From the above discussions, it follows that:

i) If no (d-1) or fewer columns of H add to OT, the all zero column vector, the code has a
minimum weight of at least„d‟.

ii) The minimum weight (or the minimum distance) of a linear block code C, is the smallest
number of columns of H that sum to the all zero column vector.

0 1 1 1 0 0
For the H matrix of Example 6.3, i.e. H = 1 0 1 0 1 0 , notice that all columns of H are non
 
1 1 0 0 0 1
zero and distinct. Hence no two or fewer columns sum to zero vector. Hence the minimum weight of
the code is at least 3.Further notice that the 1st, 2nd and 3rd columns sum to OT. Thus the minimum
weight of the code is 3. We see that the minimum weight of the code is indeed 3 from the table of
Example 4.1.

Error Detecting and Error Correcting Capabilities:

The minimum distance, dmin, of a linear block code is an important parameter of the code. To
be more specific, it is the one that determines the error correcting capability of the code. Tounderstand
this we shall consider a simple example. Suppose we consider 3-bit code words plotted at the vertices
of the cube as shown in Fig.4.10.
Fig 4.10 The distance concept

Clearly, if the code words used are {000, 101, 110, 011}, the Hamming distance between the
words is 2. Notice that any error in the received words locates them on the vertices of the cube which
are not code words and may be recognized as single errors. The code word pairs with Hamming distance
= 3 are: (000, 111), (100, 011), (101, 010) and (001, 110). If a code word (000) is receivedas (100,
010, 001), observe that these are nearer to (000) than to (111). Hence the decision is made that the
transmitted word is (000).

Suppose an (n, k) linear block code is required to detect and correct all error patterns (over a
BSC), whose Hamming weight,   t. That is, if we transmit a code vector  and the received vector
is  =   e, we want the decoder out put to be ˆ =  subject to the condition (e)  t.

Further, assume that 2k code vectors are transmitted with equal probability. The best decision
for the decoder then is to pick the code vector nearest to the received vector  for which the Hamming
distance is the smallest. i.e., d (,) is minimum. With such a strategy the decoder will be able to detect
and correct all error patterns of Hamming weight (e)  t provided that the minimum distance of the
code is such that:

dmin  (2t + 1) …………………. (4.23)

dmin is either odd or even. Let „t‟ be a positive integer such that

2t + 1  dmin  2t + 2 ………………… (4.24)

Suppose  be any other code word of the code. Then, the Hamming distances among
, and  satisfy the triangular inequality:

d(,) + d(, )  d(,) ………………… (4.25)


Suppose an error pattern of „t ‟ errors occurs during transmission of . Then the received vector 
differs from  in „t ‟ places and hence d(,) = t. Since  and  are code vectors, it follows from Eq.
(6.24).

d(,)  dmin  2t + 1 ………………. (4.26)

Combining Eq. (4.25) and (4.26) and with the fact that d(,) = t, it follows that:

d (,  )  2t + 1- t ……………… (4.27)

Hence if t t, then: d (, ) > t ……………… (4.28)

Eq 4.28 says that if an error pattern of „t‟ or fewer errors occurs, the received vector  is closer
(in Hamming distance) to the transmitted code vector  than to any other code vector  of the code.
For a BSC, this means P (|) > P (|) for   . Thus based on the maximum likelihood decoding
scheme,  is decoded as  , which indeed is the actual transmitted code word and this results in the
correct decoding and thus the errors are corrected.

On the contrary, the code is not capable of correcting error patterns of weight l>t. To show this
we proceed as below:

Suppose d (,) = dmin, and let e1 and e2 be two error patterns such that:

i) e1  e2 =   

ii) e1 and e2 do not have nonzero components in common places. Clearly,

(e1) + (e2) = ( ) = d( ,) = dmin ………………… (4.29)

Suppose,  is the transmitted code vector and is corrupted by the error pattern e1. Then the received
vector is:

 =   e1 ……………………….. (4.30)

and d (,) = (   ) = (e1) ……………………… (4.31)

d (,) = ()

= (    e1) = (e2) ………………………. (4.32)


If the error pattern e1 contains more than„t‟ errors, i.e. (e1) > t, and since 2t + 1  dmin  2t + 2,
it follows

(e2)  t- 1 ………………………… (4.33)

 d (,)  d (,) …………………………. (4.34)

This inequality says that there exists an error pattern of l > t errors which results in a received
vector closer to an incorrect code vector i.e. based on the maximum likelihood decoding scheme
decoding error will be committed.

To make the point clear, we shall give yet another illustration. The code vectors and the received
vectors may be represented as points in an n- dimensional space. Suppose we construct two spheres,
each of equal radii,„t‟ around the points that represent the code vectors  and . Further let these two
spheres be mutually exclusive or disjoint as shown in Fig.4.11 (a).

For this condition to be satisfied, we then require d (,)  2t + 1.In such a case if d (,) t,
it is clear that the decoder will pick  as the transmitted vector.

Fig. 4.11(a)

On the other hand, if d (,)  2t, the two spheres around  and  intersect and if „‟ is located as in
Fig. 4.11(b), and  is the transmitted code vector it follows that even if d (,) t, yet  is as close to
 as it is to. The decoder can now pick  as the transmitted vector which is wrong. Thus it is imminent
that “an (n, k) linear block code has the power to correct all error patterns of weight„t‟ or less if and
only if d (,)  2t + 1 for all  and”. However, since the smallest distance between any pair of code
words is the minimum distance of the code, dmin , „guarantees‟ correcting all the error patterns of

1
t   (d
min
 1 ) …………………………. (4.35)
2 
1  1 
where  ( d min  1 ) denotes the largest integer no greater than the number  ( d  1 ) . The
min 
2  2 

1 
parameter„t‟ =  ( d min  1 ) is called the “random-error-correcting capability” of the code and
2 

the code is referred to as a “t-error correcting code”. The (6, 3) code of Example 4.1 has a
minimum distance of 3 and from Eq. (6.35) it follows t = 1, which means it is a „Single Error
Correcting‟ (SEC) code. It is capable of correcting any error pattern of single errors over a block of
six digits.

For an (n, k) linear code, observe that, there are 2n-k syndromes including the all zero
syndrome. Each syndrome corresponds to a specific error pattern. If „j‟ is the number of error
 n
locations in the n-dimensional error pattern e, we find in general, there are    nC j multiple error
j 
t  n

patterns. It then follows that the total number of all possible error patterns =    , where„t‟ is the
j
j  0 

maximum number of error locations in e. Thus we arrive at an important conclusion. “If an (n, k)
linear block code is to be capable of correcting up to„t‟ errors, the total number of syndromes
shall not be less than the total number of all possible error patterns”, i.e.

n-k
t  n
2    ………………………. (4.36)
j
j  0 

Eq (6.36) is usually referred to as the “Hamming bound”. A binary code for which the Hamming
Bound turns out to be equality is called a “Perfect code”.

Standard Array and Syndrome Decoding:


The decoding strategy we are going to discuss is based on an important property of the
syndrome.

Suppose vj , j = 1, 2… 2k, be the 2k distinct code vectors of an (n, k) linear block code.
Correspondingly let, for any error pattern e, the 2k distinct error vectors, ej, be defined by

ej = e  vj , j = 1, 2… 2k ………………………. (4.37)

The set of vectors {ej, j = 1, 2 … 2k} so defined is called the “co- set” of the code. That is, a
„co-set‟ contains exactly 2k elements that differ at most by a code vector. It then fallows that there are
2n-k co- sets for an (n, k) linear block code. Post multiplying Eq (4.37) by HT, we find

ej HT = eHT  vj HT

33
= e HT ………………………………………………. (4.38)

34
Notice that the RHS of Eq (4.38) is independent of the index j, as for any code word the term
vjHT = 0. From Eq (4.38) it is clear that “all error patterns that differ at most by a code word have
the same syndrome”. That is, each co-set is characterized by a unique syndrome.

Since the received vector r may be any of the 2n n-tuples, no matter what the transmitted code
word was, observe that we can use Eq (4.38) to partition the received code words into 2k disjoint sets
and try to identify the received vector. This will be done by preparing what is called the “standard
array”. The steps involved are as below:

Step1: Place the 2k code vectors of the code in a row, with the all zero vector
v1 = (0, 0, 0… 0) = O as the first (left most) element.

Step 2: From among the remaining (2n – 2k) - n – tuples, e2 is chosen and placed below the all-
zero vector, v1. The second row can now be formed by placing (e2  vj),j =
2, 3… 2k under vj

Step 3: Now take an un-used n-tuple e3 and complete the 3rd row as in step 2.

Step 4: continue the process until all the n-tuples are used.

The resultant array is shown in Fig. 4.12.

Fig 4.12: Standard Array for an (n,k) linear block code

Since all the code vectors, vj, are all distinct, the vectors in any row of the array are also distinct.
For, if two n-tuples in the l-th row are identical, say el  vj = el  vm, j  m; we should have vj = vm
which is impossible. Thus it follows that “no two n-tuples in the same row of a slandered array are
identical”.

Next, let us consider that an n-tuple appears in both l-th row and the m-th row. Then for some
j1 and j2 this implies el  vj1 = em  vj2, which then implies el = em  (vj2  vj1); (remember thatX
 X = 0 in modulo-2 arithmetic) or el = em  vj3 for some j3. Since by property of linear block codes
vj3 is also a code word, this implies, by the construction rules given, that el must appear in the m-th
row, which is a contradiction of our steps, as the first element of the m-th row is em and is an unused
vector in the previous rows. This clearly demonstrates another important property of thearray:
“Every n-tuple appearance in one and only one row”.
From the above discussions it is clear that there are 2n-k disjoint rows or co-sets in the standard
array and each row or co-set consists of 2k distinct entries. The first n-tuple of each co-set, (i.e., the
entry in the first column) is called the “Co-set leader”. Notice that any element of the co-set can be
used as a co-set leader and this does not change the element of the co-set - it results simply in a
permutation.

Suppose DjT is the jth column of the standard array. Then it follows

Dj = {vj, e2  vj, e3  vj… e2n-k  vj} ………………….. (4.39)

where vj is a code vector and e2, e3, … e2n-k are the co-set leaders.

The 2k disjoints columns D T, D T… D T can now be used for decoding of the code. If v is
1 2 2k j

the transmitted code word over a noisy channel, it follows from Eq (5.39) that the received vector r is
in DjT if the error pattern caused by the channel is a co-set leader. If this is the case r will be decoded
correctly as vj. If not an erroneous decoding will result for, any error pattern eˆ which is not a co-set
leader must be in some co-set and under some nonzero code vector is, say, in the i-th co-set and under
v  0. Then it follows

eˆ = ei  vl , and the received vector is r = vj  eˆ = vj  (ei  vl ) = ei  vm


Thus the received vector is in D T and it will be decoded as v and a decoding error has been
m m
committed. Hence it is explicitly clear that “Correct decoding is possible if and only if the error
pattern caused by the channel is a co-set leader”. Accordingly, the 2n-k co-set leaders (including
the all zero vector) are called the “Correctable error patterns”, and it follows “Every (n, k) linear
block code is capable of correcting 2n-k error patterns”.

So, from the above discussion, it follows that in order to minimize the probability of a decoding
error, “The most likely to occur” error patterns should be chosen as co-set leaders. For a BSC an error
pattern of smallest weight is more probable than that of a larger weight. Accordingly, when forming a
standard array, error patterns of smallest weight should be chosen as co-set leaders. Then the decoding
based on the standard array would be the „minimum distance decoding‟ (the maximum likelihood
decoding). This can be demonstrated as below.

Suppose a received vector r is found in the jth column and lth row of the array. Then r will be
decoded as vj. We have

d(r, vj) = (r  vj ) = (el  vj  vj ) = (el )

where we have assumed vj indeed is the transmitted code word. Let vs be any other code word, other
than vj. Then

d(r, vs ) = (r vs ) = (el  vj  vs ) (el ) = (el  vi )


as vj and vs are code words, vi = vj  vs is also a code word of the code. Since el and (el  vi ) are in
the same co set and, that el has been chosen as the co-set leader and has the smallest weight it follows
(el )  (el  vi ) and hence d(r, vj )  d(r, vs ). Thus the received vector is decoded into a closet code
vector. Hence, if each co-set leader is chosen to have minimum weight in its co-set, the standardarray
decoding results in the minimum distance decoding or maximum likely hood decoding.

Suppose “a0, a1, a2 …, an” denote the number of co-set leaders with weights 0, 1, 2… n. This
set of numbers is called the “Weight distribution” of the co-set leaders. Since a decoding error will
occur if and only if the error pattern is not a co-set leader, the probability of a decoding error for a
BSC with error probability (transition probability) p is given by
n

P( E )  1   a j p j ( 1  p )n j …………………… (4.40)


j 0

Example 4.8:

For the (6, 3) linear block code of Example 4.1 the standard array, along with the syndrome
table, is as below:

The weight distribution of the co-set leaders in the array shown are a0 = 1, a1 = 6, a2 = 1, a3 = a4 = a5
= a6 = 0.From Eq (5.40) it then follows:

P (E) = 1- [(1-p) 6 +6p (1-p) 5 + p2 (1-p) 4]

With p = 10-2, we have P (E) = 1.3643879  10-3

A received vector (010 001) will be decoded as (010101) and a received vector (100 110) will be
decoded as (110 110).

Notice that an (n, k) linear code is capable of detecting (2n -2k) error patterns while it is
capable of correcting only 2n-k error patterns. Further, as n becomes large 2n-k/ (2n-2k) becomes
smaller and hence the probability of a decoding error will be much higher than the probability of an
undetected error.

Let us turn our attention to Eq (5.35) and arrive at an interpretation. Let x1and x2 be two n-
tuples of weights„t‟ or less. Then it follows

 (x1  x2)  (x1) + (x2)  2t  dm-n

Suppose x1 and x2 are in the same co-set then it follows that (x1  x2) must be a nonzero code
vector of the code. This is impossible because the weight of (x1  x2) is less than the minimum weight
of the code. Therefore, “No two n-tuples, whose weights are less than or equal to„t‟, can be in the
same co-set of the code and all such n-tuples can be used as co-set leaders”.

Further, if v is a minimum weight code vector, i.e. (v) = dmin and if the n-tuples, x1 and x2
satisfy the following two conditions:

i) x1  x2 = v

ii) x1 and x2 do not have nonzero components in common places

It follows from the definition, x1 and x2 must be in the same co-set and

 (x1) + (x2) = (v) = dmin

Suppose we choose x2 such that (x2) = t + 1. Since 2t+1 dmin  2t+2, we have  (x1) = t or (t+1).
If x1 is used as a co-set leader then x2 cannot be a co-set leader.

The above discussions may be summarized by saying “For an (n , k) linear block code with
1 
minimum distance dmin, all n-tuples of weight t   ( d min  1 ) can be used as co-set leaders of
2 

a standard array. Further, if all n-tuples of weight  t are used as co-set leaders, there is at least
one n-tuple of weight (t + 1) that cannot be used as a co-set leader”.

These discussions once again re-confirm the fact that an (n, k) linear code is capable of
1
correcting error patterns of (d  1 ) or fewer errors but is incapable of correcting all the
 min 
2 
error patterns of weight (t + 1).

We have seen in Eq. (4.38) that each co-set is characterized by a unique syndrome or there is
a one- one correspondence between a co- set leader (a correctable error pattern) and a syndrome. These
relationships, then, can be used in preparing a decoding table that is made up of 2n-k co-set leaders and
their corresponding syndromes. This table is either stored or wired in the receiver. The following are
the steps in decoding:
Step 1: Compute the syndrome s = r.HT
Step 2: Locate the co-set leader ej whose syndrome is s. Then ej is assumed to be the error
pattern caused by the channel.

Step 3: Decode the received vector r into the code vector v = r  ej

This decoding scheme is called the “Syndrome decoding” or the “Table look up decoding”.
Observe that this decoding scheme is applicable to any linear (n, k) code, i.e., it need not necessarily
be a systematic code. However, as (n-k) becomes large the implementation becomes difficult and
impractical as either a large storage or a complicated logic circuitry will be required.

For implementation of the decoding scheme, one may regard the decoding table as the truth
table of n-switching functions:

e1 = f1 (s1, s2... sn-k); e2 = f2 (s1, s2... sn-k); … en = fn (s1, s2... sn-k)

where s1, s2... sn-k are the syndrome digits and are regarded as the switching variables and e1, e2 … en
are the estimated error digits. The stages can be released by using suitable combinatorial logic circuits
as indicated in Fig 4.13.

Fig. 4.13 General Decoding scheme for an (n,k) linear block code

Example 4.9:

From the standard array for the (6, 3) linear block code of Example 4.8, the following truth table can
be constructed.
The two shaded portions of the truth table are to be observed carefully. The top shaded one
corresponds to the all-zero error pattern and the bottom one corresponds to a double error patter which
cannot be corrected by this code. From the table we can now write expressions for the correctable
single error patterns as below:

e1  s1 .s2 s3 e2  s1s2 s3 e3  s1s2 s3


e4  s1s2 s3 e5  s1s2 s3 e6  s1s2 s3

The implementation of the decoder is shown in Fig.4.14.

Fig 4.14: Decoding circuit for (6,3) code


Comments:

1) Notice that for all correctable single error patterns the syndrome will be identical to a
column of the H matrix and indicates that the received vector is in error corresponding to
that column position.

For Example, if the received vector is (010001), then the syndrome is (100). This is identical
withthe4th column of the H- matrix and hence the 4th – position of the received vector is in error. Hence
the corrected vector is 010101. Similarly, for a received vector (100110), the syndrome is 101 and this
is identical with the second column of the H-matrix. Thus the second position of the received vector
is in error and the corrected vector is (110110).

2) A table can be prepared relating the error locations and the syndrome. By suitable
combinatorial circuits data recovery can be achieved. For the (6, 3) systematic linear code we have
the following table for r = (r1 r2 r3 r4 r5 r6.).

Notice that for the systematic encoding considered by us (r1 r2 r3) corresponds to the data digits and
(r4 r5 r6) are the parity digits.

Accordingly the correction for the data digits would be

v̂1 = r1 + (s2. s3), v̂ 2 = r2 + (s1. s3), v̂ 3 = r3 + (s1. s2)

Hence the circuit of Fig 6.14 can be modified to have data recovery by removing only the connections
of the outputs v̂4 , v̂5 and v̂6 .

Hamming Codes:
Hamming code is the first class of linear block codes devised for error correction. The single error
correcting (SEC) Hamming codes are characterized by the following parameters.

Code length: n = (2m-1)

Number of Information symbols: k = (2m – m – 1)

Number of parity check symbols :( n – k) = m


Error correcting capability: t = 1, (dmin= 3)

The parity check matrix H of this code consists of all the non-zero m-tuples as its columns. In
systematic form, the columns of H are arranged as follows

H = [Q ⁝ Im]

Where Im is an identity (unit) matrix of order m  m and Q matrix consists of

(2m-m-1) columns which are the m-tuples of weight 2 or more. As an illustration for k=4 we have
from k = 2m – m – 1.

m=1 k=0, m=2 k=1, m=3 k=4

Thus we require 3 parity check symbols and the length of the code 23 – 1 = 7. This results in the
(7, 4) Hamming code.

The parity check matrix for the (7, 4) linear systematic Hamming code is then

The generator matrix of the code can be written in the form


G  I2 m  m 1 ⁝ QT 
And for the (7, 4) systematic code it follows:

A non systematic Hamming code can be constructed by placing the parity check bits at 2l, l=0, 1,
2…locations. It was the conventional method of construction in switching and computer applications
(Refer, for example „Switching circuits and applications -Marcus).One simple procedure for
construction of such code is as follows:

Step 1: Write the BCD of length (n – k) for decimals from 1 to n.

Step 2: Arrange the sequences in the reverse order in a matrix form.

Step 3: Transpose of the matrix obtained in step 2 gives the parity check matrix H for the code.
The code words are in the form

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

p1 p2 m1 p3 m2 m3 m4 p4 m5 m6 m7 m8 m9 m10 m11 p5 m12

Where p1, p2, p3…are the parity digits and m1, m2, m3…are the message digits. For example, let us
consider the non systematic (7, 4) Hamming code.

Step1:

1
0 01 0
0
 
1 1 0 
Step2: H T= 0 0 1
 
 1 0 1
 
 0 1 1
1 1 1 
 

 10 10 10 1 
Step3: H=  0 1 1 0 0 1 1 
 
 0 0 0 1 1 1 1 

Notice that the parity check bits, from he above H matrix apply to positions.

p1 = 1, 3, 5, 7, 9, 11, 13, 15…

p2 = 2, 3, 6, 7, 10, 11, 14, 15 …

p3 = 4, 5, 6, 7, 12, 13, 14, 15…

p4 = 8, 9, 10, 11, 12, 13, 14, 15 and so on

Accordingly, the check bits can be represented as linear combinations of the message bits.
For the (7, 4) code under consideration we have
p1 = m1 + m2 +m4

p2 = m1 + m3 +m4

p3 = m2 + m3 + m4

Accordingly, the generator matrix can written as

Notice that the message bits are located at the positions other than 2l, l = 0, 1, 2, 3…. locations.
i.e., they are located in the positions of 3, 5, 7, 9, 11, 13, 15, 17, 18….. The k- columns of the identity
matrix Ik are distributed successively to these locations. The Q sub-matrix in the H matrix can be
identified to contain those columns which have weights more than one. The transposeof this matrix
then gives the columns to be filled, in succession, in the G- matrix. For the Example of the (7, 4) linear
code considered, the Q- sub-matrix is:

1 1 0
1 1 0 1  
Q= 1 0 1 1 , and hence QT = 1 0 1
  0 1 1
0 1 1 1  
1 1 1

The first two columns of this matrix then are the first two columns of the G: matrix and the third
column is the Forth column of the G matrix. Table below gives the codes generated by this method.
Observe that the procedure outlined for the code construction starts from selecting the H matrix
which is unique and hence the codes are also unique. We shall consider the correctable error patterns
and the corresponding syndromes listed in the table below.

Table: Error patterns and syndromes for the (7, 4) linear non-systematic code

Error Pattern Syndrome

e1 e2 e3 e4 e5 e6 s1 s2 s3

1 0 0 0 0 0 1 0 0

0 1 0 0 0 0 0 1 0

0 0 1 0 0 0 1 1 0

0 0 0 1 0 0 0 0 1

0 0 0 0 1 0 1 0 1

0 0 0 0 0 1 0 1 1

0 0 0 0 0 0 1 1 1
If the syndrome is read from right to left i.e. if the sequence is arranged as „s3 s2 s1‟ it is interesting to
observe that the decimal equivalent of this binary sequence corresponds to the error location. Thus if
the code vector 1 0 1 1 0 1 0 is received as 1 0 1 0 0 1 0, the corresponding syndrome is „0 0 1‟, which
is exactly the same as the 4th column of the H-matrix and also the sequence 100 corresponds todecimal
4.

It can be verified that (7, 4), (15, 11), (31, 26), (63, 57) are all single error correcting
Hamming codes and are regarded quite useful.

An important property of the Hamming codes is that they satisfy the condition of Eq. (4.36) with
equality sign, assuming that t=1.This means that Hamming codes are “single error correcting binary
perfect codes”. This can also be verified from Eq. (4.35)

We may delete any „l‟columns from the parity check matrix H of the Hamming code resulting
in the reduction of the dimension of H matrix to m  (2m-l-1).Using this new matrix as the parity check
matrix we obtain a “shortened” Hamming code with the following parameters.

Code length: n = 2m-l-1

Number of Information symbols: k=2m-m-l-1

Number of parity check symbols: n–k=m

Minimum distance: dmin  3

Notice that if the deletion of the columns of the H matrix is proper, we may obtain a Hamming code
with dmin = 4.For example if we delete from the sub-matrix Q all the columns of even weight, we obtain
an m  2m-1 matrix


H  Q :I m
Where Q contains (2m-1 -m) columns of odd weight. Clearly no three columns add to zero as all
columns have odd weight .However, for a column in Q , there exist three columns in Im such that four
columns add to zero .Thus the shortened Hamming codes with H as the parity check matrix has
minimum distance exactly 4. The distance – 4 shortened Hamming codes can be used for correcting all
single error patterns while simultaneously detecting all double error patterns. Notice that when single
errors occur the syndromes contain odd number of one‟s and for double errors it contains even number
of ones. Accordingly the decoding can be accomplished in the following manner.

(1) If s = 0, no error occurred.

(2) If s contains odd number of ones, single error has occurred .The single error pattern pertaining
to this syndrome is added to the received code vector for error correction.
(3) If s contains even number of one‟s an uncorrectable error pattern has been detected.

Alternatively the SEC Hamming codes may be made to detect double errors by adding an extra
parity check in its (n+1) Th position. Thus (8, 4), (6, 11) etc. codes have dmin = 4 and correct single
errors with detection of double errors.
BINARY CYCLIC CODES

INTRODUCTION

"Binary cyclic codes” form a sub class of linear block codes. Majority of important linear block
codes that are known to-date are either cyclic codes or closely related to cyclic codes. Cyclic codes are
attractive for two reasons: First, encoding and syndrome calculations can be easily implemented using
simple shift registers with feed back connections. Second, they posses well defined mathematical
structure that permits the design of higher-order error correcting codes.

A binary code is said to be "cyclic" if it satisfies:

1. Linearity property – sum of two code words is also a code word.


2. Cyclic property – Any lateral shift of a code word is also a code word.

The second property can be easily understood from Fig, 4.1. Instead of writing the code as a
row vector, we have represented it along a circle. The direction of traverse may be either clockwise or
counter clockwise (right shift or left shift).

For example, if we move in a counter clockwise direction then starting at „A‟ the code word is
110001100 while if we start at B it would be 011001100. Clearly, the two code words are related in
that one is obtained from the other by a cyclic shift.

Fig 4.1: Illustrating the cyclic property

If the n - tuple, read from „A‟ in the CW direction in Fig 4.1,

v = (vo, v1, v2, v3, vn-2, vn-1) …………………… (4.1)

is a code vector, then the code vector, read from B, in the CW direction, obtained by a one bit cyclic
right shift:
v(1) = (vn-1 , vo, v1, v2, … vn-3,vn-2,) …………………… (4.2)

is also a code vector. In this way, the n - tuples obtained by successive cyclic right shifts:

v(2) = (vn-2, vn-1, vn, v0, v1…vn-3) ………………… (4.3a)

v(3) = (vn-3 ,vn-2, vn-1, vn,...vo, v1, vn-4) ………………… (4.3b)



v(i) = (vn-i, vn-i+1,…vn-1, vo, v1,…. vn-i-1) …………… (4.3c)

are all code vectors. This property of cyclic codes enables us to treat the elements of each code vector
as the co-efficients of a polynomial of degree (n-1).

This is the property that is extremely useful in the analysis and implementation of these codes.
Thus we write the "code polynomial' V(X) for the code in Eq (6.1) as a vector polynomial as:

V(X) = vo + v1 X + v2 X2 + v3 X3 +…+ vi-1 Xi-1 +... + vn-3 Xn-3 + vn-2 Xn-2 + vn-1 Xn-1 ….. (4.4)

Notice that the co-efficients of the polynomial are either '0' or '1' (binary codes), i.e. they belong to
GF (2) as discussed in sec 5.7.1.

. Each power of X in V(X) represents a one bit cyclic shift in time.

. Therefore multiplication of V(X) by X maybe viewed as a cyclic shift or rotation to the right subject
to the condition Xn = 1. This condition (i) restores XV(X) to the degree (n-1) (ii) Implies that right
most bit is fed-back at the left.

. This special form of multiplication is called "Multiplication modulo “Xn + 1”

. Thus for a single shift, we have

XV(X) = voX + v1 X2 + v2 X3 + + vn-2 Xn-l + vn-l Xn

(+ vn-1 + vn-1) … (Manipulate A + A =0 Binary Arithmetic)

= vn-1 + v0 X + v1 X2 ++ vn-2 Xn-1 + vn-1(Xn + 1)

=V (1) (X) = Remainder obtained by dividing XV(X) by Xn + 1


(Remember: X mod Y means remainder obtained after dividing X by Y)

Thus it turns out that

V (1) (X) = vn-1 + vo X + v1 X2 +... + vn-2 Xn-1 ……………… (4.5)


I is the code polynomial for v(1) . We can continue in this way to arrive at a general format:

X i V(X) = V (i) (X) + q (X) (Xn + 1) …………………… (4.6)


 
Remainder Quotient
Where

V (i) (X) = vn-i + vn-i+1X + vn-i+2X2 + …vn-1X i+ …v0Xi +v1Xi+1+…vn-i-2Xn-2 +vn-i-1Xn- …… (4.7)

4.1 GENERATOR POLYNOMIAL FOR CYCLIC CODES:

An (n, k) cyclic code is specified by the complete set of code polynomials of degree  (n-1)
and contains a polynomial g(X), of degree (n-k) as a factor, called the "generator polynomial" of
the code. This polynomial is equivalent to the generator matrix G, of block codes. Further, it is the only
polynomial of minimum degree and is unique. Thus we have an important theorem

Theorem 4.1 "If g(X) is a polynomial of degree (n-k) and is a factor of (Xn +1) then g(X) generates
an (n, k) cyclic code in which the code polynomial V(X) for a data vector u = (u0, u1… uk-1) is
generated by

V(X) = U(X) g(X) ………………… (4.8)

Where U(X) = u0 + u1 X + u2 X2 + ... + uk-1 Xk-I ……………….. (4.9)

is the data polynomial of degree (k-1).

The theorem can be justified by Contradiction: - If there is another polynomial of same degree, then
add the two polynomials to get a polynomial of degree < (n, k) (use linearity property and binary
arithmetic). Not possible because minimum degree is (n-k). Hence g(X) is unique

Clearly, there are 2k code polynomials corresponding to 2k data vectors. The code vectors
corresponding to these code polynomials form a linear (n, k) code. We have then, from the theorem

n k 1
g( X )  1  …………………… (4.10)
 g X i  X n k
i 1 i

As g(X) = go + g1 X + g2 X2 +…….+ gn-k-1 Xn-k-1 + gn-k Xn-k ………… (4.11)

is a polynomial of minimum degree, it follows that g0 = gn-k = 1 always and the remaining co-
efficients may be either' 0' of '1'. Performing the multiplication said in Eq (4.8) we have:

U (X) g(X) = uo g(X) + u1 X g(X) +…+uk-1Xk-1g(X) …………. (4.12)


Suppose u0=1 and u1=u2= …=uk-1=0. Then from Eq (4.8) it follows g(X) is a code word polynomial
of degree (n-k). This is treated as a „basis code polynomial‟ (All rows of the G matrix of a block
code, being linearly independent, are also valid code vectors and form „Basis vectors‟ of the code).
Therefore from cyclic property Xi g(X) is also a code polynomial. Moreover, from the linearity property
- a linear combination of code polynomials is also a code polynomial. It follows therefore that any
multiple of g(X) as shown in Eq (4.12) is a code polynomial. Conversely, any binary polynomial of
degree  (n-1) is a code polynomial if and only if it is a multiple of g(X). The code words generated
using Eq (4.8) are in non-systematic form. Non systematic cyclic codes can be generated by simple
binary multiplication circuits using shift registers. .

In this book we have described cyclic codes with right shift operation. Left shift version can
be obtained by simply re-writing the polynomials. Thus, for left shift operations, the various
polynomials take the following form

U(X) = uoXk-1 + u1Xk-2 +…… + uk-2X + uk-1 ……………….. (4.13a)

V(X) = v0 Xn-1 + v1Xn-2 +…. + vn-2X + vn-1 ……………… (4.13b)

g(X) = g0Xn-k + g1Xn-k-1 +…..+gn-k-1 X + gn-k …………… (4.13c)

n k
= X n k   gi X n k  i  gn k ………………… (4.13d)
i 1

Other manipulation and implementation procedures remain unaltered.

4.2 MULTIPLICATION CIRCUITS


Construction of encoders and decoders for linear block codes are usually constructed with
combinational logic circuits with mod-2 adders. Multiplication of two polynomials A(X) and B(X) and
the division of one by the other are realized by using sequential logic circuits, mod-2 adders and shift
registers. In this section we shall consider multiplication circuits.

As a convention, the higher-order co-efficients of a polynomial are transmitted first. This isthe
reason for the format of polynomials used in this book.

For the polynomial: A(X) = a0 + a1 X + a2 X2 +...+ an-1Xn-1 ………… (4.14)

where ai‟s are either a ' 0' or a '1', the right most bit in the sequence (a0, a1, a2 ... an-1) is transmitted
first in any operation. The product of the two polynomials A(X) and B(X) yield:

C(X) = A(X) B(X)


= (a0 + a1 X + a2 X2 +… ..................+ an-1Xn-1) (b0 + b1 X + b2X2 +…+ bm-1 Xm-1)
= a0b0+ (a1b0+a0b1) X + (a0b2 + b0a2+a1b1) X2 +…. + (an-2bm-1+ an-1bm-2) Xn+m -3 +an-1bm-1Xn+m -2

This product may be realized with the circuits of Fig 4.2 (a) or (b), where A(X) is the input and the co-
efficient of B(X) are given as weighting factor connections to the mod - 2 .adders. A '0' indicates no
connection while a '1' indicates a connection. Since higher order co-efficients are first sent, the highest
order co-efficient an-1 bm-1 of the product polynomial is obtained first at the output ofFig 6.2(a).
Then the co-efficient of Xn+m-3 is obtained as the sum of {an-2bm-1 + an-1 bm-2}, the first term directly
and the second term through the shift register SR1. Lower order co-efficients are then generated
through the successive SR's and mod-2 adders. After (n + m - 2) shifts, the SR's contain
{0, 0… 0, a0, a1} and the output is (a0 b1 + a1 b0) which is the co-efficient of X. After (n + m-1) shifts,
the SR's contain (0, 0, 0,0, a0) and the out put is a0b0. The product is now complete and the contents
of the SR's become (0, 0, 0 …0, 0). Fig 4.2(b) performs the multiplication in a similar way but the
arrangement of the SR's and ordering of the co-efficients are different (reverse order!). This
modification helps to combine two multiplication operations into one as shown in Fig 4.2(c).

From the above description, it is clear that a non-systematic cyclic code may be generated using
(n-k) shift registers. Following examples illustrate the concepts described so far.

Fig 4.2: Multiplication circuits

Example 4.1: Consider that a polynomial A(X) is to be multiplied by

B(X) = 1 + X + X3 + X4 + X6

The circuits of Fig 4.3 (a) and (b) give the product C(X) = A(X). B(X)
Fig 4.3: Circuit to perform C(X)*(1+X2+X3+X4+X6)

Example 4.2: Consider the generation of a (7, 4) cyclic code. Here (n- k) = (7-4) = 3 and we have to
find a generator polynomial of degree 3 which is a factor of Xn + 1 = X7 + 1.

To find the factors of‟ degree 3, divide X7+1 by X3+aX2+bX+1, where 'a' and 'b' are binary
numbers, to get the remainder as abX2+ (1 +a +b) X+ (a+b+ab+1). Only condition for the remainder
to be zero is a +b=1 which means either a = 1, b = 0 or a = 0, b = 1. Thus we have two possible
polynomials of degree 3, namely

gl (X) = X3+ X2+ 1 and g2 (X) = X3+X+1

In fact, X7 + 1 can be factored as:

(X7+1) = (X+1) (X3+X2+1) (X3+X+1)

Thus selection of a 'good' generator polynomial seems to be a major problem in the design of cyclic
codes. No clear-cut procedures are available. Usually computer search procedures are followed.

Let us choose g (x) = X3+ X + 1 as the generator polynomial. The encoding circuits are shown
in Fig 4.4(a) and (b).

Fig 4.4 Generation of Non-systematic cyclic codes


To understand the operation, Let us consider u = (10 1 1) i.e.

U (X) = 1 +X2+X3.

We have V (X) = (1 +X2+X3) (1 +X+X3).

= 1 +X2+X3+X+X3+X4+X3+X5+X6

= 1 + X + X2+ X3+ X4+ X5+ X6 because (X3+ X3=0)

=> v = (1 1 1 1 1 1 1)

The multiplication operation, performed by the circuit of Fig 6.4(a), is listed in the Table below step
by step. In shift number 4, „000‟ is introduced to flush the registers. As seen from the tabulation the
product polynomial is:
V (X) = 1 +X+X2+X3+X4+X5+X6,
and hence out put code vector is v = (1 1 1 1 1 1 1), as obtained by direct multiplication. The reader
can verify the operation of the circuit in Fig 4.4(b) in the same manner. Thus the multiplication circuits
of Fig 6.4 can be used for generation of non-systematic cyclic codes.

Table showing sequence of computation

Shift Input Bit Contents of shift Ou Remarks


Number Queue shifted registers. t
IN SRI SR2 SR3 put
0 0001011 - 0 0 0 - Circuit In reset
mode
1 000101 1 1 0 0 1 Co-efficient of X6
2 00010 1 1 1 0 1 Co-efficient of X5
3 0001 0 0 1 1 1 X4 co-efficient
*4 000 1 1 0 1 1 X3 co-efficient
5 00 0 0 1 0 1 X2 co-efficient
6 0 0 0 0 1 1 X1 co-efficient
7 - 0 0 0 0 1 X0co-efficient

4.3 DIVIDING CIRCUITS:

As in the case of multipliers, the division of A (X) by B (X) can be accomplished by using shift
registers and Mod-2 adders, as shown in Fig 4.5. In a division circuit, the first co-efficient of the
quotient is (an-l (bm-1) = q1, and q1.B(X) is subtracted from A (X). This subtraction is carried out by
the feed back connections shown. This process will continue for the second and subsequent terms.
However, remember that these coefficients are binary coefficients. After (n-1) shifts, the entire
quotient will appear at the output and the remainder is stored in the shift registers.

Fig 4.5: Dividing circuit


It is possible to combine a divider circuit with a multiplier circuit to build a “composite
multiplier-divider circuit” which is useful in various encoding circuits. An arrangement to accomplish
this is shown in Fig 4.6(a) and an illustration is shown in Fig 4.6(b).

We shall understand the operation of one divider circuit through an example. Operation of other
circuits can be understood in a similar manner.

Example 4.3:

Let A(X) = X3+X5+X6, A= (0001011), B(X) = 1 +X+X3. We want to find the quotient and
remainder after dividing A(X) by B(X). The circuit to perform this division is shown in Fig 4.7, drawn
using the format of Fig 4.5(a). The operation of the divider circuit is listed in the table:

Fig 4.6 Circuits for Simultaneous Multiplication and division


Fig 4.7 Circuits for dividing A(x) by (1 + X + X3)

Table Showing the Sequence of Operations of the Dividing circuit

Shift Input Bit Contents of shift Ou Remarks


Numbe Queue shifted Registers. t
r IN SRI SR SR put
2 3
0 0001011 - 0 0 0 - Circuit in reset
mode
1 000101 1 1 0 0 0 Co-efficient of X6
2 00010 1 1 1 0 0 Co-efficient of X5
3 0001 0 0 1 1 0 X4 co-efficient
4 *000 1 0 1 1 1 X3 co-efficient
5 00 0 1 1 1 1 X2 co-efficient
6 0 0 1 0 1 1 X1 co-efficient
7 - 0 1 0 0 1 Xo co-efficient

The quotient co-efficients will be available only after the fourth shift as the first three shifts
result in entering the first 3-bits to the shift registers and in each shift out put of the last register, SR3,
is zero.

The quotient co-efficient serially presented at the out put are seen to be (1111) and hence the
quotient polynomial is Q(X) =1 + X + X2 + X3. The remainder co-efficients are (1 0 0) and the
remainder polynomial is R(X) = 1.

4.4 SYSTEMATIC CYCLIC CODES:


Let us assume a systematic format for the cyclic code as below:

v = (p0, p1, p2 … pn-k-1, u0, u1, u2… uk-1) …………… (4.15)

The code polynomial in the assumed systematic format becomes:

V(X) = p0 + p1X + p2X2 + … +pn-k-1Xn-k-1 +u0Xn-k + u1Xn-k+1 +… +uk-1Xn-1 ………... (4.16)

= P(X) + Xn-kU(X) …………………… (4.17)

Since the code polynomial is a multiple of the generator polynomial we can write:
V (X) = P (X) +Xn-k U (X) = Q (X) g (X) ....................... (4.18)

X nkU ( X ) P( X )
  Q( X )  ………………. (4.19)
g( X ) g( X )

Thus division of Xn-k U (X) by g (X) gives us the quotient polynomial Q (X) and the remainder
polynomial P (X). Therefore to obtain the cyclic codes in the systematic form, we determine the
remainder polynomial P (X) after dividing Xn-k U (X) by g(X). This division process can be easily
achieved by noting that "multiplication by Xn-k amounts to shifting the sequence by (n-k) bits".
Specifically in the circuit of Fig 4.5(a), if the input A(X) is applied to the Mod-2 adder after the (n-k)
th shift register the result is the division of Xn-k A (X) by B (X).

Accordingly, we have the following scheme to generate systematic cyclic codes. The
generator polynomial is written as:

g (X) = 1 +glX+g2X2+g3X3+…+gn-k-1 Xn-k-1 +Xn-k ………… (4.20)

The circuit of Fig 4.8 does the job of dividing Xn-kU (X) by g(X). The following steps describe the
encoding operation.

Fig 4.8 Syndrome encoding of cyclic codes using (n-k) shift register stages

1. The switch S is in position 1 to allow transmission of the message bits directly to an


out put shift register during the first k-shifts.
2. At the same time the 'GATE' is 'ON' to allow transmission of the message bits into the
(n-k) stage encoding shift register
3. After transmission of the kth message bit the GATE is turned OFF and the switch S is
moved to position 2.
4. (n-k) zeroes introduced at "A" after step 3, clear the encoding register by moving the
parity bits to the output register
5. The total number of shifts is equal to n and the contents of the output register is the
code word polynomial V (X) =P (X) + Xn-k U (X).
6. After step-4, the encoder is ready to take up encoding of the next message input

Clearly, the encoder is very much simpler than the encoder of an (n, k) linear block code and the
memory requirements are reduced. The following example illustrates the procedure.
Example 4.4:

Let u = (1 0 1 1) and we want a (7, 4) cyclic code in the systematic form. The generator polynomial
chosen is g (X) = 1 + X + X3

For the given message, U (X) = 1 + X2+X3

Xn-k U (X) = X3U (X) = X3+ X5+ X6

We perform direct division Xn-kU (X) by g (X) as shown below. From direct division observe that
p0=1, p1=p2=0. Hence the code word in systematic format is:

v = (p0, p1, p2; u0, u1, u2, u3) = (1, 0, 0, 1, 0, 1, 1)

Fig 4.9 Encoder for the (7,4) cyclic code


The encoder circuit for the problem on hand is shown in Fig 4.9. The operational steps are as follows:

Shift Number Input Bit shifted IN Register Output


Queue contents
0 1011 - 000 -
1 101 1 110 1
2 10 1 101 1
3 1 0 100 0
4 - 1 100 1

After the Fourth shift GATE Turned OFF, switch S moved to position 2, and the parity bits
contained in the register are shifted to the output. The out put code vector is v = (100 1011) which
agrees with the direct hand calculation.

4.5 GENERATOR MATRIX FOR CYCLIC CODES:

The generator polynomial g(X) and the parity check polynomial h(X) uniquely specify the
generator matrix G and the parity check matrix H respectively. We shall consider the construction of
a generator matrix for a (7, 4) code generated by the polynomial g(X) = 1 +X+X3.

We start with the generator polynomial and its three cyclic shifted versions as below:
g(X) = 1 + X + X3
X g(X) = X + X2 + X4
X2g(X) = X2 + X3 + X5
X3g(X) = X3 + X4 + X6
The co-efficients of these polynomials are used as the elements of the rows of a (47) matrix to get
the following generator matrix:

1 1 0 1 0 0 0
0 1 1 0 1 0 0 
G 
0 0 1 1 0 1 0
 
0 0 0 1 1 0 1

Clearly, the generator matrix so constructed is not in a systematic format. We can transform this into
a systematic format using Row manipulations. The manipulations are:

First row = First row; Second row = Second row; Third row = First row + Third row; and Fourth row
= First row + second row + Fourth row.

These operations give the following result:

1 1 0 1 0 0 0
 1 0 0 
G  0 1 1 0  [ P ⁝I4 ]
1 1 1 0 0 1 0
 
1 0 1 0 0 0 1

Using this generator matrix, which is in systematic form the code word for u = (1 0 1 1) is
v = (1 0 0 1 0 1 1) (obtained as sum of 1st row + Third row + Fourth row of the G-matrix). The result
agrees with direct hand calculation.
To construct H-matrix directly, we start with the reciprocal of the parity check polynomial
defined by Xkh(X-1). Observe that the polynomial Xkh(X-1) is also a factor of the polynomial Xn+
1. For the polynomial (X7+1) we have three primitive factors namely, (X + 1), (X3+X+1) and
(X3+X2+1). Since we have chosen (X3+X+1) as the generator polynomial the other two factors
should give us the parity check polynomial.
h(X) = (X +1) (X3+X2+1) = X4+X2+X+1

There fore with h(X) = 1 +X+X2+X4, we have


h(X-1) = 1 +X-1+X-2+X-4, and

Xk h(X-1) = X4h(X-1) = X4+X3+X2+1

The two cyclic shifted versions are:

X5h(X-1) = X5 + X4 +X3 + X

X6P(X-1) = X6 + X5 + X4 + X2

Or X4h(X-1) = X4+X3+X2+1

X5h(X-1) = X5 + X4 +X3 + X

X6h(X-1) = X6 + X5 + X4 + X2

Using the co-efficients of these polynomials, we have:

1 0 1 1 1 0 0

H  0 1 0 1 1 1 0
 
0 0 1 0 1 1 1

Clearly, this matrix is in non systematic form. It is interesting to check that for the non-
systematic matrixes obtained GHT = O. We can obtain the H matrix in the systematic format
H = [I 3 ⁝P T ] , by using Row manipulations. The manipulation in this case is simply.
'First row = First row + Third row'. The result is

1 0 0 1 0 1 1
H  0 1 0 1 1 1 0 

0 0 1 0 1 1 1

Observe the systematic format adopted: G  [P⁝I k ] and H  [I nk ⁝PT ]

4.6 SYNDROME CALCULATION - ERROR DETECTION AND ERROR


CORRECTION :
Suppose the code vector v= (v0, v1, v2 …vn-1) is transmitted over a noisy channel. Hence the
received vector may be a corrupted version of the transmitted code vector. Let the received code vector
be r = (r0, r1, r2…rn-1). The received vector may not be anyone of the 2k valid code vectors. The
function of the decoder is to determine the transmitted code vector based on the received vector.
The decoder, as in the case of linear block codes, first computes the syndrome to check whether
or not the received code vector is a valid code vector. In the case of cyclic codes, if the syndrome is
zero, then the received code word polynomial must be divisible by the generator polynomial. If the
syndrome is non-zero, the received word contains transmission errors and needs
error correction. Let the received code vector be represented by the polynomial

R(X) = r0+r1X+r2X2+…+rn-1Xn-l

Let A(X) be the quotient and S(X) be the remainder polynomials resulting from the division
of R(X) by g(X) i.e.

R( X ) S( X ) ……………….. (4.21)
 A( X ) 
g( X ) g( X )

The remainder S(X) is a polynomial of degree (n-k-1) or less. It is called the "Syndrome polynomial".
If E(X) is the polynomial representing the error pattern caused by the channel, then we have:
R(X) =V(X) + E(X) ……………….. (4.22)

And it follows as V(X) = U(X) g(X), that:

E(X) = [A(X) + U(X)] g(X) +S(X) ………………. (4.23)

That is, the syndrome of R(X) is equal to the remainder resulting from dividing the error pattern by the
generator polynomial; and the syndrome contains information about the error pattern, which can be
used for error correction. Fig 4.5. A “Syndrome calculator” is shown in Fig 4.10.

Fig 4.10 Syndrome calculator using (n-k) shift register

The syndrome calculations are carried out as below:

1 The register is first initialized. With GATE 2 -ON and GATE1- OFF, the received vector is
entered into the register

2 After the entire received vector is shifted into the register, the contents of the register will be the
syndrome, which can be shifted out of the register by turning GATE-1 ON and GATE-2OFF.
The circuit is ready for processing next received vector.

Cyclic codes are extremely well suited for 'error detection' .They can be designed to detect
many combinations of likely errors and implementation of error-detecting and error correcting circuits
is practical and simple. Error detection can be achieved by employing (or adding) an additional R-S
flip-flop to the syndrome calculator. If the syndrome is nonzero, the flip-flop sets and provides an
indication of error. Because of the ease of implementation, virtually all error detecting codes are
invariably 'cyclic codes'. If we are interested in error correction, then the decoder must be capable of
determining the error pattern E(X) from the syndrome S(X) and add it to R(X) to
determine the transmitted V(X). The following scheme shown in Fig 6.11 may be employed for the
purpose. The error correction procedure consists of the following steps:

Step1. Received data is shifted into the buffer register and syndrome registers with switches
SIN closed and SOUT open and error correction is performed with SIN open and SOUT
closed.

Step2. After the syndrome for the received code word is calculated and placed in thesyndrome
register, the contents are read into the error detector. The detector is a combinatorial
circuit designed to output a „1‟ if and only if the syndrome corresponds to a correctable
error pattern with an error at the highest order position Xn-l. That is, if the detector output
is a '1' then the received digit at the right most stage of the buffer register is assumed to
be in error and will be corrected. If the detector output is '0' thenthe received digit at the
right most stage of the buffer is assumed to be correct. Thus the detector output is the
estimate error value for the digit coming out of the buffer register.

Fig 4.11 General decoder for cyclic code

Step3. In the third step, the first received digit in the syndrome register is shifted right once. If
the first received digit is in error, the detector output will be '1' which is used for error
correction. The output of the detector is also fed to the syndrome register to modify the
syndrome. This results in a new syndrome corresponding to the „altered
„received code word shifted to the right by one place.

Step4. The new syndrome is now used to check and correct the second received digit, which
is now at the right most position, is an erroneous digit. If so, it is corrected, a new
syndrome is calculated as in step-3 and the procedure is repeated.
Step5. The decoder operates on the received data digit by digit until the entire
received code word is shifted out of the buffer.

At the end of the decoding operation, that is, after the received code word is shifted out of the
buffer, all those errors corresponding to correctable error patterns will have been corrected, and the
syndrome register will contain all zeros. If the syndrome register does not contain all zeros, thismeans
that an un-correctable error pattern has been detected. The decoding schemes described in Fig
6.10 and Fig6.11 can be used for any cyclic code. However, the practicality depends on the complexity
of the combinational logic circuits of the error detector. In fact, there are special classesof cyclic
codes for which the decoder can be realized by simpler circuits. However, the price paid for such
simplicity is in the reduction of code efficiency for a given block size.

A decoder of the form described above operates on the received data bit by bit; and each bit is
tested in turn for error and is corrected whenever an error is located. Such a decoder is called a“Meggitt
decoder”.

For illustration let us consider a decoder for a (7, 4) cyclic code generated by

g(X) = 1 + X + X 3

The circuit implementation of the Meggitt decoder is shown on Fig 6.12. The entire received
vector R(X) is entered in to the SR‟s bit by bit and at the same time it is stored in the buffer memory.
The division process will start after the third shift and after the seventh shift the syndrome will be stored
in the SR‟s. If S(X) = (000) then E(X) = 0 and R(X) is read out of the buffer. Since S(X) can be found
from E(X) with nonzero coefficients, suppose E(X) = (000 0001). Then the SR contents are given as:
(001, 110, 011, 111, 101) showing that S(X) = (101) after the seventh shift. At the eighth shift, the SR
content is (100) and this may be used through a coincidence circuit to correct the error bit coming out
of the buffer at the eighth shift. On the other hand if the error polynomial wereE(X) = (000 1000)
then the SR content will be (100) at he eleventh shift and the error will be corrected when the buffer
delivers the error bit at the eleventh shift. The SR contents for different shifts, for two other error
patterns are as shown in the table below:
SR contents for the error patterns (1001010) and (1001111)
Shift Input SR-content for Input SR- content for
Number (1001010) (1001111)
1 0 000 1 100
2 1 100 1 110
3 0 010 1 111
4 1 101 1 001
5 0 100 0 110
6 0 010 0 011
7 1 101 1 011 *Indicates an error
8 0 100 0 111
9 - - 0 101
10 - - 0 100
Fig 4.12 Meggitt decoder for (7,4) cyclic code

For R(X) = (1001010), the SR content is (100) at the 8-th shift and the bit in X6 position of
R(X) is corrected giving correct V(X) = (1001011). On the other hand , if R(X) = (1001111), then it
is seen from the table that at the 10-th shift the syndrome content will detect the error and correct the
X4 bit of R(X) giving V(X) = (1001011).

The decoder for the (15, 11) cyclic code, using g(X) = 1 + X + X 4, is shown in Fig 6.13. It is
easy to check that the SR content at the 16-th shift is (1000) for E(X) =X 14. Hence a coincidence
circuit gives the correction signal to the buffer out put as explained earlier.
Although the Meggitt decoders are intended for Single error correcting cyclic codes, they may
be generalized for multiple error correcting codes as well, for example (15, 7) BCH code.
An error trapping decoder is a modification of a Meggitt decoder that is used for certain cyclic codes.

Fig 4.13 Meggitt decoder for (15,11) cyclic code

The syndrome polynomial is computed as: S(X) = Remainder of [E(X) / g(X)]. If the error
E(X) is confined to the (n-k) parity check positions (1, X, X2… Xn-k-1) of R(X), then E(X) = S(X),
since the degree of E(X) is less than that of g(X). Thus error correction can be carried out by simply
adding S(X) to R(X). Even if E(X) is not confined to the (n-k) parity check positions of R(X) but has
nonzero values clustered together such that the length of the nonzero values is less than the syndrome
length, then also the syndrome will exhibit an exact replica of the error pattern after some cyclic shifts
of E(X). For each error pattern, the syndrome content S(X) (after the required shifts) is subtracted from
the appropriately shifted R(X), and the corrected V(X) recovered.

“If the syndrome of R(X) is taken to be the remainder after dividing Xn-k R(X) by g(X),
and all errors lie in the highest-order (n-k) symbols of R(X), then the nonzero portion of the error
pattern appears in the corresponding positions of the syndrome”. Fig 4.14 shows an error trapping
decoder for a (15, 7) BCH code based on the principles described above. A total of 45 shifts
are required to correct the double error, 15 shifts to generate the syndrome, 15 shifts to correct the
first error and 15 shifts to correct the second error.

Fig 4.14 Error trapping decoder for (15,7) BCH code

Illustration:

U(X) = X6 + 1; g(X) = X8 + X7 + X6 + X4 + 1

V(X) = X14 + X13 + X12 + X10 + X8 + X7 + X4 + 1

E(X) = X11 + X

R(X) = X14 + X13 + X12 + X11 + X10 + X8 + X7 + X4 + X + 1

r = (110010011011111)
Shift Syndrome Shift Middle Shift Bottom
Number Generator Number Register Number Register
Register
1 10001011 16 01100011 31 00000100
2 01000101 17 10111010 32 00000010
3 00100010 18 01011101 33 00000001
4 10011010 19 10100101 34 00000000
5 11000110 20 11011001 35 ;
6 01100011 21 11100111 36 :
7 00110001 22 11111000 37 :
8 00011000 23 01111100 38 All zeros
9 00001100 24 00111110 39 :
10 00000110 25 00011111 40 :
11 10001000 26 10000100 41 :
12 01000100 27 01000010 42 :
13 00100010 28 00100001 43 :
14 10011010 29 00010000 44 :
15 11000110 30 00001000 45 :
Errors trapped at shift numbers 28 and 33.

Some times when error trapping cannot be used for a given code, the test patterns can be
modified to include the few troublesome error patterns along with the general test. Such a modified
error trapping decoder is possible for the (23, 12) Golay code in which the error pattern E(X) will be
of length 23 and weight of 3 or less (t  3). The length of the syndrome register is 11 and if E(X) has
a length greater than 11the error pattern is not trapped by cyclically shifting S(X). In this case, it is
shown that one of the three error bits must have at least five zeros on one side of it and at least six zeros
the other side. Hence all error patterns can be cyclically shifted into one of the following three
configurations (numbering the bit positions, e0, e1, e2 … e22):

(i) All errors (t  3 ) occur in the 11 high-order bits

(ii) One error occurs in position e5 and the other two errors occur in the 11
high-order bits.

(iii) One error occurs in position e6 and the remaining two errors occur in the
11 high-order bits.

In the decoder shown in Fig 4.15, the received code vector R(X) is fed at the rightmost stage of
the syndrome generator (as was done in Fig 6.14), equivalent to multiplying R(X) by X11. Then the
syndrome corresponding to e5 and e6 are obtained (using g1(X) as the generator polynomial) as:

S (e5) = Remainder of [X 16 /g1(X)] = X + X2 + X5 + X6 + X8 + X9 and


S (e6) = Remainder of [X 17 /g1(X)] = X2 + X3 + X6 + X7 + X9 + X10

Fig 4.15 Error trapping decoder for (23,12) Golay code

The syndrome vectors for the errors e5 and e6 will be (01100110110) or (00110011011)
respectively. Two more errors occurring in the 11 high-order bit positions will cause two 1‟s in the
appropriate positions of the syndrome vectors, thereby complementing the vector for e5 or e6. Based
on the above relations, the decoder operates as follows:

(i).The entire received vector is shifted into the syndrome generator (with switch G1 closed)
and the syndrome S(X) corresponding to X 11 R(X) is formed.

(ii).If all the three or less errors are confined to X 12, X 13 … X22 of R(X), then the syndrome
matches the errors in these positions. The weight of the syndrome is now 3 or less. This is
checked by a threshold gate and the gate output T0 switches G2 ON and G1 OFF. R(X) is now
received from the buffer and corrected by the syndrome bits (as they are clocked bit by bit)
through the modulo-2 adder circuit.

(iii).If the test in (ii) fails then it is assumed that one error is either at e5 or at e6, and the other
two errors are in the 11 high-order bits of R(X). Then if the weight of S(X) is more than 3 (in
test (ii)), then the weights of [S(X) + S (e5)] and [S(X) + S (e6)] are tested. The decisions are:
1. If weight of [S(X) + S (e5)]  2 then the decision (T1 = 1) is that one error
is at position e5 and two errors are at positions where [S(X) + S (e5)] are nonzero.

2. If weight of [S(X) + S (e6)]  2 then the decision (T2 = 1) is that one error
is at position e6 and two errors are at positions where [S(X) + S (e6)] are nonzero. The
above tests are arranged through combinatorial switching circuits and the appropriate
corrections in R(X) are made as R(X) is read from the buffer.
(iv). If the above tests fail then with G1 and G3 ON and G2 OFF, the syndrome and buffer
contents are shifted by one bit. Tests (ii) and (iii) are now repeated. Bit by bit shifting of S(X)
and R(X) is continued till the errors are located, and then corrected. A maximum of 23 shifts
will be required to complete the process. After correction of R(X), the corrected V(X) is further
processed through a divider circuit to obtain the message U(X) = V(X) / g(X).

Assuming that upon shifting the block of 23 bits with t  3 cyclically, „at most one error will
lie outside the 11 high-order bits of R(X)‟ at some shift, an alternative decoding procedure can be
devised for a Golay coder – The systematic search decoder. Here the test (ii) is first carried out. If the
test fails, then first bit of R(X) is inverted and a check is made to find if the weight of S(X)  2. If this
test is successful, then the nonzero positions of S(X) give the two error locations (similar to test
(iii) above) and the other error is at first position. If this test fails, then the syndrome content is
cyclically shifted, each time testing for weight of S(X)  3; and if not, invert 2nd, 3rd ……and 12th bit
of R(X) successively and test for weight of S(X)  2. Since all errors are not in the parity check section,
an error must be detected in one of the shifts. Once one error is located and corrected, the other two
errors are easily located and corrected by test (ii). Some times the systematic searchdecoder is simpler
in hardware than the error trapping decoder, but the latter is faster in operation.The systematic search
decoder can be generalized for decoding other multiple-error-correcting cyclic codes. It is to be
observed that the Golay (23, 12) code cannot be decoded by majority-logic decoders.
Module – 2

1) List the difference between channel coding and source coding.


2) Explain G and H matrix and show that G . HT = 0.
3) Consider a (6,3) linear block code whose generator matrix is given by

(i) Find the parity check matrix. (ii) Find the minimum distance of the code.

(iii) Draw the encoder and syndrome computation circuit

4) Write in detail about matrix description of Linear block codes.

Explain the a typical data transmission system with suitable block diagram.

Consider a (6,3) linear block code whose generator matrix is given by

(i) Find the parity check matrix. (ii) Find the minimum distance of the code. (iii) Draw the
encoder and syndrome computation circuit.
5) Consider the (7, 4) linear block code whose generator matrix is given below:

a) Find all the code vectors.


b) Find parity check matrix (H).
c) Minimum weight of this code.
6) Describe about Cyclic codes with suitable examples.
7) Explain Channel Capacity and its equation.

For the channel matrix given below, compute the channel capacity

with rs=1000 symbols/sec

8) Generate a (7, 4) Hamming Code for the data [1 0 1 0] and calculate how a single-bit error
can be detected and corrected.
9) Describe the purpose of a syndrome calculation in cyclic codes with example.
10) For a binary symmetric channel shown below P(x1)= β.
(i) Show that the mutual information I(X;Y) is given by,

I(X;Y)=H(Y)+Plog2P+(1-P)log2(1-P).

(ii) Determine I(X;Y)for p=0.1,β=0.5


11) Write the Generator matrix and parity check matrix of (7, 4) Hamming code in the systematic
form.
12) A certain linear block code has a minimum distance dmin=11.
How many errors can it detect? How many errors can it correct? Justify your answer.
13) Obtain the generator and parity-check matrices of a (7,4) cyclic code with generator
polynomial

14) Design a (4, 2) linear block code:


(i) Find the generator matrix for the code vector set.
(ii) Find the parity check matrix.
(iii) Choose the code-vectors to be in systematic form, with the goal of maximizing
dmind_{\text{min}}dmin.
(iv) Enter the sixteen 4-tuples into a standard array.
(v) What are the error-detecting and error-correcting capabilities of the code?
(vi) Make a syndrome table for the correctable error patterns.
(vii) Draw the encoding circuit.
(viii) Draw the syndrome calculating circuit.
15) Design a linear block code with a minimum distance of 3 and a message block size of 8 bits.
Draw the [G][G][G] and [H][H][H] matrices.
16) For a (7, 4) cyclic code, the received vector Z(x)Z(x)Z(x) is 1110101 and the generator
polynomial is g(x)=1+x+x3 Draw the syndrome calculation circuit and correct the single error
in the received vector.
17) Explain the concept of linear block codes in detail with examples.
Discuss the matrix representation of linear block codes and how it simplifies encoding and
decoding processes.
18) Illustrate a data transmission system using a linear block code.
Explain the working of the system with a block diagram, including encoding, transmission,
and decoding stages.
19) Given a (7,4) linear block code with the generator matrix:
(i) Find the parity-check matrix.
(ii) Determine the minimum distance of the code.
(iii) Draw the encoder and syndrome computation circuit for this code.
20) Design and explain a linear block code for error correction.
Provide a step-by-step procedure for constructing the generator and parity-check matrices
for a (5,3)(5, 3) code.
21) Discuss the properties of a (n,k)(n, k) linear block code.
Define parameters such as code rate, error-detecting capability, and error-correcting
capability.
22) For the generator matrix of a (6,4) linear block code:

(i) Write the corresponding parity-check matrix.


(ii) Encode the message vector [1,0,1,0][1, 0, 1, 0].
(iii) Identify any errors using the syndrome vector.
23) What is the significance of the minimum distance in a linear block code?
Derive its relationship with the error detection and correction capability.
24) Compare systematic and non-systematic generator matrices.
Provide examples of each and discuss their implications on the encoding process.
25) Explain in detail about encoding circuits of Linear Block Codes

26) A voice grade channel of the telephone network has a bandwidth of 3.4KHz.
(i) Determine Channel Capacity of the telephone channel for a signal to noise ratio
of 30dB.
(ii) Obtain the minimum signal-to-noise ratio required to support information
transmission through the telephone channel at the rate of 4800 bits/ sec
27) For a systematic (6.3) Linear Block Codes, the parity matrix P is given by
1 0 1
0 1 1 Find all Possible code vectors
1 1 0
28) a) Explain the significance of the entropy H(X/Y) of a communication system where X
is the transmitter and Y is the receiver.
b) Derive the relationship between entropy and mutual information.
29) A Binary symmetric channel has the following noise matrix with source probabilities
2/3 1/3
p(x1) =3/4 and p(x2) =1/4 P(Y/X)[ ].
1/3 2/3
Calculate H(x) ,H(X,Y) and H(Y/X)
30) Discuss Channel Capacity and its equation.
31) Suppose you have two different messages, "ABABAB" and "AAABBB", with the
symbol probabilities of p(a)=0.5 p(b)=0.3 and p(c)=0.2. Calculate the encoded values
for both messages using arithmetic encoding. Examine which message results in a more
efficient encoding.
32) Find the generator and parity check matrices of a (7, 4) cyclic code with generator
polynomial g (X) = 1 + X + X3.
Module 3 :
Codes on Graph

Introduction to Convolutional Codes, Tree Codes and Trellis Codes, Description of Convolutional Codes
(Analytical Representation), The Generating Function, Matrix Description of Convolutional Codes.
Viterbi Decoding of Convolutional Codes, Turbo codes, Encoding and decoding of Turbo codes.
INTRODUCTION

In block codes, a block of n-digits generated by the encoder depends only on the block of k-
data digits in a particular time unit. These codes can be generated by combinatorial logic circuits. In a
convolutional code the block of n-digits generated by the encoder in a time unit depends on not only
on the block of k-data digits with in that time unit, but also on the preceding „m‟ input blocks. An (n,
k, m) convolutional code can be implemented with k-input, n-output sequential circuit with input
memory m. Generally, k and n are small integers with k < n but the memory order m must be made
large to achieve low error probabilities. In the important special case when k = 1, the information
sequence is not divided into blocks but can be processed continuously.

Similar to block codes, convolutional codes can be designed to either detect or correct errors.
However, since the data are usually re-transmitted in blocks, block codes are better suited for error
detection and convolutional codes are mainly used for error correction.

Convolutional codes were first introduced by Elias in 1955 as an alternative to block codes.
This was followed later by Wozen Craft, Massey, Fano, Viterbi, Omura and others. A detailed
discussion and survey of the application of convolutional codes to practical communication channels
can be found in Shu-Lin & Costello Jr., J. Das etal and other standard books on error control coding.

To facilitate easy understanding we follow the popular methods of representing convolutional


encoders starting with a connection pictorial - needed for all descriptions followed by connection
vectors.

5.1 CONNECTION PICTORIAL REPRESENTATION


The encoder for a (rate 1/2, K = 3) or (2, 1, 2) convolutional code is shown in Fig.5.1. Both
sketches shown are one and the same. While in Fig.5.1 (a) we have shown a 3-bit register, by noting
that the content of the third stage is simply the output of the second stage, the circuit is modified using
only two shift register stages. This modification, then, clearly tells us that" the memory requirement m
= 2. For every bit inputted the encoder produces two bits at its output. Thus the encoder is labeled (n,
k, m) (2, 1, 2) encoder.
Fig 5.1 A (2,1,2) Encoder (a) Representation using 3-bit shift register (b) Equivalent representation
requires only two shift register stages
At each input bit time one bit is shifted into the left most stage and the bits that were present
in the registers shifted to the right by one position. Output switch (commutator /MUX) samples the
output of each X-OR gate and forms the code symbol pairs for the bits introduced. The final code is
obtained after flushing the encoder with "m" zero's where 'm'- is the memory order (In Fig.8.1, m = 2).
The sequence of operations performed by the encoder of Fig.5.1 for an input sequence u = (101) are
illustrated diagrammatically in Fig 5.2.

Fig 5.2
From Fig 5.2, the encoding procedure can be understood clearly. Initially the registers are in
Re-set mode i.e. (0, 0). At the first time unit the input bit is 1. This bit enters the first register and pushes
out its previous content namely „0‟ as shown, which will now enter the second register and pushes out
its previous content. All these bits as indicated are passed on to the X-OR gates and the output pair (1,
1) is obtained.The same steps are repeated until time unit 4, where zeros are introduced to clear the
register contents producing two more output pairs. At time unit 6, if an additional „0‟ is introduced the
encoder is re-set and the output pair (0, 0) obtained. However, this step is not absolutely necessary as
the next bit, whatever it is, will flush out the content of the second register. The „0‟ and the „1‟
indicated at the output of second register at time unit 5 now vanishes. Hence after (L+m) = 3 + 2 = 5
time units, the output sequence will read v = (11, 10, 00, 10, 11). (Note: L = length of the input
sequence). This then is the code word produced by the encoder. It is very important to remember that
“Left most symbols represent earliest transmission”.

As already mentioned the convolutional codes are intended for the purpose of error correction.
However, it suffers from the „problem of choosing connections‟ to yield good distance properties. The
selection of connections indeed is very complicated and has not been solved yet. Still,good codes have
been developed by computer search techniques for all constraint lengths less than
20. Another point to be noted is that the convolutional codes do not have any particular block size.
They can be periodically truncated. Only thing is that they require m-zeros to be appended to the end
of the input sequence for the purpose of „clearing‟ or „flushing‟ or „re-setting‟ of the encodingshift
registers off the data bits. These added zeros carry no information but have the effect of reducing the
code rate below (k/n). To keep the code rate close to (k/n), the truncation period is generally made as
long as practical. The encoding procedure as depicted pictorially in Fig 5.2 is rather tedious. We can
approach the encoder in terms of “Impulse response” or “generator sequence”which merely represents
the response of the encoder to a single „1‟ bit that moves through it.

5.2 Convolutional Encoding – Time domain approach:

The encoder for a (2, 1, 3) code is shown in Fig. 8.3. Here the encoder consists of m=3 stage
shift register, n=2 module-2 adders (X-OR gates) and a multiplexer for serializing the encoder outputs.
Notice that module-2 addition is a linear operation and it follows that all convolution encoders can be
implemented using a “linear feed forward shift register circuit”.

The “information sequence‟ u = (u1, u2, u3 …….) enters the encoder one bit at a time starting from u1.
As the name implies, a convolutional encoder operates by performing convolutions on theinformation
sequence. Specifically, the encoder output sequences, in this case v(1) = {v1(1), v2(1), v3(1)
… }and v(2) = {v1(2),v2(2),v3(2) … } are obtained by the discrete convolution of the informationsequence
with the encoder "impulse responses'. The impulse responses are obtained by determining the output
sequences of the encoder produced by the input sequence u = (1, 0, 0, 0…).The impulse responses so
defined are called 'generator sequences' of the code. Since the encoder has a m-time unit memory the
impulse responses can last at most (m+ 1) time units (That is a total of (m+ 1) shifts are necessary for
a message bit to enter the shift register and finally come out) and are written as:

g (i) = {g1(i), g2(i),g3(i) …gm+l(i)}.

Fig 5.3 (2,1,3) binary encoder

For the encoder of Fig.5.3, we require the two impulse responses,

g (1) = {g1(1), g2(1), g3 (1), g4 (1)} and

g (2) = {g1(2), g2(2), g3 (2), g4 (2)}

By inspection, these can be written as: g (1) = {1, 0, 1, 1} and g (2) = {1, 1, 1, 1}

Observe that the generator sequences represented here is simply the 'connection vectors' of the
encoder. In the sequences a '1' indicates a connection and a '0' indicates no connection to the
corresponding X - OR gate. If we group the elements of the generator sequences so found in to pairs,
we get the overall impulse response of the encoder, Thus for the encoder of Fig 5.3, the over-all
impulse response‟ will be:

v = (11, 01, 11, 11)

The encoder outputs are defined by the convolution sums:

v (1) = u * g (1) ………………. (5.1 a)

v (2) = u * g (2) ……………. (5.1 b)

Where * denotes the „discrete convolution‟, which implies:


m

 ul  i .gi 1
( j) ( j)
vl
i 0

(j)
= ul g1 + ul – 1 g2 (j) + ul – 2 g3 (j) + ….. +ul – m gm+1(j) ……………. (5.2)

for j = 1, 2 and where ul-i = 0 for all l < i and all operations are modulo - 2. Hence for the encoder of
Fig 5.3, we have:

vl(1) = ul + ul – 2 + ul - 3

vl(2) = ul + ul – 1+ ul – 2 + ul - 3
This can be easily verified by direct inspection of the encoding circuit. After encoding, the two
output sequences are multiplexed into a single sequence, called the "code word" for transmissionover
the channel. The code word is given by:

v = {v1 (1) v1 (2), v2 (1) v2 (2), v3 (1) v3 (2) …}


Example 5.1:

Suppose the information sequence be u = (10111). Then the output sequences are:

v (1) = (1 0 1 1 1) * (10 1 1)
= (1 0 0 0 0 0 0 1),

v (2) = (1 0 1 1 1) * (1 1 1)
= (1 1 0 1 1 1 0 1),

and the code word is

v = (11, 01, 00, 01, 01, 01, 00, 11)

The discrete convolution operation described in Eq (8.2) is merely the addition of shifted
impulses. Thus to obtain the encoder output we need only to shift the overall impulse response by 'one
branch word', multiply by the corresponding input sequence and then add them. This is illustrated in
the table below:

INPUT OUT PUT

1 11 01 11 11
0 0 0 0 0 0 0 0 0 ----- one branch word shifted
sequence
1 1 1 0 1 1 1 1 1---------- Two branch word
shifted
1 11 01 11 11
1 11 01 11 11
Modulo -2 sum 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 1

The Modulo-2 sum represents the same sequence as obtained before. There is no confusion at
all with respect to indices and suffices! Very easy approach - super position or linear addition of shifted
impulse response - demonstrates that the convolutional codes are linear codes just as the blockcodes
and cyclic codes. This approach then permits us to define a 'Generator matrix' for the convolutional
encoder. Remember, that interlacing of the generator sequences gives the overall impulse response and
hence they are used as the rows of the matrix. The number of rows equals the number of information
digits. Therefore the matrix that results would be “Semi-Infinite”. The secondand subsequent rows of
the matrix are merely the shifted versions of the first row -They are each shifted with respect to each
other by "One branch word". If the information sequence u has a finite length, say L, then G has L
rows and n  (m +L) columns (or (m +L) branch word columns) and v has a length of n  (m +L) or
a length of (m +L) branch words. Each branch word is of length 'n'. Thus the Generator matrix G, for
the encoders of type shown in Fig 8.3 is written as:

g (1)
g (2)
g (1)
g (2 )
g (1)
g (2 )
g (1)
g (2)
  

 1 1 2 2 3 3 4 4

G   g (1)
g (2)
g (1)
g (2)
g (1)
g (2 )
g (1)
g (2)
  ….. (5.3)
1 1 2 2 3 3 4 4
 g1( 1 ) g1( 2 ) g 2 ( 1 ) g 2( 2 ) g 3 ( 1 ) g 3( 2 ) g4 ( 1 ) g4 ( 2 )  
 
(Blank places are zeros.)

The encoding equations in Matrix form is:

v = u .G …………………. (5.4)

Example 5.2:

For the information sequence of Example 5.1, the G matrix has 5 rows and 2(3 +5) =16 columns and
we have
1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
1 
 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 
0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0
G  0 
 0 0 0 0 0 1 1 0 1 1 1 1 1 0 0 
0
0
0 0 0 0 0 0 0 1 1 0 1 1 1 1 1

Performing multiplication, v = u G as per Eq (5.4), we get: v = (11, 01, 00, 01, 01, 00, 11) same as
before.

As a second example of a convolutional encoder, consider the (3, 2, 1) encoder shown in Fig.8.4.
Here, as k =2, the encoder consists of two m = 1 stage shift registers together with n = 3 modulo -2
adders and two multiplexers. The information sequence enters the encoder k = 2 bits at a time and can
be written as u = {u1 (1) u1 (2), u2 (1) u2 (2), u3 (1) u3 (2) …} or as two separate input sequences:
u (1) = {u1 (1), u2 (1), u3 (1) …} and u (2) = {u1 (2), u2 (2), u3 (2) …}.

Fig 5.4 (3,2,1) Convolution encoder


There are three generator sequences corresponding to each input sequence. Letting
gi ( j) = {gi,1 ( j), gi,2 ( j), gi,3 ( j) … gi,m+1 ( j)} represent the generator sequence corresponding to
input i and output j. The generator sequences for the encoder are:

g1 (1) = (1, 1), g1 (2) = (1, 0), g1 (3) = (1, 0)

g2 (1) = (0, 1), g2 (2) = (1, 1), g2 (3) = (0, 0)

The encoding equations can be written as:

v (1) = u (l) * g1 (1) + u (2)* g2 (1) …………………. (5.5 a)


v (2) = u (1) * g1 (2) + u (2) * g2 (2) …………………. (5.5 b)
v (3) = u (l) * g1 (3) + u (2) * g2 (3) ………………… (5.5 c)

The convolution operation implies that:

v l (1) = u l (1) + u l-1 (1) + u l-1 (2)


v l (2) = u l (1) + u l (2) + u l-1 (2)

v l (3) = u l (1)
as can be seen from the encoding circuit.

After multiplexing, the code word is given by:

v = { v 1 ( 1) v 1 ( 2) v 1 ( 3) , v 2 ( 1) v 2 ( 2) v 2 ( 3) , v 3 ( 1) v 3 ( 2) v 3 ( 3) … }

Example 5.3:

Suppose u = (1 1 0 1 1 0). Hence u (1) = (1 0 1) and u (2) = (1 1 0). Then

v (1) = (1 0 1) * (1,1) + (1 1 0) *(0,1) = (1 0 0 1)

v (2) = (1 0 1) * (1,0) + (1 1 0) *(1,1) = (0 0 0 0)

v (3) = (1 0 1) * (1,0) + (1 1 0) *(0,0) = (1 0 1 0)

 v = (1 0 1, 0 0 0, 0 0 1, 1 0 0).
The generator matrix for a (3, 2, m) code can be written as:

 g11 ( 1 ) g11( 2 ) g11 ( 3 ) g12 ( 1 ) g 12 ( 2 ) g13 ( 3 )  g1,m  1 ( 1 ) g 1,m  1 ( 2 ) g 1,m  1 ( 3 ) 



g21 ( 1 ) g 21( 2 ) g 21( 3 ) g22 ( 1 ) g 22 ( 2 ) g 22 ( 3 )  g2 ,m  1 ( 1 ) g 2 ,m  1 ( 2 ) g 2 ,m  1 ( 3 ) 

G   ⋱ g11 ( 1 ) g11( 2 ) g11 ( 3 ) g12 ( 1 ) g12 ( 2 ) g12 ( 3 ) 


 …… (5.6)
 
 ⋱ g ( 1 )g ( 2 )g (3 ) g ( 1 )g ( 2 )g ( 3 )  
21 21 21 22 22 22 
 ⋱ ⋱ ⋱ ⋱ 

The encoding equations in matrix form are again given by v = u G. observe that each set of k = 2 rows
of G is identical to the preceding set of rows but shifted by n = 3 places or one branch word to the right.

Example 5.4:

For the Example 5.3, we have

u = {u1 (1) u1 (2), u2 (1) u2 (2), u3 (1) u3 (2)} = (1 1, 0 1, 1 0)


The generator matrix is:
1 1, 1 0 0 
1 
 1 0 , 1 1 0 
0 1 1 1, 1 
 0 0 
G   0 1 0, 1 
 1 0
 1 0
1 1, 1 0 
 0
 1 0 , 1 1 0

*Remember that the blank places in the matrix are all zeros.
Performing the matrix multiplication, v = u G, we get: v = (101,000,001,100), again agreeing
with our previous computation using discrete convolution.

This second example clearly demonstrates the complexities involved, when the number of input
sequences are increased beyond k > 1, in describing the code. In this case, although the encoder contains
k shift registers all of them need not have the same length. If ki is the length of the i-th shift register,
then we define the encoder "memory order, m" by

m  Maxki ………………. (5.7)


1i  k

(i.e. the maximum length of all k-shift registers)

An example of a (4, 3, 2) convolutional encoder in which the shift register lengths are 0, 1 and 2 is
shown in Fig 5.5.

Fig 5.5 (4,3,2) Binary convolution code encoder

Since each information bit remains in the encoder up to (m + 1) time units and during each time
unit it can affect any of the n-encoder outputs (which depends on the shift register connections)it
follows that "the maximum number of encoder outputs that can be affected by a single
information bit" is

nA  n(m  1) …………………… (5.8)

„nA‟ is called the 'constraint length" of the code. For example, the constraint lengths of the encoders
of Figures 5.3, 5.4 and 5.5 are 8, 6 and 12 respectively. Some authors have defined the constraint length
(For example: Simon Haykin) as the number of shifts over which a single message bit can
influence the encoder output. In an encoder with an m-stage shift register, the “memory” of the encoder
equals m-message bits, and the constraint length nA = (m + 1). However, we shall adopt the definition
given in Eq (5.8).

The number of shifts over which a single message bit can influence the encoder output is usually
denoted as K. For the encoders of Fig 5.3, 5.4 and 5.5 have values of K = 4, 2 and 3 respectively. The
encoder in Fig 8.3 will be accordingly labeled as a „rate 1/2, K = 4’ convolutional encoder. The term
K also signifies the number of branch words in the encoder‟s impulse response.

Turning back, in the general case of an (n, k, m) code, the generator matrix can be put in the
form:
G1 G2 G3  Gm Gm  1 
 G1 
G G2  Gm 1 Gm Gm  1 
Gm  1  …………… (5.9)
 G1  Gm  2 Gm 1 Gm
 

 ⋱ ⋱ ⋱ ⋱ ⋱ 

Where each Gi is a (k  n) sub matrix with entries as below:

 g 1,i ( 1 ) g1,i
(2)
 g1,i ( n ) 

 (1) ( n ) 
 g2 ,i g2 ,i
(2)
 g2 ,i 
Gi    ………………… (5.10)

⁝ ⁝ ⁝ ⁝ 
 
gk ,i
(1)
gk ,i
(2)
 gk ,i
(n )


Notice that each set of k-rows of G are identical to the previous set of rows but shifted n-places to
the right. For an information sequence u = (u1, u2…) where ui = {ui (1), ui (2)…ui (k)}, the code word is
v = (v1, v2…) where vj = (vj (1), vj (2) ….vj (n)) and v = u G. Since the code word is a linear combination
of rows of the G matrix it follows that an (n, k, m) convolutional code is a linear code.

Since the convolutional encoder generates n-encoded bits for each k-message bits, we define R
= k/n as the "code rate". However, an information sequence of finite length L is encoded into a code
word of length n (L +m), where the final nm outputs are generated after the last non zero information
block has entered the encoder. That is, an information sequence is terminated with all zero blocks in
order to clear the encoder memory. The terminating sequence of m-zeros is called the "Tail of the
message". Viewing the convolutional-code as a linear block code, with generator matrix G, then the
block code rate is given by kL/n(L +m) - the ratio of the number of message bits to the length of the
code word. If L >> m, then, L/ (L +m) ≈ 1 and the block code rate of a convolutional code and its rate
when viewed as a block code would appear to be same. Infact, this is the normal mode of operation for
convolutional codes and accordingly we shall not distinguish between the rate of a convolutional code
and its rate when viewed as a block code. On the contrary, if „L’ were small, the effective rate of
transmission indeed is kL/n (L + m) and will be below the block code rate by a fractional amount:
m
k / n  kL / n( L  m ) 
 L m …………………….. (5.11)
k/n

and is called "fractional rate loss". Therefore, in order to keep the fractional rate loss at a minimum
(near zero), „L‟ is always assumed to be much larger than „m‟. For the information 'sequence of
Example 8.1, we have L = 5, m =3 and fractional rate loss = 3/8 = 37.5%. If L is made 1000, the
fractional rate loss is only 3/1003≈ 0.3%.

5.3 Encoding of Convolutional Codes; Transform Domain Approach:


In any linear system, we know that the time domain operation involving the convolution integral
can be replaced by the more convenient transform domain operation, involving polynomial
multiplication. Since a convolutional encoder can be viewed as a 'linear time invariant finite state
machine, we may simplify computation of the adder outputs by applying appropriate transformation.
As is done in cyclic codes, each 'sequence in the encoding equations can' be replaced by a
corresponding polynomial and the convolution operation replaced by polynomial multiplication. For
example, for a (2, 1, ni) code, the encoding equations become:

v (1)(X) = u(X) g(l)(X) …………….. (5.12a)

v(2) (X) = u(X) g(2)(X) ……………..... (5.12b)

Where u(X) = u1 + u2X + u3X2 + … is the information polynomial,

v(1)(X) = v1(1) + v2(1)X + v3(1) X2 +. ... , and

v(2)(X) = v1(2) + v2(2)X + v3(2) X2 +.....

are the encoded polynomials.

g(1)(X) = g1(1) + g2(1) X + g3(1) X2 + ..... , and

g(2)(X) = g1(2) + g2(2) X + g3(2) X2 + .....

are the “generator polynomials” of' the code; and all operations are modulo-2. After multiplexing, the
code word becomes:

v(X) = v(1)(X2) + X v(2)(X2) ………………… (5.13)

The indeterminate 'X' can be regarded as a “unit-delay operator”, the power of X defining the
number of time units by which the associated bit is delayed with respect to the initial bit in the sequence.
Example 5.5:

For the (2, 1, 3) encoder of Fig 8.3, the impulse responses were: g(1)= (1,0, 1, 1), and g(2) = (1,1, 1, 1)

The generator polynomials are: g(l)(X) = 1 + X2 + X3, and g(2)(X) = 1 + X + X2 + X3

For the information sequence u = (1, 0, 1, 1, 1); the information polynomial is: u(X) = 1+X2+X3+X4

The two code polynomials are then:

v(1)(X) = u(X) g(l)(X) = (1 + X2 + X3 + X4) (1 + X2 + X3) = 1 + X7

v(2)(X) = u(X) g(2)(X) = (1 + X2 + X3 + X4) (1 + X + X2 + X3) = 1 + X + X3 + X4 + X5 + X7

From the polynomials so obtained we can immediately write:

v(1) = ( 1 0 0 0 0 0 0 1), and v(2) = (1 1 0 1 1 1 0 1)


Pairing the components we then get the code word v = (11, 01, 00, 01, 01, 01, 00, 11).

We may use the multiplexing technique of Eq (5.13) and write:

v (l) (X2) = 1 + X14 and v (2) (X2) = 1+X2+X6+X8+X10+X14; Xv (2) (X2) = X + X3 + X7 + X9 + X11 +
X15;

and the code polynomial is: v(X) = v (1) (X2) + X v (2) (X2) = 1 + X + X3 + X7 + X9 + X11 + X14 + X15

Hence the code word is: v = (1 1, 0 1, 0 0, 0 1, 0 1, 0 1, 0 0, 1 1); this is exactly the same as
obtained earlier.

The generator polynomials of an encoder can be determined directly from its circuit diagram.
Specifically, the co-efficient of Xl is a '1' if there is a "connection" from the l-th shift register stage to
the input of the adder of interest and a '0' otherwise. Since the last stage of the shift register in an
(n, l) code must be connected to at least one output it follows that at least one generator polynomial
should have a degree equal to the shift register length 'm', i.e.

m
Max deg g( j ) ( X )   ……………… (5.14)

1 j  n

In an (n, k) code, where k > 1, there are n-generator polynomials for each of the k-inputs, each
set representing the connections from one of the shift registers to the n-outputs. Hence, the length Kl
of the l-th shift register is given by:

K 
Max deg g ( j )( 
X ) , 1 l  k ………… (5.15)

l 1 j  n l
Where gl (j) (X) is the generator polynomial relating the l-th input to the j-th output and the encoder
memory order m is:
 
Max
( j) ( X )
Max  1  j  deg g
m K ………… (5.16)
1 l  k l 1 l  k l

Since the encoder is a linear system and u (l) (X) represents the l-th input sequence and v (j)
(X) represents the j-th output sequence the generator polynomial gl (j) (X) can be regarded as the
'encoder transfer function' relating the input - l to the output – j. For the k-input, n- output linear system
there are a total of kn transfer functions which can be represented as a (k  n) "transfer function
matrix".
 g ( 1 )( X ), g ( 2 )( X ),  g ( n )( X )
1
 g 1 ( 1 )( X ),
1
(n) 
 g 2 ( 2 )( X ),  g2 ( X ) ……………
G( X )  2
 (5.17)

⁝ ⁝ ⁝ ⁝ 
 (1) 
gk ( X ), gk ( X ),  g k ( X )
( 2 ) ( n )



Using the transfer function matrix, the encoding equations for an (n, k, m) code can be expressed as

V(X) = U(X) G(X) ………… (5.18)

U(X) = [u (1) (X), u (2) (X)...u (k) (X)] is the k-vector, representing the information polynomials, and.
V(X) = [v (1) (X), v (2) (X) … v (n) (X)] is the n-vector representing the encoded sequences. After
multiplexing, the code word becomes:

v(X) = v(1)(Xn) + X v(2)(Xn) +X2 v(3)(Xn)+…+ Xn-l v(n)(Xn) ………… (5.19)

Example 5.6:

For the encoder of Fig 5.4, we have:


g 1(1) (X) = 1 + X, g 2(1) (X) = X

g 1(2) (X) = 1, g 2(2) (X) =1+ X

g 1(3) (X) = 1 , g 2(3) (X) = 0

1  X 1 1
 G( X )  
 X 1  X 0
For the information sequence u (1) = (1 0 1), u (2) = (1 1 0), the information polynomials are:
u (1) (X) = 1 + X2, u(2)(X) = 1 + X

Then V(X) = [v (1) (X), v (2) (X), v (3) (X)] 1 1


= [1 + X2, 1 + X] 1  X = [1 +X3, 0, 1+X2]

 X
 1 X 
0
Hence the code word is:

v(X) = v(1)(X3) + Xv(2)(X3) + X2v(3)(X3)

= (1 + X9) + X (0) + X2(1 + X6)

= 1 + X2 + X8 + X9

 v = (1 0 1, 0 0 0, 0 0 1, 1 0 0).

This is exactly the same as that obtained in Example 5.3.From Eq (5.17) and (5.18) it follows that:
k

v ( X )  u (i) ( X )g
( j) ( j)
(X)
i 1
i

And using Eq (8.19) we have:


n
j 1 ( j )
v( X )   X v ( xn )
j 1
( j)
n j 1 k u( i ) ( X n )g (X n )

X  i
j 1 i 1
 v(X)  k u( i )( X n )g (X ) …………………. (5.20)
 i
i 1
n
Where g ( X )  X j 1
g ( j)
( Xn )
i  i
j 1

 gi ( 1 )( X n )  Xgi ( 2 )( X n )  X 2 gi ( 3 )( X n )  X n1 g (in )( X n ) ...................... (5.21)


is called the "composite generator polynomial" relating the i-th input sequence to v(X).

Example 5.7:

From Example 5.6, we have:

g1(X) = g1 (l) (X3) + Xg1 (2) (X3) + X2gl (3) (X3) = 1 + X + X2 + X3

g2(X) = g2 (1) (X3) + Xg2 (2) (X3) + X2g2(3) (X3)= X + X3 + X4

For the input sequence u (1) (X) = 1 + X2, u (2) (X) = 1 + X, we have

v (X) = u (1) (X3) g1(X) + u (2) (X3) g 2 (X) = 1+ X2 + X8 + X9. This is exactly the same as obtained
before.

5.4 Systematic Convolutional Codes:


In a systematic code, the first k-output sequences are exact replica of the k-input sequences
i.e.
v(i) = u (i) , i = 1,2,...... k ………………………… (5.22)
and the generating sequences satisfy:
i i,j
( j)
g  
i  1,2,3 ... k …





….

(5.
23)
where δi, j is the Kronecker delta, having values: δi, j = 1 … if j = i
= 0 … if j ≠ i
The generator matrix for such codes is given by

I P1 O P2 O P3  O Pm 1 
⋱ 
G  ⋱ I P1 O P2  O Pm O Pm 1  ……………. (5.24)
⋱ ⋱ ⋱ ⋱ I P1  O Pm 1 O Pm O 
 
 ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱

Where I is a k  k identity (unit) matrix, O is the k  k all zero matrix and Pi is a k  (n - k) matrix
given by:
 g 1,i ( k  1 ) g1,i ( k  2 )  g1,i ( n ) 
 ( k1 ) ( n ) 
 g2 ,i g2 ,i
( k2 )
 g2 ,i 
Pi    ………………… (5.25)

⁝ ⁝ ⁝ ⁝ 
 ( k1 ) ( k2 )

gk ,i gk ,i  gk ,i (n)


Further, the transfer function matrix of the code is given by:


1 0 0  0 ( k 1 ) g ( n )( X )
g (X) 
 1
( k 1 )
1

0 1 0  0 g2 (X)  g2 ( n )( X )
0 0 1  0 g ( k 1 )
(X)  g ( n )( X )

G( X )   3 3
 …………….. (5.26)
⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ 
 
⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ ⁝ 
0 0 0  1 gk ( k  1 )( X )  gk( n )( X )

The first k-output sequences = Input sequences  Information sequences


Last (n-k) sequences  parity sequences.

Number of sequences required to specify a general (n, k, m) code = kn.

For a systematic code we require only k  (n-k) sequences. Thus systematic codes form a sub class of
the set of all possible convolutional codes. Any code not satisfying Eq (5.22) to Eq (5.26) is said to
be "non systematic".

Example 5.8:

Consider a (2, 1, 3) systematic code whose encoder is shown in Fig 5.6.


Fig 8.6 (2,1,3) systematic encoder

The generator sequences are: g (1) = (1 0 0 0) and g (2) = (1 1 0 1); and the generator matrix is:

1 1 0 1 0 0 0 1 
 
⋱ ⋱ 1 1 0 1 0 0 0 1 
G  
⋱ ⋱ ⋱ ⋱ 1 1 0 1 0 0 0 1 
 
⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱ ⋱

The transfer function matrix is:

G(X) = [1, 1 + X + X3]

For an input sequence u(X) = 1+X+X3, the information sequence is:

v(1)(X) = u(X) g(1)(X) =( 1 + X + X3)


and the parity sequence is:

v(2)(X) = u(X) g(2)(X) = (1+ X + X3) (1 + X + X3) = (1 + X2 + X6)

One advantage of systematic codes is that encoding is much simpler than for the non systematic
codes - because less hardware is required. For example, the (2, 1, 3) encoder of Fig 5.6 needs only one
modulo-2 adder while that of Fig 5.5 requires three such adders. Notice also the total number of inputs
to the adders required. Further, for systematic (n, k, m) codes with K > n – k, encoding schemes that
normally require fewer than K-shift registers exist as illustrated in the following simple example.

Example 5.9:

Consider a systematic (3, 2, 2) code with the transfer function matrix

1 0 1  X  X 2 
G( X )   
0 1 1  X 2


The straight forward realization requires a total of K=K1+K2=2+2=4 shift registers and is
shown in Fig 5.7(a). However, since the parity sequences are generated by: v(3)(X)
= u(1)(X). g1(3)(X) + u(2)(X) g2(3)(X), an alternative realization as shown in Fig 5.7(b) can be obtained.
Fig 5.7 Realization of encoder for example 5.9

In majority of situations, the straight forward realization is the most efficient. However, in the
case of systematic codes simpler realizations usually exist as shown in Example 5.9.

Another advantage of' systematic codes is that no inverting circuit is needed for recovering
information from the code word. For information recovery from a non systematic code, inversion is
required in the form of an (n  k) matrix G -1(X) such that
G(X).G -1(X) = Ik X l ……………… (5.27)

for some l  0 and Ik is the (k  k) unit matrix. Then it follows that:

V(X).G -1(X) = U(X) G(X) G -1(X) = U(X).X l …………… (5.28)

and the information sequence can be recovered with an l-time unit delay from the code word by
letting V(X) to be the input to the n-input, k-output linear sequential circuit whose transfer function
matrix is G – 1(X).

For an (n, l, m) code the transfer function matrix G(X) will have a "feed forward" inverse
G – 1 (X) of delay l units if and only if :

G.C.D [g (1) (X), g (2) (X) …g (n) (X)] = X l ……………. (5.29)

for some l  0; where G.C.D denotes the 'greatest common divisor'. For an (n, k, m) code with k >
 n  n 
1, let i (X), i=1, 2 …  k  be the determinants of the  k  distinct k  k sub-matrices of the transfer

   
function matrix G(X).Then a feed forward inverse of delay l-units exists if an only if:
 n 
GGD [i (X), i=1, 2 …  k  ] = X l ……………… (5.3
 
Example 5.10:

For the (2, 1, 3) encoder of Fig 5.3, we have, from Example 8.5, the generator matrix as

G(X) = [1 + X2 + X3, 1 +X + X2 + X3)

Its inverse can be computed as:


1 1  X  X 2 

G ( X )   
 X  X 
2

and the implementation of the inverse is shown in Fig 5.8.

Example 5.11:

For the (3, 2, 1) encoder of Fig 5.4, the generator matrix as found in Example 5.6 is:

1  X 1 1
G( X )   
 X 1 X 0

The determinants of the (2  2) sub matrices are 1+ X + X2, X and 1 + X. Their GCD is 1.
A feed forward inverse with no delay exists and can be computed as:

 1 X X 
 
G1( X )   X 1 X  I
 X  X 2 1  X 2 

Implementation of this inverse is shown in Fig 5.9.

Fig 5.8 Feed forward encoder of (2,1,3) code Fig 5.9 Feed forward encoder of (3,2,1) code

To understand what happens when a feed forward inverse does not exist consider an example
of a (2, 1, 2) encoder with generator matrix

G(X) = [1+X, 1 + X2]

Since the GCD of g (1) (X) and g (2) (X) is (1+ X) (not of the form X l), a feed forward inverse does
not exist. Suppose the input sequence is:
1  1  X 2  X 3  ... Then the output sequences are: v(1)(X) = 1, v(2)(X) = 1 + X.
u( X )  1  X

That is, the code word contains only three nonzero bits even though the information sequence has
infinite weight. If this code word is transmitted over a BSC and the three nonzero bits are changed to
zeros by the channel noise, the received sequence will be all zeros. A maximum likely hood decoder
(MLD) will then produce the all-zero code word as its estimate, since this a valid code word and it
agrees exactly with the received sequence. Thus, the estimated information sequence will beû(
X )  0 , implying an infinite number of decoding errors caused by a finite number (only three in this
case) of channel errors. Clearly this is a very undesirable situation and the code is said to be
subject to "Catastrophic error propagation" and the code is called a "catastrophic code".
Equations (5.29) and (5.30) can be shown to be necessary and sufficient conditions for a code
to be 'non-catastrophic'. Hence any code for which a feed forward inverse exists is non-catastrophic.
Another advantage of systematic codes is that they are always non-catastrophic.

8.5 STATE DIAGRAMS:


The state of an encoder is defined as its shift register contents. For an (n, k, m) code with k > 1,
k
i-th shift register contains „Ki’ previous information bits. Defining K   Ki as the total encoder -
i 1

memory (m - represents the memory order which we have defined as the maximum length of any shift
register), the encoder state at time unit T', when the encoder inputs are, {u l (1), u l (2)…u l (k)}, arethe
binary k-tuple of inputs:

{u l-1 (1) u l-2 (1), u l-3 (1)… u l-k (1); u l-1 (2), u l-2(2, u l-3 (2)… u l-k (2); … ; u l-1 (k) u l-2 (k), u l-3 (k)… u l-k (k)},

and there are a total of 2k different possible states. For a (n, 1, m) code, K = K1 = m and the encoder
state at time unit l is simply {ul-1, ul-2 … ul-m}.
Each new block of k-inputs causes a transition to a new state. Hence there are 2k branches
leaving each state, one each corresponding to the input block. For an (n, 1, m) code there are only two
branches leaving each state. On the state diagram, each branch is labeled with the k-inputs causing the
transition and the n-corresponding outputs. The state diagram for the convolutional encoder of Fig 8.3
is shown in Fig 8.10. A state table would be, often, more helpful while drawing the state diagram and
is as shown.

State table for the (2, 1, 3) encoder of Fig 5.3

State S0 S1 S2 S3 S4 S5 S6 S7
Binary
000 100 010 110 001 101 011 111
Description
Fig 5.10 State diagram of encoder of Fig 5.3
Recall (or observe from Fig 8.3) that the two out sequences are:

v (1) = ul + ul – 2 + ul – 3 and
v (2) = ul + ul – 1 + ul – 2 + ul – 3

Till the reader, gains some experience, it is advisable to first prepare a transition table using the
output equations and then translate the data on to the state diagram. Such a table is as shown below:

State transition table for the encoder of Fig 5.3


Previou Binary Input Next Binary ul ul – 1 ul – 2 ul - 3 Output
s Descriptio State Descriptio
State n n
S0 0 0 0 0 S0 0 0 0 0 0 0 0 0 0
1 S1 1 0 0 1 0 0 0 1 1
S1 1 0 0 0 S2 0 1 0 0 1 0 0 0 1
1 S3 1 1 0 1 1 0 0 1 0
S2 0 1 0 0 S4 0 0 1 0 0 1 0 1 1
1 S5 1 0 1 1 0 1 0 0 0
S3 1 1 0 0 S6 0 1 1 0 1 1 0 1 0
1 S7 1 1 1 1 1 1 0 0 1
S4 0 0 1 0 S0 0 0 0 0 0 0 1 1 1
1 S1 1 0 0 1 0 0 1 0 0
S5 1 0 1 0 S2 0 1 0 0 1 0 1 1 0
1 S3 1 1 0 1 1 0 1 0 1
S6 0 1 1 0 S4 0 0 1 0 0 1 1 0 0
1 S5 1 0 1 1 0 1 1 1 1
S7 1 1 1 0 S6 0 1 1 0 1 1 1 0 1
1 S7 1 1 1 1 1 1 1 1 0
For example, if the shift registers were in state S5, whose binary description is 101, an input
„1‟ causes this state to change over to the new state S3 whose binary description is 110 while producing
an output (0 1). Observe that the inputs causing the transition are shown first, followed by the
corresponding output sequences shown with in parenthesis.

Assuming that the shift registers are initially in the state S0 (the all zero state) the code word
corresponding to any information sequence can be obtained by following the path through the state
diagram determined by the information sequence and noting the corresponding outputs on the branch
labels. Following the last nonzero block, the encoder is returned to state S0 by a sequence of m-all-
zero block appended to the information sequence. For example, in Fig 5.10, if u = (11101), the code
word is v = (11, 10, 01, 01, 11, 10, 11, 10) the path followed is shown in thin gray lines with arrows
and the input bit written along in thin gray. The m = 3 zeros appended are indicated in gray which is
much lighter compared to the information bits.

Apart from obtaining the output sequence for a given input sequence, the state diagram can be
modified to provide a complete description of the Hamming weights of all nonzero code words. (That
is, the state diagram is useful in determining a weight- distribution for the code).

This is achieved as follows: The state S0 is split into an initial and a final state. The self loop
around S0 is discarded. Each branch is labeled with a 'branch gain', „X i‟, where 'i' is the weight
(number of ones) of the n-encoded bits on that branch. Each path that connects the initial state to the
final state which diverges from and remerges with state S0, exactly once, represents a nonzero code
word.

Those code words that diverge from and remerge with S0 more than once can be regarded as a
sequence of shorter code words. The "path gain" is the product of the branch gains along a path and
the weight of the associated code word is the power of X in the path gain. As an lustration, let us
consider the modified state diagram for the (2, 1, 3) code of Fig 5.3 as shown in Fig 5.11 and another
version of the same as shown in Fig 5.12.

Fig 5.11 Modified state diagram for the (2,1,3) code

Example 5.12: A (2, 1, 2) Convolutional Encoder:

Consider the encoder shown in Fig 5.15. We shall use this example for discussing further graphical
representations viz. Trees, and Trellis.
Fig 5.15 (2,1,2) convolution encoder

For this encoder we have: v l (1) = u l + u l – 1 + u l – 2 and v l (2) = u l + u l – 2


The state transition table is as follows.

State transition table for the (2, 1, 2) convolutional encoder of Example 5.12
Previous Binary Input Next Binary u l u l – 1 u l - 2 Output
state description State description
S0 0 0 0 S0 0 0 0 0 0 0 0
1 S1 1 0 1 0 0 1 1
S1 1 0 0 S2 0 1 0 1 0 1 0
1 S3 1 1 1 1 0 0 1
S2 0 1 0 S0 0 0 0 0 1 1 1
1 S1 1 0 1 0 1 0 0
S3 1 1 0 S2 0 1 0 1 1 0 1
1 S3 1 1 1 1 1 1 0

The state diagram and the augmented state diagram for computing the „complete path
enumerator function‟ for the encoder are shown in Fig 5.16.

Fig 5.16 State diagram for the (2,1,2) encoder


There are three loops in the augmented state diagram:

S3 → S3: l1 = DLI, S1→ S2 → S1: l2 = DL2I, S1 → S3 → S2 → S1:l3 = D2L3I2

The loops l1 and l2 are non-touching and their gain product is: l1l2 = D2L3I2

  = 1 – (l1 + l2 + l3) – l1 l2

= 1 – DLI (1 + L)

There are two forward paths: F1  S0 → S1 → S2→ S0. Path gain = D5L3I

F2  S0 → S1 → S3→ S2 → S0, Path gain = D6L4I2

The loop l1 does not touch the forward path F1.  1 = 1 – l1 = 1 – DLI.

All the three loops touch the forward path F2.  2 = 1

Now use Mason‟s gain formula to get:

D5 L3I( 1  DLI )  D6 L4 I 2 D5 L3I


T( D, L, I )  
1  DLI( 1  L ) 1  DLI( 1  L )

 D5 L3I  D6 L4 I 2 ( 1  L )  D7 L5 I 3 ( 1  L )2  ...

Thus there is one code word of weight 5 that has length 3 branches and an information sequence
of weight 1, Two code words of weight 6, of which one has length 4 branches and an information
sequence of weight 2, and the other has length 5 branches and an information sequenceof weight 2
and so on.

TREE AND TRELLIS DIAGRAMS:


Let us now consider other graphical means of portraying convolutional codes. The state
diagram can be re-drawn as a 'Tree graph'. The convention followed is: If the input is a '0', then the
upper path is followed and if the input is a '1', then the lower path is followed. A vertical line is called
a 'Node' and a horizontal line is called 'Branch'. The output code words for each input bit are shown
on the branches. The encoder output for any information sequence can be traced through the tree paths.
The tree graph for the (2, 1, 2) encoder of Fig 5.15 is shown in Fig 5.18. The state transition table can
be conveniently used in constructing the tree graph.
Fig 5.18: The tree graph for the (2, 1, 2) encoder of Fig 5.15

Following the procedure just described we find that the encoded sequence for an information
sequence (10011) is (11, 10, 11, 11, 01) which agrees with the first 5 pairs of bits of the actual encoded
sequence. Since the encoder has a memory = 2 we require two more bits to clear and re-set the encoder.
Hence to obtain the complete code sequence corresponding to an information sequence of length kL,
the tree graph is to extended by n(m-l) time units and this extended part is called the

29
"Tail of the tree", and the 2kL right most nodes are called the "Terminal nodes" of the tree. Thus
the extended tree diagram for the (2, 1, 2) encoder, for the information sequence (10011) is as in Fig
5.19 and the complete encoded sequence is (11, 10, 11, 11, 01, 01, 11).

Fig 5.19 Illustration of the “Tail of the tree”


At this juncture, a very important clue for the student in drawing tree diagrams neatly and
correctly, without wasting time appears pertinent. As the length of the input sequence L increases the
number of right most nodes increase as 2L. Hence for a specified sequence length, L, compute 2L. Mark
2L equally spaced points at the rightmost portion of your page, leaving space to complete the mtail
branches. Join two points at a time to obtain 2L-l nodes. Repeat the procedure until you get only one
node at the left most portion of your page. The procedure is illustrated diagrammatically inFig
5.20 for L = 3. Once you get the tree structure, now you can fill in the needed information either looking
back to the state transition table or working out logically.

Fig 5.20 Procedure for drawing neat tree diagram


From Fig 5.18, observe that the tree becomes "repetitive' after the first three branches.

30
Beyond the third branch, the nodes labeled S0 are identical and so are all the other pairs of nodes that
are identically labeled. Since the encoder has a memory m = 2, it follows that when the third
information bit enters the encoder, the first message bit is shifted out of the register. Consequently,
after the third branch the information sequences (000u3u4---) and (100u3u4---) generate the same code
symbols and the pair of nodes labeled S0 may be joined together. The same logic holds for the other
nodes.

Accordingly, we may collapse the tree graph of Fig 5.18 into a new form of Fig 5.21 called a
"Trellis". It is so called because Trellis is a tree like structure with re-merging branches (You will have
seen the trusses and trellis used in building construction).

Fig 5.21 Trellis diagram for encoder of fig 5.15


The Trellis diagram contain (L + m + 1) time units or levels (or depth) and these are labeled
from 0 to (L + m) (0 to 7 for the case with L = 5 for encoder of Fig 5.15 as shown in Fig 5.21.
The following observations can be made from the Trellis diagram

1. There are no fundamental paths at distance 1, 2 or 3 from the all zero path.

2. There is a single fundamental path at distance 5 from the all zero path. It diverges from the all-zero
path three branches back and it differs from the all-zero path in the single input bit.

3. There are two fundamental paths at a distance 6 from the all zero path. One path diverges from the
all zero path four branches back and the other five branches back. Both paths differ from the all zero
path in two input bits. The above observations are depicted in Fig 8.24(a).

4. There are four fundamental paths at a distance 7 from the all-zero path. One path diverges from the
all zero path five branches back, two other paths six branches back and the fourth path diverges seven
branches back as shown in Fig 8.24(b). They all differ from the all zero path in three input bits. This
information can be compared with those obtained from the complete path enumerator function found
earlier.

31
THE VITERBI ALGORITHM:

The Viterbi algorithm, when applied to the received sequence r from a DMC finds the path
through the trellis with the largest metric. At each step, it compares the metrics of all paths entering
each state and stores the path with the largest metric called the "survivor" together with its metric.

The Algorithm:

Step: 1. Starting at level (i.e. time unit) j = m, compute the partial metric for the single path
entering each node (state). Store the path (the survivor) and its metric for each state.
Step: 2. Increment the level j by 1. Compute the partial metric for all the paths entering a state
by adding the branch metric entering that state to the metric of the connecting survivor
at the preceding time unit. For each state, store the path with the largest metric (the
survivor), together with its metric and eliminate all other paths.
Step: 3. If j < (L + m), repeat Step 2. Otherwise stop.

Notice that although we can use the Tree graph for the above decoding, the number of nodes
at any level of the Trellis does not continue to grow as the number of incoming message bits increases,
instead it remains a constant at 2m.

There are 2k survivors from time unit „m‟ up to time unit L, one for each of the 2kstates. After
L time units there are fewer survivors, since there are fewer states while the encoder is returning to the
all-zero state. Finally, at time unit (L + m) there is only one state, the all-zero state and hence only one
survivor and the algorithm terminates.

Fig 5.22 Survivor after time unit „j‟


Suppose that the maximum likely hood path is eliminated by the algorithm at time unit j as
shown in Fig 8.22. This implies that the partial path metric of the survivor exceeds that of the maximum
likely hood path at this point. Now, if the remaining portion of the maximum likely hood path is
appended onto the survivor at time unit j, then the total metric of this path will exceed the totalmetric
of the maximum likely hood path. But this contradicts the definition of the 'maximum likely hood path'
as the 'path with largest metric'. Hence the maximum likely hood path cannot be eliminated by the
algorithm and it must be the final survivor and it follows

32
M(r | vˆ)  M(r | v),  v  vˆ .Thus it is clear that the Viterbi algorithm is optimum in the sense that it
always finds the maximum likely hood path through the Trellis. From an implementation point of view,
however, it would be very inconvenient to deal with fractional numbers. Accordingly, the bit metric M
(ri|vi) = ln P (ri|vi) can be replaced by “C2 [ln P (ri|vi) + C1]”, C1 is any real number and C2is any
positive real number so that the metric can be expressed as an integer. Notice that a path v
which maximizes M(r | v)  N M(r | v )  N ln P (r | v ) also maximizes N C ln P(r | v )  C  .
 i i  i i  2 i i 1
i 1 i 1 i 1

Therefore, it is clear that the modified metrics can be used without affecting the performance of the
Viterbi algorithm. Observe that we can always choose C1 to make the smallest metric as zero and then
C2 can be chosen so that all other metrics can be approximated by nearest integers. Accordingly,there
can be many sets of integer metrics possible for a given DMC depending on the choice of C2. The
performance of the Viterbi algorithm now becomes slightly sub-optimal due to the use of modified
metrics, approximated by nearest integers. However the degradation in performance is typically very
low.

Example 5.13:

As an illustration let us consider a binary input-quaternary output DMC shown in Fig 5.23(a).
The bit metrics ln P (ri| vi) are shown in Fig 5.23(b). Choosing C1= − 2.3 and C2 = 7.195 yields the
"integer metric table" shown in Fig 5.23(c).

Fig 5.23 Diagran for example 5.13

Now suppose that a code word from the (2,1,2) encoder of Fig 5.15, whose Trellis diagram is
shown in Fig 5.21, is transmitted over the DMC of Fig 8.26 and the quaternary received sequence is:

r = {y3 y4, y3 y1, y3 y2, y3 y4, y3 y4, y2 y4, y1 y3}

Let us apply Viterbi algorithm to determine the transmitted sequence.

In the first time unit (j = 1) there are two branches originating from the state S0 with output
vectors (00) terminating at S0 and (11) terminating at S1. The received sequence in this time unit is
(y3 y4) and using the integer metric table of Fig 8.23(c) we have:
33
M [r1|v1 (1)] = M (y3 y4|00) = M (y3|0) + M (Y4|0) = 5 + 0 = 5, and

M [r1|v1 (2)] = M (y3 y4|11) = M (y3|1) + M (Y4|1) = 8 + 10 = 18

These computations are indicated in Fig 8.24(a). The path discarded is shown by a cross. Note that
the branch metrics are also indicated along the branches with in brackets and the state metrics are
indicated at the nodes.

For j = 2 there are single branches entering each state and the received sequence in this time
unit is (y3 y1). The four branch metrics are computed as below.

M1 = M (y3 y1|00) = M (y3|0) + M (y1|0) =5 + 10 =15

M2 = M (y3 y1|11) = M (y3|1) + M (y1|1) =8 + 0 =8

M3= M (y3 y1|10) = M (y3|1) + M (y1|0) =8 + 10 =18

M4 = M (y3 y1|01) = M (y3|0) + M (y1|1) =5 + 0 =5

The metrics at the four states are obtained by adding the branch metrics to the metrics of the previous
states (survivors) and are shown in Fig 8.24(b).

Fig 5.24 Computation for time units j=1, j=2 and j=3

Next for j = 3, notice that there are two branches entering each state as shown in Fig 8.24(c).
The received sequence in this time unit is (y3, y2) and the branch metrics are computed as below:

M1 = M (y3 y2|00) = M (y3|0) + M (y2|0) =5 + 8 =13

M2 = M (y3 y2|11) = M (y3|1) + M (y2|1) =8 + 5 =13

M3 = M (y3 y2|10) = M (y3|1) + M (y2|0) =8 + 8 =16

M4 = M (y3 y2|01) = M (y3|0) + M (y2|1) =5 + 5 =10

Following the above steps, we arrive at the following diagram.

34
Fig 5.25 Application of Viterbi algorithm
Notice that, in the last step we have ignored the highest metric computed! Indeed, if the
sequence had continued we should take this into account. However, in the last m-time unitsremember
that the path must remerge with S0.

From the path that has survived, we observe that the transmitted sequence is:

vˆ = (11, 10, 11, 11, 01, 01, 11)

and the information sequence at the encoder input is: û = (1 0 0 1 1)

Notice that “the final m-branches in any trellis path always corresponds to „0‟ inputs and
hence not considered part of the information sequence”.

As already mentioned, the MLD reduces to a 'minimum distance decoder' for a BSC (see Eq
8.40). Hence the distances can be reckoned as metrics and the algorithm must now find the path through
the trellis with the smallest metric (i.e. the path closest to r in Hamming distance). Thedetails of
the algorithm are exactly the same, except that the Hamming distance replaces the loglikely hood
function as the metric and the survivor at each state is the path with the smallest metric. The following
example illustrates the concept.

Example 5.14:

Suppose the rode word r = (01, 10, 10, 11, 01, 01, 11), from the encoder of Fig 5.15 is received
through a BSC. The path traced is shown in Fig 5.25 as dark lines.

35
Fig 5.26 Viterbi algorithm for a BSC

The estimate of the transmitted code word is

vˆ = (11, 10, 11, 11, 01, 01, 11)

and the corresponding information sequence is: û = (1 0 0 1 1)


Notice that the distances of the code words of each branch with respect to the corresponding
received words are indicated in brackets. Note also that at some states neither path is crossed out
indicating a tie in the metric values of the two paths entering that state. If the final survivor goes through
any of these states there is more than one maximum likely hood path (i.e. there may be more than one
path whose distance from r is a minimum). From an implementation point of view whenevera tie in
metric values occur, one path is arbitrarily selected as survivor, because of the non- practicability of
storing a variable number of paths. However, this arbitrary resolution of ties has no effect on the
decoding error probability. Finally, the Viterbi algorithm cannot give fruitful results when more errors
in the transmitted code word than permissible by the dfree of the code occur. For theexample illustrated,
the reader can verify that the algorithm fails if there are three errors. Discussion and details about the
performance bounds, convolutional code construction, implementation of the Viterbi algorithm etc are
beyond the scope of this book.

36
Module – 3
1) Consider a (3,1,2) convolutional encoder with g(1) = (110), g(2) = (101) and g(3) = (111)
(i) Draw the encoder block diagram
(ii) Draw state diagram
(iii) Find the encoder output by traversing through the state diagram for the input sequence of
(11101)
(iv) Obtain the output of the convolutional encoder and draw trellis for the message sequence
(10101) where g(1) = (111), g(2) = (110) and g(3) = (101)
2) Examine encoding and decoding process of turbo codes for an input sequence 101110111011
3) Apply the Viterbi algorithm to decode the received sequence 110111011101 for a given
convolutional code with generator polynomials g1= (1 1 0) and g2 = (101)
4) Examine the encoding and decoding process of turbo codes for the input sequence
1101011011011.
5) (i) Draw the block diagram for the turbo encoder.
(ii) Illustrate the interleaving process and show how it affects the encoded sequence.
(iii) Simulate the decoding process using iterative decoding and compute the
likelihood for the sequence.
6) Apply the Viterbi algorithm to decode the received sequence 111101110111111 for a
convolutional code with generator polynomials:
g(1)=(111)g^{(1)} = (111)g(1)=(111) and g(2)=(110)g^{(2)} = (110)g(2)=(110).
7) (i) Draw the trellis diagram for the convolutional code.
(ii) Traverse the trellis to find the most likely transmitted message sequence.
(iii) Show how the path metrics are calculated at each step.

8.) Consider a (3, 1, 2) convolutional code with g(1)=(011), g(2)=(110),g(3)=(101) .


(a) Draw the encoder block diagram. (b) Find the generation matrix. (c) Find the code-
vector corresponding to the information sequence d = 10001.
9) With the help of an example, illustrate the concept of polynomial description of a convolutional
encoder. Also, illustrate the matrix description of the convolutional encoder.

10) Consider the convolutional code with the generator polynomial matrix:
g(D) = [ 1 1+D+D2 ]
Draw the trellis diagram corresponding to the code. For the received sequence 1000001 ,
perform Viterbi decoding and obtain the corresponding decoded bits.

11) Consider the convolution encoder shown in figure.


(i) Write
the impulse response and its polynomial. (ii) Find the output corresponding to input message
(10111) using time domain Approach.
12) Consider the transmitter transmits message sequence 100000 code word in a transmission
medium. In this medium some errors occurred due to noise. In a receiver side, we receive
error code word of 01 10 11 10 00 00. Using Viterbi decoding, Find and correct the error.
Also get the message sequence in the receiver side.
13) An encoder shown below generates an all zero sequence which is sent over a Binary
symmetric channel. The received sequence 0100100. There are two errors in this sequence at
2nd and 5th positions. Show that this double error detection is possible with correction by the
application of Viterbi Algorithm.

14) Examine encoding and decoding process of turbo codes for an input sequence 101110111011
15) A convolutional encoder has the following generating sequence, g0=[1 1 1], g1=[1 0 1].
Apply Viterbi algorithm for the decoding of the received sequence 1101110001100011.
16) Consider the transmitter transmits the message sequence 100000 code word in a transmission
medium. In this medium, some errors occurred due to noise. In a receiver side, we receive the error
code word of 01 10 11 10 00 00. Using Viterbi decoding, find and correct the error. Also, get the
message sequence in the receiver side.
Module 4:

Symmetric (Secret Key) Cryptography

Introduction to Cryptography, An Overview of Encryption Techniques, Operations


Used By Encryption Algorithms. Symmetric (Secret Key) Cryptography: Data Encryption
Standard (DES), AES ,Linear Feedback Shift Registers.

Introduction:
This is the age of universal electronic connectivity, where the activities like hacking,
viruses, electronic fraud are very common. Unless security measures are taken, a network
conversation or a distributed application can be compromised easily.

Some simple examples are:


i. Online purchases using a credit/debit card.
ii. A customer unknowingly being directed to a false website.

iii. A hacker sending a message to person pretending to be someone else.

Network Security has been affected by two major developments over the last several
decades. First one is introduction of computers into organizations and the second one being
introduction of distributed systems and the use of networks and communication facilities for
carrying data between users & computers. These two developments lead to ‘computer security’
and ‘network security’, where the computer security deals with collection of tools designed to
protect data and to thwart hackers. Network security measures are needed to protect data during
transmission. But keep in mind that, it is the information and our ability to access that
information that we are really trying to protect and not the computers and networks.

Why We Need Information Security?


Because there are threats:
Threats
A threat is an object, person, or other entity that represents a constant danger to an asset
The 2007 CSI survey
 494 computer security practitioners
 46% suffered security incidents
 29% reported to law enforcement
 Average annual loss $350,424
 1/5 suffered ‗targeted attack‘
 The source of the greatest financial losses?
 Most prevalent security problem
 Insider abuse of network access
 Email

Threat Categories
 Acts of human error or failure
 Compromises to intellectual property
 Deliberate acts of espionage or trespass
 Deliberate acts of information extortion
 Deliberate acts of sabotage or vandalism
 Deliberate acts of theft
 Deliberate software attack
 Forces of nature
 Deviations in quality of service
 Technical hardware failures or errors
 Technical software failures or errors
 Technological obsolesce

Definitions
 Computer Security - generic name for the collection of tools designed to protect
data and to thwart hackers
 Network Security - measures to protect data during their transmission
 Internet Security - measures to protect data during their transmission over a
collection of interconnected networks
 our focus is on Internet Security
 which consists of measures to deter, prevent, detect, and correct security
violations that involve the transmission & storage of information

Aspects Of Security
consider 3 aspects of information security:

 Security Attack
 Security Mechanism
 Security Service

Security Attack
 any action that compromises the security of information owned by an
organization
 information security is about how to prevent attacks, or failing that, to
detect attacks on information-based systems
 often threat & attack used to mean same thing
 have a wide range of attacks
 can focus of generic types of attacks

 Passive
 Active
Passive Attack

Active Attack

Interruption
An asset of the system is destroyed or becomes unavailable or unusable. It is an
attack on availability.
Examples:

 Destruction of some hardware


 Jamming wireless signals
 Disabling file management systems
Interception
An unauthorized party gains access to an asset. Attack on confidentiality.
Examples:
 Wire tapping to capture data in a network.
 Illicitly copying data or programs
 Eavesdropping
Modification

5
When an unauthorized party gains access and tampers an asset. Attack is on
Integrity.
Examples:
 Changing data file
 Altering a program and the contents of a message

Fabrication
An unauthorized party inserts a counterfeit object into the system. Attack on
Authenticity. Also called impersonation

Examples:
 Hackers gaining access to a personal email and sending message
 Insertion of records in data files
 Insertion of spurious messages in a network

Security Services
It is a processing or communication service that is provided by a system to give a
specific kind of production to system resources. Security services implement security policies
and are implemented by security mechanisms.
Confidentiality

Confidentiality is the protection of transmitted data from passive attacks. It is used to


prevent the disclosure of information to unauthorized individuals or systems. It has been
defined as “ensuring that information is accessible only to those authorized to have access”.
The other aspect of confidentiality is the protection of traffic flow from analysis. Ex: A credit
card number has to be secured during online transaction.

Authentication

This service assures that a communication is authentic. For a single message


transmission, its function is to assure the recipient that the message is from intended source.
For an ongoing interaction two aspects are involved. First, during connection initiation the
service assures the authenticity of both parties. Second, the connection between the two hosts
is not interfered allowing a third party to masquerade as one of the two parties. Two specific
authentication services defines in X.800 are

Peer entity authentication: Verifies the identities of the peer entities involved in
communication. Provides use at time of Mediaconnectionestblishment and during data
transmission. Provides confidence against a masquera or replay attack
Data origin authentication: Assumes the authenticity of source of data unit, but does not
provide protection against duplication or modification of data units. Supports applications like
electronic mail, where no prior interactions take place between communicating entities.
Integrity

Integrity means that data cannot be modified without authorization. Like


confidentiality, it can be applied to a stream of messages, a single message or selected fields
within a message. Two types of integrity services are available. They are:

Connection-Oriented Integrity Service: This service deals with a stream of messages,


assures that messages are received as sent, with no duplication, insertion, modification,
reordering or replays. Destruction of data is also covered here. Hence, it attends to both message
stream modification and denial of service.
Connectionless-Oriented Integrity Service: It deals with individual messages
regardless of larger context, providing protection against message modification only.

An integrity service can be applied with or without recovery. Because it is related to


active attacks, major concern will be detection rather than prevention. If a violation is
detected and the service reports it, either human intervention or automated recovery machines
are required to recover.
Non-repudiation

Non-repudiation prevents either sender or receiver from denying a transmitted message.


This capability is crucial to e-commerce. Without it an individual or entity can deny that he,
she or it is responsible for a transaction, therefore not financially liable.
Access Control
This refers to the ability to control the level of access that individuals or entities have
to a network or system and how much information they can receive. It is the ability to limit and
control the access to host systems and applications via communication links. For this, each
entity trying to gain access must first be identified or authenticated, so that access rights can be
tailored to the individuals.
Availability

It is defined to be the property of a systemMediaorasystemresource being accessible


and usable upon demand by an authorized system entity. The v ilability can significantly be
affected by a variety of attacks, some amenable to automated counter measures i.e
authentication and encryption and others need some sort of physical action to prevent or recover
from loss of availability of elements of distributed system.

Security Mechanisms
According to X.800, the sec rity mechanisms are divided into those implemented in a
specific protocol layer and those that are not specific to any particular protocol layer or security
service. X.800 also differentiates reversible & irreversible encipherment mechanisms. A
reversible encipherment mechanism is simply an encryption algorithm that allows data to be
encrypted and subsequently decrypted, whereas irreversible encipherment include hash
algorithms and message authentication codes used in digital signature and message
authentication applications
Specific Security Mechanisms
Incorporated into the appropriate protocol layer in order to provide some of the OSI
security services,
Encipherment: It refers to the process of applying mathematical algorithms for converting
data into a form that is not intelligible. This depends on algorithm used and encryption keys.

Digital Signature: The appended data or a cryptographic transformation applied to any data
unit allowing to prove the source and integrity of the data unit and protect against forgery.
Access Control: A variety of techniques used for enforcing access permissions to the system
resources.
Data Integrity: A variety of mechanisms used to assure the integrity of a data unit or stream
of data units.
Authentication Exchange: A mechanism intended to ensure the identity of an entity by
means of information exchange.
Traffic Padding: The insertion of bits into gaps in a data stream to frustrate traffic analysis
attempts.
Routing Control: Enables selection of particular physically secure routes for certain data
and allows routing changes once a breach of security is suspected.
Notarization: The use of a trusted third party to assure cert in properties of a data exchange
Pervasive Security Mechanisms
These are not specific to any particular OSI security service or protocol layer.
Trusted Functionality: That which is perceived to b correct with respect to some criteria
Security Level: The marking bound to a resource (which may be a data unit) that names or
designates the security attributes of that resource.
Event Detection: It is the process of detecting all the events related to network security.
Security Audit Trail: Data collected and potentially used to facilitate a security audit, which
is an independent review and examination of system records and activities. Security
Recovery: It deals with requests from mechanisms, such as event handling and management
functions, and takes recovery actions.
Model For Network Security

Data is transmitted over network between two communicating parties, who must
cooperate for the exchange to take place. A logical information channel is established by
defining a route through the internet from source to destination by use of communication
protocols by the two parties. Whenever an opponent presents a threat to confidentiality,
authenticity of information, security aspects come into play. Two components are present in
almost all the security providing techniques.
A security-related transformation on the information to be sent making it unreadable
by the opponent, and the addition of a code based on the contents of the message, used to
verify the identity of sender.
Some secret information shared by the two principals and, it is hoped, unknown to the
opponent. An example is an encryption key used in conjunction with the transformation to
scramble the message before transmission and unscramble it on reception
A trusted third party may be needed to achieve secure transmission. It is responsible for
distributing the secret information to the two parties, while keeping it away from any opponent.
It also may be needed to settle disputes between the two parties regarding authenticity of a
message transmission. The general model shows that there are four basic tasks in designing a
particular security service:
1. Design an algorithm for performing the security-related transformation. The algorithm
should be such that an opponent cannot defeat its purpose
2. Generate the secret information to be used with the algorithm
3. Develop methods for the distribution and sharing of the secret information
4. Specify a protocol to be used by the two principals that makes use of the security
algorithm and the secret information to achieve a particular security service various other
threats to information system like unwanted access still exist.
Information access threats intercept or modify data on behalf of users who should not have
access to that data Service threats exploit service flaws in computers to inhibit use by legitimate
users Viruses and worms are two examples of software attacks inserted into the system by
means of a disk or also across the network. The security mechanisms needed to cope with
unwanted access fall into two broad categories.
Some basic terminologies used
1. CIPHER TEXT - the coded message
2. CIPHER - algorithm for transforming plaintext to cipher text
3. KEY - info used in cipher known only to sender/receiver
4. ENCIPHER (ENCRYPT) - converting plaintext to cipher text
5. ECIPHER (DECRYPT) - recovering cipher text from plaintext
6. CRYPTOGRAPHY - study of encryption principles/methods
7. CRYPTANALYSIS (CODEBREAKING) - the study of principles/ methods of
deciphering cipher text without knowing key
8. CRYPTOLOGY - the field of both cryptography and cryptanalysis

Cryptography
Cryptographic systems are generally classified along 3 independent dimensions:
Type of operations used for transforming plain text to cipher text:
All the encryption algorithms are a based on two general principles: substitution, in
which each element in the plaintext is mapped into another element, and transposition, in
which elements in the plaintext are rearranged.
The number of keys used:

If the sender and receiver uses same key then it is s to be symmetric key (or) single
key (or) conventional encryption. If the sender and receiver use different keys then it is said
to be public key encryption.
The way in which the plain text is processed:
A block cipher processes the input and block of elements at a time, producing output
block for each input block. A Stream cipher processes the input elements continuously,
producing output element one at a time, as it goes along.

Cryptanalysis
The process of attempting to discover X or K or both is known as cryptanalysis. The
strategy used by the cryptanalysis depends on the nature of the encryption scheme and the
information available to the cryptanalyst. There are various types of cryptanalytic attacks
based on the amount of information known to the cryptanalyst.
Cipher text only – A copy of cipher text alone is known to the cryptanalyst.

Known plaintext – The cryptanalyst has a copy of the cipher text and the corresponding
plaintext.

Chosen plaintext – The cryptanalysts gains temporary access to the encryption machine. They
cannot open it to find the key, however; they can encrypt a large number of suitably chosen
plaintexts and try to use the resulting cipher texts to deduce the key.
Chosen cipher text – The cryptanalyst obtains temporary access to the decryption machine,
uses it to decrypt several string of symbols, and tries to use the results to deduce the key.

Classical Encryption Techniques


There are two basic building blocks of all encryption techniques: substitution and
transposition.

Substitution Techniques
In which each element in the plaintext is mapped into another element.
1. Caesar Cipher
2. Monoalphabetic cipher
3. Playfair Cipher
4. Hill Cipher
5. Polyalphabetic Cipher
6. One Time Pad

Steganography
A plaintext message may be hidden in any one of the two ways. The methods of
steganography conceal the existence of the message, whereas the methods of cryptography
render the message unintelligible to outsiders by various transformations of the text. A simple
form of steganography, but one that is time consuming to construct is one in which an
arrangement of words or letters within an apparently innocuous text spells out the real message.
e.g., (i) the sequence of first letters of each word of the overall message spells out the real
(hidden) message. (ii) Subset of the words of the overall message is used to convey the hidden
message. Various other techniques have been used historically, some of them are:

 Character marking – selected letters of printed or typewritten text are overwritten


in pencil. The marks are ordinarily not visible unless the paper is held to an angle
to bright light.
 Invisible ink – a number of substances can be used for writing but leave no visible
trace until heat or some chemical is applied to the paper.
 Pin punctures – small pin punctures on selected letters are ordinarily not visible
unless the paper is held in front of the light.
 Typewritten correction ribbon – used between the lines typed with a black ribbon,
the results of typing with the correction tape are visible only under a strong light.

Drawbacks of Steganography
 Requires a lot of overhead to hide a relatively few bits of information.
 Once the system is discovered, it becomes virtually worthless.
Conventional Encryption Principles
A Conventional/Symmetric encryption scheme has five ingredients:

1. Plain Text: This is the original message or data which is fed into the algorithm as input.

2. Encryption Algorithm: This encryption algorithm performs various substitutions and


transformations on the plain text.

3. Secret Key: The key is another input to the algor thm. The substitutions and transformations
performed by algorithm depend on the key.

4. Cipher Text: This is the scrambled (unreadable) message which is output of the encryption
algorithm. This cipher text is dependent on plaintext and secret key. For a given plaintext, two
different keys produce two different cipher texts.

5. Decryption Algorithm: This is the reverse of encryption algorithm. It takes the cipher text and
secret key as inputs and outputs the plain text.
The important point is that the security of conventional encrypt on depends on the secrecy of the key,
not the secrecy of the algorithm i.e. it is not necessary to keep the algorithm secret, but only the key
is to be kept secret. This feature that algorithm need not be kept secret made it feasible for wide spread
use and enabled manufacturers develop low cost chip implementation of data encryption algorithms.
With the use of conventional algorithm, the principal security problem is maintaining the secrecy of
the key.

Definitions
Conventional Encryption Principles
An encryption scheme has five ingredients:
1. Plaintext – Original message or data.

2. Encryption algorithm – performs substitutions & transformations on plaintext.

3. Secret Key – exact substitutions & transformations depend on this

4. Cipher text - output ie scrambled input.

5. Decryption algorithm - converts cipher text back to plaintext.

Simplified Data Encryption Standard (S-DES)


The figure above illustrates the overall structure of the simplified DES. The S-DES
encryption algorithm takes an 8-bit block of plaintext (example: 10111101) and a 10-bit key as
input and produces an 8-bit block of cipher text as output. The S-DES decryption algorithm
takes an 8-bit block of cipher text and the same 10-bit key used to produce that cipher text as
input and produces the original 8-bit block of plaintext.

The encryption algorithm involves five functions:


 an initial permutation (IP)
 a complex function labeled fk, which involves both permutation and substitution
operations and depends on a key input
 a simple permutation function that switches (SW) the two halves of the data
 the function fk again
 a permutation function that is the inverse of the initial permutation
The function fk takes as input not only the data passing through the encryption algorithm, but
also an 8-bit key. Here a 10-bit key is us from which two 8-bit subkeys are generated. The key
is first subjected to a permutation (P10). Then a shift operation is performed. The output of the
shift o eration then passes through a permutation function that produces an 8-bit output (P8)
for the first subkey (K1). The output of the shift operation also feeds into another shift and
another instance of P8 to produce the second subkey (K2).
The encryption algorithm can be expressed as a composition composition1 of functions:
IP-1 ο fK2 ο SW ο fk1 ο IP
This can also be written as
Ciphertext = IP-1 (fK2 (SW (fk1 (IP (plaintext)))))
Where
K1 = P8 (Shift (P10 (Key)))
K2 = P8 (Shift (shift (P10 (Key))))
Decryption can be shown as
Plaintext = IP-1 (fK1 (SW (fk2 (IP (ciphertext)))))
S-DES depends on the use of a 10-bit key shared between sender and receiver. From
this key, two 8-bit subkeys are produced for use n particular stages of the encryption and
decryption algorithm. First, permute the key in the following fashion. Let the 10-bit key be
designated as (k1, K2, k3, k4, k5, k6, k7, k8, k9, k10).
Then the permutation P10 is defined as:
P10 (k1, K2, k3, k4, k5, k6, k7, k8, k9, k10) = (k3, k5, K2, k7, k4, k10 10, k1, k9, k8, k6)
P10 can be concisely defined by the dis lay:
P10
3 5 2 7 4 10 1 9 8 6

This table is read from left to right; each position in the table gives the identity of the input bit
that produces the output bit in that position. So the first output bit is bit 3 of the input; the
second output bit is bit 5 of the input, and so on. For example, the key (1010000010) is
permuted to (10000 01100). Next, perform a circular left shift (LS-1), or rotation, separately
on the first five bits and the second five bits. In our example, the result is (00001 11000). Next
we apply P8, which picks out and permutes 8 of the 10 bits according to the following rule:
P8
6 3 7 4 8 5 10 9
The result is subkey 1 (K1). In our example, this yields (10100100). We then go back to the
pair of 5-bit strings produced by the two LS-1 functions and performs a circular left shift of 2
bit positions on each string. In our example, the value (00001 11000) becomes (00100
00011). Finally, P8 is applied again to produce K2. In our example, the result is (01000011).
S-DES encryption
Encryption involves the sequential application of five functions.
Initial and Final Permutations The input to the algorithm is an 8-bit block of plaintext,
which we first permute using the IP function:
IP
2 6 3 1 4 8 5 7
This retains all 8 bits of the plaintext but mixes them up.
Consider the plaintext to be 11110011.
Permuted output = 10111101
At the end of the algorithm, the inverse permutation is use :
IP –1
4 1 3 5 7 2 8 6

The most complex Skyupscomponentof-DES is the function fk, which consists of a


combination of permutation and substitution functions. The functions can be expressed as
follows. Let L and R be the leftmost 4 bits and rightmost 4 bits of the 8-bit input to f K, and
let F be a mapping (not necessarily one to one) from 4-bit strings to 4-bit strings. Then we let
fk(L, R) = ( L (+) F( R, SK), R)
Where SK is a subkey and (+) is the bit-by-bit exclusive-OR function.
e.g., permuted output = 1011 1101 and suppose F (1101, SK) = (1110) for some key SK.
Then f K(10111101) = 10111110, 1101 = 01011101
We now describe the mapping F. The input is a 4-bit number (n1 n2 n3 n4). The first
operation is an expansion/permutation operation:

E/P
4 1 2 3 2 3 4 1
R= 1101 E/P output = 11101011 It is clearer to depict the result in this fashion:

The 8-bit subkey K1 = (k11, k12 12, k13 13, k14 14, k15 15, k16 16, k17 17, k18) is added to
this value using exclusive-OR:

Let us rename these 8 bits:

The first 4 bits (first row of the preceding matrix) are fed into the S-box S0 to produce a 2- bit
output, and the remaining 4 bits (second row) are fed into S1 to produce another 2- bit output.
These two boxes are defined as follows:
The S-boxes operate Skyups as follows. The first and fourth input bits are treated as a 2-bit

number that specify a row of the -box, and the second and third input bits specify a

column of the S-box. The entry in that row and column, in base 2, is the 2-bit output. For
example, if (p0,0 p0,3) = ) (00) and ( p0,1 p0,2) = (10), then the output is from row 0, column
2 of S0, which is 3, or (11) in ) binary. Similarly, (p1,0 p1,3) and ( p1,1 p1,2) are used to index
into a row and column of S1 to produce an additional 2 bits. Next, the 4 bits produced by S0
and S1 undergo a further permutation as follows:
P4
2 4 3 1

The output of P4 is the output of the function F.


The Switch Function The function f K only alters the leftmost 4 bits of the input. The switch
function (SW) interchanges the left and right 4 bits so that the second instance of f K operates
on a different 4 bits. In this second instance, the E/P, S0, S1, and P4 functions are the same. The
key input is K2. Finally apply inverse permutation to get the ciphertext

Data Encryption Standard (DES)


The main standard for encrypting data was a symmetric algorithm known as the Data
Encryption Standard (DES). However, this has now been replaced by a new standard known as the
Advanced Encryption Standard (AES) which we will look at later. DES is a 64 bit block cipher
which means that it encrypts data 64 bits at a time. This is contrasted to a stream cipher in which
only one bit at a time (or sometimes small groups of bits such as a byte) is encrypted. DES was the
result of a research project set up by International Business Machines (IBM) corporation in the late
1960’s which resulted in a cipher known as LUCIFER. In the early 1970’s it was decided to
commercialize LUCIFER and a number of significant changes were introduced. IBM was not the
only one involved in these changes as they sought technical advice from the National Security
Agency (NSA) (other outside consultants were involved but it is likely that the NSA were the major
contributors from a technical point of view). The alt red version of LUCIFER was put forward as a
proposal for the new national encryption standard requested by the National Bureau of Standards
(NBS)3 . It was finally adopted in 1977 as the Data Encryption Standard - DES (FIPS PUB 46).
Some of the changes made to LUCIFER have been the subject of much controversy even to the
present day. The most notable of these was the key size. LUCIFER used a key size of 128 bits
however this was reduced to 56 bits for DES. Even though DES actually accepts a 64 bit key as
input, the remaining eight bits are used for parity checking and have no effect on DES’s security.
Outsiders were convinced that the 56 bit key was an easy target for a brute force attack4 due to its
extremely small size. The need for the parity checking scheme was also questioned without
satisfying answers. Another controversial issue was that the S-boxes used were designed under
classified conditions and no reasons for their particular design were ever given. This led people to
assume that the NSA had introduced a “trapdoor” through which they could decrypt any data
encrypted by DES even without knowledge of the key. One startling discovery was that the S-boxes
appeared to be secure against an attack known as Differential Cryptanalysis which was only
publicly discovered by Biham and Shamir in 1990. This suggests that the NSA were aware of this
attack in 1977; 13 years earlier! In
fact the DES designers claimed that the reason they never made the design specifications for the S-
boxes available was that they knew about a number of attacks that weren’t public knowledge at the
time and they didn’t want them leaking - this is quite a plausible claim as differential cryptanalysis
has shown. However, despite all this controversy, in 1994 NIST reaffirmed DES for government
use for a further five years for use in areas other than “classified”. DES of course isn’t the only
symmetric cipher. There are many others, each with varying levels of complexity. Such ciphers
include: IDEA, RC4, RC5, RC6 and the new Advanced Encryption Standard (AES). AES is an
important algorithm and was originally meant to replace DES (and its more secure variant triple
DES) as the standard algorithm for non-classified material. However as of 2003, AES with key
sizes of 192 and 256 bits has been found to be secure enough to protect information up to top secret.
Since its creation, AES had underdone intense scrutiny as one would expect for an algorithm that
is to be used as the standard. To date it has withstood all attacks but the search is still on and it
remains to be seen Media whetherornotthis will last. We will look at AES later in the course.
DES
DES (and most of the other major symmetric ciphers) is based on cipher known as the Feistel
block cipher. It consists of a number of rounds where each round contains bit-shuffling, non-
linear substitutions (S-boxes) and exclusive OR operations. As with most encryption schemes,
DES expects two inputs - the plaintext to be encrypted and the secret key. The manner in which
the plaintext is accepted, and the key arrangement used for encryption and decryption, both
determine the type of cipher it is. DES is therefore a symmetric, 64 bit block cipher as it uses
the same key for both encryption and decryption and only operates on 64 bit blocks of data at
a time5 (be they plaintext or ciphertext). The key size used is 56 bits, however a 64 bit (or eight-
byte) key is actually input. The least significant bit of each byte is either used for parity (odd
for DES) or set arbitrarily and does not increase the security in any way. All blocks are
numbered from left to right which makes the eight bit of each byte the parity bit.
Once a plain-text message is received to be encrypted, it is arranged into 64 bit blocks
required for input. If the number of bits in the message is not evenly divisible by 64, then the
last block will be padded. Multiple permutations and substitutions are incorporated
throughout in order to increase the difficulty of performing a cryptanalysis on the cipher
Overall Structure
Figure below shows the sequence of events that occur during an encryption operation. DES
performs an initial permutation on the entire 64 bit block of data. It is then split into 2, 32 bit
sub-blocks, Li and Ri which are then passed into what is known as a round (see figure 2.3), of
which there are 16 (the subscript i in Li and Ri indicates the current round). Each of the rounds
are identical and the effectsMediaofincreasingtheir number is twofold - the algorithms security
is increased and its temporal efficiency decreased. Clearly these are two conflicting outcomes
and a compromise must be ma . For DES the number chosen was 16, probably to guarantee the
elimination of any correlation between the cipher text and either the plaintext or key6 . At the
end of the 16th round, the 32 bit Li and Ri output quantities are swapped to create what is
known as the pre-output. This [R16, L16] concatenation is permuted using a function which is
the exact inverse of the initial permutation. The output of this final permutation is the 64 bit
cipher text.
So in total the processing of the plaintext proceeds in three phases as can be seen from the
left hand side of figure

1. Initial permutation (IP - defined in table 2.1) rearranging the bits to form the
“permuted input”.

2. Followed by 16 iterations of the same function (substitution and permutation). The output
of the last iteration consists of 64 bits which is a function of the plaintext and key. The left
and right halves are swapped to produce the pre-output.
3. Finally, the pre-output is passed through a permutation (IP−1 - defined in table 2.1) which
is simply the inverse of the initial permutation (IP). The output of IP−1 is the 64- bit cipher
text
As figure shows, the inputs to each round consist of the Li , Ri pair and a 48 bit subkey which

is a shifted and contracted version of the original 56 bit key. The use of the key can be seen in

the right hand portion of figure 2.2: • Initially the key is passed through a permutation function

(PC1 - defined in table 2.2) • For each of the 16 iterations, a subkey (Ki) is produced by a

combination of a left circular shift and a permutation (PC2 - defined in table 2.2) which is the

same for each iteration. However, the resulting subkey is different for each iteration because of repeated

shifts.

34
Details Of Individual Rounds
The main operations on the data are encompassed into what is referred to as the cipher function
and is labeled F. This function accepts two different length inputs of 32 bits and 48 bits and
outputs a single 32 bit number. Both the data and key are operated on in parallel, however the
operations are quite different. The 56 bit key is split into two 28 bit halves Ci and Di (C and D
being chosen so as not to be conf sed with L and R). The value of the key used in any round is
simply a left cyclic shift and a permuted contraction of that used in the previous round.
Mathematically, this can be written as

Ci = Lcsi(Ci−1), Di = Lcsi(Di−1)

Ki = P C2(Ci , Di)

where Lcsi is the left cyclic shift for round i, Ci and Di are the outputs after the shifts, P C2(.)
is a function which permutes and compresses a 56 bit number into a 48 bit number and Ki is
the actual key used in round i. The number of shifts is either one or two and is determined by
the round number i. For i = {1, 2, 9, 16} the number of shifts is one and for every other round
it is two
OX Details
Advanced Encryption Algorithm (AES)
 AES is a block cipher with a block length of 128 bits.
 AES allows for three different key lengths: 128, 192, or 256 bits. Most of our
discussion will assume that the key length is 128 bits.
 Encryption consists of 10 rounds of processing for 128-bit keys, 12 rounds for
192-bit keys, and 14 rounds for 256-bit keys.
 Except for the last round in each case, all other rounds are identical.
 Each round of processing includes one single-byte based substitution step, a row-
wise permutation step, a column-wise mixing step, and the addition of the round
key. The order in which these four steps are executed is different for encryption and
decryption.
 To appreciate the processing steps used in single round, it is best to think of a
128-bit block as consisting of a 4 × 4 matrix of bytes, rearranged as follows:

Therefore, the first four bytes of a 128-bit input block occupy the first column in the 4
× 4 matrix of bytes. The next four bytes occupy the second column, and so on.
The 4×4 matrix of bytes shown above is referred to as the state array in AES.
The algorithm begins with an Add round key stage followed by 9 rounds of four stages and a
tenth round of three stages.
This applies for both encryption and decryption with the exception that each stage of a round
the decryption algorithm is the inverse of its counterpart in the encryption algorithm.
The four stages are as follows: 1. Substitute bytes 2. Shift rows 3. Mix Columns 4. Add
Round Key

Substitute Bytes
 This stage (known as SubBytes) is simply a table lookup using a 16 × 16 matrix of
byte values called an s-box.
 This matrix consists of all the possible combinations of an 8 bit sequence (28 = 16
× 16 = 256).
 However, the s-box is not just a random permutation of these values and there is a
well defined method for creating the s-box tables.
 The designers of Rijndael showed how this was done unlike the s-boxes in DES
for which no rationale was given. Our concern will be how state is affected in
each round.
 For this particular round each byte is mapped into a new byte in the following way:
the leftmost nibble of the byte is used to specify a particular row of the s-box and
the rightmost nibble specifies a column.
 For example, the byte {95} (curly brackets represent hex values in FIPS PUB
197) selects row 9 column 5 which turns out to contain the value {2A}.
 This is then used to update the state matrix.

Shift Row Transformation


 This stage (known as ShiftRows) is shown in figure below.
 Simple permutation an nothing more.
 It works as follow: – The first row of state is not altered. – The second row is shifted 1
bytes to the left in a circular manner. – The third row is shifted 2 bytes to the left in a circular
manner. – The fourth row is shifted 3 bytes to the left in a circular manner.
Mix Column Transformation
 This stage (known as MixColumn) is basically a substitution
 Each column is operated on individually. Each byte of a column is mapped into a new
value that is a function of all four bytes in the column.
 The transformation can be determined by the following matrix multiplication on state
 Each element of the product matrix is the sum of products of elements of one row
and one column.
 In this case the individual additions and multiplications are performed in GF(28 ).
 The MixColumns transformation of a single column j (0 ≤ j ≤ 3) of state can be
expressed as:
s ′ 0,j = (2 • s0,j) ⊕ (3 • s1,j) ⊕ s2,j ⊕ s3,j s
′ 1,j = s0,j ⊕ (2 • s1,j) ⊕ (3 • s2,j) ⊕ s3,j s ′
2,j = s0,j ⊕ s1,j ⊕ (2 • s2,j) ⊕ (3 • s3,j) s ′ 3,j
= (3 • s0,j) ⊕ s1,j ⊕ s2,j ⊕ (2 • s3,j)

Add Round Key Transformation


 In this stage (known as AddRoundKey) the 128 bits of state are bitwise XORed with
the 128 bits of the round key.
 The operation is viewed as a column wise operation between the 4 bytes of a state
column and one word of the round key.
 This transformation is as simple as possible which helps in efficiency but it also
effects every bit of state.
 The AES key expansion algorithm takes as input a 4-word key and produces a
linear array of 44 words. Each round uses 4 of these words as shown in figure.
 Each word contains 32 bytes which means each subkey is 128 bits long. Figure 7
show pseudocode for generating the expanded key from the actual key.
Blowfish Algorithm
a symmetric block cipher designed by Bruce Schneier in 1993/94 •
characteristics:
• fast implementation on 32-bit CPUs
• compact in use of memory
• simple structure for analysis/implementation
• variable security by varying key size
• has been implemented in various products
Blowfish Key Schedule
• uses a 32 to 448 bit key, 32-bit words store in K-array K j ,j from 1 to 14
• used to generate

• 18 32-bit subkeys stored in P array, P1 ….P18


• four 8x32 S-boxes stored in Si,j , each with 256 32-bit entries

Subkeys And S-Boxes Generation:


1. initialize P-arra and then 4 S-boxes in order using the fractional part of pi P 1 (
left most 32-bit), and so on,,, S4,255.
2. XOR P-array with key-Array (32-bit blocks) and reuse as needed: assume we have
up to k10 then P10 XOR K10,, P11 XOR K1 … P18 XOR K8
3. Encrypt 64-bit block of zeros, and use the result to update P 1 and P2.
4. Encrypting output from previous step using current P & S and replace P 3 and P4.
Then encrypting current output and use it to update successive pairs of P.
5. After updating all P’s (last :P17 P18), start updating S values using the encrypted
output from previous step.
 requires 521 encryptions, hence slow in re-keying
 Not suitable for limited-memory applications.
Blowfish Encryption
 uses two main operations: addition modulo 232 , and XOR
 data is divided into two 32-bit halves L0 & R0
for i = 1 to 16 do
Ri = Li-1 XOR Pi;
Li = F[Ri] XOR Ri-1;
L17 = R16 XOR P18;
R17 = L16 XOR P17;
• where
F[a,b,c,d] = ((S1,a + S2,b) XOR S3,c) + S4,d
Block Cipher Modes Of Operations
 Direct use of a block cipher is in advisable
 Enemy can build up “code book” of plaintext/cipher text equivalents
 Beyond that, direct use only works on messages that is multiple
of the cipher block size in length
 Solution: five standard Modes of Operation: Electronic Code
Book (ECB), Cipher Block Chaining (CBC),
CipherFeedback(CFB), Output Feedback (OFB), and Counter
(CTR).
Module – 4
1) User Alice & Bob exchange the key using the Diffie-Hellman algorithm. Assume
α=5, q=83, XA=6, XB=10 . Find YA, YB, K
2) In an RSA system, the public key is e=17, n=77 and the ciphertext C=29. Compute the
plaintext M.
3) Using the key “KEYWORD,” encrypt the plaintext “WHY DON’T YOU” using the Playfair cipher.
4) Outline the concept of a digital signature.
5) Describe the time domain approach of a convolutional encoder.
6) Explain AES with an example.
7) Explain the Hash Function in detail.
8) Analyze how to encrypt the plaintext using the Data Encryption Standard (DES) with a
suitable example.
9) Compare the roles of symmetric and asymmetric cryptography in achieving
cryptographic goals.
10) Explain AES encryption and decryption with necessary diagram.
11) Explain the Difference between substitution and transposition techniques in
encryption.
12) Explain the steps involved in Designing a file encryption system for a cloud storage
service to so that only authorized users can access the files.
13) Let message = “PARROT”, Key = “COMPUTER”, find ciphertext using Playfair
cipher.
14) For a plain text “NEVERSTOPTRYING”. Find ciphertext using Hill Cipher for the
17 17 5
given key [21 18 21].
2 2 19
15) Explain how to encrypt the message 1101100101010110 for the input

Xn=(Xn-1+Xn-2+ Xn-4+Xn-5+Xn-8)mod 2 using LFSR. Use Seed values as 11001010.


16) Explain how to Encrypt the plaintext "123456ABCD132536" using a single round of
the Data Encryption Standard (DES) with the key "AABB09182736CCDD". Show
the key schedule and initial permutation.
17) Explain the types of symmetric key cryptography algorithms with examples.
18) Explain the steps in the Data Encryption Standard.
19) Explain the terms cryptosystem, cryptanalysis, attacker, and cracking with the help of an
example.
20) With the help of examples, illustrate the operations used by encryption algorithms.
21) Using the key “SECURITY”, encrypt the plaintext “HOW ARE YOU” using the
Playfair cipher.
22) Explain the importance and applications of digital signatures in modern cryptographic
systems.
23) Describe the frequency domain approach of a convolutional encoder and compare it
with the time domain approach.
24) Illustrate the working of the Advanced Encryption Standard (AES) algorithm with an
example.
25) Explain the concept of hash functions and their role in ensuring data integrity in
cryptographic systems.
26) Demonstrate how to encrypt the plaintext using the Triple Data Encryption Standard
(3DES) with a practical example.
27) Discuss the advantages and limitations of symmetric cryptography compared to
asymmetric cryptography in secure communication.
28) Explain the working of AES encryption and decryption with a block diagram and a
detailed step-by-step explanation.
29) Differentiate between the Vigenère cipher and columnar transposition cipher as
examples of substitution and transposition techniques.
30) Compare and contrast the role of public key cryptography and private key
cryptography in securing data transmission.
Private-Key Cryptography

 traditional private/secret/single key cryptography uses one key


 shared by both sender and receiver
 if this key is disclosed, communications are compromised
 also is symmetric, parties are equal
 hence does not protect sender from receiver forging a message & claiming is sent by
sender

Public Key Cryptography

The development of public-key cryptography is the greatest and perhaps the only
true revolution in the entire history of cryptography. It is asymmetric, involving the use
of two separate keys, in contrast to symmetric encryption, which uses only one key. Public
key schemes are neither more nor less secure than private key (security depends on the
key size for both). Public-key cryptography complements rather than replaces symmetric
cryptography. Both also have issues with key distribution, requiring the use
of some suitable protocol. The concept of public-key cryptography evolved from an
attempt to attack two of the most difficult problems associated with symmetric
encryption:
1.) key distribution – how to have secure communications in general without having to
trust a KDC with your key
2.) digital signatures – how to verify a message comes intact from the claimed sender
Public-key/two-key/asymmetric cryptography involves the use of two keys:

 a public-key, which may be known by anybody, and can be used to


encrypt messages, and verify signatures
 a private-key, known only to the recipient, used to decrypt messages, and sign
(create) signatures.
 is asymmetric because those who encrypt messages or verify signatures
cannot decrypt messages or create signatures

Public-Key algorithms rely on one key for encryption and different but related key
for decryption. These algorithms have the following important characteristics:
 it is computationally infeasible to find decryption key knowing only
algorithm & encryption key
 it is computationally easy to en/decrypt messages when the relevant
(en/decrypt) key is known
 either of the two related keys can be used for encryption, with the other
used for decryption (for some algorithms like RSA)
The following figure illustrates public-key encryption process and shows that a public-
key encryption scheme has six ingredients: plaintext, encryption algorithm, public &
private keys, cipher text & decryption algorithm.
The essential steps involved in a public-key encryption scheme are given below:
1.) Each user generates a pair of keys to be used for encryption and decryption.

2.) Each user places one of the two keys in a public register and the other key is kept private.

3.) If B wants to send a confidential message to A, B encrypts the message using A’s public
key.
4.) When A receives the message, she decrypts it using her private key. Nobody else can
decrypt the message because that can only be done using A’s private key (Deducing a
private key should be infeasible).
5.) If a user wishes to change his keys –generate another pair of keys and publish the
public one: no interaction with other users is needed. Notations used in Public-key
cryptography:
 The public key of user A will be denoted KUA.
 The private key of user A will be denoted KRA.
 Encryption method will be a function E.
 Decryption method will be a function D.
 If B wishes to send a plain message X to A, then he sends the
cryptotext Y=E(KUA,X)
 The intended receiver A will decrypt the message: D(KRA,Y)=X

The first attack on Public-key Cryptography is the attack on Authenticity. An attacker


may impersonate user B: he sends a message E(KUA,X) and claims in the message to be
B –A has no guarantee this is so. To overcome this, B will encrypt the message using his
private key: Y=E(KRB,X). Receiver decrypts using B’s public key KRB. This shows the
authenticity of the sender because (supposedly) he is the only one who knows the private
key. The entire encrypted message serves as a digital signature. This scheme is depicted
in the following figure:
But, a drawback still exists. Anybody can decrypt the message using B’s public key. So,
secrecy or confidentiality is being compromised. One can provide both authentication and
confidentiality using the public-key scheme twice:

B encrypts X with his private key: Y=E(KRB,X)


B encrypts Y with A’s public key: Z=E(KUA,Y)
A will decrypt Z (and she is the only one capable of doing it): Y=D(KRA,Z)

A can now get the plaintext and ensure that it comes from B (he is the only one who
knows his private key): decrypt Y using B’s public key: X=E(KUB,Y).
Applications For Public-Key Cryptosystems:
1.) Encryption/decryption: sender encrypts the message with the receiver’s public key.

2.) Digital signature: sender “signs” the message (or a representative part of the
message) using his private key
3.) Key exchange: two sides cooperate to exchange a secret key for later use in a
secret-key cryptosystem.

The main requirements of Public-key cryptography are:


1. Computationally easy for a party B to generate a pair (public key KUb,
privatekey KRb).
2. Easy for sender A to generate cipher text:
3. Easy for the receiver B to decrypt cipher text using private key:
4. Computationally infeasible to determine private key (KRb) knowing public
key (KUb)
5. Computationally infeasible to recover message M, knowing KUb and cipher text C
6. either of the two keys can be used for encryption, with the other
used for decryption:
M= DKRb[EKUb(M)]=DKUb[EKRb(M)]
Easy is defined to mean a problem that can be solv ing polynomial time as a function of
input length. A problem is infeasible if the effort to solve it grows faster than
polynomial time as a function of input size. Public-key cryptosystems usually rely on
difficult math functions rather than -P networks as classical cryptosystems. One-way
function is one, easy to calculate in one direction, infeasible to calculate in the other
direction (i.e., the inverse is infeasible to compute). Trap-door function is a difficult
function that becomes easy if some extra information is known. Our aim to find a trap-
door one-way function, which is easy to calculate in one direction and infeasible to
calculate in the other direction unless certain additional information is known.
Security of Public-key schemes:
 Like private key schemes brute force exhaustive search attack is
always theoretically possible. But keys used are too large
(>512bits).
 Security relies on a large enough difference in difficulty between easy
(en/decrypt) and hard (cryptanalysis) problems. More generally the hard
problem is known, it’s just made too hard to do in practice.
 Requires the use of very large numbers, hence is slow compared to
private key schemes
RSA Algorithm

RSA is the best known, and by far the most widely used general public key
encryption algorithm, and was first published by Rivest, Shamir & Adleman of MIT in 1978
[RIVE78]. Since that time RSA has reigned supreme as the most widely accepted and
implemented general-purpose approach to public-key encryption. The RSA scheme is a
block cipher in which the plaintext and the ciphertext are integers between 0 and n-1 for
some fixed n and typical size for n is 1024 bits (or 309 decimal digits). It is based on
exponentiation in a finite (Galois) field over integers modulo a prime, using large integers
(eg. 1024 bits). Its security is due to the cost of factoring large numbers. RSA involves a
public-key and a private-key where the public key is known to ll and is used to encrypt
data or message. The data or message which has been encrypted using a public key can
only be decryted by using its corresponding private-k y. Each user generates a key pair
i.e. public and private key using the following steps:
 each user selects two large primes at random - p, q
 compute their system modulus n=p.q
 calculate ø(n), where ø(n)=(p-1)(q- 1)
 selecting at random the encry tion key e, where 1<e<ø(n),and gcd(e,ø(n))=1
 solve following equation to find decryption key d: e.d=1 mod ø(n) and 0≤d≤n
 publish their public encr ption key: KU={e,n}
 keep secret private decryption key: KR={d,n}

Both the sender and receiver must know the values of n and e, and only the receiver
knows the value of d. Encryption and Decryption are done using the following equations.
To encrypt a message M the sender:
– obtains public key of recipient KU={e,n}
– computes: C=Me mod n, where 0≤M<n
To decrypt the ciphertext C the owner:
– uses their private key KR={d,n}
– computes: M=Cd mod n = (Me) d mod n = Med mod n
For this algorithm to be satisfactory, the following requirements are to be met.
a) Its possible to find values of e, d, n such that Med = M mod n for all M<n
b) It is relatively easy to calculate Me and C for all values of M < n.

c) It is impossible to determine d given e and n

The way RSA works is based on Number theory: Fermat’s little theorem: if p is
prime and a is positive integer not divisible by p, then ap-1 ≡ 1 mod p. Corollary: For
any positive integer a and prime p, ap ≡ a mod p.
Fermat’s theorem, as useful as will turn out to be does not provide us with integers
d,e we are looking for –Euler’s theorem (a refinement of Fermat’s) does. Euler’s function
associates to any positive integer n, a number φ(n): the number of positive integers
smaller than n and relatively prime to n. For example, φ(37) = 36 i.e. φ(p) = p-1 for any
prime p. For any two primes p,q, φ(pq)=(p-1)(q-1). Euler’s theorem: for any relatively
prime integers a,n we have aφ(n)≡1 mod n. Corollary: For ny integers a,n we have
aφ(n)+1≡a mod n Corollary: Let p,q be two odd primes and n=pq. Then: φ(n)=(p-1)(q-
1) For any integer m with 0<m<n, m(p-1)(q-1)+1 ≡ m mod n For any integers k,m with
0<m<n, mk(p-1)(q-1)+1 ≡ m mod n Euler’s theorem provides us the numbers d, e such
that Med=M mod n. We have to choose d,e such that ed=kφ(n)+1, or equivalently, d≡e-
1mod φ(n)

An example of RSA can be given as,


Select primes: p=17 & q=11
Compute n = pq =17×11=187
Compute ø(n)=(p–1)(q-
1)=16×10=160 Select e :
gcd(e,160)=1; choose e=7
Determine d: de=1 mod 160 and d < 160 Value is d=23 since 23×7=161= 10×160+1
Publish public key KU={7,187}
Keep secret private key KR={23,187}
Now, given message M = 88 (nb. 88<187)
encryption: C = 887 mod 187 = 11
decryption: M = 1123 mod 187 = 88
Another example of RSA is given as,
Let p = 11, q = 13, e = 11, m = 7
n = pq i.e. n= 11*13 = 143
ø(n)= (p-1)(q-1) i.e. (11-1)(13-1) = 120
e.d=1 mod ø(n) i.e. 11d mod 120 = 1 i.e. (11*11) mod 120=1; so d
= 11 public key :{11,143} and private key: {11,143}
C=Me mod n, so ciphertext = 711mod143 = 727833 mod 143; i.e. C = 106
M=Cd mod n, plaintext = 10611 mod 143 = 1008 mod 143; i.e. M = 7

For RSA key generation,

Users of RSA must:

– Determine two primes at random - p, q

– select either e or d and compute the other

– means must be sufficiently large

– typically guess and use probabilistic test

Security of RSA
There are three main approaches of attacking RSA algorithm.
Brute force key search (infeasible given size of numbers) As explained before,
involves trying all possible private keys. Best defense is using large keys.
Mathematical attacks (based on difficulty of computing ø(N), by factoring modulus N)
There are several approaches, all equivalent in effect to factoring the product of two
primes. Some of them are given as:
– factor N=p.q, hence find ø(N) and then d

– determine ø(N) directly and find d

– find d directly

The possible defense would be using large keys and also choosing large numbers for p
and q, which should differ only by a few bits and are also on the order of magnitude 1075
to 10100. And gcd (p-1, q-1) should be small.
Diffie-Hellman Key Exchange

Diffie-Hellman key exchange (D-H) is a cryptographic protocol that allows two parties
that have no prior knowledge of each other to jointly establish a shared secret key over
an insecure communications channel. This key can then be used to encrypt subsequent
communications

using a symmetric key cipherMedia.TheD-Hlgorithm depends for its effectiveness on the


difficulty of computing discrete logar thms.

First, a primitive root of a prime number p, can be fined as one whose powers generate
all the integers from 1 to p-1. If a is a primitive root of the prime number p, then the
numbers, a mod p, a2 mod p,..., ap-1 mod p, are distinct and consist of the integers from
1 through p 1 in some permutation.
For any integer b and a primitive root a of prime number p, we can find a unique exponent

i such that .The exponent i is referred to as the


discrete logarithm of b for the base a, mod p. We express this value as dloga,p (b). The
algorithm is summarized below:
For this scheme, there are two publicly known numbers: a prime number q and an integer
α that is a primitive root of q. suppose the users A and B wish to exchange a key. User A
selects a random integer XA < q and computes YA = αXA mod q. Similarly, user B
independently selects a random integer XA < q and computes YB = αXB mod q. Each side
keeps the X value private and makes the Y value available publicly to the other side. User
A computes the key as K = (YB)XA mod q and user B computes the key as K = (YA)XB mod
q. These two calculations produce identical results.
Discrete Log Problem
The (discrete) exponentiation problem is as follows: Given a base a, an exponent b and a
modulus p, calculate c such that ab ≡ c (mod p) and 0 ≤ c < p. It turns out that this problem
is fairly easy and can be calculated "quickly" using fast-exponentiation. The discrete log
problem is the inverse problem: Given a base a, a result c (0 ≤ c < p) and a modulus p,
Calculate the exponent b such that ab ≡ c (mod p). It turns out that no one has found a
quick way to solve this problem With DLP, if P had 300 digits, Xa and Xb have more than
100 digits, it would take longer than the life of the universe to crack the method.
Examples for D-H key distribution scheme:
1) Let p = 37 and g = 13.

Let Alice pick a = 10. Alice calculates 1310 (mod 37) which is 4 and sends that to Bob. Let
Bob pick b = 7. Bob calculates 137 (mod 37) which is 32 and sends that to Alice. (Note: 6
and 7 are secret to Alice and Bob, respectively, but both 4 and 32 are known by all.)
10 (mod 37) which is 30, the secret key.

7 (mod 37) which is 30, the same secret key.

2) Let p = 47 and g = 5. Let Alice pick a = 18. Alice calculates 518 (mod 47) which is 2 and
sends that to Bob. Let Bob pick b = 22. Bob calculates 522 (mod 47) which is 28 and sends
that to Alice.
18 (mod 47) which is 24, the secret key.

22 (mod 47) which is 24, the same secret key

Man-in-the-Middle Attack on D-H Protocol


Suppose Alice and Bob wish to exchange keys, and Darth is the adversary. The attack
proceeds as follows:
1. Darth prepares for the attack by generating two random private keys XD1 and XD2
and then computing the corresponding public keys YD1 and YD2.
2. Alice transmits YA to Bob.

3. Darth intercepts YA and transmits YD1 to Bob. Darth also calculates K2 = (YA) XD2mod q.

4. Bob receives YD1 and calculates K1 = (YD1)XE mod q.

5. Bob transmits XA to Alice.

6. Darth intercepts XA and transmits YD2 to Alice. Darth calculates K1 = (YB) XD1 mod q.

7. Alice receives YD2 and calculates K2 = (YD2)XA mod q.


At this point, Bob and Alice think that they share a secret key, but instead Bob and Darth
share secret key K1 and Alice and Darth share secret key K2. All future communication
between Bob and Alice is compromised in the following way:
1. Alice sends an encrypted message M: E(K2, M).

2. Darth intercepts the encrypted message and decrypts it, to recover M.

3. Darth sends Bob E(K1, M) or E(K1, M'), where M' is any message. In the first case, Darth
simply wants to eavesdrop on the communication without altering it. In the second case,
Darth wants to modify the message going to Bob.
The key exchange protocol is vulnerable to such an attack because it does not authenticate
the participants. This vulnerability can be overcome with the use of digital signatures and
public- key certificates.
Authentication Requirements
 In the context of communications across a network, the following eight attacks can be identified:
1. Disclosure
2. Traffic analysis
3. Masquerade
4. Content modification
5. Sequence modification
6. Timing modification
7. Source repudiation
8. Destination repudiation

1-2 : message confidentiality


3-6 : message authentication (here we are not directly worried about confidentiality, message
may be seen by others)
7 : digital signature
8 : combination of digital signature and protocol
p.s.: DOS attack is not listed here as it is not directly related to a message, but affects system
availability as a whole

Message Authentication

Message authentication is a procedure to verify that received messages come from


the alleged source and have not been altered. Message authentication may also verify
sequencing and timeliness. It is intended against the attacks like content modification,
sequence modification, timing modification and repudiation. For repudiation, concept of
digital signatures is used to counter it. There are three classes by which different types of
functions that may be used to produce an authenticator. They re:
 Message encryption–the ciphertext serves as auth nticator

 Message authentication code (MAC)–a public function of the message and a


secret key producing a fixed-length value to erve as authenticator. This does not
provide a digital signature because A and B share the same key.

 Hash function–a public function mapping an arbitrary length message into a


fixed- length hash value to serve as authenticator. This does not provide a digital
signature because there is no key.

MESSAGE ENCRYPTION:
Message encryption by itself can provide a measure of authentication. The analysis differs
for conventional and public-key encryption schemes. The message must have come from
the sender itself, because the ciphertext can be decrypted using his (secret or public) key.
Also, none of the bits in the message have been altered because an opponent does not
know how to manipulate the bits of the ciphertext to induce meaningful changes to the
plaintext. Often one needs alternative authentication schemes than just encrypting the
message.
 Sometimes one needs to avoid encryption of full messages due to legal requirements.
 Encryption and authentication may be separated in the system architecture.

The different ways in which message encryption can provide authentication,


confidentiality in both symmetric and asymmetric encryption techniques is explained
with the table below:

MESSAGE AUTHENTICATION CODE


An alternative authentication technique involves the use of a secret key to
generate a small fixed-size block of data, known as cryptographic checksum or MAC,
which is appended to the message. This technique assumes that both the communicating
parties say A and B share a common secret key K. When A has a message to send to B, it
calculates MAC as a function C of key and message given as: MAC=Ck(M) The message
and the MAC are transmitted to the intended recipient, who upon receiving performs the
same calculation on the received message, using the same secret key to generate a new
MAC. The received MAC is compared to the calculated MAC and only if they match, then:
1. The receiver is assured that the message has not been altered: Any alternations
been done the MAC’s do not match.
2. The receiver is assured that the message is from the alleged sender: No one
except the sender has the secret key and could prepare a message with a proper
MAC.
3. If the message includes a sequence number, then receiver is assured of proper
sequence as an attacker cannot successfully alter the sequence number.
Basic uses of Message Authentication Code (MAC) are shown in the figure:

There are three different situations where use of a MAC is desirable:


 If a message is broadcast to several destinations in a network (such as a military
control center), then it is cheaper and more reliable to have just one node responsible
to evaluate the authenticity –message will be sent in plain with an attached
authenticator.
 If one side has a heavy load, it cannot afford to decrypt all messages –it will just
check the authenticity of some randomly selected messages.
 Authentication of computer programs in plaintext is very attractive service as they
need not be decrypted every time wasting of processor resources. Integrity of the
program can always be checked by MAC.

MESSAGE AUTHENTICATION CODE BASED ON DES


The Data Authentication Algorithm, based on DES, has been one of the most widely used
MACs for a number of years. The algorithm is both a FIPS publication (FIPS PUB 113) and
an ANSI standard (X9.17). But, security weaknesses in this algorithm have been
discovered and it is being replaced by newer and stronger algorithms. The algorithm can
be defined as using the cipher block chaining (CBC) mode of operation of DES shown
below with an initialization vector of zero.

The data (e.g., message, record, file, or program) to be authenticated are grouped into
contiguous 64-bit blocks: D1, D2,..., DN. If necessary, the final block is padded on the right
with zeroes to form a full 64-bit block. Using the DES encryption algorithm, E, and a secret
key, K, a data authentication code (DAC) is calculated as follows:
The DAC consists of either the entire block ON or the leftmost M bits of the block, with 16
≤ M ≤ 64
Use of MAC needs a shared secret key between the communicating parties and also MAC
does not provide digital signature. The following table summarizes the confidentiality
and authentication implications of the approaches shown above.

HASH FUNCTION

A variation on the message authentication code is the one-way hash function. As


with the message authentication code, the hash function accepts a variable-size message
M as input and produces a fixed-size hash code H(M), sometimes called a message digest,
as output. The hash code is a function of all bits of the message and provides an error-
detection capability: A change to any bit or bits in the message results in a change to the
hash code. A variety of ways in which a hash code can be used to provide message
authentication is shown below and explained stepwise in the table.
In cases where confidentiality is not required, methods b and c have an advantage
over those that encrypt the entire message in that less computation is required. Growing
interest for techniques that avoid encryption is due to reasons like, Encryption software
is quite slow and may be covered by patents. Also encryption hardware costs are not
negligible and the algorithms are subject to U.S export control. A fixed-length hash value
h is generated by a function H that takes as input a message of arbitrary length: h=H(M).
 A sends M and H(M)

 B authenticates the message by computing H(M) and checking the match

Requirements for a hash function: The purpose of a hash function is to produce a


“fingerprint” of a file, message, or other block of data. To be used for message
authentication, the hash function H must have the following properties
 H can be applied to a message of any size

 H produces fixed-length output

 Computationally easy to compute H(M) for any given M


 Computationally infeasible to find M such that H(M)=h,
for a given h, referred to as the one-way property
 Computationally infeasible to find M’ such that
H(M’)=H(M), for a given M, referred to as weak
collision resistance.
 Computationally infeasible to find M,M’ with
H(M)=H(M’) (to resist to birthday attacks), referred to
as strong collision resistance.
Examples of simple hash functions are:
 Bit-by-bit XOR of plaintext blocks: h= D1⊕D2⊕…⊕DN

 Rotated XOR –before each addition the hash value is rotated to the
left with 1 bit

 Cipher block chaining technique without a secret key.

Digital signature:
➢ It is an authentication mechanism that allows the sender to attach an electronic code
with the message. This electronic code acts as the signature of the sender and hence, is
named digital signature.
➢ It is done to ensure its authenticity and integrity.
➢ Digital signature uses the public-key cryptography technique. The sender uses his or
her private key and a signing algorithm to create a digital signature and the signed
document can be made public. The receiver, uses the public key of the
sender and a verifying algorithm to verify the digital signature.
➢ A normal message authentication scheme protects the two communicating parties
against attacks from a third party (intruder). However, a secure digital signature
scheme protects the two parties against each other also.
➢ Suppose A wants to send a signed message (message with A's digital signature) to B
through a network. For this, A encrypts the message using his or her private key, which
results in a signed message. The signed message is then sent through the network to B.
➢ Now, B attempts to decrypt the received message using A's public key in order to
verify that the received message has really come from A.
➢ If the message gets decrypted, B can believe that the message is from A. However, if
the message or the digital signature has been modified during transmission, it cannot be
decrypted using A's public key. From this, B can conclude that either
the message transmission has tampered with, or that the message has not been
generated by A.
Message integrity:
➢ Digital signatures also provide message integrity.
➢ If a message has a digital signature, then any change in the message after the
signature is attached will invalidate the signature.
➢ That is, it is not possible to get the same signature if the message is changed.
Moreover, there is no efficient way to modify a message and its signature such that a
new message with a valid signature is produced.
Non-repudiation:
➢ Digital signatures also ensure non-repudiation.
➢ For example, if A has sent a signed message to B, then in future A cannot deny about
the sending of the message. B can keep a copy of the message along with A's signature.
➢ In case A denies, B can use A’s public key to generate the original message. If the
newly created message is the same as that initially sent by A, it is proved that the
message has been sent by A only.
In the same way, B can never create a forged message bearing A's digital signature,
because only A can create his or her digital signatures with the help of that private key.
Message confidentiality:
➢ Digital signatures do not provide message confidentiality, because anyone knowing
the sender's public key can decrypt the message.

Digital signature process:


The digital signature process is shown in Figure. Suppose user A wants to send a signed
message to B through a network. To achieve this communication, these steps are
followed:
➢ A uses his private key (EA), applied to a signing algorithm, to sign the message (M).
➢ The message (M) along with A's digital signature (S) is sent to B.
➢ On receiving the message (M) and the signature (S), B uses A's public key
(DA),applied to the verifying algorithm, to verify the authenticity of the message. If the
message is authentic, B accepts the message, otherwise it is rejected.
Module – 5
1) Compare symmetric and asymmetric keys in Cryptography.
2) Suppose Alice and Bob wish to exchange keys, and Darth is the adversary, illustrate man in
the middle attack for this case.
3) Illustrate the RSA encryption algorithm with the help of an example.

4) Describe briefly about the digital signature process also describe how it ensures non-
reputation with a suitable example.
5) Explain the key components of the Diffie-Hellman algorithm?
6) Write a note on Digital Signature.
7) Explain the working of a message authentication code with an example.
8) In a public-key system using RSA, you intercept the ciphertext C =10 sent to a user whose
public key is e=5, n=35. What is the plaintext M?
9) Explain why HMAC is preferred over simple MAC schemes.
10) Give an overview of digital signature.
11) Explain the main requirements of public key cryptography.
12) Compare the security features of MACs and hash functions.
13) Explain the working of a message authentication code with an example.
14) User Alice & Bob exchange the key using Diffie Hellman algorithm. Assume α=5 q=83 XA=6
XB=10. Find YA, YB, K.
15) Comment on the security of HASH functions.
16) Demonstrate the RSA encryption and decryption process with an example that uses small
values for simplicity.
17) Describe the digital signature creation and verification process, and explain how it ensures
authentication and integrity with an example.
18) Outline the Diffie-Hellman key exchange algorithm, including its key parameters and the
steps involved.
19) Write a detailed note on the role of digital signatures in ensuring data integrity and
authenticity.
20) Illustrate the concept of a message authentication code (MAC) with an example of how it
ensures data security.
21) In an RSA system, the intercepted ciphertext is C=15C = 15C=15, and the public key of the
user is e=7, n=33e = 7, \, n = 33e=7,n=33. Determine the plaintext MMM.
22) Discuss the advantages of HMAC over traditional MAC schemes, focusing on security and
efficiency.
23) Provide an overview of digital signature standards (DSS) and their applications in secure
communication.
24) List and explain the fundamental principles of public key cryptography, such as key pairs and
asymmetric encryption.
25) Compare the design goals and use cases of MACs and cryptographic hash functions,
highlighting their differences.
26) Describe the working of a keyed-hash message authentication code (HMAC) with an example
to demonstrate its use.
27) Explain how the Elliptic Curve Cryptography (ECC) can be an alternative to RSA for public-key
encryption and digital signatures.
28) Discuss the differences between digital signatures and message authentication codes, with
examples of their applications.

You might also like