0% found this document useful (0 votes)
163 views74 pages

Error Detection and Correction

This document discusses error detection and correction in communications. It is divided into five sections: 1. Introduction to types of errors, redundancy, and the difference between error detection and correction. 2. Block coding which detects errors by adding redundant bits to create valid and invalid codewords. The minimum Hamming distance between codewords must be large enough to detect errors. 3. Cyclic codes which are a type of error correcting code. Common cyclic codes like CRC are used in data link layers. 4. Checksums which calculate a value for a set of data words to detect errors. 5. Forward error correction which can correct errors using Hamming distance and ensuring codewords are far enough apart

Uploaded by

sriram sharath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views74 pages

Error Detection and Correction

This document discusses error detection and correction in communications. It is divided into five sections: 1. Introduction to types of errors, redundancy, and the difference between error detection and correction. 2. Block coding which detects errors by adding redundant bits to create valid and invalid codewords. The minimum Hamming distance between codewords must be large enough to detect errors. 3. Cyclic codes which are a type of error correcting code. Common cyclic codes like CRC are used in data link layers. 4. Checksums which calculate a value for a set of data words to detect errors. 5. Forward error correction which can correct errors using Hamming distance and ensuring codewords are far enough apart

Uploaded by

sriram sharath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Error Detection and

Correction

By,
B. R. Chandavarkar,
CSE Dept., NITK, Surathkal

Ref: B. A. Forouzan, 5th Edition


This chapter is divided into five sections.

• The first section introduces types of errors, the concept of


redundancy, and distinguishes between error detection and
correction.
• The second section discusses block coding. It shows how
error can be detected using block coding and also introduces
the concept of Hamming distance.
• The third section discusses cyclic codes. It discusses a subset
of cyclic code, CRC, that is very common in the data-link
layer.
• The fourth section discusses checksums. It shows how a
checksum is calculated for a set of data words.
• The fifth section discusses forward error correction. It shows
how Hamming distance can also be used for this purpose.
1. Introduction
1.1 Types of Errors
• Whenever bits flow from one point to another, they are
subject to unpredictable changes because of interference.
This interference can change the shape of the signal.
• The term single-bit error means that only 1 bit of a given
data unit (such as a byte, character, or packet) is changed
from 1 to 0 or from 0 to 1.
• The term burst error means that 2 or more bits in the data
unit have changed from 1 to 0 or from 0 to 1
• A burst error is more likely to occur than a single-bit error
because the duration of the noise signal is normally longer
than the duration of 1 bit, which means that when noise
affects data, it affects a set of bits.
• The number of bits affected depends on the data rate and
duration of noise.
• For example, if we are sending data at 1 kbps, a noise of
1/100 second can affect 10 bits; if we are sending data at 1
Mbps, the same noise can affect 10,000 bits.
1.2 Redundancy
• The central concept in detecting or correcting errors is
redundancy.
• To be able to detect or correct errors, we need to send some
extra bits with our data. These redundant bits are added by
the sender and removed by the receiver.
• Their presence allows the receiver to detect or correct
corrupted bits.
Redundancy
1.3 Detection versus Correction
• The correction of errors is more difficult than the detection.
• In error detection, we are only looking to see if any error has
occurred. The answer is a simple yes or no. We are not even
interested in the number of corrupted bits. A single-bit error
is the same for us as a burst error.
• In error correction, we need to know the exact number of bits
that are corrupted and, more importantly, their location in
the message.
• The number of errors and the size of the message are
important factors. If we need to correct a single error in an 8-
bit data unit, we need to consider eight possible error
locations; if we need to correct two errors in a data unit of
the same size, we need to consider 28 (permutation of 8 by 2)
possibilities.
1.4 Forward Error Correction Versus Retransmission
• There are two main methods of error correction.
• Forward error correction is the process in which the receiver
tries to guess the message by using redundant bits. This is
possible, if the number of errors is small.
• Correction by retransmission is a technique in which the
receiver detects the occurrence of an error and asks the
sender to resend the message. Resending is repeated until a
message arrives that the receiver believes is error-free
(usually, not all errors can be detected).
1.5 Coding
• Redundancy is achieved through various coding schemes.
• The sender adds redundant bits through a process that
creates a relationship between the redundant bits and the
actual data bits.
• The receiver checks the relationships between the two sets of
bits to detect errors.
• The ratio of redundant bits to data bits and the robustness of
the process are important factors in any coding scheme.
• We can divide coding schemes into two broad categories:
block coding and convolution coding.
2. BLOCK CODING
• In block coding, we divide our message into blocks, each of k
bits, called datawords. We add r redundant bits to each block
to make the length n = k + r. The resulting n-bit blocks are
called codewords.
• With k bits, we can create a combination of 2k datawords;
with n bits, we can create a combination of 2n codewords.
Since n > k, the number of possible codewords is larger than
the number of possible datawords.
• The block coding process is one-to-one; the same dataword is
always encoded as the same codeword. This means that we
have 2n − 2k codewords that are not used. We call these
codewords invalid or illegal.
• The trick in error detection is the existence of these invalid
codes. If the receiver receives an invalid codeword, this
indicates that the data was corrupted during transmission.
Error Detection

Error Correction
How can errors be detected by using block coding?
If the following two conditions are met, the receiver can detect a change
in the original codeword.
1. The receiver has (or can find) a list of valid codewords.
2. The original codeword has changed to an invalid one.
Example:

Assume the sender encodes the dataword 01 as 011 and sends it to the
receiver. Consider the following cases:
1. The receiver receives 011. It is a valid codeword. The receiver extracts
the dataword 01 from it.
2. The codeword is corrupted during transmission, and 111 is received
(the leftmost bit is corrupted). This is not a valid codeword and is
discarded.
3. The codeword is corrupted during transmission, and 000 is received
(the right two bits are corrupted). This is a valid codeword. The
receiver incorrectly extracts the dataword 00. Two corrupted bits
have made the error undetectable.
Hamming Distance
• One of the central concepts in coding for error control is the
idea of the Hamming distance.
• The Hamming distance between two words (of the same size)
is the number of differences between the corresponding bits.
• We show the Hamming distance between two words x and y
as d(x, y).
• We may wonder why Hamming distance is important for
error detection. The reason is that the Hamming distance
between the received codeword and the sent codeword is the
number of bits that are corrupted during transmission. For
example, if the codeword 00000 is sent and 01101 is received,
3 bits are in error and the Hamming distance between the
two is d(00000, 01101) = 3.
• In other words, if the Hamming distance between the sent
and the received codeword is not zero, the codeword has been
corrupted during transmission.
Minimum Hamming Distance
• Although the concept of the Hamming distance is the central
point in dealing with error detection and correction codes,
the measurement that is used for designing a code is the
minimum Hamming distance.
• In a set of words, the minimum Hamming distance is the
smallest Hamming distance between all possible pairs.
• We use dmin to define the minimum Hamming distance in a
coding scheme. To find this value, we find the Hamming
distances between all words and select the smallest one.
• Example:

The dmin in this case is 2.


Three Parameters of Block Coding Scheme
• Before we continue with our discussion, we need to mention
that any coding scheme needs to have at least three
parameters: the codeword size n, the dataword size k, and
the minimum Hamming distance dmin.
• A coding scheme C is written as C(n, k) with a separate
expression for dmin.

• For example, we can call our first coding scheme C(3, 2) with
dmin =2 and our second coding scheme C(5, 2) with dmin =
3.
Minimum Distance for Error Detection
• Now let us find the minimum Hamming distance in a code if
we want to be able to detect up to s errors.
• If s errors occur during transmission, the Hamming distance
between the sent codeword and received codeword is s.
• If our system is to detect up to s errors, the minimum
distance between the valid codes must be (s + 1), so that the
received codeword does not match a valid codeword.
• In other words, if the minimum distance between all valid
codewords is (s + 1), the received codeword cannot be
erroneously mistaken for another codeword. The error will
be detected.
• We need to clarify a point here: Although a code with dmin =
s + 1 may be able to detect more than s errors in some
special cases, only s or fewer errors are guaranteed to be
detected.
Minimum Distance for Error Correction
• Error correction is more complex than error detection; a decision is
involved.
• When a received codeword is not a valid codeword, the receiver
needs to decide which valid codeword was actually sent.
• The decision is based on the concept of territory, an exclusive area
surrounding the codeword. Each valid codeword has its own
territory.
• We use a geometric approach to define each territory. We assume
that each valid codeword has a circular territory with a radius of t
and that the valid codeword is at the center.
• For example, suppose a codeword x is corrupted by t bits or less.
Then this corrupted codeword is located either inside or on the
perimeter of this circle.
• If the receiver receives a codeword that belongs to this territory, it
decides that the original codeword is the one at the center.
• Note that we assume that only up to t errors have occurred;
otherwise, the decision is wrong.
Minimum Distance for Error Detection

Minimum Distance for Error Correction

It can be shown that to detect t errors, we need to have dmin = 2t + 1.


2.1 Linear Block Codes
• Almost all block codes used today belong to a subset of block
codes called linear block codes.
• A linear block code is a code in which the exclusive OR
(addition modulo-2) of two valid codewords creates another
valid codeword.
• It is simple to find the minimum Hamming distance for a
linear block code. The minimum Hamming distance is the
number of 1s in the nonzero valid codeword with the
smallest number of 1s.
• Example:

• Linear Block Code: Simple Parity-Check Code, Hamming


Code.
Parity-Check Code
• Perhaps the most familiar error-detecting code is the parity-
check code.
• This code is a linear block code.
• In this code, a k-bit dataword is changed to an n-bit
codeword where n = k + 1. The extra bit, called the parity bit,
is selected to make the total number of 1s in the codeword
even.
• The minimum Hamming distance for this category is dmin =
2, which means that the code is a single-bit error-detecting
code.
• Example:

• The simple parity check, guaranteed to detect one single


error, can also find any odd number of errors.
r0 = a3 + a2 + a1 + a0 (modulo-2)
s0 = b3 + b2 + b1 + b0 + q0 (modulo-2)
• A better approach is the two-dimensional parity check.
• The two-dimensional parity check can detect up to three
errors that occur anywhere in the table (arrows point to the
locations of the created nonzero syndromes).
• However, errors affecting 4 bits may not be detected.
Hamming Codes
• Now let us discuss a category of error-correcting codes called
Hamming codes. These codes were originally designed with
dmin = 3, which means that they can detect up to two errors
or correct one single error. Although there are some
Hamming codes that can correct more than one error.
• First let us find the relationship between n and k in a
Hamming code. We need to choose an integer m >= 3. The
values of n and k are then calculated from m as n = 2m – 1
and k = n - m. The number of check bits r =m.
• Let us trace the path of three datawords from the sender to
the destination:
• 1. The dataword 0100 becomes the codeword 0100011.
The codeword 0100011 is received. The syndrome is
000, the final dataword is 0100.
• 2. The dataword 0111 becomes the codeword 0111001.
The syndrome is 011. After flipping b2 (changing the 1
to 0), the final dataword is 0111.
• 3. The dataword 1101 becomes the codeword 1101000.
The syndrome is 101. After flipping b0, we get 0000,
the wrong dataword. This shows that our code cannot
correct two errors.
The key to the Hamming Code is the use of extra parity bits to allow the
identification of a single error. Create the code word as follows:
• Mark all bit positions that are powers of two as parity bits. (positions 1,
2, 4, 8, 16, 32, 64, etc.)
• All other bit positions are for the data to be encoded. (positions 3, 5, 6, 7,
9, 10, 11, 12, 13, 14, 15, 17, etc.)
• Each parity bit calculates the parity for some of the bits in the code word.
The position of the parity bit determines the sequence of bits that it
alternately checks and skips.
Position 1: check 1 bit, skip 1 bit, check 1 bit, skip 1 bit, etc.
(1,3,5,7,9,11,13,15,...)
Position 2: check 2 bits, skip 2 bits, check 2 bits, skip 2 bits, etc.
(2,3,6,7,10,11,14,15,...)
Position 4: check 4 bits, skip 4 bits, check 4 bits, skip 4 bits, etc.
(4,5,6,7,12,13,14,15,20,21,22,23,...)
Position 8: check 8 bits, skip 8 bits, check 8 bits, skip 8 bits, etc. (8-
15,24-31,40-47,...)
Position 16: check 16 bits, skip 16 bits, check 16 bits, skip 16 bits, etc.
(16-31,48-63,80-95,...)
Position 32: check 32 bits, skip 32 bits, check 32 bits, skip 32 bits, etc.
(32-63,96-127,160-191,...)
etc.
• Set a parity bit to 1 if the total number of ones in the positions it checks
is odd. Set a parity bit to 0 if the total number of ones in the positions it
checks is even.
Burst error correction using Hamming code

10.32
CYCLIC CODES
• Cyclic codes are special linear block codes with one extra
property. In a cyclic code, if a codeword is cyclically shifted
(rotated), the result is another codeword.
• For example, if 1011000 is a codeword and we cyclically left-
shift, then 0110001 is also a codeword.
Cyclic Redundancy Check
• We can create cyclic codes to correct errors.
• A subset of cyclic codes called the cyclic redundancy check
(CRC), which is used in networks such as LANs and WANs.
Advantages of Cyclic Codes
• Cyclic codes have a very good performance in detecting
single-bit errors, double errors, an odd number of errors, and
burst errors.
• They can easily be implemented in hardware and software.
• They are especially fast when implemented in hardware.
This has made cyclic codes a good candidate for many
networks.
Figure 10.22 CRC division using polynomials

10.38
Note

A good polynomial generator needs to have the following


characteristics:
1. It should have at least two terms.
2. The coefficient of the term x0 should be 1.
3. It should not divide xt + 1, for t between 2 and n − 1.
4. It should have the factor x + 1.

10.39
• Which of the following g(x) values guarantees that a single-
bit error is caught ?
(i) x + 1 (ii) X3 (iii) 1

• Find the status of the following generators related to two


isolated and single-bit error.
(a) x + 1 (b) x4 + 1 (c) x7 + x6 + 1

(a) This is a very poor choice for a generator. Any two errors next to
each other cannot be detected.
(b) This generator cannot detect two errors that are four positions
apart. The two errors can be anywhere, but if their distance is 4,
they remain undetected.
(c) This is a good choice for this purpose.
CHECKSUM
• Checksum is an error-detecting technique that can be
applied to a message of any length.
• In the Internet, the checksum technique is mostly used at
the network and transport layer rather than the data-link
layer.
• At the source, the message is first divided into m-bit units.
• The generator then creates an extra m-bit unit called the
checksum, which is sent with the message.
• At the destination, the checker creates a new checksum from
the combination of the message and sent checksum. If the
new checksum is all 0s, the message is accepted; otherwise,
the message is discarded
Convolution Coding

• Convolutional codes work in a fundamentally different


manner in that they operate on data continuously as it is
received (or transmitted) by a network node.
• Consequently, convolutional encoders and decoders can be
constructed with a minimum of circuitry.
• This has meant that they have proved particularly popular
in mobile communications systems where a minimum of
circuitry and consequent light weight are a great advantage.
• convolutional code treats a particular group of bits depends
on what has happened to previous groups of bits.
• Convolutional codes are commonly specified by three
parameters; (n, k, m).
– n = number of output bits -
– k = number of input bits
– m = number of memory registers
• The quantity k/n called the code rate, is a measure of the
efficiency of the code.
• Commonly k and n parameters range from 1 to 8, m from 2
to 10 and the code rate from 1/8 to 7/8 except for deep space
applications where code rates as low as 1/100 or even longer
have been employed.
• Often the manufacturers of convolutional code chips specify
the code by parameters (n, k, L), The quantity L is called the
constraint length of the code and is defined by
– Constraint Length, L = k (m-1)
• The constraint length L represents the number of bits in the
encoder memory that affect the generation of the n output
bits.
• The convolutional code structure is easy to draw from its
parameters.
• First draw m boxes representing the m memory registers.
• Then draw n modulo-2 adders to represent the n output bits.
• Now connect the memory registers to the adders using the
generator polynomial.
• There are many choices for polynomials for any m order
code.
• They do not all result in output sequences that have good
error protection properties.
• Petersen and Weldon’s book contains a complete list of these
polynomials. Good polynomials are found from this list
usually by computer simulation.
• A list of good polynomials for rate ½ codes is given below.
• The (2,1,4) code has a constraint length of 3. The shaded
registers below hold these bits.
• The unshaded register holds the incoming bit. This means
that 3 bits or 8 different combination of these bits can be
present in these memory registers. These 8 different
combinations determine what output we will get for v1 and
v2, the coded sequence.
• The number of combinations of bits in the shaded registers
are called the states of the code and are defined by
– Number of states = 2L
– where L = the constraint length of the code and is equal to k (m - 1).
• Let’s say that we have a input sequence of 1011 and we want
to know what the coded sequence would be. We can calculate
the output by just adding the shifted versions of the
individual impulse responses.
Input Bit Its impulse response
1 11 11 10 11
0 00 00 00 00
1 11 11 10 11
1 11 11 10 11
Add to obtain response
________________________________
1011 11 11 01 11 01 01 11
Encoding (Sender)
• Graphically, there are three ways in which we can look at
the encoder to gain better understanding of its operation.
These are:
– State Diagram
– Tree Diagram
– Trellis Diagram
State Diagram
Tree Diagram
Trellis Diagram
• Trellis diagrams are messy but generally preferred over both
the tree and the state diagrams because they represent
linear time sequencing of events.
• The x-axis is discrete time and all possible states are shown
on the y-axis.
• We move horizontally through the trellis with the passage of
time.
• Each transition means new bits have arrived.
• Encoding 1010 using (2, 1, 4)
Decoding (Receiver)

• There are several different approaches to decoding of


convolutional codes. These are grouped in two basic
categories.
– Sequential Decoding
• Fano algorithm
– Maximum likely-hood decoding
• Viterbi decoding
A. Sequential Decoding

• Decoding of 11 11 01 11 01 01 11 received as 01 11 01 11 01
01 11 using sequential decoding

Cont….
Cont….
Cont….
B. Maximum likely-hood decoding
• Step 1
• Step 2
• Step 3
• Step 4
• After Step 4
• Step 5
• Step 6
• Step 7

You might also like