100% found this document useful (1 vote)
233 views70 pages

Error Correction

The document discusses error control techniques used in digital communications and data storage. It introduces several concepts: - Error detection detects errors but does not correct them, while error correction detects and fixes errors. - Parity bits provide simple error detection by checking if the total number of 1s in a data word is even or odd. Row-column parity interleaved data bits to improve detection of burst errors. - The Hamming distance and minimum distance of an error-correcting code determine its ability to detect and correct errors. Hamming codes provide an efficient way to add the minimum number of check bits needed to correct single-bit errors.

Uploaded by

Srini Vasulu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
233 views70 pages

Error Correction

The document discusses error control techniques used in digital communications and data storage. It introduces several concepts: - Error detection detects errors but does not correct them, while error correction detects and fixes errors. - Parity bits provide simple error detection by checking if the total number of 1s in a data word is even or odd. Row-column parity interleaved data bits to improve detection of burst errors. - The Hamming distance and minimum distance of an error-correcting code determine its ability to detect and correct errors. Hamming codes provide an efficient way to add the minimum number of check bits needed to correct single-bit errors.

Uploaded by

Srini Vasulu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 70

ELE4607 Advanced Digital Communications

Module 9: Error Control

Error Control
Used in communications links, error-correcting memories, magnetic
disks (RAID disk arrays), CDs, spacecraft, real-time video/audio (VIP, VoIP), tape backup

Error detection merely detect when an error has occurred Error correction detect and correct this error Correction is much harder!
RAID = Redundant Array of Inexpensive Disks VIP = Video over Internet Protocol VoIP = Voice over Internet Protocol DVD = Digital Video Disk ECM = Error Correcting Memory

Error Control
Error detection merely detect when an error has occurred Error correction detect and correct this error Correction is much harder! If we can detect an error in a data communications link, it may be pos ARQ is not suitable in some applications isochronous (real-time
trans- mission, eg speech/video) and storage applications.

sible to request re-transmission. ARQ = Automatic Retransmission reQuest.

Error Detection
Idea: from a received message, we can tell if part of the information has
been corrupted. Analogy 1: Each lighthouse on the coastline has a unique rotation speed, so ships can tell where they are. Lighthouse rotation speeds are allocated so that nearby ones have quite different speeds (in case of timing errors).

Analogy 2: Stating a date such as Friday, Feb 22 is inherently error-

checking, since (for a given year) the day/month/date must be consistent with the calendar. Also can be error-correcting, if we assume the sender meant a nearby month.

Error Control
Basic approach (for storage or transmission)
1. Split data into chunks (8-bit bytes, 16-bit words, or longer data frames) 2. Append check bits to frame.

Check bits are redundant in that they do not convey new information In general, error correction requires a more sophisticated algorithm and
more check bits.

Error Control
Obviously want to:
Minimize the number of check bits, and Maximize probability of detecting an error if one occurs.

Need to create check bits in such a way that they are error-protected
themselves check the check bits. Note that most errors occur in bursts (not single-bit errors).

Error Rate

Need a high probability of detection of errors. Example: 6 An Ethernet data link transmits at 10Mbps = 10 10 bps. Suppose the 1 probability of a single error is 1 in 100 million, ie 8 . Would expect on 10 average

1 error 10 106 bits 8 = 0.1 1 10 bits sec


This is equivalent to 1 error every 10 seconds.

error/second

Error Tolerance
Error tolerance: file transfer vs analog coded data. Files must be bitexact. Analog data may accept some (small) error rate. affect subsequent bits if using compression.

Error propagation problem if using compression. A single error can

Parity Bits
Parity: exclusive-or gates (modulo-2 arithmetic) XOR is a programmable inverter, ie A controls invert B or dont
invert B. (true).

XOR is a difference detector, ie if A and B are different, output 1

AB , A B A B
0 0 1 1 0 1 0 1

+ A B AB
0 1 1 0

(1)

D3
b

D2
b

D1
b

D0
b

invert

even

odd

XOR is also used in encryption (DES etc).

Parity can detect single-bit errors Parity cant correct the error, as there is no way of knowing which bit
was in error if two bits are flipped, error would be undetectable. Prob(error not detected) = 0.5. Poor, not much good for burst errors.

Row-Column Parity (Interleaving)


Two-dimensional parity Interleave: write data bits in rows, to form a block or matrix. Calculate parity across each row and down each column.

12

Row-Column Parity

Example using odd parity. Sending 16 data bits in a block, plus 4 + 4 = 8 check bits (HP=horizontal parity, VP=vertical parity). HP 1 1 1 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 0

VP If the bit shown in bold is flipped, the corresponding row & column parity bits will be wrong.
13

Row-Column Parity
Could correct single-bit errors by interpolating row/column. Since this bit is in error, we just need to invert it to restore that bit. What if several bits get corrupted (eg along the same row)? Could still
detect, but not correct. Efficiency of this scheme not good, as it has 8/16 or 50% overhead.

14

Concept of Error Distance


original bits 0 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1

Suppose we use a simple repetition code as follows coded bit stream 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1

15

Error Distance
Say we want to send 00. Encode as 000 000 000. Suppose this
codeword gets corrupted to 000 000 010 due to a single-bit error.

To the receiver, the obvious assumption is that 000 000 010 should have been 000 000 000, which corresponds to 00 originally.
Therefore the error is both detected and corrected.

16

Error Distance
Suppose the bit pattern gets corrupted to 000 000 011 due to a 2-bit
error burst. Receiver knows an error has occurred, as this is an invalid codeword in our error-control coding system. This is incorrect!

However, the assumption is that this codeword should have been the smallest-distance, 000 000 111, that is the bits 01 in the first place.

17

Hamming Distance

This leads to the concept of error distance, commonly called the Hamming Distance. d (x, y) is the number of locations in which codewords x and y differ. Example:

x = 0 1 1 0 y = 0 1 1 1

1 1 0 1 0 1 1 1

Hence d (x, y) = 3 The minimum distance dmin of a codevector set is the smallest Hamming distance between any two codevectors in the codeset.

18

Hamming Distance

Example. Suppose the codevectors are

x = 0110 y = 0100 z = 1001


then

d (x, y) = 1 d (x, z) = 4 d (y, z) = 3


Hence dmin = 1

19

Error detecting/correcting capabilities of a coding system depend on the codes dmin . If two codewords are a distance d apart, it will require d,
single-bit errors to convert one into another (and thus have false decoding).

To detect d errors, a distance d + 1 code is required, because d singlebit errors cannot change a valid codeword into another valid codeword (only into an erroneous codeword, which will be picked up). closest original codeword can still be deduced.

To correct d errors, a distance 2d + 1 code is required, because all legal codewords are now spaced so that even if d changes occur, the

Put another way, a coding system with a minimum Hamming distance dmin can correct up to d errors, provided

dmin 2d + 1 1 d (dmin 1) 2

(2) (3)

in the repetition code example, had dmin = 3, the minimum number of bit positions by which codewords differed (3 = 2d + 1). So we can correct up to 1 (dmin 1) = 1 (3 1) = 1 error, but can detect 2 errors (3 = d + 1).2 2

Check Bits

For an ideal single-bit error-correcting code, how many check bits do we need? Using code example from previously to demonstrate... Define m = The number of message or data bits to begin with (m = 2 in the example) n = The number of bits in total in the codeword (n = 9). c = The number of redundant (or check) bits. Hence n = c + m so c = n m = 9 2 = 7

Check Bits
Number of valid states is 2m (here 22 = 4) Number of possible codewords is 2n (here 29 = 512) Number of erroneous states is 2n 2m (29 22)
Substituting n = m + c (total codeword bits = message bits + check bits), the number of erroneous codewords is

Ne = 2m+c 2m m c m = 2 2 2 = 2m (2c 1)

(4) (5) (6)

Hamming Distance
There are Nv = 2m valid codewords. Ratio of erroneous codewords to
valid codewords is

Ne 2m (2c 1) = Nv 2m = 2c 1

(7) (8)

There are 2n possible codeword patterns (29 ). For single-bit errors, m 2 there are n 2 = 9 2 possible error patterns, because a single bit error is possible in any of the n (=9) bit positions.

Dont want two single-bit error patterns to be equally close in Hamming


distance, otherwise decoding would be ambiguous.

Number of erroneous codewords must be greater than, or equal to, the


number of one-bit incorrect codewords in order to be able to correct one-bit errors.

m Each of m the 2 legal messages has n codewords at a distance 1 from it, ie n 2 So,

2m (2c 1) n 2m c 2 1 n 2c 1 m + c

(9) (10) (11)

Given m, the number of data (message) bits, what is the smallest


# of data bits m # of check bits c 4 3 8 4 16 5 32 6

num- ber of redundant check bits required to satisfy this inequality?

Note the small increase in the number of check bits required as the data
bits go up.

Hamming Codes
We know the requirements on such a code (number of check bits
needed) so how to design this efficient code?

How to generate the code? ie does a coding system exist with this efficiency, or do we need more check bits than the theoretical minimum? (Recall earlier module on Entropy & Huffman codes). Called the Hamming code (Hamming, 1950). Incorporated into chips - see data sheets for Intel 8206 for example.

Hamming Codes
Hamming H (n, m) codes = total length of codevector n = number of data (message) bits m c = n m = redundant check bits It is:

single-error correcting double-error detecting

Hamming Codes

Hamming H (7, 4) code H (n, m) has:

n=7 m=4 c=nm =74= 3

= codevector length = message bits = check bits

Hamming Code Design


Define:

Databits as m3 m2 m1 m0 Checkbits as c2c1c0


Procedure: 1. Number the bit positions 1 to 7 2. Put c check bits in power-of-two positions (1,2,4) 7 6 5 4 3 2 1

c2

c1 c0

Hamming Code Design


7 6 5

Put m data bits in the remaining positions, to fill up n positions in total. 4 3 2 1

m3 m2 m1 c2 m0 c1 c0
Write below each bit position the indexes binary code: 7 MSB LSB 1 1 1 6 1 1 0 5 1 0 1 4 1 0 0 3 0 1 1 2 0 1 0 1 0 0 1

m3 m2 m1 c2 m0 c1 c0

Note that each check bit has a single 1 below it.

Hamming Code Design

Write check equations by xor-ing along corresponding row:

c0 = m 3 m 1 m0 c1 = m 3 m 2 m0 c2 = m 3 m 2 m1 c0 checks m0, m1, m3.

(12) (13) (14)

Hamming Code Design

Take each check equation in turn, and xor both sides to get the error syndrome bit s,

s0 = c0 c0 = c0 m3 m1 m0

(15) (16)

similarly for s1 and s2 . XORing a number with itself will always yield 0, so that each syndrome s0 = c0 c0 etc should always equal 0. If not, there is an error. Furthermore, the binary value of the syndrome points to the bit error position.

Hamming Code in Practice


The transmitter calculates and sends (or stores) the check bits c0c1c2. The receiver calculates the syndrome from m and c bits. The binary value of the syndrome points to the bit error position.
s2 s1 s 0

4 error

0 no error

34

Hamming Code Example


Data = m3 m2 m1 m0 = 1011 say. Check bits

c0 = 1 1 1 = 1 c1 = 0 c2 = 0
With no errors, the syndrome is

(17) (18) (19)

s0 = 1 1 1 1 = 0 s1 = 0 s2 = 0
ie no errors, all zero.

(20) (21) (22)

35

Hamming Code Example

Suppose there is an error in bit m1 , and hence m1 gets flipped from 1 to 0. Repeating the calculations,

s0 = 1 1 0 1 = 1 s1 = 0 s2 = 1

(23) (24) (25)

Hence s2 s1 s0 is 101, or 5. Therefore position 5 is in error, which points to m1 . Since we are using the binary system, if we know which bit is incorrect, it is simply a matter of inverting it to correct it.

36

Hamming Code & Burst Errors


Hamming code as described is OK to correct single-bit errors. However, most errors in communications systems occur in bursts
(noise bursts etc) Do we need an entirely different procedure? Can extend basic single-bit-correcting codes using a buffering approach.

37

Hamming Code & Burst Errors


Write each codevector

m 3 m 2 m 1 c2 m 0 c1 c0
and then concatenate several to form a block:
c2 (0) m0 (0) c1 (0) c0 (0) c2 (1) m0 (1) c1 (1) c0 (1) c2 (2) m0 (2) c1 (2) c0 (2) m3 (k 1) m2 (k 1) m1 (k 1) c2 (k 1) m0 (k 1) c1 (k 1) c0 (k 1) m3 (0) m3 (1) m3 (2) m2 (0) m2 (1) m2 (2) m1 (0) m1 (1) m1 (2)

That is, code each group of 4 bits using a standard (7,4) code. Repeat for the next 4 bits, etc until a block is formed (say, 256 4 bits) Then transmit the block vertically or column-wise.

38

An error burst has to last a block length (256 bits in the previous example) before it cannot be corrected.

Error-Detecting Codes
Have seen error correcting codes. What about error detecting codes where retransmission request is possible?

Generally can have more powerful error protection capability with far
fewer check bits.

Suitable where ARQ (automatic retransmission request) is feasible,


such as communications links. Fewer check bits means better bandwidth utilization. Suitable on links where error rate is low.

Error-Detecting Codes

Two main categories: Checksums and Cyclic Redundancy Checks (CRCs).

Checksums more suitable for software implementation (use only addition and shift). xor-ing)

CRCs more suitable for hardware implementation (use shifting and


CRCs sometimes called FCS or Frame Check Sequence.

Checksum
Idea: to add up the bytes (or words), each treated as an unsigned 8-bit
(or 16-bit) number.

Using byte calculations and 2k bytes, the largest number required is 28 1 = 255 for each byte, and approximately 2k 28 in total. For example, for 210 bytes the largest number is just less than 210 28 = 218 . 16-bit checksum used in TCP header, IP header. Easy for routers to
re-calculate (update) checksum as datagrams are forwarded.

42

Checksum and Modulo Arithmetic


234 183 121 sum 538 1 0 1 1 0 0 1 0 1 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 0 0 1 1 0 1 0 1 1 0

Modulo-256, the carry bits 10 would be ignored. Modulo-255 is similar: the accumulator (running sum) must be large
enough to hold the sum, then divided by 255. This is equivalent to ones complement addition with end-around carry carry from the MSB out is added to the LSB (simple test in software).

43

Fletcher Checksum
position in the message. culation yields zero.

The Fletcher Checksum gives two octets, calculated as

The modulo-255 sum of each message octet. The modulo-255 sum of each message octet weighted in reverse by its The checksum octets are modified so that the receivers checksum cal-

44

Fletcher Checksum

Checksums c0 and c1 from a length L message with bi value in each message octet is:

c0 = c1 =

L 1 X

bi
i=0 L 1 X i=0

(26) (27)

(L i) bi

The checksum octets are modified so that the receiver calculation yields zero.

45

Error detection performance (Stallings, P 537),

Detects all single-bit errors. Detects all double-bit errors. Detects 99.999981% of all bursts not exceeding 16 bits. Detects 99.9985% of all longer bursts.

TCP & IP Checksum


IP, TCP & UDP checksum header format see
http://www.kohala.com/start/pocketguide1.ps

IP performs a checksum on the header only. TCP performs a checksum on:


The data; The header; Parts ofthe IP header (termed the pseudo-header)

Set header checksum to 0 to calculate. Checking is done with the checksum in place, and should yield zero.

47

TCP & IP Checksum References


Unix Network Programming, W. Richard Stevens, Prentice-Hall,
1990, pp 454-455, which in turn references

Computing the Internet Checksum, R. Braden, D. Borman, C. Partridge, Computer Communication Review, Vol 19, No 2, April 1989, pp 86-101

Description: TCP/IP Illustrated, Volume 1 - The Protocols, W.


Richard Stevens, Addison-Wesley, 1994, p36-7 (description), p145 (pseudo-header UDP), p227 (pseudo-header TCP)

48

TCP & IP Checksum Implementation


Computed on packet send & receive. Updated on packet forward (TTL field change) 16-bit end-around carry is the same for any architecture. If the

additions are done in native endian-ordering, the result must be stored in native endian-ordering. See RFC 1071 for implementation techniques See RFC1141 for incremental update techniques. see handout

49

C Code (Stevens Book & RFC1071)


unsigned short IPChecksum(unsigned char *pPacket, int nbytes) { unsigned short *p; long sum; unsigned short oddbyte, unsigned short cksm; // 16 bits p = (unsigned short *)pPacket; // for 16-bit addition sum = 0L; while( nbytes > 1 ) { sum += *p++; nbytes -= 2; } if( nbytes == 1 ) { oddbyte = 0; *( (unsigned char *)&oddbyte ) = *( (unsigned char *)p ); sum += oddbyte; } sum = (sum >> 16) + (sum & 0x0ffff); // add carry from top 16 bits sum += (sum >> 16); // to low 16 bits cksm = (unsigned short)(sum); return cksm; }

Note: RFC1071 has a bug see handout.


50

Cyclic Redundancy Checks


CRC or Cyclic Redundancy Check is extensively used for error
detection in communication links. Provides extremely good error detection. Simple to compute using digital hardware. Check bits are computed over a frame of data (typically 128 to 1500 bytes). Also called a Frame Check Sequence (FCS). The checksum is quite suited to software implementation, whereas the CRC is more suited to hardware implementation.

51

The following introduces the basic concepts behind the CRC. 1. Define a generator (or generator polynomial) of N bits. 2. Treat the data frame to be transmitted as a number, and divide this num- ber by the generator. 3. Transmit the frame followed by the remainder. 4. At the receiver, perform the same calculation and check the remainder.

CRC Calculation

52

CRC Calculation
Although it seems that division is required (and hence floating-point
computations), the operation is carried out using modulo-2 arithmetic and the exclusive-or (XOR, written as ) digital logic function.

The check bits are appended to the message and the receiver simply
checks for a zero remainder.

53

CRC Calculation
Define a generator (polynomial) of N bits. Calculate the N 1 checkbits at the sender using N 1 zeros
appended. Divide modulo-2 the message by the generator. Ignore the quotient. Transmit the remainder immediately after the bits of the message (the augmented message). At receiver, divide the generator into the augmented message. Zero remainder indicates that no errors have occurred. Non-zero remainder indicates that one or more errors occurred and the message must be resent.

54

CRC Calculation
The computation may be understood by recalling the long-division manual calculation method. Suppose the sum is 3421 8. Recall how this would be laid out with a quotient above and a remainder at the end.

55

CRC Example
Generator = 1001 length N = 4

g(X ) = X 3 + 1

The sender calculates the N 1 CRC checkbits from the message with N 1 zeros appended.

56

CRC Example
1 0 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0

message + 3 zero bits

generator polynomial

0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 1

1 1 0 1 1 0 0 1 1 0 0 0 1 0 0 1

0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0

1 0 0 0 1 0 0 1 0 0 1

CRC result

57

CRC Example
The receiver calculates the N 1 remainder bits from the message with the N 1 CRC checkbits appended. The remainder should be zero.

58

CRC Example
1 0 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0

message + 3 CRC bits

generator polynomial

no errors

59

CRC Example
1 0 1 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 1 1 0 0 1 1 1 0 0 1 0 0 1 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1

message + 3 CRC bits

generator polynomial

error

60

CRC Example

If the error burst is identical to the generator polynomial, we fail to detect the error!

61

CRC Example
1 0 1 1 0 0 0 1 1 0 0 1 1 0 1 0 0 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0

message + 3 CRC bits

generator polynomial

no errors

62

CRC Error Detection


Normally 16 or 32 bits are used for the CRC Common CRC Polynomials:
CRC-CCITT: X 16 + X 12 + X 5 +

1
CRC-Ethernet: X 32 + X 26 + X
23

+ X 22 + X 16 + 11 10 12 X + X X + + X 5+ X 2 + X + 1

63

CRC Performance
CRC-CCITT can catch:

all single-bit errors all double-bit errors all bursts of length 16 bits or less 99.997% of 17-bit error bursts 99.998% of 18-bit or longer bursts

64

Error Control References

Simon Haykin, Digital Communications Chapter 8, Error-Control Coding John Wiley & Sons, 1988 Andrew Tanenbaum, Computer Networks Prentice-Hall, 3rd ed, 1996 William Stallings, Data and Computer Communications Appendix 11A, The ISO Checksum MacMillan, 4th ed, 1991 RFC1071: ftp://ftp.rfc- editor.org/in- notes/rfc1071.txt

65

Module Summary Important Points


1. Concepts in error detection & correction 2. Parity, interleaving, Hamming codes 3. Derive & use Hamming equations 4. Checksum algorithm 5. CRC algorithm

66

You might also like