Detecting Bit Errors: Ecture
Detecting Bit Errors: Ecture
L ECTURE 7
Detecting Bit Errors
These lecture notes discuss some techniques for error detection. The reason why error
detection is important is that no practical error correction schemes can perfectly correct all
errors in a message. For example, any reasonable error correction scheme that can correct
all patterns of t or fewer errors will have some error pattern of t or more errors that cannot
be corrected. Our goal is not to eliminate all errors, but to reduce the bit error rate to a low
enough value that the occasional corrupted coded message is not a problem: the receiver
can just discard such messages and perhaps request a retransmission from the sender (we
will study such retransmission protocols later in the term). To decide whether to keep or
discard a message, the receiver needs a way to detect any errors that might remain after
the error correction and decoding schemes have done their job: this task is done by an
error detection scheme.
An error detection scheme works as follows. The sender takes the message and pro-
duces a compact hash or digest of the message, with the idea that commonly occurring
corruptions of the message will cause the hash to be different from the correct value. The
sender includes the hash with the message, and then passes that over to the error correct-
ing mechanisms, which code the message. The receiver gets the coded bits, runs the error
correction decoding steps, and then obtains the presumptive set of original message bits
and the hash. The receiver computes the same hash over the presumptive message bits
and compares the result with the presumptive hash it has decoded. If the results disagree,
then clearly there has been some unrecoverable error, and the message is discarded. If
the results agree, then the receiver believes the message to be correct. Note that if the re-
sults agree, the receiver can only believe the message to be correct; it is certainly possible
(though, for good detection schemes, unlikely) for two different message bit sequences to
have the same hash.
The topic of this lecture is the design of appropriate error detection hash functions. The
design depends on the errors we anticipate. If the errors are adversarial in nature, e.g.,
from a malicious party who can change the bits as they are sent over the channel, then
the hash function must guard against as many of the enormous number of different error
patterns that might occur. This task requires cryptographic protection, and is done in prac-
tice using schemes like SHA-1, the secure hash algorithm. We wont study these in 6.02,
1
2 LECTURE 7. DETECTING BIT ERRORS
focusing instead on non-malicious, random errors introduced when bits are sent over com-
munication channels. The error detection hash functions in this case are typically called
checksums: they protect against certain random forms of bit errors, but are by no means the
method to use when communicating over an insecure channel. We will study two simple
checksum algorithms: the Adler-32 checksum and the Cyclic Redundancy Check (CRC).1
After the modulo operation the A and B values can be represented as 16-bit quantities.
The Adler-32 checksum is the 32-bit quantity (B 16) + A.
The Adler-32 checksum requires messages that are several hundred bytes long before
it reaches its full effectiveness, i.e., enough bytes so that A exceeds 65521.2 Methods like
the Adler-32 checksum are used to check whether large files being transferred have errors;
Adler-32 itself is used in the popular zlib compression utility and (in rolling window form)
in the rsync file synchronization program.
For network packet transmissions, typical sizes range between 40 bytes and perhaps
10000 bytes, and often packets are on the order of 1000 bytes. For such sizes, a more
effective error detection method is the cyclic redundancy check (CRC). CRCs work well
over shorter messages and are easy to implement in hardware using shift registers. For
these reasons, they are extremely popular.
1
Sometimes, the literature uses checksums to mean something different from a CRC, using checksums
for methods that involve the addition of groups of bits to produce the result, and CRCs for methods that
involve polynomial division. We use the term checksum to include both kinds of functions, which are both
applicable to random errors and not to insecure channels (unlike secure hash functions.
2
65521 is the largest prime smaller than 216 . It is not clear to what extent the primality of the modulus
matters, and some studies have shown that it doesnt seem to matter much.
SECTION 7.2. CYCLIC REDUNDANCY CHECK 3
For example, the code word 11000101 may be represented as the polynomial x7 + x6 +
x2 + 1, plugging the bits into Eq.(7.1).
We use the term code polynomial to refer to the polynomial corresponding to a code word.
The key idea in a CRC (and, indeed, in any cyclic code) is to ensure that every valid code
polynomial is a multiple of a generator polynomial, g(x). We will look at the properties of good
generator polynomials in a bit, but for now lets look at some properties of codes built with
this property.
All arithmetic in our CRC will be done in F2 . The normal rules of polynomial addition,
division, multiplication, and division apply, except that all coefficients are either 0 or 1 and
the coefficients add and multiply using the F2 rules. In particular, note that all minus signs
can be replaced with + signs, making life quite convenient.
where the notation R{a(x)/b(x)} stands for the remainder when a(x) is divided by b(x).
The encoder is now straightforward to define. Take the message, construct the message
4 LECTURE 7. DETECTING BIT ERRORS
polynomial, multiply by xnk , and then divide that by g(x). The remainder forms the
check bits, acting as the digest for the entire message. Send these bits appended to the
message.
starting s bits to the left from the end of the packet. If we pick g(x) to be a polynomial
of degree b, and if g(x) does not have x as a factor, then any error pattern of length
b is guaranteed to be detected, because g(x) will not divide a polynomial of degree
smaller than its own. Moreover, there is exactly one error pattern of length b + 1
corresponding to the case when the burst error pattern matches the coefficients of
g(x) itselfthat will not be detected. All other error patterns of length b + 1 will be
detected by this CRC.
6 LECTURE 7. DETECTING BIT ERRORS
Figure 7-2: Commonly used CRC generator polynomials, g(x). From Wikipedia.
If fact, such a CRC is quite good at detecting longer burst errors as well, though it
cannot detect all of them.
CRCs are examples of cyclic codes, which have the property that if c is a code word,
then any cyclic shift (rotation) of c is another valid code word. Hence, referring to Eq.(7.1),
we find that one can represent the polynomial corresponding to one cyclic left shift of w as
Now, because w(1) (x) must also be a valid code word, it must be a multiple of g(x),
which means that g(x) must divide 1 + xn . Note that 1 + xn corresponds to a double error
pattern; what this observation implies is that the CRC scheme using cyclic code polyno-
mials can detect the errors we want to detect (such as all double bit errors) as long as g(x)
is picked so that the smallest n for which 1 + xn is a multiple of g(x) is quite large. For
example, in practice, a common 16-bit CRC has a g(x) for which the smallest such value of
n is 215 1 = 32767, which means that its quite effective for all messages of length smaller
than that.