Error Correcting Codes: 15-1 Introduction
Error Correcting Codes: 15-1 Introduction
CHAPTER 15
Insert this material after Chapter 13 (Gray Code) and after the new material on the
Cyclic Redundancy Check.
15–1 Introduction
This section is a brief introduction to the theory and practice of error correcting
codes (ECCs). We limit our attention to binary forward error correcting (FEC)
block codes. This means that the symbol alphabet consists of just two symbols
(which we denote 0 and 1), that the receiver can correct a transmission error with-
out asking the sender for more information or for a retransmission, and that the
transmissions consist of a sequence of fixed length blocks, called code words.
Section 15–2 describes the code independently discovered by R. W. Ham-
ming and M. J. E. Golay before 1950 [Ham]. This code is single error correcting
(SEC), and a simple extension of it, also discovered by Hamming, is single error
correcting and, simultaneously, double error detecting (SEC-DED).
Section 15–4 steps back and asks what is possible in the area of forward error
correction. Still sticking to binary FEC block codes, the basic question addressed
is: for a given block length (or code length) and level of error detection and cor-
rection capability, how many different code words can be encoded?
Section 15–2 is for readers who are primarily interested in learning the basics
of how ECC works in computer memories. Section 15–4 is for those who are
interested in the mathematics of the subject, and who might be interested in the
challenge of an unsolved mathematical problem.
The reader is cautioned that over the past 50 years ECC has become a very
big subject. Many books have been published on it and closely related subjects
[Hill, LC, MS, and Roman, to mention only a few]. Here we just scratch the sur-
face and introduce the reader to two important topics and to some of the terminol-
ogy used in this field. Although much of the subject of error correcting codes
relies very heavily on the notations and results of linear algebra, and in fact is a
very nice application of that abstract theory, we avoid it here for the benefit of
those who are not familiar with that theory.
The following notation is used throughout this chapter. It is close to that used
in [LC]. The terms are defined in subsequent sections.
k Number of “information” or “message” bits.
m Number of parity-check bits (“check bits,” for short).
n Code length, n = m + k.
1
2 ERROR CORRECTING CODES 15–2
2 m ≥ m + k + 1. (1)
This is known as the Hamming rule. It applies to any single error correcting (SEC)
binary FEC block code in which all of the transmitted bits must be checked. The
check bits will be interspersed among the information bits in a manner described
below.
Because p indexes the bit (if any) that is in error, the least significant bit of p
must be 1 if the erroneous bit is in an odd position, and 0 if it is in an even position
or if there is no error. A simple way to achieve this is to let the least significant bit
of p, p0, be an even parity check on the odd positions of the block, and to put p0 in
an odd position. The receiver then checks the parity of the odd positions (includ-
ing that of p0). If the result is 1, an error has occurred in an odd position, and if the
result is 0, either no error occurred or an error occurred in an even position. This
satisfies the condition that p should be the index of the erroneous bit, or be 0 if no
error occurred.
Similarly, let the next from least significant bit of p, p1, be an even parity
check of positions 2, 3, 6, 7, 10, 11, … (in binary, 10, 11, 110, 111, 1010, 1011,
…), and put p1 in one of these positions. Those positions have a 1 in their second
from least significant binary position number. The receiver checks the parity of
these positions (including the position of p1). If the result is 1, an error occurred in
one of those positions, and if the result is 0, either no error occurred or an error
occurred in some other position.
Continuing, the third from least significant check bit, p2, is made an even par-
ity check on those positions that have a 1 in their third from least significant posi-
tion number, namely positions 4, 5, 6, 7, 12, 13, 14, 15, 20, …, and p2 is put in one
of those positions.
15–2 THE HAMMING CODE 3
Putting the check bits in power-of-two positions (1, 2, 4, 8, …) has the advan-
tage that they are independent. That is, the sender can compute p0 independently
of p1, p2, … and, more generally, it can compute each check bit independently of
the others.
As an example, let us develop a single error correcting code for k = 4. Solv-
ing (1) for m gives m = 3, with equality holding. This means that all 2 m possible
values of the m check bits are used, so it is particularly efficient. A code with this
property is called a perfect code.1
This code is called the (7,4) Hamming code, which signifies that the code
length is 7 and the number of information bits is 4. The positions of the check bits
pi and the information bits ui are shown below.
p0 p1 u3 p2 u2 u1 u0
1 2 3 4 5 6 7
Table 15–1 shows the entire code. The 16 rows show all 16 possible informa-
tion bit configurations and the check bits calculated by Hamming’s method.
1. A perfect code exists for k = 2 m – m – 1, m an integer—that is, k = 1, 4, 11, 26, 57, 120,
….
4 ERROR CORRECTING CODES 15–2
To illustrate how the receiver corrects a single-bit error, suppose the code
word
1001110
is received. This is row 4 in Table 15–1 with bit 6 flipped. The receiver calculates
the exclusive or of the bits in odd positions and gets 0. It calculates the exclusive
or of bits 2, 3, 6, and 7 and gets 1. Lastly it calculates the exclusive or of bits 4, 5,
6, and 7 and gets 1. Thus the error indicator, which is called the syndrome, is
binary 110, or 6. The receiver flips the bit at position 6 to correct the block.
A SEC-DED Code
For many applications a single error correcting code would be considered unsat-
isfactory, because it accepts all blocks received. A SEC-DED code seems safer,
and it is the level of correction and detection most often used in computer memo-
ries.
The Hamming code can be converted to a SEC-DED code by adding one
check bit, which is a parity bit (let us assume even parity) on all the bits in the SEC
code word. This code is called an extended Hamming code [Hill, MS]. It is not
obvious that it is SEC-DED. To see that it is, consider Table 15–2. It is assumed a
that checks one of the erroneous bits but not the other one? To see this, first sup-
pose one of the erroneous bits is in an even position and the other is in an odd
position. Then because one of the check bits (p0) checks all the odd positions and
none of the even positions, the parity of the bits at the odd positions will be odd,
resulting in a nonzero syndrome. More generally, suppose the erroneous bits are
in positions i and j (with i ≠ j ). Then because the binary representations of i and j
must differ in some bit position, one of them has a 1 at that position and the other
has a 0 at that position. The check bit corresponding to this position in the binary
integers checks the bits at positions in the code word that have a 1 in their position
number, but not the positions that have a 0 in their position number. The bits cov-
ered by that check bit will have odd parity, and thus the syndrome will be nonzero.
As an example, suppose the erroneous bits are in positions 3 and 7. In binary, the
position numbers are 0…0011 and 0…0111. These numbers differ in the third
position from the right, and at that position, the number 7 has a 1 and the number
3 has a 0. Therefore the bits checked by the third check bit (these are bits 4, 5, 6,
7, 12, 13, 14, 15, …) will have odd parity.
Thus, referring to Table 15–2, the overall parity and the syndrome together
uniquely identify whether 0, 1, or 2 errors occurred. In the case of one error, the
receiver can correct it. In the case of two errors, the receiver cannot tell whether
just one of the errors is in the SEC portion (in which case it could correct it) or
both errors are in the SEC portion (in which case an attempt to correct it would
result in incorrect information bits).
The overall parity bit could as well be a parity check on only the even posi-
tions, because the overall parity bit is easily calculated from that and the parity of
the odd positions (which is the least significant check bit). More generally, the
overall parity bit could as well be a parity check on the complement set of bits
checked by any one of the SEC parity bits. This observation might save some
gates in hardware.
It should be clear that the Hamming SEC code is of minimum redundancy.
That is, for a given number of information bits, it adds a minimum number of
check bits that permit single error correction. This is true because it was con-
structed that way. Hamming shows that the SEC-DED code constructed from a
SEC code by adding one overall parity bit is also of minimum redundancy. His
argument is to assume that a SEC-DED code exists that has fewer check bits, and
he derives from this a contradiction to the fact that the starting SEC code had min-
imum redundancy.
Concluding Remarks
In the more mathematically oriented ECC literature, the term “Hamming code” is
reserved for the perfect codes described above—that is, those with (n, k) = (3, 1),
(7, 4), (15, 11), (31, 26), and so on. Similarly, the extended Hamming codes are
the perfect SEC-DED codes described above. However, computer architects and
engineers often use the term to denote any of the codes that Hamming described,
and some variations. The term “extended” is often understood.
The first IBM computer to use Hamming codes was the IBM Stretch com-
puter (model 7030), built in 1961 [LC]. It used a (72, 64) SEC-DED code (not a
perfect code). A follow-on machine known as Harvest (model 7950), built in
1962, was equipped with 22-track tape drives that employed a (22, 16) SEC-DED
code. The ECCs found on modern machines are usually not Hamming codes, but
rather are codes devised for some logical or electrical property such as minimiz-
ing the depth of the parity check trees, and making them all the same length. Such
codes give up Hamming’s simple method of determining which bit is in error, and
instead use a hardware table lookup.
At the time of this writing (2005), most notebook PCs (personal computers)
have no error checking in their memory systems. Desktop PCs may have none, or
they may have a simple parity check. Server-class computers generally have ECC
at the SEC-DED level.
In the early solid state computers equipped with ECC memory, the memory
was usually in the form of eight check bits and 64 information bits. A memory
module (group of chips) might be built from, typically, nine eight-bit wide chips.
A word access (72 bits, including check bits) fetches eight bits from each of these
nine chips. Each chip is laid out in such a way that the eight bits accessed for a sin-
gle word are physically far apart. Thus, a word access references 72 bits that are
physically somewhat separated. With bits interleaved in that way, if a few close-
together bits in the same chip are altered, as for example by an alpha particle or
cosmic ray hit, a few words will have single-bit errors, which can be corrected.
15–3 SOFTWARE FOR SEC-DED ON 32 INFORMATION BITS 7
The positions checked by each check bit are shown in Table 15–4. In this
table, bits are numbered in the usual little-endian way, with position 0 being the
least significant bit (unlike Hamming’s numbering).
0. ( x = u ⊕ ( u >> 1 ) )
u
1. ( x = x ⊕ ( x >> 2 ) )
u
2. ( x = x ⊕ ( x >> 4 ) )
u
3. ( x = x ⊕ ( x >> 8 ) )
u
4. ( x = x ⊕ ( x >> 16 ) )
u
except omitting line i when computing check bit mi, for 0 ≤ i ≤ 4. For p5, all the
above assignments are used. This is where the regularity of the pattern of bits
checked by each check bit pays off; a lot of code commoning can be done. This
reduces what would be 4×5 + 5 = 25 such assignments to 15, as shown in
Figure 15–1.
Incidentally, if the computer has an instruction for computing the parity of a
word, or has the population count instruction (which puts the word parity in the
least significant bit of the target register), then the regular pattern is not needed.
On such a machine, the check bits might be computed as
and so forth.
After packing the six check bits into a single quantity p, the checkbits
function accounts for information bit u0 by complementing all six check bits if
u0 = 1. (C.f. Table 15–4; p5 must be complemented because u0 was erroneously
included in the calculation of p5 up to this point.)
.
15–3 SOFTWARE FOR SEC-DED ON 32 INFORMATION BITS 11
t1 = u ^ (u >> 1);
p1 = t1 ^ (t1 >> 4);
p1 = p1 ^ (p1 >> 8);
p1 = p1 ^ (p1 >> 16); // p1 is in posn 2.
Another example is the two-out-of-five code, in which each code word has
exactly two 1-bits:
{00011, 00101, 00110, 01001, 01010, 01100, 10001, 10010, 10100, 11000}.
The code size is 10, and thus it is suitable for representing decimal digits. Notice
that if codeword 00110 is considered to represent decimal 0, then the remaining
values can be decoded into digits 1 through 9 by giving the bits weights of 6, 3, 2,
1, and 0, in left-to-right order.
The code rate is a measure of the efficiency of a code. For a code like Ham-
ming’s, this can be defined as the number of information bits divided by the code
length. For the Hamming code discussed above it is 4 ⁄ 7 ≈ 0.57. More generally,
the code rate is defined as the log base 2 of the code size divided by the code
length. The simple codes above have rates of log 2(8) ⁄ 9 ≈ 0.33 and
log 2(10) ⁄ 5 ≈ 0.66, respectively.
Hamming Distance
The central concept in the theory of ECC is that of Hamming distance. The Ham-
ming distance between two words (of equal length) is the number of bit positions
in which they differ. Put another way, it is the population count of the exclusive or
of the two words. It is appropriate to call this a distance function because it satis-
fies the definition of a distance function used in linear algebra:
14 ERROR CORRECTING CODES 15–4
d ( x, y ) = d ( y, x ),
d ( x, y ) ≥ 0,
d ( x, y ) = 0 iff x = y, and
d ( x, y ) + d ( y, z ) ≥ d ( x, z ) (triangle inequality).
Here d ( x, y ) denotes the Hamming distance between code words x and y, which
for brevity we will call simply the distance between x and y.
Suppose a code has a minimum distance of 1. That is, there are two words x
and y in the set that differ in only one bit. Clearly, if x were transmitted and the bit
that makes it distinct from y were flipped due to a transmission error, then the
receiver could not distinguish between receiving x with a certain bit in error, and
receiving y with no errors. Hence in such a code it is impossible to detect even a
1-bit error, in general.
Suppose now that a code has a minimum distance of 2. Then if just one bit is
flipped in transmission, an invalid code word is produced, and thus the receiver
can (in principle) detect the error. But if two bits are flipped, a valid code word
might be transformed into another valid code word. Thus double-bit errors cannot
be detected. Furthermore, single-bit errors cannot be corrected. This is because if
a received word has one bit in error, then there may be two code words that are
one bit-change away from the received word, and the receiver has no basis for
deciding which is the original code word.
The code obtained by appending a single parity bit is in this category. It is
shown below for the case of three information bits (k = 3). The rightmost bit is the
parity bit, chosen to make even parity on all four bits. The reader may verify that
the minimum distance between code words is 2.
0000
0011
0101
0110
1001
1010
1100
1111
Actually, adding a single parity bit permits detecting any odd number of
errors, but when we say that a code permits detecting m-bit errors, we mean all
errors up to m bits.
Now consider the case in which the minimum distance between code words
is 3. If any one or two bits is flipped in transmission, an invalid code word results.
If just one bit is flipped, the receiver can (we imagine) try flipping each of the
received bits one at a time, and in only one case will a code word result. Hence in
such a code the receiver can detect and correct a single-bit error. However, a dou-
ble-bit error might appear to be a single-bit error from another code word, and
thus the receiver cannot detect double-bit errors.
15–4 ERROR CORRECTION CONSIDERED MORE GENERALLY 15
Error correction capability can be traded for error detection. For example, if
the minimum distance of a code is 3, that redundancy can be used to correct no
errors but to detect single- or double-bit errors. If the minimum distance is 5, the
code can be used to correct single-bit errors and detect 3-bit errors, or to correct no
errors but to detect 4-bit errors, and so forth. Whatever is subtracted from the
“Correct” column of Table 15–6 can be added to the “Detect” column.
A(n, 1) = 2 n , (2)
16 ERROR CORRECTING CODES 15–4
A(n, 2) = 2 n – 1 .
That was not difficult. What about A(n, 3)? That is an unsolved problem, in
the sense that no formula or reasonably easy means of calculating it is known. Of
course many specific values of A(n, 3) are known, and some bounds are known,
but the exact value is unknown in most cases.
When equality holds in (1), it represents the solution to this problem for the
case d = 3. Letting n = m + k, (1) may be rewritten
2n
2 k ≤ ------------ . (3)
n+1
2n
A(n, 3) ≤ ------------ ,
n+1
Thus adding 1 to the code length at most doubles the number of code words pos-
sible, for the same minimum distance d. To see this, suppose you have a code of
length n, distance d, and size A(n, d). Choose an arbitrary column of the code.
Either half or more of the code words have a 0 in the selected column, or half or
more have a 1 in that position. Of these two subsets, choose one that has at least
A(n, d)/2 code words, form a new code consisting of this subset, and delete the
15–4 ERROR CORRECTION CONSIDERED MORE GENERALLY 17
selected column (which is either all 0’s or all 1’s). The resulting set of code words
has n reduced by 1, has the same distance d, and has at least A(n, d)/2 code words.
Thus A(n – 1, d) ≥ A(n, d) ⁄ 2, from which inequality (4) follows.
A useful relation is that if d is even, then
To see this, suppose you have a code C of length n and minimum distance d, with
d odd. Form a new code by appending to each word of C a parity bit, let us say to
make the parity of each word even. The new code has length n + 1, and has the
same number of code words as does C. It has minimum distance d + 1. For if two
words of C are a distance x apart, with x odd, then one word must have even parity
and the other must have odd parity. Thus we append a 0 in the first case and a 1 in
the second case, which increases the distance between the words to x + 1. If x is
even, we append a 0 to both words, which does not change the distance between
them. Because d is odd, all pairs of words that are a distance d apart become dis-
tance d + 1 apart. The distance between two words more than d apart either does
not change or increases. Therefore the new code has minimum distance d + 1.
This shows that if d is odd, then A(n + 1, d + 1) ≥ A(n, d), or equivalently
A(n, d) ≥ A(n – 1, d – 1) for even d ≥ 2.
Now suppose you have a code of length n and minimum distance d ≥ 2 (d
can be odd or even). Form a new code by eliminating any one column. The new
code has length n – 1, minimum distance at least d – 1, and is the same size as
the original code (all the code words of the new code are distinct because the new
code has minimum distance at least 1). Therefore A(n – 1, d – 1) ≥ A(n, d). This
establishes equation (5).
Spheres
Upper and lower bounds on A(n, d), for any d ≥ 1, can be derived by thinking in
terms of n-dimensional spheres. Given a code word, think of it as being at the cen-
ter of a “sphere” of radius r, consisting of all words at a Hamming distance r or
less from it.
How many points (words) are in a sphere of radius r? First consider how
many points are in the shell at distance exactly r from the central code word. This
is given by the number of ways to choose r different items from n, ignoring the
order of choice. We imagine the r chosen bits as being complemented, to form a
word at distance exactly r from the central point. This “choice” function, often
written n , may be calculated from2
r
n = ----------------------
n!
.
r r! ( n – r )!
2. It is also called the “binomial coefficient” because is the coefficient of the term
n
r
x r y n – r in the expansion of the binomial ( x + y ) n .
18 ERROR CORRECTING CODES 15–4
n(n – 1) n n( n – 1 )( n – 2)
Thus = 1, = n, = -------------------- , = ------------------------------------- , and so forth.
n n n
0 1 2 2 3 6
The total number of points in a sphere of radius r is the sum of the points in
the shells from radius 0 to r:
r
∑ i .
n
i=0
2n
A(n, d) ≤ ------------------------------ .
(d – 1) ⁄ 2
∑ ni
Each large dot represents a code word, and each small dot represents a non-code
word a unit distance away from its neighbors.
• • • • • • • • • • • • •
d = 5, r = 2 d = 6, r = 2
FIGURE 15–3. Maximum radius that allows correcting points within a sphere.
The sphere idea also easily gives a lower bound on A(n, d). Assume again
that you have a code of length n and minimum distance d, and it has the maximum
possible number of code words—that is, it has A(n, d) code words. Surround each
code word with a sphere of radius d – 1. Then these spheres must cover all 2 n
points in the space (possibly overlapping). For if not, there would be a point that
is at a distance d or more from all code words, and that is impossible because such
15–4 ERROR CORRECTION CONSIDERED MORE GENERALLY 19
a point would be a code word. Thus we have a weak form of the Gilbert-Varsha-
mov bound:
d–1
A(n, d) ∑ n ≥ 2 n .
i = 0 i
There is the strong form of the G-V bound, which applies to linear codes. Its
derivation relies on methods of linear algebra which, important as they are to the
subject of linear codes, are not covered in this short introduction to error correct-
ing codes. Suffice it to say that a linear code is one in which the sum (exclusive or)
of any two code words is also a code word. The Hamming code of Table 15–1 is
a linear code. Because the G-V bound is a lower bound on linear codes, it is also
a lower bound on the unrestricted codes considered here. For large n, it is the best
known lower bound on both linear and unrestricted codes.
The strong G-V bound states that A(n, d) ≥ 2 k , where k is the largest integer
such that
2n
2 k < -------------------------
d–2
-.
∑ i n – 1
i=0
That is, it is the value of the right-hand side of this inequality rounded down to the
next strictly smaller integral power of 2. The “strictness” is important for cases
such as (n, d) = (8, 3), (16, 3) and (the degenerate case) (6, 7).
Combining these results:
2n 2n
d–2
- ≤ A(n, d) ≤ -----------------------------
GP2LT ------------------------- (d – 1) ⁄ 2
-, (6)
n – 1 n
i∑ =0
i ∑ i
i=0
where GP2LT denotes the greatest integral power of 2 (strictly) less than its argu-
ment.
Table 15–7 gives the values of these bounds for some small values of n and d.
A single number in an entry means the lower and upper bounds given by (6) are
equal.
If d is even, bounds can be computed directly from (6) or, making use of
equation (5), they can be computed from (6) with d replaced with d – 1 and n
replaced with n – 1 in the two bounds expressions. It turns out that the latter
method always results in tighter or equal bounds. Therefore, the entries in
Table 15–7 were calculated only for odd d. To access the table for even d, use the
values of d shown in the footing and the values of n shown at the right (shaded
regions).
The bounds given by (6) can be seen to be rather loose, especially for large d.
The ratio of the upper bound to the lower bound diverges to infinity with increas-
ing n. The lower bound is particularly loose. Over a thousand papers have been
20 ERROR CORRECTING CODES 15–4
written describing methods to improve these bounds, and the results as of this
writing are shown in Table 15–8 [Agrell, Brou; where they differ, the table shows
the tighter bounds].
The cases of (n, d) = (7, 3), (15, 3), and (23, 7) are perfect codes, meaning
that they achieve the upper bound given by (6). This definition is a generalization
of that given on page 3. The codes for which n is odd and n = d are also perfect;
see exercise 6.
We conclude this section by pointing out that the idea of minimum distance
over an entire code, which leads to the ideas of p-bit error detection and q-bit error
correction for some p and q, is not the only criterion for the “power” of a binary
FEC block code. For example, work has been done on codes aimed at correcting
burst errors. [Etzion] has demonstrated a (16, 11) code, and others, that can cor-
rect any single-bit error and any error in two consecutive bits, and is perfect, in a
sense not discussed here. It is not capable of general double-bit error detection.
The (16, 11) extended Hamming code is SEC-DED and is perfect. Thus his code
gives up general double-bit error detection in return for double-bit error correc-
tion of consecutive bits. This is of course interesting because in many applications
errors are likely to occur in short bursts.
15–4 ERROR CORRECTION CONSIDERED MORE GENERALLY 21
220 – 32768 –
26 4096 – 9672 384 – 859 64 – 98 14 4 25
1198368 84260
Exercises
1. Show a Hamming code for k = 3 (make a table similar to Table 15–1).
2. In a certain application of a SEC code, there is no need to correct the
check bits. Thus the m check bits need only check the information bits, but not
themselves. For k information bits, m must be large enough so that the receiver
can distinguish k + 1 cases: which of the k bits is in error, or no error occurred.
Thus the number of check bits required is given by 2 m ≥ k + 1. This is a weaker
restriction on m than is the Hamming rule, so it should be possible to construct,
for some values of k, a SEC code that has fewer check bits than those required by
the Hamming rule. Alternatively, one could have just one value to signify that an
error occurred somewhere in the check bits, without specifying where. This
would lead to the rule 2 m ≥ k + 2.
What is wrong with this reasoning?
3. Prove: A(2n, 2d) ≥ A(n, d).
n–d+1
4. Prove the “singleton bound”: A ( n, d ) ≤ 2 .
5. Show that the notion of a perfect code as equality in the right-hand por-
tion of inequality (6) is a generalization of the Hamming rule.
6. What is the value of A(n, d) if n = d? Show that for odd n, these codes are
perfect.
7. Show that if n is a multiple of 3 and d = 2n ⁄ 3 , then A ( n, d ) = 4.
8. Show that if d > 2n ⁄ 3 , A ( n, d ) = 2.
9. (Brain teaser) How would you find, numerically, the minimum m that
satisfies (1), as a function of k?
15–4 ERROR CORRECTION CONSIDERED MORE GENERALLY 23
References
[Ham] Hamming, Richard W., “Error Detecting and Error Correcting Codes,”
The Bell System Technical Journal 26, 2 (April 1950), 147–160. NB:
We interchange the roles of the variables k and m relative to how they
are used by Hamming. We follow the more modern usage found in [LC]
and [MS], for example.
[LC] Lin, Shu and Costello, Daniel J., Jr. Error Control Coding: Fundamen-
tals and Applications. Prentice-Hall, 1983.
Answers to Exercises
1. Your table should look like Table 15–1 with the rightmost column and
the odd numbered rows deleted.
2. In the first case, if an error occurs in a check bit, the receiver cannot
know that, and it will make an erroneous “correction” to the information bits.
In the second case, if an error occurs in a check bit, the syndrome will be one
of 100…0, 010…0, 001…0, …, 000…1 (m distinct values). Therefore m must be
large enough to encode these m values, as well as the k values to encode a single
error in one of the k information bits, and a value for “no errors.” So the Hamming
rule stands.
One thing along these lines that could be done is to have a single parity bit for
the m check bits, and have the m check bits encode values of one error in an infor-
mation bit (and where it is), or no errors occurred. For this code, m could be cho-
sen as the smallest value for which 2 m ≥ k + 1. The code length would be
k + m + 1, where the “+1” is for the parity bit on the check bits. But this code
length is nowhere better than that given by the Hamming rule, and is sometimes
worse.
3. Given a code of length n and minimum distance d, simply double-up
each 1 and each 0 in each code word. The resulting code is of length 2n, mini-
mum distance 2d, and is the same size.
4. Given a code of length n, minimum distance d, and size A(n, d), think of
it as being displayed as in Table 15–1. Remove an arbitrary d – 1 columns. The
resulting code words, of length n – ( d – 1 ), have minimum distance at least 1.
That is, they are all distinct. Hence their number cannot be more than 2 n – ( d – 1 ) .
Since deleting columns did not change the code size, the original code’s size is at
n–d+1
most 2 n – ( d – 1 ) , so that A ( n, d ) ≤ 2 .
5. The Hamming rule applies to the case that d = 3 and the code has 2k code
words, where k is the number of information bits. The right-hand part of inequal-
ity (6), with A(n, d) = 2k and d = 3 is
2n 2n
2 k ≤ ----------------------- = ------------ .
n + n 1+n
0 1
2m + k
2 k ≤ ---------------------- ,
1+m+k
showing that they achieve the upper bound in inequality (6). Proof sketch: An n-
bit binary integer may be thought of as representing uniquely a choice from n
objects, with a 1-bit meaning to choose and a 0-bit meaning not to choose the
corresponding object. Therefore there are 2 n ways to choose from 0 to n objects
n
∑ = 2 n . If n is odd, i ranging from 0 to ( n – 1 ) ⁄ 2
from n objects—that is, n
i=0 i
covers half the terms of this sum, and because of the symmetry =
n n
, it
i n – i
(n – 1) ⁄ 2
∑ i = 2 n – 1 , so that the upper
accounts for half the sum. Therefore n
i=0
bound in (6) is 2. Thus the code achieves the upper bound of (6).
7. For ease of exposition, this proof will make use of the notion of equiva-
lence of codes. Clearly a code is not changed in any substantial way by rearrang-
ing its columns (as depicted in Figure 15–1), or by complementing any column.
If one code can be derived from another by such transformations, they are called
equivalent. Because a code is an unordered set of code words, the order of a dis-
play of its code words is immaterial. By complementing columns, any code can
be transformed into an equivalent code that has a code word that is all 0’s.
Also for ease of exposition, we carry out this proof for the case n = 9 and
d = 6.
Wlog (without loss of generality), let code word 0 (the first, which we will
call cw0) be 000 000 000. Then all other code words must have at least six 1's, to
differ from cw0 in at least six places.
Assume (which will be shown) that the code has at least three code words.
Then no code word can have seven or more 1's. For if one did, then another code
word (which necessarily has six or more 1's) would have at least four of its 1's in
the same columns as the word with seven or more 1's. This means the code words
would be equal in four or more positions, and so could differ in five or fewer posi-
tions (9 – 4), violating the requirement that d = 6. Thus all code words other than
the first must have exactly six 1's.
Wlog, rearrange the columns so that the first two code words are:
cw0: 000 000 000
cw1: 111 111 000
The next code word, cw2, cannot have four or more of its 1's in the left six col-
umns, because then it would be the same as cw1 in four or more positions, and so
would differ from cw1 in five or fewer positions. Therefore it has three or fewer of
its 1's in the left six columns, so that three of its 1's must be in the right three posi-
tions. Therefore exactly three of its 1's are in the left six columns. Rearrange the
left six columns (of all three code words) so that cw2 looks like this:
cw2: 111 000 111
26 ERROR CORRECTING CODES 15–4
By similar reasoning, the next code word (cw3) cannot have four of its 1's in
the left three and right three positions, because it would then equal cw2 in four
positions. Therefore it has three of fewer 1's in the left three and right three posi-
tions, so that three of its 1's must be in the middle three positions. By similarly
comparing it to cw1, we conclude that three of its 1's must be in the right three
positions. Therefore cw3 is:
cw3: 000 111 111
By comparing the next code word, if one is possible, with cw1, cw2, and cw3,
we conclude that it must have three 1's in the right three positions, in the middle
three positions, and in the left three positions, respectively. This is impossible. By
inspection, the above four code words satisfy d = 6, so A(9, 6) = 4.
8. Obviously A(n, d) is at least 2, because the two code words can be all 0’s
and all 1’s. Reasoning similarly as in the previous exercise, let one code word,
cw0, be all 0’s. Then all other code words must have more than 2n/3 1’s. If the
code has three or more code words, then any two code words other than cw0 must
have 1’s in the same positions for more than 2n ⁄ 3 – n ⁄ 3 = n ⁄ 3 positions, as
suggested by the figure below.
1111…11110…0
> 2n/3 < n/3
(The figure represents cw1 with its 1’s pushed to the left. Imagine placing the
more than 2n ⁄ 3 1’s of cw2 to minimize the overlap of the 1’s.) Since cw1 and
cw2 overlap in more than n/3 positions, they can differ in less than n – n ⁄ 3 =
2n ⁄ 3 positions, resulting in a minimum distance less than 2n/3.
9. Treating m and k as real numbers, the following iteration converges from
below quite rapidly:
m 0 = 0,
m i + 1 = lg(m i + k + 1), i = 0, 1, …,
where lg(x) is the log base 2 of x. The correct result is given by ceil(m 2) —that is,
only two iterations are required for all k ≥ 0.
Taking another tack, it is not difficult to prove that for k ≥ 0,
bitsize(k) ≤ m ≤ bitsize(k) + 1.
m ← W – nlz(k),
m ← m + ( ( 1 << m ) – 1 – m <
u
k ),