0% found this document useful (0 votes)
56 views20 pages

Low Density Parity Check Codes For Erasure Protection: Alexander Sennhauser April 22, 2005

This document describes a semester project on Low Density Parity Check (LDPC) codes for erasure protection. It first reviews basic coding theory concepts like channels, decoding, and entropy. It then discusses linear codes and how they encode information. The main focus is on LDPC codes, including how they encode and decode information using sparse parity check matrices. The document concludes by describing implementations of LDPC encoders and decoders written in C++.

Uploaded by

Desikan Sampath
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views20 pages

Low Density Parity Check Codes For Erasure Protection: Alexander Sennhauser April 22, 2005

This document describes a semester project on Low Density Parity Check (LDPC) codes for erasure protection. It first reviews basic coding theory concepts like channels, decoding, and entropy. It then discusses linear codes and how they encode information. The main focus is on LDPC codes, including how they encode and decode information using sparse parity check matrices. The document concludes by describing implementations of LDPC encoders and decoders written in C++.

Uploaded by

Desikan Sampath
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Low Density Parity Check Codes for Erasure Protection

Alexander Sennhauser April 22, 2005

Abstract This document describes my semester project in the eld of LDPC codes for erasure protection at the Swiss Federal Institute of Technology Lausanne supervised by Betrand Nzdana Nzdana and Prof. Amin Shokrollahi. At the beginnign some basic coding theory topics are quickly reviewed in order to understand later the principles of linear codes and especially LDPC codes. In a second part the concept behind LDPC codes is shown and illustrated with some simple examples. The third part discusses an implementation of one or several encoders resp. decoders. The implementations are entirely written in C++.

Contents
1 Basic Coding Theory 1.1 General communication system . . . . . . 1.2 Channel . . . . . . . . . . . . . . . . . . . 1.2.1 Binary Symmetric Channel . . . . 1.2.2 Binary Erasure Channel . . . . . . 1.3 Decoding . . . . . . . . . . . . . . . . . . 1.4 Entropy and Information . . . . . . . . . 1.5 Capacity of a Channel . . . . . . . . . . . 1.5.1 Shannons noisy Coding Theorem . 2 3 3 3 3 4 5 6 7 8 8 9 9 12 12 12 12 15

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

2 Linear Codes 2.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Encoding Information . . . . . . . . . . . . . . . . . . . . . . 2.3 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Low Density Parity Check (LDPC) Codes 3.1 Encoding . . . . . . . . . . . . . . . . . . . 3.1.1 Standard Encoding Method . . . . . 3.1.2 Advanced Encoding Method . . . . . 3.2 Decoding . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4 Implementation 16 4.1 LDPC Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 LDPC Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Conclusion 17

Chapter 1

Basic Coding Theory


The main goul in Coding Theory is ecient and reliable data transmission in an uncooperative environment. Claude Shannons paper A Mathematical Theory of Communication described by mathematical means that we really are able to do such a data transmission. This will be much clearer when we look at Shannons Noisy Coding Theorem. In general a communication system will look like the one Shannon described in his paper and that is shown in Figure 1.1. Basically it consists of a source that will transmit a message across a channel where additional noise can or will perturb the initial message. At the end the receiver will receive some message. In order to make the transmission across the noisy channel more reliable we add some additional information to the object we send, i.e. the redundancy. It is clear that the amount of data to transmit grows with the ?? of redundancy. The more redundancy we add the longer it takes to transmit the message, but we gain a lot more reliability. It is obvious that good information content will always be in conict with ecient and performant data transmission.

Figure 1.1: Shannons model for a communication system

1.1

General communication system

A more general communication system than the one described by Claude Shannon is depicted in Figure 1.2. The message is a k -tuple choosen from a set of possible message words. This k -tuple is transformed by the encoder into a codeword n-tuple from an alphabet A and sent over the channel. Please notice that we are describing a memoryless transmission system, which means that at any given time the message does not depend on previous messages. Moreover the encoder transmits blocks of symbols of xed size n (block coding). The decoder receives from the channel an n-tuple of symbols which will either be transformed into an estimate of a message k -tuple from the encoders alphabet A or into an estimate of the codeword n-tuple. The decoding procedure will entirely be specied by our decoding schema.

1.2

Channel

In an ideal world the transmission channel is perfect and no noise is added to the codeword. In practice this will never be the case. We shall concentrate on coding on a discrete memoryless channel or DMC. The channel is called discrete because we only consider nite input and output alphabets. Because the error of one symbol does not aect the reliability of a neighbouring one the channel is called memoryless. An important type of channel is the mary symmetric channel which has an input alphabet 1 = {x1 , x2 , ..., xm } and an output alphabet 2 = {x1 , x2 , ..., xm } of m symbols and is completly characterized by its channel matrix, which has as an entry pij the probability that after the transmission of the symbol xi the symbol xj = xi is received.

1.2.1

Binary Symmetric Channel

We are specially interested in the 2-ary symmetric channel (Figure 1.2) which is also called binary symmetric channel or BSC(p), where p is the parameter of the channel matrix. The input and output alphabet consist of the set = {0, 1}.

1.2.2

Binary Erasure Channel

Another important type is the binary erasure channel or BEC( ) which is depicted in Figure 1.3. The BEC has the same input alphabet as the BSC, namely 1 = {0, 1}. The output alphabet however is 2 = {0, 1, ?}. The 3

channel itself is caracterized by a single parameter which describes the probability that one of the symboles has been fudged by the channel. Note that bits cannot be ipped anymore. This channel can be used to model a system where messages are either received correctly or lost due to some reason. The decoding problem is then to nd the correct bits given the erasure locations.

1.3

Decoding

After the n-tuple is received from the channel we can distinguish between two possible decoding schemas: hard and soft decoding. With the hard decoding schema the decoder will always decode the received n-tuple into a k -tuple message. When performing soft decoding the decoder has the choice to decode the n-tuple either into a message k -tuple or into an additinal symbol ?, which indicates the inability to make an educated guess. In the second case we speak of a channel erasure, which is best described as a symbol error whose location is known. Suppose when we are designing our decoder that for each n-tuple y we know the probability p(y x) that the n-tuple x was sent. The basis of our hard quantization decoding schema would then be: When y is received, we decode it to a codeword x that minimizes p(y x). This is called Maximum Likelihood Decoding and abbreviated to MLD. If we take as a basis the same schema but allow the decoder also to decode the received y to the symbol ? (soft quantization schema), which means error detected but not corrected, we speak of Incomplete Maximum Likelihood Decoding (IMLD). It is clear that when our goal is to maximize the probability of correct decoding that MLD should be used, because in this situation any guess is better than none. When we consider a code C in An and a decoding algorithm A we are in-

Figure 1.2: General communication system

terested in the average error expectation for decoding C using the algorithm

A dened as PC (A) = |C|1


xC

Px (A).

(1.1)

This will tell us nothing about the performance of A but how good A is as an algorithm for decoding C. We nally dene the error expecation PC for C by PC = min PC (A). (1.2) A It is clear that if PC (A) is large then the decoding algorithm is not good. However if PC is large, then no decoding algorithm is good for C and we should consider another code. In Section 1.3 of [1] some basic code examples like repetition codes, parity check codes, Hamming codes etc. are reviewed.

(a) Binary Symmetric Channel

(b) Binary Erasure Channel

P =

0 1

0 1p p

1 p 1p

P =

0 1

0 1 0

1 0 1

(c) BSC channel matrix

(d) BEC channel matrix

Figure 1.3:

1.4

Entropy and Information

Entropy is best described as a measurement for uncertainty for some event. If we consider a discrete random variable X we dene its entropy as H(X) =
k

pk log2 pk .

(1.3)

For a certain probability distribution (p1 , p2 , ..., pn ) of the random variable X there exists an upper bound for the entropy. It can be shown that H(X) = 5

H(p1 , p2 , ..., pn ) log2 n (proof see [2]), with equality if and only if p1 = p2 = 1 ... = pn = n . If X and Y are two random variables we have H(X, Y ) H(X) + H(Y ). In this case there is equality if X and Y are independent. Moreover we dende the conditional entropy as H(X Y ) =
j

H(X Y = bj )P (Y = bj ).

(1.4)

The information of a certain event E is dened as I(E) = log2 P (E). (1.5)

Notice that entropy is the mean value of information. As with entropy we dene the conditional information as I(U V ) = H(U ) H(U V ). (1.6)

This equation directly relates the concepts of entropy and information. By knowing V we lose some uncertainty about U.

1.5

Capacity of a Channel

As we have seen in section 1.2 that a channel is entirely characerized by the input alphabet 1 , the ouput alphabet 2 and the channel matrix. If we attach this channel to a memoryless source A which emits symboles from the set 1 with a probability distribution (p1 , p2 , ..., pn ), then the output of the channel could be regarded as another memoryless source B which outputs symbols from the set 2 with probability distribution (q1 , q2 , ..., qm ). Notice that qj = n P (bj received ai sent)P (ai sent) = n pi pij . Therefor we i=1 i=1 can dene the capacity of the channel as Cap = max{I(A B)} = max{H(A) + H(B) H(A, B)}. (1.7)

Notice that the capacity of a channel is entirely determined by the channel matrix. Looking at the channel matrix of the binary erasure channel and calculating the information I(A B) we nd the capacity as Cap = max{I(A B)} = 1 plog2 (p) (1 p)log2 (1 p). (1.8)

Doing the same calculation for the binary erasure channel we nd that Cap = max{I(A B)} = 1 . (1.9)

In general the capacity calculation for a channel is nontrivial and often involves some kind of an optimization problem. Often it can be usefull to relate the new channel to a channel where we have already calculated the capacity. An important relation is that the r -th extension of a memoryless channel with capacity C has capacity rC. 6

1.5.1

Shannons noisy Coding Theorem

This marvellous theorem shows us that as long as we keep the transmission rate below the channel capacity we can achieve arbitrarily high transmission realiability. If we take for example the binary symmetric channel with capacity C and a trandmission rate R, such that 0 < R < C. We then know that for a sucient large n there exists a set of 2Rn codewords of length n which has error probability less than any given threshold. The beauty of Shannons Theorem is that is assures us that such good codes exists, however it does not tell us how to nd such codes.

Chapter 2

Linear Codes
In order to be able to encode and decode eciently it is convenient to add some structure to the code space. The mathematical framework of vectorspaces provides exactly such a structure. Thus a linear code of length n is a subspace of the vectorspace F n where words will be vectors.

2.1

Basics

If a linear code is a k -dimensional subspace of F n we say that C is a [n,k] k code. Once a code is found we can dene its rate as n bits per channel use and its redundancy as n-k. When in addition to the dimensions we know the minimim distance d we call C a [n,k,d] -code. Because we are describing the code in terms of vecor subspaces we see that such a code allows k codewords of lenght n which describe a code with has nk codewords. This is true because once we have found k linearly independent codewords any subspace of dimension k is completly specied. The space savings can be enormous. The hamming weight or weight of a vector v is the number of the nonzero entries and is denoted wH (v). The minimum weight of the code C is the minimum nonzero weight among all codewords, wmin (C) = w(C) = min0=xC wH (x). (2.1)

The Hamming distance between two vectors x and y is the number of places in which those two vectors dier. We dene the minimum distance of a code C as dmin (C) = minci ,cj C ci =cj d(ci , cj ). (2.2) The Singleton Bound theorem shows us that there exists an upper bound to the minimal distance which is dmin (C) n k + 1. A linear code that meets the Singleton Bound with equality is called maximum distance separable. Notice that for linear codes we have dmin (C) = wmin (C). The matrix G is a spanning matrix for the linear code C if the rowspace of 8

G is equal to C. A generator matrix G of the [n,k] -code C is a k n matrix where the rowspace of G equals C. It is clear that a generator matrix is a spanning matrix whose rows are linearly independent. For any code C we can dene its dual code C as C = {x F n x c = 0, c C}. (2.3)

If C is a linear [n,k] -code then we know that the dual C is a linear [n,nk] -code and (C ) = C. A generator matrix H for the dual code C is called a check matrix for the code C. In general we can calculate the check matrix H from the generator matrix G of C. In order to simplify the calculations we pass G to its reduced row echelon form Aknk G = Ikk (2.4) and calculate H as H= Ankk Inknk . (2.5)

2.2

Encoding Information

For the encoding schema we consider a linear [n,k] -code. The message k tuples x are mapped to the codewords xG. We say that G is a standard generator matrix if the rst k columns form a k k identity matrix and that G is a systematic generator matrix if there are k columns of a k k identity matrix among the columns of G. A codeword will always consist of the information set and the additional redundancy. This is best described with a little example. Consider the following systematic generator matrix G for a binary parity check linear [2,1,2] -code in F 2 : G= 1 0 1 . 0 1 1

If we calculate the codeword by multiplying matrices, xG = x1 x2 1 0 1 0 1 1 = x1 x2 x1 + x2 ,

we see that the rst two columns of the codeword hold the information set while the third column consists of the redundancy.

2.3

Decoding

The simplest form of decoding always available is dictionary decoding, where we keep a list of all possible received words with the corresponding codeword to which it would be decoded under MLD. It is obvious that this is not a 9

prictical solution because we would need a lot of storage for the dictionary. A more sophisticated solution to the problem is to think of the channel as an additional source which will add to the codeword c the error vector e = x c F n . The decoding problem given x then consists in estimating either e or c. From the denition of e we know that the received vector x and the error pattern e belong to the same coset x + C = e + C. Notice that we can calculate the coset of C in advance. Actually we are looking for vectors in each coset with minimal weight, also known as coset leaders. If there is more than one coset leader we can make an arbitrary choice. When the word x is received, we know that the introduced error belongs to the coset x + C. Thus the most likely error pattern introduced by the channel is the coset leader choosen for this coset. We can decode x to e the codeword c = x . We see that with coset leader decoding the actual e errors corrected are the choosen coset leaders and that coset leader decoding is a Minimal Distance Decoding or MDD algorithm. One implementation of coset leader decoding is standard array decoding. This method takes advantage of the structure of the linear code and is therefor more ecient than dictionary decoding. See [1] for a detailled description. But again this method will not be very practical because it also demands a great amount of storage. The second method of coset leader decoding is syndrome decoding where we will use the dual code C and the check matrix H. The syndrome of a codeword x is dened as s = Hx and is for every received n-tuple x a measure wheter or not it belongs to the code. As the syndrome of a codeword c is zero we have Hx = H (c + e) = 0 + He = He . (2.6)

This equation shows us that the received vector x and the corresponding error vector e introduced by the channel will have the same syndrome. This means that the only thing we need to store is a dictionary which contains all syndromes sj toghether with the coset leaders ej such that Hej = sj . The decoding schema consists in rst calculating the syndrome sr of the received vector x. By looking at the syndrome dictionary we can then decode the received vector x to the codeword = x er . Because we know c that C contains codewords with a minimal distance d we can decode up to d-1 erasures correctly with syndrome decoding. The following example will illustrate the concept of syndrome decoding. The code C we will use is dened by the following generator matrix G= 1 0 1 0 0 1 1 1 = I22 A22 , with A22 = 1 0 . 1 1

The check matrix is then calculated as H= A22 I22 10 = 1 1 1 0 . 0 1 0 1

By looking at the equation Hc = 0 we nd the codewords of the code C to be


c1 =

0 0 0 0

, c2 =

1 1 0 1

, c3 =

1 0 1 0

, c4 =

0 1 1 1

Considering all possible syndromes we can calculate the corresponding coset leaders and construct our syndrome lookup table which will look like 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 . 1 1 1 1 1 1 1 1
=

If we receive now for example the vector y = its syndrome

we can calculate

Hy =

1 1 1 0 0 1 0 1

1 0

and decode it to

1 1 1 1

1 0 0 0

0 1 1 1

11

Chapter 3

Low Density Parity Check (LDPC) Codes


3.1
3.1.1

Encoding
Standard Encoding Method

The standard method for encoding consists in calculation the product of the message vector x and the generator matrix G. At a rst glance this looks trivial, however there are two problems with this encoding schema. First of all the procedure often requires O(n3 ) operations because the matrix G is not sparse. Another problem arises when we are not working with a particular code but considering an ensemble of codes which is constructed out of a degree distribution. These ensembles are usually dened in terms of ensembles of bipartite graphs. With such a code denition we dont even consider at all the generator matrix G. We therefore have to nd an ecient algorithm in order to construct the generator matrix G out of the parity check matrix H which is clearly non-trivial to nd. All this problems lead us to a second approach which is discussed below and merely uses the parity check matrix H.

3.1.2

Advanced Encoding Method

As mentioned above we do not look at a particular code anymore but consider an ensemble of codes. The advantage of such an ensemble is that the sparsness of the parity check matrix H enables ecient encoding while the randomness ensures a robust code. The actual encoding schema now consists in two steps. At rst we do the preprocessing step which has to be done once for a given code an which transforms the parity check matrix H into a certain shape. Then the actual encoding step uses the precalculated matrix. We know that this second step can be accomplished eciently if the matrix H is sparse, i.e. does not con12

tain a lot of non-zero entries. The most straightforward way of doing the preprocessing step would probably be to bring the matrix H into a lower triangular form by using gaussian elimination, as depicted in Fig 3.1a. The codeword c can then be split up into the information part s which is lled with the symbols of the message vector x and the parity part p, which can be calculated by using back-substitution. The problem with this approach is that the preprocessing step requires O(n3 ) operation and that after this step the matrix H is not necessarily sparse anymore. The actual encoding calculations of the parity part p take then O(n2 ) operations. A more sophisticated solution would be to try to bring the matrix H into an approximate lower triangular form, shown in Fig. 3.1b, by only using column and row permutations. This ensures that even after the preprocessing step the matrix H is sparse. More precisely the matrix after this step is in the form H= A B T C D E

with A (k g) (n k), B (k g) (g), T (k g) (k g), C (g) (n k), D (g)(g) and E (g)(kg), all being sparse matrices and T is lower triangular with ones along the diagonal. The constant g describes the gap between the lower triangular matrix T and the initial matrix H of dimension (k) (n). The codeword is now splitted in three parts, namely the information part s, the rst parity part p1 and the second parity part p2 , which leads to c = (s, p1 , p2 ). By multiplying the matrix H from the left with I 0 ET 1 I we get H= A B T ET 1 A + C ET 1 B + D 0 .

The equation Hx = 0 then splits into the two equations, which are As + Bp1 + T p2 = 0 and (ET 1 A + C)s + (ET 1 B + D)p1 = 0. (3.2) Dening := ET 1 B + D and assuming that is non-singular the calculations of p1 respectively p2 are straightforward and given by p1 = 1 (ET 1 A + C) and p2 = T 1 (As + Bp1 ). 13 (3.4) (3.3) (3.1)

If after clearing the matrix E is singular we can always perform some additional column permutation in order to remove this singularity. The authors of [4] show that the overall complexity determining p1 is O(n + g 2 ) while determining p2 is O(n). From the same authors there exists an other approach which is in terms of encoding exactly the same but which is dirent in terms of implementation. In this second schema codeword c has the form c = (p1 , p2 , s) and we try to nd a matrix H that has approximate upper triangular form, depicted in Fig 3.1c. This is again accomplished by only doing row and column permutation in order to preserve the sparseness of the matrix H. The matrix H of the form H= T A B E C D

will again be premultiplied from the left by another matrix in order to clear out E. We calculate H as H= I 0 1 I ET T A B E C D = T 0 A B 1 A D ET 1 B C ET .

As above we dene := ET 1 B + D and assume that it is non-singular, otherwise the procedure to make non-singular is the same as above. The calculations of p1 respectively p2 can be done by solving the equations T p1 + Ap2 + Bs = 0 and Ep1 + Cp2 + Ds = 0. This actually consists in the following steps. 1. Set p2 = 0 and solve (3.5) for p1 . 2. Evaluate y := Ep1 + Cp2 + Ds = Ep1 + Ds . 3. Set p2 = 1 y . 4. Solve (3.6) for p1 . Again a performance analysis shows us that calculating step(1), step (2) and step(4) are O(n) and step(3) is O(g 2 ). The question is know if there exists an alogrithm that can eciently transform any given parity check matrix H into the desired approximate lower or approximate upper triangular form with a gap g as small as possible. Such an algorithm is given in [4] and operates on H by permuting rows and columns. A performance analysis shows us that the algorithm is clearly of complexity o(n2 ), in many cases however we can do a lot better, namely at the order of O(n). 14 (3.6) (3.5)

(a) H in lower triangular form

(b) H in approximate lower triangular form

(c) H in approximate upper triangular form

Figure 3.1:

3.2

Decoding

15

Chapter 4

Implementation
4.1 4.2 LDPC Encoder LDPC Decoder

16

Chapter 5

Conclusion

17

Bibliography
[1] J.I. Hall, Notes of Coding Theory, Departement of Mathematics, Michigan State University, 2003 [2] Dominic Welsh,Codes and Cryptography, Clarendon Press, Oxford, 1998 [3] Amin Shokrollahi, LDPC Codes: An Introduction, Digital Fountain Inc, Fremont, 2002 [4] Thomas J. Richardson and Rdiger L. Urbanke, Ecient Encoding of Low-Density Parity-Check Codes, IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001

18

You might also like