0% found this document useful (0 votes)
39 views5 pages

Transactions Briefs: A Nonbinary LDPC Decoder Architecture With Adaptive Message Control

A Nonbinary LDPC Decoder Architecture With Adaptive Message Control

Uploaded by

praba821
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views5 pages

Transactions Briefs: A Nonbinary LDPC Decoder Architecture With Adaptive Message Control

A Nonbinary LDPC Decoder Architecture With Adaptive Message Control

Uploaded by

praba821
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2118

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012

Transactions Briefs
A Nonbinary LDPC Decoder Architecture With Adaptive
Message Control
Weiguo Tang, Jie Huang, Lei Wang, and Shengli Zhou

AbstractA new decoder architecture for nonbinary low-density paritycheck (LDPC) codes is presented in this paper to reduce the hardware operational complexity in VLSI implementations. The low decoding complexity
is achieved by employing adaptive message control (AMC) that dynamically trims the message length of belief information to reduce the amount
of memory accesses and arithmetic operations. To implement the proposed
AMC, we develop the architecture of a horizontal sequential nonbinary
LDPC decoder. Key components in the architecture have been designed
with the consideration of variable message lengths to leverage the benefit
of the proposed AMC. Simulation results demonstrate that the proposed
nonbinary LDPC decoder architecture can significantly reduce hardware
operations and power consumption as compared with existing work with
negligible performance degradation.
Index TermsAdaptive control, decoding, Galois field, Min-Sum, nonbinary low-density parity-check (LDPC) codes, VLSI architecture.

I. INTRODUCTION
Low-density parity-check (LDPC) codes [1], [2] are considered as
one of the most powerful capacity-approaching codes. LDPC codes
can be constructed in both binary domain and Galois fields (i.e.,
m , where
). Binary LDPC codes have been studied
extensively [3][5] and adopted in many communication protocols,
such as DVB-T2, WiMax, etc. In general, a very long code length is
required for binary LDPC codes to approach the channel capacity.
Nonbinary LDPC codes constructed in Galois fields [6] offer improved
performance at a moderate code length. In addition, nonbinary LDPC
codes can be combined with high order modulations [7], [17] to
increase the bandwidth efficiency. Due to these features, design and
implementation of nonbinary LDPC codes have become critical for
many emerging applications such as underwater acoustic communications [17].
A key challenge in the application of nonbinary LDPC codes is their
high decoding complexity, as each symbol in the codeword is decoded
m ). A lot of research effort
using a long message (e.g., m in
aims at reducing the decoding complexity of nonbinary LDPC codes
at the algorithm level [7][9], [11]. To deal with the problem that computational complexity increases exponentially with , the extended
Min-Sum (EMS) was proposed in [10] where only the most significant
m entries in a message were used in the decoding. A decoding technique developed in [12] conducted the EMS with a reduced complexity
of
m 2 m with minor performance degradation. It should be
noted that these algorithm-level techniques do not explicitly consider
the complexity in the implementation of nonbinary LDPC decoders.
While many hardware-efficient VLSI architectures were proposed for

GF(2 )

m>1

GF(2 )

O(n log n )

Manuscript received December 05, 2010; revised March 04, 2011 and June
14, 2011; accepted August 12, 2011. Date of publication September 15, 2011;
date of current version July 27, 2012.
The authors are with the Department of Electrical and Computer Engineering,
University of Connecticut, Storrs, CT 06269 USA (e-mail: weiguo.tang@engr.
uconn.edu).
Digital Object Identifier 10.1109/TVLSI.2011.2165346

binary LDPC decoders [3][5], few results [14][16] exist for nonbinary LDPC decoders.
Different from these existing work targeting hardware implementation cost, the focus of this paper is to reduce the hardware operational
complexity in nonbinary LDPC decoder architectures. This enables efficient decoding suitable for emerging applications such as underwater
acoustic sensor networks [17] that are under the severe resource (e.g.,
energy) constraints. It was reported [4] that memory accesses and arithmetic operations are the two major contributors to the operating cost in
LDPC decoders. As the amount of memory accesses and arithmetic
operations is largely determined by the message length, reducing message length is deemed as an effective way for efficient decoding. Based
on this fact, our past work [18] has proposed to use adaptive message
control (AMC) to reduce the decoding complexity. Different from the
EMS which maintains a constant message length for every symbol, the
proposed AMC adjusts the message length adaptively, which can reduce the message length at the required performance.
In this paper, we develop a horizontal sequential VLSI architecture
for the nonbinary LDPC decoder employing the AMC. The design
of the key components in this architecture, such as variable node and
check node update units, is optimized by exploiting the variable length
sorters, which can be dynamically configured in different functional
units to accommodate variable message lengths. The AMC is implemented by a low-complexity approximation method to avoid hardware
overheads and performance impact. A mapping table based approach
is proposed to conduct searching operations with low complexity. We
apply AMC to EMS to address the memory and throughput issues
caused by the worst case message length. Note that AMC can also be
employed in other decoding update rules such as the Min-Max algorithm. In addition, the proposed AMC can be generally applied to most
existing decoder architectures (sequential, partial parallel and fully parallel).
II. ADAPTIVE MESSAGE CONTROL
A nonbinary LDPC code is defined by its parity check matrix (PCM)

H = [hij ], which is an M 2 N sparse matrix with low density of


nonzero entries. The nonzero entries of H take values from a Galois
field GF(2m ). A length-N vector x with entries having values from
GF(2m) is a codeword if and only if Hx = 0. Each entry inmthe code2

word is called a symbol, which is represented by a lengthmessage


recording the m belief information, i.e., the probabilities of this noisy
m . An LDPC code
symbol to be any of the m elements in
with PCM
can be represented by a bipartite graph called Tanner
graph, which consists of two groups of nodes:
variable nodes i
  , and check nodes j   . A variable node
i is connected to a check node j if and only if ji in the PCM is
nonzero.
The major decoding operation at a check node is to refine the estimated message of a symbol based on the messages of other symbols
that are correlated by the PCM . The elementary check node operation in the Min-Sum (MS) decoding algorithm [10], [12] can be expressed as

H
1 i N M
v

GF(2 )

N
c ;1 j M
c
h

v;
H

r = q1

q2

(1)

where 1 and 2 are the length- m variable node messages, and the
2 , where
basic operation
is defined as
1


m

2
and
1 2 (in
1 2 are the entries in messages
log domain) corresponding to
, respectively.

GF(2 )

r = max(q + q )
+ =
r ;q ;q
r; q ; q
; ;

U.S. Government work not protected by U.S. copyright.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012

2119

On the other hand, each variable node improves the fidelity of belief
information based on the received messages from multiple check nodes
connected by the PCM. The elementary operation at a variable node can
be expressed as [10], [12]

q = r1 + r2

(2)

which sums up the belief information associated with the same finitefield element.
The hardware complexity in implementing (1) and (2) is determined
by the lengths of variable node messages (VNMs) and check node messages (CNMs). Intuitively, when the distribution of belief information
is more concentrated, a shorter message might be sufficient to retain
most of the belief information.
The basic idea of AMC is to keep as few entries in a message as
possible without incurring much information loss. It has been demonstrated [18] that message truncation can be implemented by finding the
minimal n that satisfies

(1 0  )
q(n + 1)  q(1) + ln


(3)

where  2 [0; 1] is the confidence factor that determines the tradeoff


between performance and operational complexity [18], and q (k) indicates the k th entry in the log domain representation of the message q
that is sorted in order. Note that messages are usually normalized with
respect to the largest entry to maintain the numerical stability [12]. In
this case, the truncation criteria can be recast as
(4)

where the threshold ln((1 0  ))=( ) is used to truncate messages.


The operation of AMC in a nonbinary LDPC decoder can be summarized as follows.
Initialization
fj = AMC(fj ), where fj is the
Channel message truncation: ^
received channel message.
Variable node message: qij = ^
fj , where qij is the variable node
message from the variable node vj to the check node ci .
Iterations

Permutation qij
! qij 3h , where hij is the nonzero
PCM element, and the multiplication is conducted in GF(2m ).
Check node update

k2M (i)nj

qik

(5)

where M (i) n j is the set of neighboring variable nodes of the


check node ci excluding the variable node vj .

Inverse permutation rij
! rij =h , where hij is the
nonzero PCM
element, and the division is conducted in GF(2m ).
Variable node update

qij

= AMC ^fj +

AMC

k2N (j )ni

rkj

(6)

where N (j )ni is the set of neighboring check nodes of the variable


node vj excluding the check node ci , and AMC means that the
AMC is applied to all the intermediate results.
Tentative decoding

c^j

= max

fj +

k2N (j )

III. VLSI ARCHITECTURE FOR SEQUENTIAL AMC-BASED DECODER


In this section, we present a sequential nonbinary LDPC decoder
architecture for the proposed AMC. We first discuss the top level architecture and then detail the design of several key components such as
the variable length sorter, variable node update unit (VNU), and check
node update unite (CNU). The proposed AMC can be applied to different nonbinary LDPC decoding schedules, such as sequential, partially parallel and fully parallel architectures. We adopt the sequential
schedule in this paper due to its high convergence speed [13].
A. Top Level Decoder Architecture

(1 0  )
q(n + 1)  ln


rij

Fig. 1. Top-level block diagram of the horizontal nonbinary LDPC decoder


employing the AMC.


rkj
:

(7)

Fig. 1 illustrates the proposed sequential nonbinary LDPC decoder


architecture that performs the zigzag decoding schedule [13]. At the beginning of the decoding, the truncated channel messages are loaded into
the memory RAMb. Tentative decoding is conducted with the channel
messages and the results are stored in the memory RAMc. If tentative decoding suceeds, decoding terminates and RAMc outputs the final
result. Otherwise, the decoder initializes the intermediate check node
messages (ICNMs) in the memory RAMd with the channel messages.
After the initialization, the CNU reads ICNMs from RAMd to perform the check node update described in (5). The CNMs from the CNU
are inversely permuted, and then passed to the VNU along with the associated channel message to perform variable node update as described
by (6). The VNU also generates the messages according to (7) for the
tentative decoding unit (TDU) to conduct tentative decoding.
If tentative decoding does not succeed, the VNMs from VNU are
then permuted and sent back to the CNU to update the corresponding
ICNMs, which are then stored in the RAMd. After this, the decoder
proceeds to another variable node. This process continues until the
checksum unit (CSU) decides to terminate because of either successful
decoding or reaching the limit of iterations.
B. Variable Length Sorter
Sorting is an important operation in the CNU and VNU. In the CNU,
the sorter chooses nm elements with the largest belief information as
required in (5). In the VNU, truncation as shown in (6) discards the
smaller elements and the remaining elements are sorted in order as the
input of CNU. The basic sorting operation is to insert a new element
into a vector according to its magnitude. We will take the sorter in the
CNU as an example to explain its design.
The proposed variable length sorter is illustrated in Fig. 2, where
only the data path of belief information is shown as it determines the
shifting operation. Other data, such as the Galois field element and
vector index, are associated with the belief information and move along
with it. The sorter consists of n3m stages to accommodate the longest
message length. Each stage contains a comparator and a register. Each
time a new belief information comes in, it is compared with all the nm

2120

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012

Fig. 2. Implementation of the variable length sorter, where only the data path
of the belief information is shown for simplicity.
Fig. 3. Implementation of the VNU unit.

belief information in the active stages in parallel. The content of the


first stage having a larger belief information will be replaced by the
new input, whereas its original belief information will be shifted to the
right stage and so on. Employing AMC, the message length reduces
gradually. Thus only the last nm stages are enabled. This is controlled
by the longer one of the two messages in the CNU or VNU.
As indicated, the major operations in the sorter are comparisons and
shiftings. The number of these operations is mainly determined by the
length of the message to be processed. The proposed AMC dynamically
adjusts the message length during the iteration, thereby reducing the
complexity of sorting.
C. Searching
Searching is another major operation in the CNU and VNU. In the
VNU, the belief information in one message needs to search for its
counterpart in another message to perform (2). Thus, the location (e.g.,
index) of a Galois field element needs to be determined. The mapping
information between the Galois field elements and their indexes in a
message is maintained by a mapping table of 2m words with word
length of m bits. For example, if the ith entry in the message q is associated with the Galois field element , then i is written into the mapping
table at the address . When an element (e.g., ) needs to be searched
in message q, the content at the address of the mapping table is read
out, which provides the location of in q. The table is initialized with
01. Thus a negative value indicates that is not in the message q.
In the CNU, nm different elements need to be generated according
to (1). The output of the sorter is considered valid if and only if the
corresponding Galois field element has not been generated previously.
A searching operation has to be conducted on the current output to
compare it with the previous outputs of the sorter. As only the existence
of the element needs to be determined, a mapping table with only 2m
bits can be constructed. Since this table mainly records the status (i.e.,
existence or not) of Galois field elements, it is referred to as status table
in the CNU.
Note that the proposed mapping-based searching scheme is naturally
low complexity in comparison with direct searching of the message,
especially when the message is very long.
D. Variable Node Update Unit
The function of VNU is to compute (2), where the belief information associated with the same Galois field element in two messages are
summed. Fig. 3 shows the block diagram of the VNU. In general, two
messages r1 and r2 may have different lengths due to the AMC.
The mapping table (see Section III-C) is utilized to sum up the entries
associated with the same finite filed element. The final results are sent to
the variable length sorter with the associated Galois field elements and
AMC is then applied to the outputs of the sorter, i.e., starting from the
largest entry in the sorter, when the entry is smaller than the threshold
[see (4)], the following entries are discarded. This reduces the hardware

Fig. 4. Implementation of the CNU unit.

operations in the VNU such as real number additions/subtractions (for


belief information), comparisons, and sorting operations.
E. Check Node Update Unit

The CNU produces the n2 largest elements among all the n1 2 n2


combinations from two input messages with lengths n1 and n2 (assume
n1  n2 ) as described in (1). It was shown [12] that the complexity of
CNU can be reduced to 2n2 additions and insertion operations if the
two input messages are sorted in the descending order. We adopt this
method in the CNU design.
The complexity of the CNU depends on two factors. The first factor
is the number of outputs (i.e., n2 ), which determines how many insertion operations are needed. The second factor is the length of the
sorter, which determines how many comparisons and shifting operations are involved in each insertion operation. Employing the proposed
AMC, the message length decreases thus n2 becomes smaller during
the iteration. This reduces the number of arithmetic operations and insertions. To address the second factor, the sorter is configured to be
the same length of the shorter message (n1 in the above example),
thereby reducing the complexity of each insertion operation. Note that
the CNU also performs additions in the Galois field. However, additions in the Galois field are essentially bit-wise XOR operations, thus
the complexity is much lower than the real number additions of belief
information.
The proposed design of CNU is shown in Fig. 4. A variable length
sorter as described in Section III-B is employed to perform shifting and
comparison operations. Searching operations are based on the method
discussed in Section III-C.
IV. SIMULATION RESULTS AND DISCUSSIONS
In this section, we evaluate the proposed AMC-based decoder using
an irregular GF(16) nonbinary LDPC code that has been used in a multicarrier underwater acoustic system [17]. We will compare key operations such as memory access, real number addition (for belief information), comparison, shifting, Galois field multiplication and division
(for permutation and inverse permutation) with the existing decoder
architectures. The hardware-related measures are also studied with the
overheads of the AMC being considered. In all simulations the codewords are transmitted through the binary AWGN channel. The maximum number of decoding iterations is set to 10.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012

2121

TABLE I
HARDWARE OPERATION REDUCTION OF EMS, AMC-EMS, AND AMC-MS COMPARED WITH MS

TABLE II
ESTIMATED HARDWARE MEASURES OF MS, EMS, AMC-EMS (

m = 4;n

;K = 5; = 0:9999)

= 10

1005 . Clearly, AMC-MS is more effective in reducing hardware operations than EMS; i.e., the number of major operations in AMC-MS is
less than 50% of that of EMS. By applying AMC to EMS, the hardware operations can be further reduced at the expense of negligible
performance loss (see Fig. 5). It is expected that by reducing hardware
operations, the proposed AMC will also enable significant reduction in
energy consumption.
C. Comparison of Hardware Implementations

Fig. 5. Performance of joint AMC and EMS with different

 and n

A. Decoding Performance
Our past work [18] has shown that applying AMC to MS incurs
less than 0.1 dB loss at the block-error-rate (BLER) of 1004 with
 = 0:9999, and the performance loss increases to 0.35 dB when
 = 0:999. In comparison, EMS when reducing nm from 16 to 12
and 10 results in the performance loss of 0.15 and 0.3 dB, respectively,
at the same BLER level. One issue of AMC-MS is that the hardware
complexity and throughput is determined by the worst case message
length, even though the operational complexity can be significantly reduced. To address this issue, we propose to apply AMC with EMS in
this paper. Fig. 5 shows the BLER performance of AMC with EMS
initialization. The AMC with  = 0:9999 and  = 0:999 incurs about
0.03 and 0.1 dB performance loss, respectively, to the conventional
EMS. However, AMC can further reduce the hardware operations of
EMS under the same throughput and memory size.
B. Complexity Reduction by the AMC
We choose three cases: EMS with nm = 10, AMC-MS with  =
0:999, and AMC-EMS with  = 0:9999 and nm = 10, to evaluate the
reduction in hardware operational complexity. The results are listed in
Table I as compared with MS at SNR 2.8 dB, where BLER can reach

Although the physical implementation of the proposed decoder architecture is beyond the scope of this paper, hardware-related measures
are necessary to evaluate the decoder efficiency. In this subsection, we
will study the hardware cost, throughput (latency), and power savings
of the proposed decoder architecture. Similar to the existing work [14],
[15], these results will be estimated and compared with existing decoders.
As discussed in Section III, the AMC-based decoder needs sorters
for truncation operations. However, the amount of hardware operations is reduced at the CNU and VNU due to AMC, which can offset
the overhead of the sorters. Note that the CNU dominates the decoder
hardware complexity. Thus, we will focus on this unit for comparing
the hardware cost. The major components in the CNU are: 1) status
table and message RAM; 2) registers; 3) MUX; 4) comparators; 5)
adders; and 6) decoder for sorter control as shown in Fig. 2. Compared
with EMS, AMC-EMS needs an additional decoder that generates the
control signal for the variable sorter. As shown in Table II, this decoder takes about 2.2% of the total transistor count in the CNU. Also,
the slightly larger memory size (3.76%) in AMC-EMS compared with
EMS comes from the overhead to record the length of each variable
message, while EMS keeps messages with a constant length. The overhead in the VNU for AMC is also very small, consisting only of a decoder and a comparator. Both AMC-EMS and EMS (nm = 10 and
K = 5 bits quantization) save about 30% of the hardware cost as compared with MS. Note that since the proposed AMC is applied to the
CNU and VNU only, the hardware cost of other units in Fig. 1 in different implementations is more or less the same and thus we do not
include them in the comparison.
For all the three decoders in Table II, the critical path is in the sorter,
which is comprised of a comparator, an XOR gate and a 2-to-1 MUX.
The critical path consists of 12 2-inputs XOR gates for MS and EMS,
assuming a 5-bit carry ripple chain comparator implementation. For
AMC-EMS, the cell enable (CE) signal of the sorter is generated by

2122

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012

the outputs from both the decoder and comparator, thereby adding one
more 2-input AND gate on the critical path. In a full parallel decoder
implementation, all the messages are updated simultaneously. Thus
the worst case message length determines the number of clock cycles to finish one decoding iteration, although the message length for
AMC-EMS will be reduced during the iterations. The AMC-EMS has
slightly lower throughput (about 4%) than EMS due to the longer critical path in the sorter, while it takes the same number of clock cycles
to finish one iteration. As MS needs to compute 32 messages while
AMC-EMS (and EMS) only needs to compute 20 in the CNU, the
throughput can actually be improved by 1:52 by truncation as compared with MS. On the other hand, in a sequential implementation
the messages are updated in series. As the message length varies a lot
in AMC-EMS, so does the number of clock cycles spent on updating
the messages in each iteration. Thus, in a sequential architecture, the
total number of clock cycles for finishing decoding iterations is determined by the average message length. The average AMC-EMS message length is about 60% of that of EMS, indicating about 1:62 higher
throughput than EMS due to a smaller amount of message computation.
Compared with MS, sequential AMC-EMS can improve the throughput
by about 2:42.
Although AMC-EMS has slightly higher hardware cost and smaller
throughput (in a full parallel implementation) than EMS, it enables
significant power savings critical to emerging applications such as underwater acoustic communications [17], where one of the very scarce
resources is energy. Note that so far no measured results of power
consumption from the implementations of nonbinary LDPC decoders
can be found in the literature. However, the power consumption can
be reasonably estimated by considering the major hardware operations
that dominate the power consumption, such as memory access, addition, comparison, shifting, Galois filed multiplication and division, as
shown in Table I. Assume that each kind of these operations consumes
about the same power for MS, EMS, and AMC-EMS. Note that this assumption is conservative because EMS and AMC-EMS use a smaller
memory size. As shown in Table II, among all the major operations,
the AMC-EMS shows the smallest reduction in memory accesses and
real additions, which indicates that AMC-EMS can potentially reduce
about 65% and 50% of power consumption in memory accesses and
real additions, as compared with MS and EMS, respectively, under the
same clock frequency. The power reduction in other operations, such
as comparison and shifting, is even larger. Thus, we expect the overall
power reduction of AMC-EMS to reach about 65% and 50% compared
with MS and EMS, respectively, assuming an equivalent implementation.
V. CONCLUSION
In this paper, we proposed a new nonbinary LDPC decoding architecture based on AMC, which can significantly reduce the hardware
operations and power consumption in VLSI implementations by adaptively adjusting the message length of belief information while maintaining the required performance. A truncation scheme was developed
to implement the proposed AMC. The architecture design was optimized to fully exploit the benefit of the proposed AMC. Further work is
being directed towards a full fledged ASIC implementation for underwater acoustic sensor networks subject to stringent energy constraints.

REFERENCES
[1] R. G. Gallager, Low-Density Parity-Check Codes. Cambridge, MA:
MIT Press, 1963.
[2] D. J. C. MacKay and R. Neal, Good codes based on very sparse
matrices, in Proc. Cryptography Coding, 5th IMA Conf., 1995, pp.
100111.

[3] T. Zhang and K. K. Parhi, VLSI implementation-oriented (3,k) regular low-density parity-check codes, in Proc. IEEE Workshop Signal
Process. Syst. (SIPS), 2001, pp. 2536.
[4] M. M. Mansour and N. R. Shanbhag, High-throughput LDPC decoders, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 11,
no. 6, pp. 976996, Dec. 2003.
[5] J. Sha, Z. Wang, M. Gao, and L. Li, Multi-Gb/s LDPC code design and
implementation, IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,
vol. 17, no. 2, pp. 262268, Feb. 2009.
[6] M. Davey and D. J. C. MacKay, Low density parity check codes over
GF(q), IEEE Commun. Lett., vol. 2, no. 6, pp. 165167, Jun. 1998.
[7] R. Peng and R. Chen, Application of nonbinary LDPC codes for communication over fading channels using higher order modulations, in
Proc. IEEE Globecom, 2006, pp. 15.
[8] V. Savin, Min-Max decoding for nonbinary LDPC codes, in Proc.
IEEE ISIT, 2008, pp. 960964.
[9] H. Song and J. R. Cruz, Reduced-complexity decoding of q-ary LDPC
codes for magnetic recording, IEEE Trans. Magn., vol. 39, no. 2, pp.
10811087, Mar. 2003.
[10] D. Declercq and M. Fossorier, Decoding algorithms for nonbinary
LDPC codes over GF(q), IEEE Trans. Commun., vol. 55, no. 4, pp.
633643, Apr. 2007.
[11] H. Wymeersch, H. Steendam, and M. Moeneclaey, Log-domain decoding of LDPC codes over GF(q), in Proc. IEEE ICC, 2004, pp.
772776.
[12] A. Voicila, D. Decercq, F. Verdier, M. Fossorier, and P. Urard, Lowcompleixty, low memory EMS algorithm for non-binary LDPC codes,
in Proc. ICC, 2007, pp. 671676.
[13] Y. Chang, A. Vila Casado, M. Chang, and R. D. Wesel, Lower-complexity layered belief-propagation decoding of LDPC codes, in Proc.
ICC, 2008, pp. 11551160.
[14] X. Zhang and F. Cai, Efficient partial-parallel decoder architecture for
quasi-cyclic nonbinary LDPC codes, IEEE Trans. Circuits Syst. I, Reg.
Papers, vol. 58, no. 2, pp. 402414, Feb. 2010.
[15] J. Lin, J. Sha, and Z. Wang, An efficient VLSI architecture for nonbinary LDPC decoders, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol.
57, no. 1, pp. 5155, Jan. 2010.
[16] A. Voicila, F. Verdier, D. Decercq, M. Fossorier, and P. Urard, Architecture of a low-complexity non-binary LDPC decoder for high order
fields, in Proc. ISCIT, 2007, pp. 12011206.
[17] J. Huang, S. Zhou, and P. Willet, Nonbinary LDPC coding for
multicarrier underwater acoustic communication, IEEE J. Sel. Areas
Commun., vol. 26, no. 9, pp. 16841696, Sep. 2008.
[18] W. Tang, J. Huang, L. Wang, and S. Zhou, Nonbinary LDPC decoding by Min-Sum with adaptive message control, in Proc. Int. Conf.
Acoust., Speech, Signal Process. (ICASSP), 2011, pp. 31643167.

You might also like