Low Power Viterbi Decoder For Trellis Coded Modulation Using T-Algorithm
Low Power Viterbi Decoder For Trellis Coded Modulation Using T-Algorithm
ISSN 2229-5518
Abstract: The viterbi decoder which is low power with the convolutional encoder for the trellis coded modulation is shown in this paper. Convolutional
encoding with Viterbi decoding is a good forward error correction technique suitable for channels affected by noise degradation. In this paper it
shows the viterbi decoder architecture with convolutional encoder with proposed precomputation T -algorithm which can effectively reduce the power
consumption with negligible decrease in the speed. Implementation result is for ¾ convolutional code rate with constraint length 7 used for trellis coded
modulation. This architecture reduces the power consumption up to 70% without any performance loss, while the degradation in clock speed is
negligible.
Key words: Convolutional code, T-algorithm, Trellis coded modulation (TCM), viterbi decoder, VLSI.
=min{P0,0(n-1)+B0,0(n),
P0,1(n-1)+B0,1(n),…….,
P0,p(n-1)+B0,p(n),
P1,0(n-1)+B1,0(n),
IJSER © 2012
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 3, Issue 9, September-2012 3
ISSN 2229-5518
Min(Ps(n-1) in cluster 2) determine the survivor path(the path with the minimum
metric) for each state. If T-algorithm is employed in the VD,
+min(Bs(n) for cluster 2), ……… , the iteration bound is slightly longer than TACSU because
there will be another two input comparator in the loop to
Min(Ps(n-1) in cluster m)
compare the new Ps with a threshold value obtained from
+min(Bs(n) for cluster m)}. the optimal Path metric and preset T as shown in (3)
The minimum (Bs) for each cluster can be easily obtained Tbound=Tadder+Tp_in_comp+T2-in_comp. (3)
from the BMU or TMU and min(Ps) at time n-1 in each
To achieve the iteration bound expressed in(3), for the
cluster can be precalculated at the same time when the
precomputation in each pipelining stage, we limit the
ACSU is updating the new Ps for time n. Theoretically,
comparison to be among only p 0r 2p metrics. To simplify
when we continuously decompose Ps(n-1), Ps(n-2),……, the
our evaluation , we assume that each stage reduces the
precomputation scheme can be extended to Q steps. Where
number of the metrics to 1/p(or2-R) of its input metrics
q is any positive integer that is less than n. Hence Popt(n)
meeting the theoretical iteration bound should satisfy
can be calculated directly from Ps(n-q) in q cycles.
(2R)qb ≥ 2k-1. Therefore qb≥ (k-1)/R and qb is expressed as
(4), with a ceiling function.
Noverhead=20+2(k-1)/q+22(k—1)/q……..+2(q-1)(k-1)/q
=2q.(k-1)/q-1/2(k-1)/q-1
=2k-1-1/2(k-1)/q-1 (5)
Popt (n) = min {min (even Ps (n-1)) + while the odd states extend to states with lower indices (the
MSB is ‘0’ in Fig. 3). This information allows us to obtain
min(even Bs(n)), min (odd Ps (n-1)) the 2-step pre-computation data path. This process is
straightforward, although the mathematical details are
+min(odd Bs(n)) }.
tedious. For clarity, we only provide the main conclusion
The functional diagram of the 1-step pre-computation here.
scheme is shown in Fig. 5. In general (Path metric purge
The states are further grouped into 4 clusters as described
algorithm) PPAU have to wait for the new Ps from the
by (7). The BMs are categorized in the same way and are
ACSU to calculate the optimal Path metric [12], while in
described by (8).
Fig. 5 the optimal Path metric is calculated directly from Ps
in the previous cycles at the same time when the ACSU is cluster3 = {Pm | 0≤m≤ 63, m mod 4 = 3}
calculating the new Ps. The details of the PPAU are shown
in Fig. 6. cluster2 = ,Pm | 0 ≤m≤63, m mod 4 = 1}
The critical path of the 1-step pre-computation scheme is min (BMG1 (n-1)), min (cluster2 (n-
T1-step-pre-T = 2TAdder+ 2T4-in_comp +3T2-in-comp (6) 2)) + min (BMG3 (n-1)), min (cluster3 (n-2))+ min (BMG2(n-
1)) }+ min (even Bs(n)),
The hardware overhead of the 1-step pre-
computation scheme is about 4 adders, which is negligible. min {min (cluster0 (n-2))+ min (BMG1(n-1)),
Compared with the SEPC-T algorithm, however, the critical
path of the 1-sept pre-computation scheme is still long[12]. min (cluster1 (n-2))+ min (BMG0(n-1)),
In order to further shorten the critical path, we explore the
min (cluster2 (n-2)) + min (BMG2 (n-1)),
2-step pre-computation design next.
min (cluster3 (n-2))+ min (BMG3(n-1))
B. Two step precomputation
}+ min (odd Bs(n)) (9)
a. Acsu design
IJSER © 2012
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 3, Issue 9, September-2012 6
ISSN 2229-5518
b. SMU design
IJSER © 2012
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 3, Issue 9, September-2012 7
ISSN 2229-5518
consumption of the pro-posed VD is reduced accordingly. transition,‛ IEEE Trans. Very Large Scale Integr. (VLSI)
In order to achieve the same BER performance, the Syst. , vol. 15, no. 11, pp. 1172–1176, Oct. 2007.
proposed VD only consumes 30.8% the power of the full-
trellis VD. *8+ F. Sun and T. Zhang, ‚Low power state-parallel relaxed
adaptive viterbi decoder design and implementation,‛ in
VI. CONCLUSION Proc. IEEE ISCAS, M ay 2006, pp. 4811–4814.
We have proposed a low-power VD design for TCM *9+ J. He, H. Liu, and Z. Wang, ‚A fast ACSU architecture
systems. The precomputation architecture that incorporates for viterbi de-coder using T-algorithm,‛ in Proc. 43rd IEEE
T-algorithm efficiently reduces the power consumption of Asilomar Conf. Signals,Syst. Comput. , Nov. 2009, pp. 231–
VDs without reducing the decoding speed appreciably. We 235.
have also analyzed the precomputation algorithm, where
the optimal precomputation steps are calculated and [10] K. S. Arunlal and Dr. S. A. Hariprasad‛ An efficient
discussed. This algorithm is suitable for TCM systems viterbi decoder‛ International Journal of Advanced
which always employ high-rate convolutional codes. Information Technology (IJAIT) Vol. 2, No.1, February 2012
Finally, we presented a design case. Both the ACSU and
[11] J. He, Z. Wang, and H. Liu, efficient 4-D 8PSK TCM
SMU are modified to correctly de-code the signal. ASIC
decoder architecture,‛ IEEE Trans. Very Large Scale Integr.
synthesis and power estimation results show that,
(VLSI) Syst. , vol. 18, no. 5, pp. 808–817, May 2010.
compared with the full-trellis VD without a low-power
scheme, the precomputation VD could reduce the power *12+. A.A. Peshattiwar & Tejaswini G. Panse ‚High Speed
consumption by 70% with only 11% reduction of the ACSU Architecture for Viterbi Decoder Using T-
maximum decoding speed. Algorithm‛ International Journal of Electrical and
Electronics Engineering (IJEEE) ISSN (PRINT): 2231 – 5284,
VII. REFERENCES
Vol-1, Iss-3, 2012
*1+ F. Chan and D. Haccoun, ‚Adaptive viterbi decoding of
convolutional codes over memory less channels,‛ IEEE
Trans. Commun. , vol. no. 45,
IJSER © 2012
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 3, Issue 9, September-2012 9
ISSN 2229-5518
IJSER © 2012
http://www.ijser.org