0% found this document useful (0 votes)
37 views4 pages

Optimizing Power-Accuracy Trade-Off in Approximate Adders: Celia D, Vinita Vasudevan and Nitin Chandrachoodan

This document discusses optimizing the power-accuracy trade-off in approximate adders. It compares various existing approximate adders that split the output sum into an accurate upper part and approximate lower part. The document proposes setting the approximate lower part to a fixed value that minimizes the mean error distance, allowing power savings similar to a truncation adder. Simulation results show the proposed adder provides better trade-off between power savings and accuracy than existing approaches.

Uploaded by

vlsi project
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views4 pages

Optimizing Power-Accuracy Trade-Off in Approximate Adders: Celia D, Vinita Vasudevan and Nitin Chandrachoodan

This document discusses optimizing the power-accuracy trade-off in approximate adders. It compares various existing approximate adders that split the output sum into an accurate upper part and approximate lower part. The document proposes setting the approximate lower part to a fixed value that minimizes the mean error distance, allowing power savings similar to a truncation adder. Simulation results show the proposed adder provides better trade-off between power savings and accuracy than existing approaches.

Uploaded by

vlsi project
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Optimizing Power-Accuracy trade-off in

Approximate Adders
Celia D, Vinita Vasudevan and Nitin Chandrachoodan
Department Electrical Engineering, Indian Institute of Technology Madras
Chennai, India 600036
Email: {ee13d003,vinita,nitin}@ee.iitm.ac.in

Abstract—Approximate circuit design has gained significance requirement for the lower part, it has the largest savings in both
in recent years targeting applications like media processing where power and area. However, it also results in significant errors
full accuracy is not required. In this paper, we propose an [3]. The other two-part segmented approximate adders namely
approximate adder in which the approximate part of the sum is
obtained by finding a single optimal level that minimises the mean approximate mirror adder [3], lower-part OR adder (LOA)
error distance. Therefore hardware needed for the approximate [4], error tolerant adder (ETA) [5] and inexact adder [6],
part computation can be removed, which effectively results in obtain better accuracy by introducing limited computations for
very low power consumption. We compare the proposed adder approximating the less significant part of the output. However,
with various approximate adders in the literature in terms of this results in additional costs in area and power.
power and accuracy metrics. The power savings of our adder is
shown to be 17% to 55% more than power savings of the existing There are other approximate adders, such as [7], [8] which
approximate adders over a significant range of accuracy values. do not split the output into approximate lower part and
Further, in an image addition application, this adder is shown accurate upper part. Instead, the adder is divided into many
to provide the best trade-off between PSNR and power. subadders and carry is predicted. Here we can trade off power
Index Terms—approximate adder, low power, accuracy, archi- and accuracy by varying the size of subadders. However, these
tectural approximation
adders perform worse than the two-part segmented adders in
terms of power-accuracy trade-off [9]. In this paper, therefore
I. I NTRODUCTION
we focus on segmented adders with two parts – an accurate
Many signal processing blocks, especially those meant for upper part and an inaccurate lower part.
video and speech, are error tolerant which makes it possi- It is possible to match the power savings of the truncation
ble to use inaccurate arithmetic units. This is exploited in adder if the approximate bits are set to a fixed value. In this
systems to save power and area as well as to reduce the paper, we propose to set it to a fixed value L, that minimizes
delay. Approximation is mainly done using voltage over- the mean error distance (MED). The value is chosen so that it
scaling and architectural approximation [1], [2]. In voltage is optimal for all inputs that have a symmetric probability mass
over-scaling, supply voltage is scaled down leading to power function (PMF). If the input PMF is not symmetric, we show
savings, but causing increased delays. This results in possible that it is close to optimal as long as the number of approximate
timing violations and hence inaccurate results. In architectural bits is not too large. We quantify the power savings of various
approximation, the functionality of the circuit is approximated two-part adders interms of the power savings per bit and the
and simplified so that even at nominal supply voltages, the MED. We have used the proposed adder in an image addition
results are not accurate. The simplification in functionality application to demonstrate its effectiveness.
results in reduced logic density and achieves savings in power, Section II contains a discussion of the power consumed by
delay and area. various approximate adders proposed in the literature. This is
Several approximate adders have been proposed and studied followed by a theoretical analysis leading to the choice of the
in the literature, each representing a trade-off between power fixed value L. Simulation results and power accuracy trade-
and accuracy. One possibility is to segment the adder into off for various adders are discussed in detail in Section IV.
two parts. The upper part of the sum containing its most Section V has the conclusions.
significant bits (MSBs) are obtained using accurate adders.
II. C OMPARISON OF PREVIOUSLY PROPOSED ADDERS
Approximate logic is used to compute the lower part of the
sum containing the remaining least significant bits (LSBs). In A. Notation and Previously proposed adders
such approximate adders, for a given number of lower-order Consider an approximate adder with N -bit inputs A, B and
bits being approximated, the power consumed by the accurate N + 1 bit sum S. Let k be the number of bits in the lower
upper part is almost the same. Power savings in the lower part of the sum which are approximated. The input A, with
portion is typically due to reduction in the switching activity binary representation aN −1 aN −2 ...ak ak−1 ...a0 , is denoted as
due to use of simpler gates. the concatenation AH AL , where AH = aN −1 aN −2 ...ak is the
The simplest of these adders is the truncation adder, where upper part and AL = ak−1 ...a0 is the lower part. The input B
the lower part is set to zero. Since there is no hardware is denoted as BH BL in a similar way. The adder output S has

978-3-9819263-0-9/DATE18/2018
c EDAA 1488
binary representation sN sN −1 ...sk sk−1 ...s0 and is denoted truncation adder. To do this, we need to have fixed values
as SH SL , where SH = sN sN −1 ...sk is the upper part and of SL and ck−1 so that, like the truncation adder, there is no
SL = sk−1 ...s0 is the lower part. ck−1 sk−1 sk−2 ...s0 denotes hardware to find the lower part of the output. The values of SL
the approximate sum of AL and BL . Here, ck−1 denotes the and ck−1 are to be chosen such that the MED is minimized.
carry bit to the upper part. For purposes of analysis, in this section, AH , AL , BH and
In a simple truncation adder, SL = 0 and ck−1 = 0. In BL refer to the corresponding decimal representation. In all
this adder there is no circuit to calculate the lower part of cases, we assume that both inputs have N bits and k bits of the
the output sum, and the power consumption is only due to the sum are approximated. The accurate sum therefore, is given
upper part. Consequently, it has the lowest power consumption by (AH + BH )2k + (AL + BL ).
amongst all the approximate adders in the literature. In the Assume that Z = AL +BL is approximated by a fixed value
approximate mirror adder 5 (AMA5) proposed in [3], the L. Therefore, the approximate sum is (AH + BH )2k + L. The
lower part of the result is set as SL = AL and the carry error distance (ED), which is the absolute value of the error,
is set as ck−1 = bk−1 . The AMA5 adder has higher power is therefore given by |Z − L|. The goal is to find L so that
consumption than truncation adder, due to the toggles in the E{|Z − L|} is minimised. The solution to this minimization
lower part that come from one of the inputs and also because of problem is well known, namely, the value of L that minimises
the carry propagating to the upper part of the sum. In the LOA E{|Z − L|} is the median of the distribution of Z [12]. For
approximate adder [4], SL = AL OR BL and ck−1 = ak−1 a discrete random variable, the median is defined as follows.
AND bk−1 . The power consumed by LOA adder is more than Definition: The median of the random variable Z is defined
that of truncation and AMA5 adders due to the OR gates used to be any number MZ that satisfies the relationship P (Z ≤
in computation of SL and the AND gate for ck−1 . In the ETA MZ ) ≥ 1/2 and P (Z ≥ MZ ) ≥ 1/2.
approximate adder proposed in [5], the lower part of the inputs Hence L = MZ . However, the value of MZ depends on the
are added from left to right until the point at which both input probability distribution (PMF) of Z. We now consider various
bits are logic 1. Beyond this point all the sum bits are set cases of input distributions.
to logic 1. Here ck−1 = 0. The hardware required and hence
the power consumed is higher than the LOA adder because A. Uniformly distributed inputs
it needs extra gates for the detection logic and setting of the This is the most common assumption for the PMF of the
sum bit to the appropriate value. In InXA2 adder proposed in inputs. Assume that A and B have a uniform PMF with values
[6], the i-th bit of the lower part sum is set as si = (ai XOR in the range [0, 2N −1]. It is obvious that AL and BL will also
bi ) OR ci−1 , where ci−1 is the carry obtained on adding the i have a uniform PMF with values within a range [0, 2k − 1].
LSBs of the two inputs. This adder has a relatively large power Since the PMF of Z = AL + BL is the convolution of the
consumption owing to the hardware needed for computation. PMFs of AL and BL , it is a triangular distribution with values
B. Accuracy Metrics between 0 and 2k+1 − 2. Since it is a symmetric PMF, the
median value is the midway point, namely, 2k −1. If the binary
To measure the accuracy and quality of approximate arith- representation of L is ck−1 sk−1 sk−2 · · · s0 , then ck−1 = 0 and
metic circuits, the metrics proposed in the literature are Mean sk−1 sk−2 · · · s0 = 11 · · · 1.
Error Distance (MED), Normalized Mean Error Distance
(NMED and NED) and Mean Relative Error Distance (MRED) B. Inputs have a symmetric distribution
[8], [10], [11].
MED is the average absolute error between the accurate Here, it is assumed that the PMFs of both inputs A and B
and approximate outputs. NMED is normalization of MED by are symmetric, i.e. P (A = Q) = P (A = 2N − 1 − Q) and
2N +1 − 2, which is the maximum sum possible in an N -bit P (B = Q) = P (B = 2N −1−Q), where 0 <= Q <= 2N −1.
adder. It is an indication of the significance of the MED in In such a setting, we claim that the distribution of AL and BL
adders of various lengths. NED is obtained by normalizing are also symmetric, i.e. P (AL = Q) = P (AL = 2k − 1 − Q)
the MED by 2k and is an indication of how rapidly the and P (BL = Q) = P (BL = 2k − 1 − Q), where 0 <= Q <=
MED grows with every additional approximate bit. MRED is a 2k − 1. A proof for this claim is as follows.
relative error metric and is an indicator of the percentage error Divide the range [0, 2N − 1] into 2N −k bins, each of size
across all values of the sum. The ranking of adders in terms of 2 . The i-th bin is given by [i2k , (i + 1)2k − 1], where 0 ≤
k

the MRED and NMED show the same trend [9]. As NMED is i ≤ 2N −k − 1. In each bin, the most significant N − k bits
just the MED divided by a constant value, optimization with have a fixed value. For 0 ≤ Q ≤ 2k − 1, we have
2N
−k
−1
respect to either MED or NMED will give the same result.
Since the primary error metric is the MED, we use it for P (AL = Q) = P (A = 2k i + Q).
i=0
further analysis and optimization.
Hence,
III. P ROPOSED M EDIAN A PPROXIMATE A DDER (MA) 2N
−k
−1
k
Our focus in this paper is to try and minimise the MED, P (AL = 2 − 1 − Q) = P (A = 2k i + 2k − 1 − Q).
while aiming to achieve the low power consumption of the i=0

Design, Automation And Test in Europe (DATE 2018) 1489


TABLE I: Power saving per approximate bit of various adders.
Since the PMF of A is symmetric, we have
P (AL = 2k − 1 − Q) Adder (A) Trunc/MA AMA5 LOA ETA InXA2
2N
−k
−1 Psb,A 1 0.88 0.72 0.57 0.01
= P (A = 2N − 1 − (2k i + 2k − 1 − Q))
i=0 Genus for 55nm technology. The synthesized netlist along
2 N −k with Standard Delay Format (SDF) file generated by Genus is
−1
= P (A = 2k (2N −k − 1 − i) + Q) = P (AL = Q). simulated with 106 uniform random input pairs at a frequency
i=0 of operation of an accurate ripple carry adder. A full adder’s
input pin capacitance is set as the output load capacitance.
The proof for the symmetry of BL is same as the above.
If the PMFs of AL and BL are symmetric, the PMF of their A. Power savings and MED
sum (which is the convolution of the PMFs of the two inputs)
Fig. 1a shows the variation in power savings normalized
is also symmetric. Since the range of Z is 0 to 2k+1 − 2 and
by the power of an accurate 1-bit full adder, as a function
Z is symmetric, the median of Z is 2k − 1.
of the number of approximate bits (k) in various approximate
C. PMF of the inputs are arbitrary adders. For k approximate bits, both the truncation and MA
In most of the applications seen in the literature, the number adders discard k full adders and set the lower part to a constant.
of approximate bits rarely exceeds N/2. As before, consider Therefore, the normalized power savings for those two adders
the division of the range [0, 2N − 1] into 2N −k bins, each is k. For any other adder A, the normalized power savings
of size 2k . For small k values, if the distribution of the 2k can be roughly expressed as Psb,A · k, where Psb,A is the
values within each bin is approximately uniform, then the normalized power saving per approximate bit. Table I shows
distributions of AL and BL are also approximately uniform. the value of Psb,A for various approximate adders. These
This means that the PMF of Z = AL + BL is approximately values are obtained from the slopes of the curves in Fig. 1a.
triangular. Therefore, in the setting of small k, the median of This is same as the power saving per bit described in [11].
Z is closely approximated by 2k − 1. We note that the value of Psb,A depends on the hardware
Since most applications are likely to satisfy one of the three complexity involved in the approximation of the lower part
cases, the proposed median adder (MA) uses L = 2k − 1. of the adder. Hence these values are low for ETA and InXA2
However, if the PMFs of the inputs are known, it is possible adders which have more hardware.
to derive the PMFs of the lower k bits, which can then be Fig. 1b shows the variation in MED with the number of
convolved to obtain the PMF of the lower part sum. In this approximate bits for all the adders. As seen from the figure,
case, the median can be obtained exactly and L can be set to for all adders, log2 MED is approximately linear in k with a
this value. slope that approaches one as k becomes larger (k ≥ 3). This
means M ED ≈ c 2k in this range. Further, as seen from the
D. Accuracy metrics for Median adder for uniformly dis- figure, c = 1 for the truncation adder. For all other adders,
tributed inputs c < 1. Therefore, for a given MED, the truncation adder has
An expression for MED of the median adder (E{|Z − L|}) the least number of approximate bits. MA for example, can
with uniformly distributed inputs can be derived as follows. have log2 3 more approximate bits than truncation for the same
k
2 −1 MED. We also note that for a given k, MED of the MA is
M ED = iP (|Z − L| = i) only marginally higher than the other inexact adders.
i=0
k
2 −1 B. Power-accuracy tradeoff
2k − i 2k 2−k
=2 i 2k
= − . (1) The important trade-off in approximate hardware systems is
i=0
2 3 3
the power-accuracy trade-off i.e. which of the adders meets an
The expressions for NMED and NED are obtained using accuracy constraint with highest possible power saving. Using
suitable normalization factors as follows. the plots in Figs. 1a and 1b, we obtain Fig. 1c, which is a plot
2k − 2−k 1 − 2−2k of the normalized power savings as a function of log2 (MED).
N M ED = N +1
and N ED = .
3 · (2 − 2) 3 For a considerable range of accuracies, the MA meets the
The expression for MED and NED of the truncation adder accuracy constraint of a given MED with the highest power
are 2k − 1 and 1 − 2−k respectively. Clearly, MA is much saving and hence offers the best trade-off between power and
better than truncation in terms of both metrics, while having accuracy as compared to all the other approximate adders.
the same power savings. We analyse the power savings of all the approximate adders
with repect to the truncation adder. For the truncation adder,
IV. E XPERIMENTAL RESULTS Psb,trunc = 1 and the number of approximate bits for a given
In this section, we compare the proposed adder with other MED is ktrunc ≈ log2 MED. Therefore, the normalized power
approximate adders in terms of power savings, MED and the savings Ps,trunc ≈ log2 MED. Let cA denote the number of
peak signal-to-noise ratio (PSNR). All the approximate circuits additional approximate bits possible for the same MED in
are designed using Verilog and synthesized using Cadence other adders. Therefore, the power savings for a given MED

1490 Design, Automation And Test in Europe (DATE 2018)


15 1.2
Trunc Trunc 15 Trunc
15

Normalized Dynamic Power


Normalized Power savings
Normalized Power savings AMA5 AMA5 AMA5
LOA 1
LOA LOA
ETA ETA ETA
10 10 10 InXA2 0.8

log2 (MED)
InXA2 InXA2
MA MA MA
0.6
5 Trunc
5 5
AMA5
0.4
LOA
0 ETA
0.2 InXA2
0 0
MA
0 2 4 6 8 10 12 14 16 0 5 10 15 0
0 2 4 6 8 10 12 14 16 0 10 20 30 40 50 60
No. of approximate bits (k) No.of approximate bits (k) log2 (MED) Peak Signal to Noise Ratio (dB)
(a) (b) (c) (d)
Fig. 1: (a)-(b) Variation in Normalized Power savings and Mean Error Distance with the number of approximate bits in various 16-bit approximate adders;
(c) Normalized Power savings vs Mean Error Distance for 16-bit approximate adders; (d) Normalized Power vs PSNR for image addition using approximate
adders, with number of approximate bits varying from 1 to 7.

can be written as Ps,A = Psb,A (ktrunc + cA ). For the MA, a truncation adder, but provides significantly better accuracy.
Psb,M A = 1 and cMA ≈ log2 3 as seen from equation (1). When compared to other similar adders, simulation results
Therefore, MA always performs better than truncation. For show that this adder gives the best trade-off between power and
other approximate adders, accuracy. We also used this adder in a simple image processing
Ps,A = Psb,A · (ktrunc + cA ) application and showed that the median adder gives better
= Ps,trunc + Psb,A · cA − (1 − Psb,A ) log2 MED. (2) PSNR and power savings when compared to other similar
adders in the literature.
As seen in Table I, Psb,A < 1 for all adders other than
truncation and MA. As MED increases, the third term in (2) R EFERENCES
becomes greater than the second, so that Ps,A becomes less [1] J. Han and M. Orshansky, “Approximate computing: An emerging
than Ps,trunc . Fig. 1c shows that truncation adder performs paradigm for energy-efficient design,” in Proceedings of the 18th IEEE
European Test Symposium (ETS), 2013.
better than (a) ETA beyond log2 MED ≈ 3, (b) LOA beyond [2] R. Venkatesan, A. Agarwal, K. Roy, and A. Raghunathan, “MACACO:
log2 MED ≈ 4 and (c) AMA5 beyond log2 MED ≈ 10. Modeling and analysis of circuits for approximate computing,” in
Although ETA and LOA have a larger cA for a given MED, the IEEE/ACM ICCAD, pp. 667–673, 2011.
[3] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, “Low-power
difference is not large enough to result in a reduction of power. digital signal processing using approximate adders,” IEEE Trans. on
This is because both of them have a significant amount of Comp.-Aided Design of Integrated Circuits and Systems, 2013.
hardware to compute the lower part sum, resulting in a lower [4] H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie, and C. Lucas, “Bio-inspired
AMA5 computational blocks for efficient VLSI implementation of soft-
Psb,A . AMA5 effects a power saving compared to truncation computing applications,” IEEE Trans. on Circuits and Systems I: Regular
for most cases because its Psb,A is high as seen in Table I. Papers, vol. 57, pp. 850–862, 4 2010.
Since the MA adder has no additional hardware, the power [5] N. Zhu, W. L. Goh, W. Zhang, K. S. Yeo, and Z. H. Kong, “Design of
low-power high-speed truncation-error-tolerant adder and its application
savings due to the increase in cA are fully realised. in digital signal processing,” IEEE Transactions on Very Large Scale
Both the power savings and MED depend only upon the Integration (VLSI) Systems, vol. 18, pp. 1225–1229, 8 2010.
number of approximate bits k. So, for any adder size N , the [6] H. A. F. Almurib, T. N. Kumar, and F. Lombardi, “Inexact designs
for approximate low power addition by cell replacement,” in Design,
power savings versus MED trade-off remains the same. Automation and Test in Europe (DATE), 2016.
[7] D. Mohapatra, V. K. Chippa, A. Raghunathan, and K. Roy, “Design of
C. Application: Image addition voltage-scalable meta-functions for approximate computing,” in Design,
Automation and Test in Europe (DATE), 2011.
The approximate adders are used in a simple image pro- [8] A. B. Kahng and S. Kang, “Accuracy-configurable adder for approximate
cessing application, namely image addition. The objective is arithmetic designs,” in Design Automation Conference (DAC), 2012.
to compare the power consumed by various adders for a given [9] H. Jiang, J. Han, and F. Lombardi, “A comparative review and evaluation
of approximate adders,” in Proc. of the Great Lakes Symposium on VLSI
peak signal to noise ratio (PSNR). Two images (Cameraman (GLSVLSI), pp. 343–348, 2015.
and Rice) each of size 256x256 with each pixel represented [10] C. Liu, J. Han, and F. Lombardi, “A low-power, high-performance
using 8 bits are chosen as input images. Fig. 1d shows the approximate multiplier with configurable partial error recovery,” in
Design, Automation and Test in Europe (DATE), 2014.
dynamic power consumption as a function of PSNR for various [11] J. Liang, J. Han, and F. Lombardi, “New metrics for the reliability of ap-
approximate adders, with number of approximate bits varying proximate and probabilistic adders,” IEEE Transactions on Computers,
from 1 to 7. From the figure, we see that MA once again vol. 62, pp. 1760–1771, 9 2013.
[12] S. K. Bar-Lev, B. Boukai, and P. Enis, “On the mean squared error, the
gives the best trade-off between dynamic power and PSNR as mean absolute error and the like,” Communications in Statistics - Theory
compared to all the other approximate adders. and Methods, vol. 28, no. 8, pp. 1813–1822, 1999.

V. C ONCLUSION
We have proposed the median adder, which approximates
the lower k bits of the output to the fixed value 2k − 1. The
median adder results in power consumption as low as that of

Design, Automation And Test in Europe (DATE 2018) 1491

You might also like