Example of Multiplier
Example of Multiplier
Abstract— Approximate computing circuits are considered as consumption. This goal stems from the need to reduce the
a promising solution to reduce the power consumption in complexity of the singular value decomposition
embedded data processing. This paper proposes an FPGA implementation proposed in [7] for tactile data processing.
implementation for an approximate multiplier based on inexact
adder circuits. The performance of the proposed multiplier is The rest of this paper is organized as follow: Section II
evaluated by comparing the power consumption, the accuracy of reviews the existing approximate multipliers in literature. In
computation, and the time delay with those of an approximate Section III, the architecture of the proposed approximate
multiplier based on exact adder presented in literature. Results multiplier is described. Section IV analyzes and evaluates the
reports a power saving up to 17.39% with an improvement in simulation results of the proposed multiplier in terms of
time delay by 13.49%, at cost of less than 5% of accuracy loss. accuracy and power consumption. Finally, conclusion is
reported in section V.
Keywords—Digital Multipliers; Approximate Computing;
Accuracy; Low Power Consumption; Error Tolerant. II. PRIOR WORKS
Efficient implementations of approximate multipliers
I. INTRODUCTION based on different approaches have been recently reported in
Low power consumption has become the most important literature. Kulkarni et al. [8] predicted the least significant
design goal in a wide range of electronic systems especially columns of the partial product as a constant by using a
when dealing with smart self-powered sensing systems for truncated multiplication method. They presented a simplified
application domains such as Internet of Things (IoT), Wearable inaccurate 2 × 2 multiplier cell in order to be used as the basic
Devices and Robotics. The ever-increasing demand for higher block for constructing larger multiplier architectures. The
computing power represents a driving force toward ultra-low power consumption has been reduced by an average of 31.78%
power design strategies. Seeking to improve the energy - 45.4% comparing to previous accurate multiplier designs,
efficiency, designers have turned to optimization methods in with an average error of 1.39% - 3.32%. Two approximate 4:2
several ways from system level down to transistor device level. compressors have been proposed in [9] providing efficient
reductions in power consumption, hardware resources and
In recent years, approximate computing has appeared as an
delay with respect to exact designs. Authors in [10] proposed
effective approach to improve energy efficiency [1], [2].
an approximate multiplier design with an error distribution
Usually approximate results are sufficient for many
reducing the propagation delay and improving the energy
applications such as tactile data processing [3], image
efficiency. Recently, a high-speed energy efficient multiplier
processing, and data mining. Thus, it is highly recommended
to take advantage of energy reduction with a minimal variation (RoBA) based on rounding of the inputs in the form of 2 has
in performance [4]. Recently, approximations have been been proposed in [11]. This approach dramatically improved
adopted in computing units of the embedded systems, the speed and the energy consumption (up to 65%) since the
especially for graphics processing units (GPUs) and field- computational intensive part of the multiplication was omitted.
programmable arrays (FPGAs) [5]. Computing units e.g. In this paper, we present the FPGA implementation of two
embedded digital signal processing (DSP) systems are approximate multipliers: 1) the first is adopted from [11] since
considered as key components of modern electronic embedded it provides high power reduction compared with exact
devices [6]. Among the arithmetic DSP operations, the multipliers, and 2) the second proposes new architecture
multiplication block has been always considered as a complex modifying the first one by employing an inexact adder circuit
block increasing the complexity of the DSP systems. in place of the accurate one. Based on FPGA implementation
Therefore, decreasing the complexity of multipliers may results, the performance of the proposed architecture is
reduce the power consumption of the overall system. In this evaluated showing a good improvement in terms of power
perspective, the proposed work uses the approximate consumption and computation delay.
computing techniques for the arithmetic units i.e. adders and III. PROPOSED APPROXIMATE MULTIPLIER
multipliers taking advantage of power consumption reduction.
The main goal is to implement an efficient hardware The proposed architecture has been adopted from [11]: it is
architecture of an approximate multiplier providing low power based on rounding signed and unsigned numbers to the form of
2 . The main idea is to make use of an approximate adder in
126
generated after eliminating the term ( – M) × ( – ) from
the initial accurate multiplication. Accurate results are obtained
only when and are respectively equal to 2n and 2m. In
case, both inputs are equal to 3 × 2n and 3 × 2m respectively,
the error will be maximum. Terms used are explained as
follows:
x Error (E): E = |Re – Ri|, where Re is the exact
multiplication result, and Ri is the inexact result
obtained by the approximate multiplier simulation.
x Accuracy (ACC): ACC = (1 – E/Re) × 100. To
determine how accurate the output of the multiplier is
with respect to the exact multiplication. The values
could be between 0% and 100%.
x Minimum Acceptable Accuracy (MAA): it is
considered as the threshold value; to respect the
constraints of the system, the obtained accuracy must
be higher than this threshold value. Fig. 4. Comparison of META for different bit sizes.
Parameters
Approximate
Power Delay
Multipliers LUT
(mW) (ns)
MRCA 8-bit 69 9.71 0.04%
127
between accuracy and power saving for improving
performance and energy efficiency. As a conclusion,
implementation results have demonstrated that this new
design can be integrated into FPGA’s applications, especially
for digital signal processing (DSP). Future works will consist
on using the proposed architecture for the singular value
decomposition to reduce the power consumption of the overall
system for embedded tactile data decoding [3].
REFERENCES
[1] S. Mittal, 2016. A survey of techniques for approximate computing.
ACM Computing Surveys (CSUR), 48(4), p.62.
[2] J. Han and M. Orshansky, "Approximate computing: An emerging
paradigm for energy-efficient design," 2013 18th IEEE European Test
Symposium (ETS), Avignon, 2013, pp. 1-6.
doi: 10.1109/ETS.2013.6569370
Fig. 6. Variation of power consumption and delay with the size of META [3] A. Ibrahim, P. Gastaldo, H. Chible, and M. Valle, 2017. Real-Time
multiplier. Digital Signal Processing Based on FPGAs for Electronic Skin
Implementation. Sensors, 17(3), p.558.
[4] D. Mohapatra, G. Karakonstantis, and Roy, K., 2009, August.
Significance driven computation: a voltage-scalable, variation-aware,
quality-tuning motion estimator. In Proceedings of the 2009 ACM/IEEE
international symposium on Low power electronics and design (pp. 195-
200). ACM.
[5] L. Sekanina, "Introduction to approximate computing: Embedded
tutorial," 2016 IEEE 19th International Symposium on Design and
Diagnostics of Electronic Circuits & Systems (DDECS), Kosice, 2016,
pp.1-6.doi: 10.1109/DDECS.2016.7482460
[6] H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie and C. Lucas, "Bio-Inspired
Imprecise Computational Blocks for Efficient VLSI Implementation of
Soft-Computing Applications," in IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 57, no. 4, pp. 850-862, April 2010.
[7] A. Ibrahim, M. Valle, L. Noli and H. Chible, "Assessment of FPGA
Implementations of One Sided Jacobi Algorithm for Singular Value
Decomposition," 2015 IEEE Computer Society Annual Symposium on
Fig. 7. Instantaneous dynamic power comparison of different selected
VLSI, Montpellier, 2015, pp. 56-61.
inputs.
[8] P. Kulkarni, P. Gupta and M. Ercegovac, "Trading accuracy for power
inputs as an example to assess the instantaneous power in a multiplier architecture." Journal of Low Power Electronics 7.4
(2011): 490-501.
consumption of the 8-bit META and MRCA multipliers. Each
[9] A. Momeni, J. Han, P. Montuschi and F. Lombardi, "Design and
input has been simulated for a period of 20ns to determine its Analysis of Approximate Compressors for Multiplication," in IEEE
instantaneous dynamic power. The comparison of the obtained Transactions on Computers, vol. 64, no. 4, pp. 984-994, April 2015.
results presented in Fig. 7 indicates an impressive saving in doi: 10.1109/TC.2014.2308214
dynamic power from 9.21% up to 50%. For instance, for the [10] S. Hashemi, R. I. Bahar and S. Reda, "DRUM: A Dynamic Range
product 1F × 7c the power drops from 82mW to 41mW while Unbiased Multiplier for approximate applications," 2015 IEEE/ACM
the accuracy of the results remains approximately unchanged. International Conference on Computer-Aided Design (ICCAD), Austin,
TX,2015,pp.418-425.doi: 10.1109/ICCAD.2015.7372600
V. CONCLUSION [11] R. Zendegani, M. Kamal, M. Bahadori, A. Afzali-Kusha and M.
Pedram, "RoBA Multiplier: A Rounding-Based Approximate Multiplier
In this paper, an FPGA implementation for a new for High-Speed yet Energy-Efficient Digital Signal Processing," in
approximate multiplier circuit called META, has been IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
vol.25,no.2,pp.393-401,Feb.2017.
proposed. The new architecture provided a noticeable
[12] N.Zhu, L. Goh, W. Zhang, K. S. Yeo and Z. H. Kong, "Design of Low-
improvement in latency and power consumption at the price Power High-Speed Truncation-Error-Tolerant Adder and Its
of a small error which is acceptable for our application [13], Application in Digital Signal Processing," in IEEE Transactions on Very
[14]. Two hardware implementations of the approximate Large Scale Integration (VLSI) Systems, vol. 18, no. 8, pp. 1225-1229,
multiplier were compared: the first one employs an exact Aug.2010.doi: 10.1109/TVLSI.2009.2020591
adder while the second one is based on inexact adder. The [13] M. Franceschi; L. Seminara; S. Dosen; M. Strbac; M. Valle; D. Farina,
"A system for electrotactile feedback using electronic skin and flexible
results revealed that the accuracy of the META multiplier matrix electrodes: Experimental evaluation," in IEEE Transactions on
decreased slightly around 4.5% which is considered an Haptics,vol.PP,no.99,pp.1-1doi: 10.1109/TOH.2016.2618377.
acceptable variation, while the power consumption and the [14] M. Franceschi, L. Seminara, L. Pinna, M. Valle, A. Ibrahim and S.
delay have been reduced respectively by 17.39% and 13.49%. Dosen, "Towards the integration of e-skin into prosthetic devices," 2016
12th Conference on Ph.D. Research in Microelectronics and Electronics
Therefore, the proposed architecture provides a tradeoff (PRIME), Lisbon, 2016, pp. 1-4.
128