Approximation_of_Hardware_Accelerators_driven_by_Machine-Learning_Models__Embedded_Tutorial

The document presents a tutorial on hardware approximation techniques driven by machine learning models, aimed at reducing power consumption in electronic circuits. It discusses the application of machine learning in designing approximate components and synthesizing hardware accelerators, highlighting various methodologies and their benefits. The tutorial also emphasizes the importance of automated design systems and the role of machine learning in enhancing the efficiency of circuit approximation processes.

Uploaded by

smanasvitareddyp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views2 pages

Approximation_of_Hardware_Accelerators_driven_by_Machine-Learning_Models__Embedded_Tutorial

Uploaded by

smanasvitareddyp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)

Approximation of Hardware Accelerators

driven by Machine-Learning Models

(Embedded Tutorial)
Vojtech Mrazek
Faculty of Information Technology, Brno University of Technology, Brno, Czechia
[email protected]

Abstract—The goal of this tutorial is to introduce functional the algorithms can use the exact solution as a starting point.
hardware approximation techniques employing machine learning Examples of the design methodologies [1] include systematic
methods. Functional approximation changes the function of a pruning [8] and netlist rewriting [9] methods using greedy
circuit slightly in order to reduce its power consumption. Machine
learning models can help to estimate the error and the resulting algorithms. More advanced transformations are offered by al-
circuit power consumption. The use of these techniques will be gorithms based on genetic programming [10], [11]. In addition,
presented at multiple levels - at the individual component level other ML methods, such as matrix factorization [12], can be
and the higher level of HW accelerator synthesis. used in logic rewriting.
Index Terms—Approximate computing, Machine Learning, 2) A cross-technology library adoption can be helpful
Estimation, Prediction
when users want to use a highly optimized library of ap-
I. I NTRODUCTION proximate components (e.g., for 45 nm ASIC technology
such as [13]) in a different technology such as FPGA. The
Approximate computing has been appearing in hardware ASIC parameters are not correlated with the FPGA parameters.
designs for several years. A significant part of the research Moreover, there are thousands of components in the libraries
has focused on functional approximation, which involves in- which makes an exhausive search infeasible. Therefore, se-
troducing small errors into the computation for the benefit of lecting the Pareto-optimal component w.r.t. error and the target
more energy-efficient or faster processing of the input data. technology hardware parameters is complicated. However, ML
Functional approximation at the circuit level can be divided models can help us to estimate the hardware parameters in the
into two basic tasks [1]: (i) design of approximate components, new technology. The models can be based on traditional ML
in particular adders and multipliers, and (ii) high-level approx- models [14], [15], convolutional neural networks [16], [17],
imate synthesis of complex hardware accelerators. Manual ap- or graph neural networks [18]. The ML-based estimation has
proaches have often been used for the approximate component been successfully used to transform ASIC libraries to Virtex7
design. In the manual methodologies, for example, elementary FPGAs [14].
units such as full-adders or 2x2 multipliers were replaced 3) Automated approximation of the entire accelerators
by approximate implementation [2], [3], or the structure of can be done globally by modification of syntax tree describing
circuits was modified [4], the longest computational paths of the accelerator design [19]. These methods have scalability
the circuit were cut [5], or other mathematical properties of issues that need to be addressed. Other approaches assign the
circuits were exploited [6], [7]. Similarly, the application of approximate components from a library instead of exact com-
these components in accelerators was guided by the expert ponents [20]–[23]. They are trying to find a close-to-optimal
knowledge of designers. assignment guided by some multi-objective heuristic algorithm
Automated design systems help the designers with the (e.g., NSGA-II [24]). The key part of these algorithms is the
approximation at multiple levels. In automated design systems, evaluation of the overall quality of results (QoR) and HW
machine learning (ML) methods often find applications for parameters of the approximate accelerator. Some approaches
both component and accelerator approximation. They can use fast simulation of a subset of the test dataset to obtain QoR
also be useful for supporting synthesis methods in parameter [21], [22]. Other methods use a fast ML model to estimate the
estimation. overall parameters of the candidate approximate accelerator
Overall, the application of ML algorithms in a functional based on the parameters of the particular components [20],
approximation can be divided into the following groups: [23]. In another study, ML methods were employed to predict
1) Automated design of arithmetic approximate circuits how well an approximate multiplier is likely to work in an
is, in fact, a search for a circuit representing a new logic approximate neural network [25].
function with respect to accuracy and hardware parameters.
In contrast to data-driven regression, the advantage is that II. E XAMPLES OF APPLICATION ML TECHNIQUES
To demonstrate the application of ML in the approximate
This work was supported by the Czech Science Foundation project 21-
13001S. The author thanks all his collaborators to the presented works, namely hardware design, three approaches were selected and described
L. Sekanina, Z. Vasicek, M. Shafique, M.A. Hanif, and B.S. Prabakaran. below.
979-8-3503-3277-3/23/$31.00 ©2023 IEEE

91
Authorized licensed use limited to: G Narayanamma Institute of Technology & Science. Downloaded on November 12,2024 at 09:02:54 UTC from IEEE Xplore. Restrictions apply.
A. Automated circuit approximation driven by data 1023 possible solutions in a few hours, while the exhaustive
Libraries of approximate circuits are composed of fully search would take four months on a high-end processor.
characterized digital circuits that can be used as building III. C ONCLUSION
blocks of energy-efficient implementations of hardware ac-
As shown, ML models can significantly help in the design
celerators. They can be employed not only to speed up the
of libraries of approximate components and approximate ac-
accelerator development but also to analyze how an accelerator
celerators. Other ways combine different approaches in the
responds to introducing various approximate operations. In pa-
literature, e.g., creating specialized libraries and then building
per [11], an application-tailored, data-driven, fully automated
entire accelerators from them [20] or using partial evaluation
method for functional approximation of combinational circuits
to determine QoR [1]. However, by using a suitable ML model,
is proposed. It is demonstrated how an application-level error
we can reach a good quality result with a lower effort.
metric such as classification accuracy can be translated to a
component-level error metric needed for an efficient and fast R EFERENCES
search in the space of approximate low-level components that [1] I. Scarabottolo et al., “Approximate logic synthesis: A survey,” Proc.
are used in the application. This is possible by employing a IEEE, vol. 108, no. 12, pp. 2195–2213, 2020.
weighted mean error distance (WMED) metric for steering the [2] P. Kulkarni et al., “Trading accuracy for power with an underdesigned
multiplier architecture,” in Int. Conf. VLSI Design, 2011, pp. 346–351.
circuit approximation process which is conducted by means [3] M. Shafique et al., “A low latency generic accuracy configurable adder,”
of genetic programming. WMED introduces a set of weights in DAC ’15), 2015, pp. 1–6.
(calculated from the data distribution measured on a selected [4] H. R. Mahdiani et al., “Bio-inspired imprecise computational blocks
for efficient vlsi implementation of soft-computing applications,” IEEE
signal in a given application) determining the importance of Trans. Circuits Syst. I, Reg. Papers, vol. 57, 2010.
each input vector for the approximation process. [5] M. A. Hanif et al., “Quad: Design and analysis of quality-area optimal
low-latency approximate adders,” in DAC ’17, 2017, pp. 1–6.
B. ApproxFPGAs: adoption of ASIC library to FPGA [6] J. N. Mitchell, “Computer multiplication and division using binary
logarithms,” IRE Trans. Electronic Computers, vol. EC-11, no. 4, 1962.
Existing approximation techniques have predominantly fo- [7] M. S. Ansari et al., “A hardware-efficient logarithmic multiplier with
cused on ASICs, while not achieving similar gains when improved accuracy,” in DATE ’19, 2019, pp. 928–931.
deployed for FPGA-based accelerator systems, due to the [8] D. Shin et al., “A new circuit simplification method for error tolerant
applications,” in DATE ’11, 2011, pp. 1–6.
inherent architectural differences between the two. In the [9] S. Venkataramani et al., “Substitute-and-simplify: A unified design
ApproxFPGAs work [14], a framework was proposed, which paradigm for approximate and quality configurable circuits,” in DATE
leverages statistical or machine learning models to effectively ’13, 2013, pp. 1367–1372.
[10] Z. Vasicek et al., “Evolutionary approach to approximate digital circuits
explore the architecture-space of state-of-the-art ASIC-based design,” IEEE Trans. Evol. Comput., vol. 19, no. 3, 2015.
approximate circuits to cater them for FPGA-based systems [11] ——, “Automated circuit approximation method driven by data distri-
given a simple RTL description of the target application. bution,” in DATE ’19, 2019, pp. 96–101.
[12] S. Hashemi et al., “Blasys: Approximate logic synthesis using boolean
matrix factorization,” in DAC ’18, 2018, pp. 1–6.
C. AutoAx: automated approximation of accelerators [13] V. Mrazek et al., “Evoapprox8b: Library of approximate adders and mul-
Because the libraries of approximate components contain tipliers for circuit design and benchmarking of approximation methods,”
in DATE ’17, 2017, pp. 258–261.
from tens to thousands of approximate implementations for a [14] B. S. Prabakaran et al., “Approxfpgas: Embracing asic-based approxi-
single arithmetic operation, it is intractable to find an optimal mate arithmetic components for fpga-based systems,” in DAC ’20, 2020.
combination of approximate circuits in the library, even for an [15] C. Xu et al., “Sns’s not a synthesizer: A deep-learning-based synthesis
predictor,” in Int. Symp. Computer Architecture (ISCA ’22), 2022.
application consisting of a few operations. An open problem [16] Y. Zhou et al., “Primal: Power inference using machine learning,” in
is ”how to effectively combine circuits from these libraries DAC ’19, ser. DAC ’19, 2019.
to construct complex approximate accelerators”. The AutoAx [17] Z. Xie et al., “Fast ir drop estimation with machine learning,” in ICCAD
’20, 2020.
algorithm [23] represents a methodology for searching, se- [18] Y. Zhang et al., “Grannite: Graph neural network inference for transfer-
lecting, and combining the most suitable approximate circuits able power estimation,” in DAC ’20, 2020.
from a set of available libraries to generate an approximate [19] K. Nepal et al., “Automated high-level generation of low-power approx-
imate computing circuits,” IEEE Trans. Emerging Topics Comp., 2017.
accelerator for a given application. To enable fast design space [20] S. Ullah et al., “Appaxo: Designing application-specific approximate
generation and exploration, the methodology utilizes machine operators for fpga-based embedded systems,” ACM Trans. Embed.
learning techniques to create computational models estimating Comput. Syst., vol. 21, no. 3, 2022.
[21] V. Mrazek et al., “Alwann: Automatic layer-wise approximation of deep
the overall quality of processing and hardware cost without neural network accelerators without retraining,” in ICCAD ’19, 2019.
performing full synthesis at the accelerator level. Using the [22] S. Barone et al., “Multi-objective application-driven approximate design
methodology, hundreds of approximate accelerators (for a method,” IEEE Access, vol. 9, pp. 86 975–86 993, 2021.
[23] V. Mrazek et al., “Autoax: An automatic design space exploration
Sobel edge detector) were constructed. The accelerators show and circuit building methodology utilizing libraries of approximate
different but relevant tradeoffs between the quality of pro- components,” in DAC’19, 2019.
cessing and hardware cost and a corresponding Pareto-frontier [24] K. Deb et al., “A fast and elitist multiobjective genetic algorithm: Nsga-
ii,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp.
was identified. Furthermore, when searching for approximate 182–197, Apr 2002.
implementations of a generic Gaussian filter consisting of 17 [25] M. S. Ansari et al., “Improving the accuracy and hardware efficiency of
arithmetic operations, the AutoAx approach enables to identify neural networks using approximate multipliers,” IEEE Trans. Very Large
Scale Integration (VLSI) Systems, 2020.
approximately 103 highly important implementations from

92
Authorized licensed use limited to: G Narayanamma Institute of Technology & Science. Downloaded on November 12,2024 at 09:02:54 UTC from IEEE Xplore. Restrictions apply.

MATH 10 Test With Answer Key
90% (10)
MATH 10 Test With Answer Key
5 pages
Accelerating Chip Design With Machine Learning
No ratings yet
Accelerating Chip Design With Machine Learning
10 pages
Embedded Systems Lab 18ECL66: Demonstrate The Use of An External Interrupt To Toggle An LED On/Off
No ratings yet
Embedded Systems Lab 18ECL66: Demonstrate The Use of An External Interrupt To Toggle An LED On/Off
16 pages
Approximate Multipliers For Optimal Utilization of FPGA Resources
No ratings yet
Approximate Multipliers For Optimal Utilization of FPGA Resources
6 pages
Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators
No ratings yet
Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators
14 pages
Application of Machine Learning in FPGA EDA Tool D
No ratings yet
Application of Machine Learning in FPGA EDA Tool D
18 pages
Chen et al. - 2023 - Machine Learning in Advanced IC Design A Methodological Survey
No ratings yet
Chen et al. - 2023 - Machine Learning in Advanced IC Design A Methodological Survey
17 pages
A Practical Approach Based On Machine Learning To Support Signal Integrity Design
No ratings yet
A Practical Approach Based On Machine Learning To Support Signal Integrity Design
6 pages
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
No ratings yet
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
13 pages
Goswami and Bhatia - 2023 - Application of Machine Learning in FPGA EDA Tool Development
No ratings yet
Goswami and Bhatia - 2023 - Application of Machine Learning in FPGA EDA Tool Development
17 pages
Approximate Softmax Functions For Energy-Efficient Deep Neural Networks
No ratings yet
Approximate Softmax Functions For Energy-Efficient Deep Neural Networks
13 pages
Data Augmentation For Performance Prediction in VLSI Circuits
No ratings yet
Data Augmentation For Performance Prediction in VLSI Circuits
14 pages
12 Logarithm Approximate Floating
No ratings yet
12 Logarithm Approximate Floating
6 pages
Approximate Arithmetic Circuits A Survey Characterization and Recent Applications
No ratings yet
Approximate Arithmetic Circuits A Survey Characterization and Recent Applications
28 pages
Logarithmic Multiplier in Hardware Implementation of Neural Networks
No ratings yet
Logarithmic Multiplier in Hardware Implementation of Neural Networks
12 pages
Applications Enabled by FPGA-Based Technology
No ratings yet
Applications Enabled by FPGA-Based Technology
4 pages
Journal
No ratings yet
Journal
7 pages
mul (1) (1)
No ratings yet
mul (1) (1)
23 pages
A Deep Learning Prediction Process Accelerator Based FPGA PDF
No ratings yet
A Deep Learning Prediction Process Accelerator Based FPGA PDF
4 pages
Paper Report
No ratings yet
Paper Report
30 pages
Synthesizable Vhdl Design For Fpgas 2014th Edition Bezerra Eduardo Augusto download
100% (1)
Synthesizable Vhdl Design For Fpgas 2014th Edition Bezerra Eduardo Augusto download
47 pages
Design of VLSI Architecture For A Flexible Testbed of Artificial Neural Network For Training and Testing On FPGA
No ratings yet
Design of VLSI Architecture For A Flexible Testbed of Artificial Neural Network For Training and Testing On FPGA
7 pages
Approx - Multiplierusing NN
No ratings yet
Approx - Multiplierusing NN
12 pages
Electronics 14 02337
No ratings yet
Electronics 14 02337
18 pages
mitali songara.pptx a
No ratings yet
mitali songara.pptx a
13 pages
A High-Speed and Low-Complexity Architecture For Softmax Function in Deep Learning
No ratings yet
A High-Speed and Low-Complexity Architecture For Softmax Function in Deep Learning
4 pages
Mitali Songara.pptx a.pptx Bb
No ratings yet
Mitali Songara.pptx a.pptx Bb
14 pages
Approximate Multiplier
No ratings yet
Approximate Multiplier
12 pages
Floating-Point_Hardware_Design_A_Test_Perspective
No ratings yet
Floating-Point_Hardware_Design_A_Test_Perspective
5 pages
2506.09596v1
No ratings yet
2506.09596v1
5 pages
Letters: Direct Neural-Network Hardware-Implementation Algorithm
No ratings yet
Letters: Direct Neural-Network Hardware-Implementation Algorithm
4 pages
IET Circuits Devices Syst - 2020 - Zhu - Design Evaluation and Application of Approximate Truncated Booth Multipliers
No ratings yet
IET Circuits Devices Syst - 2020 - Zhu - Design Evaluation and Application of Approximate Truncated Booth Multipliers
13 pages
Resize-pdf_base Paper 6 - Copy-numbered
No ratings yet
Resize-pdf_base Paper 6 - Copy-numbered
13 pages
Lutmul: Exceed Conventional Fpga Roofline Limit by Lut-Based Efficient Multiplication For Neural Network Inference
No ratings yet
Lutmul: Exceed Conventional Fpga Roofline Limit by Lut-Based Efficient Multiplication For Neural Network Inference
7 pages
ALS Survey
No ratings yet
ALS Survey
19 pages
PAxC A Probabilistic-Oriented Approximate Computing Methodology For ANNs
100% (1)
PAxC A Probabilistic-Oriented Approximate Computing Methodology For ANNs
4 pages
Performance Analysis and Implementation 097e10b9
No ratings yet
Performance Analysis and Implementation 097e10b9
20 pages
Arithmetic Circuit Evaluation
No ratings yet
Arithmetic Circuit Evaluation
38 pages
Revised Report Final-1
No ratings yet
Revised Report Final-1
28 pages
Ullah 2021
No ratings yet
Ullah 2021
14 pages
Prospects For Analog Circuits in Deep Networks: Preprint in AACD 2021 Workshop Proceedings
No ratings yet
Prospects For Analog Circuits in Deep Networks: Preprint in AACD 2021 Workshop Proceedings
6 pages
PMC 2021
No ratings yet
PMC 2021
6 pages
2024.Advances in Intelligent Data Analysis and Its Applications
No ratings yet
2024.Advances in Intelligent Data Analysis and Its Applications
528 pages
Deep learning in computational mechanics a review
No ratings yet
Deep learning in computational mechanics a review
51 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
143 pages
Design and Analysis of Approximate Redundant Binary Multipliers
No ratings yet
Design and Analysis of Approximate Redundant Binary Multipliers
15 pages
ML Syllabus
No ratings yet
ML Syllabus
5 pages
intro_2
No ratings yet
intro_2
4 pages
1 - 04 - 2019 - Design Methodology To Explore Hybrid Approximate Adders For Energy-Efficient Image and Video Processing Accelerators
No ratings yet
1 - 04 - 2019 - Design Methodology To Explore Hybrid Approximate Adders For Energy-Efficient Image and Video Processing Accelerators
14 pages
Karthiga Phase I Report
No ratings yet
Karthiga Phase I Report
69 pages
Saini S.vlsi and Hardware Implementations..Learning Methods 2022
No ratings yet
Saini S.vlsi and Hardware Implementations..Learning Methods 2022
329 pages
lbdl
No ratings yet
lbdl
143 pages
1 s2.0 S0026269219309346 Main
No ratings yet
1 s2.0 S0026269219309346 Main
9 pages
Yu - 2023 - Machine Learning in EDA When and How
No ratings yet
Yu - 2023 - Machine Learning in EDA When and How
6 pages
Practical Solutions To Accelerating ASIC Design Development Using Machine Learning
No ratings yet
Practical Solutions To Accelerating ASIC Design Development Using Machine Learning
142 pages
Kulkarni Thesis 2019
No ratings yet
Kulkarni Thesis 2019
99 pages
Mlunit 1
No ratings yet
Mlunit 1
63 pages
Approximate Recursive Multipliers Using Low Power
No ratings yet
Approximate Recursive Multipliers Using Low Power
16 pages
Machine Learning For Electronic Design Automation: A Survey
No ratings yet
Machine Learning For Electronic Design Automation: A Survey
44 pages
Fault Tolerance
No ratings yet
Fault Tolerance
15 pages
Overcoming Computational Errors in Sensing Platforms Through Embedded Machine-Learning Kernels
No ratings yet
Overcoming Computational Errors in Sensing Platforms Through Embedded Machine-Learning Kernels
12 pages
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Toshiba Satellite A100 Inventec San Antonio 10e Revx01
No ratings yet
Toshiba Satellite A100 Inventec San Antonio 10e Revx01
50 pages
Garmin GPSMAP 65s Specs
No ratings yet
Garmin GPSMAP 65s Specs
2 pages
B.tech AI&DS Curriculum&Syllabus
No ratings yet
B.tech AI&DS Curriculum&Syllabus
125 pages
wam_generic_approvals_vol_1_process_description_v2_0_333447
No ratings yet
wam_generic_approvals_vol_1_process_description_v2_0_333447
66 pages
3. Google Business Profile 2022
No ratings yet
3. Google Business Profile 2022
55 pages
Home Recording Studio Essentials
No ratings yet
Home Recording Studio Essentials
16 pages
wrt54g Ds
No ratings yet
wrt54g Ds
2 pages
Shenzhen Weizhongyun Technology Co.,Ltd
No ratings yet
Shenzhen Weizhongyun Technology Co.,Ltd
1 page
Building Information Modeling (BIM) in Construction
No ratings yet
Building Information Modeling (BIM) in Construction
21 pages
5213935-UNIT 2 AI PROJECT CYCLE With Modelling - Uploaded
No ratings yet
5213935-UNIT 2 AI PROJECT CYCLE With Modelling - Uploaded
42 pages
On Page SEO Optimization
No ratings yet
On Page SEO Optimization
6 pages
Advanced SQL DBA
No ratings yet
Advanced SQL DBA
141 pages
2nd Year MCQS
No ratings yet
2nd Year MCQS
6 pages
Apply Quality Control
100% (1)
Apply Quality Control
24 pages
Sipt 20 P
No ratings yet
Sipt 20 P
6 pages
Luz y Sombra Interactive Worksheet - Live Worksheets
No ratings yet
Luz y Sombra Interactive Worksheet - Live Worksheets
9 pages
Advertising Models Summer Intern Report
No ratings yet
Advertising Models Summer Intern Report
6 pages
Client Scripts ServiceNow 1702754037
100% (1)
Client Scripts ServiceNow 1702754037
6 pages
ProtaStructure Design Guide - Space Trusses
No ratings yet
ProtaStructure Design Guide - Space Trusses
13 pages
Hilltop Pool Resort Hotel Reservation System
No ratings yet
Hilltop Pool Resort Hotel Reservation System
84 pages
Unit - 3 Unit - 3 Data Base Managemant Data Base Managemant
No ratings yet
Unit - 3 Unit - 3 Data Base Managemant Data Base Managemant
20 pages
Arduino 110901051202 Phpapp01
No ratings yet
Arduino 110901051202 Phpapp01
41 pages
Assignment 1: Website Design & Development
No ratings yet
Assignment 1: Website Design & Development
27 pages
How To Update Lot Divisible Flag To Yes (Y) In Organization Items (Doc ID 1106005.1)
No ratings yet
How To Update Lot Divisible Flag To Yes (Y) In Organization Items (Doc ID 1106005.1)
4 pages
Analysis & Pediction Using WEKA Machine Learing Toolkit
No ratings yet
Analysis & Pediction Using WEKA Machine Learing Toolkit
37 pages
Java Brochure
No ratings yet
Java Brochure
23 pages
IBM SPSS Statistics Installation and Renewals
No ratings yet
IBM SPSS Statistics Installation and Renewals
8 pages
12 IP REVISION PAPER
No ratings yet
12 IP REVISION PAPER
7 pages

Approximation_of_Hardware_Accelerators_driven_by_Machine-Learning_Models__Embedded_Tutorial

Uploaded by

Approximation_of_Hardware_Accelerators_driven_by_Machine-Learning_Models__Embedded_Tutorial

Uploaded by

2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)

Approximation of Hardware Accelerators

driven by Machine-Learning Models

You might also like