0% found this document useful (0 votes)

32 views6 pages

12 Logarithm Approximate Floating

This document proposes using addition-as-int approximate multipliers to accelerate probabilistic circuits (PCs) for hardware-efficient inference on edge devices. PCs allow for reliable probabilistic reasoning but their computation requires many floating-point multiplications, which are expensive. Approximate multipliers can reduce hardware costs significantly with little impact on accuracy. The paper analyzes the expected approximation error and shows through simulation that the approach leads to accurate results for common queries while providing a way to trade off accuracy and efficiency.

Uploaded by

Philippe Englert Velha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views6 pages

12 Logarithm Approximate Floating

Uploaded by

Philippe Englert Velha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Logarithm-Approximate Floating-Point Multiplier

for Hardware-efficient Inference in Probabilistic Circuits

Lingyun Yao1 Martin Trapp2 Karthekeyan Periasamy1 Jelin Leslin1 Gaurav Singh1 Martin Andraud1

1
Electrical Engineering Dept., Aalto University, Espoo, Finland
2
Computer Science Dept., Aalto University, Espoo, Finland

Abstract the energy efficiency of dedicated AI processors by 10× –

100× compared to Graphical Processing Units [17]. How-
ever, NNs that have been adopted into real-world use of-
Machine learning models are increasingly being ten raise concerns related to their reliability, fairness, and
deployed onto edge devices, for example, for smart interpretability [9, 7] alongside their high inference costs
sensing, reinforcing the need for reliable and effi- [27, 23].
cient modeling families that can perform a variety
of tasks in an uncertain world (e.g., classification, Consequently, to be suitable for the challenges associated
outlier detection) without re-deploying the model. with edge AI, there is an urgent need to develop effective
Probabilistic circuits (PCs) offer a promising hardware acceleration of machine learning models that are
avenue for such scenarios as they support efficient probabilistic, i.e., they enable reasoning in an uncertain
and exact computation of various probabilistic world [6], and tractable, i.e., they can reliably answer many
inference tasks by design, in addition to having probabilistic queries without re-deployment. Recent work
a sparse structure. A critical challenge towards on tractable probabilistic models, specifically on probabilis-
hardware acceleration of PCs on edge devices is tic circuits (PCs) [2], poses a promising avenue as these mod-
the high computational cost associated with mul- els (i) exhibit high expressive efficiency (representational
tiplications in the model. In this work, we propose power), (ii) enable reliable [25, 13] and fair [1] reasoning,
the first approximate computing framework for and (iii) allow many probabilistic queries to be computed
energy-efficient PC computation. For this, we tractably by design. Yet, while pioneering works have ex-
leverage addition-as-int approximate multipliers, plored acceleration of PCs on Field Programmable Gate
which are significantly more energy-efficient than Arrays (FPGAs) [3, 21, 22] and Application-Specific Inte-
regular floating-point multipliers, while preserving grated Circuits (ASICs) [18, 19], the hardware acceleration
computation accuracy. We analyze the expected of PCs poses many open challenges. In particular, their irreg-
approximation error and show through hardware ularity (i.e., PCs are sparsely connected making parallelism
simulation results that our approach leads to a more challenging [20]) and high computation resolution
significant reduction in energy consumption with (i.e., probabilistic inference with PCs typically requires 30 –
low approximation error and provides a remedy 40 floating-point bits [22, 20] as arithmetics are performed
for hardware acceleration of general-purpose on probabilities) hinders their deployment on edge devices
probabilistic models. where efficiency and reduced resolution are key due to the
limited energy resources.
In this work, we propose to approximate floating-point mul-
1 INTRODUCTION tipliers through Addition-as-Int [10], suggesting high poten-
tial gains in computational efficiency (Addition-as-Int can
The development of smart sensing and Internet-of-Things reduce the hardware cost of multiplication by a factor of up
applications based on embedded artificial intelligence (AI), to 112×) with little impact on the accuracy of the computa-
such as smartphones, wearables, or other sensor networks, tions. In addition, we carry out a theoretical analysis of the
is pushing the computation of machine learning meth- expected error and show that our approach can result in ac-
ods directly onto edge devices. Recent innovations (e.g., curate computations for maximum-a-posteriori (MAP) and
[12, 26, 17]) have pushed up the computational efficiency marginal queries and enables to concisely trade-off accuracy
of deep feedforward neural networks (NNs) and improved and computational efficiency.

Accepted for the 6th Workshop on Tractable Probabilistic Modeling at UAI (TPM 2023).
(a) Probabilistic Circuit (b) Corresponding hardware representation for MAP Inference

max

w1,1 w1,2 w1,3

1x1 =0 v1 1x1 =1 v2
max max
X1 = 0 X1 = 1
w3,2
w2,1 w2,2 w3,1 w3,2

w2,1 w2,2 w3,1

1x3 =0 1x2 =0 1x3 =1 1x2 =1 1x3 =0 1x3 =1 1x2 =0

X3 = 0 X2 = 0 X3 = 1 X2 = 1 X3 = 0 X3 = 1 X2 = 0 v3 v4 v5 v6 v7 v8 v9

Figure 1: Illustration of a PC (a) over discrete RVs (X1 , X2 , X3 ) and the corresponding hardware realization of MAP
inference (b). For this, sum nodes are replaced by max operators, and an additional propagation path for information bits is
added to back-track the most probable path (MAP result)

2 BACKGROUND: PROBABILISTIC nodes are decomposable.

CIRCUITS Definition 2.2 (Determinism). A sum node S is determinis-
tic if for every complete evidence x at most one child has
Probabilistic circuits (PCs) have recently been introduced a positive value. Consequently, a PC is deterministic if all
as an umbrella to unify a variety of existing tractable prob- sum nodes are deterministic.
abilistic models (e.g., [4, 14, 15, 8]). They represent the
(possibly unnormalized) distribution function (density or
We refer the reader to [2] for further details on the structural
mass function) of a multivariate probability distribution
properties of PCs.
over random variables (RVs) X = {Xi }di=1 through a di-
rected acyclic graph G. The computational
P graph (G) con-
stitutes weighted sums S(x) = C∈ch(S) wS,C C(x) with 3 APPROXIMATE COMPUTING FOR
P Q
C∈ch(S) wS,C = 1, products P(x) = C∈ch(S) C(x), and PROBABILISTIC CIRCUITS
leaf nodes associated with parametric functions, typically
assumed to be density/mass functions of univariate probabil- Assuming positive numbers in floating-point representation,
ity distributions L(x) = p(x | θL ). We use ch(N) to denote two operands x and y can be written as x = 2Ex (1 + Mx )
the set of children of a node (N) and θ denotes parameters and y = 2Ey (1 + My ). Note that we can omit the sign bit
of the parametric leaves. In addition, each node N ∈ G is and only have to consider their exponent (E) and mantissa
associated with a scope ψ(N) ⊆ X provided by a scope (M) values. Therefore, the exact product x × y is given as:
function ψ : N → P(X) [24], where P(X) denotes the
power set of X, specifying the set of RVs the node repre- x × y = 2Ex +Ey (1 + Mx )(1 + My ) (1)
sents a joint distribution over. Fig. 1(a) illustrates a PC over
three discrete RVs using indicator functions at the leaves, This product can be conveniently expressed in log-space,
where we use ⊕ to illustrate sum nodes and ⊗ for product i.e.,
nodes. Fig. 1(b) illustrates our proposed hardware realiza-
tion of MAP inference for a PC. A particularly relevant class log2 (x × y) = Ex + Ey + log2 (1 + Mx ) + log2 (1 + My ),
of PCs are those that are smooth and decomposable, as both (2)
properties are requirements for many probabilistic queries
to be computable exactly and in time linear in the number of A popular approximate solution is based on Mitchell’s
nodes of G. Henceforth, we will briefly review smoothness method [10]. To approximate the logarithm, Mitchell’s
and decomposability. method uses log2 (1 + F ) ≈ F , which is the first-order
Taylor series expansion of log2 (1 + F ). Using this approx-
Definition 2.1 (Smooth & Decomposability). A sum node S
imation, Eq. (2) becomes:
is smooth if all children have the same scope, i.e., ψ(C) =
ψ(C′ ), ∀C, C′ ∈ ch(S). Further, a product node P is de- log2 (x × y) ≈ Ex + Ey + Mx + My . (3)
composable if all children have pairwise disjoint scopes,
i.e., ψ(C) ∩ ψ(C′ ) = ∅, ∀C, C′ ∈ ch(P). A PC is smooth if Previous work pointed out that adding two IEEE 754
all sum nodes are smooth and decomposable if all product floating-point numbers with an integer addition instruction

2
results in Mitchell’s approximate multiplication and called 4.2 ENERGY SAVING WITH DIFFERENT
as Addition-As-Int (AAI) [11]. By doing so, we can directly NUMBER OF BITS
obtain an approximation from Eq. (2) to Eq. (3). Denoting
×
e as the approximate multiplication, we obtain: We replaced all multipliers with AAI to assess the error
and the power savings for MAR and MAP queries under
x×
e y = FLOAT(INT(x) + INT(y)) (4) varying resolutions. For MAR queries, we computed the
squared error according to a software baseline (64-bits), i.e.,
2
P
Where INT(·) interprets the binary string of the IEEE x (p(x)−q(x)) where q(·) denotes the model with lower
754 floating-point representations as integer strings and resolution multipliers and p(·) the PC in software. In addi-
FLOAT(·) interprets the resulting integer string back to tion, we calculated the maximum and minimum obtainable
the IEEE 754 floating-point representation. Therefore, per- errors. For MAP queries, we calculated the MAP inference
forming AAI in hardware only requires integer addition accuracy over the latent variables (assuming complete evi-
operators. dence) regarding the baseline. We collected the optimized
bits in Table 1 where the Nb represents 32 bits, Nbe and Nba
are the number of bits related to the smallest error in the
Exact multiplier exact multiplier and approximate multiplier respectively.
AAI
40
Power (µW)

MAR queries. With AAI, the error varies across bench-

marks but generally requires higher exponent bits E , c.f.
20

Fig. 3. In practice, exact multipliers produce a small er-

ror at the tested resolutions, as seen in Table 1. Indeed, E
0

determines the minimum representable value, and M repre-

0 5 10 15 20
sents the quantization in every exponent range, which only
Number of mantissa bits depends on the representation error. Going from a 32-bit
resolution to Nbe enables saving around 2× power. We find
Figure 2: Power cost of multipliers on 65nm CMOS using 8 that using AAI can allow for 24× to 40× extra savings if
exponent bits. the tolerated error is a few percent. The total power savings
from 32-bit to the optimal AAI are between 56× and 88×,
c.f. Table 1 .

4 EXPERIMENTS MAP queries. We find that the resolution of MAP com-

putation can be drastically reduced while introducing no
We evaluated our approach on four benchmark data sets: error since MAP stays correct as long as the argmax at sum
NLTCS, Jetser, DNA, and Book, which are a subset of fre- nodes stays the same. Further, AAI multipliers can achieve
quently used data sets in the community (e.g., [16, 5, 24]). higher accuracy for fewer bits, c.f. Fig. 4. In contrast to
We generated PC structures and parameters using Learn- exact floating-point multiplication, where mantissa values
SPN [5], a popular method for structure learning, resulting are normalized (see Appendix A), and successive multi-
in smooth and decomposable PCs. All evaluations are per- plications result in smaller mantissa values, AAI handles
formed on the test set. normalization by using a carry, hence, requiring fewer bits.
Most power savings are obtained from Nb to Nbe , i.e. 18.6×.
Switching for AAI increases savings by up to 11×. Total
4.1 POWER CONSUMPTION COMPARISON power savings can be 206×, c.f. Table 1 .
BETWEEN EXACT MULTIPLICATIONS AND
AAI
5 CONCLUSION AND DISCUSSION
Floating-point and AAI multipliers have been designed and
simulated for various resolutions in a 65nm CMOS technol- We introduced approximate computing in PCs to increase
ogy, and models have been fitted to the simulation results. their energy efficiency for deployment on edge devices
Fig. 2 shows the resulting model for 8 exponent bits and and provided a theoretical and empirical analysis of the
varying number of mantissa bits. We see that the hardware introduced error. Specifically, we investigated the energy
cost is dominated by mantissa processing, and the hardware efficiency and approximation error of Addition-as-Int mul-
complexity grows significantly with the number of man- tipliers in PCs for different benchmarks and query types
tissa bits. As AAI uses much simpler addition hardware, (marginals and MAP). Our results show that maximum
the complexity and power grow linearly with the number of power savings of 88× and 206× can be achieved for MAR
bits. and MAP queries, respectively.

3
Table 1: Overview of optimal configuration and performances over several data sets. Nbe and Nba correspond to the settings
with the smallest error and the loss is the error relative to the max. error.

Power Exact ⊗ AAI ⊗ Loss

Data set Query Nb = 32 Nbe Power Nba Power Exact AAI
µW ,@Nb (E,M) µW ,@Nbe (E,M) µW ,@Nba % %
NLTCS MAP 85482 5,3 4594 5,1 414 0 0
MAR 85482 8,15 36699 8,7 1035 3e-7 0.8
Jester MAP 660408 5,3 35492 5,1 3199 0 0
MAR 660408 8,15 283530 11,11 11731 4e-7 5.9
DNA MAP 674902 5,3 36271 5,1 3269 0 0
MAR 674902 11,15 306942 11,3 7629 3e-6 3.3
Book MAP 1272053 5,3 68364 5,1 6162 0 0
MAR 1272053 8,15 546124 11,7 18488 7e-6 0.4

NLTCS Jester DNA Book

·10−3 ·10−26 ·10−68 ·10−2
3 2 1
Approx. Error
AAI Multi.

1
2
1 0.5
0.5
1

0 0 0 0

3 7 11 15 3 7 11 15 3 7 11 15 3 7 11 15
·10−3 ·10−26 ·10−68 ·10−2
3 2 1
Approx. Error
Exact Multi.

1
2
1 0.5
0.5
1

0 0 0 0

3 7 11 15 3 7 11 15 3 7 11 15 3 7 11 15
Mantissa Mantissa Mantissa Mantissa

Figure 3: Results for AAI (first row) and exact (second row) multipliers using varying number of exponent ( E=8,
E=11) and mantissa bits. Maximum possible error ( ) is shown for reference.

NLTCS Jester DNA Book

1 1 1 1
AAI Multi.
MAP ACC

0.8 0.8 0.8

0.5
0.6 0.6 0.6
0
1 2 5 8 1 2 5 8 1 2 5 8 1 2 5 8

1 1 1 1
Exact Multi.
MAP ACC

0.8 0.8 0.8

0.5
0.6 0.6 0.6
0
1 2 5 8 1 2 5 8 1 2 5 8 1 2 5 8
Exponent Exponent Exponent Exponent

Figure 4: MAP accuracy (ACC) results for AAI (first row) and exact (second row) multipliers using varying the number of
exponent and mantissa bits ( m=1, m=3, m=5).

4
Acknowledgements [10] John N Mitchell. Computer multiplication and divi-
sion using binary logarithms. IRE Transactions on
MT acknowledges funding from the Academy of Finland Electronic Computers, (4):512–517, 1962.
(grant number 347279).
MA acknowledges partial funding from the Academy of Fin- [11] Tsuguo Mogami. Deep neural network training with-
land through the project WHISTLE (grant number 332218). out multiplications. arXiv preprint arXiv:2012.03458,
This work has also been partially funded by the European 2020.
Union through the SUSTAIN project. Views and opinions
expressed are, however, those of the author(s) only and [12] B. Moons and M. Verhelst. Energy-efficiency and
do not necessarily reflect those of the European Union or accuracy of stochastic computing circuits in emerg-
EISMEA. Neither the European Union nor the granting ing technologies. IEEE Journal on Emerging and
authority can be held responsible for them. Selected Topics in Circuits and Systems, 4(4):475 –
486, 2014. ISSN 2156-3357. doi: 10.1109/JETCAS.
2014.2361070.
References
[13] Robert Peharz, Antonio Vergari, Karl Stelzner, Alejan-
[1] YooJung Choi. Probabilistic Reasoning for Fair and dro Molina, Martin Trapp, Xiaoting Shao, Kristian Ker-
Robust Decision Making. PhD thesis, 2022. sting, and Zoubin Ghahramani. Random sum-product
networks: A simple and effective approach to proba-
[2] YooJung Choi, Antonio Vergari, and Guy Van den bilistic deep learning. In Amir Globerson and Ricardo
Broeck. Probabilistic circuits: A unifying framework Silva, editors, 35th Conference on Uncertainty in Arti-
for tractable probabilistic models. oct 2020. ficial Intelligence (UAI), volume 115 of Proceedings
of Machine Learning Research, pages 334–344. AUAI
[3] Young-kyu Choi, Carlos Santillana, Yujia Shen, Adnan Press, 2019.
Darwiche, and Jason Cong. Fpga acceleration of prob-
abilistic sentential decision diagrams with high-level [14] Hoifung Poon and Pedro M. Domingos. Sum-product
synthesis. ACM Trans. Reconfigurable Technol. Syst., networks: A new deep architecture. In Fábio Gagliardi
sep 2022. ISSN 1936-7406. doi: 10.1145/3561514. Cozman and Avi Pfeffer, editors, 27th Conference on
Uncertainty in Artificial Intelligence (UAI), pages 337–
[4] Adnan Darwiche. A differential approach to inference 346. AUAI Press, 2011.
in bayesian networks. J. ACM, 50(3):280–305, 2003.
doi: 10.1145/765568.765570. [15] Tahrima Rahman, Prasanna V. Kothalkar, and Vibhav
Gogate. Cutset networks: A simple, tractable, and
[5] Robert Gens and Domingos Pedro. Learning the struc- scalable approach for improving the accuracy of chow-
ture of sum-product networks. In International con- liu trees. In Toon Calders, Floriana Esposito, Eyke
ference on machine learning, pages 873–880. PMLR, Hüllermeier, and Rosa Meo, editors, European Confer-
2013. ence in Machine Learning and Knowledge Discovery
in Databases ECML, volume 8725 of Lecture Notes
[6] Zoubin Ghahramani. Probabilistic machine learning in Computer Science, pages 630–645. Springer, 2014.
and artificial intelligence. Nature, 521(7553):452–
459, May 2015. ISSN 1476-4687. doi: 10.1038/ [16] Amirmohammad Rooshenas and Daniel Lowd. Learn-
nature14541. ing sum-product networks with direct and indirect vari-
able interactions. In International Conference on Ma-
[7] Ari Heljakka, Martin Trapp, Juho Kannala, and Arno chine Learning, pages 710–718. PMLR, 2014.
Solin. Disentangling model multiplicity in deep learn-
ing. arXiv preprint arXiv: 2206.08890, 2023. [17] Jae-sun Seo, Jyotishman Saikia, Jian Meng, Wangxin
He, Han-sok Suh, Anupreetham, Yuan Liao, Ahmed
[8] Doga Kisa, Guy Van den Broeck, Arthur Choi, and Hasssan, and Injune Yeo. Digital versus analog arti-
Adnan Darwiche. Probabilistic sentential decision di- ficial intelligence accelerators: Advances, trends, and
agrams. In Chitta Baral, Giuseppe De Giacomo, and emerging designs. IEEE Solid-State Circuits Maga-
Thomas Eiter, editors, 14th International Conference zine, 14(3):65–79, 2022. doi: 10.1109/MSSC.2022.
on Principles of Knowledge Representation and Rea- 3182935.
soning KR. AAAI Press, 2014.
[18] N. Shah, L. I. G. Olascoaga, S. Zhao, W. Meert, and
[9] Gary Marcus. The next decade in AI: Four steps to- M. Verhelst. 9.4 piu: A 248gops/w stream-based pro-
wards robust artificial intelligence. arXiv preprint cessor for irregular probabilistic inference networks
arXiv: 2002.06177, 2020. using precision-scalable posit arithmetic in 28nm. In

5
2021 IEEE International Solid- State Circuits Confer- for edge inference of deep neural networks. Nature
ence (ISSCC), volume 64, pages 150–152, 2021. doi: Electronics, 1(4):216–222, 2018.
10.1109/ISSCC42613.2021.9366061.
[19] N. Shah, W. Meert, and M. Verhelst. Dpu-v2: Energy-
efficient execution of irregular directed acyclic graphs.
In 2022 55th IEEE/ACM International Symposium on
Microarchitecture (MICRO), pages 1288–1307, Los
Alamitos, CA, USA, oct 2022. IEEE Computer Soci-
ety.
[20] Nimish Shah, Laura I Galindez Olascoaga, Wannes
Meert, and Marian Verhelst. Problp: A framework for
low-precision probabilistic inference. In Proceedings
of the 56th Annual Design Automation Conference
2019, pages 1–6, 2019.
[21] L. Sommer, J. Oppermann, A. Molina, C. Binnig,
K. Kersting, and A. Koch. Automatic mapping of
the sum-product network inference problem to fpga-
based accelerators. In 2018 IEEE 36th International
Conference on Computer Design (ICCD), pages 350 –
357, 2018. doi: 10.1109/ICCD.2018.00060.
[22] Lukas Sommer, Lukas Weber, Martin Kumm, and An-
dreas Koch. Comparison of arithmetic number for-
mats for inference in sum-product networks on fpgas.
In 2020 IEEE 28th Annual international symposium
on field-programmable custom computing machines
(FCCM), pages 75–83. IEEE, 2020.
[23] Emma Strubell, Ananya Ganesh, and Andrew McCal-
lum. Energy and policy considerations for deep learn-
ing in NLP. In Proceedings of the 57th Conference of
the Association for Computational Linguistics (ACL),
pages 3645–3650. Association for Computational Lin-
guistics, 2019.
[24] Martin Trapp, Robert Peharz, Hong Ge, Franz
Pernkopf, and Zoubin Ghahramani. Bayesian learning
of sum-product networks. In Hanna M. Wallach, Hugo
Larochelle, Alina Beygelzimer, Florence d’Alché-Buc,
Emily B. Fox, and Roman Garnett, editors, 32nd Con-
ference on Neural Information Processing Systems
(NeurIPS), pages 6344–6355, 2019.
[25] Fabrizio Ventola, Steven Braun, Zhongjie Yu, Martin
Mundt, and Kristian Kersting. Probabilistic circuits
that know what they don’t know. arXiv preprint arXiv:
2302.06544, 2023.
[26] N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay,
L. Chen, B. Zhang, and P. Deaville. In-memory com-
puting: Advances and prospects. IEEE Solid-State
Circuits Magazine, 11(3):43–55, Summer 2019. ISSN
1943-0590. doi: 10.1109/MSSC.2019.2922889.
[27] Xiaowei Xu, Yukun Ding, Sharon Xiaobo Hu, Michael
Niemier, Jason Cong, Yu Hu, and Yiyu Shi. Scaling

Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
No ratings yet
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
13 pages
PAxC A Probabilistic-Oriented Approximate Computing Methodology For ANNs
100% (1)
PAxC A Probabilistic-Oriented Approximate Computing Methodology For ANNs
4 pages
Fix Point Implementation of Elementry Functions
No ratings yet
Fix Point Implementation of Elementry Functions
134 pages
NEET Chemistry Chapter Wise Mock Test - Physical Chemistry I - CBSE Tuts
No ratings yet
NEET Chemistry Chapter Wise Mock Test - Physical Chemistry I - CBSE Tuts
25 pages
Final Version
No ratings yet
Final Version
14 pages
Project Report Vlsi
No ratings yet
Project Report Vlsi
33 pages
Quadratic Span Programs and Succinct Nizks Without PCPS: Rosario Gennaro Craig Gentry Bryan Parno Mariana Raykova
No ratings yet
Quadratic Span Programs and Succinct Nizks Without PCPS: Rosario Gennaro Craig Gentry Bryan Parno Mariana Raykova
59 pages
Arithmetic Circuit Evaluation
No ratings yet
Arithmetic Circuit Evaluation
38 pages
An Introduction To Probabilistic Programming: Jan-Willem Van de Meent
No ratings yet
An Introduction To Probabilistic Programming: Jan-Willem Van de Meent
218 pages
Approximate Arithmetic Circuits A Survey Characterization and Recent Applications
No ratings yet
Approximate Arithmetic Circuits A Survey Characterization and Recent Applications
28 pages
Revised Report Final-1
No ratings yet
Revised Report Final-1
28 pages
A Scalable Bayesian Inference Accelerator for Unsupervised Learning - (Ko 等) - 2020
No ratings yet
A Scalable Bayesian Inference Accelerator for Unsupervised Learning - (Ko 等) - 2020
27 pages
Efficient Sampling From Noisy Hardware FINAL
No ratings yet
Efficient Sampling From Noisy Hardware FINAL
26 pages
Pooja Vashisth
No ratings yet
Pooja Vashisth
35 pages
An Efficient Floating Point Adder For Low-Power Devices
No ratings yet
An Efficient Floating Point Adder For Low-Power Devices
9 pages
Weight-Oriented Approximation For Energy-Efficient Neural Network Inference Accelerators
No ratings yet
Weight-Oriented Approximation For Energy-Efficient Neural Network Inference Accelerators
14 pages
Approximate Floating-Point Multipliers For Error-Resilient Applications
No ratings yet
Approximate Floating-Point Multipliers For Error-Resilient Applications
7 pages
ALS Survey
No ratings yet
ALS Survey
19 pages
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
No ratings yet
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
17 pages
Adaptive Area-Efficient Multiplier With Accuracy-Configurable Lookahead Multiplication
No ratings yet
Adaptive Area-Efficient Multiplier With Accuracy-Configurable Lookahead Multiplication
23 pages
Electronics 13 02846
No ratings yet
Electronics 13 02846
14 pages
A Visual Introduction To Gaussian Belief Propagation A Framework For Distributed Inference With Emerging Hardware.
No ratings yet
A Visual Introduction To Gaussian Belief Propagation A Framework For Distributed Inference With Emerging Hardware.
20 pages
Hybrid FP FXP Dot Product
No ratings yet
Hybrid FP FXP Dot Product
12 pages
A Two-Stage Operand Trimming Approximate
No ratings yet
A Two-Stage Operand Trimming Approximate
11 pages
A Theoretical Framework For Quality Estimation and Optimization of DSP Applications Using Low-Power Approximate Adders
No ratings yet
A Theoretical Framework For Quality Estimation and Optimization of DSP Applications Using Low-Power Approximate Adders
14 pages
Area-Efficient Iterative Logarithmic Approximate Multipliers For IEEE 754 and Posit Numbers
No ratings yet
Area-Efficient Iterative Logarithmic Approximate Multipliers For IEEE 754 and Posit Numbers
13 pages
Reconfigurable Multiplier
No ratings yet
Reconfigurable Multiplier
16 pages
Performance Analysis and Implementation 097e10b9
No ratings yet
Performance Analysis and Implementation 097e10b9
20 pages
Belk - Possessions and The Extended Self
No ratings yet
Belk - Possessions and The Extended Self
31 pages
Towards An API For The Real Numbers: Hans-J. Boehm
No ratings yet
Towards An API For The Real Numbers: Hans-J. Boehm
15 pages
PLAC Piecewise Linear Approximation Computation For All Nonlinear Unary Functions
No ratings yet
PLAC Piecewise Linear Approximation Computation For All Nonlinear Unary Functions
14 pages
Fault Tolerance
No ratings yet
Fault Tolerance
15 pages
Approximate Softmax Functions For Energy-Efficient Deep Neural Networks
No ratings yet
Approximate Softmax Functions For Energy-Efficient Deep Neural Networks
13 pages
Resize-Pdf - Base Paper 6 - Copy-Numbered
No ratings yet
Resize-Pdf - Base Paper 6 - Copy-Numbered
13 pages
Towards A Broader View of Theory of Computing
No ratings yet
Towards A Broader View of Theory of Computing
26 pages
1 s2.0 S1383762124000493 Main
No ratings yet
1 s2.0 S1383762124000493 Main
16 pages
Formal Verification of Integer Multiplier Circuits Using Binary Decision Diagrams
No ratings yet
Formal Verification of Integer Multiplier Circuits Using Binary Decision Diagrams
14 pages
Improved Low-Power Cost-Effective DCT Implementation Based On Markov Random Field and Stochastic Logic
No ratings yet
Improved Low-Power Cost-Effective DCT Implementation Based On Markov Random Field and Stochastic Logic
11 pages
Design and Analysis of Approximate Redundant Binary Multipliers
No ratings yet
Design and Analysis of Approximate Redundant Binary Multipliers
15 pages
Approximate Recursive Multipliers Using Low Power
No ratings yet
Approximate Recursive Multipliers Using Low Power
16 pages
Approximate Hybrid High Radix Encoding For Energy-Efficient Inexact Multipliers
No ratings yet
Approximate Hybrid High Radix Encoding For Energy-Efficient Inexact Multipliers
10 pages
Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation
No ratings yet
Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation
13 pages
Fast Hub Floating Point Adder
No ratings yet
Fast Hub Floating Point Adder
5 pages
Camus Dac16
No ratings yet
Camus Dac16
6 pages
Paper 6
No ratings yet
Paper 6
5 pages
Hardware-Implemented Lightweight Accelerator For Large Integer Polynomial Multiplication
No ratings yet
Hardware-Implemented Lightweight Accelerator For Large Integer Polynomial Multiplication
4 pages
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
No ratings yet
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
5 pages
Next-Generation Automatic Human-Readable Proofs Enabling Polynomial Formal Verification
No ratings yet
Next-Generation Automatic Human-Readable Proofs Enabling Polynomial Formal Verification
4 pages
Approximate Multipliers For Optimal Utilization of FPGA Resources
No ratings yet
Approximate Multipliers For Optimal Utilization of FPGA Resources
6 pages
Floating-Point Hardware Design A Test Perspective
No ratings yet
Floating-Point Hardware Design A Test Perspective
5 pages
Trabalho Deal An
No ratings yet
Trabalho Deal An
5 pages
Intro 2
No ratings yet
Intro 2
4 pages
Stavros A Gpu
No ratings yet
Stavros A Gpu
8 pages
Approximation of Hardware Accelerators Driven by Machine-Learning Models Embedded Tutorial
No ratings yet
Approximation of Hardware Accelerators Driven by Machine-Learning Models Embedded Tutorial
2 pages
Efficient Implementation of Pipelined Double Precision Floating Point Unit On FPGA
No ratings yet
Efficient Implementation of Pipelined Double Precision Floating Point Unit On FPGA
6 pages
DRUM: A Dynamic Range Unbiased Multiplier For Approximate Applications
No ratings yet
DRUM: A Dynamic Range Unbiased Multiplier For Approximate Applications
8 pages
FPGA Based Reciprocator
No ratings yet
FPGA Based Reciprocator
5 pages
Stochastic PDF
No ratings yet
Stochastic PDF
4 pages
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
No ratings yet
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
5 pages
Energy-Efficient Approximate Multiplication For Digital Signal Processing and Classification Applications
No ratings yet
Energy-Efficient Approximate Multiplication For Digital Signal Processing and Classification Applications
5 pages
A High-Speed and Low-Complexity Architecture For Softmax Function in Deep Learning
No ratings yet
A High-Speed and Low-Complexity Architecture For Softmax Function in Deep Learning
4 pages
Pub - Finite Element Analysis PDF
No ratings yet
Pub - Finite Element Analysis PDF
694 pages
900-Prof Ed - Questions
No ratings yet
900-Prof Ed - Questions
63 pages
Curriculum Development A Summary
No ratings yet
Curriculum Development A Summary
22 pages
Acclimatisation and Hardening
No ratings yet
Acclimatisation and Hardening
13 pages
Qulay Qo'LanmaInglizEnglish
No ratings yet
Qulay Qo'LanmaInglizEnglish
19 pages
Ayesha Ramzan
No ratings yet
Ayesha Ramzan
19 pages
7th International Workshop Applications in Electronics Pervading Industry, Environment & Society
No ratings yet
7th International Workshop Applications in Electronics Pervading Industry, Environment & Society
4 pages
Science 9 Q4 SML17 V2
No ratings yet
Science 9 Q4 SML17 V2
15 pages
Qs Leadership in Construction
No ratings yet
Qs Leadership in Construction
2 pages
OL Physics Book 2 (MCQ Theory) 2008 Till 2021
No ratings yet
OL Physics Book 2 (MCQ Theory) 2008 Till 2021
386 pages
WTS 12 Functions & Inverses
No ratings yet
WTS 12 Functions & Inverses
46 pages
Traffic Sign Detection and Recognition Using Opencv: Icices2014 - S.A.Engineering College, Chennai, Tamil Nadu, India
No ratings yet
Traffic Sign Detection and Recognition Using Opencv: Icices2014 - S.A.Engineering College, Chennai, Tamil Nadu, India
6 pages
The Kite Runner Essays
100% (2)
The Kite Runner Essays
7 pages
Tarekegn Mechato MscThesis
No ratings yet
Tarekegn Mechato MscThesis
94 pages
12 Rashis and Their Lords
No ratings yet
12 Rashis and Their Lords
1 page
BSD840N - Rev2 4-1226130
No ratings yet
BSD840N - Rev2 4-1226130
10 pages
Gallery Walk Final Report
No ratings yet
Gallery Walk Final Report
14 pages
HHW Class-Ix 2024-25
No ratings yet
HHW Class-Ix 2024-25
25 pages
Office 2016 Activation
No ratings yet
Office 2016 Activation
2 pages
A Chip-Scale Polarization-Spatial-Momentum Quantum SWAP Gate in Silicon Nanophotonics
No ratings yet
A Chip-Scale Polarization-Spatial-Momentum Quantum SWAP Gate in Silicon Nanophotonics
12 pages
Ij3c 4
No ratings yet
Ij3c 4
15 pages
MSC 417 PDF
No ratings yet
MSC 417 PDF
26 pages
Evidence
No ratings yet
Evidence
4 pages
Addressing The Programming Challenges of Practical Interferometric Mesh Based Optical Processors
No ratings yet
Addressing The Programming Challenges of Practical Interferometric Mesh Based Optical Processors
16 pages
A Simple Multi-Processor Computer Based On Subleq
No ratings yet
A Simple Multi-Processor Computer Based On Subleq
24 pages
Measuring Plank Constant With Colour LEDs and Comp0001
No ratings yet
Measuring Plank Constant With Colour LEDs and Comp0001
48 pages
TLV 3544
No ratings yet
TLV 3544
41 pages
LMH 6559
No ratings yet
LMH 6559
40 pages
ON-OFF Neuromorphic ISING Machines Using Fowler-Nordheim Annealers
No ratings yet
ON-OFF Neuromorphic ISING Machines Using Fowler-Nordheim Annealers
13 pages
Friday 5 June 2020: Physics
No ratings yet
Friday 5 June 2020: Physics
16 pages
5G-Compatible IF-over-Fiber Transmission Using A L
No ratings yet
5G-Compatible IF-over-Fiber Transmission Using A L
10 pages
Ram Mohan Impact of West
No ratings yet
Ram Mohan Impact of West
2 pages
Graybox Quantum System Identification and Control - Peruzzo
No ratings yet
Graybox Quantum System Identification and Control - Peruzzo
9 pages
Product Description Solder Powder:: Technology Application
No ratings yet
Product Description Solder Powder:: Technology Application
3 pages
Translation Shift in English-Indonesian Translation of "The Things You Can See Only When You Slow Down"
No ratings yet
Translation Shift in English-Indonesian Translation of "The Things You Can See Only When You Slow Down"
14 pages
Once Upon A Time The Session
No ratings yet
Once Upon A Time The Session
18 pages
Bradshaw 2020 Eur. J. Phys. 41 025406
No ratings yet
Bradshaw 2020 Eur. J. Phys. 41 025406
17 pages
80 GHZ Germanium Waveguide Photodiode Enabled by Parasitic Parameter Engineering
No ratings yet
80 GHZ Germanium Waveguide Photodiode Enabled by Parasitic Parameter Engineering
5 pages
Analysis of x86 Instruction Set Usage For Windows 7 Applications
No ratings yet
Analysis of x86 Instruction Set Usage For Windows 7 Applications
6 pages
TLV 3544
No ratings yet
TLV 3544
11 pages
Dofeed Series
No ratings yet
Dofeed Series
12 pages
Patching The SAP JVM 4.1 Manually For NetWeaver 2004 and 7.0
No ratings yet
Patching The SAP JVM 4.1 Manually For NetWeaver 2004 and 7.0
4 pages
SSC Geography
No ratings yet
SSC Geography
3 pages
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
No ratings yet
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
7 pages
Acta Tropica: P. Coelho, P. Sousa, D.J. Harris, A. Van Der Meijden
No ratings yet
Acta Tropica: P. Coelho, P. Sousa, D.J. Harris, A. Van Der Meijden
9 pages
Antimicrobial Resistance Genes in Microbiota Associated With Sediments and Water From The Akaki River in Ethiopia
No ratings yet
Antimicrobial Resistance Genes in Microbiota Associated With Sediments and Water From The Akaki River in Ethiopia
16 pages

12 Logarithm Approximate Floating

Uploaded by

12 Logarithm Approximate Floating

Uploaded by

Logarithm-Approximate Floating-Point Multiplier

for Hardware-efficient Inference in Probabilistic Circuits

Abstract the energy efficiency of dedicated AI processors by 10× –

w1,1 w1,2 w1,3

w1,1 w1,2 w1,3

w2,1 w2,2 w3,1

1x3 =0 1x2 =0 1x3 =1 1x2 =1 1x3 =0 1x3 =1 1x2 =0

2 BACKGROUND: PROBABILISTIC nodes are decomposable.

MAR queries. With AAI, the error varies across bench-

Fig. 3. In practice, exact multipliers produce a small er-

determines the minimum representable value, and M repre-

4 EXPERIMENTS MAP queries. We find that the resolution of MAP com-

Power Exact ⊗ AAI ⊗ Loss

NLTCS Jester DNA Book

NLTCS Jester DNA Book

0.8 0.8 0.8

0.8 0.8 0.8

You might also like