0% found this document useful (0 votes)
16 views5 pages

Floating-Point Hardware Design A Test Perspective

The document discusses the development of a Python library for designing and debugging floating-point hardware, particularly for non-standard bit-width formats used in AI applications. It highlights the challenges of error detection in hardware description languages and proposes a solution that includes intermediate result tracking and conversion between decimal and binary formats. The library aims to improve energy efficiency and accuracy in custom floating-point arithmetic hardware designs by providing essential functions for arithmetic operations and debugging support.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views5 pages

Floating-Point Hardware Design A Test Perspective

The document discusses the development of a Python library for designing and debugging floating-point hardware, particularly for non-standard bit-width formats used in AI applications. It highlights the challenges of error detection in hardware description languages and proposes a solution that includes intermediate result tracking and conversion between decimal and binary formats. The library aims to improve energy efficiency and accuracy in custom floating-point arithmetic hardware designs by providing essential functions for arithmetic operations and debugging support.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Floating-Point Hardware Design: A Test Perspective

T.K.R Arvind, Ashish Reddy Bommana, Srinivas Boppu


{atk10, bar12, srinivas}@iitbbs.ac.in
School of Electrical Sciences, Indian Institute of Technology Bhubaneswar (IITBBS), India

Abstract—The growing field of Artificial Intelligence research In [5], [10], authors have concluded that different applications
2022 IEEE Silchar Subsection Conference (SILCON) | 978-1-6654-7100-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/SILCON55242.2022.10028826

necessitates the development of non-standard bit-width num- require different precision to get accurate results. In [9], the
ber format arithmetic hardware units to improve the energy authors use 8-bit FP format for inputs, 16-bit FP format
efficiency of the underlying hardware. However, building these
hardware units using hardware description language is error- for storing weights, and 32-bit FP format for convolution
prone. It is difficult to catch these errors in the early design stage output, demonstrating the use of mixed FP formats to get
without having the proper tools or instruments to cross-check the the best performance out of the hardware. Therefore, defining
results. Furthermore, floating-point hardware designs contain custom FP formats with arbitrary exponent and mantissa
many stages by which the final result is calculated; therefore, widths and designing arithmetic hardware supporting these
it is essential to identify the erroneous stage for debugging.
This paper proposes an easy-to-use Python library for IEEE- custom formats has practical relevance. However, designing
754-based floating-point numbers with arbitrary exponent and such hardware is often a daunting task.
mantissa width. This library provides not only the result for
cross-checking HDL results but also debugging the hardware’s In general, Hardware Description Languages (HDLs) such
intermediate stage results for easier and faster development. The as Verilog or VHDL are used for designing the hardware.
support of this module in converting the numbers to and fro Subsequently, these designs will be simulated using hardware
from decimal to binary makes it ideal to use it as a full-fledged simulators, and functionality will be checked. However, while
calculator to perform the complex arithmetic in the required designing the hardware, it is essential to catch functional
format and debugger in binary form for the development of
hardware to perform these computations on. errors early in the design cycle and fix them to have less
Index Terms—Floating-point formats, Floating-point Hard- turn-around time or less time to market. A typical FP arith-
ware, Artificial Intelligence, Energy Efficiency, Simulators, De- metic hardware consists of many stages: alignment, addition,
sign and Test, Software Floating-point Libraries subtraction, normalization, and rounding. While designing
custom FP hardware, it is essential to check the final result
I. I NTRODUCTION for given input stimuli and identify the appropriate stage
Due to the cheap and abundant availability of computing for further debugging in case of functional errors. Once the
power, Machine Learning (ML) and Deep Learning (DL) problematic stage is identified, it can be further debugged—
technologies—once considered computationally intensive— provided the intermediate stage’s results are also available.
gained much traction. These technologies are being deployed Therefore, this paper proposes an intelligent debugger and
in many application domains, such as face recognition, com- calculator for non-standard floating-point formats. The main
puter vision, natural language processing, medical imag- contribution of the paper is a Python package with the
ing, etc., leading to industrial and societal transformation. following features:
There is also an increasing trend toward building and devel-
oping custom hardware accelerators to achieve higher energy • An easy to use Python header, supporting custom FP
efficiency, which is the new fundamental limit of performance formats with arbitrary widths for exponent and mantissa.
in deep sub-micron technologies [1], [2]. For instance, Google • Serves as expected output calculator for given two FP
designed different Tensor Processing Unit (TPU) [3], [4] numbers and the operation to be performed.
suited for training, inference, and edge devices. One of the • Dumps all the intermediate stages’ results, including
most widely used techniques to reduce power consumption the final result by which the hardware can be easily
is to use custom (non-standard, reduced precision, reduced debugged.
bit-width) and short floating-point (FP) formats as opposed • Implicit number conversion from decimal to the required
to standard IEEE-754 FP formats without compromising the format and back to decimal helps emulate complex
accuracy of the application [3], [5], [6]. For instance, the arithmetic and compare them with the output produced.
TPU of Google and Nvidia’s Tensor Cores [7] use a reduced
precision format called bfloat and tensorfloat, respectively. The rest of the paper is organized as follows. Section II
Both of these new FP formats are 16-bit formats that use describes the related work. Section III briefly highlights the
8 bits for the exponent and 7 bits for the mantissa [3], importance of the debugger. The main features of the library
[7]. Furthermore, even much shorter 8-bit FP formats are are presented in Section IV. The use cases of the library
being tested and used, reducing the power budget [8], [9]. are discussed in Section V. Concluding remarks are given
978-1-6654-7100-8/22/$31.00 © 2022 IEEE in Section VI.

Authorized licensed use limited to: SRM University Amaravathi. Downloaded on November 05,2024 at 13:01:15 UTC from IEEE Xplore. Restrictions apply.
II. R ELATED W ORK III. N EED FOR D EBUGGER
With the advent of growing technology and the need
The library presented in this paper is not the very first for standards for mathematical computation across various
open-source simulator for computing standard IEEE-754 and architectures from vendors to avoid surprises in the results and
non-standard floating-point arithmetic operations. However, sometimes disaster [15] costing millions of dollars and even
this could be the first library to aid the development of such lives of people, gave birth to the IEEE-754 standard floating-
computational hardware. Other publicly available floating- point format. This IEEE-754 [16] compliance N -bit floating-
point simulator libraries focus mainly on computation speed point number should consist of a sign-bit, E-bit exponent,
and raw power, but no library provide any aid for developing and M -bit mantissa, see Fig. 1.
such hardware units. We give a brief overview of some of
these libraries in the following. 1 E M

One of the most popular among others is the MPFR class S E E E E M M M M


IEEE-754 N-bit floating-point format, <E,N> N=E+M+1
from GMP [11] for arbitrary floating-point computation. It 1 11 52
is implemented in C and designed to be as fast as possible S E E E E M M M M
for small and huge operands. There are wrappers for this IEEE-754 64-bit (double precision) floating-point format, <34,45>

library to be used in languages like C++ and Python, which 1 8 23

provide more useful functions [12] than the MPFR class S E E E E M M M M


IEEE-754 32-bit (single-precision) floating-point format, <8,23>
offers. Though these are computationally efficient, they are 1 5 10
not lightweight packages and take significant time before they S E E E E M M M M
can be set up and used. They also have a significant learning IEEE-754 16-bit (half-precision) loating-point format, <5,10>

curve before these can be utilized efficiently. Fig. 1: IEEE-754 standard floating-point formats.
QPytorch [12], an interface over PyTorch, is built to
support experiments on low-precision number format, which This scientific format of < E, N > floating-point repre-
can be used to simulate non-standard floating-point numbers. sentation introduces complex computations, unlike integers
However, the underlying computation is still done using with much overhead. With standardization across vendors and
single-precision representation. It may not always produce an its computational abilities, it gained popularity, and many FP
accurate result when converting from one arbitrary floating- arithmetic hardware designs for standard IEEE-754 formats
point format to others without having an impact on the bit- exist. FP designs still pose many difficulties when creating or
value on a specific bit position. So this library cannot be modifying them to support custom FP formats. As aforemen-
relied on when comparing the results produced from hardware tioned, these custom FP formats, i. e., non-standard IEEE-754
designed for a particular numeric precision. formats with arbitrary exponent and mantissa, have shown
emFloat [13] is an IEEE-754 compliant floating-point li- significant power savings in the design of ML/DL hardware
brary for an embedded system that tries to deliver FPU- accelerators. Therefore, designing such custom FP arithmetic
like performance in pure software. goFast [14] was also hardware can be easy if there exist software libraries that not
developed for embedded applications and is fast as these only act as simulators but also as a debugger.
are written in assembly language and optimized explicitly A typical FP hardware unit has two mandatory stages
for each processor. Though these libraries support various alignment, normalization, and optional rounding, apart from
CPU architectures, they can only support single and double- the actual arithmetic stage like addition, subtraction, etc.,
precision floating-point formats. see Fig. 2. Though these are typical stages in binary opera-
Though numerous such libraries exist to simulate the ar- tions, they can be perplexing when creating new or modifying
bitrary precision floating-point arithmetic, they focus mainly existing floating-point units when some uncaught cases/new
on computation speed and correct rounding. Few of these are cases in the new custom format are missed at any stage,
user-friendly simulators and can hardly help debugging, and causing a domino effect to propagate until the results, making
the outputs from such simulators are in decimal represen- debugging more difficult. Therefore, the proposed Python
tation. However, the output from the generated hardware is library functions as a calculator as well as debugger, providing
in binary representation. Conversion from decimal to binary the most useful information on where mistakes are most likely
is necessary to compare simulator outputs to the hardware to occur for any operator at various stages, as described below.
outputs, which might introduce errors when not properly Alignment: The alignment stage is the first stage in any
handled. This is where the proposed library stands out by floating-point arithmetic operation. The least valued exponent
computing the required precision floating-point arithmetic and from the inputs will be made equal to the higher valued
aiding in the development of such computational cores with exponent. The mantissa of the lesser valued exponent number
an easy-to-use python header with no installation overhead. is shifted right by ‘m’ bits, which is the difference between
Furthermore, final outputs and intermediate stages’ outputs the highest and least exponent value. A probable error that
are dumped to help debug the hardware. In other words, the could occur in this stage could be in computing the value of
library mimics the exact hardware in the software. ‘m’ and shifting logic for the mantissa.

Authorized licensed use limited to: SRM University Amaravathi. Downloaded on November 05,2024 at 13:01:15 UTC from IEEE Xplore. Restrictions apply.
as a full-fledged simulator in a fully-automated verification
operand A operation operand B flow for calculating expected values.

Alignment Shift by ‘m’ IV. S UPPORTING F UNCTIONS


This library provides all essential functions including the
use of denormal number format, which any full-fledged
max. mantissa operation aligned
mantissa functional library should offer with additional switches like
Arithmetic verbose. This verbose switch can be a convenient and useful
feature in modifying the existing floating-point hardware to
arithmetic operation result support different precision or in designing a new floating-
Normalization point hardware [17] altogether.
The first and far most feature to support is the datatype of
normalized result GRS the inputs used for computation can be passed as hexadeci-
Rounding Tie2(even/odd, etc) mal/binary as well as decimal. It is very helpful when trying
a different range of values as the hardware can sometimes
Fig. 2: Typical stages in a floating-point adder/subtractor compute with cent percent perfection when inputs are within
hardware. a range but can malfunction when the result overflows or
underflows. The main functions available from this Python
library are shown in Listing 1.
Normalization: Once the arithmetic operation is performed
on the mantissa, the output of the arithmetic unit is then Listing 1: Arithmetic and Conversion functions from the
shifted right/left by ‘s’ bits. The highest valued exponent library
among the input is added/subtracted by ‘s’. The value of ‘s’ #1 Arithmetic functions with various datatypes
is determined by the number of bits, the computed mantissa [Operation][type](a,b,ewidth,nwidth,debug_on=0)
from the arithmetic stage to be shifted left/right, to set the operation=[ADD,SUB,MUL,DIV]
type=[B,H,D]#binary,hexadecimal,decimal
hidden-bit according to the IEEE-754 standard. The range
of values that ‘s’ can take for an operation depends on the #2 Conversion functions on different formats
highest valued exponent. Addition/subtraction of ‘s’ with the [format]2[format](a,ewidth,nwidth)
format=[DEC,FLP]#decimal,floating-point
highest valued exponent should not make the exponent to
overflow or underflow. Most of the error in the design of this It not only supports various input datatype but also supports
stage can be caught, if the value of ‘s’ for the given arithmetic converting decimal to < E, N > floating-point hex format
operation is available for cross-referencing as almost all the and back to decimal. This conversion function is very useful
change that occur during this stage depends on the value of in calculating the error gradient where conversion from <
‘s’. E, N > floating-point format to decimal is needed without
Rounding: The very last and optional stage is rounding. much change in the existing frameworks. With this flexible
It is beneficial to have a library supporting arithmetic with option for datatype, it is easier to compute the equations in
an optional rounding stage. In some applications where decimals and hexadecimal/binary formats.
multiple back-to-back floating-point arithmetic operations are On the support front, this library also helps in quick debugs
performed, rounding is performed only once at the very end of if an optional debug switch is enabled. An example Python
all the operations. In some applications, rounding is computed code snippet is shown below; see Listing 2, showing a hex-
after each operation depending on the value of flags set in adecimal computation of IEEE-754 standard 16-bit floating-
previous stages. point with and without debug option enabled.
For the hardware simulation, the inputs to the hardware are
accepted in binary representation, while the majority of the Listing 2: Code snippet showing the usage as calculator
floating-point inputs are generated in human-readable decimal and debugger
representations for the development of test cases. More impor- import FLP as flp
tantly, the same decimal number value has different floating- #setting precision variables
point binary representations depending on the required preci- Ewidth=5
Nwidth=16
sion. To reduce the number conversion efforts, this library also
supports inbuilt conversion of the decimal representation of
floating-point formats to the required precision binary format #hexadecimal calculation in required precision
hex_A=0x45a0
and vice-versa. Therefore, this Python library focuses more hex_B=0x647B
on reducing debugging when implementing a completely
new hardware logic for an arbitrary precision floating-point
#--------simple computation------------#
number format with all these design decisions. Once required hex_C=flp.ADDH (hex_A,hex_B,Ewidth,Nwidth)
experimental hardware is built, this library can also be used print ("1.Simple computation on Hex numbers\n")

Authorized licensed use limited to: SRM University Amaravathi. Downloaded on November 05,2024 at 13:01:15 UTC from IEEE Xplore. Restrictions apply.
print (hex_A,"+",hex_B,"=",hex_C,"\n\n") of GRS bits and the decision taken (is ulp added or not), and
the final rounded value of the result.
#----computation with debug options set----# It should be noted that this library mainly supports func-
print ("2.Computation with debug option\n") tional correctness and debugging of the logic and does not
debug_on=1 #setting for debug help emphasize the execution speed or any optimization based on
hex_C=flp.ADDH (hex_A,hex_B,Ewidth,Nwidth,
debug_on) the architecture used.
V. U SE C ASES & I MPLEMENTATION
The outputs produced by the above Python snippet are shown
in Listing 3. Applications in [5], [10] need different precision to get
the best results; once these formats with required precision
Listing 3: Output for the code shown in Listing 2 are known using the proposed python library, hardware can
be designed effectively. In other applications like [9], where
1.Simple computation on Hex numbers mixed-precision formats are used, one layer’s outputs can
0x45a0+0x647B=0x6481
be converted to the required precision format using this
2.Computation with debug option library. The proposed library can also be seamlessly integrated
#--------- Inputs ------------# into existing frameworks such as TensorFlow, PyTorch, etc.
Floating point A : 0_10001_0110100000
Floating point B : 0_11001_0001111011 In [17], the authors proposed a vectorized floating-point
Is subtraction : 0 hardware, and this library can also be made to support
such vectorized hardware with little effort. The proposed
#------- Alignment -----------#
is subtraction? : 0 Python library has been used to automatically verify in-
Rshift Mantissa : 8 house developed floating-point arithmetic hardware, as shown
mantissa to shift: 1_0110100000_000 in Fig. 3. The random input stimuli for the FP hardware
Aligned mantissa : 0_0000000101_101
Maximum mantissa : 1_0001111011_000 design are generated automatically, and expected results were
calculated using the proposed software library. A testbench
#------- Arithmetic ----------# was also generated with the same random stimuli, which can
lsb cin 0
Mantissa Min Val : 0_0000000101_101 be used with the other design files to simulate the hardware
Complemented Val : 0_0000000101_101 in the QuestaSim [18]. Subsequently, simulated hardware
Mantissa Max Val : 1_0001111011_000 produced results and expected python library generated results
Arithmetic Val : 01_0010000000_101
were compared to check the correctness of the hardware.
#----- Normalization ---------# In the event of a functional mismatch, the expected and
Max-Exp Value : 25 simulated results from each stage may be compared bit-wise,
preshift value : 0
NormalizedExp is : 11001 and the hardware can be easily debugged. The Python library
NormalizedMan is : 0010000000101 is available at the following https://github.com/TKRArvind/
Pre Rounding is : 0_11001_0010000000_101 FloatingPoint_Calculator. Currently, this library supports ad-
#-------- Round --------------# dition, subtraction operations, and other operations such as
PreRounding is : 0110010010000000_101 multiplication and division are under development and not
Round decision : 101 yet published. However, the idea of developing a software
is ulp added : 1
rounded value is : 00110010010000001 library that can replicate the same hardware data flow is
rounded Hex is : 0x6481 always helpful for both testing and debugging.

In the first case, the function returns the computed value in VI. CONCLUSION
the same format as the input. In the second case—with debug We have briefly discussed the importance of reduced pre-
options enabled—the function produces valuable information cision, reduced bit-width, or non-standard IEEE-754 floating-
to help debug the hardware results. Under the alignment point arithmetic hardware for the energy efficiency of the
section, the output shows that the exponent difference for the hardware accelerators built for artificial intelligence appli-
given input is 8. The least mantissa before and after shifting cations. We have presented a Python library that can be
by 8 along with hidden bit at the start and extra three GRS used as a calculator and debugger in developing the IEEE-
(Guard, Round, Sticky) bits used for rounding decision are 754 compliant floating-point hardware units. The library is
shown in the output. a simple and easy-to-use header only library that can even
The arithmetic section of the output shows the result of be used with any existing Python-based AI framework. It
adding the mantissa of both the inputs. The output it displays also provides functions for easy manipulation of the numbers
has an extra bit before the hidden bit, as adding two n bits in the required or chosen number format. Other useful in-
may produce n + 1 output bits. The normalization section built functions like hex to bin conversion and vice-versa,
in the output displays the final exponent value, the number optional features like rounding for the arithmetic or rounding
of bits the mantissa is shifted, and the final values of the type to be used in hardware make it an ideal library for
exponent and mantissa. The very last section shows the value both the development and testing of floating-point arithmetic

Authorized licensed use limited to: SRM University Amaravathi. Downloaded on November 05,2024 at 13:01:15 UTC from IEEE Xplore. Restrictions apply.
start

Stimuli Generation

Expected Value Testbench Generation & Verilog Design Files


Computation Hardware Simulation (FP hardware)

Results No Examine stage’s debug


Match ? results

Yes
Legend
Reports Generation
Python Library

end Automated test environment

Fig. 3: Use of proposed library for FP hardware verification

hardware. Furthermore, this library mimics the hardware in [12] T. Zhang, Z. Lin, G. Yang, and C. De Sa, “Qpytorch: A low-precision
software, i. e., the final result of the arithmetic operation, but arithmetic simulation framework,” in 2019 Fifth Workshop on Energy
Efficient Machine Learning and Cognitive Computing-NeurIPS Edition
intermediate stages’ outputs are also available, making the (EMC2-NIPS). IEEE, 2019, pp. 10–13.
hardware debugging an easier task. A few use cases of the [13] Segger, “emFloat - The SEGGER Floating-Point
library and implementation details were also discussed. Library,” 2022, (Accessed on 05/02/2022). [On-
line]. Available: https://www.segger.com/products/development-tools/
runtime-library/technology/floating-point-library
R EFERENCES [14] Micro Digital, “GoFast® Floating Point Library,” 2022. [Online].
Available: https://www.smxrtos.com/ussw/gofast.htm
[1] M. B. Taylor, “A landscape of the new dark silicon design regime,” [15] Intel, “Intel and Floating-Point: Updating One of the
IEEE Micro, vol. 33, no. 5, pp. 8–19, 2013. Industry’s Most Successful Standards,” 2022. [Online]. Avail-
[2] H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and able: https://www.intel.com/content/dam/www/public/us/en/documents/
D. Burger, “Power challenges may end the multicore era,” Commu- case-studies/floating-point-case-study.pdf
nications of the ACM, vol. 56, no. 2, pp. 93–102, 2013. [16] V. Rajaraman, “Ieee standard for floating point numbers,” Resonance,
[3] N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, vol. 21, no. 1, pp. 11–30, 2016.
S. Bates, S. Bhatia, N. Boden, A. Borchers et al., “In-datacenter [17] S. Mach, F. Schuiki, F. Zaruba, and L. Benini, “FPnew: An Open-source
performance analysis of a tensor processing unit,” in Proceedings of the Multiformat Floating-point Unit Architecture for Energy-Proportional
44th annual international symposium on computer architecture, 2017, Transprecision Computing,” IEEE Transactions on Very Large Scale
pp. 1–12. Integration (VLSI) Systems, vol. 29, no. 4, pp. 774–787, 2020.
[4] S. Cass, “Taking ai to the edge: Google’s tpu now comes in a maker- [18] Siemens, “Questa advanced simulator,” 2021. [Online]. Available: https:
friendly package,” IEEE Spectrum, vol. 56, no. 5, pp. 16–17, 2019. //eda.sw.siemens.com/en-US/ic/questa/simulation/advanced-simulator/
[5] J. Y. F. Tong, D. Nagle, and R. A. Rutenbar, “Reducing power by
optimizing the necessary precision/range of floating-point arithmetic,”
IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
vol. 8, no. 3, pp. 273–286, 2000.
[6] A. Agrawal, S. M. Mueller, B. M. Fleischer, X. Sun, N. Wang, J. Choi,
and K. Gopalakrishnan, “Dlfloat: A 16-b floating point format designed
for deep learning training and inference,” in 2019 IEEE 26th Symposium
on Computer Arithmetic (ARITH). IEEE, 2019, pp. 92–95.
[7] S. Xie, S. Davidson, I. Magaki, M. Khazraee, L. Vega, L. Zhang,
and M. B. Taylor, “Extreme datacenter specialization for planet-scale
computing: Asic clouds,” ACM SIGOPS Operating Systems Review,
vol. 52, no. 1, pp. 96–108, 2018.
[8] N. Wang, J. Choi, D. Brand, C.-Y. Chen, and K. Gopalakrishnan,
“Training deep neural networks with 8-bit floating point numbers,”
Advances in neural information processing systems, vol. 31, 2018.
[9] N. Mellempudi, S. Srinivasan, D. Das, and B. Kaul, “Mixed precision
training with 8-bit floating point,” arXiv preprint arXiv:1905.12334,
2019.
[10] A. Buttari, J. J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek,
and S. Tomov, “Exploiting mixed precision floating point hardware in
scientific computations.” in High Performance Computing Workshop,
2006, pp. 19–36.
[11] T. Granlund, “Gnu multiple precision arithmetic library,” http://gmplib.
org/, 2010.

Authorized licensed use limited to: SRM University Amaravathi. Downloaded on November 05,2024 at 13:01:15 UTC from IEEE Xplore. Restrictions apply.

You might also like