Design of FPGA Based 32-Bit Floating Point Arithmetic Unit and Verification of Its VHDL Code Using MATLAB
Design of FPGA Based 32-Bit Floating Point Arithmetic Unit and Verification of Its VHDL Code Using MATLAB
Abstract — Most of the algorithms implemented in hardware arithmetic units because of their high
FPGAs used to be fixed-point. Floating-point operations integration density, low price, high performance and
are useful for computations involving large dynamic flexible applications requirements for high precious
range, but they require significantly more resources than operation.
integer operations. With the current trends in system Floating-point implementation on FPGAs has been
requirements and available FPGAs, floating-point the interest of many researchers. The use of custom
implementations are becoming more common and floating-point formats in FPGAs has been investigated
designers are increasingly taking advantage of FPGAs in a long series of work [1, 2, 3, 4, 5]. In most of the
as a platform for floating-point implementations. The cases, these formats are shown to be adequate for some
rapid advance in Field-Programmable Gate Array applications that require significantly less area to
(FPGA) technology makes such devices increasingly implement than IEEE formats [6] and to run
attractive for implementing floating-point arithmetic. significantly faster than IEEE formats. Moreover, these
Compared to Application Specific Integrated Circuits, efforts demonstrate that such customized formats enable
FPGAs offer reduced development time and costs. significant speedups for certain chosen applications. The
Moreover, their flexibility enables field upgrade and earliest work on IEEE floating-point [7] focused on
adaptation of hardware to run-time conditions. A 32 bit single precision although found to be feasible but it was
floating point arithmetic unit with IEEE 754 Standard extremely slow. Eventually, it was demonstrated [8] that
has been designed using VHDL code and all operations while FPGAs were uncompetitive with CPUs in terms
of addition, subtraction, multiplication and division are of peak FLOPs, they could provide competitive
tested on Xilinx. Thereafter, Simulink model in MAT sustained floating-point performance. Since then, a
lab has been created for verification of VHDL code of variety of work [2, 5, 9, 10] has demonstrated the
that Floating Point Arithmetic Unit in Modelsim. growing feasibility of IEEE compliant, single precision
floating point arithmetic and other floating-point
Index Terms — Floating Point, Arithmetic Unit, VHDL, formats of approximately same complexity. In [2, 5], the
Modelsim, Simulink. details of the floating-point format are varied to
optimize performance. The specific issues of
implementing floating-point division in FPGAs have
1. Introduction been studied [10]. Early implementations either
involved multiple FPGAs for implementing IEEE 754
The floating point operations have found intensive single precision floating-point arithmetic, or they
applications in the various fields for the requirements adopted custom data formats to enable a single-FPGA
for high precious operation due to its great dynamic solution. To overcome device size restriction,
range, high precision and easy operation rules. High subsequent single-FPGA implementations of IEEE 754
attention has been paid on the design and research of the standard employed serial arithmetic or avoided features,
floating point processing units. With the increasing such as supporting gradual underflow, which are
requirements for the floating point operations for the expensive to implement.
high-speed data signal processing and the scientific In this paper, a high-speed IEEE754-compliant 32-bit
operation, the requirements for the high-speed hardware floating point arithmetic unit designed using VHDL
floating point arithmetic units have become more and code has been presented and all operations of addition,
more exigent. The implementation of the floating point subtraction, multiplication and division got tested on
arithmetic has been very easy and convenient in the Xilinx and verified successfully. Thereafter, the new
floating point high level languages, but the feature of creating Simulink model using MAT lab for
implementation of the arithmetic by hardware has been verification of VHDL code of that 32-bit Floating Point
very difficult. With the development of the very large Arithmetic Unit in Modelsim has been explained. The
scale integration (VLSI) technology, a kind of devices simulation results of addition, subtraction,
like Field Programmable Gate Arrays (FPGAs) have multiplication and division in Modelsim wave window
become the best options for implementing floating
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
2 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
have been demonstrated. and smaller than 255, and there is 1 in the MSB of the
The rest of the paper is organized as follows. Section significand then the number is said to be a normalized
2 presents the general floating point architecture. number; in this case the real number is represented by (1)
Section 3 explains the algorithms used to write VHDL
codes for implementing 32 bit floating point arithmetic V = (-1s) * 2 (E - Bias) * (1.M) (1)
operations: addition/subtraction, multiplication and
division. The Section 4 of the paper details the VHDL Where M = m22 2-1 + m21 2-2 + m20 2-3+…+ m1 2-
22
code and behaviour model for all above stated +m0 2-23; Bias = 127.
arithmetic operation. The section 5 explains the design
steps along with experimental method to create
Simulink model in MAT lab for verification of VHDL 3. Algorithms for Floating Point Arithmetic Unit
code in Modelsim and the results are shown and
discussed in its section 6 while section 7 concludes the The algorithms using flow charts for floating point
paper with further scope of work. addition/subtraction, multiplication and division have
been described in this section, that become the base for
writing VHDL codes for implementation of 32-bit
2. Floating Point Architecture floating point arithmetic unit.
3.1 Floating Point Addition / Subtraction
Floating point numbers are one possible way of
representing real numbers in binary format; the IEEE The algorithm for floating point addition is explained
754 [11] standard presents two different floating point through flow chart in Figure 2. While adding the two
formats, Binary interchange format and Decimal floating point numbers, two cases may arise. Case I:
interchange format. This paper focuses only on single when both the numbers are of same sign i.e. when both
precision normalized binary interchange format. Figure the numbers are either +ve or –ve. In this case MSB of
1 shows the IEEE 754 single precision binary format both the numbers are either 1 or 0. Case II: when both
representation; it consists of a one bit sign (S), an eight the numbers are of different sign i.e. when one number
bit exponent (E), and a twenty three bit fraction (M) or is +ve and other number is –ve. In this case the MSB of
Mantissa. one number is 1 and other is 0.
32 bit Single Precision Floating Point Numbers IEEE
standard are stored as: Case I: - When both numbers are of same sign
S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMM Step 1:- Enter two numbers N1 and N2. E1, S1 and E1,
S: Sign – 1 bit S2 represent exponent and significand of N1 and N2
E: Exponent – 8 bits respectively.
M: Mantissa – 23 bits Fraction Step 2:- Is E1 or E2 =‟0‟. If yes; set hidden bit of N1 or
N2 is zero. If not; then check if E2 > E1, if yes swap N1
Sign bit
8 bits 23 bits
and N2 and if E1 > E2; contents of N1 and N2 need not
to be swapped.
Biased exponent Significand Step 3:- Calculate difference in exponents d=E1-E2. If d
= „0‟ then there is no need of shifting the significand. If
32 bits d is more than „0‟ say „y‟ then shift S2 to the right by an
amount „y‟ and fill the left most bits by zero. Shifting is
Figure.1: IEEE 754 single precision binary format representation
done with hidden bit.
The value of number V: Step 4:- Amount of shifting i.e. „y‟ is added to exponent
of N2 value. New exponent value of E2= (previous E2)
If E=255 and F is nonzero, then V= Nan ("Not a + „y‟. Now result is in normalize form because E1 = E2.
Number")
Step 5:- Check if N1 and N2 have different sign, if „no‟;
If E=255 and F is zero and S is 1, then V= - Infinity Step 6:- Add the significands of 24 bits each including
If E=255 and F is zero and S is 0, then V= Infinity hidden bit S=S1+S2.
Step 7:- Check if there is carry out in significand
If 0<E<255 then V= (-1) **S * 2 ** (E-127) * (1.F)
addition. If yes; then add „1‟ to the exponent value of
(exponent range = -127 to +128)
either E1 or new E2. After addition, shift the overall
If E=0 and F is nonzero, then V= (-1) **S * 2 ** (-126) result of significand addition to the right by one by
* (0.F) ("un-normalized" values”) making MSB of S as „1‟ and dropping LSB of
significand.
If E=0 and F is zero and S is 1, then V= - 0
If E=0 and M is zero and S is 0, then V = 0 Step 8:- If there is no carry out in step 6, then previous
exponent is the real exponent.
An extra bit is added to the mantissa to form what is Step 9:- Sign of the result i.e. MSB = MSB of either N1
called the significand. If the exponent is greater than 0 or N2.
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 3
using MATLAB
Step 10:- Assemble result into 32 bit format excluding result to left until there is „1‟ in MSB and also count the
24th bit of significand i.e. hidden bit. amount of shifting say „z‟.
Case II: - When both numbers are of different sign Step 8:- Subtract „z‟ from exponent value either from E1
or E2. Now the original exponent is E1-„z‟. Also append
Step 1, 2, 3 & 4 are same as done in case I. the „z‟ amount of zeros at LSB.
Step 5:- Check if N1 and N2 have different sign, if Step 9:- If there is no carry out in step 6 then MSB must
„Yes‟; be „1‟ and in this case simply replace „S‟ by 2‟s
Step 6:- Take 2‟s complement of S2 and then add it to complement.
S1 i.e. S=S1+ (2‟s complement of S2). Step 10:- Sign of the result i.e. MSB = Sign of the larger
Step 7:- Check if there is carry out in significand number either MSB of N1or it can be MSB of N2.
addition. If yes; then discard the carry and also shift the Step 11:- Assemble result into 32 bit format excluding
24th bit of significand i.e. hidden bit.
Start
Enter N1 and N2 in
Floating Format
No
Yes
Is E1 or E2=0 Swap N1 and N2
Yes No
Are N1 and N2
having different
sign?
Yes No
Replace S2 of N2 by 2’s Compute Significand
complement S=S1+S2
Compute Compute
Sign=Sign of Sign=Sign of N1
larger number or N2
Discard Carry and shift the result to left Add 1 to Exponent and Previous
until there is ‘1’ at MSB fill least Also Shift overall result Exponent is the
No Carry Out significant bits by zero. Calculate to right dropping LSB real Exponent
amount of shifting say ‘x’ and making MSB ‘1’
If MSB is 1, Replace S by
2’s Complement,
Amount of Shifting is Subtracting from Exponent to produce original
otherwise keep S as such
exponent .Exponent of result =N1Expo/N2Expo-‘x’
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
4 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
In this algorithm three 8-bit comparators, one 24-bit One swap unit is required to swap the numbers if N2
and two 8-bit adders, two 8-bit subtractors, two shift is greater than N1. Swapping is normally done by taking
units and one swap unit are required in the design. the third variable. Two shift units are required one is
First 8-bit comparator is used to compare the shift left and second is shift right.
exponent of two numbers. If exponents of two numbers
3.2 Floating Point Multiplication
are equal then there is no need of shifting. Second 8-bit
comparator compares exponent with zero. If the The algorithm for floating point multiplication is
exponent of any number is zero set the hidden bit of that explained through flow chart in Figure 3. Let N1 and N2
number zero. Third comparator is required to check are normalized operands represented by S1, M1, E1 and
whether the exponent of number 2 is greater than S2, M2, E2 as their respective sign bit, mantissa
number 1. If the exponent of number 2 is greater than (significand) and exponent. Basically following four
number 1 then the numbers are swapped. steps are used for floating point multiplication.
One subtractor is required to compute the difference 1. Multiply signifcands, add exponents, and determine
between the 8-bit exponents of two numbers. Second sign
subtractor is used if both the numbers are of different M=M1*M2
sign than after addition of the significands of two E=E1+E2-Bias
numbers if carry appears. This carry is subtracted from S=S1XORS2
the exponent using 8-bit subtractor. 2. Normalize Mantissa M (Shift left or right by 1) and
One 24-bit adder is required to add the 24-bit update exponent E
significands of two numbers. One 8-bit adder is required
3. Rounding the result to fit in the available bits
if both the numbers are of same sign than after addition
of the significands of two numbers if carry appears. This 4. Determine exception flags and special values for
carry is added to the exponent using 8-bit adder. Second overflow and underflow.
8-bit adder is used to add the amount of shifting to the
exponent of smaller number.
Start
Enter N1 and N2 in
Floating Format
Compute Sign
S=S1 XOR S2
Multiply M1 and M2
i.e. M=M1*M2
Yes No
Set Exponent E for Is M=0
zero
Yes Check if M No
overflows
Yes
Check if E
overflows
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 5
using MATLAB
Sign Bit Calculation: The result of multiplication is a must be between 1 and 254 otherwise the value is not a
negative sign if one of the multiplied numbers is of a normalized one .An overflow may occur while adding
negative value and that can be obtained by XORing the the two exponents or during normalization. Overflow
sign of two inputs. due to exponent addition can be compensated during
Exponent Addition is done through unsigned adder subtraction of the bias; resulting in a normal output
for adding the exponent of the first input to the exponent value (normal operation). An underflow may occur
of the second input and after that subtract the Bias (127) while subtracting the bias to form the intermediate
from the addition result (i.e. E1+E2 - Bias). The result exponent. If the intermediate exponent < 0 then it is an
of this stage can be called as intermediate exponent. underflow that can never be compensated; if the
Significand Multiplication is done for multiplying the intermediate exponent = 0 then it is an underflow that
unsigned significand and placing the decimal point in may be compensated during normalization by adding 1
the multiplication product. The result of significand to it .When an overflow occurs an overflow flag signal
multiplication can be called as intermediate product (IP). goes high and the result turns to ±Infinity (sign
The unsigned significand multiplication is done on 24 determined according to the sign of the floating point
bit. multiplier inputs). When an underflow occurs an
The result of the significand multiplication underflow flag signal goes high and the result turns to
(intermediate product) must be normalized to have a ±Zero (sign determined according to the sign of the
leading „1‟ just to the left of the decimal point (i.e. in floating point multiplier inputs).
the bit 46 in the intermediate product). Since the inputs
3.3 Floating Point Division
are normalized numbers then the intermediate product
has the leading one at bit 46 or 47. If the leading one is The algorithm for floating point multiplication is
at bit 46 (i.e. to the left of the decimal point) then the explained through flow chart in Figure 4. Let N1 and N2
intermediate product is already a normalized number are normalized operands represented by S1, M1, E1 and
and no shift is needed. If the leading one is at bit 47 then S2, M2, E2 as their respective sign bit, mantissa
the intermediate product is shifted to the right and the (significand) and exponent. If let us say we consider
exponent is incremented by 1. x=N1 and d=N2 and the final result q has been taken as
Overflow/underflow means that the result‟s exponent “x/d”. Again the following four steps are used for
is too large/small to be represented in the exponent field. floating point division.
The exponent of the result must be 8 bits in size, and
Start
Divide M1 by M2 i.e.
M=M1/M2
Yes No
Set Exponent E for Is M=0
zero
Yes No
Check if M
overflows
Yes
Check if E
overflows
Figure. 4: Flow Chart for floating point Division (q = x/d; N1=x and N2=d)
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
6 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 7
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
8 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 9
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
10 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 11
using MATLAB
6. RESULTS
Double clicking the „Launch HDL Simulator‟ in the
Simulink model loads the test bench for simulation. The
Figure. 6: Design steps to create Simulink model for verification of
VHDL code in Modelsim
ModelSim Simulator opens a display window for
monitoring the simulation as the test bench runs. The
The Simulink Model to generate and verify Floating wave window in Figure 9 shows the simulation of two
Point arithmetic created is shown in Figure 7. Input 1 exponential inputs and Select set to „01‟for „adder‟
and Input 2 are the two 32 bit floating point inputs to the result as HDL waveform. Figure 10 shows the
model and „Select‟ is set to „01‟ for Adder, „11‟ for simulation of two decimal inputs for „adder‟. Figure 11
Divider and „10‟ for Multiplier. It also has a scope to and 12 show the simulation of two decimal inputs for
view the output. A sub-system is created to launch the „divider‟. Figure 13 and 14 show the simulation of two
Modelsim Simulator from Simulink as shown in Fig. 8. decimal inputs for „multiplier‟.
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
12 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
Figure. 9: Simulation result of decimal inputs 1.1 & 1.1 for „adder‟ in Figure. 12: Simulation result of decimal inputs 2.5 & 4.75 for „divider‟
Modelsim wave window in Modelsim wave window
Figure. 10: Simulation result of decimal inputs 2.5 & 4.75 for „adder‟ Figure. 13: Simulation result of decimal inputs 1.1 & 1.1 for
in Modelsim wave window „multiplier‟ in Modelsim wave window
Figure. 11: Simulation result of decimal inputs 1.1 & 1.1 for „divider‟ Figure. 14: Simulation result of decimal inputs for 2.5 & 4.75
in Modelsim wave window „multiplier‟ in Modelsim wave window
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code 13
using MATLAB
Table-I below shows the input output details of the Floating point arithmetic architecture designed and linked using
Simulink and Modelsim.
Table I
Wave Select Input 1 Input 2 Output
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14
14 Design of FPGA based 32-bit Floating Point Arithmetic Unit and verification of its VHDL code
using MATLAB
Copyright © 2014 MECS I.J. Information Engineering and Electronic Business, 2014, 1, 1-14