0% found this document useful (0 votes)
107 views

Goldscmidt Algo

Goldscmidt algo

Uploaded by

chayanpathak
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views

Goldscmidt Algo

Goldscmidt algo

Uploaded by

chayanpathak
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Conference on Communication and Signal Processing, April 6-8, 2016, India

Design and Synthesis of Goldschmidt Algorithm


based Floating Point Divider on FPGA
Naginder Singh and Trailokya Nath Sasamal


Abstract—This paper presents a single precision floating point In this computational algorithm, the numerator and
division based on Goldschmidt computational division algorithm. denominator are scaled using a common factor, as a result of
The Goldschmidt computational algorithm is implemented using which, the denominator converges to one, and numerator
32-bit floating point multiplier and subtractor. The salient feature converges directly to the quotient. This computation of
of this proposed design is that the module for computing mantissa quotient is done using the Goldschmidt computational division
in 32-bit floating point multiplier is designed using a 24-bit Vedic algorithm. This algorithm uses an iterative process in which
multiplication (Urdhva-triyakbhyam-sutra) technique. 32-bit the denominator gets scaled to one to get the final quotient.
floating point multiplier, designed using Vedic multiplication The iterative process, used in this algorithm, uses several
technique, yields a higher computational speed and is used to complex operations for division, where not only the precision
increase the performance of the floating point divider. The main is to be maintained for very large data intervals, but precision
objective is to synthesize the proposed floating point divider on
should be high for better operation.
FPGA using Verilog hardware description language (HDL). The
The division is a mathematical operation used in various
proposed floating point divider can be used in the design of
floating point divide – add fused (DAF) architecture. signal processing algorithms and different types of division
algorithms are described in [1-3]. The computational division
Index Terms—Floating point Divider, FPGA, Goldschmidt algorithm, described in [4], converges much faster within one
Computational algorithm, Urdhva Tiryagbhyam sutra. iteration, provides high computational speed and throughput.
To implement and design 32-bit floating point division
I. INTRODUCTION based on Goldschmidt computational algorithm, 32-bit floating
point multiplier and 32-bit floating point subtraction modules

R ECONFIGURABLE computing processing provides very


versatile high-speed computing. Due to the advancement
of field programmable gate array (FPGA), we have reached a
are used [5-6]. For efficient implementation of the floating
point multiplier, Vedic multiplication is used for calculating
mantissa part [7-8]. The floating point divide-add fused (DAF)
point where the architecture of processors can be modified architecture presents a dedicated unit for the combined
instantaneously. The enhanced feature of Spartan-6 voluntarily operation of floating point division followed by addition or
reduces the cost per logic cell designed. Floating Point Divider subtraction [9-10].
The format for representing 32-bit and 64-bit floating point
is used in many arithmetic and logical units and in many
numbers are provided by the IEEE 754 standard [11-12]. IEEE
digital systems. Power consumption and throughput are the 754 uses a fixed number of bits for representing the 32-bit
major factors that play a major role when designing a Floating floating point number. The representation format comprises of
Point Divider. The design of Floating Point Divider in this three parts, i.e., sign (s), exponent (e) and the mantissa (m).
paper is based on computational algorithm, which makes use Table I shows the structure for IEEE 754 formats and
of basic mathematical operations such as multiplication and describes the single and double precision. In IEEE 754 Single
subtraction. This Computational algorithm is implemented precision format, the mantissa is represented by 23 bits,
using a 32-bit floating point multiplier and subtractor. In this exponent is represented by 8-bits and MSB corresponds to
proposed design, the module used for computing the mantissa sign bit. The Sign of the floating point number depends on the
sign bit or MSB. The number is positive when the MSB bit is
in 32-bit Floating point multiplier is designed using Vedic
Mathematics (Urdhva-Tiryagbhyam-sutra). TABLE I
IEEE 754 STANDARD FORMAT FOR SINGLE AND DOUBLE
PRECISION
Exponent
Sign (s) Mantissa (m)
(e)
32-bit 1-bit 8-bit 23-bit
64-bit 1-bit 11-bit 52-bit
Naginder Singh is with the School of VLSI Design and Embedded
Systems Department, National Institute of Technology, Kurukshetra, 0 and negative when the MSB bit is 1.
Haryana, India. (E-mail: [email protected]). The formulation of the paper is as follows. Section II
describes the architecture of the floating point multiplier using
Trailokya Nath Sasamal is with Electronics and Communications
Engineering Department, National Institute of Technology, Kurukshetra,
Vedic multiplication. Section III presents the description of
Haryana, India. (E-mail: [email protected]). 32-bit floating point subtractor. Section IV explains the

978-1-5090-0396-9/16/$31.00 ©2016 IEEE


1286
Goldschmidt computational algorithm. Section V presents the III. FLOATING POINT SUBTRACTOR
Simulation results of floating point division. The conclusion In the subtractor module, X[31-0] and Y[31-0] are given as
and references are presented in the final section. inputs to the floating point subtractor. The sign (s), exponent
(e) and mantissa (m) are represented in IEEE 754 format. 32-
II. FLOATING POINT MULTIPLIER bit floating point subtraction operation is done in a stepwise
Fig. 1 shows the complete architecture of proposed 32-bit manner as explained further. First of all, the floating point
floating point multiplier. This multiplier module is designed numbers are unpacked. After unpacking, the sign, exponent
using a Vedic multiplication technique, where mantissa and mantissa are identified for performing the subtraction
calculation is done using a 24x24 bit Vedic multiplier. The operation. Next, the exponent is equalized for performing
main purpose of using Vedic multiplier is to improve the alignment and normalization of mantissa part. If neither of the
overall performance of the 32-bit floating point multiplier. operands are infinity, then the relation between e1 (X[30-23])
IEEE 754 format presents a fixed number of bits for and e2 (Y[30-23]) is determined by comparing the mantissa
representing the sign, exponent and mantissa. The inputs given m1 and m2. The mantissa is shifted right until the exponent
to the floating point multiplier are A[31-0] and B[31-0] as per becomes equal, i.e. e1 (X[30-23]) = e2 (Y[30-23]). After
IEEE 754 format. 32-bit Floating point multiplication unit is alignment and normalization, the mantissa m1 (X[23-0]) and
divided into three parts - sign unit, exponent unit and mantissa m2 (Y[23-0]) are subtracted. After that, the mantissa values
unit. are rounded off. Finally, the sign, exponent and mantissa parts
are concatenated. The 32-bit floating point subtraction module
A. Mantissa unit
is used in the Goldschmidt computational algorithm for
In mantissa unit, for calculation of mantissa, a 24x24 bit performing the iteration process. Fig. 2 shows the complete
Vedic multiplier is used efficiently for higher throughput and architecture of 32-bit floating point subtractor. This 32-bit
computation. The lower bits of inputs, m1 (A[22-0]) and m2 floating point subtractor module is used for calculating the
(B[22-0]) are given to the 24-bit Vedic multiplier, which iterative process in Goldschmidt algorithm.
produces 24-bit normalized output and should have leading
one as their MSB.
B. Exponent unit
In Exponent unit, the exponent calculation is done by using
ripple carry adder. The exponent is computed by providing
inputs e1 (A[30-23]) and e2 (B[30 – 23]) to the 8-bit ripple
carry adder unit and result is biased to 127. The overflow and
underflow cases are carefully handled.
C. Sign unit
In sign unit, the sign bit is computed by XORing the 31st bit
of inputs, s1 (A[31]) and s2 (B[31]) of floating point inputs.
The output of XOR gate represents the sign of the floating
point multiplier. The Vedic multiplication technique is
efficiently used for high computational speed and throughput.
This multiplier module, designed using Vedic multiplier, is
used in the Goldschmidt computational algorithm for
performing the iteration process.

Fig. 2. The architecture of 32-bit Floating point subtractor

Fig. 1. The proposed architecture for 32-bit Floating point multiplier

1287
IV. GOLDSCHMIDT ALGORITHM
–G
n
N i i
(1)
The Goldschmidt division algorithm uses a complex Q
–G
n
D
initialization for computing continual iterations. This i i

algorithm uses an iterative process that converges the divisor


(D) to one, by continually multiplying both the dividend (N) The Gi+1 is the factor for the next iteration and is calculated by
and divisor by a common factor Gi [13]. In this proposed “equation (2)”. The numerator and denominator for the next
design, Goldschmidt computational division algorithm is iteration are calculated by “equation (3)”.
designed using a 32-floating point multiplier module and
subtractor module. In this division algorithm, the numerator
and denominator are scaled so that the divisor should be in the G i +1 = 2 - D i (2)
interval (0, 1). Scaling of the divisor is done by shifting N i 1 N i Gi 1
operation. To produce a precise result, more iterations are (3)
required. For this purpose, fast division algorithms are Di 1 Di Gi 1
developed. The Goldschmidt algorithm converges much faster The proposed floating point divider using Goldschmidt
for computing one iteration. Thus, several multiplication and algorithm can be used in the design of DAF architecture. F1,
subtraction operations needed to perform the continual F2 and F3 are the three floating point input numbers on which
iteration process. DAF operation is carried out. The resultant of division of F1
by F2 followed by addition or subtraction of F3.

V. SIMULATION RESULTS
The design is synthesized using Xilinix 14.7 and
simulations are performed on ISim. Fig. 4 shows the
simulation results of the proposed 32-bit floating point
division of two numbers using the Goldschmidt method. Table
III shows the device utilization of the Xilinx Spartan 6 SP605
Evaluation Platform FPGA. Table IV shows the Xilinx Power
Estimator (XPE)-14.3 device summary report for the proposed
32-bit floating point division carried out for the Spartan-6
SP605 Evaluation Platform FPGA. Table V shows the
comparative analysis of proposed DAF with the existing ones.
In Fig. 4, N represents the numerator, D represents the
denominator, the G1 is the first iteration result, G2 is the
second iteration result, G3 is the third iteration result, G4 is the
final iteration result and Q represents the quotient. The two
inputs N and D are given to the divider in IEEE 754 standard
format as shown in Table II.

TABLE II
SAMPLE INPUT AND ITS OUTPUT FOR SIMULATION
Fig. 3. Flowchart of the 32-bit floating point division using Goldschmidt Decimal Sign Exponent Mantissa
method
N 7.73 0 10000001 11101110101110000101001

Fig. 3 shows the flowchart of floating point division using D -6.04 1 10000001 10000010100011110101110
the Goldschmidt computational algorithm where N and D are Q -1.279801 1 01111111 01000111101000010000111
the numerator and denominator given to the multiplier. In this
algorithm, one multiplier and one subtraction module are used
to perform one iteration. In four iterations, computational TABLE III
DEVICE UTILIZATION OF THE XILINX SPARTAN 6 SP605 EVALUATION
algorithm produces the quotient directly by scaling numerator PLATFORM
and denominator by common factor Gi. Hence, denominator
Logic Utilization Used Available Utilization
converges to one and numerator converges directly to the
quotient. More iterations are performed to refine the result. Number of Slice Registers 3025 54,576 5%
This computational algorithm produces optimized results by Number of Slice LUTs 9,281 27,288 34%
computing one iteration in one cycle. The formula is shown in Number of occupied Slices 2,961 6,822 43%
“equation (1),” where Q is a quotient, N is the numerator, D is Number of bonded IOBs 161 296 54%
the denominator and Gi is factor at iteration i.

1288
TABLE IV Reconfigurable Computing and FPGAs (ReConFig), 2013 International
XILINX POWER ESTIMATOR (XPE) -14.3 DEVICE SUMMARY REPORT OF Conference on, pp. 1-6. IEEE, 2013.
SPARTAN-6 SP605 EVALUATION PLATFORM FPGA [4] Malik, Peter. "High Throughput Floating-Point Dividers Implemented in
FPGA." In Design and Diagnostics of Electronic Circuits & Systems
Specifications Values
(DDECS), 2015 IEEE 18th International Symposium on, pp. 291-294.
Junction Temperature 25.5 ºC IEEE, 2015.
Total On-Chip Power 0.037 W [5] Rathor, Ajay, and Lalit Bandil. "Design Of 32 Bit Floating Point
Addition And Subtraction Units Based On IEEE 754 Standard." In
Thermal Margin 59.5 ºC ( 4.0 W ) International Journal of Engineering Research and Technology, vol. 2,
Effective ϴJA 14.3 ºC/W no. 6 (June-2013). ESRSA Publications, 2013.
[6] Tirthaji, Jagadguru Swami Sri Bharati Krisna. "Maharaja." Vedic
mathematics or Sixteen simple mathematical formulae from the Vedas,
Motilal Banarsidass, Delhi , 1965.
TABLE V [7] Kanhe, Aniruddha, Shishir Kumar Das, and Ankit Kumar Singh.
COMPARATIVE ANALYSIS OF DAF UNIT IN TERMS OF POWER AND LATENCY "Design and implementation of low power multiplier using vedic
TIME multiplication technique." IJCSC) International Journal of Computer
Sr. no. Ref [9] Ref [10] Proposed DAF Science and Communication 3, no. 1, pp.131-132, 2012.
[8] Al-Ashrafy, Mohamed, Ashraf Salem, and Wagdy Anis. "An efficient
1 Latency Time 175.49ns 130.8ns 75ns
implementation of floating point multiplier." In Electronics,
2 Power - 0.050W 0.037W
Communications and Photonics Conference (SIECPC), 2011 Saudi
International, pp. 1-5. IEEE, 2011.
[9] Alexandru amaricai, mircea vladutiu,, and oana boncalo, “Design issues
and implementations for floating-point divide–add fused” IEEE
Transactions on circuits and systems—ii:, Vol. 57, no. 4, 2010.
[10] Pande, Kuldeep, Abhinav Parkhi, Shashant Jaykar, and Atish
Peshattiwar. "Design and Implementation of Floating Point Divide-Add
Fused Architecture." In Communication Systems and Network
Technologies (CSNT), 2015 Fifth International Conference on, pp.
797-800. IEEE, 2015.
[11] Mahakalkar, Sushma S., and Sanjay L. Haridas. "Design of High
Performance IEEE754 Floating Point Multiplier Using Vedic
Mathematics." In Computational Intelligence and Communication
Networks (CICN), 2014 International Conference on, pp. 985-988.
IEEE, 2014.
[12] IEEE 754-2008, IEEE Standard for Floating-Point Arithmetic, 2008.
[13] Kong, Inwook, and Earl E. Swartzlander Jr. "A Goldschmidt division
method with faster than quadratic convergence." Very Large Scale
Integration (VLSI) Systems, IEEE Transactions on 19, no. 4, 696-700.
2011.

Fig. 4. Simulation result of Floating point division

VI. CONCLUSION
The Single precision floating point division using
Goldschmidt computational algorithm was synthesized using
Xilinx Spartan 6 SP605 Evaluation Platform on FPGA and
Simulations were done on ISim simulator. In this paper the
multiplier module is designed using Vedic algorithm, which
improved the performance of the floating point divider. The
device utilisation parameters were optimized thereby; the
Power consumption and Latency time are reduced by 26% and
42.6% respectively as compared to existing DAF. The design
presented in this paper is useful in high computation
demanding applications.

REFERENCES
[1] Muller, Jean-Michel. "Avoiding double roundings in scaled Newton-
Raphson division." In Asilomar Conference on Signals, Systems, and
Computers, pp. 4-pages. 2013.
[2] Kong, Inwook, and Earl E. Swartzlander Jr. "A rounding method to
reduce the required multiplier precision for Goldschmidt division."
Computers, IEEE Transactions on 59, no. 12, pp.1703-1708, 2010.
[3] Rodriguez-Garcia, A., L. Pizano-Escalante, Ramon Parra-Michel, O.
Longoria-Gandara, and J. Cortez. "Fast fixed-point divider based on
Newton-Raphson method and piecewise polynomial approximation." In

1289

You might also like