0% found this document useful (0 votes)

133 views4 pages

Fpga Implementation of Modified Radix 2 SRT Division Algorithm

Uploaded by

Asif Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

133 views4 pages

Fpga Implementation of Modified Radix 2 SRT Division Algorithm

Uploaded by

Asif Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

FPGA Implementation Of Modified Radix 2 SRT Division Algorithm

AttifA. Ibrahem I, Hamed Elsimary I, Aly E. Salama 2

Electronics Research Institute, Cairo, Egypt,
2
Cairo University, Cairo, Egypt

Abstract - The flexibility of field programable gate arrays The radix r recurrence for computing successive
(FPGAs) can provide arithmetic intensive applications residuals is:
with the benefites of custom hardware but without the high
cost of custom silicon implementations. In this paper, we Rj = rR1-, - qjd , j = 1,2,3...
present the adaptation of modified radix 2 division (I)
algorithm [1] for lookup table based FPGAs where Rj is the residual, d E [2,1) is the normalized
implementation. For this modified scheme, the result digits
and the residuals are computed concurrently and the divisor and qj E {-a...a} is thejth quotient digit. The
computations in adjacent rows are overlapped. The
implementation has been done with Xilinx technology and residual at the jth step must satisfy VR < ad l(r -1)|.
FPGA-Advantage CAD tools. The quotient is accumulated by appending successive
Keywords: Field programmable gate arrays (FPGAs),
division, SRT division, two-digit quotient selection. quotient digits to the partial quotient Qj, i.e.,

1. Introduction Qj = Qj-l + qjr .

There are two main factors which limit the performance

Now, the flexibility of field programmable gate arrays of the SRT method. Firstly, a serial dependency existes
(FPGAs) allows the rapid development of high among the iterations. This is fundamental to the successive
performance custom hardware. By selecting arithmetic approximation method of computing the quotient.
algorithms suited to the FPGA technology and Secondly, the computation of the residual and the quotient
subsequently applying optimal mapping strategies, high digit are performed sequentially.
performance FPGA implementations can be developed [2].
In designing fast, hardware-oriented arithmetic 3. Modified Radix 2 SRT Division
algorithms, VLSI architectural design issues such as
regularity, modularity and locality of interconnections The key to this new radix 2 method of division [1] is
must be. addressed. This facilitates the mapping of
algorithms on to an architecture which is amenable to a the reformulation of the recurrence in (1) as
VLSI implementation and assists in the test and
complexity management of designs[ 1]. The purpose of this R; =2Rj, +0 0 =
paper is to briefly describe the modified division scheme
9 1 qj E-d tO)
and its adaptation for lookup table based FPGAs
implementation.
The paper is structured as follows. Section 2 shows a
brief description of the standard SRT division method and
R. -|{R2Rk if ifqj =-1=0or I (2)
identifies the factors which inhibit performance. In Section qj
3, we give a brief description of the modified radix 2 SRT
division algorithm [I] and its array architecture. Section 4 where d e [dmin9 dmax ) and qj is chosen from the
presents critical factors that efficiently matching division signed binary number representation (SBNR) digit set.
to a given set of FPGA characteristics [2]. Section 5
presents simulation results. Section 6 presents Clearly, the computation of the tentative residual, R;, can
implementation results. Then we provide our conclusions
in section 7. proceed before the full quotient digit has been computed.
2. standard SRT division algorithm All that is required is that the quotient digit to be located to
either the {0,1) or {-1,0) subsets. The quotient digit can
SRT division [3] is a digit-recurrence algorithm which then be computed concurrently and in a separate path to
utilises arithmetic redundancy [4] to reduce the required that of the tentative residual. To facilitate this, it must be
precision of comparisons between the divisor and the possible to compute 0 as quickly as possible. Therefore,
residual.
sz5 should be dependent upon the MSD only of the residual

0-7803-8294-3/04/$20.00 ©2004 IEEE

1419
and redundancy overflow in the residual should be Table 1 Quotient Digit and signal generation for Modified
avoided. This can be achived by introducing a more SRT Division
stringent bound on the residual, namely jR| < dmin .
Fig. I shows the selection regions which are used to R q a/s restor compress R - q.d2
design the quotient digit selection function for the divison 00 0 X I 0 00
algoithm[ I]
01 0 x I 0 01
01 0 x I 0 01
10 I 0 0 x 00
ih~~~~~~~+ Li_U 11 1 0 0 x 01
d iT 0 x I I 01
- d o0 l I 0 x 00
Ti 0 x 1 1 o
11 1O 1 0 x 01
--ddab
-d3-
2dalh qi,_
-dab- dafx x - Don't care
F K-_,

Fig.1 Selection regions for division

3.1 Division Array architecture
Table I detailes the necessary control signales required by An architecture to implement the modified SRT division
algorithm is illustrated in fig. 2. the circuit comprises a
the algorithm for each digit-pair comprising the residual regular array of type I and type 2 cells with quotient digits
estimate. The restore signal is required when a zero is and control signals being determinied by the S cells oni the
periphery of the array. The functional and gate level
selected as a quotient digit in order to select the previous descriptions of the basic cells are given in fig. 3. the
residual, rather than the tentative residual, as the new divisor digits d3d4d5 .... and the dividend
residual. The compress signal is required when the x = O.Ox2 x3...: enter the array in a bit-parallel manner
quotient digit is a zero and MSD of the residual is non- as shown. Scince the two MSDs of the divisor are known
implicity [1], they are not input to the array. Each signed
zero i.e. when R = 0.11 - whereR is the estimated binary digit, R, comprising the residual is composed of
two digits namely, R+ E{E,} and R E {- 1,0}, which
residual of 2Rjl - or R = 0. 1 1. The signal which
are encoded as shown in table 2. this coding enables
determines whether the divisor multiple is d or - d is conventional full-adders to be employed in
adding/subtracting a signed binary operand and a binary
the add/subtract signal, a/s for brevity, and can be operand.
determined from the MSD of the residual as required. It is Table 2 Coding of R+ and R
important to note that the first digit of the term R -qd
where d2 is the two MSDs of the divisor, is always zero
such that, when the new residual is scaled, no redundancy
overflow occures.

1420
4. Division and FPGAs d Rm
d

The proper selection of radix and algorithm [2] are critical

factors in efficiently matching division to a given set of f.
a l als
FPGA characteristics. A higher radix will have greater
1,0
restore
1 I f h(0°l
aestore
combinational logic depth but fewer required iterations.
Each of the digit-recurrence division algorithms differs R,,, d (0,1)
primarily in the number of bits required to compute the
following basic steps: (- 1.0,1)
idf a/s- I
d if ars-0 esStore
2t,o, + S = R-+ P
R4 R-
R '- S+ ti,

{e.RrnR i/if restore

restore -I
-0

a/s' R,,
retRe t etr
ols /I
comprvss it," R, R,
a/s
compress
R ,,,
(- 1,0.1)
J° if o/s-I
P |1 f aJs - O
2t,,, +5 = R*,+p
Fig. 2 Radix 2 division array R '-S+tI. t,, =0

Ri tf compress 0
restore -
f,

(1) Select a quotient digit, qj , R*, tf compress I

R iRtf restore I

(2) Form a multiple of the divisor, qjD, and R, R2

if restore 0

(3) Compute the next residual, R1 .

q ~~~~compress
A choice of the best radix for a lookup table based restore

FPGA[5] is determined by comparing the maximum

function size of the division algorithm (based on number of q &l(R,1R2)
input bits) with the function size of a single logic block. A (O.-I)
[I If Re
composition of switching functions maps most efficiently ff R e(0.1)
into-k-input lookup tables when each function has no more
thet k inputs. With more than k inputs a function requires compress -
I f t I

at least two lookup tables, and a composition of n such lO if RI 0

functiorns may require more than 2n logic blocks. By this I f q-O

rationale, the best division algorithm for k-input logic o

O If qe(-I.I)
blocks is the one with the largest radix having steps of a
maximum of k input bits. For example, SRT (radix 2) q q
division is the best choice for the XC4010 under this
criterion. SRT requires only four bits for the quotient digit Fig. 3 Description of the Basic cells
selection function whereas higher radices need at least six
bits. SRT also uses three bits for the generation of each bit 5. simulation results
of the divisor multiple while higher radices use at least
five. The computation of rR1 - shifted estimated residual The proposed radix 2 division algorithm were described
using VHDL code, compiled and simulated using
of Ri- for all radices requires more input bits than the Modelsim. A simulation result of the proposed divider is
general 5-input lookup table of an XC4010 CLB, but this shown in Fig. 4 for the input operands
computation has the fewest number of input bits with radix x = (0.0100000)2 = 0.25 ( in binary signed digit
2. BSD form), and d = (0.100000)2 = 0.5 (in regular

1421
binary form), which results in q = (0.10000)2= 0.5 ( in arithmetic, pp. 80-86, Windsor, Ontario, Canada, June
BSD form). 29-jully 2, 1993.
[2] M. E. Louie and M. D. Ercegovac," On Digit -
Recurrence Division Implementations for Field Progr-
/da,l&OAO 11111 . j 1 - ammable gate Arrays" In Proc. Of the 1 10h symposium
/d_w, Q Jnm 100000 iL on Computer Arithmetic, PP. 202-209, Canada, June29
/d_s,apid%Ak 00000 -July 2 1993
/dwq/zo.Aw. 10000 ' r- --
[3] J.E Robertson, " A New Class of Digital Division
Methods," IRE Trans Electronic Computers, Vol. 7,
PP. 218-222, Sept. 1958.
[4] D.E. Atkins, " Introduction to the Role of Redundancy
in Computer Arithmetic," IEEE Computer, 1975,
3400nw ..
. _.____!X .. s.._ .. l
2?w u._
..... ,. .,...... I.
3s]
PP74-77.
146 g 4 f140. t [5] Xilinx, "XC4000 Logic Cell Array Family- Technical
Fig. 4 function simulation results data," San Jose, 1990.

6. Implementation results
the VHDL code for proposed divider was processed by
Leonardo synthesis tool for xilinx XC4O 10
FPGA(40 lOePQ 160). The result of the synthesis process is
shown in Table 3
Table 3. synthesis results of xilinx XC4010_FPGA
Max.
Speed estimated FG HFG CLB 1OS
grade clock

-4 4.75 MHz 90 24 51 36

-3 7.0 MHz 90 24 51 36
.

7. Conclusions
Generating an efficient lookup table based FPGA
implementation for arithmetic requires (I) the selection of
an algorithm suited to the target technology, (2) the
creation of a suitable variation of that algorithm for the
target characteristics, and (3) an efficient mapping
approach. A well- matched algorithm is recognized by the
simple decomposition of its intermediate steps into
expressions of k variables or less, where k is the number of
inputs in the lookup tables. This paper also has briefly
described the modified radix 2 SRT division algorithm.
The premise of the approach has been that concurrently
computing the residual and the result digit at each step
leads to an increase in the performance of the circuits
compared to the SRT methods. The penalty of the
modified method appears in the reduced range of the
operands.
8. References
[1] S.E. Cquillan, J.V. McCanny, and R. Hamill, "New
Algorithms and VLSI Architectures for SRT Division
and Square Root," Proc. 1I h symp. Computer

1422

Decimal Division Implementation Using VHDL
No ratings yet
Decimal Division Implementation Using VHDL
16 pages
SRT Division Architectures and Implementations
No ratings yet
SRT Division Architectures and Implementations
8 pages
Division Algorithms and Hardware Implementations: Sherif Galal Dung Pham
No ratings yet
Division Algorithms and Hardware Implementations: Sherif Galal Dung Pham
7 pages
SHMT Chap4 Division
No ratings yet
SHMT Chap4 Division
100 pages
Design Considerations For Implementing A Divider in Hardware
No ratings yet
Design Considerations For Implementing A Divider in Hardware
3 pages
Adaptive Approximation in Arithmetic Circuits A Low-Power Unsigned Divider Design
No ratings yet
Adaptive Approximation in Arithmetic Circuits A Low-Power Unsigned Divider Design
6 pages
Division: Parts Chapters
No ratings yet
Division: Parts Chapters
23 pages
A. With: George
No ratings yet
A. With: George
8 pages
ECEN 4233 - Implentation of Goldschmidt's Algorithm For 16 Bit Division and Square Root
No ratings yet
ECEN 4233 - Implentation of Goldschmidt's Algorithm For 16 Bit Division and Square Root
13 pages
Computer Division
No ratings yet
Computer Division
2 pages
Pentiumbug
No ratings yet
Pentiumbug
14 pages
Implementation of N-Bit Divider Using VHDL: Abstract
No ratings yet
Implementation of N-Bit Divider Using VHDL: Abstract
4 pages
CH 5
No ratings yet
CH 5
76 pages
Lec 14
No ratings yet
Lec 14
29 pages
Comp 11
No ratings yet
Comp 11
13 pages
Processors.: Mops Integer Dmder Ic
No ratings yet
Processors.: Mops Integer Dmder Ic
3 pages
Division Algorithms in Computer Organization and Architecture
No ratings yet
Division Algorithms in Computer Organization and Architecture
5 pages
FPGA Implementation of Fixed Point Integer Divider Using Iterative Array Structure
No ratings yet
FPGA Implementation of Fixed Point Integer Divider Using Iterative Array Structure
10 pages
Computer Science 37 Lecture 9
No ratings yet
Computer Science 37 Lecture 9
17 pages
Designing A Divider: With Contributions From J. Kubiatowicz (CS152)
No ratings yet
Designing A Divider: With Contributions From J. Kubiatowicz (CS152)
12 pages
CH14
No ratings yet
CH14
32 pages
FALLSEM2018-19 CSE2001 TH SJT502 VL2018191005001 Reference Material II 2.5a Fixedpoint Division
No ratings yet
FALLSEM2018-19 CSE2001 TH SJT502 VL2018191005001 Reference Material II 2.5a Fixedpoint Division
13 pages
04 HLDD - DatapathFunctionalUnits
No ratings yet
04 HLDD - DatapathFunctionalUnits
32 pages
What Are The Roles of ALU
No ratings yet
What Are The Roles of ALU
150 pages
Division
No ratings yet
Division
19 pages
Iterative Division
No ratings yet
Iterative Division
15 pages
4175-Article Text-7672-1-10-20210430
No ratings yet
4175-Article Text-7672-1-10-20210430
12 pages
FPGA Implementation of Modified Non-Restoring Square Root Core
No ratings yet
FPGA Implementation of Modified Non-Restoring Square Root Core
6 pages
SRT Div
No ratings yet
SRT Div
8 pages
Implentation of Goldschmidt's Algorithm For 16 Bit Division and Square Root
100% (1)
Implentation of Goldschmidt's Algorithm For 16 Bit Division and Square Root
13 pages
Computer Arithmetic - M. Vladutiu
No ratings yet
Computer Arithmetic - M. Vladutiu
269 pages
Reference 21
No ratings yet
Reference 21
12 pages
10-Fixed Point Arithmetic - Division
No ratings yet
10-Fixed Point Arithmetic - Division
7 pages
Floating Point Division
No ratings yet
Floating Point Division
11 pages
Reconfigurablecomputing: Euclidean Distance Based Sorting
No ratings yet
Reconfigurablecomputing: Euclidean Distance Based Sorting
27 pages
Module 3 - Part 2
No ratings yet
Module 3 - Part 2
13 pages
Unit - 3 of Computer Architecture
No ratings yet
Unit - 3 of Computer Architecture
59 pages
Coordinate Rotation-Based Design Methodology For Square Root and Division Computation
No ratings yet
Coordinate Rotation-Based Design Methodology For Square Root and Division Computation
5 pages
An Optimized Algorithm For Integer Division: Abstract
No ratings yet
An Optimized Algorithm For Integer Division: Abstract
2 pages
Design of Double Precision IEEE-754 Floating-Point Units
100% (15)
Design of Double Precision IEEE-754 Floating-Point Units
73 pages
COA Module 2
No ratings yet
COA Module 2
65 pages
Khan Salman MASc 2015
No ratings yet
Khan Salman MASc 2015
79 pages
Chapter 3 Part 3 PDF
No ratings yet
Chapter 3 Part 3 PDF
11 pages
Cordic 1
No ratings yet
Cordic 1
19 pages
Cordic 1
No ratings yet
Cordic 1
19 pages
Low Latency Floating-Point Division and Square Root Unit
No ratings yet
Low Latency Floating-Point Division and Square Root Unit
14 pages
Non Restoring Asynchronous Divider
No ratings yet
Non Restoring Asynchronous Divider
10 pages
Low-Power Unsigned Divider and Square Root Circuit Designs Using Adaptive Approximation
No ratings yet
Low-Power Unsigned Divider and Square Root Circuit Designs Using Adaptive Approximation
12 pages
Division FPGA
No ratings yet
Division FPGA
9 pages
Review: MULTIPLY HARDWARE Version 1: °64-Bit Multiplicand Reg, 64-Bit ALU, 64-Bit Product Reg
No ratings yet
Review: MULTIPLY HARDWARE Version 1: °64-Bit Multiplicand Reg, 64-Bit ALU, 64-Bit Product Reg
6 pages
G1 Report
No ratings yet
G1 Report
5 pages
Lecture 35
No ratings yet
Lecture 35
34 pages
Comments and Errata
No ratings yet
Comments and Errata
5 pages
Arithmetic
No ratings yet
Arithmetic
13 pages
ECE4680 Computer Organization & Architecture Divide, Floating Point, Pentium Bug
No ratings yet
ECE4680 Computer Organization & Architecture Divide, Floating Point, Pentium Bug
17 pages
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
From Everand
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
Analog Dialogue
4/5 (1)
Build and Study RS, D, JK, and T Flip Flops Using TTL Logic Gates
From Everand
Build and Study RS, D, JK, and T Flip Flops Using TTL Logic Gates
GURUPRASAD N H
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Types of Smart Antenna Systems
No ratings yet
Types of Smart Antenna Systems
17 pages
Design and Implementation of Verilog Based High Speed Low Power Uart
No ratings yet
Design and Implementation of Verilog Based High Speed Low Power Uart
10 pages
DDHDL
No ratings yet
DDHDL
68 pages
Nios V Monitor Program Introduction
No ratings yet
Nios V Monitor Program Introduction
10 pages
Final Paper PUF 2 Repaireddocx-2024-11!04!15-20
No ratings yet
Final Paper PUF 2 Repaireddocx-2024-11!04!15-20
5 pages
DS3153 730 Digitizer Family r5
No ratings yet
DS3153 730 Digitizer Family r5
2 pages
Course Handout - CS G553 II Sem 2020 - 21
No ratings yet
Course Handout - CS G553 II Sem 2020 - 21
3 pages
E-Cad and Vlsi Lab
No ratings yet
E-Cad and Vlsi Lab
59 pages
SFAL Training Program PDF
No ratings yet
SFAL Training Program PDF
7 pages
Hardware Platforms For Flash Memory/NVRAM Software Development
No ratings yet
Hardware Platforms For Flash Memory/NVRAM Software Development
14 pages
Adv. Digital Circuit Design Clarkson University
No ratings yet
Adv. Digital Circuit Design Clarkson University
30 pages
FIR Filter Design On Chip Using VHDL: IPASJ International Journal of Computer Science (IIJCS)
No ratings yet
FIR Filter Design On Chip Using VHDL: IPASJ International Journal of Computer Science (IIJCS)
5 pages
FPGA: What? Why?: Marco D. Santambrogio Marco - Santambrogio@polimi - It
No ratings yet
FPGA: What? Why?: Marco D. Santambrogio Marco - Santambrogio@polimi - It
30 pages
FPGA DS 02007 2 2 CrossLink Family Data Sheet
No ratings yet
FPGA DS 02007 2 2 CrossLink Family Data Sheet
67 pages
Intro To LabVIEW and Robotics Hands-On Seminar
No ratings yet
Intro To LabVIEW and Robotics Hands-On Seminar
58 pages
Chat GPT
No ratings yet
Chat GPT
7 pages
(+) FPGA Technology Quiz - Documents
No ratings yet
(+) FPGA Technology Quiz - Documents
4 pages
Design Debugging Using The SignalTap II Embedded Logic Analyzer
No ratings yet
Design Debugging Using The SignalTap II Embedded Logic Analyzer
82 pages
Beginning FPGA Programming - Partie74
No ratings yet
Beginning FPGA Programming - Partie74
5 pages
A Complete 8-Bit Microcontroller in VHDL - FPGA4student
No ratings yet
A Complete 8-Bit Microcontroller in VHDL - FPGA4student
20 pages
13.M.E. Applied Electronics 2021R
No ratings yet
13.M.E. Applied Electronics 2021R
20 pages
SDC vs. FDC
100% (1)
SDC vs. FDC
16 pages
Implementation of 8x8 Vedic Multiplier Using Verilog
No ratings yet
Implementation of 8x8 Vedic Multiplier Using Verilog
8 pages
Industrial Visit Report
No ratings yet
Industrial Visit Report
94 pages
Inverter Using Simulink
No ratings yet
Inverter Using Simulink
8 pages
Hack Space Mag 04
100% (1)
Hack Space Mag 04
132 pages
Xapp1232 Bitstream Id With Usr - Access
No ratings yet
Xapp1232 Bitstream Id With Usr - Access
8 pages
What's New in MATLAB
No ratings yet
What's New in MATLAB
65 pages
Objective:: Ihtisham Ijaz Mughal LAB#1 2927
No ratings yet
Objective:: Ihtisham Ijaz Mughal LAB#1 2927
4 pages
Digital Electronics and Communication Systems: Curriculum
No ratings yet
Digital Electronics and Communication Systems: Curriculum
83 pages

Fpga Implementation of Modified Radix 2 SRT Division Algorithm

Uploaded by

Fpga Implementation of Modified Radix 2 SRT Division Algorithm

Uploaded by

FPGA Implementation Of Modified Radix 2 SRT Division Algorithm

AttifA. Ibrahem I, Hamed Elsimary I, Aly E. Salama 2

1. Introduction Qj = Qj-l + qjr .

There are two main factors which limit the performance

0-7803-8294-3/04/$20.00 ©2004 IEEE

Fig.1 Selection regions for division

The proper selection of radix and algorithm [2] are critical

{e.RrnR i/if restore

(1) Select a quotient digit, qj , R*, tf compress I

(2) Form a multiple of the divisor, qjD, and R, R2

(3) Compute the next residual, R1 .

FPGA[5] is determined by comparing the maximum

at least two lookup tables, and a composition of n such lO if RI 0

functiorns may require more than 2n logic blocks. By this I f q-O

You might also like