Radix-4 Modified Booth's Multiplier Using Verilog RTL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/325370146

RADIX-4 MODIFIED BOOTH'S MULTIPLIER USING VERILOG RTL

Article · May 2018

CITATIONS READS

0 1,444

3 authors, including:

Nadeem Tariq Beigh


Sharda University
20 PUBLICATIONS   0 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Brain Sense based vehicle for disabled View project

Electronic system control using brainwave data View project

All content following this page was uploaded by Nadeem Tariq Beigh on 25 May 2018.

The user has requested enhancement of the downloaded file.


© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

RADIX-4 MODIFIED BOOTH’S MULTIPLIER


USING VERILOG RTL
1
Aamir Bin Hamid, 2Nadeem Tariq Beigh, 3Ritu Singh
1
Student, Sharda University, 2Student, Sharda University, 3Assistant Professor, Sharda University
1, 2, 3
Department of Electronics & Communication
1, 2, 3
, Sharda University, Greater Noida, UP, India

Abstract: This paper presents a description of modified booth’s algorithm for multiplication two signed binary numbers. Radix-2
booth’s algorithm is explained, it is then identified that the main bottleneck in terms of speed of the multiplier is the addition of
partial products. Radix-4 Booth’s algorithm is presented as an alternate solution, which can help reduce the number of partial
products by a factor of 2.The booth’s multiplier is then coded in Verilog HDL, and area and timing analysis is performed on it.
Radix-4 Booth’s multiplier is then changed the way it does the addition of partial products by utilizing a configuration register for
range detection to reduce the number of partial product additions. Results of timing and area are then shown. The results table
contain device utilization and timing results of 2 multipliers i.e. Radix-4 booth’s multiplier) and radix-4 modified booth’s
multiplier with configuration register.

Index Terms - Booth’s multiplier, Radix-4, Xilinx, Multiplier, Verilog, Configuration Register, Optimization.

I. INTRODUCTION
1.1 Radix-2 Booth’s Algorithm
The radix 2 booth algorithm is explained, and using the radix-2 booth algorithm, radix-4 will be explained. In case of a radix-2
booths multiplier, we add a dummy zero at the least significant bit of the multiplier, and sign extend the most significant bit.
Initially the partial product is set to 0.The least significant 3 bits of the multiplier are considered,[3]
If 000, do nothing (no-op)
If 001, add the multiplicand
If 010, do an addition, and replace the 1 with a 0
If 011, do nothing (no-op)
If 100, do nothing (no-op)
If 101, do a subtraction, and replace the 0 with a 1
If 110, do a subtraction
If 111, do nothing (no-op)
Multiply the multiplicand by 2 (or, shift multiplicand by 1).
Look at the next 3 bits of the multiplier, and repeat the steps. When we run out of bits, return the partial product, which will now
be the full product. An example of this algorithm in practice is here:

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1072
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

This is a mechanism for doing radix-2 Booth’s multiplication that ensures that we only do n/2 addition/subtraction operations for
an n-bit multiplier. Unfortunately, even though we are guaranteed n/2 addition/subtraction operations, we do not actually see any
performance benefit. This is because we are still shifting the multiplicand and multiplier by 1. The fact that we are shifting by 1
(multiplying by 2) is why it is called a radix-2 multiplication.

1.2 Radix4 booths multiplier

Theoretically, if we ensure n/2 addition/subtraction operations, we should be able to shift the multiplicand and multiplier by 2
(multiply by 4). This is because every addition/subtraction operation will always be followed by a no-op. An algorithm that shifts
by 2 is what is known as radix-4 multiplication. However, with our Modified Booth’s algorithm, there are 2 classes of corner
cases that preclude us from shifting by 2. The first is for a multiplier string like 00011. For the least significant 3 bits, we will see
011 and then do nothing, as per our algorithm. Then, if we shift the multiplier by 2, our next 3 bits are 000, which also means that
you do nothing. Unfortunately, by shifting by 2, we have missed a 001, so we are missing an addition. We need to handle this
case. The other type of corner case is for a multiplier string like 11100, which has the same problem. The least significant 3 bits
are 100, necessitating a no-op, and then if we shift by 2, our next 3 bits are 111, which is also a no-op. Of course, you see that in
this case, we are missing a 110 subtraction case, which we need to handle.[2]
To fix this, we need to adjust our 100 and 011 cases so that they actually perform an operation instead of doing a no-op. What
should the operation be? Well, for 011, if we were shifting by 1, we would do a no-op followed by an addition. Likewise, for 100,
we would be doing a no-op followed by a subtraction if we shifted by 1. So, if we shifted by 2 instead of 1, we need to do a single
operation that is functionally equivalent. Of course, you realize that the single operation for the 011 case is to double the
multiplicand and add it. Likewise, for the 100 case, we must double the multiplicand and subtract it. So, our updated radix-4
algorithm will be:
Add a dummy zero at the least significant bit of the multiplier
Note: we no longer need to sign-extend the MSB for even-bit multipliers. For odd-bit multipliers, we need to sign-extend by 1 bit.
Initially consider the partial product is 0.The least significant 3 bits are considered.
If 000, do nothing (no-op)
If 001, add the multiplicand
If 010, do an addition (Note: Since we are shifting by 2, we do not need to replace the 1 with a 0 or 0 with 1.)
If 011, add 2*multiplicand
If 100, subtract 2*multiplicand
If 101, do a subtraction, and replace the 0 with a 1
If 110, do a subtraction
If 111, do nothing (no-op)
Multiply the multiplicand by 4 (or, shift multiplicand by 2).
Shift the multiplier by 2.
Look at the next 3 bits of the multiplier, and repeat the steps.
When we run out of bits, return the partial product, which will now be the full product.
An example of this is shown below:

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1073
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

As you can see, by shifting by 2 instead of 1, we reduce the number of steps in our algorithm by a factor of 2.

II. MULTIPLIER DESIGN


2.1 Design of Radix-4 Booth’s Multiplier

MODULE PARTIALPRODUCT (INPUT1,SEGMENT,OUTPUT1);


INPUT [15:0] INPUT1;INPUT [2:0] SEGMENT;OUTPUT REG [31:0] OUTPUT1;
ALWAYS @(*) BEGIN
CASE (SEGMENT)
3'B000:OUTPUT1=$SIGNED(1'B0);
3'B011:
BEGIN
OUTPUT1=$SIGNED(INPUT1);
OUTPUT1=$SIGNED(OUTPUT1)<<<1;
END
3'B100:BEGIN
OUTPUT1=$SIGNED(INPUT1);
OUTPUT1=$SIGNED(~OUTPUT1+1'B1);
OUTPUT1=$SIGNED(OUTPUT1)<<<1;
END
3'B101:BEGIN
OUTPUT1=$SIGNED(INPUT1);
OUTPUT1=$SIGNED(~OUTPUT1+1'B1);
END
3'B110:BEGIN
OUTPUT1=$SIGNED(INPUT1);
OUTPUT1=$SIGNED(~OUTPUT1+1'B1);
END
3'B111:OUTPUT1=$SIGNED(16'B0);
DEFAULT:OUTPUT1=$SIGNED(INPUT1);
ENDCASE
END
ENDMODULE

MODULE BOOTH_16_BIT(A,B,C);
INPUT [15:0] A;
INPUT [15:0] B;
OUTPUT REG [31:0] C;
REG [31:0] D, E;
WIRE [31:0] TEMP [7:0];
PARTIALPRODUCT P0(A,{B[1:0],1'B0},TEMP[0]);

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1074
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

PARTIALPRODUCT P1(A, B[3:1],TEMP[1]);


PARTIALPRODUCT P2(A, B[5:3],TEMP[2]);
PARTIALPRODUCT P3(A, B[7:5],TEMP[3]);
PARTIALPRODUCT P4(A, B[9:7],TEMP[4]);
PARTIALPRODUCT P5(A, B[11:9],TEMP[5]);
PARTIALPRODUCT P6(A, B[13:11],TEMP[6]);
PARTIALPRODUCT P7(A, B[15:13],TEMP[7]);
ALWAYS@(*)
BEGIN
D= ($SIGNED(TEMP[0])+$SIGNED(TEMP[1]<<<2)+$SIGNED(TEMP[2]<<<4)+
$SIGNED(TEMP[3]<<<6)+$SIGNED(TEMP[4]<<<8)+$SIGNED(TEMP[5]<<<10)+
$SIGNED(TEMP[6]<<<12)+$SIGNED(TEMP[7]<<<14));
E= 3*($SIGNED(TEMP[0])+$SIGNED(TEMP[1]<<<2)+$SIGNED(TEMP[2]<<<4)+
$SIGNED(TEMP[3]<<<6)+$SIGNED(TEMP[4]<<<8)+$SIGNED(TEMP[5]<<<10)+
$SIGNED(TEMP[6]<<<12)+$SIGNED(TEMP[7]<<<14));
C=(D+(E/3))/2;
END
ENDMODULE
The above RTL code successfully implements the radix-4 booth’s algorithm. The simulation of this booth’s multiplier gave
correct results. This also proves that the algorithm given in this report is correct.

Fig. 1: Technology schematics of Radix-4 Booth’s Multiplier.

Fig. 2: Simulation output of Radix-4 Booth’s Multiplier.

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1075
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

2.2 Design of modified Booth’s Multiplier


MODULE MODIFIED_BOOTH(A,B,C);
INPUT [15:0] A;
INPUT [15:0] B;
OUTPUT REG [31:0] C;
REG [31:0] D;
REG [3:0] RANGEA;
REG [31:0] TEMP [7:0];
INTEGER I;
ALWAYS @(*)
BEGIN
// FOR(I=0;I<8;I=I+1)
// TEMP[I]=0;C=0;D=0;
/////////////////////////////////
////CONFIGURATION///////////////

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1076
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

IF((A[15:12])!=4'B0000)
RANGEA=15;
ELSE IF((A[11:8])!=4'B0000)
RANGEA=11;
ELSE IF((A[7:4])!=4'B0000)
RANGEA=7;
ELSE IF((A[3:0])!=4'B0000)
RANGEA=3;
ELSE
RANGEA=0;
//////RANGE OBTAINED///////////
PARTIAL_PRODUCT(B,{A[1:0],1'B0},TEMP[0]);
FOR(I=1;I<=(RANGEA/2);I=I+1)
PARTIAL_PRODUCT(B,A[(2*I+1)-:3],TEMP[I
FOR(I=0;I<(RANGEA/2);I=I+1)
D=D+($SIGNED(TEMP[I]<<<(2*I)));
C=D;
END
TASK PARTIAL_PRODUCT;
INPUT [15:0] INPUT1;
INPUT [2:0] SEGMENT;//RADIX 4

OUTPUT [31:0] OUTPUT1;


BEGIN
CASE (SEGMENT)

3'B000:OUTPUT1=$SIGNED(1'B0);
3'B011:
BEGIN

OUTPUT1=($SIGNED(INPUT1)<<<1);
END
3'B100:
BEGIN
OUTPUT1=($SIGNED((~INPUT1)+1'B1)<<<1);
END

3'B101:
BEGIN
OUTPUT1=$SIGNED((~INPUT1)+1'B1);

END
3'B110:
BEGIN

OUTPUT1=$SIGNED((~INPUT1)+1'B1);
END
3'B111:OUTPUT1=$SIGNED(16'B0);
DEFAULT: OUTPUT1=$SIGNED(INPUT1);
ENDCASE
END
ENDTASK

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1077
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

ENDMODULE

Here an attempt is made to design a high speed and power-efficient modified booths multiplier. Modified booths multiplier can
be at least twice as fast as booth’s algorithm. Modified booth is an efficient way to reduce no of partial products. The main
concerns are speed, power efficiency and structural flexibility. From the basics of booths multiplication we came to know that
number of passes or cycles to obtain the final product depends upon the operand width. If the input data is of the form of 001110
(multiplier) and 001001 (multiplicand) we have to go for six passes to obtain the result. Since the significant data is contained in
least 4 significant bits, output result can be obtained only after four passes by suppressing most significant bits, being zero.[1]
That will not only reduce the delay but also reduces the switching to a great extent. So, an attempt has been made to modify the
booths multiplier by reducing the delay and power. In this multiplication technique of modified booth multiplier, firstly the range of
booth the operands A and B are detected by the configuration register that is being configured through input ports. Configuration
register will detect whether the computation will be done on 4 bit, 8 bit, 12 bit or 16 bit. There after the further computation will be
done accordingly. A register RangeA is taken, that will store the concatenation of accumulator (4 bit or 8bit or 12bit or 16bit,
initially assigned with 0’s), multiplier A( 4 bit or 8 bit or 12 bit or 16 bit) and an extra least significant bit, LSB (initially assigned
with 0). Figure 5 describes the working of modified booth multiplier. [4]
In this figure multiplier and multiplicand are stored in 16-bit registers. Configuration register detects the range of input data by
performing bit wise OR operation. Booth controller provides the necessary control signals and addition, subtraction and shifting
will be done according to booths algorithm. Final result are stored in accumulator detection by configuration register. The
configuration register will perform bitwise OR operation to determine range. The data detection starts from the most significant
bits, examining each four bit group. In the range detection technique both the input 16-bit operands A [15:0] and B [15:0] are
divided into four parts or sub expressions that are A[15:0], A[11:8], A[7:4], A[3:0] and B[15:12], B[11:8], B [7:4], B [3:0]. For
example if the input data is of 16-bit (A=1010111111001010), then the required sub expressions are:[6].
1 0 1 0 1 1 1 1 1 1 0 0 1 0 1 0

16 12 8 4
A [15:0] =1010, A [11:0] =1111, A [7:0]=1100 , A [3:0]=1010.
The size of both operands is checked separately and simultaneously whether they come in the range of 4 bits, 8 bits, 12 bits or 16
bits.

Fig. 3: Technology schematics of Modified Radix-4 Booth’s Multiplier.

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1078
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

Fig. 4: Technology schematics of Modified Radix-4 Booth’s Multiplier.

III. RESULTS & DISCUSSIONS


It is required to perform a timing and device utilization summary on 2 types of multipliers i.e. radix-4 booth’s multiplier and a
modified radix-4 booth’s multiplier, so that the need of a booth’s multiplier can be appreciated. For this purpose. Radix-4 booth’s
multiplier and a modified radix-4 booth’s multiplier were designed, and synthesized in Xilinx ISE using Verilog HDL, on target
library XC7A100T from Artix7 family of FPGA. Each of these multipliers were optimized for timing and device utilization
separately. Timing being the main criteria and constraint are given in tables under.

Table1: Advanced HDL Synthesis Report of Radix - 4 Booth’s Multiplier.


Component Utilization
32x3-bit multiplier 1
32-bit adder 41
32-bit / 8-inputs adder tree 1
32-bit comparator 31
33-bit comparator 1
34-bit comparator 1
1-bit 2-to-1 multiplexer 928
32-bit 2-to-1 multiplexer 3
32-bit 8-to-1 multiplexer 8
Maximum combinational path delay 11.287 (5.19ns logic, 6.09ns route)
(46.0% logic, 54.0% route)

Table2: Advanced HDL Synthesis Report of Modified Radix - 4 Booth’s Multiplier.


Component Utilization
16-bit adder 1
32-bit adder 6
3-bit comparator 7
3-bit comparator 7
1-bit 2-to-1 multiplexer 531
1-bit 8-to-1 multiplexer 90
32-bit 8-to-1 multiplexer 1
32-bit 2-to-1 multiplexer 7
4-bit 2-to-1 multiplexer 3
Maximum combinational path delay 10.800ns (5.0328ns logic, 5.7672ns route)
(46.6% logic, 53.4% route)

Table3: Device Utilization summary of Radix - 4 Booth’s Multiplier.

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1079
© 2018 JETIR May 2018, Volume 5, Issue 5 www.jetir.org (ISSN-2349-5162)

Table3: Device Utilization summary of Modified Radix - 4 Booth’s Multiplier.

From the above table, it is gathered that while the design was optimized for timing, that is for the most important constraint in
the exercise, radix-4 Booth’s multiplier has a path delay of 11.287 ns , and the modified radix-4 Booth’s multiplier has a path delay
of 10.800 ns.So modified radix-4 Booth’s multiplier is best as far as the timing is concerned. Clearly modified radix-4 Booth’s
multiplier outperformed the ordinary radix-4 Booth’s multiplier when optimized for timing.

IV. CONCLUSION

1). Use Modified radix-4 booth’s multiplier if area is not critical.


2). Use ordinary radix-4 booth’s multiplier if area is critical and a bit of compromise on timing can be made. In the above
example, 646 units of area are used to improve the timing by ~55ns. In percentages, 39% area is increased to get 4.5 %
reduction in timing. So the best choice is modified radix-4 booth’s multiplier. This design can also parameterized, giving a
high degree of re-use.

REFRENCES

[1] Deepali Chandel1, Gagan Kumawat, Pranay Lahoty, Vidhi Vart Chandrodaya , Shailendra Sharma “Booth Multiplier: Ease
of multiplication” in International Journal of Emerging Technology and Advanced Engineering ,Volume 3, Issue 3, March
201
[2] Tam Anh Chu, “Booth Multiplier with Low Power High Performance Input Circuitry”, US Patent, 6.393.454 B1, May 21,
2002.
[3] Sumit Vaidya and Deepak Dandekar “Delay-power performance Comparison of multipliers in VLSI Circuit design” in
International Journal of Computer Networks & Communications (IJCNC), Vol.2, No.4, July 2010.
[4] Neha Goyal, Khushboo Gupta & Renu Singla, “study of combinational and booths multiplier’ ’International Journal of
Scientific and Research Publications, Volume 4, Issue 5, May 2014.
[5] K. Tsoumanis et al., “An optimized modified booth recoder for efficient design of the add-multiply operator,” IEEE Trans.
Circuits Syst. I, Reg. Papers, vol. 61, no. 4, pp. 1133–1143, Apr. 2014.
[6] Cilardo et al., “High speed speculative multipliers based on speculative carry-save tree,” IEEE Trans. Circuits Syst. I, Reg.
Papers, vol. 61, no. 12, pp. 3426–3435, Dec. 2014.
[7] Vazquez and E. Antelo, “Area and Delay Evaluation Model for CMOS Circuits,” Internal Report, Univ. Santiago de
Compostela, Jun. 2012.
[8] Galal et al., “FPU generator for design space exploration,” in Proc. 21st IEEE Symp. Comput. Arithmetic (ARITH) , pp.
25–34, Apr. 2013.
[9] M. Zamin Ali & et. al., “Optimization of Power Consumption in VLSI Circuit”, IJCSI International Journal of Computer
Science Issues, Vol.8, Issue 2, pp.648-654, 2011.
[10] S.S.Kerur, Prakash Narchi, Jayashree C N, Harish M Kittur and Girish V A “Implementation of Vedic Multiplier for
Digital Signal Processing” International conference on VLSI communication & instrumentation (ICVCI), 2011.
[11] Virendra Babanrao Magar, “Intelligent and Superior Vedic Multiplier for FPGA Based Arithmetic Circuits”, International
Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-3, Issue-3, July 2013.

JETIR1805369 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 1080
View publication stats

You might also like