Array Multiplier Using
Array Multiplier Using
Head of the Department, Department of ECE, ASCET, Gudur, Andhra Pradesh, India
ABSTRACT
The increasing demand for the high fidelity portable devices has laid emphasis on the development of low power
and high performance systems. In the next generation processors, the low power design has to be incorporated into
fundamental computation units, such as multipliers. The characterization and optimization of such low power multipliers
will aid in comparison and choice of multiplier modules in system design. In this paper we performed a comparative
analysis of the power, delay, and power delay product (PDP) optimization characteristics of five parallel digital multipliers
implemented using low power 10 transistor (10T) Adder. Multipliers realized using the 10T Adder circuit because 10T
Adder consumed considerably less power compared to Static Energy Recovery Full adder (SERF) and static CMOS Adder
for all the configurations studied., design of five different array multipliers are presented. The multipliers presented in this
paper were all modeled using Cadence Tools (Virtuoso Simulator). The comparison is done on the basis of three
performance parameters i.e. Area, Speed and Power consumption. The difference between the power consumption of the
10 transistor based multipliers is significant at 45 nm. For smaller geometry sizes down to 45 nm, the propagation delay of
the multipliers implemented with 10 transistors translates to a better performance measure. This can be attributed to the fast
computational capability of the CSA multiplier and 10T adder logic saving more power at deep sub-micron sizes. The
proposed SERF-10T Hybrid adder model multipliers consumed the least power of all the Hybrid and regular models with
no deterioration in performance. Taken together, these results suggest that SERF-10T Hybrid model based multipliers are
suited for ultra low power design and fast computation at smaller geometry sizes.
50
multipliers. While several of these techniques reduce power dissipation by eliminating spurious transitions others have
focused on developing novel multiplier architectures.
In this paper it is tried to find out the best solution to achieve low power consumption, less area required and high
speed for multiplier operation. This research uses an approach to significantly reduce the power consumption and the chip
area of the parallel multipliers, without sacrificing performance. The approach is based on using low power, minimal
transistor count adders that are the determining in the performance of the multiplier. The operation of a parallel multiplier
can be divided into two parts: (a) formation of the partial products, and (b) summation of these products to form the final
product of the multiplication.
In this study, we investigated the power and delay performance characteristics of three different Adders Circuits
are SERF, 10T and CMOS 28T adders. For comparative study, we realized 10T adder has better performance. By using
10T adder five Different Multipliers Bit Array, Carry-Save, Braun Array, Vedic Multiplier and Baugh Wooley are realized.
The tradeoffs between speed and power of these multipliers were compared In Section 2, we describe the CMOS 28T
adder, SERF, and 10T adder circuits used in our design. Section 3 describes the multiplier architectures. Section 4
describes the simulation methodology used. In Section 5 the results of simulation study are discussed and Section 6
presents a summary of the paper and the concluding remarks.
ADDER MODULES
Adders are the fundamental building blocks in all the multiplier modules. Hence employing fast and efficient full
adders plays a key role in the performance of the entire system. In the following section we briefly describe the adder
modules used in our design.
Conventional CMOS 28 Transistor (28T) Full Adder
The 28 Transistor full adders is the pioneer CMOS traditional adder circuit. The schematic of this adder is shown
in Figure 1.This adder cell is built using equal number of N and P mos transistors. The logic for the Complimentary MOS
logic was realized using the Eqs. (1) and (2)
Cout= AB+BCin+ACin
(1)
(2)
The first 12 transistors of the circuit produce the Cout and the remaining transistors produce the Sum outputs.
Therefore the delay for computing Cin is added to the total propagation delay of the Sum output. The structure of this adder
circuit is huge and thereby consumes large on-chip area.
Figure 1: Conventional CMOS Adder with 28 Transistors. Reprinted with Permission from [2],
J.M.Rabae y et al., Digital Integrated Circuits, Prentice Hall Publications (2003)
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
51
SERF Adder
The Static Energy Recovery Full Adder (SERF adder) circuit was developed implementing energy recovery logic
and reduced number of transistors.
Figure 2: Static Energy-Recovery Full (SERF) Adder. Reprinted with Permission from [8],
R.Shalem et al., A Novel Low Power Energy Recovery Full Adder Cell Proceedings of the
Great Lakes Symposium on VLSI (1999), pp.380383
The schematic of the 10 transistor SERF adder is shown in Figure 2.The basic idea in the SERF adder is the reuse
of charge stored in the load capacitance during the high output to drive the control logic. In regular non-energy recovery
adder designs the input charge applied at logic high will be drained off during logic low mode. This is achieved by using
only one voltage source (VDD) in the circuit. As an added advantage there will be no path from one voltage level (VDD)
to the other (GND).The elimination of the direct path to the ground removes the short circuit power component for the
adder module. This reduces the total energy consumed in the circuit and making it an energy efficient design. The SERF
adder is not only energy efficient but also area efficient due to its low transistor count. The main drawback of the SERF
adder is the threshold voltage drop at the output voltage for certain input combinations. A detailed comparative study of
SERF adder with other low power adders can be found in Ref. [4].
52
Figure 4: 10 Transistor (10T) Adder. Reprinted with Permission from [5], L.Junming et al., A Novel 10-Transistor
Low-Power High-Speed Full Adder Cell. Proceedings of 6th International Conference on Solid-State and
Integrated-Circuit Technology (2001), pp.11551158
The capacitance at the outputs of XOR and XNOR gates is also reduced as they are not loaded with inverter. If the
signal degradation at the SUM and Cout is significant for deep sub-micron circuits, drivers can be used to reduce the
degradation. The driver will help in generating outputs with equal rise and fall times. This results in better performance
regarding speed, low power dissipation and driving capabilities. The output voltage swing will be equal to the VDD, if a
driver is used at the output. Figure 4 gives the circuit level diagram of 10T adder.
100
50.23
42.5
142.8
0.8 V
200
250
400
Power(n W)
101.1 122.8 201.1
83.12 98.04 153.2
282.4 355.1 558.4
500
250.3
189.4
695.4
100
77.29
78.52
236.2
1V
200
250
400
Power(n W)
157.3 191.0 313.6
163.4 196.2 316.7
470.0 591.3 933.1
500
390.0
392.0
1163
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
53
100
109.3
115.8
354.8
1.2 V
250
400
Power(n W)
224.0 271.7 446.3
242.4 289.9 467.71
709.2 891.6 1411
200
500
554.7
568.6
1759
Supply
Voltage
0.8 V
1.0 v
1.2 V
Maximum Delay(p s)
Adder
10T
SERF
Sum
Cout
Sum
Cout
47.35 86.41 35.05 150.5
32.91 38.22 32.26 56.24
11.58 24.86 29.44 5.738
28t
Sum
176.9
81.39
54.12
Cout
107.1
48.07
31.68
100
4.346
2.999
12.783
1V
200
250
400
Power(10 -18 alto W)
8.846 10.741 17.636
6.241
7.494
12.097
25.43
32.00
50.49
500
21.933
14.974
62.94
MULTIPLIER ARCHITECTURES
Multipliers are in fact complex adder arrays. This is an operation common to a large number of applications, and
the complexity of this function has lead to a large amount of research directed at speeding up its execution. Multipliers can
be implemented using different algorithms. Depending on the algorithm used, the performance characteristics of the
multipliers vary. In the implementation of digital multipliers binary adders are an essential component. With the emergence
of power as a design consideration, speed is not the only criterion by which various implementations are judged. Designing
multipliers with low power, energy efficient adders reduce the power consumption and efficiency of multipliers. In this
paper we have concentrated on the design and characterization of five popular multipliers, The Carry-Save Multiplier, the
Bit-Array Multiplier, Braun Array Multiplier and Baugh Wooley Multipliers study and the performance evaluation of
these five parallel digital multipliers we implemented them using 10T adder for 44.
54
In this algorithm the secret image is divided into four parts and these four parts of the secret image is then
embedded into cover image. The processed four images are then transmitted to the intended receiver. The receiver receives
four dummy images which contains four parts of the secret image. These four parts of the image is then extracted using an
algorithm and a access which again works on bit operations. The extracted images are then combined to reconstruct the
secret image.
Carry-Save Array Multiplier[3]
Carry Save Array Multiplier has a very regular structure, which makes it amenable to automation. The algorithm
is based on the fact that the multiplication result does not change when the output carry bits are passed diagonally
downwards instead of only to the right. An extra adder, known as vector-merging adder, is added in each stage of the
multiplication such that the final result is obtained. This is called the carry-save multiplier because the carry bits are not
immediately added but are rather saved for the next addition stage. In the final stage the carries and the sums are merged in
a fast-carry propagate adder stage, usually by using a carry-look ahead adder. Due to the additional adder in each stage
there is a slight increase in the area cost. However, it uses only short wires to the nearest neighboring cells. It can also be
easily pipelined. Another added advantage is that there is only one critical path rather than the several identical critical
paths found in the generic array multiplier. The general structure of a Carry-Save Multiplier is shown in Figure 7.The delay
of this multiplier can be expressed10 as,
T = Tand+Tfinal+(X 1) Tcarry
(3)
Where Tand is the delay of the pre-product generating AND gates, Tfinal is the delay of the final stage carry-look
ahead adder, X is the number of partial product stages, and Tcarry is the propagation delay between input and output carry.
This equation is based on the assumption that the delay for sum generation is equal to that of the carry generation.
Figure 7: 44 Carry-Save Multiplier. Reprinted with Permission from [2], J.M.Rabae y et al.,
Digital Integrated Circuits, Prentice Hall Publications (2003)
Bit Array Multiplier
Bit Array Multipliers are essentially regular structures and are simple to expand. The structure is similar to the
previously discussed Carry-Save multiplier but propagates the carry bits from the full adders in a different fashion. A
simple diagram of a 44 multiplier is shown in Figure 8. Each partial product is generated by the multiplication of the
multiplicand with one multiplier bit.
The partial products are shifted according to their bit orders and then added. In array multiplication we need to
add as many partial products as there are multiplier bits. In order to perform signed multiplication, 2s complement number
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
55
system is used to represent the multiplicand and the multiplier. This implies that all the adders in a particular stage should
be of equal bit length.
Figure 8: 44 Bit-Array Multiplier. Reprinted with Permission from [2], J.M.Rabae y et al.,
Digital Integrated Circuits, Prentice Hall Publications (2003)
To achieve this, the sign bits of the partial products in the initial row and the sum and carry signals of each adder
stage are extended. The extension is carried out until the signals width matches the width of the largest absolute value
signal in that stage. Also, the generation of X partial products requires XY two-bit AND gates. Large area of the multiplier
is devoted to perform addition of N partial products, which require (N 1) M-bit adders.1_ 10 The shifting of the partial
products for proper alignment is performed by simple routing and does not require any logic. The array structure makes it a
difficult task to measure the propagation delay. There are more than one identical length critical timing paths available in
the circuit. An approximate equation as shown in Eq.(4) 10 for the propagation delay can be obtained by a detailed study of
these paths.
T = Tand+Tsum+ [(Y 1) + (X 2)]Tcarry
(4)
where Tand is the delay of the pre-product generating AND gates, Tsum is the delay between the input carry and the
sum bit of the full adder, Y is the width of the multiplicand, X is the width of the multiplier, and Tcarry is the propagation
delay between input and output carry.
Baugh Wooley Multiplier
Baugh Wooley Multiplier is used for 2s complement multiplication. It adjusts the partial products to maximize
regularity of the multiplication array. It moves the partial products with negative signs to the last steps and also adds the
negation of partial products rather than subtracts.This technique has been developed in order to design regular multipliers,
suited for 2s complement numbers
(5)
Gate-level diagram of a 4-bit Baugh Wooley multiplier is shown in Figure 7. The equation of Baugh-Wooley
algorithm for an N N multiplication is given by Eq.(5),
56
Figure 10: Braun Array Multiplier Structure, Anitha R, Bagyaveereswaran V, Brauns Multiplier
Implementation Using FPGA with Bypassing Techniques, International Journal of VLSI Design &
Communication Systems (VLSICS) Vol.2, No.3, September 2011. Ref[6]
Vedic 44 Multiplier
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
57
The method is explained below for two, 2 bit numbers A and B where A = a1a0 and B = b1b0 as Firstly, the least
significant bits are multiplied which gives the least significant bit of the final product (vertical). Then, the LSB of the
multiplicand is multiplied with the next higher bit of the multiplier and added with, the product of LSB of multiplier and
next higher bit of the multiplicand (crosswise). The sum gives second bit of the final product and the carry is added with
the partial product obtained by multiplying the most significant bits to give the sum and carry. The sum is the third
corresponding bit and carry becomes the fourth bit of the final product. s0 = a0b0; (1) c1s1 = a1b0 + a0b1; (2) c2s2 = c1 +
a1b1
The 4x4 bit Vedic multiplier module is implemented using four 2x2 bit Vedic multiplier modules. Lets analyze
4x4 multiplications, say A= A3 A2 A1 A0 and B= B3 B2 B1 B0. The output line for the multiplication result is S7
S6S5S4 S3 S2 S1 S0. Lets divide A and B into two parts, say A3 A2 & A1 A0 for A and B3 B2 & B1B0 for B. Using the
fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block
Sample Representation of 4X4 Multiplication
Figure 11:44 Vedic Multiplier Reprinted with Permission from. Sowmiya.M, Nirmal Kumar.R, Dr.
S.Valarmathy3, Karthick.S, Design of Efficient Vedic Multiplier by the Analysis of Adders, International Journal
for Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008
Certified Journal, Volume 3, Issue 1, January 2013) Ref[7]
58
SIMULATION RESULTS
In this section, performance measurement of all the five multipliers using 10T adder has been compared. These
results were obtained from Virtuoso simulations (Cadence) .Though low power is the objective of our design; we wanted to
measure the delay and area of these circuits, as they are indicators of good performance.
Power
The energy consumption for all the multipliers investigated is presented in below Tables for a 45 nm technology
size. For all the operand sizes, the SERF adder based multipliers consumed considerably less energy compared to the
CMOS adder based multipliers. In fact, the SERF based multiplier performed at least thirty-two percent better than any
CMOS based version. The 10T based 44 Bit-Array multiplier proved to have the greatest advantage over its CMOS with
a sixty percent improvement.
The power gain of 10T is less as compared to SERF based multipliers and hence can be used where pass transistor
logic is used. The power consumed for array multiplier is higher than Braun Array and Carry Save Array Multiplier in 4-bit
Power Comparisons at 0.8 volts for multipliers are shown in Table 4.
Table 6: Power Consumption Comparison of Multipliers at 0.8V
Supply Voltage
Frequency(MHz)
Multiplier
Baugh Wooley
Braun Array
Bit Array
Vedic
CSA
100
0.852
0.783
1.082
1.332
0.736
0.8 V
250
400
Power(W)
1.559 1.948 2.985
1.513 1.918
2.87
1.903 2.198 3.121
2.337 2.775 4.093
1.374 1.786 2.768
200
500
3.643
3.498
3.698
5.172
3.404
In this table Braun Array Multiplier and Carry save Array Multipliers are Efficient for All the frequencies
Compared Above.
Figure 12: Power Consumption Comparison Chart for Multipliers at 0.8 Volts
Table 7: Power Consumption Comparison of Multipliers at 1V
Supply Voltage
Frequency(MHz)
Multiplier
Baugh Woolley
Braun Array
Bit Array
Vedic
CSA
100
1.929
1.678
1.723
2.797
2.301
1V
200
250
400
Power( W)
3.262 4.016 6.019
3.131 3.925 5.905
3.116 4.036 6.001
4.808 5.809 8.551
3.473 4.002 5.683
500
7.337
7.243
7.095
10.65
6.892
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
59
100
3.875
2.591
6.541
7.91
4.23
1.2 V
200
250
400
Power( W)
6.368 7.848 11.37
4.54 5.663 8.547
10.53 10.72 13.79
10.79 12.36 16.38
5.286 6.297 9.107
500
13.20
10.36
15.28
19.99
10.07
0.8
Min
43.17
1
Max
960.0
Min
22.02
1.2
Max
659.6
Min
21.93
Max
657.9
60
Braun Array
Bit Array
Vedic
CSA
43.53
58.23
29.93
65.08
Table 9: Contd.,
625.6 22.25
904.6 34.04
552.7
20.2
502.9 37.52
536.2
795.0
426.2
481.2
16.01
25.87
19.33
27.26
456.4
625.9
379.9
412.0
0.8 V
100
200
250
400
500
Power Delay Product (*p=f W)
0.817 1.496 1.870 2.865 3.497
0.489 0.945 1.198 1.793 2.186
0.978 1.721 1.988 2.823 3.345
0.736 1.291 1.533 2.262 2.858
0.370 0.690 0.898 1.392 1.711
Figure 16: Power Delay Product Comparison Chart for Multipliers at 0.8 Voltage
Table 11: Power Delay Product Comparison of Multipliers at 1V
Supply Voltage
Frequency(MHz)
Multiplier
Baugh Wooley
1V
100
200
250
400
500
Power Delay Product (*p=f W)
1.272
2.151 2.648 3.970 4.839
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
Braun Array
Bit Array
Vedic
CSA
3.166
4.770
3.694
2.734
61
3.883
5.640
4.539
3.316
Figure 17: Power Delay Product Comparison Chart for Multipliers at 1Voltage
Table 12: Power Delay Product Comparison of Multipliers at 1.2V
Supply Voltage
Frequency(MHz)
Multiplier
Baugh Wooley
Braun Array
Bit Array
Vedic
CSA
1.2 V
100
200
250
400
500
Power Delay Product( *p=f W)
2.549 4.189 5.163 7.480 8.684
1.182 2.072 2.584 3.900 4.728
4.094 6.590 6.709 8.631 9.563
3.005 4.099 4.695 6.222 7.594
1.742 2.177 2.594 3.752 4.148
Figure 18: Power Delay Product Comparison Chart for Multipliers at 1.2 Voltage
CONCLUSIONS
In this paper, we have presented the power and speed Performance characteristics of five different multipliers
realized using 10T Adder After comparisons of three adders 10T, SERF and CMOS 28T static adders PDP. we realized
44 Carry-Save multiplier have low PDP when compared with all other multipliers(Baugh Wooley, Bit Array, Vedic
Multiplier and Braun Array Multipliers) when going to higher frequencies, But At Lower Frequencys Braun Array
Multiplier have low PDP from this paper I realize, when we want to use multiplier at lower frequencys Braun Array
Multiplier is Efficient As well CSA multiplier is Efficient from 400MHZ on words from above comparisons I conclude
depending upon the frequency Braun Array Multiplier and Carry save Array Multiplier are Efficient at low frequency and
high frequency respectively.
62
REFERENCES
1.
Dhireesha Kudithipudi and Eugene John, Implementation of Low Power Digital Multipliers Using 10 Transistor
Adder Blocks, Journal of Low Power Electronics Vol.1, 111, 2005
2.
3.
J.M.Rabae y, A.Chandrakasan, and B.Nik olic, (Eds.), Digital Integrated Circuits, Prentice Hall Publications
(2003).
4.
R.Shalem, E.John, and L.K.John, A novel low power energy recovery full adder cell Proceedings of the Great
Lakes Symposium.
5.
L.Junming et al., A novel 10-transistor low-power high-speed full adder cell. Proceedings of 6th International
Conference on Solid-State and Integrated-Circuit Technology (2001), pp.11551158
6.
Anitha R, Bagyaveereswaran V, Brauns Multiplier Implementation using FPGA with Bypassing Techniques,
International Journal of VLSI design & Communication Systems (VLSICS) Vol.2, No.3, September 2011.
7.
Sowmiya.M, Nirmal kumar.R, Dr. S.Valarmathy3, Karthick.S, Design Of Efficient Vedic Multiplier by the
analysis of Adders, International Journal of Emerging Technology and Advanced Engineering Website:
www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 1, January 2013)
8.
R.Shalem et al., A novel low power energy recovery full adder cell Proceedings of the Great Lakes Symposium
on VLSI (1999), pp.380383.
9.
Muhammad H. Rais, Hardware Implementation of Truncated Multipliers Using Spartan-3AN, Virtex-4 and
Virtex-5 FPGA Devices, Am. J. Engg. & Applied Sci., 3 (1): 201-206, 2010.. on VLSI (1999), pp.380383.
10. S.Shah, A.J.Al-Khalili, and D.Al-Khalili, Comparison of 32-bit multipliers for various performance measures.
Proceedings of the 12th International Conference on Microelectronics (2000),pp.7580.
11. I.S.Ab u-Khater, A.Bellaouar , and M.I.Elmasry , Circuit techniques for CMOS low power high-performance
multipliers. IEEE Journal of Solid State Circuits (1996), Vol.31, pp.15351546.
12. G.-K.Ma and F.J.T aylor, Multiplier policies for digital signal processing.IEEE ASSP Magazine (1990), pp.619.
13. T.K.Calla way and E.E.Sw artzlander, Jr., Power delay characteristics of CMOS multipliers. Proceedings of the
13th International Symposium on Computer Arithmetic (1997), pp.2632.
14. G.Goto, A.Inoue, R.Ohe, S.Kashiw ahra, S.Mitarai, T.Tsuru, and T.Iza wa, A 4.1-ns compact 5454 multiplier
utilizing sign-select booth encoders. IEEE Journal of Solid-State Circuits (1997), Vol. 32, pp.167682.
15. K.Z.Pekmestzi, Multiplexer-based array multipliers. IEEE Trans.on Computers (1999), Vol.48, pp.1523.
16. D.Radhakrishnan, Low Voltage CMOS Full Adder Cells. Electronicsletters (1999), Vol.35, pp.17921794.
17. L.Junming, S.Y an, L.Zhenghui, and W.Ling, A novel 10-transistor low-power high-speed full adder cell.
Proceedings of 6th International Conference on Solid-State and Integrated-Circuit Technology (2001), pp.1155
1158.
18. http://public.itrs.net, International technology roadmap for semiconductors
Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology
63
19. H.T .Bui, Y.W ang, and Y.Jiang, Design and analysis of low-power 10-transistor full adders using novel XOR
XNOR gates. IEEE Trans. on Circuits and Systems-II: Analog and Digital Signal Processing (2002), Vol.49,
pp.2530.
20. A.M.Shams, T.K.Darwish, and M.A.Bayoumi, Performance analysis of low-power 1-Bit CMOS full adder cells.
IEEE Trans. On VLSI Systems (2002), Vol.10, pp.2029.
AUTHORS DETAILS
Bynaboina vamsi krishna received B.Tech. degree in Electronics & Communication Engineering from Mekhapati
Rajamohan Reddy college of Engineering and Technoloy,Udayagiri,Nellore(D,st), in 2010, He is currently doing M.Tech.
in VLSI from Audisankara College of Engineering and Technological, Affiliated to JNTUA.Gudur,Nellore(D.St),A.P524101,INDIA
K Dhanunjaya received his B.Tech Degree in Electronics & Communication Engineering from G.Pulla Reddy
Engineering college, Kurnool, AP in 1998, M.Tech. in ECE from Jawaharlal Nehru Technological University Kakinada in
2001. He is currently doing research in Low power VLSI design from Jawaharlal Nehru Technological University
Anantapur. He has 13 years teaching experience, presently working as Professor & Head of the department of ECE,
Audisankara College of Engineering and Technology(Autonomous), Affiliated to JNTUA, Gudur. He is a life time
member of IETE & ISTE and member of IEEE.