0% found this document useful (0 votes)

43 views23 pages

Switching Characteristics of Generalized Array Multiplier Architectures and Their Applications To Low Power Design

This document discusses new array multiplier architectures that can reduce switching activity and power dissipation in digital signal processing applications. It presents a generalized cellular structure that can model different array multipliers. The switching characteristics of this structure are analyzed and compared to a tree multiplier and a commonly used least-significant-bit-first array multiplier. The analysis shows that the proposed hybrid architectures that combine least-significant-bit-first and most-significant-bit-first approaches can significantly reduce switching activity and power dissipation depending on the statistical properties of the input signals. This has applications in low power design and reconfigurable computing.

Uploaded by

anon_100122661

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views23 pages

Switching Characteristics of Generalized Array Multiplier Architectures and Their Applications To Low Power Design

Uploaded by

anon_100122661

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Purdue University

Purdue e-Pubs
ECE Technical Reports Electrical and Computer Engineering

3-1-1999

Switching Characteristics of Generalized Array Multiplier Architectures and their Applications to Low Power Design
Khurram Muharnmad
Purdue University School of Electrical and Computer Engineering

Dinesh Somasekhar
Purdue University School of Electrical and Computer Engineering

Kaushik Roy
Purdue University School of Electrical and Computer Engineering

Follow this and additional works at: http://docs.lib.purdue.edu/ecetr

Muharnmad, Khurram; Somasekhar, Dinesh; and Roy, Kaushik, "Switching Characteristics of Generalized Array Multiplier Architectures and their Applications to Low Power Design" (1999). ECE Technical Reports. Paper 37. http://docs.lib.purdue.edu/ecetr/37

This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information.

SWITCHING CHARACTERISTICS OF GENERALIZED ARRAY MULTIPLIER ARCHITECTURES AND THEIR APPLICATIONS TO LOW POWER DESIGN

TR-ECE 99-4 MARCH 1999

Switching Characteristics of Generalized Array Multiplier Architectures and their ,Applications to Low Power ~ e s i ~ n l
Khurram Muharnmad, Dinesh Somasekhar and Kaushik Roy
Enlail: [email protected], [email protected] and [email protected] School of Elect.rica1 and Computer Engineering, Purdue University, West. Lafayette, IN 47907 February 22, 1999.

Abstract This paper presents several new array multiplier architectures for reducing the switching activity in general digital signal processing applications. A general cellular structure is described which can be used to obtain any array multiplier suitable for a given application. The switching activity at the output nodes of the cells in this structure is analyzed and compared with a tree multiplier based on 4 : 2 compressors. It is shown that the relative inlprovement in power is a function of statistical properties of the signal. It is also shown that selection of appropriate array architecture can give up to 40% reduction in switching activity compared to a tree multiplier, and more than 3 times less switching activity compared t o the widely used least-szgnzficant-bzt-first array multiplier for commonly occurring situations. We also outline applications of the proposed multipliers t o the areas of low power quantization, reconfigu~.ablecomputing and high-level synthesis for low power.

'This work was supported in part by DARPA (F33615-95-C-1625), NSF CAREER award (9501869-MIP),Rockwell, AT&T and Lucent foundation.

With the recent trend in increasing mobility and performance in small hand-held mobile communicat,ion and portable computing equipment, low power has become an important design factor. New features are continually provided using DSP algorithms which are dominated by three basic operations; add, shaft and

multiply. Many DSP algorithms can be implemented such that the data is processed in carry save (CS)
format. as this format yields zero cost of accumulation [I] in multzply-and-accumulate (MAC) operation. The conversis3n of the result to normal binary forrn can be delayed for as long as possible for the given algorithm since it results in a significantly faster implementation. Consider, for example, a digital filter implementat:.on. In such an application, the intermediate result which is the accumulatior~ of a given inner product of d,sta and the coefficient can be kept stored in CS format, with the CS to binary conversion taking place only after the final result is computed in CS form. Consequently, multiplier architectures processing data in CS format are of particular interest. Multiplica1,ion operations are considered to be the dominant computation in DSP algorithms [2], [3]. Since, computation directly results in dynamic power consumption [4] it is an equally important factor when considering dynamic power dissipation of such algorithms. In general, high-performance DSP architectures aire required in mobile unit,s which process data a t high transmission rates, or in a port,able computer providing advance multimedia features. For this reason, such units are generally constructed with pipelined array m~lt~ipliers. If the latency of t,he pipelined architecture is an important consideration, a pipeli.ned tree multiplier can be used. Both types of multipliers can be easily pipelined using the conventional register based approach, or by using wave pipelining. Over t,he past few years, a number of papers have addressed multiplier topologies for a variety of applications [I], [6], [7]. In particular, array structures prl3posed in [6] address pipelining of recursive digital filters using most signijicant bit (MSB) first digit serial arithmetic. However, to the best of our knowledge, no work has been reported in literature which address dynamic switching activity trade-offs between popular multiplier architectuires as a function of statistical properties of inputs. In this paper, we esplore array structures from the point of view of dynamic power dissipation. Contrary to the expectation that any ordering of array multiplier would yield similar dynamic power dissipation performance, we will show that more than 3 times reduction in switching activity may be possible compared t o t,he commonly used least significant bit (LSB) first array multipliers (also known as ri g ht-left multipliers), depending on the signal characteristic of input signals. This is because a salient feature of computation in DSP algorithms is that the computations are governed by the statistical properties of the underlying process generating dat,a. In general, data signals are correlated and consequently, rapid crhanging data is seldom processed. Hence, we will explore the effects of signal statistics on the output swit,ching activit,y in various array structures in order to assess the feasibility of using a given structure under the condition of known or predictable signal statistics. We will show that re-ordering of partial product addition can result

in significant reduction in switching activity (hence, dynamic power) if the signal statistics are known a

przori. This observation leads to new array multiplier architectures which form hybrids of MSB-first and
LSB-first strl~ctures.We also discuss the application of such multipliers to low power iniplementation of DSP algorithms and to the general area of reconfigurable computing. The main objective of this work is to identify what type of architectures are best suited for processing signals with known statistical properties for reduced dynamic power dissipation? There are three major contributions: of this work:
r

We propose hybrid-array structures which combine LSB-first and MSB-first types of array multipliers. For appropriate signal conditions, these structures are shown to significantly reduce dynamic power dissipation.

T h e switching characteristics of array multipliers are compared with a tree multiplier based on 4 : 2 compres:jors as well as the most commonly used LSB-first multiplier to show the region of strength of each zrchitecture. Hence, this work can be used to formulate an appropriate strategy for selecting the best order of partial product addition for reducing power dissipation in a given LISP task. Alternatively, when processing signals with known statistical properties, one can formulate a strategy for applying signals to the multiplier inputs in an order which most effectively reduces dynamic power dissipatilsn.

The architectures presented in this paper provide new insights to the general area of low power design and reconfigurable computing.

This paper is organized in to five sections. Section I1 describes the array multiplier architectues considered in this work. Section I11 presents a simulation based study of the switching characteristics of output nodes in the architectures considered. The signal models used to compute the performance of these multipliers are also explained in this section. Section IV discusses the applications of these strucl ures to general signal procesr;ing algorithms. Finally, section V concludes this paper.

We will f i n t present a simple approach for obtaining various types of array multipliers. Figure 1 shows
a template for a cellular array structure which serves as the basis for generating different types of 8-bit

array multipliers. Each location in this matrix can be occupied by a cell which can be an a.nd gate (AND), a half a d d e r :H.4) or a full a d d e r (F.4). In the sequel, the cell at location i, j will be referred to as ci,j. As an example, the cells on four corners are shown labeled in the figure. Let A = ao, a , , . . . , a ~ - 1 and

B = bo, b l , . . . , b N P l represent the input vectors applied at right and top, respectively. The output is
represented by P = po, p l , . . . , p 2 ~ - ~ Then . each partial product ai . bj, where i, j = 0 , 1 , .. . , N - 1 must be added in the appropriate relative position to obtain the correct value of P. In figure 1 we have shown the structure of LSB-first type array multiplier by the colored cells comprising a parallelogram. In this figure, the continuous lines show presence of connections, while the dashed lines show absence of them.

Hence, t h e aztive connections in a CS type of array multiplier are shown using the contii~uous lines. T h e connections i i o m primary inputs t o appropriate cells are not shown explicitly, and are assumed implicit t o reduce clutter. By counting the number of active inputs, one can determine the type of cell. Hence, the cells in row #O are all AND gates, whereas the seven rightmost cells in row # I are HAS. T h e cells accepting three active inputs are FAs. Note that the inputs are counted by considering tlie implicit input
ai .

bj which is not shown. T h e resulting CS array multiplier structure is shown on the right in figure 1

for clarity.

Fig. 1. Basic template for constructing array multipliers.

Now, the goal of an array multiplier is t o add the partial products from cells which occupy t.he same column in t h e cellular array structure shown in figure 1 . T h e order in which these partial products are added is not important, we only need t o ensure t h a t only the products in t h e same colurrln are added (in addition t o the carry's generated from the cells in the adjacent column on right). Hence, one can exchange rows #3 and #7 as shown in figure 1. Cells in row #3 after moving t o row #7 are shown by cells shaded by circles. 'I'lne cells in row #7 after moving t o row #3 are shown by dark colored cells. Now, we only need t o ensure t h a t carry's generated from next rows are correctly added, which may require extra cells. Let

( r o ,r l , rz, . . . , r ~ - 1 ) be the set of indices which represents a n ordering of success,ive additions of

rows of partia,l products. T h e n , the orderi~ig given by

= i for i = 0 , 1 , . . . N - 1 expresses the LSB-first

multiplier shown in figure 1. T h e MSB-first multiplier can also be expressed similarly by the ordering
r1 .- N
-

1 - i: for i = 0 , 1 , . . . , N - 1. Clearly, there are N! ways t o construct carry save array multipliers.

Each of these multipliers mays be constructed using propagation of carry in eit,her ripple form or CS form or a combination of these. This formulation is the basis of generating various architectures of interest which are evaluated for their switching activity performance in this paper.

A. LSB-First Multipliers
The LSB-first multiplier can be constructed either using the CS format shown in figure 1, or by using ripple carry structure. We will refer the former as LSB-first CS multiplier and the latter as the LSBfirst R P multiplier, respectively. LSB-first R P multiplier is the most well-known and widely used array structure for multiplication and is obtained from the cellular array of figure 1 by turning off the diagonal lines (by nlalting then1 dashed) and turning on the horizontal dashed lines (by making them continuous)

+ 1, . . . , 2 N - i 2 (right-most cell excepted) in the LSB-first CS multiplier of figure 1. The vector. merge row (row # N + 1)
which connect cell
ci,,

ci,,+I

for all cells

ci,j,

i = 0 , 1, . . . , N - 1 and j = N - i, N - i

is no longer I-equired. The advantage of using CS format is the reduction in propagation delay through the multiplier. LSB-first R P multiplier has 30% longer critical path as compared to the LSB-first CS multiplier. Irl this work, we consider both since our objective is to highlight the switching characteristics of various array multipliers.

An MSB-first multiplier place MSBs of A input at the top row positions as shown in figure 2. The main idea is to flip the cells in the cellular array of figure 1 along a horizontal axis such that row # i is moved to row #(N - 1 - i ) , for i = 0, 1 , . . . , N - 1. This results in a MSB-first multiplier [ B ] . The multiplier can be const1:ucted by propagating the carry in either CS form, or can be ripple in a fashion identical to the LSB-first R P multiplier. The multiplier using CS format has been presented in [B] for pipelining recursive digital filters. A major advantage of the MSB-first CS multiplier is that the d e l a j ~ through vector merge stage can be reduced by taking advantage of the fact that the MSB-first array produces the MSBs before the LSBs. Hence, a carry-select structure can be constructed in the region occupied by cells for i
ci,j

j to improve the vector merge delay. Consequently, MSB-first CS array multiplier can improve

the speed of multiplication [ B ] . The observation that MSBs of product are available before the LSBs is 1 1 contrast to a fundamental to the construction of the MSB-first R P multiplier shown in figure 2(b). 1 LSB-first R P multiplier, it has the same propagation delay as the LSR-first CS multiplier and offers an attractive alternative to it.

C. H y b r i d Multipliers

R which is not monot,one. Note that there is only one monotonically increasing ordering of the elements of R and it leads to the LSB first
A hybrid multiplier is obtained by any ordering of elements of structure. Siinilarly, the only monotonically decreasing ordering leads to the MSB first structure. Any ordering other than these two leads to a hybrid array multiplier. In this paper, we consider only two types for hybrid structures. The first structure places L consecutive LSB bits of operand A as L top

most rows. This structure is shown on left in figure 3. The second structure places L consecutive MSB
bits of operand A as L top rnost rows and is shown on right in figure 3. We will refer to the former as

Fig. 2. Structures for MSB-first multipliers; ( l e f t ) MSB-first CS multiplier, and, ( r i g h t ) MSB-first R P multiplier.

hybrid LSB-first multiplier and the latter as hybrid MSB-first multiplier, respectively. Botli of these can be
constructed either by using ripple carry or by using CS format. Hence, there are four wa,ys to implement a hybrid multiplier which puts L top most rows of one type of multiplier above the other (i.e. LSB-first, over MSB-first or vice versa). The multiplier on left in figure 3 puts L = 3 top rows of the LSB-first CS multiplier over N - L = 5 top rows of the MSB-first CS multiplier. We will refer to such a multiplier as hybrid LSB-frst CS/CS multiplier with L = 3. Similarly, the multiplier on right in 3 puts L = 3 top most rows of MSB-first R P multiplier over N - L top most rows of LSB-first CS multiplier. This multiplier will be referrcd to as hybrid MSB-first RP/CS multiplier with L = 3. We can obtain three more types of L = 3 hytlrid multipliers for each of these cases by considering the remaining three combinations of adding carrys in the two parts of the multiplier. Each type of hybrid multiplier implementation requires a different overhead and has a different length of critical path. Since the goal of this work is to develop an understanding of the swit'ching trade-offs in various multipliers, we will only consider implementations which place L consecutive rows of one type of multiplier over the other. The reason for focusing on such architectures is because DSP applications, in general, process data streams whose properties can only be predicted or controlled over a part of the word-length. For example, if the signal strength reduces, consecutive MSBs of the data--stream become zeros (assuming a sign-magnitude representation). Similarly, "less important" data values may be further quantized by truncating some LSBs, thereby resulting in the data-stream having zeros a t the corresponding locations. It will be shown that the proposed hybrid multipliers yield substantial improvement in switching activity reduction compared to a tree multiplier (constructed using 4 : 2 compressors) as well as the simple LSB-first or MSB-first multipliers under appropriate signal conditions. The multiplier structure shown on left in figure 3 is entirely CS structure, and its speed can be increased by using a carry select structure similar to the one proposed in 161. The multiplier on right in 3 has the same delay as a LSB-first CS array multipl:.er despite the fact that the MSB-first part ripples the carry. The reason for considering this structure is that it requires a smaller overhead cells required to ensure that all partial product sums and

carrys are aalded at appropriate locations.

b b b b b b b b

Fig. 3. Structures for hybrid multipliers; (left) hybrid LSB-first CS/CS multiplier, a n d , (right) hybrid MSB-first RP/CS multiplier.

We first investigate the switching characteristics of the multipliers presented in the previous section qualitatively. Let us first consider the LSB-first multipliers. A close observation of the multiplier in figure 1 shows that if successive inputs are applied such that their LSBs are zeros in operand A , the corresponding top rows of the multiplier will be turned off as the evaluated partial products would all be zeros. Ac.y input which has a0 = 1 will place the vector B at the output of the first row of partial product outputs. These values will propagate downwards even if the next LSBs in A are id1 zeros. Hence, switching activity can only be reduced if successive inputs applied at the input A ensure that when a bit a j is 1, all ai's are zeros for i

j . Similarly, we notice that if the successive inputs applied at the B

inputs are such that M MSB bits are zeros, then the cells a i , j such that j = i

+ k for k := 0 , 1 , .. . , L - 1

along the diagonal (columns of partial product generators) in the cellular array are all turned off. Hence, no sum or carry output transitions in these cells. Hence, low over-all switching activity can be ensured if the inputs applied to this multiplier are ordered to ensure that they cause smaller switching activity. Similar 0b:servations are made for the MSB-first and hybrid multipliers. The "best" input conditions for these multipliers are summarized in table I and can be verified by a careful study of figures 1
-

Next, in order to obtain a quantitative behavior of these multipliers, we will use two signal models which are described next.

A. Signal Models
In the first model we only vary the signal strength to determine the switching characi,eristics. Hence, successive sainples of signals are assumed to be uncorrelated and drawn from a uniform distribution. It has been shown in [lo] that the switching activity in the LSB-first RP multiplier prim;~rilydepends on the input signal strength. Hence, we apply all possible combinations of fixed signal strengths in an N-bit

Multiplier

Favorable Conditions at Input A at Input B 1 s zeros

Overheads Vector Merge

C
I
S IGNAL

LSB-first CS LSB-first R P MSB-first CS MSB-first R P

LSBs zeros LSBs zeros MSBs zeros MSBs zeros MSBs & LSBs zeros

N - 1 cells
Not Required

MSBs zeros

MSBS zeros
MSBs zeros MSBs zeros

3 ( -~ 1) cells
N - 1 cells
3N - 2 L - 1 L
-

1
I

7 1 1
Wiring

None

Hybrid LSB-first CS/CS Hybrid MSB-first R P / C S

I MSBs & LSBs zeros I MSBs zeros 1

T.4BLE I

1 cells

None

CClNDITIONS CAUSING LOW SWITCHING ACTIVITY A N D OVERHEADS IN THE MLlLTIPLIERS

PRESENTED. HYBRID

MULTIPLIERS ASSUME T H A T

ROWS A R E hlOVED TO TOP.

multiplier by sweeping the space of possible signal strengths at the two inputs. We obtain d a t a for these points by generating samples comprising of i-bits from a uniform distribution, where i is varied from 1 to N. The N x N possible combinations of siginal strengths of the two operands are obtained by applying signal of strength i-bits as operand A and j-bits as operand B, where i, j = 1 , 2 , . . . , N. This model will be referred to as the U model and it can be used to assess the merits of using the presented multipliers for signals which can be represented by N or less bits and/or which can be re-quantized by discarding some LSBs without significantly degrading the system performance. The second model generates correlated signals from a zero mean Gaussian distribution. These samples are represented using sign-magnitude (SM) number representation and only the magnitude of the number is applied at the inputs of the multiplier. The signal correlation in operand A is represented by p~ and the correlation in B is represented by p ~ Four . situations arise by considering all possible combinations of high and low correlations in the signals at the two inputs. The high correlation value is considered to be 0.95, and low correlation equal t o 0. This model will be referred to as the Q model.

B. h'zlmerictrl Results On Power Dissapation

We now turn our attention to the switching activity performance of the presented multipliers. The switching ac1,ivity of each multiplier was evaluated by counting the number of switches a t each output of in ~ the mult,iplier. Let S, denote the switching count of cell c. Then the possible cells in every m o d u l ~ a multiplier sre an .AND gate, a HA, a FA and a 4 : 2 compressor (the 4 : 2 compressor appears in the tree multiplier). The corresponding switching metric which expresses the switch counts in these cells will be represenkd by S A N DS , HAS , H Aand S4:2,respectively. The total switching metric was obtained using the following; weighting; 2 for S A N D 3 , for S H A , SF^ and S4:2 (weight reflects output load capacitance

driven by the gate output). These relative weighting factors were obtained by considering the pin loading of a typical rnodule in the array configuration. In addition, the switches at the input pins were counted separately for the given simulation and multiplied by N to account for input buffer drivers. The total switch count:j at all outputs (including input pins), weighted by the corresponding factor were summed to obtain the sviitching metric for the multiplier. These weightings yield a metric which expresses the total switched capacitance in the multiplier for the given input conditions.

.4 similar inetric was obtained for the tree multiplier by using using the same input signals. We will
let SArray and STree denote the switching metrics for the array and tree multipliers, respectively, for the given input signal conditions. Then the relative advantage of using the array multiplier is defined as

The above quantity is expressed as a percentage and shows the advantage of using the array multiplier over a tree fcr the given signal condition. We will refer to this quantity as percentage switching reduction. The rationale behind this normalization is to clearly indicate the relative performance of each type of array multiplier with respect to the tree structure and to quantify percentage reduction in switching activity for given signal condition. Similar quantity can be obtained for comparing the relative performance of any two multipliers. Figure 4 shows one such metric computed using the LSB-first CS multiplier as the reference for normalization. The figure shows the relative advantage of using the indicated hybrid multipliers ill comparison to the LSB-first CS multiplier by using SLSB-First cs in place of STreein equation 1. This quantity will be represented by
YA,,,~,

as we set the LSB-first CS multiplier as the

base-line for comparison in array multipliers. It is noted that switching reduction of up to 200% (3X smaller) is possible when using a hybrid multiplier in comparison to the LSB-first CS multiplier, under appropriate signal conditions. The result:; presented in this section were obtained by using 1000 randomly generated vectors using the

U model. These results give rise to a surface as a function of the number of bits in the applied inputs.
as in This surface is best shown by slicing it into different regions and showing every slice ir~dividually figure 4. An even better representation is to place each slice along-side as a bar chart as shown in the remaining figures. The abscissa in these figures show the number of bits in the samples (drawn from a uniform distribution) applied at the A input. The data samples were applied at the multiplier inputs by computed the aligning their LSB with the zeroth indexed row/column. Hence, the successive simulatio~ls switching metrics for inputs with increasing widths until metrics for all the grid points of the switching metric surface were computed. The metrics were normalized to obtain relative switching ireduction shown in the figures. The bars in each figure are composed of N groups. The position of a group corresponds to the number c'f bits in the samples applied at A . Each group, in turn, is composed of N bars. The position of a bar insicle a group indicates the number of bits in the samples applied at the B input. Hence, as we scan a figure from left towards right, the strength of the input signal at B input repeatedly increases and

Signal Strenglh of A

S~gnal Slrenglh of 3

Signal Strength of 3

Signal Strength of A

Fig. 4.

qilrra:, for 16-bit hybrid array multipliers. Figure above: shows the percentage switching reduction for Hybrid

LSB-first C!S/CS with L = l , and, figure below: shows this surface for Hybrid MSB-First R P / C S m.ultiplier with L = l (normaliza1;ion is performed with respect t o LSB-first CS multiplier).

falls, while the strength of the signal applied a t A continually increases. B . l Results Using the Figures 5
--

U Signal Model

7 show q~,,, as a function of signal strength in the LSB-first and MSB--first multipliers

for N = 8 , 1 6 and 32, respectively. We observe a consistent trend of the relative performance for each of these mult,ipliers. Each of these multipliers gives gains in switching reduction for difFerent operating conditions. A.s pointed out in table I, LSB-first multipliers would give an improvement when the LSBs of

A input, or bilSBs of B input are zeros. The first situation does not arise with this signal model, because
it would require MSBs t o be I s and LSBs to be 0s. Such a signal can only be generated by quantizing (rounding/truncating) the LSBs. However, the second condition is more realistic and we note that up t o 25% reduction in switching activity is possible over tree multiplier when the signal strength of A is high, and B is small as they result in left-most columns of multipliers turning off. Despi.te the overhead of vector merge state, the CS multiplier out-performs the R P multiplier as evident by a close inspection of these figures. T h e MSB-first multiplier shows the gains in switching activity reduction when the signal strength a t the A input is low. Hence, the top most rows do not switch. Larger gains are observed when the signal strength a t the B input is large. T h e R P type multiplier clearly outperforms the CS multiplier because of smaller overhead cells. Further, the relative gains under favorable signal conditions are higher as compared to the LSB-first multipliers. Finally, both favorable situations appear a t the inputs in the U signal model because the MSB-first multipliers reduce switching when the MSBs of both inputs are 0s ( a situation which frequently arises in DSP a pp lications). It is seen that close to 40% reduction in switching

-20

-30

3 4 5 6 Sgnal Slrerglh ol A m LSB-F rsI C S Muit8pler

-40

3 4 5 6 Sign3 Strength 01 A m MSB-FlrsI CS Mulllp mr

-30
1 2 3 4 5 6 S g n 3 Sllsnglh of A n LSB-FlrsI RP Mulllpl~ei 7 6

-40 1 2 3 4 5 6 Sgnal Strength ol A m MSB-Flrrt RP Mulllp ler 7 8

Fig. 5.

v~~~~ for

(left) 8-bit LSB-first array multiplier, and, (right) 8-bit MSB-first array multiplier as a function of the

signal strength in the operands.

8 8 10 12 Sgnal Strsngih ol A ~nLSB-Fra RP Milt!pl~sr

6 8 10 12 14 S1gn3Strengh ol A n MSB-Fbra RP Mult#@isr

Fig. 6.

v~~~~ Ior (left) 16-bit

LSB-first array multiplier, and, (rightj 16-bit MSB-first array multiplier as a function of the

signal strength in the operands.

activity is possible in the MSB-first R P multiplier when the signal strength of A is very small and B is very strong. The savings are consistent across 8, 16 and 32 bit multipliers. We can also compare the relative performance of MSB-first and LSB-first multipliers. Figure 4 shown earlier indicates that LSB-first multiplier out-performs the MSB-first multiplier by up to 30% when signal strengths a t both A and B inputs are very small. However, MSB-first multiplier oui,-performs LSBfirst multiplier for most situations giving larger relative advantage in switching reducltion. Note that these results favor MSB-first type multiplier from switching activity point of view for most common signal conditions. One may notice that many multipliers used in DSP do not need all 2N product bits (especially i r ~ floating point units) and MSB-first multiplier is an attractive choice since by construction it also furnishes the MSB part of the product very quickly.

9 11 13 15 17 19 21 23 25 Slgnal Slrenglh o l A m LSB-Flr61 C S Mulllpllsr

9 11 13 15 17 19 21 23 25 Slgnd Strength 01 A m MSB-Flrsl C S Multlpiler

9 I1 13 15 17 19 21 23 25 Slgnal Slrenglh o l A m LSB-FlrsI RP Mullipller

9 11 13 15 17 19 21 23 25 Slgnal Slrength al A m MSB-Flrpl RP Mullipiler

Fig. 7.

for ~ ( l e~ f t ) 32-bit ~ LSB-first array multiplier, and, ( r i g h t ) 32-bit MSB-first array multiplier as a function of the

signal strength in the operands.

We next consider hybrid multipliers. The main objective of employing the hybrid multipliers presented in this paper is t o take advantage of a signal whose L LSB bits are zeros. Such signal values may arise in many ways ill typical DSP applications. As an example, computations may be organized as floating point type of operations in which the normalized mantissa of operands is multiplied using an array multiplier and values are expressed by using only a few MSB bits in the mantissa, depending on the accuracy required (ua;-zable preczszon arzthmetzc). Example of such a system is a digital filter implementation employing scaled coefficients for reducing performance degradation due to coefficient quantization [3]. Another exainple of truncation of signal's L LSB bits is a situation where the resulting degradation in accuracy can be tolerated for the application at hand. Again, examples of such a system is an FIR filters whose objective is to meet given filter specifications, however, the implementation is nnade by using a multiplier which is bigger than the least number of bits required to meet these specifications [8], [9]. This situation car easily arise in general DSP implementations where shared multipliers are used for more than one applications and resources are not exclusively dedicated to only one task. For these reasons the switching performance of the hybrid multipliers was computed by truncating L LSB bits of the signal and setting them to zeros. If L LSB bits are not set to zeros, the hybrid multiplier's switching performance will lie between that of LSB-first and MSB-first multipliers. Next, we analyze the results shown in figures 8
-

10. These figures show q~~~~ for 8, 16 and 32-bit

multipliers, respectively. The figures on left show the results for multipliers with L = 1, and the figures on right show the results obtained for multipliers with L = 2. It is seen that hybrid MSB--first and hybrid LSB-first multipliers show improvement in performance for different signal conditions. The former shows most improvsment when the signal strength is small for A and large for B. The latter sllows most gains when the converse is true. The reduction in switching activity is more pronounced in the Hybrid LSB-first multiplier despite the overhead of cells required to ensure correct operation. This is due to the fact that

2 3 4 5 6 7 Sgnal Slrmyth at A m Hybrld-LSB-Rrrl Multopller (Ll)

2 3 4 5 8 7 Slgnal Slrenglh a l A ~nybnd-LSB-Flrn M d l ~ psr l (L=Z)

2 3 4 5 6 7 Slgnal Slrenglh o l A m Hybrd-MSB-Flrl Mult~plsr (L4)

-301

2 3 4 5 6 7 Sgnal Sliengm al A n ybnd-MSB-Flrsl Multopl~er (L=2)

-8

Fig. 8. I)T,,, for 8-bit hybrid array multipliers as a function of the signal strength in the operands. (left:) L = l , and, ( r i g h t )

L=2.

4 6 6 10 12 14 Sqnal nrenyln ol A ~nHybrid-LSB-Ftrsf Mlltiplier ( L . 1 )

5 I8

I
-

4 6 8 I0 12 14 Sgnal Strmglh d A ~nybnd-LSB-Flrsl Mdlplsr (L.2)

20loo-

-- -

$ -101 -20 -

-30

0
2 4 6 8 10 12 14 sqnal Slrenglh d A n Wrd-MSB-Fin Mlltlpller ( L d ) 16

4 0 0

4 6 8 10 12 14 Signd Strength of A m Wnd-MSB-Flm Mlll~pl~er (L=Z)

Fig. 9. q~~~~ for 16-bit hybrid array multipliersas a function of the signal strength in the operands. ( l e f t ) L = l , and, ( r i g h t )

L=2.

the L LSB bit truncated signals obtained through the 24 model are more effective in turning off larger part of the multiplier since LSB-first part of the multiplier precedes the MSB-first in the former case. Significant r~duct,ion in switching activity is achieved in both cases. Further, the trends adreconsistent for all sizes of multiplier. Figures 11 shows q~~~~for 8 and 16-bit multipliers, respectively, with L = 3. Figure 1;: shows q~~~~for

16 and 32-bit multipliers, respectively, for hybrid multipliers wit,h L = 4. The missing bars indicate that
the A operaind under t,he indicated signal conditions were zeros (small power, large truncation). Hence, no operations are necessary. However, the region of switching reduction moves to the the mid-region of A signal povier. The relative switching activity reduction becomes larger as L increases The trends are consist,ent for all hybrid LSB-first CS/CS and hybrid MSB-first RP/CS multipliers for all sizes and values

7 9 11 13 15 17 19 21 23 25 Signal Srrenylh 01 A 10Hybrid-LSB-Fir* Mlllpl<sr(L=o

7 9 11 13 15 17 19 21 23 25 Slgml Slrenyth of A m tiybnd-LSB-First Mlillplrlr (L=2)

Fig. 10.

q ~ ~ for ~ 32-bit < ? hybrid array multipliers as a function of the signal strength in the operands. ( l e f t ) L = l , and,

( r i g h t ) L=2.

as
1 2 3 4 5 6 7 Spnal Srrenglh 01 A n biybnd-LSB-Flm Mllllplier [ L S ) 8
-40 0

4 6 6 10 12 14 S~QMI Slrenylh a1 A m Hybnd-LSB-Flm Mllllplier ( k 3 )

2 3 4 5 8 7 Slgnal Slrenplh 01 A 8" Hybnd-MSB-Fvsl Mlltipller ( L S )

4 6 6 10 12 14 Slgml Srrenglh d A ~nWrld-MSB-Firrl Milllpixr (M)

Fig. 11. q ~ ~ for ~8 ~ ( l e f t ) and 16-bit ( r i g h t ) hybrid array multipliers with L = 3 as a function of the signal strength in the operands.

of L. T h e reduction in switching activity under favorable signal conditions is as large as 40%. The relative performance of a large multiplier for small and large values of L is shown in figure 13. This figures shows q~~~~ for 32-bit hybrid multipliers for L = 1 (figures on left) and L = 8 (figures on right), respectively. Since the indicated truncation for small signal strength complete1.y annihilates its value, the first seven groups of bars are missing in the figures on right. No operations are necessary in this region of operation and no srvitching activity results in the hybrid multiplier, if such operands are applied. Switching reduction of up t o 35% are achievable in the L = 8 case in comparison to about 30% for the L = 1 case. Although the results shown in figures 8
-

13, in general, suggest superior performance

of hybrid LSB-first multiplier over hybrid MSB-first multipliers, one must remember that the decision to choose the best multiplication scheme is dependent on the input signal conditions. The relative switching

4 0 0 2

r
16 I6 1 3 5 7 9 11 13 15 17 19 21 23 25 Slgnal Slrmglh of A m Hybrld-LSB-Flrrl Mlll~plnr ( L . 4 ) 27 29 31

0 8 10 12 14 Sgnal Slrenglh of A m Hybnd-LSB-Flrn Mullapller ( L . 4 )

4 0 0 2

4 6 6 10 12 14 Sign?d hrrngm al A m Hybrd-MSB-Flrd Mulllplmr (L.4)

7 9 11 13 I 5 17 19 21 23 25 Signal Strengih of A m Hybrd-MSB-Flm Mullipl~ei (L.4)

Fig. 12. VT,,, operands.

for 16 (left) and 32-bit (right) hybrid array multipliers with L = 4 as a function of the signal strength in the

7 9 11 13 15 17 19 21 23 25 Slgnal Strenqlh of A m Hybrd-LSB-Furl Mullpllsr ( L . 1 )

7 8 11 13 15 17 19 21 23 25 Signal Slrenglh of A l n Hybrid-LSB-Flrd Mulliplsr ( L d )

2 7

.mL;-3";;

$1;Ij1;;71;;1;3;s~;9 1 j - l8
Slqnal Slrmgth of A m Hybrid-MSB-Flm Mull~pllsr ( L . 1 )

7 9 11 13 IS 17 19 21 23 25 Slgnal Strength 01 A m Hybrld-MSB-Flrn Mullpller ( L . 6 )

d n 29 31

Fig. 13. VT,,, for 32-bit hybrid array multipliers with L = 1 (left) and the operands.

L = 8 ( r i g h t )as a function of the signal strength in

activity reduction also depends on the choice of multiplier used in normalization in equation 1. Figure

14 demonstr<%tes this point by showing q ~ , . , .for ~ ~32-bit hybrid array multipliers with L = 2 and L = 4,
respectively. Notice that the switching activity reduction in hybrid MSB-first multiplier, although smaller than its courlterpart, is more consistent as signal strength of A varies. Hence, for a given application, the latter may b ? preferred despite its general inferior performance t o the hybrid LSB-first multiplier. B.2 Switchirlg Activity for Correlated Signals We now consider the performance of the presented multipliers using the model. For this purpose we

applied d a t a samples obtained from Gaussian distribution for different signal strengths varying from 1 to

- 1 bits. Four situations were chosen t o reflect the effect of correlation in the signal by considering

Slgnal Strength 01 A

Signal Slrength 01 A

S~gnel Slrenglh al A

3 4 5 6 7 Sgrlal Slrenglh a1 A

3 4 5 6 7 Signal Strenglh 01 A

3 4 5 8 7 Slgna Slrenglh of A

3 4 5 8 7 Slgnal Slrength a1 A

Fig. 16.

qpee

for ( l e f t ) 8-bit MSB-first CS, and, ( r i g h t ) 8-bit MSB-first RP, array multi p liers as a furlction of the signal and p g are shown with each plot.

strength i n the operands. Values of

strength of A increases. T h e effect of increasing p~ is an "equalization" of a t A . However, the differences are very small.

v~~~~for small signal strengths

In the case of MSB-first multipliers shown in figure 16, we notice that higher p~ causes an "equalized"

q~~~~for smitll signal strengths of A . Hence, better gains are obtained as signal strength of A increases,
and these gams drop quickly as A becomes stronger. The effect of p~ is not discernible Similar results are seen in the hybrid multipliers shown in figures 17 - 18 which shows the effect of correlated signals on the performance of hybrid multipliers. In all these examples, the effect of p~ is negligible, however, high
p~ causes the gains t o equalize in the region where the hybrid multiplier out-performs the tree multiplier.

It is noted that consideration of extremely high correlations do not make much sense because a better approach in (,hiscase is t o difference the d a t a and reduce its dynamic range. Hence, by adding overhead of add operatzon one can significantly reduce the size of the operands in multiplication. T h e results shown in this section clearly indicate that signal correlations have a small effect on the switching activity for all multipliers. It is actually the signal strength a t the inputs which almost completely determines the switching in the multiplier. This is confirmed by a similar observation made for LSB-firsl; RP multipliers in [lo] B.3 Area Comparison The LSB-first CS and MSB-first CS multipliers were implemented in CMOS using 0 . 6 , ~ technology. Both of these structures were implemented after inverter elimination simplifications for the: partial product generator rows [4]. Cells were implemented for both non-inverted and inverted outputs [41 and the bottom

m o s t row constituted a vector merge adder for converting CS format t o regular repre~ental~ion. The layout
areas of the i;wo multipliers is shown in table I1 for purpose of comparison. MSB-first Cis adds a wiring overhead which results in an increased area. This is because the carry signal must be propagated one cell

8 -20
1 2 3 4 5 6 7 S g V l Strenglh of A 6 1 2 3 4 5 6 7 Sgnal Slrenglh of A 8 -30 1 2 3 4 5 6 7 Slgnal Slrenglh o l A 8

8 -20
1 2 3 4 5 6 7 Slgnal Stlength of A 8

3 4 5 8 7 S g l a l Slrenglh ol A

3 4 5 6 7 Sgnal Slrenglh of A

3 4 5 6 7 S~gnaI Slrenglh 0 1A

3 4 5 6 7 Sgnal Slrsnglh 01 A

Fig. 17.

v~~~~ for 8-bit

hybrid array multipliers as a function of the signal strength in the operands. (lej't) Hybrid LSB-first
p~

CS/CS with L = l , a n d , ( r i g h t ) Hybrid MSB-First RP/CS with L = l . Values of

and

are shown with each plot.

W -L d

0 3 2 3 4 5 6 7 Sgnal Slrenglh of A 8 1 2 3 4 5 6 7 Slgnal Strength o l A 8

B 2

SlgTal Slrsnglh 0 1A

O L 3 4 5 6 7 Srgnal Slrenglh 01 A

A 8

8 -20
1 2 3 4 5 8 7 S g m l Strenglh 01 A 8 1 2 3 4 5 8 7 S g ~Slrenglh l 0 1A 8 3 0 1 2 3 4 5 8 7 Slgnd Strength 01 A 8

-20

-30

3 4 5 6 7 Sgnal Slrsnglh of A

Fig. 18. 7 T r e e for 8-bit hybrid array rnultipliersas a function of the signal strength in the operands. ( l e j t ) Hybrid LSB-first CS/CS with L=2, a n d , ( r i g h t ) Hybrid MSB-First R P / C S with L=2. Values of
p~

and

are shown with each plot.

further in a secta angular layout. These values can be used to approximately estimate the area overhead of using hybrid multipliers.

IV. A PPLICATION

LOW P OWER D ESIGN

In the previous sections, we have provided a qualitative as well as quantitative assessment of the switching activity reduction which can be obtained by using the proposed multiplier structures for various signal conditions. These results can assist in the design for low-power as they show the relative strengths and weaknesses of different multiplier architectures. In this section, we will briefly discus:; the application of this work t o low power quantizatzon, reconfigurable computing and high-level synthesis for low-power

Multiplier LSB-first CS MSB-first CS Area Overhead

N =8
63,508 84,948 33.8%

N = 12
138,040 178,406 29.3 %

N = 16
241,073 306,009 26.9 %

N = 24
532,982 663,954 24.6 %

N = 32
939,007 1,158,672 23.3%

]
IN 0 . 6 ~

TABLE I1

L AYOUT

A R E A I N (prn)'

LSB-FIRST A N D MSB-FIRST CS

M U L T IPL I ER S FOR V A R I O U S

N x N-BITMUL.TIPLIERS

TECHNOLOGY.

A. Low Power Quantization

As discussed earlier, non-dedicated DSP systems generally employ multipliers whose size is determined by the performance requirements of the most computationally expensive intended application. An application in a general DSP system with fixed resources may not require the full precision offered by the resource. In such a situation, the power dissipation of the computational unit can be significantly reduced by appropriate use of the resource. Such quantizations have been proposed in [8], [9] without considering support multiplier architectures. We further note that these results are also useful in formulating a strategy for employing variable word-length computing, in which different tasks of a DSP algorithm are computed with different precisions without significantly degrading the overall system performance. As evident from the results presented in previous sections, the following two conditioils must be met: first, an appropriate multiplier architecture should be selected, and second, correct input conditions must be provided such that reduced switching activity is guaranteed. Quite clearly, it is not enough to ensure only one of t,hese conditions. For example, if we truncate the LSB bits of the B input in a LSB-first multiplier, it will not help reduce switching activity. Further, it is also important t o ensure that favorable signal conditions are maintained a t the inputs consistently. For example, if successive A inputs in the LSB-first multiplier have toggling a0 bit, the reduction in switching activity will be entirely lost. Reduction in switching < ~ c t i v i tis y possible only if the data-stream applied a t A input of this multiplier ensures that successive sainples have all L LSB bits turned off.

B. Reconfigurable Computing
The cellula.r array structure presented in section I1 is the most general template using which any array multiplier car1 be formed. In applications where reconfigurability is sought for the application a t hand, one may use the ~.nderlying structure proposed in this paper to form any of N! possible multiplier architectures. It is noted that reconfigurability desired specifically for reduction of switching activity may not achieve that goal becsuse of the overheads involved. In general, these overheads reduce the speed of application as well as increzse the overhead power. However, for specific applications where structure of d a t a stream is

well-known, re-configurable multiplier may be employed which eliminates the undesired rows of multiplier (to form a n appropriate hybrid multiplier) in order t o increase the speed of multiplicatiori. In such a case, the interpretation of array multipliers presented in section I1 and the template described in figure 1 can prove to be extremely useful.

C . Hzgh Level Synthesis Based o n Iqnowledge of Signal Characteristics

The results presented in this paper clearly indicate that each array multiplier offer!j advantages for specific signal conditions. Further, large inlprovements are possible in reduction of switching activit,y by appropriate choice of multipliers. Hence, maximum reduction in switching activity can be achieved by scheduling and allocating operations such that favorable input conditions are ensured a t the inputs of the multipliers employed in the implementation. Hence, existing high-level synthesis tools can be improved such that they consider the expected signal behavior at various points of the algorithm while arriving a t an implementation. Note t h a t the condition of ensuring favorable signal conditions a t the multiplier inputs also reduce bus-power, since these conditions must be met consistently between successi.~.ed a t a samples. This work shows that an appropriate choice of array multiplier assures that reduction in switching activity in the input bus to the multiplier reflects as reduced switching activity in the multiplier Hence, one can reduce the pswer dissipation in a data-path by careful scheduling and allocation of instrilctions based on the expected statistical properties of the d a t a being processed.

We presen1,ed several new array multiplier architectures for reducing switching activity !In general digital signal processing applications. A general cellular structure was presented which can be used to obtain any array multiplier suitable for the given application. This structure provides a unified view of all
N ! possible N-bit array multipliers. The switching activity a t the output nodes of the cells in various

nlultiplier structures was analyzed and compared with a tree nlultiplier based on 4 : 2 cornpressors as well as a LSB-first CS array multiplier. It was shown that the relative improvement in power is a function of statistical properties of the input signals. It was also shown that selection of appropriate airray architecture can give up t,o 40% reduction in switching activity compared to a tree multiplier, and more than 3 times reduction in switching activity compared to the widely used LSB-first array multiplier for commonly occurring situations. We also outlined applications of the proposed multipliers and the presented results t o the areas of low power quantization, reconfigurable computing and high-level synthesis for low power. Hence, the proposed architectures can prove to be extremely useful structures for low power DSP system design.

[I] E. E. Swar1,zlander. '.Computer Arithmetic," IEEE C o m p u t e r S o c i e t y P r e s s , 1990. [2] S. Haykin, "Adaptive Filter Theory," Prentice Hall, N J , 1996.

[3] J . G. Proakis and D. G . Manolakis, ''Digital Signal Processing: Principles, illgorithms, and ilpplications," McMillan

Publishing Company, New York, 1992.

[4] J . M. Raba.ey, "Digital Integrated Circuits: A Design Perspective." Prentice Hall, New Jersey, 1996. [5] N. H. E. Weste a n d K. Eshraghian, "Principles of CMOS VLSI Design: A Systems Perspective," 2nd Edition, Addison Wesley, 1994. [6] S. E. McQuillan and J . V. McCanny, "A Systematic Methodology for the Design of High Performance Recursive Digital Filters," IEEE Trans. on Computers, Vol. 44, No. 8, pp. 971-982, Aug. 1995.

[7] J . K. Jain, L. Song and K.K. Parhi, "Efficient Semisystolic Architectures for Finite-Field Arithmetic," IEEE Trans.

VLSI Systcms, Vol. 6 , No. 1 , pp. 101-113, Mar. 1998.

[8] K. Muhamlnad and K. Roy, "On Complexity Reduction of FIR Digital Filters Using Constrained Least. Squares Solution,"

In Proc. of 1997 IEEE International Conference on Computer Design (ICCD '97), pp. 196-201, Austin, Texas.
[9] K. Muham.mad and K. Roy. "Low Power Digital Filters Based On Constrained Least Squares Solution," In Proc. o f the

31st Asilonzar Conference on Signals, Systems and Computers, 1997, Monterey, California

Invited Paper.

[lo] M. Lundberg, K . Muhammad, K. Roy and S. K. Wilson, "High-level Modeling of Switching Activit,~ With Application to Low-power DSP System Synthesis," To appear in the 1999 Proc. IEEE International Conference On ilcoustics,

Speech, ant1 Sig nal Processing (ICASSP'99).

[ l l ] T. H. Cormen, C. E. Leiserson and R. L. Rivest, "Introduction to Algorithms," The MIT Press, 1990.

Manju Kapur Novel
100% (1)
Manju Kapur Novel
32 pages
Empowerment Technologies Quarter 2 Module 1
No ratings yet
Empowerment Technologies Quarter 2 Module 1
44 pages
Vidyanjali Project SCT - CSR
No ratings yet
Vidyanjali Project SCT - CSR
21 pages
0022 Ammonia Production
No ratings yet
0022 Ammonia Production
32 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
No ratings yet
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
27 pages
DSP Architecture - Part 1
No ratings yet
DSP Architecture - Part 1
36 pages
Flight Performance and Planning (PPL)
No ratings yet
Flight Performance and Planning (PPL)
3 pages
Synthesis of Computational Structures For Analog Signal Processing
No ratings yet
Synthesis of Computational Structures For Analog Signal Processing
456 pages
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
No ratings yet
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
12 pages
Daftar Harga Allengers - 012020
No ratings yet
Daftar Harga Allengers - 012020
8 pages
Design of Low-Power Reduction-Trees in Parallel Multipliers
No ratings yet
Design of Low-Power Reduction-Trees in Parallel Multipliers
210 pages
Unit 1dspa
No ratings yet
Unit 1dspa
95 pages
Vlsi Mtech Document
No ratings yet
Vlsi Mtech Document
72 pages
Essay-Contest 2021 en
No ratings yet
Essay-Contest 2021 en
42 pages
Phase II Review
No ratings yet
Phase II Review
30 pages
Design of Non Parallel Gears
No ratings yet
Design of Non Parallel Gears
25 pages
I4G Job Search Tips PDF
No ratings yet
I4G Job Search Tips PDF
25 pages
CORE Stat and Prob Q4 Mod11 W1 Hypothesistesting
No ratings yet
CORE Stat and Prob Q4 Mod11 W1 Hypothesistesting
24 pages
Vocabulary of ISO 9000 Standard
No ratings yet
Vocabulary of ISO 9000 Standard
19 pages
RD-2000 Sound List Eng02 W
No ratings yet
RD-2000 Sound List Eng02 W
24 pages
Low Power 16×16 Bit Multiplier Design Using Dadda Algorithm
No ratings yet
Low Power 16×16 Bit Multiplier Design Using Dadda Algorithm
17 pages
Chap 3-3-2 Grad Varied Flow Civil App-Online RRR Stvers
No ratings yet
Chap 3-3-2 Grad Varied Flow Civil App-Online RRR Stvers
17 pages
SPST
No ratings yet
SPST
40 pages
Implementation of ALU Using Modified Radix-4 Modified Booth Multiplier
No ratings yet
Implementation of ALU Using Modified Radix-4 Modified Booth Multiplier
15 pages
Essential Skills Module 1-4
No ratings yet
Essential Skills Module 1-4
19 pages
Wordlengthresuction
No ratings yet
Wordlengthresuction
18 pages
DTAV40Series Instructions PDF
No ratings yet
DTAV40Series Instructions PDF
12 pages
Electronics 12 00446 v2
No ratings yet
Electronics 12 00446 v2
21 pages
Customertouch Points Videshi
No ratings yet
Customertouch Points Videshi
17 pages
Ijecet: International Journal of Electronics and Communication Engineering & Technology (Ijecet)
No ratings yet
Ijecet: International Journal of Electronics and Communication Engineering & Technology (Ijecet)
11 pages
Wallace Tree Multiplier
No ratings yet
Wallace Tree Multiplier
11 pages
Expert Systems With Applications: George S. Atsalakis, Kimon P. Valavanis
No ratings yet
Expert Systems With Applications: George S. Atsalakis, Kimon P. Valavanis
10 pages
Course Code and Name: (23BSMD31 COMPUTER ORGANIZATION AND Architecture)
No ratings yet
Course Code and Name: (23BSMD31 COMPUTER ORGANIZATION AND Architecture)
16 pages
Paper M
No ratings yet
Paper M
10 pages
Gesture Control of Mobile Robot Based On MSP430 Microcontroller
No ratings yet
Gesture Control of Mobile Robot Based On MSP430 Microcontroller
13 pages
Vlsi Architecture of Parallel Multiplier - Accumulator Based
No ratings yet
Vlsi Architecture of Parallel Multiplier - Accumulator Based
8 pages
DRD
No ratings yet
DRD
16 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
Vlsi Implementation of Low Power 8 Bit DSP Processor IJERTCONV1IS06078
No ratings yet
Vlsi Implementation of Low Power 8 Bit DSP Processor IJERTCONV1IS06078
8 pages
Culture
No ratings yet
Culture
15 pages
Ijlbps 65f86753839c3
No ratings yet
Ijlbps 65f86753839c3
9 pages
Booth Encoder
No ratings yet
Booth Encoder
8 pages
Low Power Add and Shift Multiplier Design Bzfad Architecture
No ratings yet
Low Power Add and Shift Multiplier Design Bzfad Architecture
14 pages
Low Power Shift and Add Multiplier Design: June 2010
No ratings yet
Low Power Shift and Add Multiplier Design: June 2010
12 pages
BOOTHvs KARATSUBAvs VEDIC
No ratings yet
BOOTHvs KARATSUBAvs VEDIC
6 pages
Design of Area, Power and Delay Efficient High-Speed Multipliers
No ratings yet
Design of Area, Power and Delay Efficient High-Speed Multipliers
8 pages
Design, Comparison and Implementation of Multipliers On FPGA
No ratings yet
Design, Comparison and Implementation of Multipliers On FPGA
8 pages
PaperID 74S201921
No ratings yet
PaperID 74S201921
7 pages
Design of High Performance Radix-4 and Radix-8 Multiplier Using Verilog HDL
No ratings yet
Design of High Performance Radix-4 and Radix-8 Multiplier Using Verilog HDL
11 pages
Design and Analysis of Low Power Braun Multiplier Architecture
No ratings yet
Design and Analysis of Low Power Braun Multiplier Architecture
8 pages
Jurnal Imunisasi
No ratings yet
Jurnal Imunisasi
10 pages
Comparison of Different Types of Multipliers With
No ratings yet
Comparison of Different Types of Multipliers With
6 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
31 Design JJ New
No ratings yet
31 Design JJ New
8 pages
A Study On Wallace Tree Multiplier
No ratings yet
A Study On Wallace Tree Multiplier
7 pages
IJETR032171
No ratings yet
IJETR032171
4 pages
Isolated Footing
No ratings yet
Isolated Footing
5 pages
Designing of 4-Bit Array Multiplayer
No ratings yet
Designing of 4-Bit Array Multiplayer
6 pages
Design of Low Power and High Speed 4X4 WTM
No ratings yet
Design of Low Power and High Speed 4X4 WTM
5 pages
An Efficient Multiplier Based On Shift A
No ratings yet
An Efficient Multiplier Based On Shift A
6 pages
C4.1 Student Activity: Amount of Substance
No ratings yet
C4.1 Student Activity: Amount of Substance
7 pages
Lal15IJAERArif2015 PDF
No ratings yet
Lal15IJAERArif2015 PDF
6 pages
Low Power VLSI Design of Modified Booth Multiplier
No ratings yet
Low Power VLSI Design of Modified Booth Multiplier
6 pages
The Design of A Low Power Asynchronous Multiplier: Yijun Liu, Steve Furber
No ratings yet
The Design of A Low Power Asynchronous Multiplier: Yijun Liu, Steve Furber
6 pages
Evaloution of Mobile Phone
No ratings yet
Evaloution of Mobile Phone
7 pages
Bhattacharjee 2011
No ratings yet
Bhattacharjee 2011
5 pages
Galvocoat 16380
No ratings yet
Galvocoat 16380
2 pages
Review of Multiplier S
No ratings yet
Review of Multiplier S
6 pages
Design of Modified Low Power Booth Multiplier
No ratings yet
Design of Modified Low Power Booth Multiplier
6 pages
A New Design For Array Multiplier With Trade Off in Power and Area
No ratings yet
A New Design For Array Multiplier With Trade Off in Power and Area
6 pages
Loyalty - From Single Stage Loyalty To Four Stage
No ratings yet
Loyalty - From Single Stage Loyalty To Four Stage
5 pages
Expectancy Theory PDF
No ratings yet
Expectancy Theory PDF
2 pages
A VLSI Architecture For Signed Multipliers
No ratings yet
A VLSI Architecture For Signed Multipliers
4 pages
The Efficient Implementation of An Array Multiplier
No ratings yet
The Efficient Implementation of An Array Multiplier
5 pages
Vedic Multiplier in Vlsi For High Speedapplications
No ratings yet
Vedic Multiplier in Vlsi For High Speedapplications
4 pages
Low Power Booth Multiplier Using Radix-4 Algorithm On Fpga: Prof. V. R. Raut, P. R. Loya
No ratings yet
Low Power Booth Multiplier Using Radix-4 Algorithm On Fpga: Prof. V. R. Raut, P. R. Loya
5 pages
FPGA Implementation of Efficient Modifie
No ratings yet
FPGA Implementation of Efficient Modifie
4 pages
Area - Power Efficient Multiplier and Square
No ratings yet
Area - Power Efficient Multiplier and Square
6 pages
Ijece V3i3p103
No ratings yet
Ijece V3i3p103
4 pages
A Novel 32-Bit Scalable Multiplier Architecture: Yeshwant Kolla Yong-Bin Kim John Carter
No ratings yet
A Novel 32-Bit Scalable Multiplier Architecture: Yeshwant Kolla Yong-Bin Kim John Carter
4 pages
Performance Comparison of Multipliers For Power-Speed Trade-Off in VLSI Design
No ratings yet
Performance Comparison of Multipliers For Power-Speed Trade-Off in VLSI Design
5 pages
Priyanka - 50300 16 130
No ratings yet
Priyanka - 50300 16 130
4 pages
Quiz 001
No ratings yet
Quiz 001
3 pages
Modified Low Power Low Area Array Multiplier With SOC Encounter
No ratings yet
Modified Low Power Low Area Array Multiplier With SOC Encounter
4 pages
5.8 Practice Test
No ratings yet
5.8 Practice Test
2 pages
Mukul - Curriculum - Vitae - V.1
No ratings yet
Mukul - Curriculum - Vitae - V.1
2 pages
Digital Calibration Gage: 0.05% Full Scale Accuracy, 316 SS Wetted Parts
No ratings yet
Digital Calibration Gage: 0.05% Full Scale Accuracy, 316 SS Wetted Parts
1 page
Reviews in Computational Chemistry, Volume 31
From Everand
Reviews in Computational Chemistry, Volume 31
Abby L. Parrill
No ratings yet
Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
Digital Spectral Analysis MATLAB® Software User Guide
From Everand
Digital Spectral Analysis MATLAB® Software User Guide
S. Lawrence Marple, Jr.
No ratings yet

Switching Characteristics of Generalized Array Multiplier Architectures and Their Applications To Low Power Design

Uploaded by

Switching Characteristics of Generalized Array Multiplier Architectures and Their Applications To Low Power Design

Uploaded by

Purdue University

Follow this and additional works at: http://docs.lib.purdue.edu/ecetr

TR-ECE 99-4 MARCH 1999

Fig. 1. Basic template for constructing array multipliers.

( r o ,r l , rz, . . . , r ~ - 1 ) be the set of indices which represents a n ordering of success,ive additions of

rows of partia,l products. T h e n , the orderi~ig given by

= i for i = 0 , 1 , . . . N - 1 expresses the LSB-first

for all cells

carrys are aalded at appropriate locations.

j . Similarly, we notice that if the successive inputs applied at the B

Favorable Conditions at Input A at Input B 1 s zeros

Overheads Vector Merge

LSB-first CS LSB-first R P MSB-first CS MSB-first R P

Hybrid LSB-first CS/CS Hybrid MSB-first R P / C S

I MSBs & LSBs zeros I MSBs zeros 1

CClNDITIONS CAUSING LOW SWITCHING ACTIVITY A N D OVERHEADS IN THE MLlLTIPLIERS

ROWS A R E hlOVED TO TOP.

B. h'zlmerictrl Results On Power Dissapation

as we set the LSB-first CS multiplier as the

3 4 5 6 Sgnal Slrerglh ol A m LSB-F rsI C S Muit8pler

3 4 5 6 Sign3 Strength 01 A m MSB-FlrsI CS Mulllp mr

-40 1 2 3 4 5 6 Sgnal Strength ol A m MSB-Flrrt RP Mulllp ler 7 8

signal strength in the operands.

8 8 10 12 Sgnal Strsngih ol A ~nLSB-Fra RP Milt!pl~sr

6 8 10 12 14 S1gn3Strengh ol A n MSB-Fbra RP Mult#@isr

v~~~~ Ior (left) 16-bit

signal strength in the operands.

9 11 13 15 17 19 21 23 25 Slgnal Slrenglh o l A m LSB-Flr61 C S Mulllpllsr

9 11 13 15 17 19 21 23 25 Slgnd Strength 01 A m MSB-Flrsl C S Multlpiler

9 I1 13 15 17 19 21 23 25 Slgnal Slrenglh o l A m LSB-FlrsI RP Mullipller

9 11 13 15 17 19 21 23 25 Slgnal Slrength al A m MSB-Flrpl RP Mullipiler

signal strength in the operands.

10. These figures show q~~~~ for 8, 16 and 32-bit

2 3 4 5 6 7 Sgnal Slrmyth at A m Hybrld-LSB-Rrrl Multopller (Ll)

2 3 4 5 8 7 Slgnal Slrenglh a l A ~nybnd-LSB-Flrn M d l ~ psr l (L=Z)

2 3 4 5 6 7 Slgnal Slrenglh o l A m Hybrd-MSB-Flrl Mult~plsr (L4)

2 3 4 5 6 7 Sgnal Sliengm al A n ybnd-MSB-Flrsl Multopl~er (L=2)

4 6 6 10 12 14 Sqnal nrenyln ol A ~nHybrid-LSB-Ftrsf Mlltiplier ( L . 1 )

4 6 8 I0 12 14 Sgnal Strmglh d A ~nybnd-LSB-Flrsl Mdlplsr (L.2)

4 6 8 10 12 14 Signd Strength of A m Wnd-MSB-Flm Mlll~pl~er (L=Z)

7 9 11 13 15 17 19 21 23 25 Signal Srrenylh 01 A 10Hybrid-LSB-Fir* Mlllpl<sr(L=o

7 9 11 13 15 17 19 21 23 25 Slgml Slrenyth of A m tiybnd-LSB-First Mlillplrlr (L=2)

4 6 6 10 12 14 S~QMI Slrenylh a1 A m Hybnd-LSB-Flm Mllllplier ( k 3 )

2 3 4 5 8 7 Slgnal Slrenplh 01 A 8" Hybnd-MSB-Fvsl Mlltipller ( L S )

4 6 6 10 12 14 Slgml Srrenglh d A ~nWrld-MSB-Firrl Milllpixr (M)

13, in general, suggest superior performance

0 8 10 12 14 Sgnal Slrenglh of A m Hybnd-LSB-Flrn Mullapller ( L . 4 )

4 6 6 10 12 14 Sign?d hrrngm al A m Hybrd-MSB-Flrd Mulllplmr (L.4)

7 9 11 13 I 5 17 19 21 23 25 Signal Strengih of A m Hybrd-MSB-Flm Mullipl~ei (L.4)

Fig. 12. VT,,, operands.

7 9 11 13 15 17 19 21 23 25 Slgnal Strenqlh of A m Hybrd-LSB-Furl Mullpllsr ( L . 1 )

7 8 11 13 15 17 19 21 23 25 Signal Slrenglh of A l n Hybrid-LSB-Flrd Mulliplsr ( L d )

7 9 11 13 IS 17 19 21 23 25 Slgnal Strength 01 A m Hybrld-MSB-Flrn Mullpller ( L . 6 )

L = 8 ( r i g h t )as a function of the signal strength in

strength i n the operands. Values of

v~~~~for small signal strengths

v~~~~ for 8-bit

CS/CS with L = l , a n d , ( r i g h t ) Hybrid MSB-First RP/CS with L = l . Values of

are shown with each plot.

0 3 2 3 4 5 6 7 Sgnal Slrenglh of A 8 1 2 3 4 5 6 7 Slgnal Strength o l A 8

are shown with each plot.

LOW P OWER D ESIGN

Multiplier LSB-first CS MSB-first CS Area Overhead

A. Low Power Quantization

C . Hzgh Level Synthesis Based o n Iqnowledge of Signal Characteristics

Publishing Company, New York, 1992.

VLSI Systcms, Vol. 6 , No. 1 , pp. 101-113, Mar. 1998.

Speech, ant1 Sig nal Processing (ICASSP'99).

You might also like