A Survey of Power Estimation: Techniques in Circuits

This document provides an overview of power estimation techniques for VLSI circuits. It discusses how power dissipation depends on switching activity within circuits, making power estimation strongly dependent on input patterns. This poses challenges as inputs are often unknown during design. The document surveys approaches that use probabilities to describe all possible inputs and estimate average power. It also discusses the need for efficient power estimation tools during design to meet power specifications without costly redesign.

Uploaded by

Tho Mai Duc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views10 pages

A Survey of Power Estimation: Techniques in Circuits

Uploaded by

Tho Mai Duc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

446 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 2, NO.

4, DECEMBER 1994

A Survey of Power Estimation

Techniques in VLSI Circuits
Farid N. Najm, Member, IEEE

(Invited Paper)

Abstruct- With the advent of portable and high-density mi- the power estimation problem because the power becomes a
croelectronic devices, the power dissipation of very large scale moving target-it is input pattern-dependent.
integrated (VLSI) circuits is becoming a critical concem. Accu- Thus the simple and straight-forward solution of estimating
rate and efficient power estimation during the design phase is
required in order to meet the power specificationswithout a costly power by using a simulator is severely complicated by this
redesign process. In this paper, we present a reviewhtorial of the pattem-dependence problem. Input signals are generally un-
power estimation techniques that have recently been proposed. known during the design phase because they depend on the
system (or chip) in which the chip (or functional block) will
eventually be used. Furthermore, it is practically impossible
I. INTRODUCTION to estimate the power by simulating the circuit for all possible

T HE CONTINUING DECREASE in feature size and the

corresponding increase in chip density and operating
frequency have made power consumption a major concem
inputs. Recently, several techniques have been proposed to
overcome this problem by using probabilities to describe the
set of all possible logic signals, and then studying the power
in VLSI design [l], 121. Modern microprocessors are indeed resulting from the collective influence of all these signals. This
hot: the Power PC chip from Motorola consumes 8.5 W, formulation achieves a certain degree of pattern-independence
the Pentium chip from Intel consumes 16 W, and DEC’s that allows one to efficiently estimate and manipulate the
alpha chip consumes 30 W. Excessive power dissipation in power dissipation.
integrated circuits not only discourages their use in a portable The rest of this paper is organized as follows. In the next
environment, but also causes overheating, which degrades section, we explain the power estimation problem in more
performance and reduces chip life-time. To control their tem- detail and introduce a number of probabilistic measures that
perature levels, high power chips require specialized and have been used to estimate power. Section I11 contains a litera-
costly packaging and heat-sink arrangements. This, combined ture survey of power estimation techniques and the following
with the recently growing demand for low-power portable two sections focus in more detail on the two main types of
communications and computing systems, has created a need to approaches. We discuss sequential circuit power estimation in
limit the power consumption in many chip designs. Indeed, the Section VI, and provide a summary and conclusions in Section
Semiconductor Industry Association has identified low-power VII.
design techniques as a critical technological need [3].
Managing the power of an IC design adds to a growing list 11. DETAILEDPROBLEMDESCRIPTION
of problems that IC designers and design managers have to By power estimation we generally refer to the problem of
contend with. Computer Aided Design (CAD) tools are needed estimating the average power dissipation of a digital circuit.
to help with the power management tasks. Specifically, there This is different from estimating the worst case instantaneous
is a need for CAD tools to estimate power dissipation during power, often referred to as the voltage drop problem [4]-[6].
the design phase in order to meet the power specifications Chip heating and temperature are directly related to the av-
without a costly redesign process. erage power.
In CMOS and BiCMOS technologies, the chip components We have already alluded to a most straight-forward method
(Gates, cells) draw power supply current only during a logic of power estimation, namely by simulation; perform a circuit
transition (if we ignore the small leakage current). While this simulation of the design and monitor the power supply current
is considered an attractive low-power feature of these tech- waveform. Subsequently, the average of the current waveform
nologies, it makes the power dissipation highly dependent on is computed and used to provide the average power. The
the switching activity inside these circuits. Simply put, a more advantages of this technique are mainly its accuracy and
active circuit will consume more power. This complicates generality. It can be used to estimate the power of any
circuit, regardless of technology, design style, functionality,
Manuscript received August 31, 1994. architecture, etc. The simulation results, however, are directly
The author is with the Coordinated Science Laboratory, University of
Illinois at Urbana-Champaign, Urbana, IL 61801 USA. related to the specific input signals used to drive the simulator.
IEEE Log Number 9406369. Furthermore, complete and specific information about the
1063-8210/94$04.000 1994 IEEE
NAJM: POWER ESTIMATION TECHNIQUES 447

Laplc B l d

1. :4Jx,* ym/l;.y
Tc

Fig. I . A combinational circuit embedded in a synchronous sequential

design. AS W
M W
lun

Wph
Power
Td
input signals is required, in the form of voltage waveforms.
Hence we describe these simulation-based techniques as being Fig. 2. An alternative flow for power estimation.
strongly pattern-dependent.
The pattern-dependence problem is serious. Often, the These additional transitions have been called hazards or
power of a functional block needs to be estimated when the glitches. Although unplanned for by the designer, they are
rest of the chip has not yet been designed, or even completely not necessarily design errors. Only in the context of low-
specified. In such a case, very little may be known about power design do they become a nuisance, because of the
the inputs to this functional block, and complete and specific additional power that they dissipate. It has been observed [8]
information about its inputs would be impossible to obtain. that this additional power dissipation is typically 20% of the
Even if one is willing to guess at specific input waveforms, it total power, but can be as high as 70% of the total power in
may be impossible to assess if such inputs are typical. Large some cases such as conibinational adders. We have observed
numbers of input patterns would have to be simulated, and that in a 16-hit multiplier circuit, some nodes make as many as
this can become computationally every expensive, practically 20 transitions before reaching steady state. This component of
impossible for large circuits. the power dissipation is computationally expensive to estimate,
Most other (more efficient} power estimation techniques that because it depends on the timing relationships between signals
have been proposed start out by simplifying the problem in inside the circuit. Consequently, many proposed power estima-
three ways. First, it is assumed that the power supply and tion techniques have ignored this issue. We will refer to this
ground voltage levels throughout the chip are fixed, so that elusive component of power as the toggle power. Computing
it becomes simpler to compute the power by estimating the the toggle power is one main challenge in power estimation.
current drawn by every sub-circuit assuming a given fixed Ariother challenge has to do with independence when sig-
power supply voltage. Second, it is assumed that the circuit is nals are represented with probabilities. The reason for in-
built of logic gates and latches, and has the popular and well- troducing probabilities is to solve the pattern-dependence
structured design style of a synchronous sequential circuit, as problem, as follows. Instead of simulating the circuit for a
shown in Fig. 1 . In other words, it consists of latches driven by large number of patterns and then averaging the result, one can
a common clock and combinational logic blocks whose inputs simply compute (from the input pattern set, for instance) the
(outputs) are latch outputs (inputs). It is also assumed that the fraction of cycles in which an input signal makes a transition
latches are edge-triggered and, with the use of a CMOS or (a probability measure) and use that information to estimate
BiCMOS design technology, the circuit draws no steady-state (somehow) how often internal nodes transition and, conse-
supply current. Therefore, the average power dissipation of quently, the power drawn by the circuit. Conceptually, this
the circuit can be broken down into 1) the power consumed idea is shown in Fig. 2, which depicts both the conventional
by the latches and 2 ) that consumed by the combinational path of using circuit simulation and the alternative path of
logic blocks. This provides a convenient way to decouple the using probabilities. In a sense, one performs the averaging
problem and simplify the analysis. And, finally, it is commonly before, instead of after, running the analysis. Thus, a single
accepted that, in accordance with the results of [7], it is enough run of a probabilistic analysis tool replaces a large number
to consider only the chargingldischarging current drawn by a of circuit simulation runs, provided some loss of accuracy
logic gate, so that the short-circuit current during switching can be tolerated. The issues are exactly what probabilities are
is neglected. required, how they are to be obtained and, most importantly,
Whenever the clock triggers the latches, some of them will what sort of analysis should be performed.
make a transition and will draw power. Thus latch power is In practice, one can directly provide the required input
drawn in synchrony with the clock. The same is not true for probabilities, eliminating the need for a large set of specific
gates inside the combinations logic. Even though the inputs input patterns. The results of the analysis will depend on the
to a combinational logic block are updated by the latches (in supplied probabilities. Thus, to some extent the process is
synchrony with the clock), the internal gates of the block may still pattern-dependent and the user must supply information
make several transitions before settling to their steady state about the Qpical behavior at the circuit inputs, in terms of
values for that clock period. probabilities. However, since one is not required to provide
448 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS. VOL. 2, NO. 4,DECEMBER 1994

complete and specific information about the input signals, we The density provides an effective measure of switching
call these approaches weakly pattern-dependent. activity in logic circuits. If the density at every circuit node is
There are many ways of defining probability measures made available, the overall average power dissipation in the
associated with the transitions made by a logic signal, be it at circuit can be computed as:
the primary inputs of the combinational block or at an intemal
node. We start with the following two:
Dejnition I : (signal probability): The signal probability (4)
Ps(z)at a node x is defined as the average fraction of clock
cycles in which the steady state value of z is a logic high.
DeJnition 2: (transition probability): The transition proba- In a synchronous circuit, with a clock period T,, the rela-
bility Pt(z) at a node z is defined as the average fraction of tionship between transition density and transition probability
clock cycles in which the steady state value of 2 is different is:
from its initial value.
The signal probability is a relatively old concept that was
first introduced to study circuit testability [9]. It is important to
note that both these probability measures are unaffected by the where equality occurs in the zero-delay case. Thus the transi-
circuit intemal delays. Indeed, they remain the same even if a tion probability gives a lower bound on the transition density.
zero-delay timing model is used. When this is done, however, Let P ( z ) denote the equilibrium probability [25] of a logic
the toggle power is automatically excluded from the analysis. signal x ( t ) , defined as the average fraction of time that the
This is a serious shortcoming of techniques that are based on signal is high. Formally:
these measures, as we will point out below. DeJnition 4: (equilibrium probability): If ~ ( t is) a logic
If a zero-delay model is assumed and the transition proba- signal (switching between 0 and l), then its equilibrium
bilities are computed, then the power can be computed as: probability is defined as:

where Tc is the clock period, Ct is the total capacitance at In contrast to the signal probability, the equilibrium proba-
node zz,and n is the total number of circuit nodes that are bility depends on the circuit intemal delays since it describes
outputs of logic gates or cells. Since this assumes at most a the signal behavior over time and not only its steady state be-
single transition per clock cycle, then this is actually a lower havior per clock cycle. In the zero-delay case, the equilibrium
bound on the true average power. probability reduces to the signal probability.
We can now discuss the signal independence issue. In In the remainder of this paper, we will make use of the
practice, signals may be correlated so that, for instance, two of probability measures defined above in discussing the various
them may never be simultaneously high. It is computationally recently proposed power estimation techniques.
too expensive to compute these correlations, so that the circuit
input and internal nodes are usually assumed to be indepen-
111. BRIEFOVERVIEW
dent. We refer to this as a spatial independence assumption.
Another independence issue is whether the values of the same The earliest proposed techniques of estimating power dis-
signal in two consecutive clock cycles are independent or not. sipation were strongly pattern-dependent circuit simulation
If assumed independent, then the transition probability can be based [ll], [12]. One would simulate the circuit while moni-
easily obtained from the signal probability according to: toring the supply voltage and current waveforms, which are
subsequently used to compute the average power. Besides
being strongly pattern dependent, these techniques are too slow
to be used on large circuits, for which high power dissipation
is a problem.
We refer to this as a temporal independence assumption. In order to improve computational efficiency, other simula-
Other recent power measures are based on the transition tion based techniques were also proposed using various kinds
density formulation [lo], [25]. The transition density at node of timing, switch-level, and logic simulation [ 131-[ 181. These
z is the average number of transitions per second at node z, techniques generally assume that the power supply and ground
denoted D ( z ) . Formally: voltages are fixed, and only the supply current waveform is
Dejnition 3: (transition density): If a logic signal z ( t ) estimated. While they are indeed more efficient than traditional
makes n,(T) transitions in a time interval of length T , then circuit simulation, at the cost of some loss is accuracy, they
the transition density of z(t) is defined as: remain strongly pattem-dependent.
In order to overcome the shortcomings of simulation-based
techniques, other specialized approaches have been proposed
with a focus on combinational digital CMOS circuits embed-
NAJM: POWER ESTIMATION TECHNIQUES 449

ded i n a synchronous design environment, as described above. TABLE 1

For the rest of this section, therefore, we will be concerned PROBABILISTICTECHNIQUES
with the power consumed in a combinational circuit whose
inputs switch in synchrony.
The use of probabilities to estimate power was first proposed
in 1191. In this work, a zero-delay model was assumed and
a tcmporal independence assumption was made so that the
transition probabilities could be estimated using signal proba-
bilities based on (2). Signal probabilities supplied by the user
at the primary inputs are propagated into the circuit assuming
spatial independence and the power was computed based on IV. PROBABILISTIC TECHNIQUES
(1). Since a zero-delay model was used, the toggle power was
Recently, several power estimation approaches have been
ignored.
proposed that use probabilities in order to solve the pattern-
A probabilistic power estimation approach that does com-
dependence problem. In practice, all are applicable only to
pute the toggle power and does not make the zero-delay
combinational circuits and require the user to specify typical
or temporal independence assumptions, called probabilistic
behavior at the combinational circuit inputs. We will compare
simulation was proposed in [20]-[22]. In this technique, the
and contrast these techniques based on the six criteria of 1)
use of probabilities was expanded to allow the specification
Whether they include the toggle power, 2) If they handle
of probability waveforms, as described in more detail in the
temporal correlation, 3) Complexity of the required input
next section. This approach assumed spatial independence, and
specification, 4) Whether they provide the power consumed
was not restricted to only synchronous circuits. Improvements
by individual gates, 5 ) If they handled spatial correlation, and
on this technique were proposed in [23] and [24], where the
6) Speed. We will discuss five different approaches, for which
accuracy and the correlation handling were improved upon.
the comparisons are shown in Table I.
Another probabilistic approach was proposed in [25]-[27],
These techniques all use simplified delay models for the cir-
where the transition density measure of circuit activity was
cuit components and require user-supplied information about
introduced. An algorithm was also presented for propagating
typical input behavior. Thus, their accuracy is limited by
the transition density into the circuit. This approach does not
the quality of the delay models and the input specification.
make a zero-delay assumption and makes only the spatial
Nevertheless, some are more accurate then others, and this
independence assumption, as will be discussed in more detail
may be gauged by looking at criteria l), 2), and 5 ) in the
in Section IV. Nevertheless, the result of this independence
table.
assumption is to make the computed density values insensitive
to the internal circuit delays.
Yet another probabilistic approach was presented in [28], A. USing Signal Probability
where Binary Decision Diagrams (BDD’s) [35] were used In [19], a zero-delay model is used and temporal as well
to take into account intemal node correlations and toggle as spatial independence is assumed. The user is expected to
power, at the cost of increased computation. This approach provide signal probabilities at the primary inputs. These are
can become computationally expensive, especially for circuits then propagated into the circuit to provide the probabilities
where toggle power is dominant. It will be reviewed in more at every node. In the paper, the propagation of probabilities
detail in Section IV. is performed at the switch-level, but this is not essential to
We refer to the above approaches as probabilistic because the approach. The simplest way to propagate probabilities
probabilistic information is directly propagated into the circuit. is to work with a gate-level description of the circuit. Thus
To perform this, special models for circuit blocks (gates) if y = AND(zl,sz), then it follows from basic probability
must be developed and stored in the cell library. In contrast, theory [34] that Ps(y) = Ps(z1)Ps(x2), provided z1 and z2
other techniques, that we will refer to as statistical, do not are (spatially) independent. Similarly, other simple expressions
require specialized circuit models. Instead, they use traditional can be derived for other gate types. Once the signal probabil-
simulation models and simulate the circuit for a limited ities are computed at every node in the circuit, the power is
number of randomly generated input vectors while monitoring computed by making use of (1) and (2), based on the temporal
the power. These vectors are generated from user-specified independence assumption.
probabilistic information about the circuit inputs. Using statis- In general, if the circuit is built from Boolean components
tical estimation techniques, one can determine when to stop the that are not part of a pre-defined gate library, the signal
simulation in order to obtain a certain specified error bound. probability can be computed by using a BDD [35]to represent
Details of these techniques can be found in [29]-[32], and the Boolean functions, as proposed in [lo] and [37]. As an
will be summarized below. example to illustrate the BDD representation, consider the
+
All of the above probabilistic and statistical techniques are Boolean function y = x1x2 23, which can be represented
applicable only to combinational circuits. They require the user by the BDD shown in Fig. 3. The Boolean variables zi are
to specify information on the activity at the latch outputs. ordered, and each level in the BDD corresponds to a single
Power estimation in sequential circuits will be discussed in variable. Each level may contain one or more BDD nodes at
Section VI. which one can branch in one of two directions, depending on
450 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 2. NO. 4, DECEMBER 1994

.tI 0.2 0.6 0.0

I
tl t2 t3 Time
Fig. 4. Example probability waveform.

average power dissipated in each gate and the total average

power of the circuit.
An example of a probability waveform is shown in Fig. 4.
In this example, the signal is high with probability 0.5, to begin
O1 \ I with. It then transitions low-to-high with probability 0.2 at t l ,
to become high with probability 0.25 between tl and t 2 etc. At

loI Ill
every transition time point, the signal may also make a high-
to-low transition, the probabilities of which can be computed
U U
from the other probabilities specified in the waveform. Notice
Fig. 3. Example BDD representation.
that, at t l , 0.2 # (1.0 - 0.5) x 0.25 which illustrates that
temporal independence is not assumed. Given such waveforms
at the primary inputs, they are propagated into the circuit to
the value of the relevant variable. For example, suppose that compute the corresponding probability waveforms at all the
x1 = 1,2 2 = 0, and 2 3 = 1. To evaluate y, we start at the top nodes.
node and branch to the right since x1 = 1, then branch to the The propagation algorithm is very similar to event driven
left since 2 2 = 0, and finally branch to the right since 2 3 = 1 logic simulation with an assignable delay model. The only
to reach the terminal node “1”. Thus the corresponding value difference is that the simulation algorithm and simulation
of y is 1. model for each gate deal with the probability of making a
In general, let :y = f ( x 1 , . . . , 2 , ) be a Boolean function. transition rather then the definite occurrence of a transition.
If the inputs xi are independent, then the signal probability The events are propagated one at a time, using an event
of f can be obtained in linear time (in the size of its BDD queue based mechanism. Whenever an event occurs at the
representation), as follows. If f z l = f (1,x2, . . . , z,) and input to a gate, the gate makes a contribution to the overall
f~ = f ( O , x 2 , .. .,x,) are the cofactors of f with respect average current that is being estimated, and generates an output
to 21, then: event that is scheduled after some time delay. In the original
implementation of CREST, a transistor level netlist was used
(7) to compute the average current pulse and delay of every gate.
The same can be achieved using gate level models, provided
This equation shows how the BDD can be used to evaluate they are pre-characterized to estimate the current pulse and
P ( y ) . The two nodes that are descendants of y in the BDD delay.
correspond to the cofactors of f . The probability of the
cofactors can then be expressed in the same way, in terms C. Transition Density (DENSIM)
of their descendants. Thus a depth-first-traversal of the BDD, The average number of transitions per second at a node in
with a post-order evaluation of P ( - )at every node is all that is the circuit has been called the transition density in [25]-[27],
required. This can be implemented using the “scan” function where an efficient algorithm is presented to propagate the
of the BDD package [36]. density values from the inputs throughout the circuit. This was
implemented in the program DENSIM for which the required
B. Probabilistic Simulation (CREST) input specification is a pair of numbers for every input node,
This approach [20]-[22] requires the user to specify typ- namely the equilibrium probability and transition density. In
ical signal behavior at the circuit inputs using probability this case, both signal values and signal transition times are
waveforms. A probability waveform is a sequence of values random.
indicating the probability that the signal is high for certain time To see how the propagation algorithm works, recall the
intervals, and the probability that it makes low-to-high transi- concept of Boolean difference: if y is a Boolean function that
tions at specific time points. The transition times themselves depends on x, then the Boolean difference of ?J with respect
are not random. This allows the computation of the average, to x is defined as:
as well as the variance, of the current waveforms drawn by
the individual gates in the design in one simulation run. The
average current waveforms can then be used to compute the
NAIM: POWER ESTIMATION TECHNIQUES 45 1

Fig. 5 . A simple test case circuit

where Bi denotes the exclusive-or operation. It was shown in

y (1) Y (2)
[25] that, if the inputs 2 ; to a Boolean module are (spatially)
independent, then the density of its output y is given by:
2 (1) z (2) z (3)

(9) Fig. 6. Timing diagram.

2=1

The simplicity of this expression allows very efficient CAD In this way, one can express the intermediate values of every
implementations. Given the probability and density values at node in terms of the two sets of values at the primary inputs.
the primary inputs of a logic circuit, a single pass over the If a BDD is built for these functions, then the intermediate
circuit, using (9), gives the density at every node. In order to state probabilities can be accurately computed. In order to
compute the Boolean difference probabilities, one must also compute the probabilities of intemal transitions, one can
propagate the equilibrium probabilities P ( x ) from the primary use the BDD to construct the exclusive-OR function of two
inputs throughout the circuit, using the same BDD algorithm consecutive intermediate states. Thus, in the above example,
for signal probability propagation described above. the probability that the first transition of z occurs is P(z(1) @
As an example, consider the simple case of a 2-input logic 4 2 ) ) and the probability that the second transition occurs
AND gate: y = ~ 1 x 2 .In this case, dy/i)zl = 2 2 and is P(z(2) €0 4 3 ) ) . Once these XOR functions have been
ay/dzz = z l , so that: constructed, both of these probabilities can be computed from
the BDD. The expected number of transitions at z in a
D(y) = P(.2;2)D(51) + P(Zl)Db2) (10) clock cycle is, therefore, E[n,(Tc)]= P(z(1) €0 z(2)) +
P(z(2) 132(3)), and the transition density at z is D ( z ) =
In more complex cases, where f is a general Boolean function,
E[%(TdIPc.
Binary Decision Diagrams can be used [25] to compute the Using a BDD to perform these tasks implicitly means that
Boolean difference probabilities. Recently, specialized BDD- the BDD variables are assumed independent. In the above
based techniques have been proposed to facilitate this [39].
example, this means that 21(1),21(2),22(1),and 22(2) are
independent. Thus, while some temporal c o r r e n between
z(1) and z(2) is taken care of (through the z l ( l ) term), no
D. Using a BDD temporal correlation between y(1) and y(2) is possible. The
The technique proposed in [28] attempts to handle both reason is that temporal and spatial independence are effec-
spatial and temporal correlations by using a BDD to represent tively assumed at the primary inputs. Hence the qualifications
the successive Boolean functions at every node in terms of “Internally” is Table I.
the primary inputs, as follows. The circuit topology defines a One disadvantage of this technique is that it is computation-
Boolean function corresponding to every node that gives the ally expensive. Since the BDD is built for the whole circuit,
steady state value of that node in terms of the primary inputs. there will be cases where the technique breaks down because
The intermediate values that the node takes before reaching the required BDD may be too big. As a result, this approach
steady state are not represented by this function. Nevertheless, is limited to moderate sized circuits. The situation is actually
one can construct Boolean functions for them by making use potentially worse than this, because a BDD function must be
of the circuit delay information, assuming the delay of every built for every intermediate state and for their painvise XOR
gate is a specified fixed constant. functions. In cases where many intermediate transitions occur,
To illustrate, consider the circuit in Fig. 5, and let the values even moderate sized circuits may be too big to handle. In
of z1 and 2 2 in two consecutive clock cycles be denoted absolute terms, and by way of comparison, the previous three
2 1 ( 1 ) , . ~ 1 ( 2 and
) , x 2 ( l ) , x 2 ( 2 ) .Assuming the AND gate and techniques can run on circuits with a few thousand gates in
the inverter have comparable delays, a typical timing diagram a matter of seconds, while the one large circuit (with 2779
is shown in Fig. 6, where it can be seen that node z may make gates) reported for this approach takes over half an hour (on
two transitions before reaching steady state. The intermediate a DEC-station 5900). Nevertheless, the technique has many
and steady-state values of y and z can be expressed as follows: desirable and interesting features.
- ___
y(1) = -
~ 1 ( 1 ) , and ~ ( 2 =) -
~ ( 2 ) (11) E. Correlation CoefJicients
z(1) = 21(1)22(1), 4 2 ) = 21(1)22(2), Another probabilistic approach that is similar to probabilis-
-
and z ( 3 ) = 21(2)22(2) (12) tic simulation was proposed in [24] whereby the correlation
452 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS. VOL. 2. NO. 4, DECEMBER 1994

TABLE I1 Therefore, for a desired percentage error F in the power

STATISTICAL TECHNIOUES estimate, and for a given confidence level (1 - a ) , we must
simulate the circuit until:

Which means that the number of required simulation is:

coefficients between steady state signal values are used as
approximations to the correlation coefficients between the
intermediate signal values. This allows spatial correlation to be
handled approximately, and is much more efficient than trying
N > (
9y
to estimate the dynamic correlations between intermediate
states. The steady state correlations are estimated from the In practice, this technique was found to be very efficient.
BDD by constructing the function for the AND of two signals. Typically, as few as 10 vectors may be enough to estimate the
The reported results have good accuracy, but the technique power of a large circuit with thousands of gates. But perhaps
does require building the BDD for the whole circuit, which the most useful feature of this technique is that the user can
may not always be feasible. specify the required accuracy and confidence level up-front.
Thus, it retains the accuracy of deterministic simulation-based
approaches, while achieving speeds comparable to proba-
V. STATISTICAL
TECHNIQUES bilistic techniques. It also does not require an independence
The idea behind these techniques is quite simple and ap- assumption for internal nodes; it only requires the primary
pealing: simulate the circuit repeatedly, using some timing or inputs to be independent. The approach can be extended to
model and take into account the correlations between input
logic simulator, while monitoring the power being consumed.
nodes.
Eventually, the power will converge to the average power,
based on (3) and (4). The issues are how to select the input Perhaps the only disadvantages of this approach is that,
while it provides an accurate estimate of the total power, it
patterns to be applied in the simulations and how to decide
does not provide the power consumed by individual gates or
when the measured power has converged close enough to
small groups of gates. It would take many more transitions
the true average power. Normally, the inputs are randomly
generated and statistical mean estimation techniques [38] are to estimate (with the same accuracy) the power of individual
used to decide when to stop, essentially a Monte' Carlo method. gates, because some gates may switch very infrequently. This
point will be further clarified below.
We will review the two main approaches that have been
proposed, whose characteristics are compared in Table 11.
B. Power of Individual Gates (MED)
This recent technique [32] is a modification of the McPower
A. Total Power (McPower) approach that provides both the total and individual-gate power
This approach [29]-[3 I ] uses Monte Carlo simulation to estimates, with user-specified accuracy and confidence. One
reason why one may want to estimate the power consumed
estimate the total average power of the circuit. It consists
by individual gates is to be able to diagnose a high power
of applying randomly generated input patterns at the primary
problem, and find out which part of the circuit consumes the
inputs and monitoring the energy dissipated per clock cycle
most power. Other reasons have to do with the fact that esti-
using a simulator. If the successive input patterns are indepen-
mating gate power is essentially equivalent to estimating the
dently generated, a number N of such measurements is called
transition density at every node. Indeed, the implementation
a random sample whose average (divided by T,) approaches
of this technique in the program MED provides the transition
the desired average power for large N . In order to stop the
density at every gate output node, in addition to the total
simulation when one is close enough to the average power, we
power. These density values can then be used to estimate
need a so-called stopping criterion.
circuit reliability [25].
It was found experimentally [31] that the power consumed
The main difference between this and the above approach
by a circuit over a period T has a distribution that is very
is in the stopping criterion. Suppose we simulate the circuit
close to normal. This allows one to use the following stopping
for a time interval T , N , times and measure the number of
criterion. Let p and s be the measured average and standard
transitions at a node every time, call this 71,~.
Then, according to
deviation of the random sample of the power, measured over
a period T . Then we have (1 - cr) x 100% confidence that
the Central Limit Theorem [%I, the average ?L = xLl nL/N
has a distribution which is close to normal for large N . If 7)
IF - Pa,,( < t a p s / a , where t a l 2 is obtained from the f -
is the true expected number of transitions in T , and s is the
distribution [38] with ( N - 1) degrees of freedom. This result
can be rewritten as: measured standard deviation of the N values n,, then it can
be shown that with confidence (1 - ( i ) x 100%:
NAJM: POWER ESTIMATION TECHNtQUES 453

provided N is larger than about 30 transitions, where z,/2 is is Markov [34] (so that its future is independent of its past
obtained from the normal distribution [38]. The ratio E / T ap- once its present state is specified). If the signal and transition
proaches the transition density D = rt/T. Thus if a percentage probabilities at the present state inputs of the FSM (Le., the
error c is tolerated in the density, then the number of required latch outputs) are known, then, with some approximation, any
simulations is: of the above combinational circuit techniques can be used to
compute the power.
Several approaches, [ 401-[43] have been proposed for se-
quential circuits, all of which make use of the above Markov
assumption. Some of these compute only the probabilities
It should be clear from (17) that for small values of E the (signal and transition) at the latch outputs, while others also
number of samples required can become too large. It thus compute the power. The approach in [40] solves directly for
becomes too expensive to guarantee a percentage accuracy for the transition probabilities on the present state lines using the
low-density nodes. This is why the McPower approach cannot Chapman-Kolmogorov equations [33]-[34], which is compu-
be used as is to measure node densities. The modification tationally too expensive. Another approach that also attempts
proposed in [32] is to use an absolute, rather than percentage, a direct solution of the Chapman-Kolmogorov equations was
error bound for low-density nodes, as follows. A node is given in [41]. While it is more efficient, it remains quite
classified as a low-density node if it has E < q,in, where expensive, so that the largest test case presented contains less
sr/miIl is user-specified. For these nodes, if we use the modified than 30 latches.
stopping criterion: Better solutions are offered by two recent papers [42], [43],
which are based on solving a nonlinear system that gives

N >(z)* the present state line probabilities, as follows. Let a vector

of input probabilities PIn = b l , p 2 . . . . ,p,] be applied to
the combinational logic block and let the n input signals be
independent. At the outputs of the logic, let the corresponding
then with (1 - cy) confidence: output node probability vector be Pant. The mapping from Pi,
to Poutis some non-linear function that is determined by the
Boolean function implemented by the logic. We denote this
vector-valued function by F ( - ) ,so that Pout= F(P,n).
If we now assume that P,, = P is the vector of present state
Thus 71n1i11~becomes an absolute error bound that characterizes probabilities, then we should also have Pout= P , because
the accuracy for low-density nodes. Although the percentage the steady-state state line probabilities are constant. If we
error for low-density nodes sharply increases as ? +i 0, the assume that the state lines are independent, this translates to
absolute error remains relatively fixed. In fact. it can be shown P = F ( P ) . The solution of this non-linear system gives the
that the absolute error bounds for low-density nodes are always required state line probability vector P . It is solved using the
less than the absolute error bounds for other nodes. Although Newton-Raphson method in [42], and using the Picard-Peano
these nodes require the longest time to converge, they have iteration method in [43].
the least effect on circuit power and reliability. Therefore the Both techniques also try to correct for the state line inde-
above strategy reduces the execution time, with little or no pendence assumption. In [42], this is done by accounting for
penalty. m-wise correlations between state bits when computing their
A weakness of this approach may be its speed (currently, probabilities. This requires 2'" additional gates and can get
a circuit with 16 000 gates requires about two hours on a very expensive. Nevertheless, they show good experimental
SUN sparc ELC). Further development may improve this results. The approach in [43] is to unroll the combinational
performance. logic block k times. This is less expensive than [42], and the
authors observe that with k = 3 or so, good results can be
obtained.
VI. SEQUENTIAL CIRCUITS
The above techniques do not apply directly to sequential
circuits. While the CREST approach can be used to simulate a
circuit with feedback, the resulting loss of accuracy due to the
VII. SUMMARY AND CONCLUSION
independence assumption, especially when recursively applied Power estimation tools are required to manage the power
in a feedback loop, renders the results somewhat suspect. As consumption of modem VLSI designs during the design phase,
for (28J, although the title includes "sequential circuits," it is so as to avoid a costly redesign process. Since average
assumed that all states are equally probable, which is not true power dissipation is directly related to the average switching
in practice. activity inside a circuit, it would not make sense to expect
To simplify the discussion, we will assume that the sequen- to estimate power without some information about the circuit
tial circuit implements a finite state machine (FSM) with a input pattems. Yet this is what one would like to do in order
connected state space. Another simplifying assumption that to qualify a chip with a certain power rating that is expected
has been made by most researchers is to say that the FSM to hold irresDective of the application.
_. We have presented a
454 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 2. NO. 4. DECEMBER IW4

number of power estimation techniques that are designed to networks,” in IEEHACM Int. Con$ Computer-Aided Design, Santa
alleviate this slrong pattern-dependence problem. Clara, CA, Nov. 8-12, 1992, pp. 402407.
K. P. Parker and E. J. McCluskey, “Probabilistic treatment of general
The presented techniques are weakly pattern dependent since combinational networks,” IEEE Trans. Comput., vol. C-24, pp. 668-670,
the user is expected to supply some information on the typical June 1975.
F. Najm, “Transition density, a stochastic measure of activity in dig-
behavior at the circuit inputs. This information is usually ital circuits,” in ZBfh ACM/IEEE Design Automation Conference, San
in the form of probability (average fraction of time that a Francisco, CA, June 17-21, 1991, pp. 644-649.
signal is high) and density (average number of transitions S . M. Kang, “Accurate simulation of power dissipation in VLSI cir-
cuits,” IEEE J . Solid-State Circ.,vol. SC-21, no. 5, pp. 889-891, Oct.
per second). This information is usually much more readily 1986.
available to designers than specific input patterns are. For G. Y. Yacoub and W. H. Ku, “An accurate simulation technique for
instance, it is relatively easy for a designer to estimate average short-circuit power dissipation based on current component isolation,”
IEEE Int. Symp. Circuits and Syst., pp. I 157- I I6 I , 1989.
input switching frequencies, say by looking at test vector A.-C. Deng, Y.-C. Shiau, and H.-H. Loh, “Time domain current wave-
sets, or simply by assuming some nominal average switching form simulation of CMOS circuits,” in IEEE Int. Con$ Computer-Aided
frequency based on the known clock frequency. The proposed Design, Santa Clara, CA, Nov. 7-10, 1988, pp. 208-21 I.
R. Tjamstrom, “Power dissipation estimate by switch level simulation,”
techniques are effective ways of using this information to find IEEE Int. Symp. Circuits and Syst., pp. 881-884, May 1989.
the circuit power. U. Jagau, “SIMCURRENT-An efficient program for the estimation of
the current flow of complex CMOS circuits,” IEEE Int. Conf: Compufer-
All these techniques use simplified delay models, so that AidedDesign, Santa Clara, CA, Nov. I 1-15, 1990, pp. 396399.
they do not provide the same accuracy as, say, circuit sim- T. H. Krodel, “PowerPlay-fast dynamic power estimation based on logic
ulation. But they are fast, which is very important because simulation,” IEEE Inf. Con$ Compufer Design, pp. 9 6 1 0 0 , Oct. 1991,
L. Benini, M. Favalli, P. Olivo, and B. Ricco, “A novel approach to cost-
VLSI designers are interested in the power dissipation of large effective estimate of power dissipation in CMOS ICs,” Europ. Design
designs. Within the limitations of the simplified delay models, Automation Con$, pp. 354-360, 1993.
some of these techniques, e.g., the statistical techniques, can F. Dresig, Ph. Lanches, 0. Rettig, and U. G. Baitinger, “Simulation and
reduction of CMOS power dissipation at logic level,” European Design
be very accurate. In fact the desired accuracy can be specified Automation Con$, pp. 341-346, 1993.
up-front. The other class of techniques, i.e., the probabilistic M. A. Cirit, “Estimating dynamic power consumption of CMOS cir-
techniques, are not as accurate but can be faster. Two of the cuits,” IEEE Int. Con$ Computer-Aided Design, pp. 534-537, Nov.
9-12, 1987.
proposed probabilistic techniques use BDDs and achieve very R. Burch, F. Najm, P. Yang, and D. Hocevar, “Pattern-independent
good accuracy, but they can be slow and may not be feasible current estimation for reliability analysis of CMOS circuits,” in 25th
ACM/IEEE Design Automation Con$, Anaheim, CA, June 12-15, 1988,
for larger circuits. pp. 294-299.
From an implementation standpoint, one major difference F. Najm, R. Burch, P. Yang, and 1. Hajj, “CREST-A current estimator
between probabilisitc and statistical techniques is that statis- for CMOS circuits,” IEEE Int. Con$ Computer-Aided Design, Santa
Clara, CA, Nov. 7-10, 1988, pp. 206207.
tical techniques can be built around existing simulation tools -, “Probabilistic simulation for reliability analysis of CMOS VLSI
and libraries, while probabilistic techniques cannot. Typically, circuits,” IEEE Trans. Computer-Aided Design, vol. 9, no. 4, pp.
probabilistic techniques require specialized simulation models. 439-450, April 1990 (Errata in July 1990).
G. I. Stamoulis and 1. N. Hajj, “Improved techniques for probabilistic
In general, it is not clear that any one approach is best in simulation including signal correlation effects,” in 30th ACM/IEEE
all cases, but we feel that the second statistical approach Design Automation Con$, 1993, pp. 379-383.
(MED) offers a good mix of accuracy, speed, and ease of C.-Y. Tsui, M. Pedram, and A. M. Despain, “Efficient estimation of
dynamic power consumption under a real delay model,” IEEE Int.
implementation. It may be that a combination of the different Conf: Computer-Aided Design, Santa Clara, CA, Nov. 7-1 I , 1993, pp.
techniques can be used for different circuit blocks. Tables I and 224-228.
F. Najm, “Transition density: a new measure of activity in digital
I1 compare the different characteristics of these techniques. circuits,” IEEE Trans. Computer-Aided Design, vol. 12, no. 2, pp.
310-323, Feb. 1993.
_ _ , “Improved estimation of the switching activity for reliability
REFERENCES prediction in VLSI circuits,” IEEE Custom Integrated Circuits Con$,
San Diego, CA, May 1 4 , 1994, pp. 17.7.1-17.7.4.
R. W. Brodersen, A. Chandrakasan, and S. Sheng, “Technologies for -, “Low-pass filter for computing the transition density in digital
personal communications,” 1991 Symp. VISI Circ., Tokyo, Japan, pp. circuits,” IEEE Trans. Computer-Aided Design, vol. 13, no. 9, pp.
5-9, 1991. 1123-1 131, September 1994.
A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, “Low-power A. Ghosh, S. Devadas, K. Keutzer, and J. White, “Estimation of
CMOS digital design,” IEEE J. Solid-State Circ., vol. 27, no. 4, pp. average switching activity in combinational and sequential circuits,”
4 7 3 4 8 4 , April 1992. 29th ACM/lEEE Design Automation Con$, Anaheim, CA, June 8-12,
Workshop Working Group Reports, Semiconductor Industry Associa- 1992, pp. 253-259.
tion, pp. 22-23, Nov. 17-19, 1992, Irving, TX.. C. M. Huizer, “Power dissipation analysis of CMOS VLSI circuits by
S. Chowdhury and J. S. Barkatullah. “Estimation of maximum currents means of switch-level simulation,” in IEEE Euunp Solid-State Circ.
in MOS IC logic circuits,” IEEE Trans. Computer-Aided Design, vol. 9, Conf:,Grenoble, France, 1990, pp. 61-64.
no. 6, pp. 642-654, June 1990. R. Burch, F. Najm, P. Yang, and T. Trick, “McPOWER: A Monte Carlo
S . Devadas, K. Keutzer, and J. White, “Estimation of power dissipation approach to power estimation,” IEEHACM Inf. Con$ Compufer-Aided
in CMOS combinational circuits using Boolean function manipulation,” Design, Santa Clara, CA, Nov. 8-12, 1992, pp. 90-97.
IEEE Trans. Computer-Aided Design, vol. 11, no. 3, pp. 373-383, March -, “A Monte Carlo approach for power estimation,” lEEE Trans.
1992. VIS1 Sysr., vol. I, no. I , pp. 63-71, Mar. 1993.
H. Kriplani, F. Najm, and I. Hajj, “Maximum current estimation in M. Xakellis and F. Najm, “Statistical estimation of the switching activity
CMOS circuits,” in 29th ACM/IEEE Design Automarion Con$. Anaheim, in digital circuits,” in 3 l s t ACM/IEEE Design Automation Con$. San
CA, June 8-12, 1992, pp. 2-7. Diego, CA, 1994, pp. 728-733.
H. J. M. Veendrick, “Short-circuit dissipation of static CMOS circuitry S . M . Ross, Stochastic Processes. New York: Wiley, 1983
and its impact on the design of buffer circuits,” IEEE J. Solid-State A. Papoulis, Probability, Random Variables and Sfochastic Processes,
Circ., vol. SC-L9, no. 4, pp. 468-473, Aug. 1984. 2nd ed. New York: McGraw-Hill, 1984.
A. Shen, A. Ghosh, S. Devadas, and K. Keutzer, “On average power R . E. Bryant, “Graph-based algorithms for Boolean function manipula-
dissipation and random pattern testability of CMOS combinational logic tion,” IEEE Trans. Computer-Aided Design, pp. 677-691, Aug. 1986.
NAJM: POWER ESTIMATION TECHNIQUES 455

(361 K. S . Brace, R. L. Rudell, and R. E. Bryant, “Efficient implementation Farid N. Najm (S’85-M’89) received the B E
of a BDD package,” in 27th ACM/IEEE Design Automation Con$, June degree (with distinction) in electncal engineenng
1990, pp. 4 0 4 5 . from the Amencan University of Beirut (AUB) in
137) S . Chakravarty, “On the complexity of using BDDs for the synthesis 1983, and the M S and Ph D degrees in electncal
and analysis of Boolean circuits,” in 27th Annual Allerton Conference on and computer engineenng from the University of
Communication, Control, and Computing, Monticello, IL, Sept. 27-29, Illinois at Urbana-Champaign in 1986 and 1989,
1989, pp. 730-739. respectively
[38J I. Miller and J. Freund, Probubiliry and Statisticsfor Engineers, 3rd ed. He worked with the General Swedish Electnc
Englewocd Cliffs, NJ: Rentice-Hall, 1985. Company (ASEA) in Vasteras, Sweden, in 1982,
139) B. Kapoor, “Improving the accuracy of circuit activity measurement,” in and was a teaching assistant at AUB in 1983 He
31st ACM/IEEE Design Automation Coni, San Diego, CA, June 6-10, later worked as electronics engineer with AUB from
1994, pp. 734-739.. 1983 to 1984 and held a visiting position with the University of Warwick,
[40] A. A. Ismaeel and M. A. Breuer, “The probability of error detection in England, In 1984. While at the University of Illinois, 1985-1989, he was a
sequential circuits using random test vectors,” .IElrctronic
. Testing, vol. revarch assistant with the Coordinated Science Laboratory, and worked for a
I , pp. 245-256, Jan. 1991. year with the VLSI Design Laboratory at Texas Instruments Inc , Dallas, TX
(411 G. E). Hachtel, E. Macii, A. Pardo, and F. Somenzi, “Probabilistic In July 1989, he joined Texas Instruments as Member of Technical Stdff with
analysis of large finite state machines,” in 31st ACWIEEE Design the Semconductor Process and Design Center In August 1992, he became an
Automation Con$. June 6-10, 1994, San Diego, CA, pp. 270-275. Awqtant Professor with the Electrical and Computer Engineering Department
[42] J. klonteiro and S. Devadas, “A methodology for efficient estimation at the University of Illinois at Urbana-Champaign. His research interests are
of switching activity in sequential logic circuits,” in ACM//EEE 3 f s t in the general area of CAD tool development for VLSI circuits, including
Design Automation Conf, San Diego, CA, June 6-10, 1994, pp. 12-17. power estimation, reliability prediction, synthesis of low-power and reliable
(431 C.-Y. Tsui, M. Pedram, and A. M. Despain, “Exact and approximate VLSI, timng analysis, test generation. and circuit and timing simulation
methods for calculating signal and transition probabilities in FSMs,” in Dr Najm received the IEEE TRANSACTIONS ON CIRCUITS AND DEVICFS Best
ACM/IEEE 31st Design Automation Conf. San Diego, CA, June 610, Paper Award in 1992 and the NSF Research Initiation Award in 1993
1994, pp. 18-23.