0% found this document useful (0 votes)

2 views8 pages

A Fanout Optimization Algorithm Based On The Effort Delay Model

This paper presents the LEOPARD algorithm for fanout optimization in VLSI design, focusing on minimizing total buffer area while adhering to timing and capacitance constraints. The algorithm decomposes the problem into subproblems for each sink and merges the solutions, achieving significant reductions in buffer area compared to previous methods. The paper also discusses the delay model used and provides theoretical proofs for the optimization process.

Uploaded by

Tamal Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views8 pages

A Fanout Optimization Algorithm Based On The Effort Delay Model

Uploaded by

Tamal Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO.

12, DECEMBER 2003 1671

A Fanout Optimization Algorithm Based showed that there still exists an optimal solution in this search space
on the Effort Delay Model under a gain-based delay model. Fanout-free trees are trees in which a
buffer can drive at most one other buffer.
Peyman Rezvani and Massoud Pedram In this paper, an algorithm is presented that finds the fanout tree
topology and sizes of the buffers on the tree by decomposing the whole
problem into subproblems and solving each subproblem separately for
Abstract—This paper presents a Logical Effort-based fanout OPtimizer each sink. The solutions to the subproblems are then merged to form
for ARea and Delay (LEOPARD), which relies on the availability of a (near)
continuous size buffer library. Based on the concept of logical effort in very the solution to the whole problem. Our derivation relies on the notions
large scale integrated circuits, the proposed algorithm attempts to minimize of logical and electrical effort first proposed in [4].
the total buffer area under the required time and input capacitance con- Sutherland and Sproull [4] minimized the delay along any single path
straints by constructing the fanout tree topology and assigning the buffer by assigning equal delay budgets to each stage on that path. While this
sizes. More precisely, the proposed algorithm produces the optimum fanout approach was proven to minimize the delay, it did not necessarily result
tree solution if the fanout tree topology is restricted to a chain of buffers.
For the case where a discrete size library of buffers is available, this paper in an optimal solution in terms of the total buffer area. Kung [3], on the
also presents a postprocessing (buffer merging) step that transforms the other hand, solved the fanout-optimization problem to minimize the
continuous buffer-sizing solution to a discrete one while minimizing the input capacitance seen at the source gate subject to timing constraints for
round-off error. Experimental results show that compared with previous the sinks and without any consideration of the buffer area. In contrast,
approaches, both for continuous and discrete buffer libraries, LEOPARD the approach presented in this paper minimizes the total buffer area
achieves a significant reduction in the total buffer area subject to the re-
quired time constraints. subject to capacitance constraint for the driver. This is an important
distinction because it allows one to tradeoff the propagation delay
Index Terms—Buffer insertion, fanout optimization, gate sizing, logic de- through the source driver and through the rest of the buffer tree to reduce
sign, logical effort.
the total buffer area without too high of an increase in the overall delay.
The remainder of this paper is organized as follows. In Section II,
I. INTRODUCTION the effort delay model that is used throughout this paper is explained.
Section III explains the details of the algorithm. In Section IV, experi-
Quite often in a very large scale integrated (VLSI) design, a signal mental results are shown, and in Section V, we conclude the paper.
needs to be distributed to several destinations under a required timing
constraint at each destination. Furthermore, in practice, there may also
be a limitation on the load that can be driven by the source signal. II. DELAY MODEL
Fanout optimization is the problem of finding a buffer-tree topology The delay model used in this paper is based on the concept of logical
and sizing the buffers in this topology so as to satisfy the constraints. and electrical efforts presented in [4]. The effort-based model is basi-
Since these buffers must be picked from the sizes that are available cally a reformulation of the conventional RC model of CMOS gate delay.
in a given cell library, the more realistic problem is to find the op- Using the same terminology as in [4], the delay of a gate is defined
timum sizes for the buffers from the set of sizes available in the library. to be
This problem has been proved to be NP-complete [1]. While several
approaches exist for tackling the fanout optimization problem using
d = (p + gh) (1)
simplified delay models [9], [10], new techniques [12] have also been
proposed which use more accurate delay models or even taking inter-
connect delay into account [11]. More recently, however, researchers where is a time unit that characterizes the semiconductor process
[3] have started to use continuous, as opposed to discrete, size libraries, being used. It is only used to convert the unitless part of (p + gh) to a
in the sense that the optimum fanout tree is calculated with the as- time unit. For simplicity, is not considered from now on. Parameter p
sumption that buffers are available in all sizes. This greatly simplifies is the parasitic delay of the gate. The major contribution to the parasitic
the problem and allows the application of more powerful optimization delay is the capacitance of the source/drain regions of the transistors
techniques. At the same time, the number of discrete sizes for inverters that drive the output. Throughout this paper pinv is used as the parasitic
in a typical application-specified integrated circuit (ASIC) library has delay for an inverter. Parameter g is called the logical effort of the gate
increased to the extent that a “near-continuous inverter sizing” model and depends only on the topology of the gate and the ability to produce
has become a valid and fairly accurate model. output current. The logical effort for an inverter is assumed to be 1
In [2], the authors simplified the fanout optimization problem by and, for other gates, calculated based on their internal topologies. The
restricting the search space to a subset of trees and showed that the logical effort of a logic gate tells how much worse it is at producing
results still compare very favorably with the algorithms that consider output current than is an inverter, given that each of its inputs may have
a larger set of topologies. The authors used a dynamic programming only the same input capacitance as the inverter. Parameter h (specified
approach to implicitly enumerate the set of so-called LT-trees and find for each input pin of the gate) is called the electrical effort (also called
the optimal LT-tree topology and sizing. An LT-tree is either a 2-level gain) of the gate and is defined to be the ratio of the capacitive load
buffer type or a chain of buffers with intermediate fanouts to sinks that driven by the gate to the input capacitance at the corresponding input
ends up to sinksor to a 2-level tree. Reference [3] also restricted the pin. The electrical effort describes how the electrical environment of
search space to a certain class of trees, called fanout-free trees, and the logic gate affects performance and how the size of the transistors in
the gate determines its load-driving capability.
Manuscript received October 7, 2003; revised March 16, 2003. A prelimi- The important point is that p and g are independent of the size of
nary version of this work appeared in Proc. Int. Conf. Computer-Aided Design, the gate, and the only factor that is affected by sizing is the electrical
San Jose, CA, pp. 516–519, Nov. 1999. This paper was recommended by Asso- effort h. Reference [4] shows how p and g are independent of sizing
ciate Editor M. D. F. Wong. by doing the reformulation to define the four factors , p, g , and h in
The authors are with the Department of Electrical Engineering-Systems,
University of Southern California, Los Angeles, CA 90089 (e-mail: peyman@ terms of the resistance and capacitance of a minimum size inverter and
usc.edu; [email protected]). a template gate representing the topology of the gate. For details, refer
Digital Object Identifier 10.1109/TCAD.2003.819423 to [4].

0278-0070/03$17.00 © 2003 IEEE

Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
1672 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003

Proof: According to (3), area is a monotonically decreasing func-

tion of all hi s (i = 1; . . . ; n). In other words, increasing any hi will
always result in a buffer chain with smaller area. The delay, on the other
hand, is an increasing function of all hi s according to (2). This means
that by increasing any arbitrary hi , area can be decreased and delay can
be increased up to the point that delay becomes no larger than the given
constraint TR ; therefore, the optimum buffer chain has delay = TR .
Fig. 1. Buffer chain.
Lemma 1: In the 1FO problem, for a fixed number of buffers n in
the chain, the optimum buffer chain has hi equal to a constant TR 0
npinv .

III. ALGORITHM Proof: According to Theorem 1 and (2)

In this section, the fanout optimization problem is stated as two sep-

n
arate problems, and each one is solved separately. npinv + h i = TR :
One-sink fanout optimization (1FO) problem: Given the source
i=1
of a signal Q with maximum driving capability Cin and a sink S with The first term on the left hand side npinv is constant for a given n.
capacitive load CL , required polarity P , and required arrival time TR , Therefore, hi for the optimum buffer chain with n buffers is also

find the optimum number of buffers for a buffer chain and the appro- constant and equal to
priate sizing for them to minimize the total buffer area such that the n
delay from Q to S is less than or equal to TR , the required polarity P
i
h = T R 0 npinv : (4)
is achieved, and the capacitive load imposed on Q is no more than Cin . i=1
Multiple-sink fanout optimization (mFO) problem: Given the
source of a signal Q with maximum driving capability Cin along with Hence, the claim is proved.
a set of m sinks Si each of which is assigned a triplet (CL , TR , Pi ) To find the optimum number of buffers n the maximum input capac-
where CL is the capacitive load, TR is the required arrival time, and itance constraint C1 Cin is used, where C1 is the input capacitance
Pi is the required polarity for the sink Si , find a fanout tree of buffers
of the first buffer in the chain being driven by the source signal and Cin
and the appropriate sizing for them to minimize the total buffer area is the given constraint on the input capacitance.
such that the timing constraint and the polarity required at each sink is The input capacitance for the first buffer is computed as follows:
satisfied and the capacitive load imposed on Q is no more than Cin . C L
Note that the only difference between the two problems is the C1 = : (5)
h i
number of sinks to be driven. Area, the objective function in both of
these problems, is considered to be the summation of input capaci- Let the electrical effort of the chain be defined as the product of elec-
tances of all the buffers, which is reasonable with the assumption of trical efforts of all the buffers, and let it be shown by H . Using the
continuous sizing for the gates. above equation, the input capacitance constraint can be restated as fol-
The rest of this section is organized as follows. The 1FO problem lows:
is solved in Section III-A, and in Section III-B, the mFO problem is
C L C L:
solved based on the solution derived for the 1FO problem. H = hi = (6)
C1 Cin

A. Buffer Chain Theorem 2: In the 1FO problem, for a fixed number of buffers n in
For the 1FO problem, the solution is a chain of buffers between the the chain, the electrical effort of the buffer chain H achieves its max-
source and the sink (Fig. 1). The variables of the problem are defined imum value when all hi s are equal.
to be the number of buffers n and the electrical efforts of these buffers Proof: According to Lemma 1, the summation of all hi s is con-
h 1 ; h 2 ; . . . ; hn . stant for any given number of buffers. Since the product of some vari-
Since the logical effort for an inverter is 1, the delay through the ables with a constant summation is maximum when all those variables
buffer chain can be expressed in terms of n and hi s as follows: are equal, all hi s have to be equal to maximize H .
n The electrical effort of each buffer for the buffer chain that maxi-
mizes H , according to Theorem 2 and (4), would then be
delay = npinv + h : i (2)
i=1
^ ^ = T R 0 npinv 8
h i =h i = 1; . . . ; n: (7)
The overall area, which is calculated as the summation of the input n

capacitances of all buffers on the buffer chain, may subsequently be

So, the maximum of H , named H as a function of n would be
expressed as
n
n n =
T R 0 npinv (8)
CL H :
area = i=
C n h : (3) n

i=1 i=1 j =i j
H is drawn in Fig. 2 for TR = 14 and pinv = 0:6.
The goal would be to find n and all hi s to minimize area while both According to Theorem 2, there is a maximum value that H can
timing and input capacitance constraints are satisfied. That is achieve for any given buffer count. Therefore, the only buffer counts
Min area that are feasible are those for which the maximum value that H
achieves is not less than the ratio CL =Cin (6) and those correspond
st : delay TR
C1 Cin :
to the buffer counts between the points of intersection of H and line
CL =Cin (Fig. 2). As an example, for Case I in Fig. 2, there is no

Theorem 1: In the 1FO problem, delay through the optimum buffer feasible solution because there are no intersection points and H lies
chain is exactly equal to the specified required time TR , i.e., delay = TR . below CL =Cin for all buffer counts. For Case III, on the other hand,
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003 1673

Fig. 4. Split/merge transformations.

5 and 6, the optimum sizing for the buffers on the chain is found by
solving a convex optimization problem as follows:
C + C C
Min h h h +... + h h h
Fig. 2. Plot of H = Max( h ) versus n. st : h1 + . . . + hn TR 0 npinv
...

(11)
h1 . . . hn CC :
This is a minimization of a posynomial function with posynomial in-
equality constraints that can be easily solved in polynomial time [6].
Finally, among all of the solutions, the one with the minimum area is
selected as the optimum solution.
It is interesting to note that by taking the derivative of H and setting
it equal to zero, its maximum value is found to be at
n^ = TR 2 (pinv ) (12)
where
Lambert p e
(pinv ) = :
Lambert p e
(13)
pinv +1

The function Lambert(! ) is the solution to the nonlinear equation

xex = ! . For further information about Lambert function, refer to
[5]. As pinv tends toward zero
1
(pinv ) =
Fig. 3. Algorithm OptN . p !
lim
0 e
(14)

and this corresponds to allocating the well-known electrical effort of e

there are two points of intersection n ~ 1 and n ~ 2 ; therefore, the only to each buffer with the assumption of pinv = 0.
feasible buffer counts are between n~ 1 and n
~2 . Theorem 3: Algorithm OptN finds the optimum solution for the
With these observations, algorithm OptN in Fig. 3 is proposed for 1FO problem.
finding the optimum number of buffers and their sizes. Proof: Since all of the feasible solutions are explicitly consid-
To find the optimum number of buffers, the line CL =Cin is inter- ered, the algorithm is guaranteed to find the optimum solution.
sected with the graph H (line 2 of Fig. 3 and Case III in Fig. 2) which
results in n
~ 1 and n
~ 2 . Note that B. Buffer Tree
In this section, the more general case of the fanout-optimization
H = 1:
n!0
lim (9)
problem is considered, where the source signal is driving more than
Therefore, there always exists an n
~ 1 unless the line CL =Cin is passing one sink.
below unity, which means that CL is less than or equal to Cin , in which Reference [3] introduced two transformations that can be performed
case, no buffers need to be used at all. On the other hand, there ex- on a fanout tree, namely merging and splitting (Fig. 4). It is shown
ists an upper bound on the number of buffers because of the intrinsic here that these transformations maintain the same area, delay, and
buffer delay. According to (4), for the electrical efforts of buffers to capacitance.
have a meaningful physical interpretation, TR 0 npinv has to be posi- Theorem 4: The split/merge transformations applied to a fanout tree
tive, which means (line 4 of Fig. 3) preserve the input capacitance (thus, area) and the delay.
Proof: The proof for split transformation is as follows. Suppose
TR
n : (10) the electrical effort of the original buffer before splitting is h. Thus, the
pinv delay through the buffer for both of the branches is h + pinv , and the
In short, the buffer count is limited by n ~ 1 on one side and by n
~ 2 and input capacitance is (C1 + C2 )=h, which is also the area of the buffer.
TR =pinv on the other side. Therefore, the optimum buffer count, n, lies After splitting the original buffer to two buffers with equal electrical
between n1 and n2 (lines 3 and 4 of Fig. 3). efforts of h, the delay for both branches would still be h + pinv and
There is a possibility that the line CL =Cin could intersect the graph the input capacitance would be C1 =h + C2 + h, thus, the same input
where there is no integer n between the points of intersection to satisfy capacitance and, hence, the same area. For merge transformation, one
the polarity constraint. This only happens when the line crosses the H can easily verify the same provided that the electrical efforts of the
curve very close to the peak of the graph (Case II in Fig. 2). In lines buffers to be merged are equal.
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
1674 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003

Now that there exists a range for input capacitance for each buffer
count, it can be proven that area is a decreasing function of input ca-
pacitance in this range.
Theorem 5: For a fixed number of buffers in a buffer chain, the area
cost is a decreasing function of input capacitance for C i Cin C i .
Proof: Increasing input capacitance Cin for a branch will de-
crease the ratio CL =Cin in the capacitive constraint of the optimiza-
tion problem in (11). Therefore, there either exists a better solution
with smaller area or, if not, the same solution with the same area is still
achievable. Hence, increasing input capacitance will not increase area
and, therefore, area is a decreasing function of input capacitance and
claim is proven.
Fig. 5. Input capacitance allocation for a fanout-free buffer tree. Area versus input capacitance for some buffer count will, therefore,
look something like the graph shown in Fig. 6(a). As shown in Fig. 6(a),
Therefore, if T 3 is the optimal fanout tree with the proper sizing of no feasible solution exists for input capacitances smaller than C i and
buffers, it can be split to a fanout-free tree consisting of a set of buffer the area stays the same for input capacitances larger than C i . Different
chains T , which has the same area as T 3 , according to Theorem 4, buffer counts in the range [1, bTR =pinv ] result in the graphs shown in
and also satisfies the timing and input capacitance constraints (Fig. 5). Fig. 6(b). The minimum area over all buffer counts will, therefore, look
First, T will be found by using the optimal algorithm presented in Sec- like the graph shown in Fig. 6(c). This piecewise nature of area versus
tion III-A. The method used to transform T into T 3 will be discussed input capacitance, which is due to different buffer counts, causes the
later. ICA problem to be NP-complete.
The 1FO problem was stated such that the maximum input capaci- Theorem 6: ICA problem is NP-complete.
tance allowed was given. Therefore, before the mFO problem can be Proof: To perform the proof, the 0-1 Knapsack problem will be
broken down into 1FO problems, different portions of Cin need to be reduced to the ICA problem. In the conventional version of the Knap-
allocated to each branch (Fig. 5). sack problem, each item has a size and a value and the objective is to
Input capacitance allocation (ICA) problem: Given a number of maximize the total value. In the ICA problem, however, the objective
sinks, each with a required time, capacitive load, and required polarity, is to minimize area. Therefore, we will consider the negative of area,
and a total budget on input capacitance Cin , allocate portions of Cin rather than the area itself, so as to make the problem a maximization
to each branch, such that the total area is minimized while the given problem rather than a minimization one [Fig. 7(a)].
constraints for all sinks are satisfied. The value versus size curve for some item of 0-1 Knapsack problem
In this section, it is first proven that the ICA problem is NP-complete is shown in Fig. 7(b). The point about this graph is that it is not a con-
and then a heuristic is proposed for solving this problem. tinuous one. For sizes below si , the value is zero, and for sizes greater
Intuitively speaking, the input capacitance allocation problem is than si , the value is vi . Assuming to be the accuracy of the machine,
similar to Knapsack problem, where objects of the Knapsack problem the graph can be modified to the one shown in Fig. 7(c) to make it a
correspond to the capacitance budgets of each branch and the total continuous one. Note that the graph may have any arbitrary behavior in
capacitance is limited by the input capacitance constraint Cin , which the range between si and si + . This new graph is a special case of the
corresponds to the Knapsack volume. graph shown in Fig. 7(a), in which the curve has become linear. Since
Before it can be formally proven that this problem is NP-complete, the 0-1 knapsack problem is NP-complete, the ICA problem is NP-hard
the behavior of area must be studied as a function of input capacitance as well, otherwise one could formulate the 0-1 Knapsack problem as an
for each branch. The valid range for the buffer count on branch i is [1, ICA problem and solve it in polynomial time. Note that the NP-hard-
bTR =pinv c], according to (10). For each buffer count n, in this range, ness of ICA is because of the piecewise nature of the area versus input
there exists a maximum electrical effort for the buffer chain, according capacitance curve and, that, in turn, is because area is represented by
to (8). Therefore, because of the capacitance constraint in (6), there different functions for different buffer counts. Now that it has been
exists a minimum required input capacitance as follows: proven that ICA is NP-hard, it must be shown that the decision version
CL of ICA can be tested in polynomial time. This is obviously true be-
Ci = T 0np n (15)
cause one can easily add up the input capacitances of each branch and
n compare it with the input capacitance budget Cin . This can be done in
where the denominator is the maximum value that can be achieved by linear time, meaning ICA is in NP, and since it was proven that ICA is
h, according to (8). On the other hand, there exists a maximum ben- NP-hard, therefore, the ICA problem is NP-complete.
eficial input capacitance, C i , for each buffer count which means that
After proving that ICA is an NP-complete problem, this section pro-
allocating an input capacitance larger than C i will not improve area
ceeds by proposing a heuristic method for allocating input capacitances
any further. This value can be calculated using the same optimization
to each branch.
problem as in (11) but with dropping the capacitance constraint. That
Let m denote the number of sinks and, thus, the number of branches.
Consider the k th branch (1 k m) and H k , the maximum of
is
arean
fhg = Min
st : delayn TR
electrical effort of the k th branch, has its minimal value of 1 at nk = 0
(lim. H when n tends toward 0). On the other hand, H k cannot be any
larger than (TR ; pinv ), the value of H k (nk ) when nk is calculated
and then calculating C i as follows:
from (12). According to (5), the maximum value of H k corresponds to
CL
Ci = : the minimum value of Cik . Therefore, the minimum acceptable input
h capacitance would be
Obviously, any input capacitance larger than C i will not improve area
any further because allocating C i already results in the same solution CLk
Ck = : (16)
as when the capacitance constraint is dropped. (TR ; pinv )
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003 1675

(a) (b)

(c)
Fig. 6. (a) Area versus input cap for some buffer count n. (b) Area versus input cap for different buffer counts. (c) Minimum area versus input cap.

(a) (b)

(c)
Fig. 7. (a) Area versus input cap. (b) Value versus size for an item of knapsack problem. (c) Modified value versus size graph.

Allocating any capacitance less than C k to any branch will make The proposed heuristic is shown in Fig. 9. Line 4 finds xk s such that
that branch infeasible. Hence, m new positive variables xk for the desired ratio between them, as discussed above, is fulfilled.
k = 1; . . . ; m are introduced such that The slope for each branch is estimated as follows:
Cik = C k + xk : (17) ymax 0 ymin (TR ; pinv ) 0 1
:
This way, one can be sure that the minimum required capacitance is
slopek =
xmax 0 xmin =
TR (pinv ) 0 0
(18)
allocated to each branch. The heuristic is to find xk s in such a way that
After finding the allocated input capacitances, m instances of the 1FO
their ratio is proportional to the positive slope of H graph in Fig. 2.
problem will be generated that can be optimally solved by the algorithm
The motivation behind this heuristic is the fact that for two different
presented in Section III-A.
branches to have the same change in buffer count, the branch with
smaller slope would need a smaller change in CL =Cin . When a branch
is given a wider range of buffer counts to explore, a better solution will C. Merging Buffer Chains
likely be found. For an example, refer to Fig. 8. Branch 1 has a larger So far, a continuous-sized buffer library has been assumed. In reality
slope compared to branch 2; therefore, a larger change in CL =Cin for the ASIC library has a finite (and hopefully large) number of inverter
branch 1 is required to have the same buffer count range as branch 2. sizes. So the solution needs to be mapped to one consistent with the
Since CL is given and fixed for each branch, changing CL =Cin corre- library. The main problem when rounding the inverter sizes is that it
sponds to changing the Cin allocated to that branch. may result in significant errors. To alleviate this problem, the merging
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
1676 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003

Fig. 8. Different slopes corresponding to different branches.

Fig. 10. Algorithm merge.

TABLE I
COMPARISON WITH SUTHERLAND

very first stage) is the same. As noted in the proof of Theorem 4, for the
merging transformation to produce the exact same area and delay, the
Fig. 9. Algorithm InCapAlloc. electrical efforts of the buffers to be merged must be equal. However,
because each branch of the fanout tree is optimized separately with re-
spect to the corresponding sink, the electrical efforts of the buffers may
transformation, which is the opposite of the split transformation intro- not necessarily be equal. Thus, a constant " is defined and two buffers
duced in Fig. 4, is used. are merged if the difference between their electrical efforts is less than
To show how this works, recall Theorem 4. If the electrical efforts or equal to " percent. In addition, two buffers are merged if the rounding
of the buffers on two branches are equal, one can merge them and re- error after merging the two is smaller than the summation of rounding
place them with a single buffer with the same electrical effort. Note errors of each buffer before the merge operation. Obviously, the effi-
that simply because the electrical efforts of the buffers are the same, ciency of this approach is dependent on the order in which the buffers
one cannot conclude that the buffer sizes are also the same. As shown are selected to be merged. The approach presented here is to cluster the
in Fig. 4, the sizes of each of the buffers before merging are C1 =h buffers into groups of nearly equal electrical efforts and check for the
and C2 =h, respectively, and the size of the buffer after merging is merging possibilities inside each group. Merging is performed starting
(C1 + C2 )=h. Therefore, the size of the buffer after merging is equal
at the source of the signal, and proceeding toward the sinks, while at
to the summation of buffer sizes before merging. This fact can be used the same time preserving the area so as not to increase the capacitive
to reduce the rounding error. As an example, consider a buffer size of load imposed on the previous stage. The pseudocode for a recursive
0.35 that has to be mapped to a buffer size of 1 in the ASIC library. merging algorithm is shown in Fig. 10.
Now, if two buffers of size 0.35 could be merged to a single buffer, the
size would be 0.7, and rounding to a buffer size of 1 would result in
IV. EXPERIMENTAL RESULTS
smaller error.
Clearly, one has to be concerned about satisfying the required time Three different sets of experiments were performed. In the first set,
and input capacitance constraints when performing this transformation. the LEOPARD algorithm of Section III was compared with an im-
The merging should be performed in such a way that all timing con- plementation of the Sutherland algorithm [4], which minimizes delay
straints are satisfied and the area (as well as the input capacitance of the through a path. The results are reported in Table I.
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003 1677

TABLE II TABLE III

COMPARISON WITH KUNG COMPARISON WITH SIS

Sutherland delay model. A very good match between the SIS delay and
For all of the experiments, the minimization problems within the logical effort delay model values was enforced.
LEOPARD algorithm were solved using the Matlab Optimization The fanout optimization programs of SIS were first used to perform
Toolbox v. 2.0. Furthermore, pinv was assumed to be 0.6. For each cir- fanout optimization. The results are reported in column 6 of Table III.
cuit, the capacitive load of the sink and the maximum capacitance that Then, the delay and input capacitance resulting from SIS were used
the source can drive were given. First, the path delay was minimized as constraints for LEOPARD. The results, assuming a continuous-size
using Sutherland’s method. Delay and area of minimum-delay buffer buffer library, are reported in column 3. Then, merging and mapping to
chain are reported in columns 2 and 3. Next, the resulting delay and the real buffers in the ASIC library were performed, and the results are
polarity were used as the constraints for the area minimization problem shown in columns 4 and 5. As shown in the table, in case of continuous
in LEOPARD. In the 4th column, the minimum area generated by sizing the area is expressed in terms of the capacitances but for the dis-
LEOPARD, subject to the given constraints, is shown. As expected, crete-sized buffers, it is the actual buffer area extracted from the library.
the area is almost the same because delay has been minimized and, Results show an average of 38% area improvement for LEOPARD.
hence, the timing constraint is so tight there will not be much room for
reducing area. However, when LEOPARD was given a 5% additional
slack, it can reduce area by an average of 29% as shown in columns 6 V. CONCLUSION
and 7. This shows how delay can be traded off for area to significantly
This paper presented an optimal algorithm for buffer chains to min-
reduce area using LEOPARD if a slight increase in delay can be
imize area with the assumption of continuous sizing for the buffers.
afforded. Note that merging or rounding is not applied during this
The algorithm finds the optimum number of buffers and the optimum
set of experiments and the area reported is the summation of input
sizing for them by solving a posynomial minimization problem subject
capacitances of all inverters.
to posynomial inequality constraints which can be easily and quickly
In the next set of experiments, the results from LEOPARD are com- solved by a convex program solver. Based on this algorithm, a heuristic
pared with the results of an implementation of Kung’s algorithm [3]. method was presented for the general case of buffer trees. Considering
For each circuit, a number of sinks with capacitive load, required the fact that the number of discrete sizes for buffers in typical libraries
time, and required polarity are given. The number of sinks for each cir- has highly increased, the assumption of near-continuous buffer library
cuit is shown in column 2. Kung’s algorithm was first used to minimize is fairly accurate.
capacitive load on the source. The resulting capacitance and area are
reported in columns 3 and 4. The capacitance calculated by Kung’s al-
gorithm was then used as the capacitive constraint for area optimization REFERENCES
in LEOPARD. The resulting area is reported in column 5. Finally, an [1] C. L. Berman, J. L. Carter, and K. F. Day, “The fanout problem: from
additional 5% input capacitance was allowed for each circuit to further theory to practice,” in Advanced Research in VLSI: Proceedings of the
reduce area, and the resulting input capacitance and area are shown in 1989 Decennial Caltech Conferences, C. L. Seitz, Ed. Cambridge,
columns 6 and 7. An average of 19% improvement in area is achieved MA: MIT Press, 1989, pp. 69–99.
in the expense of 5% additional input capacitance. Note that in this [2] K. Kodandapani, J. Grodstein, A. Domic, and H. Touati, “A simple algo-
rithm for fanout optimization using high-performance buffer libraries,”
set of experiments, neither merging nor rounding were performed for in Proc. Int. Conf. Computer-Aided Design, 1993, pp. 466–471.
Kung’s algorithm or LEOPARD and the area reported in Table II is the [3] D. S. Kung, “A fast fanout optimization algorithm for near-continuous
total capacitance of inverters calculated by the algorithms rather than buffer libraries,” in Proc. 35th Design Automation Conf., 1998, pp.
extracted from the library. pinv is assumed to be 0.6. 352–355.
[4] I. E. Sutherland and R. F. Sproull, “Logical effort: Designing for speed
Finally, our last set of experimental results compare LEOPARD with on the back of an envelope,” in Advanced Research in VLSI. Santa
the sequential interactive synthesis (SIS) fanout optimization program. Cruz, CA: Univ. of Calif., 1991.
SIS runs different fanout optimization programs, namely LT-Tree, Two- [5] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E.
Level, Bottom-Up, and Balanced, and the best one is reported [14]. In Knuth, “On the Lambert W function,” Adv. Computat. Math., vol. 5, pp.
329–359, 1996.
this set of experiments, a standard cell library consisting of ten different [6] P. M. Vaidya, “A new algorithm for minimizing convex functions over
inverters was used. For each inverter, intrinsic and Rout were specified convex sets,” in Proc. IEEE Foundations Comput. Sci., Oct. 1989, pp.
for the SIS library delay model and pinv and were specified for the 332–337.
Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.
1678 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 12, DECEMBER 2003

[7] C. Mead and L. Conway, Introduction to VLSI Systems. Reading, MA: these methods have been incorporated into commercial computer-aided
Addison Wesley, 1980. design (CAD) tools and used by industry.
[8] P. Rezvani, A. Ajami, M. Pedram, and H. Savoj, “LEOPARD: A logical
One major obstacle is that these methods are based on constrained
effort-based fanout optimizer for area and delay,” in Proc. Int. Conf.
Computer-Aided Design, Nov. 1999, pp. 516–519. nonlinear programming, a process known to be computationally inten-
[9] M. C. Golumbic, “Combinational merging,” IEEE Trans. Comput., vol. sive (NP-hard) [12]. These methods are applicable only to small size
25, pp. 1164–1167, Nov. 1976. problems, while P/G networks in today’s very large scale integration
[10] K. J. Singh and A. Sangiovanni-Vincentelli, “A heuristic algorithm for (VLSI) design may contain millions of wire segments (therefore, mil-
the fanout problem,” in Proc. 27th Design Automation Conf., June 1990,
pp. 357–360. lions of variables). On the other hand, with the continuous shrinking of
[11] A. Salek, J. Lou, and M. Pedram, “A simultaneous routing tree con- the chip feature size, P/G network optimization is becoming increas-
struction and fanout optimization algorithm,” in Proc. Int. Conf. Com- ingly important, since more and more portions of the chip area are
puter-Aided Design, Nov. 1998, pp. 625–630. dedicated to P/G routings, and the problems of IR drop and electro-
[12] P. Cocchini, M. Pedram, G. Piccinini, and M. Zamboni, “Fanout opti-
migration deteriorate.
mization under a submicron transistor-level delay model,” IEEE Trans.
Computer-Aided Design, vol. 9, pp. 339–349, Mar. 1990. In this paper, we present a new method capable of solving the P/G
[13] Y.Yu Nesterov and A. Nemirovsky, Interior point polynomial methods optimization problem orders of magnitude faster than the best known
in convex programming. Philadelphia, PA: SIAM, 1994. method. Our method is inspired by a key observation made by Chowd-
[14] H. J. Touati, “Performance-Oriented Technology Mapping,” Ph.D. dis- hury that if currents in wire segments are fixed, and voltages are used as
sertation, Univ. California, Berkeley, 1990.
variables, then the resulting optimization problem is convex [8]. How-
ever, instead of using the conjugate gradient method as in [8], we show
that the problem can be solved elegantly by a sequence of linear pro-
grams. We prove that there always exists a sequence of linear programs
that converge to the optimal solution of the original convex optimiza-
tion problem. Experimental results have demonstrated that usually a
Reliability-Constrained Area Optimization of VLSI
few linear programs are required to reach the optimal solution. The
Power/Ground Networks Via Sequence complexity of the proposed method is proportional to the complexity
of Linear Programmings of linear programming (which can be solved in polynomial time [5],
[12]). Therefore, our method is scalable, i.e., the CPU time increases
Sheldon X.-D. Tan, C.-J. Richard Shi, and Jyh-Chwen Lee
approximately polynomially with the size of a network. In practice,
we have observed that the new method is orders of magnitude faster
Abstract—This paper presents a new method of sizing the widths of than the conjugate gradient method with constantly better optimization
the power and ground routes in integrated circuits so that the chip area results.
required by the routes is minimized subject to electromigration and IR This paper is organized as follows. Section II reviews some previous
voltage drop constraints. The basic idea is to transform the underlying work. Section III describes the formulation of the P/G network opti-
constrained nonlinear programming problem into a sequence of linear
programs. Theoretically, we show that the sequence of linear programs mization problem. The new method is presented in Section IV. Some
always converges to the optimum solution of the relaxed convex opti- practical considerations are described in Section V. Experimental re-
mization problem. Experimental results demonstrate that the proposed sults from some large P/G networks are summarized in Section VI.
sequence-of-linear-program method is orders of magnitude faster than the Section VII concludes the paper.
best-known method based on conjugate gradients with constantly better
solution qualities.
Index Terms—Circuit modeling, linear programming, power distribu- II. PREVIOUS WORK
tion network, simulation and optimization.
It is generally assumed that the average current drawn by each
module is known and is modeled as an independent current source
I. INTRODUCTION (we do not consider the temporal correlations of current sources). The
Power/ground (P/G) networks connect the P/G supplies in the circuit constraints from reliability and design rules include: 1) IR voltage
modules to the P/G pads on a chip. An important problem in P/G net- drop constraints; 2) metal-migration constraints; 3) minimum width
work design is to use the minimum amount of chip area for wiring P/G constraints; and 4) equal width constraints. The problem of deter-
networks, while avoiding potential reliability failures due to electromi- mining the widths of wire segments of a P/G network to minimize the
gration and excessive IR drops. Specifically, we are concerned with the total P/G routing area subject to all these constraints is a constrained
problem of P/G-network optimization where the topologies of P/G net- nonlinear optimization problem [6], [7].
works are assumed to be fixed, and only the widths of wire segments In the method of Chowdhury and Breuer [6], resistance values and
are to be determined. Several methods have been developed to solve branch currents are selected as independent variables. Both the objec-
this problem [6]–[9]. However, to the best of our knowledge, none of tive function and the IR voltage drop constraints become nonlinear. The
augmented Lagrangian method combined with the steepest descent al-
gorithm [1] is used to solve the resulting problem.
Manuscript received August 17, 2002; revised February 3, 2003. Some pre-
Dutta and Marek-Sadowska [9] used only resistance values as vari-
liminary results of this paper were presented at the ACM/IEEE 38th Design
Automation Conference, New Orleans, LA, June 1999. This paper was recom- ables. All of the constraints expressed in terms of nodal (terminal)
mended by Associate Editor M. Sarrafzadeh. voltages and branch currents, which have to be obtained by explicitly
S. X.-D. Tan is with the Department of Electrical Engineering, University of solving an electrical network, become nonlinear. The feasible direction
California, Riverside, CA 92521 USA (e-mail: [email protected]). method [4] is employed to solve the nonlinear optimization problem.
C.-J. R. Shi is with the Department of Electrical Engineering, University of
Washington, Seattle, WA 98195 USA. At each iteration step, extra effort is required to solve the electrical net-
J.-C. Lee is with Synopsys Inc., Mountain View, CA 94043 USA. work for nodal voltages and branch currents, as well as their gradients
Digital Object Identifier 10.1109/TCAD.2003.819429 by numerical differentiation.

Authorized licensed use limited to: TAMAL DAS. Downloaded on January 20,2025 at 17:30:55 UTC from IEEE Xplore. Restrictions apply.

Ceaser and Cleopatra
No ratings yet
Ceaser and Cleopatra
9 pages
Sample Calculation Drainage Design Road Side Drain PDF
100% (3)
Sample Calculation Drainage Design Road Side Drain PDF
3 pages
2020 Msce Practical Questions Target
No ratings yet
2020 Msce Practical Questions Target
30 pages
Haldirams Intro
100% (2)
Haldirams Intro
12 pages
Pddesign
100% (2)
Pddesign
269 pages
The Nature of Mathematics
100% (1)
The Nature of Mathematics
13 pages
Occult Signs and Symbols
100% (2)
Occult Signs and Symbols
27 pages
Inverter
100% (2)
Inverter
231 pages
4G-4G Traffic Sharing For LTC (Low Throughput Cells) Improvement in Huawei LTE
No ratings yet
4G-4G Traffic Sharing For LTC (Low Throughput Cells) Improvement in Huawei LTE
3 pages
PLAY - The Bean Game - Worksheet
No ratings yet
PLAY - The Bean Game - Worksheet
5 pages
DALL E 3 - OpenAI
No ratings yet
DALL E 3 - OpenAI
8 pages
Prep Asic
No ratings yet
Prep Asic
36 pages
Chapter4 FA16
100% (1)
Chapter4 FA16
65 pages
Lec04.Logic Effort
No ratings yet
Lec04.Logic Effort
20 pages
A Psalm of Life
0% (1)
A Psalm of Life
12 pages
Ee141 Hw4 Sol
100% (1)
Ee141 Hw4 Sol
16 pages
General Physics 2 Performance Task #1 Module 1, Week 1, Quarter 3
100% (1)
General Physics 2 Performance Task #1 Module 1, Week 1, Quarter 3
2 pages
Unit 2 VLSI Design - Final
No ratings yet
Unit 2 VLSI Design - Final
93 pages
21EC63 Module 4
No ratings yet
21EC63 Module 4
56 pages
Vlsi Module-3
No ratings yet
Vlsi Module-3
129 pages
Lecture9to11 LogicDesign
No ratings yet
Lecture9to11 LogicDesign
42 pages
PD Unit 1
No ratings yet
PD Unit 1
56 pages
UserManual en 1051 1100
No ratings yet
UserManual en 1051 1100
50 pages
Unit 2 VLSI Design
No ratings yet
Unit 2 VLSI Design
61 pages
Lecture10 LE
No ratings yet
Lecture10 LE
27 pages
Lect 3
No ratings yet
Lect 3
47 pages
Iso 13385-2 - 2011
No ratings yet
Iso 13385-2 - 2011
8 pages
Digital - Integrated - Circuit - 05 - Combinational Logic I
No ratings yet
Digital - Integrated - Circuit - 05 - Combinational Logic I
42 pages
Design and Implementation of VLSI Systems
No ratings yet
Design and Implementation of VLSI Systems
55 pages
V.L.S.I. I: Page 1 of 50
No ratings yet
V.L.S.I. I: Page 1 of 50
50 pages
Dic Lec 11 Paths Effort v01
No ratings yet
Dic Lec 11 Paths Effort v01
39 pages
Logical Effort Anitha
No ratings yet
Logical Effort Anitha
40 pages
Lab p2
No ratings yet
Lab p2
9 pages
SIROLL ALU en
No ratings yet
SIROLL ALU en
28 pages
Dic Lec 10 Logical Effort v01
No ratings yet
Dic Lec 10 Logical Effort v01
31 pages
Numerical S 1
No ratings yet
Numerical S 1
9 pages
Untitled
No ratings yet
Untitled
12 pages
Lecture 27 SP24
No ratings yet
Lecture 27 SP24
21 pages
A Simple Algorithm For Fanout Optimization Using High-Performance Buffer Libraries
No ratings yet
A Simple Algorithm For Fanout Optimization Using High-Performance Buffer Libraries
6 pages
Vlsi Notes
No ratings yet
Vlsi Notes
11 pages
Battery Thermal Management System
No ratings yet
Battery Thermal Management System
17 pages
Analisis Factorial 2 2 y 2 3
No ratings yet
Analisis Factorial 2 2 y 2 3
8 pages
LEOPARD A Logical Effort-Based Fanout OPtimizer For ARea and Delay
No ratings yet
LEOPARD A Logical Effort-Based Fanout OPtimizer For ARea and Delay
4 pages
Designing of Combinational Logic Gates in Cmos: G.Susmitha Roll No:06
No ratings yet
Designing of Combinational Logic Gates in Cmos: G.Susmitha Roll No:06
49 pages
4740 Lecture06 Inverter Sizing
No ratings yet
4740 Lecture06 Inverter Sizing
16 pages
cs6710 Log Effx2
No ratings yet
cs6710 Log Effx2
26 pages
Vlsi Note 3
No ratings yet
Vlsi Note 3
8 pages
Logic Gate Delay Modeling - 1: Bishnu Prasad Das Research Scholar Cedt, Iisc, Bangalore Bpdas@Cedt - Iisc.Ernet - in
No ratings yet
Logic Gate Delay Modeling - 1: Bishnu Prasad Das Research Scholar Cedt, Iisc, Bangalore Bpdas@Cedt - Iisc.Ernet - in
45 pages
Cycle Time and Slack Optimization For VLSI-Chips: C. Albrecht B. Korte J. Schietke J. Vygen
No ratings yet
Cycle Time and Slack Optimization For VLSI-Chips: C. Albrecht B. Korte J. Schietke J. Vygen
6 pages
Delay Optimization & Logical Efforts
No ratings yet
Delay Optimization & Logical Efforts
38 pages
Logical Effort
No ratings yet
Logical Effort
37 pages
16 Logical Effort and Transistor Sizing 31-08-2020 (31 Aug 2020) Material - I - 31 Aug 2020 - Logical - Effort
No ratings yet
16 Logical Effort and Transistor Sizing 31-08-2020 (31 Aug 2020) Material - I - 31 Aug 2020 - Logical - Effort
28 pages
English Cosplay
No ratings yet
English Cosplay
4 pages
Optimising A D Flip Flop Through Delay and Power Estimation Using An RC Model and Transistor Sizing
No ratings yet
Optimising A D Flip Flop Through Delay and Power Estimation Using An RC Model and Transistor Sizing
7 pages
2024 Acuvue Price List
No ratings yet
2024 Acuvue Price List
2 pages
Delay Estimation in Static Design: by Prasad Pande
No ratings yet
Delay Estimation in Static Design: by Prasad Pande
33 pages
Minimizing The Delay of C2mos D Flip Flop Using Logical Effort Theory
No ratings yet
Minimizing The Delay of C2mos D Flip Flop Using Logical Effort Theory
3 pages
ECO Timing Optimization Using Spare Cells2
No ratings yet
ECO Timing Optimization Using Spare Cells2
31 pages
Memory of The World (SCIFICT - REPORT)
No ratings yet
Memory of The World (SCIFICT - REPORT)
21 pages
Ecture: Vlsi D
No ratings yet
Ecture: Vlsi D
44 pages
PG5 Us
No ratings yet
PG5 Us
2 pages
BOM Prod Analysis
No ratings yet
BOM Prod Analysis
3 pages
Static CMOS Gates: F (A+B) - (C+D)
No ratings yet
Static CMOS Gates: F (A+B) - (C+D)
22 pages
Thnsistor Sizing For Minimization of Energy-Delay Product
No ratings yet
Thnsistor Sizing For Minimization of Energy-Delay Product
6 pages
6 Logical Effort
No ratings yet
6 Logical Effort
5 pages
ENGN1630 12 Elmore Delay
No ratings yet
ENGN1630 12 Elmore Delay
7 pages
IJCSET13-04-09-023logical Effort 2
No ratings yet
IJCSET13-04-09-023logical Effort 2
11 pages
EE476 Vlsi Lecture 6: Designing For Speed: CSE477 L10 Inverter, Dynamic.1 Irwin&Vijay, PSU, 2002
No ratings yet
EE476 Vlsi Lecture 6: Designing For Speed: CSE477 L10 Inverter, Dynamic.1 Irwin&Vijay, PSU, 2002
26 pages
Lecture35&36 - Delay in Multilevel Logic: Jagannadha Naidu K
No ratings yet
Lecture35&36 - Delay in Multilevel Logic: Jagannadha Naidu K
19 pages
Andrew D. Miall
No ratings yet
Andrew D. Miall
48 pages
Energy Efficient CMOS Microprocessor Design
No ratings yet
Energy Efficient CMOS Microprocessor Design
10 pages
Embedded System Timing Analysis Basics: Part 2 - Fan-Out & Loading Analysis
No ratings yet
Embedded System Timing Analysis Basics: Part 2 - Fan-Out & Loading Analysis
6 pages
Platoon #1
No ratings yet
Platoon #1
3 pages
Social Studies Unit Plan Organizer Teacher Candidate: Andrea Murree Grade: 6 Social Studies
No ratings yet
Social Studies Unit Plan Organizer Teacher Candidate: Andrea Murree Grade: 6 Social Studies
23 pages
A Brief Review of Die Sinking Electrical Discharging Machining Process Towards Automation
No ratings yet
A Brief Review of Die Sinking Electrical Discharging Machining Process Towards Automation
7 pages
Electrical Effort "H" (Cout/Cin)
No ratings yet
Electrical Effort "H" (Cout/Cin)
1 page
Cmos Vlsi Design 198
No ratings yet
Cmos Vlsi Design 198
1 page
El T I III Electronics III: - Power Dissipation in CMOS Digital Circuits - Optimization of Chain of Inverters
No ratings yet
El T I III Electronics III: - Power Dissipation in CMOS Digital Circuits - Optimization of Chain of Inverters
23 pages
Breaking Spaghetti Nives Bonacic Croatia IYPT 2011
No ratings yet
Breaking Spaghetti Nives Bonacic Croatia IYPT 2011
34 pages
454 Final Cheat Sheet
No ratings yet
454 Final Cheat Sheet
4 pages
Brosur CCTV ZiFMachines
No ratings yet
Brosur CCTV ZiFMachines
3 pages
hw04 f05 Soln
No ratings yet
hw04 f05 Soln
4 pages
Comparison of Critical Rate Correlations: Firdavs A. Aliev, Khurshed A. Rahimov, Balabek Amzayev, Alim F.Kemalov
No ratings yet
Comparison of Critical Rate Correlations: Firdavs A. Aliev, Khurshed A. Rahimov, Balabek Amzayev, Alim F.Kemalov
7 pages
KNL4343 Lecture10
No ratings yet
KNL4343 Lecture10
21 pages
Pa6 GF20 - RTP Company RTP Pa6 20 GF
No ratings yet
Pa6 GF20 - RTP Company RTP Pa6 20 GF
1 page
Floor Plan
No ratings yet
Floor Plan
40 pages
Adaptive Filtering Prediction and Control
From Everand
Adaptive Filtering Prediction and Control
Graham C Goodwin
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
From Everand
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
Analog Dialogue
No ratings yet

A Fanout Optimization Algorithm Based On The Effort Delay Model

Uploaded by

A Fanout Optimization Algorithm Based On The Effort Delay Model

Uploaded by

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO.

12, DECEMBER 2003 1671

0278-0070/03$17.00 © 2003 IEEE

Proof: According to (3), area is a monotonically decreasing func-

III. ALGORITHM Proof: According to Theorem 1 and (2)

In this section, the fanout optimization problem is stated as two sep-

capacitances of all buffers on the buffer chain, may subsequently be

Fig. 4. Split/merge transformations.

The function Lambert(! ) is the solution to the nonlinear equation

and this corresponds to allocating the well-known electrical effort of e

Fig. 8. Different slopes corresponding to different branches.

TABLE II TABLE III

0278-0070/03$17.00 © 2003 IEEE

You might also like