0% found this document useful (0 votes)
78 views11 pages

A Net Router Gate: Arrays

This document presents a multi-terminal net router for field-programmable gate arrays (FPGAs). The router directly routes multi-terminal nets after placement, bypassing the typical two-stage routing process of global routing followed by detailed routing. Direct routing of multi-terminal nets improves routability compared to decomposing them into two-terminal nets. The router aims to minimize total wire length and number of wiring segments used while meeting timing constraints. Preliminary results show the router routes large industrial circuits with fewer segments than other approaches.

Uploaded by

kalai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views11 pages

A Net Router Gate: Arrays

This document presents a multi-terminal net router for field-programmable gate arrays (FPGAs). The router directly routes multi-terminal nets after placement, bypassing the typical two-stage routing process of global routing followed by detailed routing. Direct routing of multi-terminal nets improves routability compared to decomposing them into two-terminal nets. The router aims to minimize total wire length and number of wiring segments used while meeting timing constraints. Preliminary results show the router routes large industrial circuits with fewer segments than other approaches.

Uploaded by

kalai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

VLSI DESIGN (C) 1996 OPA (Overseas Publishers Association) Amsterdam B.V.

1996, Vol. 4, No. 1, pp. 1-10 Published in The Netherlands under license by
Reprints available directly from the publisher Gordon and Breach Science Publishers SA
Photocopying permitted by license only Printed in Malaysia

A Multi-Terminal Net Router for


Field-Programmable Gate Arrays
DINESH BHATIA*
Design Automation Laboratory, Department of Electrical & Computer Engineering and Computer Science, P.O. Box
210030, University of Cincinnati, Cincinnati, OH 45221-0030 (513)-556-2570 (voice) (513)-556-7326 (fax)
[email protected]
AMIT CHOWDHARY t
Amit Chowdhary Department of Electrical Engineering and Computer Science, University of Michigan, 1301 Beal
Avenue, Ann Arbor, MI 48109-2122 amitc @eecs.umich.edu

(Received December 22, 1993, Revised October 18, 1994)

This paper presents a router for routing multi-terminal nets in field-programmable gate arrays (FPGAs). The router does not
require pre-assignment of routing channels, a phase that is normally accomplished during global routing. This direct routing
approach greatly enhances the probability of routing (routability). The multi-terminal routing greatly reduces the total wire
length as it approximates a Steiner tree. The total number of segments required to route the circuits is usually less as
compared to other routing approaches. The router has generated excellent routing results for some industrial circuits. The
memory requirements for this router are very low. The time needed for the routing is linear with respect to the size of the
circuit.

Key Words: Routing, Field-Programmable Gate Arrays, VLSI

1 INTRODUCTION router for an architecture called logic cell arrays (LCAs).


The architecture is very popular and has been pioneered
ecentlyField-programmable gate arrays (FPGAs) by Xilinx [9]. We begin by describing the LCA architec-
ave become very popular for implementing Appli- ture in the next section.
cation Specific Integrated Circuits (ASICs). As the
technology evolves, the low and medium end ASICs are
being implemented using FPGAs. Primarily it is because 2 ARCHITECTURE OF AN LCA
FPGAs support very rapid prototyping with design
realization times of the order of few hours. Due to a The architecture of the logic cell array is depicted in
restricted architecture, FPGAs have to rely on special figure 1. An LCA is a two-dimensional array of logic
purpose CAD tools for design realization. The perfor- blocks marked as L. The LCA also consists of horizontal
mance and logic utilization is of primary concern in and vertical routing channels. Each channel consists of
FPGA based designs. We have addressed, in this paper, sections, that span the length of one logic block. Within
the problem of routing nets in an FPGA. As we discuss the channel sections are present the wiring segments. A
later, the problem of layout synthesis has not been channel section (and entire channel) will have W wiring
addressed adequately for FPGAs. We have designed and segments arranged in parallel tracks. The routing
implemented a performance driven multi-terminal FPGA switches are present in the connection boxes and switch
boxes.
A connection box or C box consist of switches that
*Partially supported by the University Research Council of the
University of Cincinnati and from Solid State Electronics Directorate, connect the logic block pins to the wiring segments. The
Wright Lab of the US Air Force under contract no. F33615-91-C-1811. switch box or S box switches connect one wiring
f Partially supported by MTL Systems, Dayton, OHIO. segments to another. The S boxes are present on the
2 D. BHATIA AND A. CHOWDHARY

Channel
Section

Connection
Block

Input Block Switch Block


FIGURE Architecture of Logic Cell Array.
intersection of the horizontal and vertical routing chan-
nels. The flexibility of a C box, F, is defined as the FIGURE 3 An example where two terminal decomposition of a three
number of segments to which each logic block pin can be terminal results in routes that use excessive number of wiring segments.
connected. The flexibility of an S box, Fs, is defined as {a, b, c} is a three-terminal net whose two terminal decomposition
the number of segments to which a wiring segment results into two two-terminal nets a,b and a,c that are routed using
disjoint set of segments in common wiring channels.
entering a S box can be connected. Figure 2 illustrates a
switch box and a connection box and their flexibilities
respectively. The flexibilities of C and S boxes are shown This type of wiring can result in excessive routing
in Figure 2. From now on, we will use the terms FPGA delay especially when the spanning tree is like a path
and LCA interchangeably. (very skewed tree) with an output pin at one end. In this
Brown et al. [3] have developed a detailed router for case the maximum delay would depend on the number of
LCAs that routes the two-terminal nets within their input pins in the net. We have observed a few nets in the
assigned global routes. The router, known as the Coarse industrial circuits to have as many as 20 to 30 input pins.
Graph Expansion (CGE) router, expands the global route Our router addresses the problem of excessive delays
(coarse graph) of each net to find a detailed route. While by placing an upper bound on the length of the route
routing, the CGE router considers the side-effect of one from the output pin of a multi-terminal net to any input
connection on another. One drawback of the CGE router pin on the same net. This bound is the minimum possible
is that each multi-terminal net is decomposed into a and is obtained from the placement of logic modules. It
group of two-terminal nets before routing. By doing so is easy to see that this bound is the distance between the
the detailed router effectively constructs a spanning tree output pin and the input pin belonging to the same net
and uses excessive number, of wiring segments. In and farthest from the output pin. We bypass global
general, a two-terminal net router can result in a wiring routing and directly perform the detailed routing after
where two two-terminal nets belonging to a same multi- placement. The two stage routing, i.e., global routing
terminal net can occupy different segments (tracks) in the followed by the detailed routing is not very efficient. In
same channel. This happens because once the two- fact, even in the presence of pre-defined routing archi-
terminal nets are formed they are all treated as indepen- tecture, the detailed routing problem in the presence of
dent nets. Please refer to Figure 3 for an example where assigned global routes is NP-complete [8]. The results of
two terminal decomposition of a three terminal net Wu and Marek-Sadowska [8] are due to a graph based
results in two routes that use disjoint set of segments. analysis of the routing architecture of logic cell arrays.
We route multi-terminal nets directly after module
placement. By doing so, we not only bypass the global
routing but also enhance the probability of routing. As
results will show our router is capable of routing large
industrial circuits with significantly smaller number of
wiring segments. The channel width requirement is
almost the same as that for CGE router by Brown et al.
[2]. In most cases we have observed that 95-99% routing
FIGURE 2 The Flexibility of Switch and Connection Boxes. with channel width much less than that required by the
MULTI-TERMINAL NET ROUTER 3

CGE router is easily obtained by our router. Very few ment into account. It should be noted that the cost
nets that are difficult to route in the later stages of routing function used for our router can readily accept segments
increase the channel width locally. A manual routing of of arbitrary lengths. Besides the primary requirement of
remaining nets would give significantly improved re- routability and small wiring delays, we also address the
suits. memory requirements and execution time of our router. It
The multi-terminal router searches the entire space will be evident from the results that the memory and the
around the pin within the associated bounding box. Also, execution time requirements of the router are extremely
instead of decomposing the multi-terminal net into two- small.
terminal nets, one input pin of the multi-terminal net is The approach normally followed for routing is to
routed to the already routed portion of the net. Thus, the decompose each multi-terminal net into a group of
probability of the pin getting routed increases as the two-terminal nets, perform the global routing of two-
routed portion of the net becomes large. Actually, the terminal nets, and then route the circuit within the
routing becomes difficult only in the later stages when assigned global routes using a detailed router. A global
the majority of the routing resources have been used up, route for a two-terminal net is a sequence of channel
but the increase in the routed portion of the nets more or sections from one terminal to another. The search space
less compensates for the difficulty in routing the subse- for finding a detailed route is restricted to the assigned
quent input pins. The final routing of any one net is an global route. The problem of finding detailed routes
approximation of a Steiner tree t. It usually requires less within the assigned global routes for FPGAs is NP-
number of segments than the route obtained after decom- complete [8]. We have taken a different approach. In
posing multi-terminal nets into two-terminal nets. The order to make full use of the available routing resources,
performance of the router depends on the topology of the nets are routed from the netlist description obtained after
connection box and the switch box (i.e. the pattern of the the placement of the modules, i.e., the global routing is
switches in these boxes). Thus, a switch box or a bypassed. We route one input pin of a multi-terminal net
connection box with an efficient topology might result in at a time to the already routed portion of that net. In other
a smaller channel width. words, the router decides a Steiner point for connecting
an input pin to the routed portion of the net. Instead of
bounding the search space by assigning global routes, the
complete feasible space is searched for the routing of
3 MAIN FEATURES OF THE each terminal. This results in very high success probabil-
MULTI-TERMINAL ROUTER FOR FPGAS ity of routing. In our implementation, the input pins,
ik, <-- j <-- Pk, are first ordered in the non increasing
The input to the router is the netlist obtained after the order of their Manhattan distance from the output pin ok
placement of modules on an FPGA. Each multi-terminal belonging to the same net. Let this order be
net nk is a set of one output pin o k and one or more input {i, i
.2
i,}. Following this ordering, ’Jlk is routed
pins. Thus nk { ok, lk, k /k}, where Pk is the total within distance l(Ok, "jk). Subsequently input pins i
number of input pins associated with nk. We distinguish through i, are routed to the routed portions of net nk. For
output and input pins from each other. A net will refer to input pins belonging to a net and lying on the timing
a multi-terminal net, unless specified otherwise. The aim critical path(s), the net ordering is slightly different. We
for routing the nets is to achieve 100% routing with the first route the input pin belonging to a net and lying on a
channel width and the routing delay as low as possible. timing critical path to the output pin belonging to the
Currently we bound the routing delay for each net nk by same net. Subsequently, the remaining pins are routed as
max { l(Ok, ik) }, 1 <-- j <-- Pk, where l(a, b) is the Manhat- determined by the net ordering described above. In doing
tan distance between a and b. This metric is less than or so we gain the following.
equal to the half perimeter of the bounding box for net nk.
Also, due to the fact that all segments are of unit length, 1. Each of the Pk input pins of a net nk is routed in
the wiring delay due to programmable switches is such a way that the length of the path from the
independent of the track assignment. This is consistent output pin ok nk to an input pin is at most
with the XILINX XC2000 and the XC3000 family of {l(Ok, iJk)}, 1 <--j <-- Pk" This places an upper
FPGAs [9]. In the XC4000 family, segments of different bound on the length of the path between the
lengths are also available [9]. Thus the wiring delay will output pin of a net and any of its input pins. This
have to take both Manhattan distance and track assign- results in a final routed circuit with a low routing
delay. In case of two-terminal routing, there is no
1In practice the multi-terminal router can save at most 33% routing upper bound on the length of the path from an
resources. This due to
is the fact that Steiner trees are at most 33% output pin to any input pin. In worst, case the path
better than spanning trees over the same set of vertices [5]. length can be as much as the cost of minimum
4 D. BHATIA AND A. CHOWDHARY

spanning tree obtained for a net nk using distance. In doing so, the length of the path between the
Kruskal’s method [6]. In addition, the two termi- output pin 0 and the input pin 5 becomes very large
nal decomposition itself is time consuming as it resulting in a considerable routing delay. The final route
takes O(p2) time for a net with p terminals. obtained by our router as shown in figure 4 results in a
2. The input pins of a net are routed in the decreas- Steiner tree type of structure. Initially, the input pin 5 is
ing order of their Manhattan distances from the routed to the output pin 0 with the shortest possible
output pin of the net. Typically the longer paths distance and then the other pins are routed to the already
are difficult to route and initially in the presence existing route of that net. The final route thus obtained is
of almost all routing resources, they get routed shorter than that obtained by the two-terminal routing.
easily. Subsequently, as more and more pins The length of a path from the output pin 0 to any input
belonging to the same net get routed, the routed pin is less than the distance between the output pin 0 and
portion of the net increases and the probability of the input pin 5. In the example illustrated in Figure 4 the
routing the subsequent input pins also increases. total length of wire for two terminal wiring is 16 while it
This is because our router, while routing an input is 14 for the multi-terminal wiring. Typical industrial
pin, searches for the already routed portions of circuits can have as many as 40-50 pins, thus two-
the net. This compensates for the increase in the terminal decomposition can cause significant routing
difficulty of routing in the later stages of routing. delay.
Our router takes in the topology of the switch box and
Figure 4 illustrates a net with an output pin numbered the connection box (i.e. the pattem of the switches) as an
0 and five input pins numbered 1 to 5. The possible final input. The topology of a switch box specifies all the pairs
routes obtained by our router and a two-terminal routing of segments on different sides of the switch box that can
algorithm are also illustrated in figure 4. In a two- be connected by programmable switches. Similarly, the
terminal routing approach, the net is decomposed into 5 connection box topology gives the location of the
two-terminal nets that are routed in the shortest possible switches inside the connection box.

4 MULTI-TERMINAL NET ROUTING


4.1 Terminology
Let N be the total number of nets in a circuit and Pk be the
number of input pins in each net nk, 1 <-- k <-- N. We
denote by ML(nk), 1 <- k <- N, the distance between o k
and an input pin i Pk such that l(ok, i)
OU’I’PUT PIN
max{ l(Ok, iJk) }, 1 <-- j <-- Pk We denote by CR(nk), 1 <-- k
--< N, the partial route of nk. CR(nk) includes ok, a subset
INPUT PIN of input pins i, 1 <- j <- Pk, and the wiring segments used
to route Ok and the subset of input pins.
A channel section is defined as the set of wiring
segments between two successive switch-boxes in a
horizontal row/vertical column. Two channel sections
and j are said to be adjacent iff they share a common
switch box. A Global-Graph is a directed acyclic graph
G(V, E) rooted at vertex vo. A vertex v VG
represents a channel section i. There exists an edge vivy
PIN0
v
between two vertices vi and iff and j are adjacent
channel sections and if vi belongs to level in G then
belongs to level + 1 in G. Suppose an input pin
v
Pk nkis to be routed to CR(nk). Then G is constructed as
follows. Vo represents the channel section o such that i is
brought into o using a connection box. Clearly, the
in-degree of v is equal to zero. V is a collection of
vertices such that:
FIGURE 4 Worst case two-terminal routing compared against a
multi-terminal routing. 1. if there exists a wiring segment in channel section
MULTI-TERMINAL NET ROUTER 5

that belongs to CR(nk) then vi belongs to VG. level in D. In addition, if vj is a leaf vertex in
The vertex v is also called a leaf vertex in the Global-Graph, then { V} Vo represents a set of leaf
Global-Graph. vertices. Eo is constructed as follows. For each vi, vj
2. if there exists a path from i through a sequence VG, if there exists an edge vvj then there exists a directed
of channel section, J l, J2 Jm, for some m --< edge from members of { Vi} to members of { V} if and
ML(nk), such that Jm contains a wiring segment only if two members share a programmable switch
that belongs to CR(nk) then vii, Vj2 Vjm_ between them in the switch-box between channel sec-
belongs to VG. Note that vj, will belong to V due tions and j. Figure 5 illustrates an example. In Figure 5,
to condition (1) above. Also vi 1 <- <- m the Global-Graph and corresponding Detailed-Graph
belongs to level in G. are shown. The vertices Vo...v9 in Global-Graph repre-
sent the channel sections. The vertices in the Detailed-
The edges of G are constructed as defined above. The Graph represent the feasible wiring segment within a
Global-Graph for net nk, 1 <-- k <-- N, consists of ML(nk) channel section. Finally, we define routability as the ratio
levels, and each level consists of vertices that represent in percent of the number of nets completely routed to the
channel sections that are at distance 1 from the root total number of nets in a given circuit.
vertex representing the channel section where the input
pin to be routed exists. The Global-Graph associated
with any pin belonging to a net lists all the possibilities 4.2 Pre-processing of Nets
of routing (all possible global paths). We search for the
minimum cost wiring in a Detailed-Graph which is Each net nk has Pk -> 1 input pins. For each net nk, <--
obtained after we expand the Global-Graph into a k -< N, the input pins are arranged in a certain pre-
Detailed-Graph. The Detailed-Graph is described below. determined order. As stated earlier, in our implementa-
To route an input pin ik nk, a Detailed-Graph D(Vz, tion, the input pins, ilk, 1 --< j <-- p, are first ordered in the
ED) is obtained by expanding each channel section non increasing order of their Manhattan distance from
represented in the Global-Graph into individual wiring the output pin Ok belonging to the same net. Let this order
segments. D is also a directed acyclic graph where each be {i, i i,}. Following this ordering, "J’k is routed
jl
vertex vi Vo represents a wiring segment within a within distance l(ok, k). Subsequently input pins
channel section. Vo is constructed as follows. Each through i, are routed to CR(nk). These pins, i.e. ilk"
vertex vj VG is replaced by a set { V} of vertices where through i*, are not constrained to their bounding boxes
each vertex represents a feasible wiring segment in with respect to the output pin or,. Instead pins i through
channel section j. By feasible wiring segment we imply i, are routed by expansion of global graph into detailed
an unoccupied wiring segment. For leaf vertices in the graph when no more than ML(nk) levels are permitted in
Global-Graph the feasible wiring segment set also in- the global graph. The ordering of pins should not be
cludes the occupied wiring segments that belong to the mistaken as two terminal decomposition of a multi-
net nk. Also, if vj VG belongs to level in G then the terminal net. Here we are arranging input pins in the
corresponding set of vertices { V} Vo also belongs to descending order of their Manhattan distance from the

Global-Graph Detailed-Graph

FIGURE 5 The global and detailed graphs.


6 D. BHATIA AND A. CHOWDHARY

output pin. Thus for N nets in a circuit, we will have N We describe the following terms before discussing the
distinct ordered sequences. cost associated with each wiring segment. Figure 6
illustrates the evaluation of cost function.
4.3 Routing of nets
1. Pins-Connectivity denoted by PC(i) for some pin
We now present the routing algorithm. In each iteration is the number of unused wiring segments to
an input pin belonging to a net nk from each of the which the pin can be connected to. PC(i) can at
ordered sequences is routed to its respective CR(nk). The most be equal to the connection box flexibility Fc.
algorithm executes max{p}, 1 _< k -< N iterations. The 2. Segments-Connectivity denoted by SC(i, j), for
routed portion CR(n) of each net n is small in very early any segment and 1 -< j <-- 6 is the number of
stages of routing. It is likely that few of the input pins segments that a segment can connect to in one
might not get routed in the first execution of the of the six adjacent channel sections. As seen from
algorithm. The algorithm can take multiple passes to the Figure 6, a segment in a channel section can
improve the routability. Thus if each subsequent pass connect to segments in six different channel
after first execution takes as input the ordered sequences sections through two switch boxes. Initially, the
of unrouted pins, the probability of successful routing value of this parameter depends solely on the
would increase due to substantially larger CR(n). topology of the switch box; i.e., for cases when
switch box flexibility, Fs, is a multiple of three,
Procedure Route and each incoming segment connects uniformly
Input: A logic cell array, a set of netlists obtained after on all three outgoing sides, then SC(i, j) Fs/3,
placement and pin-assignment, a width of routing chan- 1 --< j _< 6. For each segment used in routing, the
nel W, and the flexibilifies, F and Fc, and topologies of connectivity of all the segments in the six adja-
switch box and connection box respectively. cent channel sections shown in Figure 6 will have
Output: A multi-terminal routing if it is feasible with to be updated.
channel width W.
Step 1: For each net nk, 1 < k <- N, form the ordered The cost of a segment depends on the number of
"’
sequence {tk iJ-,, i,-} of input pins such that
1(0k, iJr+), < 1(ok, iJkr), 1 < r < Pk 1
segments in the six adjacent channel sections and the
number of pins in the same channel section, that can be
Step 2: For each net nk such that its ordered sequence connected to the segment, along with their corresponding
of input pins is not empty do Steps 3-7 Segments-Connectivity and Pins-Connectivity. If a seg-
Step 3: POP ink from the top of the ordered sequence of ment in a channel section connects to a large number of
input pins. segments in the six adjacent channel sections or a large
Step 4: Construct the Global-Graph G(Va, Ea) with number of pins in the same channel section, then the cost
root node representing the channel section in which ink of the segment must be high as it is in a great demand. In
belongs. another case it is likely that a segment in a channel
Step 5: Expand the Global-Graph G into a detailed
graph D(VD, ED). fuse site used segment
Step 6: Select the minimum cost leaf node (one with
out-degree equal to zero) and trace back a minimum cost
path from leaf level to level 1 in D. Add the members of
minimum cost path to the CR(n).
Step 7: Update cost of all segments and remove the
routed input pin inkfrom the ordered sequence of input
pins for net n.

4.4 Cost Function


Each wiring segment in the FPGA has a certain cost
assigned to it. After each pin is routed, the cost associ-
ated with the wiring segments is updated. The cost
assigned to a segment reflects the demand on that PC(h) 2; SC(x,1)=O; SC(x,5)=I; SC(x,2)=2
segment. The cost of any route is the sum of the costs FIGURE 6 An example for evaluating cost function. The cost
associated with all the wiring segments that belong to the associated with each segment is dependent upon connectivity of
same route. segment in six adjacent channel sections.
MULTI-TERMINAL NET ROUTER 7

section j can connect to a set S of segments in the Thus, the worst case memory required for storing a
adjacent channel sections, however the segment belong- detailed graph is O(m2 W). For storing the routes of all
ing to set S can connect to only few segments in channel the nets, the memory required is O(m 2. W). This is true
section j. In such a case, the segment is in high demand. because each segment can be assigned to at most one net
A similar high cost scenario can exist when a segment and total number of segments is O(m2. W). The worst
can connect to a set P of pins but for each pin p C P the case memory requirements are independent of the num-
PC(p) is very small. In such a case segment is in high ber of nets.
demand. All these facts are taken into account while

cost[segment-no]
=. i6

--
designing the cost function given below.

SC(i, j)
o
PC(k)
Here, the summation over is done for all the segments
in the adjoining channel sections that can be connected to
the segment segment-no. Similarly, the summation over
k is done for all pins in the same segment that can be
connected to the segment segment-no.
5 IMPLEMENTATION DETAILS AND
RESULTS
Our router has been used to route several industrial
circuits on FPGAs. The FPGA architecture that we used
for experiments assumes 4-input lookup table type logic
blocks. Each logic block has a D-flipflop. Thus, each
The weight assigned to the cost of a segment due to the logic block has 7 pins: pins 0-3 are the input pins, 4 is
pins is taken to be a times the weight of the cost due to the clock pin, 5 is the tri-state pin and 6 is the output pin.
the segments in the adjoining channel sections. This is Each pin appears on only one side of the logic block.
motivated by the fact a pin can be connected in only one Table shows the number of multi-terminal nets, the
channel section, while a segment in a channel section can number of input pins used and the number of logic
be connected to segments in six different channel sec- blocks used in some of these circuits. These circuits are
tions. In our experiments, we found that ot 4 is a good from different sources: Bell Northern Research (BNR),
weighting factor. Zymos and two different designers at University of
Toronto (UTD1 & UTD2). In the subsequent subsec-
tions, the routing results are presented and compared.
4.5 Time Complexity and Memory Requirements of
the Router
Lemma 1 The overall time complexity of the routing is 5.1 Channel Density after Global Routing
O(km2WFs) for an array of size m m, channel width
equal to W, and switch box flexibility of Fs. k is the total Before discussing the results obtained after detailed
number of input pins for the input netlist. routing for the circuits stated in Table 1, we want to
The time complexity of the routing algorithm is demonstrate the capability of our router for estimating
governed by the expansion of the global graph into a the wiring requirements. Our router can be used as a
detailed graph. For an FPGA of size m m and the global router for FPGAs. This is done by keeping the
channel width equal to W, the global graph has O(m) switch box flexibility Fs and the connection box flexibil-
levels and each level has at most m channels. Each ity F as the maximum possible, i.e., F 3W and F
channel has W segments, where each segment can W. By doing so, the maximum channel density over all
connect to at most F segments in the next level. Thus, the channels for a circuit or simply the channel density
the expansion of global graph into a detailed graph takes will be the lower bound on the number of segments
O(m2 W" Fs) time. We expand a small portion of worst required to complete the detailed routing of a circuit. The
case global graph into a detailed graph for each of the k
input pins belonging to the netlist. Thus the overall worst TABLE
Characteristics of Experimental Circuits
case time complexity is O(km2WFs).
No. of No. of
Lemma 2 The worst case space complexity for the multi- No. of logic
terminal input blocks
router is O(m2W). Circuit nets pins used used Source Type
The memory requirements for our router depends on, BUSC 151 392 109 UTDI Bus Cntl
DMA 213 771 224 UTD2 DMA Cntl
storing the detailed graph for an input pin, and
BNRE 352 1257 362 BNR Logic/Data
storing the assignment for the routed nets. DFSM 420 1422 401 UTD State
Mach.
The detailed graph has O(m) levels and each level has
Z03 608 2135 586 Zymos 8-bit Mult
O(m) channels with each channel having W segments.
8 D. BHATIA AND A. CHOWDHARY

TABLE 2 Routing Channel Width


loo
Channel Density After Global Routing
Circuit Density from Density from
Modified LocusRoute router Our Router 95- ,, Busc
BUSC 9 8 DFSM
DMA 10 10
BNRE 11 12 90
DFSM 10 10
Z03 11 12
85

channel density for a circuit also reflects the quality of


the router used. The channel density for the industrial 80
5 9 I0 II I2 13
circuits using our router are shown in Table 2. This Channel Width
channel density is compared with the density obtained FIGURE 7 The effect of channel width on routability.
after global routing by a modified version of the Locus-
Route global router [7]. LocusRoute global router per- channel width. We have observed the effect of switch box
forms global routing for standard cell designs. It has been topology on the quality of routing. For example, initially
modified to suit to FPGA routing architecture. the channel width required for routing the DMA circuit
was equal to 12. After trying a few different switch box
and connection box topologies, the channel width was
5.2 Channel Width after Detailed Routing reduced to 11 which is the minimum possible as obtained
by the global routing.
The industrial circuits are routed using our router with Let WcGe be the channel width required for 100%
the switchbox flexibility, Fs, equal to 6 and the connec- routing using CGE router. Let R(W) be the ratio of
tion box flexiblity, Fc, equal to 0.6W. The effect of number of nets routed for channel with W to the total
routability for varying switch box and connection box number of nets in a circuit using our router. Table 4
shows R(WcGe) for various circuits. About 97-100% of
flexiblity has been studied and reported in [1] [4]. The
channel width W obtained after routing the circuits using the nets get routed by our router, if the channel width is
the sequential router is compared with the CGE router the minimum required by the CGE router for 100%
[3] in Table 3. routing.
It should be noted from Figures 7 and 8 that the router
is capable of routing almost all nets with very small
channel widths. In fact, in most cases the routability of as 5.3 Wiring Segment Utilization
high as 98% is obtained for substantially smaller channel
widths. If the remaining nets, as illustrated in Figure 8, Table 5 shows the total number of wiring segments used
are manually routed then we believe that our router can for routing the circuits. For Fs 6 and Fc 0.6W, the
be used for very tight routing. Manual interaction is total wire length in terms of the number of the used
possible and in most cases we have observed that the segments obtained by our router is always less than that
routing resources are available that can be used for
manual routing. 31
The channel width obtained after routing using our
router depends on the topology of the switch box and the
connection box. The topology of these boxes specifies
the pairs of segments in the switch box or pin-segment
pairs in the connection box that can be connected by a 19 Numbers indicated
switch. An efficient topology will surely lead to a lower unrouted nets

TABLE 3
Channel Width W Required (Fs 6; F 0.6W)
Circuit W for CGE Router W for Our Router 7
4
BUSC 10 8
DMA 10 11
BNRE 12 12 7 8 9 10
DFSM 10 11
Z03
FIGURE 8 Number of unrouted nets for BUSC and DRAM circuits
13 13
for changing channel widths.
MULTI-TERMINAL NET ROUTER 9

TABLE 4 TABLE 6
R(WcE) for Various Circuits. Execution Time Requirements for Our Router
Circuit R(wc) Circuit CPU Time (Secs.)
BUSC 1.00, W for our router is 8 and WcE 10. BUSC 8
DMA .98 DMA 32
BNRE 1.00 BNRE 116
DFSM .9952 DFSM 141
Z03 1.00 Z03 231

obtained by the CGE router [3]. This is due to the fact that have restricted number of routing resources, our
that our router results in a Steiner tree type of firial route router performs better in terms of resource utiliza-
for each multi-terminal net. This is usually less than the tion.
total length of the net if the net is decomposed into
two-terminal nets. The time taken by our router lies between 8 to 231
Table 5 also shows the total wire length obtained by CPU seconds as is shown in Table 6. The execution time
using our router as a global router. As mentioned earlier, needed by the router is the CPU time taken on a Sun
this is done by keeping the flexibility of the connection SPARC-2 work-station. This is almost the same as the
box and the switch box as maximum. It should be noted time taken for just detailed routing of circuits using the
that the two-terminal detailed routes obtained after the CGE router. The maximum memory requirements for our
global routing always lie within the channels assigned by router never exceeded 500 Kbytes. This is primarily the
the global router. Thus, the total wire length estimated space required to store all the multi-terminal routes. The
after detailed routing is the same as the one obtained after reported memory requirements for CGE router lie in
the detailed routing. Thus, total wire length obtained for between 1.5 to 7.5 megabytes.
F 3 and Fc 0.6W using CGE router is also the
estimated total wire length obtained after global routing.
6 CONCLUDING REMARKS
5.4 Summary of Results We have designed a router for routing multi-terminal nets
in field-programmable gate arrays. Our router eliminates
The results stated in the previous subsections demon- the need for the global routing. The multi-terminal
strate the efficiency of our router. For sake of comparison, routing also bounds the worst case length that a signal
we have also stated results obtained for the CGE router. has to traverse. Since the detailed routes of the nets have
A summary of comparison is given below, a Steiner tree type of configuration, the total number of
segments required to route the complete circuit is usually
CGE router performs two-terminal routing after less than any other two-terminal routing approach. Our
global routing. Instead, our router performs multi- results are the average case of topologies for switch box
terminal net routing in one stage, i.e., no global and connection box and compare well with the one
routing is performed for channel assignment. Input reported in [3]. It should be noted that the channel width
to the router is the placement of a circuit on FPGAs. required for 100% routing using our router is for a fully
Efficient use of wiring segments during multi- automated design. In practice, we have observed that as
terminal routing. Experimentally, we have found many as 98% nets get routed when the channel width is
that channel width requirements for the two routers substantially less than that needed for 100% routing.
are about the same but the total number of wiring
segments used by our router are about 8-10% less Acknowledgement
than those used by CGE router. Thus for FPGAs,
TABLE 5 We would like to thank Jonathan Rose and Stephen Brown of the
Total Number of Wiring Segments Used for 100% Routing University of Toronto for providing us with the code of the CGE router
and the benchmark circuits. We also thank Ms. Akila Subramaniam for
Our Router CGE Router helping us in executing the router on the benchmark circuits and
rs 3" Fc Fs= 3; F= Rajasekhar Medicherla for carefully pointing out errors in the manu-
Circuit Full Flexibility 0.6W 0.6W script.
BUSC 1142 1216 1360
References
DMA 2507 2606 2862
BNRE 4494 4622 4896 [1] Dinesh Bhatia, Amit Chowdhary, and Spyros Tragoudas. Math-
DFSM 4772 4896 5211 ematical Model for Routability Analysis of FPGAs. In Proceed-
Z03 7866 8070 8668 ings of 4th Great Lakes Symposium on VLSI, IEEE Computer
Society Press, pages 76-79, 1994.
10 D. BHATIA AND A. CHOWDHARY

[2] Stephen Brown. Routing Algorithms and Architectures for Field- [9] Xilinx Inc., San Jose, California. The Programmable Gate Array
Programmable Gate Arrays. PhD thesis, University of Toronto, Data Book, 1994.
1992.
[3] Stephen Brown, Jonathan Rose, and Zvonko Vranesic. A De-
tailed Router for Field-Programmable Gate Arrays. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and
Systems, 11(5):620-627, May 1992. Biographies
[4] Stephen D. Brown, Jonathan Rose, and Zvonko Vranesic. A DINESH BHATIA is an Assistant Professor in the department of
Stochastic Model to Predict the Routability of Field-Program- Electrical and Computer Engineering and Computer Science at the
mable Gate Arrays. IEEE Transactions on Computer Aided University of Cincinnati. He also directs the Design Automation
Design of Integrated Circuits and Systems, 12(12):1827-1838, Laboratory within the same department. Prior to his current position he
December 1993. was visiting Assistant Professor of Computer Science and Engineering
E K. Hwang. On Steiner Minimal Trees with Rectilinear at the Southern Methodist University in Dallas. His research interests
[5]
include the architecture and CAD for field-programmable gate arrays,
Distance. SlAM Journal on Applied Mathematics, 30:104-114,
interconnection problems in VLSI, physical design of MCMs and large
1976. ICs, and graph theory and its application in VLSI design.
[6] J.B. Kruskal. On the Shortest Spanning Subtree of a Graph and
the Travelling Salesman Problem. Proceedings of American
Mathematical Society, 7:48-50, 1956. AMIT CHOWDHARY received the Bachelor of Technology degree
J. Rose. Parallel global routing for standard cells. IEEE Trans- in Electrical Engineering from Indian Institute of Technology, Kanpur,
[7]
India in 1991, and the M.S. degree in Computer Science and Engineer-
actions on Computer-Aided Design, 9:1085-1095, October
ing from the University of Cincinnati, Cincinnati, Ohio in 1993. He is
1990. currently working towards the Ph.D. degree in Computer Science and
[8] Yu-Liang Wu and M. Marek-Sadowska. Graph Based Analysis Engineering at the University of Michigan, Ann Arbor, Michigan. His
of FPGA Routing. In Proceedings of European Design Automa- main research interests include logic synthesis and architecture of field
tion Conference, EURODAC-93, pages 104-109, 1993. programmable gate arrays.
International Journal of

Rotating
Machinery

International Journal of
The Scientific
Engineering Distributed
Journal of
Journal of

Hindawi Publishing Corporation


World Journal
Hindawi Publishing Corporation Hindawi Publishing Corporation
Sensors
Hindawi Publishing Corporation
Sensor Networks
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Journal of

Control Science
and Engineering

Advances in
Civil Engineering
Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Submit your manuscripts at


http://www.hindawi.com

Journal of
Journal of Electrical and Computer
Robotics
Hindawi Publishing Corporation
Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

VLSI Design
Advances in
OptoElectronics
International Journal of

International Journal of
Modelling &
Simulation
Aerospace
Hindawi Publishing Corporation Volume 2014
Navigation and
Observation
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2010
Hindawi Publishing Corporation
http://www.hindawi.com
http://www.hindawi.com Volume 2014

International Journal of
International Journal of Antennas and Active and Passive Advances in
Chemical Engineering Propagation Electronic Components Shock and Vibration Acoustics and Vibration
Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

You might also like