Synthesis For Advanced ASIC Design: As As
Synthesis For Advanced ASIC Design: As As
Sanjiv Kaul Cadence Design Systems 2655 Seely Road, Bldg 7 San Jose , CA 95134
Introduction
As ASIC device sizes continue to shrink in order to meet the need for faster and faster clock speeds, it has become apparent that traditional design approaches -- not necessarily the individual design tools capabilities and performance-- must be re-examined. The fastest simulation and the highest-performing logic synthesis engines, while critical for mega-gate, sub-micron designs, are no longer enough to combat the challenges of todays most advanced technologies. What is required as devices slip down into the sub-0.5~ range is a more synergistic and interactive relationship between the heretofore separate logical and physical worlds. Specifically, in a topdown design methodology, unless there is a more thorough understanding by synthesis tools of the effects their results have on place-and-route and a more interactive relationship between the two worlds tool sets, ASIC designers and their foundries will continue to be frustrated by unroutable netlists, multiple design iterations and sub-optimal performing designs. This paper describes a new synthesis capability required for the design of advanced ASICs. Called placement-based synthesis (PBS), it specifically addresses the challenges of 0.5pn design. It allows placement data to be read by logic design tools in order to accurately assess interconnect delays (which account for up to 80% of the overall delay through a path in devices of this size). It then directly and incrementally adjusts the placement in order to correct timing problems and meet performance objectives.
65
This approach offers significant improvements over earlier techniques which could only account for interconnect delays after placement had been performed. In addition, existing commercial tools have only been able to adjust netlists, not the physical location (placement) of the cells contained in that netlist. Placement-based synthesis ability to read placement affords great control over whether timing requirements are met after place and route since once the placement is known by the synthesis tool, it can determine how to alter it to meet timing requirements. In addition an enhanced optimization capability called layout-intelligent optimization will be described. This feature, is required to automatically select cells and cell combinations that lead to more routable designs, while still meeting timing and area goals.
Synthesis requirements for 0 . 5 ~ Device sizes are shrinking to meet the need for faster and faster clock speeds. ASIC design kits for 0.7-0.8~ technologies are not uncommon and several 0 . 5 ~ ASIC design kits have emerged in 1993. At the 0 . 5 ~ level, interconnect delays dominate intrinsic cell delays. In fact, interconnect delays can contribute to as much as 80% of the overall delay through a path.
For 0 . 5 ~ technologies, interconnect delays can be more accurately assessed and critical paths thus more accurately identified when the physical locations of pins that lie on a path are known. However: These pin locations are known only after the design has been placed The delay calculator used must allow the user to accurately model interconnect delays for the
particular ASIC library being used by the design To date, synthesis tools have simply generated design netlists using fanout-based interconnect delay estimates as a basis for optimizing the design to be fast and to meet timing requirements. The netlist was then fed to a place-and-route (P&R) tool with some indication of the timing goals to be met, in the hope that timing goals would indeed be met after P&R. The problem with this approach is that timing goals for the netlist provided by the synthesis tool cannot always be met after P&R and the P&R tool has no knowledge of how to adjust the netlist.
Also, synthesis tools have traditionally ignored the impact of their choices on the ease and speed with which the resulting netlist can be routed by the P&R tool. Such routability problems are becoming an increasing issue.
paths are critical paths, functional operation of the design, max tolerable clock skew). It also has the ability to translate one network to a logically equivalent network to meet such goals. The latter is the essence of current synthesis/optimization technology. However, to truly meet timing goals, a synthesis tool must have accurate delay information at its disposal. This information can only be obtained after placement and is thus not available to an isolated synthesis tool. designs, ASIC foundries confirm that a For 0.5~ paradigm change is needed in which synthesis tools become more intelligent about layout. The most accurate delay information and the ability to reduce interconnect delays requires that the synthesis tool has knowledge of actual cell locations (placement). With placement information, the tool can precisely direct the P&R tool (specify exact placement changes) that will enable the placed circuit to be incrementally changed so that it meets timing goals.
For example, commercial synthesis tools produce designs with high fanout nets (the average pins per net is about 4 for automaticallysynthesized designs versus 3 for manuallygenerated designs). This leads to congestion which in turn makes the design difficult to route, and can result in the creation of long, slow nets. In an attempt to reduce "area," synthesis tools would also typically select cells that were difficult to route over, choosing these cells because their cell area was small. Such small cells often have inadequate feedthroughs (i.e. they are "nonporous'') and therefore cause routing blockages. This would lead to long (and thus slow) nets. Also, groups of small cells would lead to high pin densities in a localized area and thus a pocket of congestion that proved difficult to route. The problem is illustrated in Fig.1, in which a larger but more porous cell might be chosen because it is easier to route over. Design ECOs are very common and both laborious and expensive to perform manually. The most common cause is a timing violation. It is not unusual for there to be over five iterations between P&R and synthesis to meet design goals. Fortunately, synthesis is in a unique position to automatically drive changes to an already placed/routed design. This is because the synthesis tool understands the designer's original goals (e.g. desired clock speed, which
66
Traditional approaches Traditional approaches to synthesis-based topdown design have taken a point tool approach. That is, synthesis tools and P&R tools work in relative isolation from each other. The synthesis tool generates netlists, "optimized" to get timing and cell-area or gate-count requirements. It judges whether these timing and area requirements were met from characteristics of the individual cells that comprised the design and, for timing, used fanout information to predict interconnection delays. The generated netlist was then handed to the P&R tool in the hope that timing and die size requirements would truly be met after P&R.
Specifically, the approach taken by "point tool" vendors is:
1. Synthesize using cell-by cell information available is a synthesis library. The problem with this approach is that interconnect delays are not accurately computed. They are estimated from piece wise linear, fanout-based approximation (estimate interconnect length based on fanout, compute delay from estimated length). Also, since the synthesis tool ignores interconnect, it might
generate designs that contain high fanout nets and small non-porous cells, both of which cause the design to be difficult to route. Quite often the synthesis tools choose small non-porous cells because these cells, due to their lack of feedthrough, had a seemingly smaller area. But, overuse of such cells can lead to high pin concentrations and, with few feedthroughs present, pockets of congestion that result in: - large area occupied by interconnect in channel-routeddesigns - designs that are difficult to route - designs that contain long nets and are thus slow due to excessive interconnect - delays
2. Pass a netlist and timing goals to the P&R tool. The disconnect between synthesis and P&R is further evidenced here. The netlist may contain cells of inappropriate drive power for the interconnect that results after P&R. Also the proximity of a cell to the cells that it drives cannot be controlled.
3. Read delay information back from the P&R tool. This step is used to determine whether timing requirements were met.. Usually they were not. 4. Iteratively implement design changes. This is typically done in one of two ways: by manual techniques or by resynthesizing small portions of the design. Each iteration can take 26 weeks to perform. The synthesis tool may suggest minor netlist changes to the P&R tool that it hopes will correct the problem. The change may however not fix the problem since, without placement information, the synthesis tool's timing estimates are inaccurate. Multiple design/P&R iterations are typically necessary, resulting in months of delays in time-to-market and high engineering costs. A paradigm shift The approach described here offers designers techniques that are both P&R independent (called layout-intelligentoptimization) and layoutdependent (called placement-basedsynthesis).
flat). Optimization technology has to be enhanced so that performs layout-intelligent optimization. This enhancement results in designs that are inherently more routable by the P&R tool (due to fewer pins and less congestion). The tool considers interconnect, rather than using the traditional approach of blindly choosing a combination of cells because it is the smallest in terms of cell area or gate count. The technology works by accounting interconnect in the following ways: It reduces pins, thus resulting in fewer wires to route. It does so transparently, using existing synthesis library information
It ensures that standard cell designs are sufficiently porous (i.e. that there are sufficient routing resources over the cells). It can optionally do so, using information in the Cadence floorplanning tool's library.
Interconnect considerations not only impact the routability of the design but also the actual delay characteristicsof the design. For standard cell designs with channel routing, consideration of interconnect results in smaller die sizes. Synthesis tools have traditionally attempted to optimize designs to meet "area" goals, that "area' being taken to be the total cell area. For standard cell designs, the percentage of the die occupied by cells may be as little as 30%. The layoutintelligent optimization capability recognizes that die size is a function of interconnect as well as cell area. With placement information, the synthesis tool is in a position to direct the P&R tool to perform exact incremental changes to placement (cell substitutions and movement and rebuffering), changes that will ensure that timing problems are resolved. Moreover, it can drive incremental placement changes that actually improve the delay characteristicsof the circuit's critical paths. the ability to adjust placement rather than netlists provides more control over how timing problems are corrected.
Placement-basedsynthesis technology can correct timing problems and reduce critical path delays using a combination of the following: Synthesis tools have gained a reputation for functional knowledge (what function each generating netlists that are difficult to route. As piece of the design performs) which is design complexities increase, routing times also knowledge only available to synthesis tools. increase and are becoming unacceptable (e.g. a Synthesis tools understand logical 320K gate design might take 7 days to place 67
equivalencies, e.g. they know how to substitute one cell (or buffer) for alogically equivalent cell (or buffers) in order to correct or improve timing physical knowledge (where cells are located with respect to the other cells or gates that they drive; actual interconnect delays and loading). This knowledge has been traditionally only available to P&R tools and floorplanner. With this advancement, it now available to synthesis tools (for example Cadences Synergy toolset) since they read can placement
Specifying the desired changes in terms of placement regions allows the P&R tool to implement the desired change without being over constrained. Placement time for the incremental placement is typically about 10% of the time taken to place the original design.
Using this placement-basedsynthesis approach, the synthesis tool can read placement information, thus allowing it to automatically compute accurate interconnect delays using the same delay calculator used by leading simulators, timing analyzers and floorplanners. Placement data read and written by the synthesis tool can be in the form of: Exact placement information produced and understood by P&R tools Estimated (region-based)placement from advanced floorplanners Timing corrections are made by Synergy-PBS for example to the placed design to ensure that , after P&R, setup/hold violations are removed, fanout/load requirements are met, and clock speed specifications are met. Techniques used by the tool include: optimizing buffers swapping cells for logically equivalent cells that have more appropriate drive capabilities or that would lead to less congestion avoid creating cell combinations that would result in long wires along critical paths Using knowledge of existing placement, it specifies to the P&R tool desired changes to cell/buffer locations in terms of the regions where the cell/buffer should be placed. This specification of location is computed based on interconnect load and ensures that once placed, the circuit will: meet timing requirements * encounter less interconnect congestion contain shorter (and thus faster) interconnect along critical paths For portions of the design that already meet spec, little or no change occurs to the placement.
Summary The pace of ASIC design technology dictates more than just incremental improvements in design automation tools. Methodologies that incorporate complete front-to-back thinking are required to deal with the complexities and challenges of today most advanced devices. Therefore, to avoid needless design iterations, sub-optimal device performance and even unroutable netlists, a synergistic relationship between the logical and physical design worlds must exist.
interconnect Specifically, with designs of 0.5~ delays can constitute as much as 80% of the path delay. Accurate interconnect delay estimates can only be made after placement information using an ASIC vendors delay models. A breakthrough technology called placement-based synthesis (PBS) can perform the delay estimate accurately and also correct the placement of the design so as to meet timing requirements and improve clock speeds. Traditional synthesis tools have only been able to alter netlists. However, after the netlist is altered, there is no guarantee that timing requirements will be met after incremental P&R. The P&R tool itself is inherently constrained in that it cannot generally alter the netlist to be placed and routed. For example, it cant substitute a 4-input AND gate for another (logically equivalent) 4-input AND gate. However, synthesis tools understand logical equivalence and are therefore in a position to make such substitutions. hence, such changes can be performed by synthesis tools, in conjunction with P&R tools. Automation of this process is critical if designers hope to reduce turnaround times and the number of NREs typically associated with synthesisbased top-down design. The approach described in this paper allows the synthesis tool to automatically read placement informationfrom industry-standardP&R tools and drive incremental changes to the design placement, cells and connectivity.
68
In the past, commercial synthesis tools have attempted to patch the timing problems that inevitably arise after P&R. The approach taken is to read (back annotate) delay information derived after P&R and use this information as a basis for correcting timing problems. The synthesis tool alters the netlist and then feeds those netlist changes to the P&R tool in the hope that the P&R tool will be able to remove the problem by incrementallychanging the netlist. With the placement-basedsynthesis approach the synthesis tool adjusts actual placement to ensure that timing goals are met and timing problems removed. It does so with minimal or no changes to portions of the design known already to meet their specification. In addition optimizationtechnology needs to be enhanced to consider the impact of its decisions on interconnect (die size, routability and routing runtime costs). To improve routability, it needs to avoid making decisions that would lead to high pin densities and local pockets of congestion. It should consider cell porosity and interconnect area, where appropriate, when choosing which cell and cell combinationsto use to implement the design. It should also consider where to place such cells so as to reduce congestion and to reduce the likelihood of long nets. With these type of advancements, ASIC designers and their foundries can dramatically reduce overall design times and improve performance of designs. Removing the obstacles that have plagued first-generation approaches to top-down design by focusing on the barriers between the logical and physical design worlds will ensure more rapid success and proliferation ~ of design technologies such as 0 . 5device sizes.
69