0% found this document useful (0 votes)
56 views6 pages

Machine Learning Applications in Physical Design - Recent Results and Directions

Uploaded by

xzxuan2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views6 pages

Machine Learning Applications in Physical Design - Recent Results and Directions

Uploaded by

xzxuan2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

Machine Learning Applications in Physical Design: Recent


Results and Directions
Andrew B. Kahng
CSE and ECE Departments, UC San Diego, La Jolla, CA 92093
[email protected]
ABSTRACT well realize only one – possibly two – of these PPA wins. The 2013
In the late-CMOS era, semiconductor and electronics companies ITRS roadmap [40] highlighted a gap between scaling of available
face severe product schedule and other competitive pressures. In transistor density and scaling of realizable transistor density. This
this context, electronic design automation (EDA) must deliver “design- design capability gap, which adds to the spotlight on design cost, is
based equivalent scaling” to help continue essential industry trajec- illustrated in Figure 1 [20]. The recent DARPA Intelligent Design
tories. A powerful lever for this will be the use of machine learning of Electronic Assets (IDEA) [38] program directly calls out today’s
techniques, both inside and “around” design tools and flows. This design cost crisis, and seeks a “no human in the loop,” 24-hour
paper reviews opportunities for machine learning with a focus design framework for RTL-to-GDSII layout implementation.
on IC physical implementation. Example applications include (1)
removing unnecessary design and modeling margins through corre-
lation mechanisms, (2) achieving faster design convergence through
predictors of downstream flow outcomes that comprehend both
tools and design instances, and (3) corollaries such as optimizing
the usage of design resources licenses and available schedule. The
paper concludes with open challenges for machine learning in IC
physical design.
ACM Reference Format:
Andrew B. Kahng. 2018. Machine Learning Applications in Physical Design:
Recent Results and Directions. In ISPD’18: 2018 International Symposium on
Physical Design, March 25–28, 2018, Monterey, CA, USA. ACM, New York,
NY, USA, 6 pages. https://doi.org/10.1145/3177540.3177554

1 CONTEXT: THE LAST SCALING LEVERS Figure 1: Design Capability Gap [40] [20].
Semiconductor technology scaling is challenged on many fronts More broadly, the industry faces three intertwined challenges:
that include pitch scaling, patterning flexibility, wafer processing cost, quality and predictability. Cost corresponds to engineering
cost, interconnect resistance, and variability. The difficulty of contin- effort, compute effort, and schedule. Quality corresponds to tra-
uing Moore’s-Law lateral scaling beyond the foundry 5nm node has ditional power, performance and area (PPA) competitive metrics
been widely lamented. Scaling boosters (buried interconnects, back- along with other criteria such as reliability and yield (which also
side power delivery, supervias), next device architectures (VGAA determines cost). Predictability corresponds to the reliability of the
FETs), ever-improving design-technology co-optimizations, and design schedule, e.g., whether there will be unforeseen floorplan
use of the vertical dimension (heterogeneous multi-die integration, ECO iterations, whether detailed routing or timing closure flow
monolithic 3D VLSI) all offer potential extensions of the indus- stages will have larger than anticipated turnaround time, etc. Prod-
try’s scaling trajectory. In addition, various “rebooting computing” uct quality of results (QOR) must also be predictable. Each of three
paradigms – quantum, approximate, stochastic, adiabatic, neuro- challenges implies a corresponding “last lever” for scaling. In other
morphic, etc. – are being actively explored. words, reduction of design cost, improvement of design quality, and
No matter how future extensions of semiconductor scaling ma- reduction of design schedule (which is the flip side of predictability;
terialize, the industry already faces a crisis: design of new products recall that Moore’s Law is “one week equals one percent”) are are
in advanced nodes costs too much.1 Cost pressures rise when in- all forms of design-based equivalent scaling [19] [20] that can ex-
cremental technology and product benefits fall. Transitioning from tend availability of leading-edge technology to designers and new
40nm to 28nm brought as little as 20% power, performance or area products. A powerful lever for this will be the use of machine learn-
(PPA) benefit. Today, going from foundry 10nm to 7nm, or from ing (ML) techniques, both inside and “around” electronic design
7nm to 5nm, the benefit is significantly less, and products may automation (EDA) tools.
The remainder of this paper reviews opportunities for machine
1 The 2001 International Technology Roadmap for Semiconductors [40] noted that “cost learning in IC physical implementation. Section 2 reviews exam-
of design is the greatest threat to continuation of the semiconductor roadmap”.
ple ML applications aimed at removing unnecessary design and
Permission to make digital or hard copies of all or part of this work for personal or
modeling margins through new correlation mechanisms. Section 3
classroom use is granted without fee provided that copies are not made or distributed reviews applications that seek faster design convergence through
for profit or commercial advantage and that copies bear this notice and the full citation predictors of downstream flow outcomes. Section 4 gives a broader
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
vision of how ML can help the IC design and EDA fields escape
to post on servers or to redistribute to lists, requires prior specific permission and/or a the current “local minimum” of coevolution in design methodology
fee. Request permissions from [email protected]. and design tools. Section 5 concludes with open challenges for ML
ISPD’18, March 25–28, 2018, Monterey, CA, USA in IC physical design. Since this paper shares its subject matter and
© 2018 Association for Computing Machinery.
ACM ISBN 978-1-4503-5626-8/18/03. . . $15.00 was written contemporaneously with [23], readers are referred to
https://doi.org/10.1145/3177540.3177554 [23] for additional context.

68
Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

2 IMPROVING ANALYSIS CORRELATION


Analysis miscorrelation exists when two different tools return dif-
ferent results for the same analysis task (parasitic extraction, static
timing analysis (STA), etc.) even as they apply the same “laws of
physics” to the same input data. As illustrated in Figure 2, better
accuracy always comes at the cost of more computation. 2 Thus,
miscorrelation between two analysis reports is often the inevitable
consequence of runtime efficiency requirements. For example, sig-
noff timing is too expensive (tool licenses, incremental analysis
speed, loops of timing window convergence, query speed, number
of corners, etc.) to be used within tight optimization loops.

Figure 2: Accuracy-cost tradeoff in analysis. Figure 3: Flow and results for machine learning of STA tool
Miscorrelation forces introduction of design guardbands and/or miscorrelation: (a) [16]; (b) [30]. HSM approaches are de-
pessimism into the flow. For example, if the place-and-route (P&R) scribed in [28] [29].
tool’s STA report determines that an endpoint has positive worst
setup slack, while the signoff STA tool determines that the same Next Targets. [23] identifies two near-term extensions in the realm
endpoint has negative worst slack, an iteration (ECO fixing step) of timer analysis correlation. (1) PBA from GBA. Timing analysis
will be required. On the other hand, if the P&R tool applies pes- pessimism is reduced with path-based analysis (PBA), at the cost of
simism to guardband its miscorrelation to the signoff tool, this will significantly greater runtime than traditional graph-based analysis
cause unneeded sizing, shielding or VT-swapping operations that (GBA). In GBA, worst (resp. best) transitions (for max (resp. min)
cost area, power and design schedule. Miscorrelation of timing delay analyses) are propagated at each pin along a timing path,
analyses is particularly harmful: (i) timing closure can consume leading to conservative arrival time estimates. PBA calculates path-
up to 60% of design time [12], and (ii) added guardbands not only specific transition and arrival times at each pin, reducing pessimism
worsen power-speed-area tradeoffs [3, 9, 12], but can also lead to that can easily exceed a stage delay. Figure 4 shows the frequency
non-convergence of the design. distribution of endpoint slack pessimism in GBA. This pessimism
Signoff Timer Correlation. Correlation to signoff timing is the harms the design flow, e.g., when GBA reports negative slack when
most valuable target for ML in back-end design. Improved correla- PBA slack is positive, schedule and chip resources are wasted to fix
tion can give “better accuracy for free” that shifts the cost-accuracy false timing violations; when both GBA and PBA report negative
tradeoff (i.e. achieving the ML impact in Figure 2) and reduces iter- slack, there is waste from from over-fixing per the GBA report;
ations, turnaround time, overdesign, and tool license usage along etc. Similar considerations apply to accuracy requirements for pre-
the entire path to final design signoff.3 [27] uses a learning-based diction of PBA slack itself. (2) Prediction of timing at “missing
approach to fit analytical models of wire slew and delay to estimates corners”. Today’s signoff timing analysis is performed at 200+ cor-
from a signoff STA tool. These models improve accuracy of delay ners, and even P&R and optimization steps of physical design must
and slew estimations along with overall timer correlation, such that satisfy constraints at dozens of corners. [23] [24] note that predic-
fewer invocations of signoff STA are needed during incremental tion of STA results for one or more “missing” corners that are not
gate sizing optimization [34]. [16] applies deep learning to model analyzed, based on the STA reports for corners that are analyzed,
and correct divergence between different STA tools with respect to corresponds to matrix completion in ML [6] - and that the outlook
flip-flop setup time, cell arc delay, wire delay, stage delay, and path for this ML application is promising. An implicit challenge is to
slack at timing endpoints. The approach achieves substantial (mul- identify or synthesize the K timing corners that will enable the most
tiple stage delays) reductions in miscorrelation. Both a one-time accurate prediction of timing at all N production timing corners.
training methodology using artificial and real circuit topologies, as Product teams can also inform foundries and library teams of these
well as an incremental training flow during production usage, are K corners, so that the corresponding timing libraries can be the
described (Figure 3(a)). [30] achieves accurate (sub-10ps worst-case first to be characterized.
error in a foundry 28nm FDSOI technology) prediction of SI-mode
timing slacks based on “cheaper, faster” non-SI mode reports. A
combination of electrical, functional and topological parameters
are used to predict the incremental transition times and arc/path
delays due to SI effects. From this and other works, an apparent
“no-brainer” is to use Hybrid Surrogate Modeling (HSM) [28] to
combine predicted values from multiple ML models into final pre-
dictions (Figure 3(b)).
2 The figure’s y-axis shows that the error of the simplest estimates (e.g., “Elmore delay”)
can be viewed as having accuracy of (100 − x )%. The return on investment for new
ML applications would be higher when x is larger.
3 Given that miscorrelation equates with margin, it is useful to note [18]. Figure 4: Frequency distribution of ((PBA slack) − (GBA
slack)) at endpoints of netcard, 28FDSOI.

69
Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

(3) Other analysis correlations. There are numerous other anal-


ysis correlation opportunities for ML. Often, these are linked with
the prediction of tool and flow outcomes discussed below. Examples
include correlation across various “multiphysics” analysis trajec-
tories or loops [7] [22], such as those involving voltage droop or
temperature effects in combination with normal signal integrity- Figure 5: Post-route design rule violations (DRCs) predicted
aware timing. And, prominent among many parasitic estimation from global routing overflows (left); actual post-route DRCs
challenges is the prediction of bump inductance as early as possible (middle); overlay (right).
in the die-package codesign process [22].
violations during the (default) 20 iterations of a commercial router.
3 MODELS OF TOOLS AND DESIGNS Unsuccessful runs are those that end up with too many violations
Convergent, high-quality design requires accurate modeling and for manual fixing (e.g., the red and orange traces); these should be
prediction of downstream flow steps and outcomes. Predictive mod- identified and terminated after as few iterations as possible. How-
els (e.g., of wirelength, congestion, timing, etc.) become objectives ever, ultimately successful runs (e.g., the green trace) should be
or guides for optimizations, via a “modeling stack” that reaches run to completion. Tool logfile data can be viewed as time series to
up to system, architecture, and even project and enterprise levels. which hidden Markov models [35] or policy iteration in Markov de-
There is an urgent, complementary need for improved methods to (i) cision processes (MDPs) [4] may be applied. For the latter, collected
identify structural attributes of design instances that determine flow logfiles from previous successful and unsuccessful tool runs can
outcomes, (ii) identify “natural structure” in netlists (cf. [37]), and serve as the basis for automated extraction of a “blackjack strategy
(iii) construct synthetic design proxies (“eye charts”) [13][25][39] card” for a given tool, where “hit” analogizes to continuing the
to help develop models of tools and flows. More broadly, tool and tool run for another iteration, and “stay” analogizes to terminating
flow predictions are needed with increasing “span” across multiple the tool run.4 (2) Feeding higher-level optimizations. As noted
design steps: the analogy is that we must predict what happens at above, predictive models must provide new objectives and guid-
the end of a longer and longer rope when the rope is wiggled. ance for higher-level optimizations. [1] points out that the scope
Several examples of predictive models for tools and flows are for application extends up to project- and enterprise-level schedule
reviewed in [23]. [8] demonstrates that learning-based models can and resource optimizations, with substantial returns possible.
accurately identify routing hotspots in detailed placement, and en-
able model-guided optimization whereby predicted routing hotspots
are taken into account during physical synthesis with predictor-
guided cell spreading. This addresses today’s horrific divergence
between global routing and final detailed routing, which stems from
constraints on placement and pin access. Figure 5 [8] illustrates the
discrepancy between routing hotspots (DRCs) predicted from global
routing congestion, versus actual post-detailed routing DRCs. False
positives in the former mislead routability optimizations and cause
unnecessary iterations back to placement, while false negatives lead
to doomed detailed routing runs. As with all other PD-related ML
efforts thus far, the model of [8] incorporates parameters identified Figure 6: Four example progressions of the number of design
through domain expertise and multiple phases of model develop- rule violations (shown as a base-2 logarithm) with iterations
ment. (Reducing this dependence could be a long-term goal for of a commercial detailed router.
the field.) The work of [15] combines several simple predictions of
(3) Other Modeling and Prediction Needs. A first direction for
layout and timing changes to predict clock buffer placement ECOs
future tool and flow modeling is to add confidence levels and prob-
that will best improve clock skew variation across multiple timing
abilities to predictions. There is a trajectory of prediction from “can
corners. The work of [7] uses model parameters extracted from
be achieved” to “will achieve” to “will achieve within X resources
netlist, netlist sequential graph, floorplan, and constraints to predict
with Y probability distribution”. A second direction is to improve
post-P&R timing slacks at embedded memory instance endpoints.
the link between generation of data for model creation, and the
There are two clear takeaways from these experiences. First, there
model validation process. While physical design tools are not em-
has been no escape from the need for deep domain knowledge and
bedded in real-time, safety-critical contexts (i.e., impacts of poor
multiple, “highly curated” phases of model development. Second,
modeling are likely limited to quality, cost and schedule), model
results provide some optimism for the prospect of tool and flow
accuracy must be as high as possible, as early as possible. Third,
prediction, based on models of both tools and design instances.
ML opportunities in physical design are clustered around “linch-
The three reviewed works give a progression of “longer ropes”: (i)
pin” flow steps: floorplan definition, logic synthesis, and handoff
from global/trial routing through detailed routing (and from ECO
from placement to routing. For the logic synthesis step alone: since
placement through incremental global/trial routing); (ii) from clock
there is exactly one netlist handed off to implementation, what
buffer and topology change through automated placement and rout-
are the “magic” corners and constraints (including per-endpoint
ing ECOs, extraction, and timing analysis; and (iii) from netlist and
constraints [10]) that will induce the post-synthesis netlist that
floorplan information through placement, routing, optimization
leads to best final implementation? Fourth, additional opportunities
and IR drop-aware timing analysis.
Next Targets. [23] identifies two near-term targets for modeling of 4 In the MDP paradigm, the state space used could consist of binned violation count
tools, flows and designs. (1) Predicting doomed runs. Substantial and change in DRVs since a previous iteration; actions could be “go” or “stop”, and
rewards at each state used to derive the policy could include a small negative reward
effort and schedule can be saved if a “doomed run” is avoided. Fig- for a non-stop state, a large positive reward for termination with low number of DRVs,
ure 6 shows four example progressions of the number of design rule etc.

70
Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

lie in finding the fixed point of a chicken-egg loop, as noted in [7] reduces the time needed to solve any given subproblem, and smaller
[22]. An example challenge today is to predict the fixed point for subproblems can be better-solved (see [33]). At the same time, in-
(non-uniform) power distribution and post-P&R layout that meets creasing the number of design partitions without undue loss of
signoff constraints with maximum utilization. global solution quality demands new placement, global routing and
optimization algorithms, as well as fundamentally new RTL parti-
4 SOC IMPLEMENTATION: A VISION tion and floorplan co-optimization capabilities. Further, reducing
Physical design tools and flows today are unpredictable. A root design flexibility by giving designers “freedoms from choice” with
cause is that many complex heuristics have been accreted upon pre- respect to RTL constructs, power distribution, clock distribution,
vious complex heuristics. Thus, tools have become unpredictable, global buffering, non-default wiring rules, etc. would increase pre-
particularly when they are forced to try hard. Figure 7 (left), from im- dictability, leading to fewer iterations (ideally, single-pass design).
plementation of the PULPino low-power RISC V core in a foundry Turnaround time is then minimized. Improved predictability and
14nm enablement, shows that post-P&R area can change by 6% fewer iterations result in smaller design guardbands. The end result:
when target frequency changes by just 10MHz near the maximum improvement of achieved design quality, which shrinks the design
achievable frequency. Figure 7 (right) illustrates that the statistics capability gap. As pointed out in [24], achieving this vision of fu-
of this noisy tool behavior are Gaussian [32] [17]. Unpredictability ture SOC design methodology would improve quality, schedule and
of design implementation results in unpredictability of the design cost – i.e., “the last scaling levers”. A number of new mindsets for
schedule. However, since product companies must strictly meet tool developers and design flow engineers are implicit: (i) tools and
design and tapeout schedules, the design target (PPA) must be guard- flows should never return unexpected results; (ii) designers should
banded, impacting product quality and profitability. Put another see predictability, not chaos, in their tools and flows; (iii) cloud
way: (i) our heuristics and tools are chaotic when designers de- deployment and parallel search can help to preserve or improve
mand best-quality results; and (ii) when designers want predictable achieved quality of results; and (iv) the focus of design-based equiv-
results, they must aim low. alent scaling is on sustained reduction of design time and design
effort.

Figure 7: Left: SP&R implementation noise increases with


target design quality. Right: Observed noise is essentially
Gaussian. Figure 8: SOC design (a) today, and (b) in the future.
SOC Design: Today. From Figure 7, a genesis of today’s SOC phys-
ical implementation methodology can be seen, as illustrated in Fig- 5 A ROADMAP FOR ML IN PD
ure 8(a). The figure illustrates that with unpredictable optimizers, This section describes a “roadmap” for the insertion of ML within
as well as the perceived loss of “global optimization” of solution and around physical design flow steps. Four high-level stages of
quality when the design problem is partitioned, designers demand insertion are described. Then, a list of specific, actionable challenges
as close to flat methodologies as possible. Hence, today’s prevail- is given.
ing SOC methodology entails having as few large hard macros
as possible. To satisfy this customer requirement in the face of Four Stages of ML Insertion
Moore’s-Law scaling of design complexity, EDA tools must add
Insertion of ML into and around physical design algorithms, tools
more heuristics so as to turn around ever-larger blocks in the same
and flows could be divided into four qualitatively distinct stages.
turnaround time. To recover design quality (e.g., in light of “aim
Figure 9(a) conveys why IC implementation and design resource
low”) designers seek as much flexibility as possible in their imple-
requirements are so challenging: there are thousands of potential
mentation tools.5 This leads to poor predictability in design, which options at each flow step (don’t-use cells, timing constraints, pin
then leads to more iterations, and turnaround times become longer. placements, density screens, allowed netlist transforms, alternate
Further, the lack of predictability induces larger design guardbands. commands-options and environment variables, ...), resulting in an
As a result of these cause-effect relationships, the achieved design enormous tree of possible flow trajectories. Today, even identify-
quality worsens, and the design capability gap grows. This is the ing a “best” among alternative post-synthesis netlists or physical
unfortunate tale of coevolution between physical design tools and floorplans to carry forward in the flow is beyond the grasp of hu-
physical implementation methodology. man engineers. Thus, the likely first stage of ML insertion into
SOC Design: Future. To close the design capability gap, EDA and IC will entail creating robots: mechanizing and automating (via
IC design together must “flip the arrows” of Figure 8(a). A vision expert systems, perhaps) 24/7 replacements for human engineers
for future SOC design is suggested in Figure 8(b). The physical
that reliably execute a given flow to completion.6 Figure 10 shows
implementation challenge is decomposed into many more small
subproblems, by hyperpartitioning or “extreme partitioning”; this 6 Thisgoes beyond today’s typical ‘make chip’ flow automation in that real expertise
and human-seeming smarts are captured within the robot engineer. As discussed
5A modern P&R tool has thousands of, and even more than ten thousand, command- below, robots will likely also fill in last-mile or small-market tasks that are unserved
option combinations. by available tools.

71
Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

how primitive “multi-armed bandit” (MAB) sampling can achieve


resource-adaptive commercial synthesis, place and route with no
human involvement – in a “robotic” manner that is distinct from
expert systems approaches. Past tool run outcomes are used by
the MAB to estimate the probability of meeting constraints at dif-
ferent parameter settings; future runs are then scheduled that are
most likely to yield the best outcomes within the given (licenses ×
schedule) design resource budget. The figure shows the evolution
of sampled frequencies versus iterations in the MAB’s “robotic”
execution.

Figure 11: (a) Go-with-the-winners [2]. (b) Adaptive multi-


start in a “big valley” optimization landscape [5] [14].

prune, terminate, or otherwise not waste design resources on less-


promising flow trajectories. Implicit in the third stage is the im-
provement of predictability and modelability for PD heuristics and
EDA tools. Finally, the fourth stage will span from reinforcement
learning to “intelligence”. At this stage, there are many obstacles.
For example, the latency and unpredictability of IC design tool runs
(we can’t play the IC design game hundreds of millions of times in
Figure 9: (a) Tree of options at flow steps. (b) Phases of ML a few days, as we would the game of chess), the sparsity of data
insertion into production IC implementation. (there are millions of cat and dog faces on the web, but not many
10nm layouts), the lack of good evaluation functions, and the huge
space of trajectories for design all look to be difficult challenges.
Hopefully, aspects of the vision for future SOC design given above,
and solutions to the initial challenges given below, will provide
help toward realization of the fourth stage.

Specific Initial Challenges


Following are several specific “initial challenges” for machine learn-
ing in physical design.
“Last-Mile” Robots. A number of today’s time-consuming, error-
prone and even trial-and-error steps in IC implementation should be
automated by systems that systematically search for tool command
sequences, and/or observe and learn from humans. (1) Automation
of manual DRC violation fixing. After routing and optimization, P&R
Figure 10: Trajectory of “no-human-in-the-loop” multi- tools leave DRC violations due to inability to handle latest foundry
armed bandit sampling of a commercial SP&R flow, with 40 rules, unavoidable lack of routing resource in a high-utilization
iterations and 5 concurrent samples (tool runs) per iteration. block, etc. PD engineers today must spread cells and perform rip-
Testcase: PULPino core in 14nm foundry technology, with up and reroute manually. (2) Automation of manual timing closure
given power and area constraints. Adapted from [21]. steps. After routing and optimization, several thousand violations
of maxtrans, setup and hold constraints may exist. PD engineers
Once a robot engineer exists, the second stage of ML insertion today fix these manually at the rate of several hundred per day
is to optimally orchestrate N robot engineers that concurrently per engineer. (3) Placement of memory instances in a P&R block. (4)
search multiple flow trajectories, where N can range from tens Package layout automation. The ML challenge is to be able to assess
to thousands and is constrained chiefly by compute and license the post-routed quality (e.g., with respect to bump inductances)
resources. Here, simple multistart, or depth-first or breadth-first of floorplan and pin map in die-package codesign. From this will
traversal of the tree of flow options, is hopeless. Rather, it seems flow bump/ball placement and placement improvement; a possible
likely that strategies such as “go-with-the-winners” (GWTW) [2] prerequisite is the automation of manual package routing.
will be applied. GWTW launches multiple optimization threads, and Improving Analysis Correlation. (1) Prediction of the worst PBA
periodically identifies and clones the most promising thread while path. For a given endpoint, the worst PBA path is not necessarily
terminating other threads; see Figure 11(a). The GWTW method among any the top k GBA paths: CCS loads on side fanouts, path
has been applied successfully in, e.g., [26]. Another promising direc- topology and composition, GBA common path pessimism removal,
tion may be adaptive multistart [5] [14], which exploits an inherent etc. all affect the rank correlation between GBA and PBA results
“big valley” structure in optimization cost landscapes to adaptively of timing paths. (2) Prediction of the worst PBA slack per endpoint,
identify promising start configurations for iterative optimization. from GBA analysis. E.g., from all GBA endpoint slacks. (3) Prediction
This is illustrated in Figure 11(b), where better start points for opti- of timing at “missing corners”. Given timing analysis reports at k
mization are identified based on the structure of (locally-minimal) corners, predict reports at N − k corners, where k << N . Similarly:
solutions found from previous start points. given a prediction accuracy requirement, find k << N corners, with
The third stage will integrate prediction of tool- and design-specific k as small as possible, that enable prediction of remaining corners
outcomes over longer and longer subflows, so as to more surgically with the required accuracy. (4) Closing of multiphysics analysis loops.

72
Statistical and Machine Learning-Based CAD ISPD’18, March 25–28, 2018, Monterey, CA, USA

I.e., as in [22] [7], with early priorities being vectorless dynamic [5] K. D. Boese, A. B. Kahng and S. Muddu, ”New Adaptive Multistart Techniques
IR drop and power-temperature loops. (5) Continued improvement for Combinatorial Global Optimizations”, Operations Research Letters 16(2) (1994),
pp. 101-113.
of timing correlation and estimation as in [16] [30]. Matching the [6] E. J. Candes and B. Recht, “Exact Matrix Completion via Convex Optimization”,
golden tool earlier in the flow will more accurately drive optimiza- Foundations of Computational Mathematics 9 (2009), pp. 717-772.
tions and reduce ECO iterations. [7] W.-T. J. Chan, K. Y. Chung, A. B. Kahng, N. D. MacDonald and S. Nath, “Learning-
Based Prediction of Embedded Memory Timing Failures During Initial Floorplan
Predictive Models of Tools and Designs. (1) Prediction of the Design”, Proc. ASP-DAC, 2016, pp. 178-185.
convergent point for non-uniform PDN and P&R. The PDN is defined [8] W.-T. J. Chan, P.-H. Ho, A. B. Kahng and P. Saxena, “Routability Optimization for
before placement, but power analysis and routability impact can Industrial Designs at Sub-14nm Process Nodes Using Machine Learning”, Proc.
ISPD, 2017, pp. 15-21.
be assessed only after routing. (2) Estimation of the PPA response [9] T.-B. Chan, A. B. Kahng, J. Li and S. Nath, “Optimization of Overdrive Signoff”,
of a given block in response to floorplan optimizations. Final PPA Proc. ASP-DAC, 2013, pp. 344-349.
impacts of feedthroughs, shape, utilization, memory placement, [10] T.-B. Chan, A. B. Kahng and J. Li, “NOLO: A No-Loop, Predictive Useful Skew
Methodology for Improved Timing in IC Implementation”, Proc. ISQED, 2014, pp.
etc. must be comprehended to enable floorplan assessment and 504-509.
optimization (within a higher-level exploration of design partition- [11] S. Fenstermaker, D. George, A. B. Kahng, S. Mantik and B. Thielges, “METRICS:
ing/floorplanning solutions). (3) Estimation of useful skew impact on A System Architecture for Design Process Optimization”, Proc. DAC, 2000, pp.
705-710.
post-route WNS, TNS metrics. See, e.g., [10]. A low-level related chal- [12] R. Goering, “What’s Needed to “Fix” Timing Signoff?”, DAC Panel, 2013.
lenge: predicting buffer locations to optimize both common paths [13] P. Gupta, A. B. Kahng, A. Kasibhatla and P. Sharma, “Eyecharts: Constructive
and useful skew. (4) “Auto-magic” determination of constraints for a Benchmarking of Gate Sizing Heuristics”, Proc. DAC, 2010, pp. 597-602.
[14] L. Hagen and A. B. Kahng, “Combining Problem Reduction and Adaptive Multi-
given netlist, for given performance and power targets – i.e., best Start: A New Technique for Superior Iterative Partitioning”, IEEE Trans. Computer-
settings for maxtrans, maxcap, clock uncertainty, etc. at each flow Aided Design of Integrated Circuits and Systems 16(7) (1997), pp. 709-717.
stage. More generally, determine “magic” corners and constraints [15] K. Han, A. B. Kahng, J. Lee, J. Li and S. Nath, “A Global-Local Optimization Frame-
work for Simultaneous Multi-Mode Multi-Corner Skew Variation Reduction”,
that will produce the best netlist to send into P&R. (5) Prediction Proc. DAC, 2015, pp. 26:1-26:6.
of the best “target sequence” of constraints through layout optimiza- [16] S. S. Han, A. B. Kahng, S. Nath and A. Vydyanathan, “A Deep Learning Method-
tion phases. I.e., timing and power targets at synthesis, placement, ology to Proliferate Golden Signoff Timing”, Proc. DATE, 2014, pp. 260:1-260:6.
[17] K. Jeong and A. B. Kahng, “Methodology From Chaos in IC Implementation”,
etc. such that best final PPA metrics are achieved. (6) Prediction of Proc. ISQED, 2010, pp. 885-892.
impacts (setup, hold slack, max transition, power) of an ECO, across [18] K. Jeong, A. B. Kahng and K. Samadi, “Impacts of Guardband Reduction on
MCMM scenarios. (7) Prediction of the “most-optimizable” cells during Design Process Outcomes: A Quantitative Approach", IEEE Trans. Semiconductor
Manufacturing 22(4) (2009), pp. 552-565.
design closure. Many optimization steps are wasted on instances [19] A. B. Kahng, “The Cost of Design”, IEEE Design & Test of Computers, 2002.
that cannot be perturbed due to placement, timing, power and other [20] A. B. Kahng, “The ITRS Design Technology and System Drivers Roadmap: Process
context. (8) Prediction of divergence (detouring, timing/slew viola- and Status”, Proc. DAC, 2013, pp. 34-39.
[21] A. B. Kahng, DARPA IDEA Workshop presentation, Arlington, April 2017.
tions) between trial/global route and final detailed route. (9) Prediction [22] A. B. Kahng, ANSYS Executive Breakfast keynote talk, June 2017.
of “doomed runs” at all steps of the physical design flow. http://vlsicad.ucsd.edu/Presentations/talk/Kahng-ANSYS-DACBreakfast_
And More. (1) Infrastructure for ML in IC design. Standards for talk_DISTRIBUTED2.pdf
[23] A. B. Kahng, “New Directions for Learning-Based IC Design Tools and Method-
model encapsulation, model application, IP-preserving model shar- ologies”, Proc. ASP-DAC, 2018, pp. 405-410.
ing, etc. are yet to be developed. (2) Standard ML platform for EDA [24] A. B. Kahng, “Quality, Schedule, and Cost: Design Technology and the Last
modeling. Enablement of design metrics collection, tool and flow Semiconductor Scaling Levers”. keynote talk, ASP-DAC, 2018. http://vlsicad.ucsd.
edu/ASPDAC18/ASP-DAC-2018-Keynote-Kahng-POSTED.pptx
model generation, design-adaptive tool and flow configuration, pre- [25] A. B. Kahng and S. Kang, “Construction of Realistic Gate Sizing Benchmarks
diction of tool and flow outcomes, etc. would realize the original With Known Optimal Solutions”, Proc. ISPD, 2012, pp. 153-160.
vision of METRICS [36] [11] [31]. (3) Development of more mod- [26] A. B. Kahng, S. Kang, H. Lee, I. L. Markov and P. Thapar, “High-Performance
elable algorithms and tools with smoother, less-chaotic outcomes Gate Sizing with a Signoff Timer”, Proc. ICCAD, 2013, pp. 450-457.
[27] A. B. Kahng, S. Kang, H. Lee, S. Nath and J. Wadhwani, “Learning-Based Approxi-
than present methods. (4) Development of datasets to support ML. mation of Interconnect Delay and Slew in Signoff Timing Tools”, Proc. SLIP, 2013,
This spans new classes of artificial circuits and “eyecharts”, as well pp. 1-8.
as sharing of training data and the data generation task across [28] A. B. Kahng, B. Lin and S. Nath, “Enhanced Metamodeling Techniques for High-
Dimensional IC Design Estimation Problems”, Proc. DATE, 2013, pp. 1861-1866.
different design organizations. [29] A. B. Kahng, B. Lin and S. Nath, “High-Dimensional Metamodeling for Prediction
of Clock Tree Synthesis Outcomes”, Proc. SLIP, 2013, pp. 1-7.
6 ACKNOWLEDGMENTS [30] A. B. Kahng, M. Luo and S. Nath, “SI for Free: Machine Learning of Interconnect
Coupling Delay and Transition Effects”, Proc. SLIP, 2015, pp. 1-8.
Many thanks are due to Dr. Tuck-Boon Chan, Dr. Jiajia Li, Dr. Sid- [31] A. B. Kahng and S. Mantik, “A System for Automatic Recording and Prediction
dhartha Nath, Dr. Stefanus Mantik, Dr. Kambiz Samadi, Dr. Kwan- of Design Quality Metrics”, Proc. ISQED, 2001, pp. 81-86.
[32] A. Kahng and S. Mantik, “Measurement of Inherent Noise in EDA Tools”, Proc.
gok Jeong, Ms. Hyein Lee and Mr. Wei-Ting Jonas Chan who, along ISQED, 2002, pp. 206-211.
with current ABKGroup students and collaborators, performed [33] A. Katsioulas, S. Chow, J. Avidan and D. Fotakis, “Integrated Circuit Architecture
much of the research cited in this paper. I thank Professor Lawrence with Standard Blocks”’, U.S. Patent 6,467,074, 2002.
[34] C. W. Moon, P. Gupta, P. J. Donehue and A. B. Kahng, “Method of Designing
Saul for ongoing discussions and collaborations. Permission of coau- a Digital Circuit by Correlating Different Static Timing Analyzers", US Patent
thors to reproduce figures from works referenced here is gratefully 7,823,098, 2010.
acknowledged. Research at UCSD is supported by NSF, Qualcomm, [35] L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications
on Speech Recognition”, Proc. IEEE 77 (1989), pp. 257-286.
Samsung, NXP, Mentor Graphics and the C-DEN center. [36] The GSRC METRICS Initiative. http://vlsicad.ucsd.edu/GSRC/metrics/
[37] Partitioning- and Placement-based Intrinsic Rent Parameter Evaluation. http:
REFERENCES //vlsicad.ucsd.edu/WLD/RentCon.pdf
[38] “DARPA Rolls Out Electronics Resurgence Initiative”, https://www.darpa.mil/
[1] P. Agrawal, M. Broxterman, B. Chatterjee, P. Cuevas, K. H. Hayashi, A. B. Kahng, news-events/2017-09-13
P. K. Myana and S. Nath, “Optimal Scheduling and Allocation for IC Design [39] Gate Sizing Benchmarks With Known Optimal Solution. http://vlsicad.ucsd.edu/
Management and Cost Reduction”, ACM TODAES 22(4) (2017), pp. 60:1-60:30. SIZING/bench/artificial.html
[2] D. Aldous and U. Vazirani, “Go With the Winners”, Proc. IEEE Symp. on Founda- [40] International Technology Roadmap for Semiconductors. http://www.itrs2.net/
tions of Computer Science, 1994, pp. 492-501. itrs-reports.html
[3] S. Bansal and R. Goering, “Making 20nm Design Challenges Manageable”,
http://www.chipdesignmag.com/pdfs/chip_design_special_DAC_issue_2012.pdf
[4] D. Bertsekas, Dynamic Programming and Optimal Control, Athena, 1995.

73

You might also like