Ma Et Al. - 2020 - Understanding Graphs in EDA From Shallow To Deep Learning
Ma Et Al. - 2020 - Understanding Graphs in EDA From Shallow To Deep Learning
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
119
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
A D
C
(a)
(a) (b)
B B
A1
<latexit sha1_base64="vyWfX0mmmyMfsu7oknOlPh/5R5w=">AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstERtLDERJIEL2VvmYMPe3mV3z4Rc+A02Fhpj6w+y89+4wBUKvmSSl/dmMjMvSATXxnW/ncLa+sbmVnG7tLO7t39QPjxq6zhVDFssFrHqBFSj4BJbhhuBnUQhjQKBj8H4duY/PqHSPJYPZpKgH9Gh5CFn1FipdV3te9V+ueLW3DnIKvFyUoEczX75qzeIWRqhNExQrbuemxg/o8pwJnBa6qUaE8rGdIhdSyWNUPvZ/NgpObPKgISxsiUNmau/JzIaaT2JAtsZUTPSy95M/M/rpia88jMuk9SgZItFYSqIicnsczLgCpkRE0soU9zeStiIKsqMzadkQ/CWX14l7XrNu6jV7+uVxk0eRxFO4BTOwYNLaMAdNKEFDDg8wyu8OdJ5cd6dj0VrwclnjuEPnM8fcQSNxw==</latexit>
C C
A D A2
<latexit sha1_base64="SU7PbLKnjQ8MtPXISwC7EXRxb4Q=">AAAB7HicbVBNT8JAEJ3iF+IX6tHLRjDxRNp60CPqxSMmFkmgIdtlCxu2u83u1oQ0/AYvHjTGqz/Im//GBXpQ8CWTvLw3k5l5UcqZNq777ZTW1jc2t8rblZ3dvf2D6uFRW8tMERoQyaXqRFhTzgQNDDOcdlJFcRJx+hiNb2f+4xNVmknxYCYpDRM8FCxmBBsrBdf1vl/vV2tuw50DrRKvIDUo0OpXv3oDSbKECkM41rrruakJc6wMI5xOK71M0xSTMR7SrqUCJ1SH+fzYKTqzygDFUtkSBs3V3xM5TrSeJJHtTLAZ6WVvJv7ndTMTX4U5E2lmqCCLRXHGkZFo9jkaMEWJ4RNLMFHM3orICCtMjM2nYkPwll9eJW2/4V00/Hu/1rwp4ijDCZzCOXhwCU24gxYEQIDBM7zCmyOcF+fd+Vi0lpxi5hj+wPn8AXKJjcg=</latexit>
D
(b) (c)
(c) (d) Figure 3: An example of decomposition graph in layout de-
Figure 2: Examples of routing graphs. (a) Channel model; (b) composition problem. (a) Layout pattern; (b) Decomposition
Channel graph; (c) Grid model; (d) Grid graph. graph; (c) Decomposition graph with stitch edge.
120
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
Input graph Layer 1 Layer k Embedding where hvt is the hidden state of node v at time step t, Nv the set
of adjacent vertices of v, xv the feature of node v, xuv the feature
of edge uv, and f the parametric function for local state transi-
tion, which should be carefully design (specifically, a contraction
…… Task map [57]) to ensure convergence. Despite the conceptual signifi-
cance, public interest towards recurrent GNNs was limited due to
the restricted expressive power of a contractive operation and the
heavy computational burden to reach its equilibrium.
Figure 4: Graph neural network with k layers for embedding Given the drawbacks of recurrent GNNs, the emergence of Graph
generation. Obtained embedding is fed to downstream tasks. Convolutional Networks (GCNs) is no surprise. Taxonomically,
GCNs fall into two categories, viz., spectral-based and spatial-based,
with the former based on graph spectral analysis, while the latter
Gaussian Process Regression-based active learning flow is proposed inherits the paradigm of message passing from recurrent GNNs.
for high performance adder design space exploration based on prefix We introduce the two approaches in the following paragraphs.
graph representation [16]. A timing failure prediction technique is Graph convolution is defined [58] on Fourier domain in the spec-
proposed in [53] given the information of netlist, timing constraints, tral approaches, where the eigen-decomposition of graph Lapla-
and floorplan. In place-and-route (P&R) stage, routability is a critical cian is computed. Specifically, graph Laplacian is defined as L =
1 1
issue and has large impact on the final quality. Some previous works I − D − 2 AD 2 = V ΛV ⊤ , where I is the identity matrix, D and A are
investigate routability estimation with a particular routing graph degree matrix and adjacent matrix of the graph, respectively. Let
model based on manually extracted features. Qi et al. [54] and Zhou дθ : R → R be a filter defined on graph spectrum Λ and f : V → R
et al. [55] applied multivariate adaptive regression splines (MARS) be features of nodes, graph convolution is given by:
to detailed routing congestion estimation. Pui et al. [17] proposed
a hierarchical hybrid model for congestion estimation in FPGA дθ ∗ f = V (дθ (Λ) ⊙ V ⊤ f ), (2)
placement, which consists of linear regression and support vector which respects the Convolution Theorem. Equivalently, we write
regression. д = diag(дθ (Λ)) and a convolutional layer with multiple (fl ) chan-
nels is defined by:
3 GRAPH REPRESENTATION WITH DEEP fl
LEARNING METHODOLOGIES H (l +1) = σ ( V (дi V ⊤ H (l ) )),
Õ
(3)
The main challenge of conducting learning algorithms on graphs is i=1
how to encoding structural information of graphs, which have been
where H (l ) is the output of previous convolutional layer with
intensively investigated in the machine learning community. In
H (0) := X the collection of node features, дi the i-th trainable
this section, we introduce a bunch of graph representation methods
filter, and σ (·) a nonlinear activation function. Note that the filters
based on neural networks, and highlight some challenges on how
in spectral domain may not be localized, which could be alleviated
to apply to EDA applications.
with some smoothing techniques [58]. Further, the computation
complexity of this line of methods is reduced through approxima-
3.1 Graph Learning with Neural Networks tion and simplification [22, 59].
Graph-based learning is a new approach to machine learning with
a wide range of applications [56]. Before performing a certain task,
representation of node or graph should be obtained first, which
3
is known as embedding and can be fed to downstream models, as 1 [1 ⇥ d2 ]
5
<latexit sha1_base64="dK7Q7Rwf8Es7WLHBSVx9xxio7ps=">AAAB+XicbVBNS8NAEJ3Ur1q/oh69LLaCp5LEgx4LXjxWsB/QhrDZbNqlm03Y3RRK6D/x4kERr/4Tb/4bt20O2vpg4PHeDDPzwowzpR3n26psbe/s7lX3aweHR8cn9ulZV6W5JLRDUp7KfogV5UzQjmaa034mKU5CTnvh5H7h96ZUKpaKJz3LqJ/gkWAxI1gbKbDtxsBFQ80SqlAUeH4jsOtO01kCbRK3JHUo0Q7sr2GUkjyhQhOOlRq4Tqb9AkvNCKfz2jBXNMNkgkd0YKjAZpVfLC+foyujRChOpSmh0VL9PVHgRKlZEprOBOuxWvcW4n/eINfxnV8wkeWaCrJaFOcc6RQtYkARk5RoPjMEE8nMrYiMscREm7BqJgR3/eVN0vWa7k3Te/TqLa+MowoXcAnX4MIttOAB2tABAlN4hld4swrrxXq3PlatFaucOYc/sD5/AFyKkiE=</latexit>
shown in Figure 4. 1
Encoding
The blossoms of deep learning in various disciplines have pro- 4 2
l=2
<latexit sha1_base64="fRaTxQ5ojt4n2ht7GBug69Ws30w=">AAAB7nicbVA9SwNBEJ2LXzF+RS1tFhPBKtydhTZCwMYygvmA5Ah7m02yZG/v2J0TwpEfYWOhiK2/x85/4ya5QhMfDDzem2FmXphIYdB1v53CxubW9k5xt7S3f3B4VD4+aZk41Yw3WSxj3Qmp4VIo3kSBkncSzWkUSt4OJ3dzv/3EtRGxesRpwoOIjpQYCkbRSu2qJLfEr/bLFbfmLkDWiZeTCuRo9MtfvUHM0ogrZJIa0/XcBIOMahRM8lmplxqeUDahI961VNGImyBbnDsjF1YZkGGsbSkkC/X3REYjY6ZRaDsjimOz6s3F/7xuisObIBMqSZErtlw0TCXBmMx/JwOhOUM5tYQyLeythI2ppgxtQiUbgrf68jpp+TXvquY/+JW6n8dRhDM4h0vw4BrqcA8NaAKDCTzDK7w5ifPivDsfy9aCk8+cwh84nz8og44V</latexit>
1 [1 ⇥ d1 ]
moted the application of neural networks in graph learning. Typi- (a)
<latexit sha1_base64="iuzoLUFRRSOK/BCTq64rBn1pfHU=">AAAB+XicbVBNS8NAEN3Ur1q/oh69LLaCp5LEgx4LXjxWsB/QhrDZbNqlm03YnRRK6D/x4kERr/4Tb/4bt20O2vpg4PHeDDPzwkxwDY7zbVW2tnd296r7tYPDo+MT+/Ssq9NcUdahqUhVPySaCS5ZBzgI1s8UI0koWC+c3C/83pQpzVP5BLOM+QkZSR5zSsBIgW03Bi4eAk+YxlHg+o3ArjtNZwm8SdyS1FGJdmB/DaOU5gmTQAXReuA6GfgFUcCpYPPaMNcsI3RCRmxgqCRmlV8sL5/jK6NEOE6VKQl4qf6eKEii9SwJTWdCYKzXvYX4nzfIIb7zCy6zHJikq0VxLjCkeBEDjrhiFMTMEEIVN7diOiaKUDBh1UwI7vrLm6TrNd2bpvfo1VteGUcVXaBLdI1cdIta6AG1UQdRNEXP6BW9WYX1Yr1bH6vWilXOnKM/sD5/AFsEkiA=</latexit>
l=1
<latexit sha1_base64="jhd8YxvX+zP9Gz++g6JJ1IAdtkc=">AAAB7nicbVA9SwNBEJ2LXzF+RS1tFhPBKtzFQhshYGMZwXxAcoS9zVyyZG/v2N0TwpEfYWOhiK2/x85/4ya5QhMfDDzem2FmXpAIro3rfjuFjc2t7Z3ibmlv/+DwqHx80tZxqhi2WCxi1Q2oRsEltgw3AruJQhoFAjvB5G7ud55QaR7LRzNN0I/oSPKQM2qs1KkKcku86qBccWvuAmSdeDmpQI7moPzVH8YsjVAaJqjWPc9NjJ9RZTgTOCv1U40JZRM6wp6lkkao/Wxx7oxcWGVIwljZkoYs1N8TGY20nkaB7YyoGetVby7+5/VSE974GZdJalCy5aIwFcTEZP47GXKFzIipJZQpbm8lbEwVZcYmVLIheKsvr5N2veZd1eoP9UqjnsdRhDM4h0vw4BoacA9NaAGDCTzDK7w5ifPivDsfy9aCk8+cwh84nz8m/o4U</latexit>
The seminal works of graph neural network (GNN) are mostly Aggregation Aggregation Aggregation Aggregation
inspired by recurrent neural networks, where nodes recurrently 1 2 3 4 1 2 5 1 3 1 4
exchange information with adjacent nodes until a stable state is [4 ⇥ d0 ] [3 ⇥ d0 ] [2 ⇥ d0 ] [2 ⇥ d0 ]
reached. Formally, the hidden state of a node v is updated recur-
<latexit sha1_base64="R1ZfNX8av2JYxeUaA0Mvx7Ru29k=">AAAB+XicbVBNS8NAEN3Ur1q/oh69LLaCp5KkBz0WvHisYFuhDWGz2bRLN5uwOymU0H/ixYMiXv0n3vw3btsctPXBwOO9GWbmhZngGhzn26psbe/s7lX3aweHR8cn9ulZT6e5oqxLU5Gqp5BoJrhkXeAg2FOmGElCwfrh5G7h96dMaZ7KR5hlzE/ISPKYUwJGCmy7MWjhIfCEaRwFjt8I7LrTdJbAm8QtSR2V6AT21zBKaZ4wCVQQrQeuk4FfEAWcCjavDXPNMkInZMQGhkpiVvnF8vI5vjJKhONUmZKAl+rviYIkWs+S0HQmBMZ63VuI/3mDHOJbv+Ayy4FJuloU5wJDihcx4IgrRkHMDCFUcXMrpmOiCAUTVs2E4K6/vEl6XtNtNb0Hr972yjiq6AJdomvkohvURveog7qIoil6Rq/ozSqsF+vd+li1Vqxy5hz9gfX5A1ygkiE=</latexit>
<latexit sha1_base64="Eh1b42pRgKYPxFI3kSS46lwB6Ao=">AAAB+XicbVBNS8NAEN3Ur1q/oh69LLaCp5JEQY8FLx4r2FpoQ9hsNu3SzSbsTgol9J948aCIV/+JN/+N2zYHbX0w8Hhvhpl5YSa4Bsf5tiobm1vbO9Xd2t7+weGRfXzS1WmuKOvQVKSqFxLNBJesAxwE62WKkSQU7Ckc3839pwlTmqfyEaYZ8xMylDzmlICRAttu9K/xAHjCNI4Cx28Edt1pOgvgdeKWpI5KtAP7axClNE+YBCqI1n3XycAviAJOBZvVBrlmGaFjMmR9QyUxq/xicfkMXxglwnGqTEnAC/X3REESradJaDoTAiO96s3F/7x+DvGtX3CZ5cAkXS6Kc4EhxfMYcMQVoyCmhhCquLkV0xFRhIIJq2ZCcFdfXiddr+leNb0Hr97yyjiq6Aydo0vkohvUQveojTqIogl6Rq/ozSqsF+vd+li2Vqxy5hT9gfX5A14xkiI=</latexit>
<latexit sha1_base64="BnQWivthPq2obYWxmNGU0l5q0nc=">AAAB+XicbVBNS8NAEJ3Ur1q/oh69LLaCp5LEgx4LXjxWsB/QhrDZbNqlm03Y3RRK6D/x4kERr/4Tb/4bt20O2vpg4PHeDDPzwowzpR3n26psbe/s7lX3aweHR8cn9ulZV6W5JLRDUp7KfogV5UzQjmaa034mKU5CTnvh5H7h96ZUKpaKJz3LqJ/gkWAxI1gbKbDtxsBDQ80SqlAUOH4jsOtO01kCbRK3JHUo0Q7sr2GUkjyhQhOOlRq4Tqb9AkvNCKfz2jBXNMNkgkd0YKjAZpVfLC+foyujRChOpSmh0VL9PVHgRKlZEprOBOuxWvcW4n/eINfxnV8wkeWaCrJaFOcc6RQtYkARk5RoPjMEE8nMrYiMscREm7BqJgR3/eVN0vWa7k3Te/TqLa+MowoXcAnX4MIttOAB2tABAlN4hld4swrrxXq3PlatFaucOYc/sD5/AFsPkiA=</latexit>
<latexit sha1_base64="BnQWivthPq2obYWxmNGU0l5q0nc=">AAAB+XicbVBNS8NAEJ3Ur1q/oh69LLaCp5LEgx4LXjxWsB/QhrDZbNqlm03Y3RRK6D/x4kERr/4Tb/4bt20O2vpg4PHeDDPzwowzpR3n26psbe/s7lX3aweHR8cn9ulZV6W5JLRDUp7KfogV5UzQjmaa034mKU5CTnvh5H7h96ZUKpaKJz3LqJ/gkWAxI1gbKbDtxsBDQ80SqlAUOH4jsOtO01kCbRK3JHUo0Q7sr2GUkjyhQhOOlRq4Tqb9AkvNCKfz2jBXNMNkgkd0YKjAZpVfLC+foyujRChOpSmh0VL9PVHgRKlZEprOBOuxWvcW4n/eINfxnV8wkeWaCrJaFOcc6RQtYkARk5RoPjMEE8nMrYiMscREm7BqJgR3/eVN0vWa7k3Te/TqLa+MowoXcAnX4MIttOAB2tABAlN4hld4swrrxXq3PlatFaucOYc/sD5/AFsPkiA=</latexit>
121
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
Spatial-based graph convolutions are defined based on the spa- Recently, GraphZoom [65] is proposed for improving both accu-
tial relationship of nodes, where information is propagated and racy and scalability of unsupervised graph embedding algorithms,
aggregated in a message passing scheme. which is a multi-level spectral framework. In addition to designing
A representative work is GraphSAGE [23], which can generate specific algorithms and models, there are also a few third-party
node embedding by leveraging node feature information from the libraries like DGL [66] for users to make the network scalable.
neighborhood. The fundamental procedure consists of two steps,
3.2.2 Hypergraph. Another significant distinction is that hyper-
i.e., aggregation and encoding, which can be formulated as:
graph is commonly applied in EDA applications. Different from a
(l ) (l −1)
h N(v) ← AGGREGATEl ({hu , ∀u ∈ N(v)}), (4) regular edge that connects exactly two vertices, a hyperedge may
connect more than two vertices. The basic idea to extend GCN to
(l ) (l )
hv ← σ (W (l ) · h N(v) ), (5) handle hypergraphs is approximating the structural information
indicated by a hyperedge with normal edges. One way to perform
where AGGREGATEl in Equation (4) is the aggregation function ap-
convolution on hypergraph is proposed in [67], which can be writ-
plied to node v and its neighborhood N(v). Equation (5) is encoding
ten as:
operation consisting of an embedding projection and a non-linear
activation. An example illustrating a 2-layer network for generating H (l +1) = σ (D −1/2QW B −1Q ⊤ D −1/2 H (l ) P), (8)
the embedding for node 1 is depicted in Figure 5, with encoding di- where D, B are the degree matrices of the vertex and hyperedges,
mensions in l 1 and l 2 are d 1 and d 2 , respectively. Specifically, if mean respectively, P and W are trainable weights and Q is the incidence
function is selected as the aggregation function, the aggregation in matrix of the hypergraph. Intuitively, Equation (8) works through
layer l is equivalent to Equation (6). clique expansion [67] which is one of the widely used methods to
handle hyperedge by replacing it of size k with a k-clique. There are
(l ) also a few variations of clique expansion such as attention-based
H N(v) = A · H (l −1) clique expansion [68], which assigns different weights to generate
normal edges by an attention mechanism. However, performing
1 2 3 4 5 (l −1) clique expansion on hyperedge of size k requires a complexity of
1 1 w1 w1 w1 0 h 1 O(k!) in terms of the number of edges.
(l −1)
w 2 w 1 h 2
2 1 0 0 In order to reduce the complexity and improve the efficiency, the
= 3 w
2 0 1 0 0 × h (l3 −1) , (6) transformation procedure can be reduced by selection. Specifically,
4 w 2
0 0 1
0 h (l −1)
a few normal edges are selected in the edge set generated by clique
4 expansion instead of keeping all the edges. The selection criteria is
5 0 w2 0 0 1 (l −1)
h 5 based on the assumption that nodes in the same hyperedge share
similar features. Intuitively, an edge can be omitted if the repre-
sentations of two nodes are already similar during training, while
where A is the adjacency matrix of the graph, and H (l ) contains the those nodes with relatively distinct representations should remain
embeddings for every node in the graph, w 1 and w 2 are weights connected. Therefore, one criterion is proposed in [69], which is to
for input edges and output edges, respectively. Then the two-step select node pairs connected by the same hyperedge with maximal
process can be calculated with matrix calculation: feature difference. The procedure can be formulated as:
H (l ) = σ ((A · H (l −1) ) · W (l ) ). (7) (i, j) = arg max ||(hi − h j )||2 . (9)
i,j ∈e
Note that random walk is often introduced as a sampling technique.
For more representative work, see [60–62]. We refer readers to [57] where hi , h j are the features of node i and node j, respectively. By
for a more comprehensive survey of graph learning. connecting (i, j) and {(i, u), (j, u) : u ∈ S e } where u ∈ S e is also
called “mediator", the complexity can be reduced to O(k). To do so,
3.2 Challenges in EDA Applications the graph structure changes dynamically during training since the
node representations are going to be updated. [70] also proposes a
3.2.1 Scalability. Unlike conventional graph learning tasks, graph
fast selection criteria which uses the input feature of each node to
learning for EDA problems is prone to runtime overhead consider-
compare the difference and selects the edges at the beginning such
ing that the scale of circuits keeps soaring. Similar to conventional
that the graph structure is fixed regardless of potential change on
CNNs, the most time-consuming process in the computation of a
node representations during training, which requires a constant
GCN is the embedding generation. To tackle the issue of scalability,
number of edges for approximation.
several attempts have been made for efficient graph representation
learning. In [63], a forward computation method with personalized 3.2.3 Heterogeneous Graphs. Most of the existing works on graph
PageRank is investigated to incorporate neighborhood features representation focus on homogeneous graphs, in which all ver-
without aggregation procedure. Besides, it is pointed out that the tices are of the same type and all edges only represent one kind of
inefficiency might be caused by duplicated computation under the relation. However, graphs in EDA can be constructed in a heteroge-
GraphSAGE-like framework [64]. To address this, PinSAGE [64] is neous manner, in which there may exist different types of nodes
proposed to select important neighbors by random walk instead of and edges. For example, in the multiple patterning lithography
aggregating all the neighbors, and a MapReduce pipeline is lever- problem, a typical graph contains two types of edges: stitch edge
aged for maximizing the inference throughput of a trained model. and conflict edge, as shown in Figure 3(c). A conflict edge implies
122
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
123
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
124
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
Negtive Positive
40 40 40
20 20 20
0 0 0
−20 −20 −20
−40 −40 −40
−40 −20 0 20 40 −40 −20 0 20 40 −40 −20 0 20 40
(a) (b) (c)
Figure 8: Visualization of node embedding with different search depth L. (a) L=1; (b) L=2; (c) L=3.
125
ISPD ’20, September 20–23, 2020, Taipei, Taiwan
Session 6: Machine Learning for Physical Design (part 2) Proceedings published March 29, 2020
[15] W.-H. Chang, L.-D. Chen, C.-H. Lin, S.-P. Mu, M. C.-T. Chao, C.-H. Tsai, and Y.-C. [45] C. K. Cheng, S. Z. Yao, and T. C. Hu, “The orientation of modules based on graph
Chiu, “Generating routing-driven power distribution networks with machine- decomposition,” IEEE TC, vol. 40, pp. 774–780, June 1991.
learning technique,” in Proc. ISPD, 2016, pp. 145–152. [46] C.-W. Sham, F. Y. Young, and C. Chu, “Optimal cell flipping in placement and
[16] Y. Ma, S. Roy, J. Miao, J. Chen, and B. Yu, “Cross-layer optimization for high floorplanning,” in Proc. DAC, 2006, pp. 1109–1114.
speed adders: A pareto driven machine learning approach,” IEEE TCAD, vol. 38, [47] J. Kuang and E. F. Y. Young, “An efficient layout decomposition approach for
no. 12, pp. 2298–2311, 2018. triple patterning lithography,” in Proc. DAC, 2013, pp. 69:1–69:6.
[17] C.-W. Pui, G. Chen, Y. Ma, E. F. Young, and B. Yu, “Clock-aware ultrascale fpga [48] Y. Yang, W.-S. Luk, D. Z. Pan, H. Zhou, C. Yan, D. Zhou, and X. Zeng, “Lay-
placement with machine learning routability prediction,” in Proc. ICCAD, 2017, out decomposition co-optimization for hybrid e-beam and multiple patterning
pp. 929–936. lithography,” IEEE TCAD, vol. 35, no. 9, pp. 1532–1545, 2016.
[18] H. Yang, S. Li, Y. Ma, B. Yu, and E. F. Young, “GAN-OPC: Mask optimization [49] M. Cho and D. Z. Pan, “BoxRouter: a new global router based on box expansion
with lithography-guided generative adversarial nets,” in Proc. DAC, 2018, pp. and progressive ILP,” IEEE TCAD, vol. 26, no. 12, pp. 2130–2143, 2007.
131:1–131:6. [50] Y. Lin, X. Xu, B. Yu, R. Baldick, and D. Z. Pan, “Triple/quadruple patterning layout
[19] H. Yang, J. Su, Y. Zou, Y. Ma, B. Yu, and E. F. Y. Young, “Layout hotspot detection decomposition via linear programming and iterative rounding,” JM3, vol. 16, no. 2,
with feature tensor generation and deep biased learning,” IEEE TCAD, vol. 38, 2017.
no. 6, pp. 1175–1187, 2019. [51] R. Samanta, J. Hu, and P. Li, “Discrete buffer and wire sizing for link-based
[20] Z. Xie, Y.-H. Huang, G.-Q. Fang, H. Ren, S.-Y. Fang, Y. Chen, and J. Hu, “RouteNet: non-tree clock networks,” IEEE TVLSI, vol. 18, no. 7, pp. 1025–1035, 2009.
Routability prediction for mixed-size designs using convolutional neural network,” [52] A. B. Kahng, S. Kang, H. Lee, S. Nath, and J. Wadhwani, “Learning-based approxi-
in Proc. ICCAD, 2018, pp. 80:1–80:8. mation of interconnect delay and slew in signoff timing tools,” in Proc. SLIP, 2013,
[21] H. Yang, P. Pathak, F. Gennari, Y.-C. Lai, and B. Yu, “DeePattern: Layout pattern pp. 1–8.
generation with transforming convolutional auto-encoder,” in Proc. DAC, 2019, [53] W.-T. J. Chan, K. Y. Chung, A. B. Kahng, N. D. MacDonald, and S. Nath, “Learning-
pp. 148:1–148:6. based prediction of embedded memory timing failures during initial floorplan
[22] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolu- design,” in Proc. ASPDAC, 2016, pp. 178–185.
tional networks,” Proc. ICLR, 2016. [54] Z. Qi, Y. Cai, and Q. Zhou, “Accurate prediction of detailed routing congestion
[23] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on using supervised data learning,” in Proc. ICCD, 2014, pp. 97–103.
large graphs,” in Proc. NIPS, 2017, pp. 1024–1034. [55] Q. Zhou, X. Wang, Z. Qi, Z. Chen, Q. Zhou, and Y. Cai, “An accurate detailed
[24] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, routing routability prediction model in placement,” in Proc. ASQED, 2015, pp.
“Graph convolutional neural networks for web-scale recommender systems,” in 119–122.
Proc. KDD, 2018, pp. 974–983. [56] H. Cai, V. W. Zheng, and K. Chang, “A comprehensive survey of graph embedding:
[25] D. Xu, Y. Zhu, C. B. Choy, and L. Fei-Fei, “Scene graph generation by iterative problems, techniques and applications,” IEEE TKDE, vol. 30, no. 9, pp. 1616–1637,
message passing,” in Proc. CVPR, 2017, pp. 5410–5419. 2018.
[26] J. You, B. Liu, Z. Ying, V. Pande, and J. Leskovec, “Graph convolutional policy [57] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, “A comprehensive survey
network for goal-directed molecular graph generation,” in Proc. NIPS, 2018, pp. on graph neural networks,” arXiv preprint arXiv:1901.00596, 2019.
6410–6421. [58] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally
[27] J. You, R. Ying, X. Ren, W. Hamilton, and J. Leskovec, “Graphrnn: Generating connected networks on graphs,” arXiv preprint arXiv:1312.6203, 2013.
realistic graphs with deep auto-regressive models,” in Proc. ICML, 2018, pp. 5694– [59] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks
5703. on graphs with fast localized spectral filtering,” in Advances in neural information
[28] H. Dai, H. Li, T. Tian, X. Huang, L. Wang, J. Zhu, and L. Song, “Adversarial attack processing systems, 2016, pp. 3844–3852.
on graph structured data,” in Proc. ICML, 2018, pp. 1123–1132. [60] A. Micheli, “Neural network for graphs: A contextual constructive approach,”
[29] Y. Ma, H. Ren, B. Khailany, H. Sikka, L. Luo, K. Natarajan, and B. Yu, “High per- IEEE Transactions on Neural Networks, vol. 20, no. 3, pp. 498–511, 2009.
formance graph convolutional networks with applications in testability analysis,” [61] J. Atwood and D. Towsley, “Diffusion-convolutional neural networks,” in Ad-
in Proc. DAC, 2019, p. 18. vances in Neural Information Processing Systems, 2016, pp. 1993–2001.
[30] R. J. Francis, J. Rose, and K. Chung, “Chortle: A technology mapping program [62] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph
for lookup table-based field programmable gate arrays,” in Proc. DAC, 1990, pp. attention networks,” arXiv preprint arXiv:1710.10903, 2017.
613–619. [63] A. Bojchevski, J. Klicpera, B. Perozzi, M. Blais, A. Kapoor, M. Lukasik, and S. Gün-
[31] R. Brayton and A. Mishchenko, “Abc: An academic industrial-strength verification nemann, “Is pagerank all you need for scalable graph neural networks?” 2019.
tool,” in International Conference on Computer Aided Verification, 2010, pp. 24–40. [64] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec,
[32] C. J. Alpert, A. E. Caldwell, A. B. Kahng, and I. L. Markov, “Hypergraph parti- “Graph convolutional neural networks for web-scale recommender systems,” in
tioning with fixed vertices,” IEEE TCAD, vol. 19, no. 2, pp. 267–272, 2000. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge
[33] B. Yu, X. Xu, J.-R. Gao, Y. Lin, Z. Li, C. Alpert, and D. Z. Pan, “Methodology for Discovery & Data Mining. ACM, 2018, pp. 974–983.
standard cell compliance and detailed placement for triple patterning lithography,” [65] C. Deng, Z. Zhao, Y. Wang, Z. Zhang, and Z. Feng, “Graphzoom: A multi-level
IEEE TCAD, vol. 34, no. 5, pp. 726–739, May 2015. spectral approach for accurate and scalable graph embedding,” arXiv preprint
[34] R. E. Bryant, “Graph-based algorithms for boolean function manipulation,” IEEE arXiv:1910.02370, 2019.
TC, vol. 100, no. 8, pp. 677–691, 1986. [66] M. Wang, L. Yu, D. Zheng, Q. Gan, Y. Gai, Z. Ye, M. Li, J. Zhou, Q. Huang, C. Ma
[35] J. Cong and P. H. Madden, “Performance driven global routing for standard cell et al., “Deep graph library: Towards efficient and scalable deep learning on graphs,”
design,” in Proc. ISPD, vol. 14, no. 16, 1997, pp. 73–80. arXiv preprint arXiv:1909.01315, 2019.
[36] C. Albrecht, “Provably good global routing by a new approximation algorithm [67] Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural networks,” in
for multicommodity flow,” in Proc. ISPD, 2000, pp. 19–25. Proc. AAAI, vol. 33, 2019, pp. 3558–3565.
[37] T. Yoshimura and E. S. Kuh, “Efficient algorithms for channel routing,” IEEE TCAD, [68] S. Bai, F. Zhang, and P. H. Torr, “Hypergraph convolution and hypergraph atten-
vol. 1, no. 1, pp. 25–35, 1982. tion,” arXiv preprint arXiv:1901.08150, 2019.
[38] K. Yuan, J.-S. Yang, and D. Z. Pan, “Double patterning layout decomposition for [69] N. Yadati, M. Nimishakavi, P. Yadav, V. Nitin, A. Louis, and P. Talukdar, “Hypergcn:
simultaneous conflict and stitch minimization,” IEEE TCAD, vol. 29, no. 2, pp. A new method for training graph convolutional networks on hypergraphs,” in
185–196, Feb. 2010. Advances in Neural Information Processing Systems, 2019, pp. 1509–1520.
[39] H.-Y. Chang and I. H.-R. Jiang, “Multiple patterning layout decomposition con- [70] T.-H. H. Chan and Z. Liang, “Generalizing the hypergraph laplacian via a diffusion
sidering complex coloring rules,” in Proc. DAC, 2016, pp. 40:1–40:6. process with mediators,” Theoretical Computer Science, 2019.
[40] Y. Ma, J.-R. Gao, J. Kuang, J. Miao, and B. Yu, “A unified framework for simulta- [71] C. Zhang, D. Song, C. Huang, A. Swami, and N. V. Chawla, “Heterogeneous graph
neous layout decomposition and mask optimization,” in Proc. ICCAD, 2017, pp. neural network,” in Proc. KDD. ACM, 2019, pp. 793–803.
81–88. [72] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling,
[41] R. Bellman, “On a routing problem,” Quarterly of applied mathematics, vol. 16, “Modeling relational data with graph convolutional networks,” in European Se-
no. 1, pp. 87–90, 1958. mantic Web Conference. Springer, 2018, pp. 593–607.
[42] L.-T. Wang, Y.-W. Chang, and K.-T. T. Cheng, Electronic design automation: syn- [73] X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graph
thesis, verification, and test. Morgan Kaufmann, 2009. attention network.” ACM, 2019, pp. 2022–2032.
[43] B. Yu and D. Z. Pan, “Layout decomposition for quadruple patterning lithography [74] S. Yun, M. Jeong, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks,”
and beyond,” in Proc. DAC, 2014, pp. 53:1–53:6. in Proc. NIPS, 2019, pp. 11 960–11 970.
[44] B. Yu, Y.-H. Lin, G. Luk-Pat, D. Ding, K. Lucas, and D. Z. Pan, “A high-performance [75] L. H. Goldstein and E. L. Thigpen, “SCOAP: Sandia controllability/observability
triple patterning layout decomposer with balanced density,” in Proc. ICCAD, 2013, analysis program,” in Proc. DAC, 1980, pp. 190–196.
pp. 163–169. [76] L. v. d. Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine
Learning Research, vol. 9, no. Nov, pp. 2579–2605, 2008.
126