0% found this document useful (0 votes)
22 views23 pages

A General Layout Pattern Clustering Using Geometric Matching-Based Clip Relocation and Lower-Bound A - He at Al

This document presents a novel approach to layout pattern clustering in semiconductor processing, focusing on hotspot detection through geometric matching and clip relocation techniques. The proposed method significantly reduces the number of clusters needed for Design-for-Manufacturability (DFM) while improving runtime efficiency, achieving optimal solutions across various benchmarks. Experimental results indicate an average cluster number reduction of 16.59% and a runtime improvement of 74.11% compared to existing methods.

Uploaded by

yangkunkuo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views23 pages

A General Layout Pattern Clustering Using Geometric Matching-Based Clip Relocation and Lower-Bound A - He at Al

This document presents a novel approach to layout pattern clustering in semiconductor processing, focusing on hotspot detection through geometric matching and clip relocation techniques. The proposed method significantly reduces the number of clusters needed for Design-for-Manufacturability (DFM) while improving runtime efficiency, achieving optimal solutions across various benchmarks. Experimental results indicate an average cluster number reduction of 16.59% and a runtime improvement of 74.11% compared to existing methods.

Uploaded by

yangkunkuo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

A General Layout Pattern Clustering Using Geometric

Matching-based Clip Relocation and Lower-bound Aided


Optimization

XU HE, Hunan University, China


YAO WANG, National University of Defense Technology, China
ZHIYONG FU and YIPEI WANG, Hunan University, China
YANG GUO, National University of Defense Technology, China

With the continuous shrinking of feature size, detection of lithography hotspots has been raised as one of
the major concerns in Design-for-Manufacturability (DFM) of semiconductor processing. Hotspot detection, 90
along with other DFM measures, trades off turnaround time for the yield of IC manufacturing, and thus a sim-
plified but wide-ranging pattern definition is a key to the problem. Layout pattern clustering methods, which
group geometrically similar layout clips into clusters, have been vastly proposed to identify layout patterns
efficiently. To minimize the clustering number for subsequent DFM processing, in this article, we propose a
geometric-matching-based clip relocation technique to increase the opportunity of pattern clustering. Partic-
ularly, we formulate the lower bound of the clustering number as a maximum-clique problem, and we have
also proved that the clustering problem can be solved by the result of the maximum-clique very efficiently.
Compared with the experimental results of the state-of-the-art approaches on ICCAD 2016 Contest bench-
marks, the proposed method can achieve the optimal solutions for all benchmarks with very competitive
runtime. To evaluate the scalability, the ICCAD 2016 Contest benchmarks are extended and evaluated. More-
over, experimental results on the extended benchmarks demonstrate that our method can reduce the cluster
number by 16.59% on average, while the runtime is 74.11% faster on large-scale benchmarks compared with
previous works.
CCS Concepts: • Hardware → VLSI design manufacturing considerations;
Additional Key Words and Phrases: Design for manufacturability, lithography hotspot, clip pattern clustering,
clip relocation, cluster number minimization
ACM Reference format:
Xu He, Yao Wang, Zhiyong Fu, Yipei Wang, and Yang Guo. 2023. A General Layout Pattern Clustering Using
Geometric Matching-based Clip Relocation and Lower-bound Aided Optimization. ACM Trans. Des. Autom.
Electron. Syst. 28, 6, Article 90 (October 2023), 23 pages.
https://doi.org/10.1145/3610293

This work is supported by the National Natural Science Foundation of China, under grant U19A2062.
Authors’ addresses: X. He, Z. Fu, and Y. Wang, Hunan University, Changsha, Hunan, China; emails: [email protected],
[email protected], [email protected]; Y. Wang and Y. Guo, National University of Defense Technology, Chang-
sha, Hunan, China; emails: [email protected], [email protected].
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be
honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee. Request permissions from [email protected].
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
1084-4309/2023/10-ART90 $15.00
https://doi.org/10.1145/3610293

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:2 X. He et al.

1 INTRODUCTION
As the feature size continuously shrinks, Design-for-Manufacturability (DFM) has become a
crucial concern in semiconductor processing. Due to the limitations in lithographic wavelength,
the diffraction phenomenon causes unwanted shape distortions on the printed layout patterns.
One of the most critical DFM tasks is called “hotspot detection,” which involves multiple flows and
steps to detect defective patterns (i.e., “hotspots”) in the layout with a proprietary library of yield
detractor patterns that may lead to manufacturing defects or performance degradation.
Traditionally, lithograph simulation is the most accurate technique to capture these defective
patterns. Based on physical simulation on the lithography process, hotspot detection can be per-
formed accurately but is also extremely time-consuming. As the number of transistors that are
integrated into a chip keeps growing exponentially, simulation-based hotspot-detection methods
are becoming impractical, especially for those full-chip simulations [16].
To provide quick feedback on the layout printability, pattern-classification-based methods
have been proposed for hotspot detection, which can be roughly divided into two categories,
namely pattern-matching-based techniques and machine learning–based techniques. For pattern-
matching-based methods, the process hotspots are detected in a given layout with a pre-defined
hotspot pattern library [14, 20, 23, 26]. Since a pattern-matching-based method cannot predict
hotspots beyond its pre-defined library, it is case sensitive and therefore suffers from low predic-
tion accuracy for unknown patterns [25]. For machine learning–based methods, a hotspot classifi-
cation model is trained first utilizing existing patterns, and the trained model is then used to detect
hotspots [4, 6, 9, 25].
Pattern clustering is a procedure that extracts clips from a layout and groups geometrically
similar clips into clusters. The major objective of the pattern clustering problem is to find the
minimum cluster number of layout patterns that are representative of the interested features. For
both pattern-matching-based or machine learning–based methods, pattern clustering is a key step
to address the accuracy and efficiency of hotspot detection. Pattern clustering helps to eliminate
the overestimation or over-optimization and boost the performance in the pattern-matching pro-
cess [6]. Besides hotspot detection, pattern clustering has been also widely utilized in design-space
analysis, design-rule generation, and manufacturing yield optimization [19].
For a pattern clustering problem, finding the minimum cluster number is computationally hard.
Previous pattern clustering methods involve k-means, hierarchical, and incremental clustering.
In k-means clustering [18], based on a pre-defined cluster count k, it randomly selects k cluster
representatives first and reassigns each clip to a similar cluster representative until convergence.
In hierarchical clustering [15], each clip is initialized as a cluster, and then similar pairs of clus-
ters are hierarchically combined together. To give a quantitative measurement of the difference
between layout clips, in ICCAD 2016 Contest [19], two metrics indicating the similarity between
layout patterns are applied, namely the edge displacement and the area difference. Based on these
metrics, some new clustering methods have been proposed recently. Woo et al. [21] convert the pat-
tern clustering problem into a Set-Covering Problem (SCP) and apply a Greedy Randomized
Adaptive Search Procedure (GRASP) to solve it. Chang et al. [1] propose a fast method based on
Markov clustering to improve the matching performance. Wu et al. [22] introduce a MapReduce
method to reduce the clustering problem size, which minimizes the cluster number and maximizes
the cluster size simultaneously. Reference [17] applies an Integer Linear Programming (ILP)
method to minimize the clustering count of representative-clip generation for clusters. Chen et al.
[3] adopt a clip shifting technique to compensate for inaccurate markers during pattern clustering.
Chang et al. [2] propose a general layout pattern clustering method called “iClaire” with a clip shift-
ing and centroid recreation technique. For the generation of clip candidates, enumerating all the

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:3

Fig. 1. An example of pattern clustering. Given a layout in GDSII format shown in (a), hotspots are marked
in red. Clips are then created for the markers, as illustrated in (b). Aligning the clips’ centers to their corre-
sponding markers’ centers, the clips can be grouped into three clusters, as shown in (c); the cluster number
can be further reduced if the clip relocation (the blue solid boxes) technique is applied ((b) and (c)).

possibilities is not practical. To avoid the problem of space explosion, Reference [3] has first in-
troduced a clip-shifting technique in which candidates are searched for each clip with different
topological encodes. In Reference [2], clip candidates are generated based on an identical core.
Some aforementioned methods [1, 21, 22] take only the tolerance of pattern shape variations into
consideration. Usually, the shape variation tolerance is set to be very small to keep the topology
thus chance of clustering is then limited. Figure 1 presents an example of clip clustering. Markers
in red indicate the hotspot locations in the layout. Clips created for markers are shown as in
Figure 1(b). By default, the center of a clip is set to the center of its corresponding marker. By
clustering similar clips together, three clusters are obtained as shown in Figure 1(c). However, if
the clip centers can be moved properly within their marker ranges, which is called “clip relocation,”
then the final cluster number can be further reduced (from 3 to 1 for the case in Figure 1(c)). Hence,
by applying the clip relocation technique, the possibility of pattern clustering will be effectively
increased.
In this article, the clip clustering problem given in ICCAD 2016 Contest [19] is taken under
investigation. Compared with the earlier version of this work [10], major steps are added or modi-
fied for a more general clip clustering, which includes the candidate generation, the lower-bound
computation, and the clustering optimization. For the candidate generation, as illustrated in Refer-
ence [19], clips are often similar to each other and replicated throughout the layout. Therefore, it
is a natural and practical way to make use of the clip similarity characteristic in candidate gener-
ation. Different from prior works [2, 3], our candidates are located based on the matched regions
within entire clips. The clip-matching problem is analogous to the image-matching problem in the
computer vision community, while the geometrical information of clip pattern can be akin to the
feature descriptor of image. Comparing each pair of the entire clip set is a very time-consuming
task. To achieve a competitive runtime, instead of using pixel-based matching techniques, we pro-
pose our polygon-based clip-matching method, as clip pattern is composed of multiple rectilinear
polygons. Given the generated candidates, the lower bound of the cluster number can be obtained
by constructing a general clip mutex graph and calculating the maximum clique on the graph. Last,
clip clustering is formulated into an SCP, which is an NP-hard problem [7]. Different from prior

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:4 X. He et al.

works [3, 21], our lower-bound result can be utilized to optimize the SCP solving process. In our
method, not only the lower-bound number is applied as the solving early ending condition, but
also the selection of mutex clip candidates is optimized for clustering initialization.
Compared with previously published works, our method has achieved the smallest clustering
number for almost all the benchmarks with a very competitive runtime as well. In addition, we
have also studied the scalability on enlarged benchmarks. It is shown that our proposed method
can achieve tight lower bound in experiments and can be used to prove our solution is optimal for
the original problem in ICCAD 2016 Contest.
Our main contributions of this work can be summarized as follows:
● We have introduced a systematic clip relocation method to search clip candidates based on
geometrical matching region between clips.
● We have proven that, for the first time to the best of our knowledge, based on given clips,
the lower bound of the optimal clip clustering can be formulated as maximum clique on clip
mutex graph.
● We have proposed a heuristic SCP’s solving algorithm optimized by the lower-bound solu-
tion for quick convergency.
● Experimental results have demonstrated that our method has achieved the optimal cluster-
ing number for all benchmarks in the ICCAD 2016 Contest benchmark suite. Moreover, for
an extended benchmark suite that we generated from the ICCAD 2016 Contest benchmarks,
our method outperforms the previous works by almost all clustering metrics with very com-
petitive runtime.
The remainder of this article is organized as follows. Section 2 gives the problem formulation.
Section 3 gives an overview of our pattern clustering framework. Section 4 provides the prelimi-
nary of clip preparation. In Section 5, we propose our candidate generation method. In Section 6,
we propose the lower-bound aided clip clustering. Experimental results and conclusions are pre-
sented in Section 7 and Section 8, respectively.

2 PROBLEM FORMULATION
We first define the terminology used in the problem formulation as follows:
● Marker is a polygonal region given in advance by cruising all over each layout layer, i.e.,
potential hotspot location. Markers do not necessarily intersect with any layout pattern, and
they can be placed on the space among features. Generally, the sizes and shapes of markers
may be varied. And for most practical cases, marker size is picked much smaller than clip
size.
● Clip is a rectangular window composed of a set of rectilinear polygons, with a marker con-
tained in it. The input clip size is w c × hc , where w c and hc are the width and height of the
clip, respectively. For each clip, its center must be within or on the edges of its corresponding
marker. Figure 1(b) gives an example of markers and clips extracted from a layout.
● Candidate: When creating a clip for a marker, we can generate many clip candidates whose
centers fall into the marker’s region. One of these candidates will be chosen as the clip
referring to the marker.
● Cluster is a set of clips that are grouped together based on a specified similarity requirement.
● Representative Clip is a clip in a cluster that meets a given similarity requirement between
itself and any other clip in the same cluster.
A layout similarity metric in real applications of clip clustering is not unique [5]. In our work, to
determine the clip similarity, Distance(c i , c j ) between two clips c i and c j is evaluated, either using

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:5

Fig. 2. Examples of clip XOR operation and edge shifting operation.

the Area Constrained Clustering (ACC) or the Edge Constrained Clustering (ECC) [19]. The
ACC metric constrains the total area of pattern differs, while the ECC metric focus on the tolerance
of small variations or shifts of the shapes. Denoting the area match constraint for ACC as a and
the edge displacement constraint for ECC as e, respectively, the ACC and ECC criteria are defined
as follows:
● ACC checks the difference between two clips c i and c j by the overlapping portion of their
area difference. The calculation is given in Equation (1) as follows:

Ar ea(X O R(c i ,c j ))
Distance(c i , c j ) = w c ×hc ≤ 1 − a, (1)

where Area(XOR(c i , c j )) is the area difference between clip c i and c j , w c × hc denotes the
area of a clip, and the parameter a indicates the least area match ratio that should be achieved
between clip c i and c j . Figure 2 shows an example of XOR result for two clips.
● ECC checks every edge of two clips c i and c j whether the displacement of the edge exceeds
a given parameter e ≥ 0 or not. The edges of polygons of the clip c i or c j can shift inward
or outward with respect to the original edge position. Figure 2 gives an example of edge
shifting for a clip. The parameter e is the maximum allowable distance of edge shifting. The
calculation is given in Equation (2),

∃ϕ ∈ Φ,
(2)
Distance(c i , c j ) = max{MSD ϕ (c i ), MSD ϕ (c j )} ≤ e,

where Φ is the set of all edge-shifting solutions making c i and c j exactly identical, and
for an edge-shifting solution ϕ ∈ Φ, MSD ϕ (c i ) and MSD ϕ (c j ) are the Maximum-Shifting-
Displacement of edges in clip c i and clip c j , respectively.
It is easy to deduce that the default parameters of a = 1 and e = 0 (nm) correspond to an exact
match.
Given a set of clips, the clustering process seeks an optimal clip clustering result under the
default mode (a = 1, e = 0), ACC mode (a < 1), or ECC mode (e > 0). To simplify the discussion, we
use threshold d to unify the ACC and ECC expressions. For the ACC criteria, d = 1−a; while for the
ECC criteria, d = e. In this way, for both the ACC criterion and the ECC criterion, the constraint
can be written in a unified way as Distance(c i , c j ) ≤ d.
Here, we give the problem formulation of pattern clustering as follows:

Problem 1. Given an input layout in GDSII format containing layout polygon patterns and mark-
ers, clip size w c × hc , and the similarity distance requirement a and e, the problem objective is to
minimize the number of clusters to group the clips under the specified constraints. Either ACC or ECC
is adopted for each operation. Every clip center must be within or on the edges of its corresponding
marker.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:6 X. He et al.

Fig. 3. Our pattern clustering framework. (a) The process without clip relocation. (b) The modified steps
when considering clip relocation.

3 OVERVIEW OF OUR PATTERN CLUSTERING


The overview of our pattern clustering method is illustrated in Figure 3. It consists of three parts:
(1) data preparation, (2) clip candidate generation, and (3) lower-bound aided clustering. Figure 3(a)
represents the basic clustering process without clip relocation, while Figure 3(b) gives the modi-
fied steps compared to Figure 3(a) that take into account the general clustering problem with clip
relocation.
In the data preparation part, we obtain all the clip information. Given an input layout, all rec-
tilinear polygons in the layout are first partitioned into rectangles. Then, the relevant rectangles
overlapping with clip regions are extracted. Additional rectangle merging is performed to preserve
the homogeneity in the metric calculation. For the basic version of this work [10], a clip region
is obtained for each marker, and the centers of clips are set to the centers of their corresponding
markers by default. When considering clip relocation, an expanded-clip region is computed in-
stead of original clip region. After obtaining the clip information, we merge the identical clips to
reduce the problem size.
In basic version, following the preparation of clip data, we proceed with the lower-bound aided
clustering. We first determine whether the given ACC or ECC criterion is satisfied between each
pair of clips and then establish a clip association graph. Based on the similarity relationship in the
clip association graph, we calculate the lower bound of the cluster number. Finally, we transform

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:7

Fig. 4. The process of expanded-clip preparation.

the clip clustering problem into an SCP. Since SCP is an NP-complete problem, to solve it effectively,
we utilize the lower-bound result as the initial solution to SCP and also as one of the termination
conditions for SCP.
In the general clustering process with clip relocation, we introduce the candidate generation
process after data preparation. Our candidate generation method mainly involves analyzing poly-
gon similarity and computing identical regions between pairs of expanded clips. Additionally, we
modify the lower-bound aided clustering approach to accommodate the general case. We expand
the lower-bound computation method and the SCP formulation, since each marker can correspond
to multiple clip candidates instead of just one clip as in the basic version.
The detailed discussion of the data preparation, the candidate generation, and the lower-bound
aided clip clustering will be given in Section 4, Section 5, and Section 6, respectively.

4 PRELIMINARY OF CLIP PREPARATION


For clip preparation without clip relocation, the center of each clip is set to the center of its corre-
sponding marker, and the clip can be extracted according to its marker center and the given clip
size. When considering clip relocation, for each marker, many candidate clips could be generated.
To bound the maximum range of locations of all the candidate clips, we create an expanded-clip
for the marker. It should be mentioned that, apart from the clip size, the clip preparation method
without clip relocation is similar with the expand-clip preparation method.
Figure 4 gives an example of expanded-clip extraction. Since every clip’s center must be within
the scope of its corresponding marker, for a marker with the size of wm × hm , the center of its
expanded-clip is set to the marker’s center, and the size of the expanded-clip is assumed to be
(w c + wm , hc + hm ) [3]. Figure 4(a) presents an illustration of the expanded-clip size.
The process of expand clip preparation can be briefly introduced as follows: Given an input
layout, all rectilinear polygons in the layout are first partitioned into rectangles (Figure 4(b)). Then,
the relevant rectangles overlapped with the expanded-clip are extracted and further combined
to preserve the homogeneity in the metric calculation (Figure 4(c)). The RTree data structure is
adopted to retrieve rectangle from the layout. Since some polygons in the benchmark suite may
have arbitrary cuts, a combining step is required for neighboring rectangles with the same width or
height. Otherwise, the arbitrarily divided rectilinear polygons can cause a mismatch in the metric
calculation of the rectangle matching.
The clip representation method proposed in Reference [1] is applied in our work, which is briefly
introduced with an example shown in Figure 4(d). For each expanded-clip, the layout inside it can
be divided into non-uniform grids, aligning the horizontal and vertical lines contributed by the
coordinates of rectangles corners within the expanded-clip. Each grid is filled with either a 1 or
0, depending on whether it is occupied by a rectangle or not. Therefore, the expanded-clip can

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:8 X. He et al.

Fig. 5. The process of candidate generation.

be represented as a (0, 1)-matrix with sorted horizontal and vertical grid coordinates list. After
obtaining the expanded-clip, we merge the identical expanded-clips to reduce the problem size.
Note that the mirror-symmetrical expanded-clips, including x-axis flipping, y-axis flipping, or both,
are regarded as identical when doing expanded-clip merging.

5 CLIP-CANDIDATE GENERATION
In this section, we introduce the method of clip-candidate generation. Enumerating all possible
candidates of each clip is impractical due to the overflow of the problem size. Since often clips
are similar to each other and replicated throughout the layout [19], compared to cover all possible
candidates, it is practical to have a smaller problem size by searching highly similar candidates for
clustering.
Our candidate generation is based on the pattern-matching regions between pairs of expanded-
clips. Our method mainly includes two steps: (1) polygon similarity analysis and (2) matching
region computation. Figure 5 presents the process of candidate generation. In our method, we first
perform polygon similarity analysis for each pair of expanded-clips and obtain the common part
of polygons with the maximum area (Section 5.1). We then compute the matching region between
expanded-clips based on polygon common part (Section 5.2). The candidates are finally extracted
according to the maximum matching region of expanded-clips. Both steps can be formulated into
classical problems to be solved, and Table 1 gives the notations used in this section.

5.1 Polygon Similarity Analysis


To compute the matching region between two expanded-clips, we first analyze the similarity of
the rectilinear polygons contained in the expanded-clips.
Every rectilinear polygon of expanded-clips is first partitioned into a rectangle list. The rectangle
list area is set to the total area of rectangles in the list. Each rectangle is represented by its lower-
left and upper-right vertice coordinates. The vertice coordinates of the rectangle is the relative
coordinates within its expanded-clip. Rectangles from a polygon are sorted by their lower-left
coordinates. As shown in Figure 6(a), a polygon p1 of an expanded-clip ec 1 is partitioned into
a rectangle list (r 1 , r 2 , r 3 , r 4 , r 5 ), while a polygon p1′ of an expanded-clip ec 2 is partitioned into a
rectangle list (r 1′ , r 2′ , r 3′ , r 4′ , r 5′ ).
Given two polygons, as each polygon can be partitioned into a rectangle list, the objective of
Polygon Similarity (PS) problem is to find the common rectangle sublist with the maximum

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:9

Table 1. Notations in Section IV


Notation Description
(wm , hm ) Marker width and height.
(wc , hc ) Clip or candidate width and height.
(xl, xr ,yl,yr ) Lower-upper x/y coordinates for a rectangle region.
mi The ith marker.
c ik The kth candidate generated for marker mi .
ci The chosen candidate as the clip for marker mi .
ec i The expanded-clip for marker mi .
pi A rectilinear polygon.
ri A rectangle.
rni Rectangle number of polygon pi after partition.
aw i Alignment window for expanded-clip ec i .
mmr i j Maximum matching region of two expanded-clips ec i and ec j .

Fig. 6. An example of matching region computation between two expanded-clips ec 1 and ec 2 . Each rectilinear
polygon contained in the expanded-clips is first partitioned into rectangle list (b). Based on the maximum
common sublist of polygons, we do the alignment for the expanded-clip and obtain the maximum alignment
window aw 1 and aw 2 (c). Finally, the maximum matching region is computed by searching the maximum
rectangular filled with “0” in XOR(aw 1 , aw 2 ) (d).

area. An example is shown in Figure 6(b). For two polygons p1 and p1′ , the common sublist with
the maximum area in p1 is (r 1 , r 2 , r 3 ), while the corresponding common sublist in p1′ is (r 3′ , r 4′ , r 5′ ).
A dynamic programming algorithm is applied to solve the PS problem. The solution of the PS
problem can be referred to as the Longest Common Substring (LCS) problem [8], but it is dif-
ferent from the LCS problem by the following factors:
● Different Objective: In the PS problem, the obtained common rectangle sublist is the sublist
with the maximum area instead of the longest substring in the LCS problem.
● Different Comparison Condition: In the PS problem, for each rectangle in the common
sublist, it should meet the following requirements: (1) same sequence order, which is also

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:10 X. He et al.

ALGORITHM 1: Maximum Common Sublist


Require: p, p ′
1: Initialize rn ← ∣Rectlist(p)∣, rn ′ ← ∣Rectlist(p ′ )∣
2: Initialize 2D-array dp0..r n,0..r n ′ ← 0, maxpos ← (0, 0)
3: for each r i ∈ Rectlist(p) do
4: for each r j ∈ Rectlist(p ′ ) do
5: if i > 0 and j > 0 then
6: f r i j ← f r i−1j−1 , f r i′j ← f r i−1j−1

7: else
8: f r i j ← r i , f r i′j ← r j
9: end if
10: if RelCoord(r i , f r i j ) = RelCoord(r j , f r i′j ) then
11: if i > 0 and j > 0 then
12: dpi j ← dpi−1j−1 + Area(r i )
13: else
14: dpi j ← Area(r i )
15: end if
16: else if Size(r i ) = Size(r j ) then
17: dpi j ← Area(r i )
18: f r i j ← r i , f r i′j ← r j
19: end if
20: if dpmaxpos < dpi j then
21: maxpos ← ij
22: end if
23: end for
24: end for
25: commlist ← Sublist(p, f rmaxpos , dpmaxpos )
26: commlist ′ ← Sublist(p ′ , f rmaxpos ′
, dpmaxpos )

required in the classic LCS problem; (2) same rectangle size; and (3) same relative position
to the sublist.
Given two polygons p and p ′ from two different expanded-clips, the rectangle number of p
and p ′ are rn and rn′ , respectively. For each polygon, its rectangle list is sorted by the lower-left
coordinates. A dynamic programming algorithm shown in Algorithm1 is applied on the rectangle
lists of p and p ′ to compute the common rectangle sublist with the maximum area.
The time complexity of Algorithm 1 is O(rn × rn′ ). The two-dimensional array dp stores the
maximum area of the common sublist for the prefixes p[0..i] and p ′ [0..j] that end at rectangle
p[i] (i.e., r i ), p ′ [j] (i.e., r j ), respectively. The array dp is initialized to zero (line 2). For the common
sublist ended at p[i] and p ′ [j], the variables f r i j and f r i′j are used to hold the first rectangle of the
common sublist in p and p ′ , respectively. Function RelCoord() returns the relative coordinates of
the lower-left and upper-right vertices of the rectangle r i /r j to the first rectangle f r i j /f r i′j . For each
pair of rectangle r i and r j from p and p ′ , if the relative coordinates of their vertices in the common
sublist are the same, then they have the same size and relative position in the common sublist and
will be appended into the common sublist (line 11–15). If their relative positions in the common
sublist are not the same, but the sizes of both r i and r j are identical, then a new common sublist
is created by setting dpi j to Area(r i ), and the first common rectangles f r i j and f r i′j of this new
sublist are set to r i and r j , respectively (line 16–19). The maxpos keeps the last rectangle position
in p and p ′ of the maximum common sublist. And dpmaxpos stores the area of maximum common

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:11

sublist (line 20–22). The maximum common sublist can be saved efficiently by just retrieving from
the first rectangle f r i j /f r i′j to the last rectangle at maxpos (line 25 and 26).
When doing polygon comparison, the mirror-symmetric directions, including x-axis flipping, y-
axis flipping, and both, are considered. The common sublist with the maximum area will be used
for the following matching region computation.

5.2 Matching Region Computation


Given two expanded-clips, we can find the maximum common part among all polygon pairs. As
shown in Figure 6(c), given two expanded-clips ec 1 and ec 2 , we find out that the polygon pair (p1 ,
p1′ ) has the largest common area. The common rectangle sublists of p1 and p1′ are (r 1 , r 2 , r 3 ) and
(r 3′ , r 4′ , r 5′ ), respectively. Since the largest common parts from these two expanded-clips may be in
different locations, to search clip candidates within ec 1 and ec 2 , we do expanded-clip alignment
and obtain the maximum alignment window according to the boundary of ec 1 and ec 2 . Figure 6(c)
shows examples of alignment windows aw 1 in ec 1 , and aw 2 in ec 2 , respectively.
For the clip representation, we adopt the method proposed in Reference [1]. A representation
example of alignment window is shown in Figure 6(d). The layout inside the alignment window
can be divided into non-uniform grids, aligning the horizontal and vertical lines contributed by
the coordinates of rectangles’ corners within the alignment window. Each grid is then filled by 1
or 0, depending on whether it is occupied by a rectangle. Therefore, the alignment window can be
represented as a (0, 1)-matrix with sorted horizontal and vertical grid coordinates, which can be
denoted as h_list and v_list. Each grid area can be computed based on the x/y coordinates of its
grid lines.
Now we can find out the maximum matching region between two alignment windows.
Figure 6(d) gives an example showing the computation of the maximum matching region mmr 12
of two different alignment windows aw 1 and aw 2 . Given two alignment windows aw 1 and aw 2
from expanded-clips ec 1 and ec 2 , respectively, the XOR(aw 1 , aw 2 ) operation is computed by merg-
ing the sorted h_list and v_list of aw 1 and aw 2 first. As shown in Figure 6(d), the merged list
will be h_list = (x 1 , x 2 , x 3 , x 4 , x 5′ , x 6′ , x 5 ), v_list = (y1 ,y2 ). The corresponding (0, 1)-matrix is then
constructed for XOR(aw 1 , aw 2 ). In the (0, 1)-matrix of the XOR(aw 1 , aw 2 ) result, the elements
with “0”s are the identical parts of aw 1 and aw 2 . Finding the maximum matching region between
two expanded-clips is transformed into finding the maximum rectangular area filled with “0”s in
XOR(aw 1 , aw 2 ).
The problem of finding the maximum matching region can be transformed to the problem of
finding the largest rectangle in (0, 1)-matrix, which can be optimally solved by a Histogram algo-
rithm [13]. The histogram is a graphical representation of data. In our case, starting from the top
or the bottom, the corresponding histogram value for each element in the (0, 1)-matrix is recorded
as the column height filled with “0” (i.e., the number of consecutive 0’s in the column from the top
or bottom direction), and a histogram matrix is obtained. With the histogram data, we can traverse
each row of the histogram matrix and compute the maximum rectangular area filling with “0” us-
ing the algorithm described in Reference [13]. If the calculated area of the current row is greater
than the earlier calculated areas, then the maximum rectangle is then set to the newly found rec-
tangle. After traversing all rows of the matrix, the maximum matching region of the matrix is
obtained.
For our problem context, it should be mentioned that the maximum matching region mmr is not
the region with the maximum number of elements filled by “0”s in the (0, 1)-matrix, but the region
with the largest area. Since it traverse all elements in (0, 1)-matrix, the time complexity of the
Histogram algorithm is O(∣h_list∣ × ∣v_list∣), where ∣h_list∣ and ∣v_list∣ are the element numbers
in h_list and v_list, respectively.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:12 X. He et al.

Fig. 7. An example of ECC computation method. The results of clip-minus operations (c i − c j ) and (c j − c i )
are the extra parts in c i and c j , respectively. To check whether these two parts can be deleted by edge shifting
within ±d, (d = e), we apply enlarge and shrink operations for both c i and c j . If both two parts (c i − c j ) and
(c j − c i ) can be digested, then c i and c j meet the ECC criteria.

It is found that the number of generated clip candidates directly affects the following clustering
runtime. To trade off between the runtime and the clustering number, we choose to control the
candidate number by generating candidates between pairs of expanded-clips with similarity as
much as possible. Given two expanded-clips ec 1 and ec 2 , if their maximum matching region mmr 12
is large enough, i.e., Area(mmr 12 ) > (w c × hc ) in our implementation, then clip candidates will
be generated within mmr 12 ; otherwise, no clip candidates will be created at all. To reduce the
clustering problem size, we only enumerate at most three positions in the x/y directions as the
optional candidate positions. For example, the three optional positions in the x direction include
the left boundary, center, and the right boundary of mmr 12 . The optional positions in the y direction
are set similarly. The candidates are generated by the combination of optional positions in the x
and y directions.

6 LOWER-BOUND AIDED CLIP CLUSTERING


In this section, we first introduce the technique of using maximum-clique to determine the lower
bound of cluster numbers. We then illustrate the method of utilizing the maximum-clique set to
help solve clip clustering. For each technique, the basic algorithm without clip relocation is pro-
vided first, followed by the general method with clip relocation.

6.1 Clip Similarity Check


The clip clustering is based on the similarity relationships between clips. The clip distance under
ACC and ECC criteria is computed for similarity check. Here we illustrate the distance computation
under ACC and ECC criterias as follows:
● ACC distance computation: For ACC criteria with two clip candidates c i and c j from two differ-
ent markers mi and m j , Distance(c i , c j ) can be computed by Equation (1), where the XOR(c i , c j )
operation is the same with the XOR operation on alignment window for candidate generation
(Section IVB).
● ECC distance computation: For ECC criteria, we propose a new and simple method to check
whether Distance(c i , c j ) ≤ d, (d = e). An example is shown in Figure 7.
First, we do a clip-minus operation (c i −c j ) and (c j −c i ) to c i and c j . This operation is similar to
the XOR operation, except for the (0-1)-matrix value filling. Figure 7 gives the results for (c i − c j )

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:13

and (c j − c i ). For the result (c i − c j ), the elements with “1”s in its (0-1)-matrix are the extra parts
in c i compared to c j . However, for the result (c j − c i ), the elements with “1”s in (0-1)-matrix are
the extra parts in c j compared to c i . When c i and c j are exact match, both (c i −c j ) and (c j −c i ) are
empty clips with their (0, 1)-matrixes filled by “0”s. If both (c i −c j ) and (c j −c i ) can become empty
clips by shifting polygon edges of c i and c j within d displacement, then Distance(c i , c j ) ≤ d, and
c i and c j meet the ECC criteria.
To obtain the deformation range after shifting polygon edges inward and outward, we apply
two kinds of operations to polygon of clip: shrinking and enlarging. The shrinking and enlarging
operations are to shift the x/y coordinates of polygon corners by ±d. As shown in Figure 7, the
shrunk and enlarged results of c i are c i_Shr ink and c i_Enl ar дe , respectively, and the shrunk and
enlarged results of c j are c j_Shr ink , c j_Enl ar дe , respectively.
Equations (3) and (4) give the method to check whether c i and c j meet the ECC criteria. If both
results of Equations (3) and (4) are empty clips, then Distance(c i , c j ) ≤ d, and c i and c j meet the
ECC criteria,
((c i − c j ) ∩ c i_Shr ink ) − c j_Enl ar дe , (3)

((c j − c i ) ∩ c j_Shr ink ) − c i_Enl ar дe . (4)


Proof. The result (c i − c j ) is the extra parts in c i compared to c j . To check whether (c i − c j )
can become empty clip by edge shifting, we can first do edge shifting inward for c i , i.e., shrinking
polygons in c i . The intersection between (c i − c j ) and c i_Shr ink is the remaining extra part after
shifting inward for c i . If the remaining part is not empty, then we then do edge shifting outward
for c j , i.e., enlarging polygons in c j . If the remaining extra part is not out of the range of c j_Enl ar дe ,
then the (c i −c j ) part meet the ECC criteria. This computation method is given in Equation (3). In
the same way, the result (c j − c i ) can be checked by Equation (4). 

6.2 Lower-bound Computation


Based on the similarity relationship between clips, the association graph can be constructed. We
first introduce the basic association graph without considering clip correlation and then expand it
to the general association graph with clip relocation.
● Basic association graph: For two markers, to record whether their clips can be assigned to the
same cluster, we apply basic association graph G = (M, E) to represent the relationship between
markers, where M = {m 1 ,m 2 , . . . ,mn } is the marker set, n is the marker number, and E is the
edge set. For two markers mi and m j (i ≠ j), their clips c i and c j can be assigned to the same
cluster, i.e., ∃ei, j ∈ E, if they meet the requirement in Equation (5). If Distance(c i , c j ) ≤ d, then the
requirement is still valid by setting c k = c i . Figure 8(a) gives an example of the construction of the
basic association graph according to the clip similarity relationship,
∃c k , Distance(c i , c k ) ≤ d, Distance(c j , c k ) ≤ d. (5)
● General association graph: When considering clip relocation, for each marker mi , suppose
it has ni candidates, i.e., c i0 , c i1 , . . . , c ini . When constructing an association graph, the relationships
between different clip candidates of markers are more complex than the basic case.
′ ′
For two clip candidates c ii from mi and c jj from m j (i ≠ j), if the similarity ACC/ECC distance
′ ′
between c ii and c ij is less than the given requirement d, then these two clip candidates can be

assigned in the same cluster. If the distance is larger than d, but there exists a candidate c kk from
′ ′
mk that meets the similarity requirement with both c ii and c jj , then these three clip candidates can

be assigned into the same cluster by setting c kk as the representative clip of the cluster. In other

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:14 X. He et al.

Fig. 8. The marker association graphs for the basic and general clustering problems, created based on clip
connections that meet the similarity requirement, are presented as (a) and (b), respectively. The marker
mutex graph, which is the complement of the marker association graph, illustrates that clips from connected
markers cannot be assigned to the same cluster. The marker mutex graphs and their maximum-clique results
for the basic and general clustering problems are provided as (c) and (d), respectively.

′ ′
words, two clip candidates c ii and c jj , i ≠ j, can be assigned to the same cluster iff there exists a
′ ′ ′
candidate c kk connecting with both candidates c ii and c jj . This requirement is given in Equation (6).
′ ′ ′ ′
If Distance(c ii , c jj ) ≤ d, then Equation (6) remains valid by setting c kk = c ii . Figure 8(b) gives
an example of the construction of the general association graph according to the clip-candidate
similarity relationship.
′ ′ ′ ′ ′ ′ ′
∃c ii , c jj , c kk , Distance (c ii , c kk ) ≤ d, Distance (c jj , c kk ) ≤ d. (6)

As shown in Figure 8(b), clips from m 1 and m 2 can be assigned to the same cluster, since their
clip candidates c 11 and c 22 are connected, i.e., Distance(c 11 , c 22 ) ≤ d. Clips from m 1 and m 4 can also
be assigned to the same cluster, since ∃c 11 , c 42 , c 61 that Distance(c 11 , c 61 ) ≤ d, Distance(c 42 , c 61 ) ≤ d.
It should be mentioned that, the basic association graph can be seen as a special case of the gen-
eral association graph with only one candidate in each marker. Therefore, the general association
graph construction is also available for the basic case.
We can infer that if no candidates from marker mi and m j satisfy the requirement in Equa-
tion (6), then their clips cannot be assigned to the same cluster. We derive a marker mutex graph
G ′ = (M, E ′ ) that each connection ei′j ∈ E ′ between two markers mi and m j denotes that their

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:15

clips cannot be grouped into the same cluster. Figure 8(c) and (d) give examples to show the ba-
sic and general marker mutex graphs G ′ derived from their corresponding association graphs G,
respectively.
Here we give the computation of cluster number lower bound.
Theorem 1. Let MaxClique(M, E ′ ) represents the maximum-clique calculation for the marker
mutex graph G ′ , and its maximum-clique result is LB. The maximum-clique size ∣LB∣ can be seen as
the lower bound of optimal cluster number optCnum .
Proof. For two markers mi and m j connected by ei,′ j ∈ E ′ , none of candidates from mi and m j
can be assigned in the same cluster. Since all marks in the maximum-clique LB are connected to
each other, none of candidates from these markers can be assigned in the same cluster. Therefore,
the number of markers in the maximum-clique set ∣LB∣ is less than or equal to the cluster number.
In other words, ∣LB∣ can be seen as the lower bound of optimal cluster number optCnum . 
Taking Figure 8(c) and (d) as examples, it can be derived that the mutex relationship between
markers is reduced obviously when considering clip relocation, resulting in a smaller maximum
clique size in the general mutex graph, i.e., ∣optCnum | is smaller in general case.
Since maximum-clique Problem (MCP) is an NP-complete problem, if a heuristic algorithm
is applied to solve the MCP, then its result is still valid as the lower bound of the optimal solution.
Theorem 2. Suppose MaxClique(M, E ′ ) is solved by a heuristic algorithm, and its maximum-
clique result is LB ′ . The maximum-clique size ∣LB ′ ∣ from the heuristic algorithm is still valid as the
lower-bound cluster number.
Proof. The maximum-clique size ∣LB ′ ∣ from heuristic algorithm will not be larger than the opti-
mal maximum-clique size. We have ∣LB ′ ∣ ≤ ∣LB∣ ≤ optCC. Therefore, ∣LB ′ ∣ ≤ optCC, and LB ′ is valid
as the lower-bound cluster number. 
In our implementation, an efficient algorithm [11] is used to find the maximum-clique. It uses
a branch-and-bound approach, searching systematically through possible solutions and applying
approximate bounds to limit the search space.
It should be mentioned that the MCP cannot be applied for clip clustering directly. This is be-
cause that all the nodes are required to be connected with each other in maximum clique. However,
in the clip clustering problem, clips in the same cluster are required to be connected with the repre-
sentation clip only, not to each other in the cluster. Section 6.3 will give the clip clustering solution
with optimization aided by the lower-bound result.

6.3 Lower-bound Aided Clustering


The cluster number minimization problem has been formulated to a SCP in previous works [3, 21].
We first briefly provide the SCP formulation for basic and general clustering problem. Our main
focus is on discussing the optimization of the SCP solution using the lower-bound result.
● Basic clip clustering formulation: Here we give a brief explanation to SCP expressed in Equa-
tion (7),
min ∑ x i
c i ∈C
s.t . ∑ ai j x i ≥ 1 ∀c j ∈ C (7)
c i ∈C
x i ∈ 0, 1 ∀c i ∈ C
1 if Distance(c i , c j ) ≤ d
ai j = { .
0 otherwise

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:16 X. He et al.

For basic clustering problem, each marker mi corresponds uniquely to a clip c i . The value of ai j
represents whether clip c i and c j meet the ACC or ECC criteria. For each cluster, a clip c i will
be chosen as the representative clip, i.e., x i = 1, if it meets similarity criteria between itself and
other clips in the same cluster. The object of SCP is to minimize the number of representative clips,
which is equivalent to minimizing the number of clusters.
● General clip clustering formulation: When considering clip relocation, Equation (7) is ex-
panded to Equation (8) for the general clip clustering problem,

i
min ∑ xi

c ii ∈C
′ ′
i i
s.t . ∑ ai j x i ≥ 1 ∀m j ∈ M
′ (8)
c ii ∈C

∑ x ii ≤ 1 ∀mi ∈ M

c ii ∈Cand (m i )
′ ′
x ii ∈ 0, 1 ∀c ii ∈ C
′ ′ ′
′ 1 if ∃c jj ∈ Cand(m j ) and Distance(c ii , c jj ) ≤ d
aii j ={ .
0 otherwise
where M is the marker set and C is the candidate set of all markers. In a general problem with clip
relocation, each marker can correspond to multiple clip candidates. The similarity relationship

ai j between c i and c j in Equation (7) is changed to aii j , representing the similarity relationship
′ ′ ′ ′ ′
between candidate c ii and marker m j . If ∃c jj ∈ Cand(m j ), then Distance(c ii , c jj ) ≤ d , aii j = 1.
′ ′
Every candidate c ii and its marker mi have a connection, and aiii = 1 by default. Similarly with

Equation (7), at most one of its clip candidates c ii ∈ Cand(mi ) can be chosen as the representative

clip of clip cluster, i.e., x ii = 1.
The basic problem in Equation (7) can be seen as a special case of the general problem in Equa-
tion (8), in which each marker corresponds to one candidate. Therefore, we discuss the SCP solu-
tion for general problem in Equation (8) as follows.
There are many methods to solve SCP. Instead of directly applying an ILP tool, in our implemen-
tation, we use the method proposed in Reference [12] to solve SCP. It applies the Meta-heuristic
for Randomized Priority Search (Meta-RaPS). By considering an option list, it first constructs
a feasible solution with a random selection between the best choice and a member in the option
list. After that, in the local search phase, feasible solutions can be improved by an improvement
heuristic. In addition to applying the basic Meta-RaPS, the heuristic developed herein integrates
the elements of randomizing the selection of priority rules, penalizing the worst columns when
the searching space is highly condensed and defining the core problem to speed up the algorithm.
In Section 6.2, we have obtained the maximum-clique set LB. Since ∀mi ∈ LB, its clip has to
be assigned to a different cluster, and instead of original initialization in Reference [12], we can
initialize the cluster set in SCP according to LB. For each m j ∈ LB, we randomly select its associated
′ ′ ′
candidate c ii with aii j = 1 as the representative clip. The reason for not directly selecting c jj ∈

Cand(m j ) is that the candidate c jj ∈ Cand(m j ) may be not always suitable as a representative clip
for its poor similarity with other candidates.
Further, we also adopted the maximum-clique size ∣LB∣ as an early ending condition of the SCP,
since ∣LB∣ has been proved as the lower bound of the cluster number in Section 6.2. If the cluster
number is searched the lower bound, then the process of SCP can be ended in advance.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:17

Table 2. Benchmark Suite Information


Benchmark Width Height #Poly #Corner #Clip Csize (nm)
Case1 3,750 3,000 77 385 16 200
Case2 15,000 13,904 845 4,225 200 200
Case3 40,288 25,830 9,779 64,031 5,068 200
Case4 99,940 105,333 147,764 2,456,620 264,824 250
Extend1 3,750 3,000 77 385 72 200
Extend2 15,000 13,904 845 4,225 868 200
Extend3 40,288 25,830 9,779 64,031 25,132 200
Extend4 99,940 105,333 147,764 2,456,620 1,323,908 250

According to the SCP formulation in Equation (8), the minimum cluster number and the rep-
resentative clips of clusters can be obtained by solving SCP. However, the belonging clusters of
non-representative clips are not determined in SCP. Since maximum cluster size is one of the eval-
uation metrics in experiment, to maximize the cluster size, we apply a greedy algorithm to assign
each non-representative clip to its connected representative clip with the largest cluster size.
Further experimental comparisons on the clustering results are given in Section 7.

7 EXPERIMENTAL RESULTS
The proposed method is implemented in C++ with Standard Template Library and evaluated on
a PC machine with an Intel Core i7 (@2.30 GHz) and 16 GB DDR3. For a fair comparison, the
program is running in single-thread mode. The benchmark suites Case1-Case4 from the ICCAD
2016 Contest [19] is utilized in experiments.
To evaluate the scalability of the proposed method, we extend Case1-Case4 by generating new
markers abutting each marker in four directions, i.e., left, right, up, and bottom directions. Table 2
gives the specific information of Case1-Case4 and the extended benchmarks Extend1-Extend4.
The columns “Width” and “Height” denote the layout width (nm) and height (nm), respectively.
The columns “#Poly” and “#Corner” are the number of polygons and polygon vertices in each
layout, respectively. The column “#Clip” represents the clip number after clip exact matching. The
column “CSize” means the clip size.
As described in Reference [24], in clip-based scanning, the clip size and scanning stride are em-
pirically determined according to the optical diameter under given lithography specifications. All
benchmarks shown in Table 2 are derived from industry and the clip size is set to 200–250 nm.
It should be also mentioned that the applications of clip clustering are various [1], like hotspot
library generation, hierarchical data storage, and yield optimization speedup. However, the per-
formance of specific clip clustering application is mainly determined by their similarity criterion
settings. The objective of clip clustering is focus on clustering-related metrics. In our experiments,
both the similarity criterion settings and the clustering metrics are obtained from the ICCAD 2016
Contest [19], for which are also derived from industry.
The results and performance analysis of our method are presented first. The comparison be-
tween our experimental results and recent existing works is given afterward.

7.1 Analysis of Our Method


We first give the results of clip clustering in Table 3. For the method “Ours Basic [10],” the clip cen-
ters are set to their markers’ centers without clip relocation, while for the method “Ours General,”
the clip centers can be moved within their markers’ ranges. In Table 3, the column “Parameters”
lists the similarity constraint; the columns “Cnum ” and “Cmax ” present the number of clusters and
the number of clips in the largest cluster, respectively; the column “LB” presents the lower bound

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:18 X. He et al.

Table 3. Analysis on Clip Relocation Technique


Ours Basic [10] Ours General
Benchmark Parameters C num Cmax LB Opt? CPU(s) Cand C num Cmax LB Opt? CPU(s)
Default 8 5 — — 0.001 15 8 5 8 Y 0.007
Case1 a = 0.95 3 9 3 Y 0.001 37 3 9 3 Y 0.015
e =4 5 5 5 Y 0.003 16 5 5 5 Y 0.027
Default 26 104 — — 0.003 230 18 122 18 Y 0.089
Case2 a = 0.95 11 106 11 Y 0.006 321 5 125 5 Y 0.578
a = 0.9 7 115 7 Y 0.006 336 3 130 3 Y 0.693
e =4 18 104 18 Y 0.016 185 10 128 10 Y 1.054
Default 70 792 — — 0.047 137 51 1,048 51 Y 0.091
Case3 a = 0.85 13 2,672 13 Y 0.182 365 8 2,816 8 Y 0.450
e =8 37 1,104 37 Y 0.138 182 23 1,392 23 Y 1.372
Default 72 193,370 — — 3.108 191 51 198,340 51 Y 4.883
Case4 a = 0.99 22 198,000 22 Y 3.157 295 16 198,340 16 Y 6.850
e =2 46 198,000 46 Y 3.510 195 35 198,340 35 Y 10.343
Default 24 11 — — 0.001 79 22 12 22 Y 0.026
Extend1 a = 0.95 14 12 14 Y 0.002 162 10 16 10 Y 0.109
e =4 20 11 20 Y 0.012 70 15 13 15 Y 0.173
Default 108 201 — — 0.006 753 73 398 73 Y 0.505
Extend2 a = 0.95 47 201 47 Y 0.060 1,249 22 393 22 Y 2.231
a = 0.9 29 206 28 N 0.259 1,407 12 443 10 N 3.494
e =4 81 205 81 Y 0.189 728 39 452 39 Y 5.090
Default 762 2,720 — — 0.152 2,148 436 3,990 426 N 6.684
Extend3 a = 0.85 73 6,550 53 N 16.816 5,079 35 6,889 19 N 34.516
e =8 212 4,355 188 N 17.567 3,615 101 4,688 75 N 75.510
Default 1,064 1,033,058 — — 4.124 3,807 531 1,036,596 493 N 40.427
Extend4 a = 0.99 486 1,034,506 456 N 31.399 5,949 257 1,036,827 215 N 55.671
e =2 650 1,035,387 639 N 50.139 4,363 334 1,038,905 272 N 75.634

of the optimal clustering result; and the column “CPU” presents the runtime in seconds. For “Ours
General” method, the column “Cand” gives the number of candidates generated by the clip reloca-
tion technique.
In Table 3, for both methods of “Ours Basic [10]” and “Ours General,” our clustering number
“Cnum ” is consistent with the clustering lower-bound “LB” on the ICCAD 2016 Contest bench-
marks Case1-Case4, that is, the obtained clustering solution is optimal. However, with the expan-
sion of the problem scale, the clustering number “Cnum ” will be larger than “LB” provided by the
maximum-clique, i.e., in Extend3 and Extend4. It also should be mentioned that, for the “Ours
General” method, the “LB” is calculated only based on our generated candidates, and it is not the
exact lower bound, as it is not practical to enumerate all possibilities. In general, compared with the
“Ours Basic [10]” method, the average clustering number “Cnum ” in the “Ours General” method is
reduced by 35.45%, while the largest cluster size “Cmax ” in “Ours General” is increased by 26.47%.
Figure 9 presents the runtime breakdown on average for both the clustering with/without clip
relocation. For the basic clustering without clip relocation, as shown in Figure 9(a), most runtime,
i.e., 43.36% runtime, is spent on the SCP clustering process. For the general clustering with clip
relocation, more than half of the runtime is spent on the processes of building the association
graph. This is because when the clip relocation technique is applied, there are more candidates
generated, and thus more similarity check time is required among candidates. In our implementa-
tion, two methods are used to save runtime of the graph construction. The first is to reduce the
number of candidates. We reduce the number of candidates by setting a rigid similarity threshold,
i.e., the generated candidates between two clips should be identical. The second is to shorten the
comparison time of each pair of candidates. We found part of generated candidates are identical. If
the polygon areas of candidates are equal, then we can first check clip consistence, which is much
faster than checking the ACC/ECC metrics.
Figure 10 gives an example from Case2 to show that the clip relocation technique can effec-
tively increase the opportunity of clip clustering. Given two expanded-clip ec 1 and ec 2 , if their clip

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:19

Fig. 9. Discrete runtime steps on average of method “Ours Basic [10]” (a), and method “Ours General” (b).

Fig. 10. An example from Case2 (ACC = 0.95) to show that the general clustering with clip relocation can
effectively increase the opportunity of clip clustering.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:20 X. He et al.

Fig. 11. For Case2 (ACC = 0.95), the clustering process in basic clustering [10] without clip relocation is given
in (a)–(d), while the general clustering process with clip relocation is shown in (e)–(h). By applying the clip
relocation technique, both the clustering lower bound and the final clustering number are much smaller,
which are equal to 5 clusters in (h) instead of 11 clusters in (d).

centers are limited at their marker centers, then clip candidate c 1 and c 2 are obtained from ec 1 and
ec 2 , respectively. When ACC = 0.95, these two clips do not satisfy the ACC requirement, since
Distance(c 1 , c 2 ) > 0.05. However, when the clip relocation technique is applied, the clips can be
moved within their expanded-clip region, and c 1′ and c 2′ are chosen within ec 1 and ec 2 , respectively.
Since Distance(c 1′ , c 2′ ) ≤ 0.05, these two clips can be clustered together.
Figure 11 gives an example of general clustering process for Case2 under the constraint of ACC =
0.95. Due to the limited space, instead of plotting all candidates, only marker IDs are plotted in
Figure 11. For clustering without clip relocation, the clustering process is given in Figure 11(a)–
(d). There are 26 remaining clips of markers after the process of center-clip matching, and the
association graph G is shown in Figure 11(a). Figure 11(b) presents the mutex graph G ′ , in which
any clips from the connected markers cannot be assigned in the same cluster. Figure 11(c) shows
the corresponding maximum-clique result as dark nodes. Finally, 11 clusters are obtained after
clustering as shown in Figure 11(d), in which the marker IDs of the representative clips of all
clusters are drawn in dark color. According to Theorem 1 and Theorem 2, since the number of
clusters is consistent with the maximum-clique size, the obtained clustering solution is optimal.
For clustering with clip relocation technique, the clustering process is given in Figure 11(e) and
(f). The association graph G with more nodes is shown in Figure 11(e), since in the preparation pro-
cess, we do expanded-clip matching instead of previous center-clip matching. The marker mutex
graph G ′ is given in Figure 11(f). The maximum-clique result and the clustering result are shown
in Figure 11(g) and (h), respectively. Compared with the clustering without clip relocation, when
applying clip relocation technique, both the clustering lower bound and the final clustering num-
ber in Figure 11(g) and (h) are much smaller, which are equal to 5 clusters instead of previous 11
clusters.
To quantify the impact of clip size on clip clustering, the clip size is set to 300 and 400 nm for
further investigation. We find that with the increase of clip size, the number of clusters will be
increased significantly. For example, compared with the results in Table 3, the number of clusters
will be increased by 1.14–2.6 times under 300-nm clip size, while it is increased by 1.63–5.34 times
under 400-nm clip size. This is because as larger clips usually contain more pattern features, the

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:21

Table 4. Comparison of Basic Clustering Results without Clip Relocation on ICCAD 2016
Contest Benchmark
Bench Para Ref. [19] iClaire [2] GRASP [21] Ours Basic [10]
mark meter C num Cmax CPU(s) C num Cmax CPU(s) C num Cmax CPU(s) C num Cmax LB CPU(s)
Default 8 5 0.903 8 5 <0.001 8 5 0.001 8 5 — 0.001
Case1 a = 0.95 4 6 1.808 3 9 0.01 3 9 0.001 3 9 3 0.001
e =4 5 5 1.324 5 5 <0.001 5 5 0.001 5 5 5 0.003
Default 26 104 1.167 26 104 <0.001 26 104 0.017 26 104 — 0.003
Case2 a = 0.95 13 106 0.854 11 106 0.01 11 106 0.014 11 106 11 0.006
a = 0.9 10 114 1.168 7 112 0.01 7 112 0.015 7 115 7 0.006
e =4 18 104 0.874 18 104 0.01 18 104 0.017 18 104 18 0.016
Default 70 792 1.314 70 792 0.06 70 792 0.149 70 792 — 0.047
Case3 a = 0.85 26 1,344 1.232 13 2,608 0.11 13 2,672 0.159 13 2,672 13 0.182
e =8 52 1,048 1.311 37 1,056 0.11 47 1,048 0.161 37 1,104 37 0.138
Default 72 193,370 5.279 72 193,370 3.27 72 193,370 2.298 72 193,370 — 3.108
Case4 a = 0.99 31 197,660 4.740 24 197,830 3.34 24 197,830 2.352 22 198,000 22 3.157
e =2 57 193,540 4.364 46 193,710 3.53 52 193,540 2.428 46 198,000 46 3.510
Since different machines are used, the runtime is reported for reference only.

clustering opportunities will be reduced. In addition, the cluster number is decreased by 28.5%
when applying our clip relocation technology with the current clip size of 200 nm, while the cluster
number is decreased by 24.1% and 21.9% with clip sizes of 300 and 400 nm, respectively. As the
marker size is not enlarged with the clip size, the advantage of applying clip relocation technology
become smaller.

7.2 Comparison with Previous Methods


To evaluate our performance, we first compare the results of “Ours Basic [10]” with the counter-
parts in recent works, namely “Ref.” [19], iClaire [2], and GRASP [21], which are presented in
Table 4 (the results for the “Ref.” [19] are taken from the contest website [19]). For fairness, all the
methods are applied with the same settings for investigation, i.e., all the clip centers are set to the
marker centers.
Since the methods of “Ref.,” “iClaire,” and “GRASP” have not released their binaries, we can only
do the comparison with their results on the ICCAD 2016 Contest benchmarks but not the extended
benchmarks in Table 6. As one can observe from Table 4, our method obtains the minimum cluster-
ing count as shown in column “Cnum ,” while the largest cluster size as shown in column “Cmax .” As
indicated in Table 4, the “Cnum ” results given by our method outperform all the other counterparts
for all benchmarks. Note also that iClaire also achieves a nearly optimal result for the precondition
“without relocation.” The performance of iClaire and our method are much the same and should
be investigated furtherly under other conditions.
For that reason, we further evaluate the clustering performance with our clip relocation tech-
nique and compare the “Ours General” method with the “Clip Shifting” method [3] and the “iClaire”
method [2] in Table 5. For the ICCAD 2016 Contest benchmarks, we can observe that the cluster
numbers “Cnum ” from the “Ours General” method is the smallest in most benchmarks. On aver-
age, compared with “Clip Shift” and “iClaire,” our clustering method can achieve 14.49% and 8.17%
average reduction on the cluster number “Cnum ,” respectively. There are cases where our “Cnum ”
values may not be the lowest. The reason is that our optimal results are based on our generated
candidates instead of enumerating all possible candidates. As shown in Table 5, compared with
the minimum runtime, our method can achieve comparable runtime for all benchmarks.
The scalability analysis of clip relocation method is given in Table 6. The “Clip Shifting” re-
sults are given from their released binary. We mainly focus on the large benchmarks Extend3 and
Extend4. For the large-scale benchmarks, our average clustering count “Cnum ” is 16.59% smaller
with 74.11% faster runtime.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
90:22 X. He et al.

Table 5. Comparison of General Clustering Results With Clip Relocation on ICCAD 2016
Contest Benchmark
Clip Shift [3] iClaire [2] Ours General
Benchmark Parameters C num Cmax CPU(s) C num CPU(s) C num Cmax LB CPU(s)
Default 8 5 <0.5 8 <0.001 8 5 8 0.007
Case1 a = 0.95 3 9 <0.5 3 0.01 3 9 3 0.015
e =4 5 5 <0.5 5 0.01 5 5 5 0.027
Default 20 106 <0.5 20 0.01 18 122 18 0.089
Case2 a = 0.95 6 122 <0.5 8 0.02 5 125 5 0.578
a = 0.9 4 120 <0.5 5 0.02 3 130 3 0.693
e =4 13 106 1 12 0.02 10 128 10 1.054
Default 55 1,048 2 47 0.07 51 1,048 51 0.091
Case3 a = 0.85 8 3,335 2 14 0.12 8 2,816 8 0.450
e =8 34 1,047 3 27 0.12 23 1,392 23 1.372
Default 59 193,370 176 53 3.97 51 198,340 51 4.883
Case4 a = 0.99 17 198,340 198 22 4.3 16 198,340 16 6.850
e =2 44 197,010 257 33 4.39 35 198,340 35 10.343
Since different machine is used for iClaire result, its runtime is reported for reference only.

Table 6. Comparison of Clustering Results with Clip Relocation on


Extended Benchmark
Clip Shift [3] Ours General
Benchmark Parameters C num Cmax CPU(s) C num Cmax LB CPU(s)
Default 22 12 <0.5 22 12 22 0.026
Extend1 a = 0.95 10 16 <0.5 10 16 10 0.109
e =4 16 12 <0.5 15 13 15 0.173
Default 75 201 1 73 398 73 0.505
Extend2 a = 0.95 28 404 2 22 393 22 2.231
a = 0.9 11 461 2 12 443 10 3.494
e =4 46 436 9 39 452 39 5.090
Default 436 5,102 12 436 3,990 426 6.684
Extend3 a = 0.85 31 8,737 53 35 6,889 19 34.516
e =8 163 4,740 111 101 4,688 75 75.510
Default 614 1,035,941 179 531 1,036,596 493 40.427
Extend4 a = 0.99 274 1,037,994 254 257 1,036,827 215 55.671
e =2 513 1,032,848 505 334 1,038,905 272 75.634

In summary, compared with other reported clustering methods, our clip clustering method has
advantages in performance, scalability, and solution quality.

8 CONCLUSION
In this article, we propose a maximum-clique-based method with a clip relocation technique for
pattern clustering. First, we adopt a systematic clip relocation technique to increase the opportu-
nity of pattern clustering and thus minimize the clustering number. Further, to perform pattern
clustering, we formulate it as an SCP. Since SCP is an NP-hard problem, the lower-bound calcula-
tion of the optimal clustering is converted into an MCP, and the MCP’s result is utilized to help
SCP solving process. Experiments have been performed on the ICCAD 2016 Contest benchmarks.
To analyze the scalability, we also enlarge the benchmark size with much more clips. Compared
with previously published works, our method has achieved the least clustering number for almost
benchmarks with very competitive runtime.

REFERENCES
[1] Wei-Chun Chang et al. 2017. iClaire: A fast and general layout pattern classification algorithm. In Proceedings of the
54th Annual Design Automation Conference (DAC’17). 1–6.
[2] Wei-Chun Chang and Iris Hui-Ru Jiang. 2020. iClaire: A fast and general layout pattern classification algorithm with
clip shifting and centroid recreation. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 39, 8 (2020), 1662–1673.
[3] Kuan-Jung Chen, Yu-Kai Chuang, Bo-Yi Yu, and Shao-Yun Fang. 2017. Minimizing cluster number with clip shifting
in hotspot pattern classification. In Proceedings of the 54th Annual Design Automation Conference (DAC’17). 1–6.

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.
General Layout Pattern Clustering 90:23

[4] Duo Ding, Andres J. Torres, Fedor G. Pikus, and David Z. Pan. 2011. High performance lithographic hotspot detection
using hierarchically refined machine learning. In Proceedings of the 16th Asia and South Pacific Design Automation
Conference. IEEE Press, 775–780.
[5] Fan Yang, Subarna Sinha, Charles C. Chiang, Xuan Zeng, and Dian Zhou. 2017. Improved tangent space-based distance
metric for lithographic hotspot classification. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 36, 9 (2017), 1545–1556.
[6] Jhih-Rong Gao, Bei Yu, and David Z. Pan. 2014. Accurate lithography hotspot detection based on PCA-SVM classifier
with hierarchical data clustering. In Design-Process-Technology Co-optimization for Manufacturability VIII, Vol. 9053.
International Society for Optics and Photonics, 90530E.
[7] Michael R. Garey and David S. Johnson. 1979. Computers and Intractability; A Guide to the Theory of NP-Completeness.
W. H. Freeman & Co., New York, NY.
[8] Dan Gusfield. 1997. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cam-
bridge University Press.
[9] Xu He, Yu Deng, Shizhe Zhou, Rui Li, Yao Wang, and Yang Guo. 2019. Lithography hotspot detection with FFT-based
feature extraction and imbalanced learning rate. ACM Trans. Des. Autom. Electr. Syst. (2019).
[10] Xu He, Yipei Wang, Zhiyong Fu, Yao Wang, and Yang Guo. 2020. Maximum clique based method for optimal solution of
pattern classification. In Proceedings of the IEEE 38th International Conference on Computer Design (ICCD’20). 304–311.
[11] Janez Konc et al. 2007. An improved branch and bound algorithm for the maximum clique problem. MATCH Commun.
Math. Comput. Chem. 58 (2007), 569–590.
[12] Guanghui Lan, Gail W. DePuy, and Gary E. Whitehouse. 2007. An effective and simple heuristic for the set covering
problem. Eur. J. Operat. Res. 176, 3 (2007), 1387–1403.
[13] Daniel Leskosky. Largest Rectangle in Histogram. Retrieved from https://www.danielleskosky.com/largest-rectangle-
in-histogram/.
[14] Shengyuan Lin, Jingyi Chen, and Jincheng Li. 2013. A novel fuzzy matching model for lithography hotspot detection.
In Proceedings of the 50th Annual Design Automation Conference (DAC’13).
[15] Ning Ma. 2009. Automatic IC Hotspot Classification and Detection Using Pattern-based Clustering. University of Califor-
nia, Berkeley.
[16] Moojoon Shin and Jee Hyong Lee. 2016. Accurate lithography hotspot detection using deep convolutional neural
networks. J. Micro/Nanolithogr. MEMS MOEMS 15, 4 (2016), 043507.
[17] Rohit Reddy Takkala et al. 2019. CHIP: Clustering hotspots in layout using integer programming. J. Electr. Comput.
Eng. 2019 (2019).
[18] Wing Chiu Jason Tam and Ronald D. Shawn Blanton. 2015. LASIC: Layout analysis for systematic IC-defect identifi-
cation using clustering. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 34, 8 (2015), 1278–1290.
[19] Rasit O. Topaloglu. 2016. ICCAD-2016 CAD contest in pattern classification for integrated circuit design space analysis
and benchmark suite. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 41.
[20] Wan Yu Wen, Jin Cheng Li, Sheng Yuan Lin, Jing Yi Chen, and Shih Chieh Chang. 2014. A fuzzy-matching model
with grid reduction for lithography hotspot detection. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 33, 11 (2014),
1671–1680.
[21] Mingyu Woo et al. 2017. GRASP based metaheuristics for layout pattern classification. In Proceedings of the IEEE/ACM
International Conference on Computer-Aided Design (ICCAD’17). 512–518.
[22] Yan-Shiun Wu et al. 2018. MapReduce-based pattern classification for design space analysis. In Proceedings of the
International Symposium on VLSI Design, Automation and Test (VLSI-DAT’18). 1–4.
[23] Jingyu Xu, Subarna Sinha, and Charles C. Chiang. 2007. Accurate detection for process-hotspots with vias and incom-
plete specification. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’07).
[24] Haoyu Yang, Shuhe Li, Cyrus Tabery, Bingqing Lin, and Bei Yu. 2021. Bridging the gap between layout pattern sam-
pling and hotspot detection via batch active learning. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 40, 7 (2021),
1464–1475.
[25] Haoyu Yang, Jing Su, Yi Zou, Yuzhe Ma, Bei Yu, and Evangeline F. Y. Young. 2019. Layout hotspot detection with feature
tensor generation and deep biased learning. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 38, 6 (2019), 1175–1187.
[26] Yen-Ting Yu, Ya-Chung Chan, Subarna Sinha, Iris Hui-Ru Jiang, and Charles Chiang. 2012. Accurate process-hotspot
detection using critical design rule extraction. In Proceedings of the 49th Annual Design Automation Conference
(DAC’12). ACM, 1167–1172.

Received 26 September 2022; revised 29 June 2023; accepted 15 July 2023

ACM Transactions on Design Automation of Electronic Systems, Vol. 28, No. 6, Article 90. Pub. date: October 2023.

You might also like