Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network
Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network
Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network
Abstract—Wafer maps provide important information for There are a number of studies for wafer map pattern
engineers in identifying root causes of die failures during semi- recognitions [1]–[4]. Their classification approaches can be
conductor manufacturing processes. We present a method for divided into two main groups: 1) model-based pattern recog-
wafer map defect pattern classification and image retrieval using
convolutional neural networks (CNNs). Twenty eight thousand six nition, 2) feature extraction based pattern recognition. The
hundred synthetic wafer maps for 22 defect classes are generated model-based pattern recognition uses a predefined probability
theoretically and used for CNN training, validation, and testing. distribution function for each defect pattern and selects the best
The overall classification accuracy for the 6600 test dataset is matching model using information criterion such as the Akaike
98.2%. One thousand one hundred and ninety one real wafer information criterion (AIC) and the Bayesian information cri-
maps are used for CNN performance evaluation for the same
model trained by synthetic wafer maps. We demonstrate that by terion (BIC). The feature extraction based pattern recognition
using only synthetic data for network training, real wafer maps extracts pattern features using techniques such as correlogram
can be classified with high accuracy. For image retrieval, a binary and Radon transform. Once the pattern features are extracted,
code for each wafer map is generated from an output of a fully the common pattern classification algorithms such as support
connected layer with sigmoid activation. A retrieval error rate is vector machines, neural networks, nearest neighbors etc. are
0.36% for the test dataset and 3.7% for the real wafers. Image
retrieval takes 0.13 s per wafer map from the 18 000 wafer map applied for the classification task.
library. Deep convolutional neural networks (CNN) [5] have
recently advanced the state-of-the-art image classification
Index Terms—Deep learning, convolutional neural network,
information retrieval, semiconductor defects. performance and became the standard approach for any image
classification tasks. CNN is the end-to-end model and does not
require any task-specific feature engineering. This end-to-end
I. I NTRODUCTION model approach is beneficial since we don’t need to develop
N THE semiconductor manufacturing, wafer maps are used the task specific feature extractors and the domain specific
I to visualize defect patterns and identify potential process
issues. Inline metrology tools perform inspection after a cer-
export knowledge is not required. Another aspect of image
classification is the problem of image retrieval [6], [7]. The
tain process step and monitor abnormalities on dies. Then image retrieval is a task of finding images containing simi-
a wafer map is created based on the detected abnormal loca- lar objects or scene, given a query image, and has been used
tions. One of the main purposes for wafer map visualization in security and surveillance, medical imaging, and many other
is to monitor any abnormal defect signatures and respond to areas. Traditionally, the image retrieval requires feature extrac-
process problems quickly. Once wafer map libraries are cre- tion using object color and shapes. Since the deep CNN can
ated with corresponding root causes, defect pattern similarities learn rich features at each layer, these intermediate features
between wafers could be a good indication of the common are used as good descriptors for image retrieval [8], [9].
root causes and this knowledge base can be used to solve In this paper, we employ CNN for the defect pattern clas-
problems. In order to have an effective knowledge base, two sification and wafer map retrieval tasks. As a dataset, we use
components are required: 1) wafer map defect pattern classifi- wafer maps from simulation and the real wafers. For CNN
cation and 2) wafer map image retrieval from historical wafer training and validation, we only use the simulated wafer maps
map libraries. The wafer map defect pattern classification can because real data available for each class from the manufactur-
provide information about a defect occurrence rate for each ing process is highly imbalanced. In this case, it is beneficial to
defect class and engineers focus on the most important issue train CNN by using theoretically generated data so that we can
using this data. The wafer map image retrieval is helpful to also include rare defect patterns to the model and yet achieve
identify a root cause by querying historical wafer maps with reasonable classification accuracy. To verify the performance
the known root cause. of the proposed method, we generated 28,600 dataset by sim-
ulation. Data from 1,191 real wafers are also used to evaluate
Manuscript received November 2, 2017; revised December 31, 2017; the performance of the trained CNN.
accepted January 15, 2018. Date of publication January 18, 2018; date of
current version May 8, 2018. (Corresponding author: Takeshi Nakazawa.) Our paper is organized as follows. In Section II, methods
The authors are with Intel Corporation, Chandler, AZ 85226 USA (e-mail: for wafer map pattern generation, the CNN configuration and
[email protected]). CNN based image retrieval are descried. In Section III, we
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. present the results of defective wafer map pattern generations,
Digital Object Identifier 10.1109/TSM.2018.2795466 the CNN training/validation/test results using theoretically
0894-6507 c 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
310 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 31, NO. 2, MAY 2018
TABLE I
CNN C ONFIGURATION
Fig. 1. Defect density wafer map with the random defects (left) and with
the random and the non-random defects (right).
Fig. 2. The example of the generated wafer map for each class.
defect and non-random cluster defect at each quadrant. The For example, the non-random cluster defects for each quadrant
simulated defect classes contain the similar defect patterns are considered as the different class. The reason is that some-
but its location is at the different area of the wafer map. times the defect location provides locational commonality
312 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 31, NO. 2, MAY 2018
TABLE II
L IST OF WAFER M AP D EFECT C LASS
Fig. 3. Accuracy confusion matrix in percentage for the simulated test wafer
maps.
TABLE III
BATCH S IZE AND M EMORY U SAGE
Fig. 6. Accuracy confusion matrix in percentage for the real wafer maps.
Fig. 8. Query wafer map (1st column) and the corresponding top 3 retrieved
wafer map images for the selected defect patterns. (a) Query wafer map is
from simulation. (b) Query wafer map is from the real wafer.
TABLE IV
I MAGE R ETRIEVAL E RROR R ATE
Fig. 7. The misclassified wafer map from the real wafer (left) and the top
5 class probability (right).
the real wafers. Without having enough number of dataset, [7] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain,
CNN cannot be trained well and it is difficult to have enough “Content-based image retrieval at the end of the early years,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349–1380,
data size in some cases if defect patterns happen rarely. Our Dec. 2000.
model enables rare event detection capability without having [8] A. Babenko, A. Slesarev, A. Chigorin, and V. Lempitsky, “Neural codes
real data and it is particularly beneficial during technology for image retrieval,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Zürich,
Switzerland, 2014, pp. 584–599.
development phase. [9] K. Lin, H.-F. Yang, J.-H. Hsiao, and C.-S. Chen, “Deep learning of
We also demonstrate efficiency and performance of CNN binary hash codes for fast image retrieval,” in Proc. IEEE Conf. Comput.
based image retrieval using the binary code generated by the Vis. Pattern Recognit. Workshops (CVPRW), Boston, MA, USA, 2015,
pp. 27–35.
FC layer of our CNN model. Once the root causes and solu- [10] R. Pasupathy, “Generating homogeneous Poisson processes,” in Wiley
tions of a particular defect mode are associated with its wafer Encyclopedia of Operations Research and Management Science.
map pattern(s), wafer map image retrieval can be used to Hoboken, NJ, USA: Wiley, Jan. 2011.
trigger the actions for problematic processes.
R EFERENCES
[1] J. Y. Hwang and W. Kuo, “Model-based clustering for integrated circuit
Takeshi Nakazawa received the Ph.D. degree in optical sciences from the
yield enhancement,” Eur. J. Oper. Res., vol. 178, no. 1, pp. 143–153,
College of Optical Sciences, University of Arizona in 2011.
Apr. 2007.
He is currently working with Intel Corporation, Chandler, AZ, USA,
[2] Y.-S. Jeong, S.-J. Kim, and M. K. Jeong, “Automatic identification of
as a Yield Engineer/Data Scientist for developing image and data analysis
defect patterns in semiconductor wafer maps using spatial correlogram
systems and yield prediction models using machine learning. He was the recip-
and dynamic time warping,” IEEE Trans. Semicond. Manuf., vol. 21,
ient of several Intel divisional and department awards, the Best Paper Award
no. 4, pp. 625–637, Nov. 2008.
for Intel Technology Journal, and several distinguished invention awards.
[3] T. Yuan, W. Kuo, and S. J. Bae, “Detection of spatial defect patterns gen-
erated in semiconductor fabrication processes,” IEEE Trans. Semicond.
Manuf., vol. 24, no. 3, pp. 392–403, Aug. 2011.
[4] M.-J. Wu, J.-S. R. Jang, and J.-L. Chen, “Wafer map failure pattern
recognition and similarity ranking for large-scale data sets,” IEEE Trans.
Semicond. Manuf., vol. 28, no. 1, pp. 1–12, Feb. 2015.
[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classifica- Deepak V. Kulkarni received the Ph.D. degree in mechanical engineering
tion with deep convolutional neural networks,” in Proc. Adv. Nueral Inf. from the University of Illinois at Urbana-Champaign in 2005. He cur-
Process. Syst., 2012, pp. 1097–1105. rently serves as an Engineering Technology Development Manager with
[6] Y. Rui, T. S. Huang, and S.-F. Chang, “Image retrieval: Current tech- the Assembly and Test Technology Development Group, Intel Corporation,
niques, promising directions, and open issues,” J. Vis. Commun. Image Chandler, AZ, USA. His interests are in applying big data analysis techniques
Represent., vol. 10, no. 1, pp. 39–62, Mar. 1999. to improve manufacturing yield.