2017 Supervised Machine Learning Based Surface Inspection by Synthetizing Artificial Defects
2017 Supervised Machine Learning Based Surface Inspection by Synthetizing Artificial Defects
Abstract— The preparation of labeled training data for labeling of occurring defects in the training data needed,
supervised machine learning methods involves a lot of effort. which in addition should also appear in a sufficiently large
Regarding surface inspection tasks, this endeavor is often not number in the training data. For surface inspection tasks
economically reasonable. In this paper, an artificial defect where the occurrence of some rare defects is often arbitrarily
synthetization algorithm based on a multistep stochastic and not reproducible, these requirements are difficult to
process is proposed. It adds defects to fault-free surface fulfill at a reasonable expense.
images, which can be used for supervised machine learning. By
this means a deep convolutional neural network has been B. Approach
trained, achieving a detection rate of 94% of occurring real In this paper, an artificial defect synthetization algorithm
defects on the presented test surface. is proposed with the aim to enable supervised machine
learning for surface inspection tasks where no labeled defects
Keywords-inspection; surface images; image synthetization;
can be provided for the model training. The presented
artificial defects; supervised machine learning; convolutional
neural nets; deep learning approach is based on a multistep stochastic process that
generates defect textures that are added to fault-free surface
image patches. Therefore, in the manual selection process of
I. INTRODUCTION training images only surface images that are mainly fault-
free have to be included. Consequently, with the exception of
A. Motivation and Related Work a few outliers, all extracted image patches can be considered
Whether a product surface is flawless or not, has a large and labeled as fault-free. Image patches, labeled as defective,
impact on its perceived quality and therefore crucially are then only provided by carefully manipulating their fault-
contributes to a buying decision. For consumer products like free pendants by means of the proposed algorithm. For this
cars, cell phones, etc., especially for the respective premium reason, a manual labeling is not necessary.
variants, a flawless surface is an absolute must and In order to demonstrate the approach on a real world
necessitates an inspection of all visible surfaces of all inspection task the proposed synthetization algorithm is
fabricated units. However, such an endeavor is a applied on images of automotive interior parts described in
monotonous and exhaustive task for human raters. It is not section II. Image patches with synthetic defects together with
possible to keep the concentration over a prolonged time non-manipulated fault-free image patches are fed to a deep
period, which frequently leads to varying and inconsistent convolutional neural network (DCNN) that is evaluated on
quality assessments. Consequently, automated quality real defects, eventually. The obtained results suggest a good
inspection systems are of great interest from an industrial generalization of the DCNN model between fault-free and
point of view. defective looking image patches. The presented approach is
While for plain and homogenous surfaces an inspection demonstrated on one example surface, though the principle is
algorithm can be comparably easy implemented on a hard- supposed to be also applicable on other surface types with
coded rule base [1], there is far more effort needed for similar occurring defect morphologies.
patterned, structured and/or complexly shaped surfaces. The
distinction of patterns and structures including allowed part- II. TEST PARTS AND SURFACE DEFECTS
to-part variations from visually perceived imperfections is a The example objects are complexly shaped decorative
major challenge. Therefore, over the past few years more plastic parts that are manufactured with the so-called foil-
frequently machine learning techniques got the means of insert-molding (FIM) process [7]. This is a special type of
choice for complex inspection tasks [2,3,4]. polymer injection-molding process, where a preformed
One major drawback of machine learning based decorated foil is inserted into the cavity of the injection
approaches, is their demand for a sufficiently large number molding machine. The polymer mass is injected afterwards.
of training data to achieve a good generalization on the The surface design is apart from the surface geometry
problem [5,6]. For the supervised case there is also a manual
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.
exclusively determined by the imprinting process of the foil
that takes place beforehand the actual injection molding
process. It is therefore easily interchangeable without
expensive mechanical modifications on the fabrication tools.
Furthermore, FIM parts show a higher surface robustness
and scratch-resistance than usual injection-molding products.
Overall, the FIM process allows the fabrication of high-
quality plastic parts that are frequently used for premium
products like high-priced cars.
One significant characteristic of FIM products are
slightly varying foil distortions due to the thermoforming
process in which the geometry of the inlay foil is preformed
with regard to the final surface geometry of the plastic parts.
In case of an imprinted pattern this leads to slight random Figure 1. Surface pictures of two different Foil-Insert-Molding (FIM) parts
pattern distortions and shifts especially for complexly shaped that show the same region. Note the significant appearance variations of the
surface geometries. Furthermore, when the foil consists of surface pattern.
multiple imprinted layers, even the visual appearance of
pattern primitives can vary from part to part. In fact, all
manufactured parts are unique in their surface texture details
(compare Fig. 1). For humans those part-to-part variations do
not stand out at all and can only be noticed by comparing
two different parts. However, from the perspective of
computer-aided inspection those characteristics are
challenging, since there is no constant reference part (golden
sample) available for comparison with the parts to be
inspected. An inspection algorithm has to reliably distinguish
between allowed part-to-part variations and visually
perceived imperfections [8].
The focus of this paper is mainly on tiny surface
imperfections that are quite difficult to spot for a human. Figure 2. Image patches of size 51x51 pixels that show real defects. Since
They usually extend over a few pixels only in a surface these defects are very small and often weakly contrasted, they are difficult
to find for a human on a large surface region.
picture and stand out weakly in terms of contrast (compare
Fig. 2). Depending on the defect type and the used III. SYNTHETIZING ARTIFICIAL DEFECTS
measurement geometry they can appear bright or dark ON SURFACE IMAGES
contrasted. The demonstrated example surface is
photographed in a bright-field setup. The majority of the The generation of artificial image data as an approach for
light originating of the illumination unit is reflected by the enabling machine-learning was already used in cases where
part’s surface directly to the camera direction. Therefore, the the acquisition and labeling of a large set of real images
majority of occurring surface disturbances appear as dark comes with a great expense and/or is not possible for various
contrasted structures. reasons. One example is the automated recognition of
Since the presented test parts are complexly shaped they number plates [9], where artificially generated images of
were photographed in four different positions to cover a number plates including random affine transformations are
large area of the surface. The training data contains pictures inserted in an arbitrarily selected image scene. The result is
of 31 different parts where occasionally some positions were an image generator that can provide any count of unique
excluded that showed obvious defects. The region of interest images, which is helpful for training a complex machine
of each of the four positions was determined manually by learning model like a deep convolutional neural network
drawing four masks. Image patches of a size of 51x51 pixels (DCNN).
were extracted that way that no corner was visible in the Regarding the synthetization of surface defects
extracted patches. That way roughly 21 million different corresponding to a few pixels our approach can be roughly
image patches could be extracted. It must be noted that this divided into four steps. The first step concerns the generation
set of image patches is highly correlated, because they can of a morphological defect skeleton by means of a stochastic
differ from each other only by shifting the extraction window process that can be considered as a random walk with
by one pixel. Hence, this patch set roughly corresponds to direction momentum. In the second step, a texture of the
8000 uncorrelated image patches. skeleton is randomly generated and widened afterwards. In
The validation data comprises 9 parts, again with the the third step, the synthetized image patch of the defect
exclusion of images showing image regions with defects. In texture is used to modify a randomly picked fault-free image
comparison to that, the test data consists of 19 parts. In this patch that shows the type of surface to be inspected. In the
data set occurring defects where manually labeled for the final step, the defect visibility in the modified image is
evaluation of the trained DCNN on real defects. analyzed.
391
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.
A. Skeleton generation with WP indicating the mean contrast of the defect texture
The used skeleton generation algorithm resembles a and W(i) indicating a uniformly distributed random variable
random walker with momentum in a 2-dim space [10]. The within the interval [-WQ,WQ]. The parameter -WP and WQ are also
random walker starts from a random point uniformly distributed random variables that are measured
(x0, y0)∈R² with a < x0 < dx – a – 1, a < y0 < dy – a – 1 with a once for every defect synthetization process. The preliminary
step size s and a total number of steps l, where dx and dy are texture image defined in (4) is finally blurred to obtain a
the size of the image patches and a is the minimum distance smoother more natural looking defect.
between the starting point and the image edges. The C. Modification of a fault-free image patch
coordinates (x(i), y(i)) of the ith point resulting from the
subsequent stochastic process are obtained by In the last synthetization step a fault-free surface image
patch is manipulated by adding the previously generated
T(i) T(i-1)+XT(i)'T(i) defect texture. In this step a real surface images is required.
Since our focus was so far only on the surface image
introduced in 1.C it must be suspected that other surface
types may require a modified version of the two last
M(i) M(i-1)+XM(i)'M(i)+T(i) synthetization steps.
The fault-free image patch is manipulated by subtracting
the texture image. In order to generate synthetized defects
x(i) x(i-1)+s cosM(i)y(i) y(i-1)+s sinM(i) that appear as individual structures and do not strictly follow
the gray levels of the regular surface image structure a gray
where M(i) is the polar angle determining the movement level cutoff for the defective image region was used that
direction of the random walker at the ith step. The random results from the grey levels in the corresponding fault-free
variable XM(i) that follows a Bernoulli distribution with pM image region. In addition, a further random variable maintain
determines whether M(i-1)is changed by 'M(i). In addition a the regular image noise in case that a large manipulated
polar angle bias Tis introduced that causes a persistent image region runs into the gray level cutoff.
change of the polar angle for several steps. T itself is The described modification step only leads to dark
changed by 'T with the probability XT(i) which in turn contrasted defect structures, which however in our case study
follows a Bernoulli distribution with pT. 'M(i)and 'T(i) are represents the majority of real defect occurrences. In order to
uniformly distributed random variables in the intervals [- generate bright contrasted defect structures which may be
-,-] and [-4,4], respectively. In addition, the parameters s, necessary for other illumination setups or other surface
types, the fault-free image patch is just inverted in advance
pM, pT-, 4 and l are also uniformly distributed random
of this synthetization step and inverted back afterwards. Of
variables, which however are only measured once at the start
course, dark and bright contrasted defects can be mixed by
of every stochastic defect generation process. The
introducing a binary random variable that decides between
corresponding intervals are hyper parameters and have to be
the two variants.
chosen manually.
The probability pT has a large effect on how frequently D. Analysis of the defect visibility
bulky defects or defects with a large curvature are generated. This analysis of the defect visibility is actually not part of
In the presented experiments its range is chosen low in the synthetization procedure. Nevertheless, it is necessary to
comparison to pM so that also thin, scratch-like defects are discard synthetized defect images, where the added defect is
generated. In general all hyper parameters should be chosen so weak that it is simply not visible. If those images are kept,
with the goal to generate the greatest possible diversity of they negatively affect the training process of a machine
artificially synthetized defects. learning algorithm leading to a significantly reduced
For every carried out stochastic skeleton generation classification accuracy.
process a binary skeleton image S is obtained by rounding all Since for every synthetized defect image a fault-free
l points (x(i), y(i))∈R² to the nearest integer values (m(i), clone exists, the analysis of the defect visibility was done by
n(i))∈Z². This then determines the indices of the non-zero subtracting both images. The sum of the squared residual
pixels. Indices that links to non-existent image regions are image provides an acceptable estimator for the defect
discarded. visibility under consideration of the defect size. Although,
B. Texture generation this approach helps to discard the majority of badly
synthetized defect images, there occur some exceptions. In
In this step a texture image of the defect is generated on example in our case study, when a very thin defect is exactly
basis of the binary skeleton image. For all non-zero pixels added to the edge of a diamond shaped surface element, it
Sm(i),n(i) in the skeleton image the corresponding non-zero only get thicker by one or two pixels which does not change
pixels in the preliminary texture image are obtained by: the character of the element. For a human such an image still
appears fault-free, although the residual image shows a
Tm(i),n (i) WPW(i) sufficiently large signal for the synthetized defect image
being accepted. Although, this problem could be attenuated
by slightly blurring the fault-free image before the image
392
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.
Figure 3. Stepwise synthetization of artificial defects on fault-free image patches that show the surface to be inspected. The first image row show the fault-
free image patch to be manipulated. The second row shows the randomly generated defect skeleton. The third row shows the generated defect texture. The
fourth row shows the artificial defect added to the fault-free source image. The 5th row shows the residual images used for analyzing the defect visibility.
subtraction, it could not be completely ensured that badly they are generally known to show a shorter training time due
synthetized defect images are discarded with no exception. to their more non-saturating behavior [11]. As output layer
Vice versa, there exists a reverse case where image two softmax units were used – one each for the prediction of
patches – falsely labeled as fault-free – are fed to a machine fault-free and defective image labels. This configuration is
learning algorithm. This happens particularly when some of approximately analogous to an output layer with a single
the tiny defects are overlooked in the manual selection of the sigmoid unit, which would predict a binary variable linked to
surface images. the binary classification task. However, the output layer with
softmax units is easily extendable by additional units, in case
IV. MACHINE LEARNING MODEL of the requirement of a distinction between several defect
classes.
A. Architecture In order to accelerate the training of the DCNN batch-
As machine learning methodology a deep convolutional normalization was used for all convolutional layers in
neural networks (DCNN) was chosen. The corresponding advance to the activation functions. Batch-normalization is
network architecture was designed with consideration of the known to reduce the internal covariate shift. It generally
case study. Since the surface images are processed as gray- facilitates the training process by allowing higher learning
scale images due to defect synthetization algorithm, the input rates, making it less prone to bad initializations as well as
to the first convolutional layer is an l x l x 1 image, where l is acting as a regulizer [12].
the height and the width of the image. For the presented
experiments the input image size was fixed to l = 51. B. Training
Correspondingly, the surface images presented in 1.C were The DCNN model was learned from scratch using the
split into image patches of the same size. Adam optimizer [13] with the hyperparameters D= 0.0005,
The applied DCNN consists of seven layers which are E= 0.9, E= 0.999 H= 10-8 and a batch size of 128. The
(64C2)-(128C3-MP2)-(128C2)-(128C3-MP2)-(128C2)- weights were initialized in each layer from a zero-mean
(128C3-MP2)-(2N). This corresponds to six convolutional Gaussian distribution with a standard deviation of 1. The
layers with max-pooling on every second layer and one biases were all initialized with the constant 0. An
fully-connected output layer with two units. Both, additional initialization of the biases in the same way as the weights
convolutional layers and one additional fully connected showed a considerably worse training behavior. In order to
hidden layer did not show an increased performance of the enhance the model generalization a comprising data
DCNN for the test parts. augmentation was implemented in the preprocessing step
As activation functions Rectified Linear Units (ReLUs) after the optional synthetization of defect images. It included
were used over tanh-units except for the output layer, since random rotations, random shears, random zooms and random
393
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.
flips of the image patch. By using larger raw patches as input
to the preprocessing pipeline no image extrapolation at the
corners was necessary in the end. Such image artefacts,
generated by extrapolation techniques would have a serious
impact on the model’s performance, since they often show
similar characteristics as defects.
The network was trained for roughly 650k iterations
which corresponds to a total number of 83.2 million image
patches (compare Fig. 4). In view of a total number of 21
million raw image patches this corresponds to roughly 4
cycles. The training was performed in roughly 25 hours on a
GTX1080. As deep learning framework TensorFlow [14]
was used.
V. EXPERIMENTAL RESULTS
Figure 4. Training of the DCNN on surface images with artificially
A. Classification result for synthetically generated defects synthetized defects.
The deep convolutional neural network described in
C. Discussion and future directions
section III achieved a maximum accuracy of 99.7% on the
validation data set, which includes unseen surface images but The detection rate of real defects implies that the variety
only labeled defects that originate from the same defect of synthetized defects of the proposed algorithm in section
synthetization algorithm used in the training stage. Fig. 4 III leads to a solid generalization of the trained 7-layer
indicates that the network is only slightly overfitting at the DCNN between flawless and faulty looking images of the
very end of the training process. presented surface. However, there are two defects that were
From this results it can be concluded that the model not properly detected by the network (see Fig. 8). On the
generalizes well on varying surface images. It is at least other hand, there are defects properly detected that deviate
capable of a reliable distinction between unseen variations of significantly in its appearance from the majority of
surface patterns and synthetically generated defects. The synthetized ones (compare the two image patch in the lower
question whether the model is also capable of detecting non- row of Fig. 6).
synthetized defects has to be answered on the test set. The results demonstrate the suitability of supervised
machine learning methods for surface inspection tasks with
B. Result on the test set with real defects the focus on tiny and weakly contrasted surface defects task
In comparison to the validation set the test set includes despite the absence of real defects. Consequently, the total
real defects that were labeled manually. Since, there is only a amount of surface images to be taken is much lower, since
total of 35 labeled defects in the test set, the model real defects have to be provided only for model testing.
performance was separately evaluated for defective and Especially for rare and randomly occurring defects the effort
fault-free image patches. Fig. 5. illustrates the trained DCNN of the production of a large number of test parts, the time-
applied to the test set and also provide an idea of how large consuming examination of the surface images and the
the image patches are in relation to one of the inspected manual drawing of corresponding ground truths for each
surface regions. image would be too large in an industrial environment.
In order to perform the inference on the test set in a While the initial results are very encouraging, the topic
practical manner, a sliding window with a jump distance of deserves further investigation, especially for other surface
20 pixel was used. Occurring defects were considered as types or surface regions that show edges or other
detected (true positive defect detection results) when at least topographical structures. The inspection of defects that cover
one overlapping window responds accordingly. If no a larger surface region can be probably solved by a
overlapping window shows the proper response it is counted multiscale approach. Furthermore, a segmentation of
as a “not detected defect”, otherwise as a “detected defect”. detected defects would be useful for an automated inspection
On the other hand, if a fault-free image patch is predicted as system, since it would allow a subsequent analysis of the
defective it is counted as an “erroneously detected defect”, defect morphology and even the visual perceptibility [15].
otherwise as “correctly classified as fault-free”. Former could be done with an accordingly modified DCNN
The proportion of image patches “correctly classified as and the same presented defect synthetization algorithm, since
fault-free” is at 99.8% on the test set which is comparable to a detailed ground truth is automatically provided as a by-
the maximum accuracy of the network on the validation set product.
as expected. Regarding the 35 labeled defects in the test set,
33 could successfully detected by the DCNN, which VI. CONCLUSION
corresponds to an accuracy of roughly 94%. Examples for In this paper, an artificial defect synthetization algorithm
“detected defects” and “not detected defects” and is proposed to enable supervised machine learning for
“erroneously detected defects” are shown in Fig. 6-8. surface inspection tasks where no or not enough pictures
394
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.
.
Figure 5. One out of four inspected regions of two different parts of the
test set. The blue rectangular indicates a successfully detected defect. The
yellow one indicates an erroneously defect detection event.
Figure 8. The two defects in the test set that were not properly detected by
the DCNN.
REFERENCES
[1] C. Demant, B. Streicher-Abel and P. Waszkewitz, “Industrial Image
Processing: Visual Quality Control in Manufacturing,” Springer:
Berlin, Heidelberg, 1999
[2] S. Ravikumara, K.I. Ramachandran and V. Sugumaran, “Machine
learning approach for automated visual inspection of machine
components,” Expert Syst. Appl. 38(4), 2011, pp. 3260-3266
[3] M.A.F. Pimentel, D.A. Clifton, L. Clifton and L. Tarassenko, “A
review of novelty detection,” Signal Processing 99, 2014, pp. 215-
Figure 6. Some of the real defects in the test set that were successfully 249
detected by the DCNN. [4] E. Weigl, W. Heidl, E. Lughofer, T. Radauer and C. Eitzinger, “On
improving performance of surface inspection systems by online active
learning and flexible classifier updates, Machine Vision and
with labeled defects can be provided. The focus is on the Applications, vol. 27, issue 1, pp. 103-127
detection of small and weakly contrasted defects, which only
[5] F. Pereira, P. Norvig and A. Halevy, “The Unreasonable
cover a few pixels in the surface image. The photographed Effectiveness of Data,” IEEE Intelligent Systems, Issue No. 02, vol.
surfaces shows a regular pattern that varies in its alignment 24, March/April 2009, pp. 8-12
and distortion from part-to-part. [6] C. Sun, A. Shrivastava, S. Singh and A. Gupta, “Revisiting
The proposed artificial defect synthetization algorithm Unreasonable Effectiveness of Data in Deep Learning Era”,
makes use of a stochastic process to generate a wide variety arXiv:1707.02968v1, 10 Jul 2017
of defect skeletons. A texture of the skeleton is randomly [7] A. Martinez, J. Castany and J. Aisa, “Characterization of In-Mold
generated and widened afterwards. In the final step the defect Decoration Process and Influence of the Fabric Characteristics in This
Process,” Materials and Manufacturing Processes, Issue 9, vol. 26,
texture is added to a fault-free surface image patch and fed to 2011, pp. 1164-1172
a supervised machine learning model together with [8] M. Haselmann and D.P. Gruber, “Anomaly detection on arbitrarily
unmodified surface image patches. In this paper a deep distorted 2D patterns by computation of a virtual golden sample”,
convolutional neural network with 7-layer was successfully Image Processing (ICIP), 2016, pp. 4398-4402
trained with an accuracy of 99,7% on images with artificially [9] M. Earl, https://github.com/matthewearl/deep-anpr, 28.02.2017
synthetized defects. In comparison to that 33 of 35 real [10] M. Haselmann and D.P. Gruber, “Machine Learning im Bereich der
defects could be detected in the test set. automatisierten Qualitätsinspektion von Dekormuster”, 26. LKK,
2017
ACKNOWLEDGMENT [11] V. Nair and G.E. Hinton, „Rectified linear units improve restricted
boltzmann machines,“, In Proc. 27th International Conference on
The research work of this paper was performed at the Machine Learning, 2010
Polymer Competence Center Leoben GmbH (PCCL, [12] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep
Austria) within the framework of the COMET-program of Network Training by Reducing Internal Covariate Shift,”
the Federal Ministry for Transport, Innovation and arXiv:1502.03167, 2015
Technology and the Federal Ministry of Economy, Family [13] D.P. Kingma and J. Ba, “Adam: A Method for Stochastic
and Youth with contributions by Schöfer GmbH. The PCCL Optimization”, arXiv:1412.6980, 2014
is funded by the Austrian Government and the State [14] M. Abadi et. al., “TensorFlow Large-scale machine learning on
Governments of Styria and Upper Austria. heterogeneous systems, 2015. Software available from tensorflow.org
[15] D.P. Gruber, J. Macher, D. Haba, G.R. Berger, G. Pacher and W.
Friesenbichler, „Measurement of the visual perceptibility of sink
marks on injection modling parts by a new fast processing model,“
Polymer Testing, vol. 33, 2014, pp. 7-12
395
Authorized licensed use limited to: Central South University. Downloaded on June 03,2021 at 15:30:56 UTC from IEEE Xplore. Restrictions apply.