Defect Tracking Using GAI
Defect Tracking Using GAI
com
ScienceDirect
Procedia CIRP 107 (2022) 1101–1106
www.elsevier.com/locate/procedia
Abstract
Supervised machine learning methods are increasingly used for detecting defects in automated visual inspection systems. However, these methods
require large quantities of annotated image data of the surface being inspected, including images of defective surfaces. In industrial contexts, it
is difficult to collect the latter since acquiring sufficient image data of defective surfaces is costly and time-consuming. Additionally, gathered
datasets tend to contain selection-bias, e.g. under representation of certain defect classes, and therefore result in insufficient training data quality.
Synthetic training data is a promising alternative as it can be easily generated unbiasedly and in large quantities. In this work, we present a
procedural pipeline for generating training data based on physically based renderings of the object under inspection. Defects are being introduced
as 3D-models on the surface of the object. The generator provides the ability to randomize object and camera parameters within given intervals,
allowing the user to use the domain randomization technique to bridge the domain gap between the synthetic data and the real world. Experiments
suggest that the data generated in this way can be beneficial to training defect detection models.
© 2022 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the International Programme committee of the 55th CIRP Conference on Manufacturing Systems
Keywords: Synthetic training data; machine learning; surface inspection; industrial quality control; domain randomization
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
1102 Ole Schmedemann et al. / Procedia CIRP 107 (2022) 1101–1106
challenging and not ubiquitous. The main obstacle is the • Domain adaptation
transferability from the synthetic domain to the real domain [7]. • Sensor-realistic rendering
In this work, a novel approach for synthesizing training data • Domain randomization
for CNN-assisted defect detection is proposed. Existing
approaches for synthetic defect generation are based on 2D The goal of domain adaptation techniques is to reduce the
mergence of images [8] and 2D defect maps [2, 9–11]. Despite statistical deviation between source and target domain. Several
initial success and principal applicability of synthetic data strategies, e.g. adversarial learning or generative-based, have
generation for defect detection, those approaches are limited to been suggested to adapt the domains. Toldo et al. [22]
special use cases, complex in implementation, or not able to summarize state of the art domain adaption methods.
completely close the domain gap. Therefore, we aim to As stated by Hodan et al. [23] a high degree of visual realism
introduce a novel approach for synthesizing 3D objects and can be achieved by focusing on modeling the geometry,
defect data. textures and materials to a high level of detail and by simulating
For this approach, a 3D model of the object under inspection the lighting as physically correct as possible. A technique
is rendered. Defects are introduced into the model using known as physically based rendering (PBR) [24] has been
negatives of the defect geometry. To bridge the domain gap to shown to help in bridging the domain gap and may even be
the real world the principle of domain randomization is used necessary for rendering usable images featuring complex
[12]. Process parameters like textures, illumination, and defect reflections.
shape are randomized within intervals. The goal is to extend The goal of domain randomization (DR) as introduced by
the synthetic domain to such an extent that the real domain Tobin et al. [12] is to extend the synthetic domain to the point
becomes a subset of the synthetic domain, enabling models where the real domain becomes a subset of the synthetic
trained with synthetic data to be applied to real-world data. In domain. They showed that DR enables the use of lower-quality
addition, less sensor realism is needed to reduce the modeling renderers which are optimized for speed. Tremblay et al. [25]
complexity in the rendering pipeline. To our knowledge, this is demonstrated that parts of the domain, which are not the feature
the first extensive utilization of the principle of domain that is to be detected, can be randomized in a non-realistic way.
randomization for the generation of synthetic data for visual Prakash et al. [18] demonstrate further that generating images
defect detection. which preserve the structure of a scene can increase the
performance of DR further.
2. Related work Several toolboxes like BlenderProc [26], NDDS [27], and
Perception [28] have been published in order to assist
CNNs have improved visual object recognition and object researchers to render images and generate annotations. These
detection significantly [13]. However, when only trained with toolboxes assume that 3D models and texture maps of the
datasets of small size, models tend to overfit and produce poor objects to be rendered are available. For surface defects, these
results. Popular strategies in dealing with small datasets are must first be created, thus they cannot be directly deployed for
data augmentation, transfer learning, use of pre-trained models, industrial inspection tasks.
or synthetic training data.
For detection of everyday objects, extensive datasets are 2.2. Synthetic data generation for visual quality inspection
available, e.g. ImageNet [14], KITTI [15], or MS COCO [16],
which can be used to pre-train models. On the contrary, Successful automated visual defect detection with CNNs
publically available datasets for surface defect detection, has been extensively demonstrated [29–33]. Typically, these
e.g. NEU-DET [3] or GC10-DET [4], are not extensive enough authors approach the data problem by tediously collecting data
to be used for pre-training and are limited to specialized use for their use case or by relying on existing data collected with
cases, thus lacking the ability to transfer to new inspection vision systems in the field. Each of the aforementioned
tasks. The availability of suitable training data inhibits the researchers created a dataset specific to their inspection task
further spread of the use of CNNs in industrial inspection. that cannot be generalized to other inspection tasks.
There exists a variety of approaches for the generation of
2.1. Synthetic training data synthetic data for visual inspection that can be categorized into
rendering-based and generative-based methods. Generative
Multiple approaches such as [12, 17–20] have shown that approaches use generative adversarial networks (GANs) to
synthetic data can be used to solve the data availability problem create or augment training data [10, 11, 34–36] making use of
with small or biased datasets. Nikolenko gives an extensive their intrinsic domain adaption capability, while rendering
survey on research regarding synthetic training data [21]: approaches [1, 2, 7, 9, 37] aim for high sensor-realism.
Machine learning models usually assume, that training and test Retzlaff et al. [7] use procedural modeling and PBR to
data distributions are similar. However, the distributions of generate images for glass shade classification. Lee et al. [37]
synthesized training data and real world test data differ combine DR and sensor-realism to render images for industrial
significantly, making the domain transfer from source to target object detection. [2, 8, 9] create synthetic images for defect
domain challenging. Several strategies have been proposed to detection. Haselmann and Gruber [8] use real images and
tackle this challenge: overlay them with randomly generated defect textures. [2, 9]
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Ole Schmedemann et al. / Procedia CIRP 107 (2022) 1101–1106 1103
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
1104 Ole Schmedemann et al. / Procedia CIRP 107 (2022) 1101–1106
3.3. Implementation
4. Demonstration
The TDG was implemented using the 3D computer graphics
software toolset Blender. Blender features a python API which To demonstrate our training data generation pipeline a part
allows us to automate the training data generation by executing from the manufacturing industry was chosen: a turbocharger
scripts. Furthermore, Blender allows for advanced 3D object housing for the automotive industry. The cast iron component
manipulation that is used to execute the Boolean operations to has to be visually inspected after demolding. We choose the
integrate the defect tool shape into the object. For physical endoscopic inspection of the part’s cavities for this
based rendering we make use of Blenders raytracing engine demonstration. Difficult illumination conditions, varying
Cycles. In addition, we use BlenderProc [26] to create the relative positioning of the camera to the part’s surface, as well
segmentation maps showing the defect position in the rendered as freeform surfaces make the visual inspection of the part
images, see Fig. 4h. challenging. The concept of using synthetic data for endoscopic
Fig. 3 shows the general structure of the TDG. For each inspection was first published in [39].
defect object, a script to generate the defects is executed, which The cast iron component consists of only one surface
contains the object’s model with the material applied to it. The texture, therefore only one texture needs to be modeled. In
defect tool is used to generate all defects. Next, the supplied addition, the cavity allows to neglect having to model the
camera poses get randomized and for each pose, an arbitrary background since it will not be visible in the images reducing
number of additional randomized poses can be added. The the modeling complexity. This makes the part well suited as a
middle for-loop iterates over all of these randomized poses. demonstration part since it combines a challenging image
Then the according light pose is computed. In the same step, processing task with relatively limited modeling effort.
the light intensity is randomized too. Finally, the innermost for-
loop randomizes the texture. Afterwards, all elements needed 4.1. Synthetic dataset generation for use case
to execute BlenderProc have been assembled and the first
image along with its segmentation map can be generated. When To approximate the look of the real-world part, see Fig. 4a,
the textures have been changed the number of times that was we combined noise textures of different scale to generate a
specified, the next camera pose is chosen and after all camera bump map. The result is shown in Fig. 4c. For this experiment
poses have been used, the next defect object is generated. The we focus on the defect type ‘blowhole’ as seen in Fig. 4b. We
size of the dataset is thus the product of iterations of each for- created a spherical defect tool for our TDG. Fig. 4d shows a
loop. synthetic defect generated with the defect tool. We randomized
image features that are not relevant for detecting the defect. Fig.
4e-g show exemplarily variations in roughness (e), color (f),
and defect size (g).
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Ole Schmedemann et al. / Procedia CIRP 107 (2022) 1101–1106 1105
a b c d
e f g h
Fig. 4. (a) real-world endoscopic image of turbocharger housing cavity; (b) real-world endoscopic image of blowhole; (c) synthetic image from modeling; (d)
synthetic image with generated defect from TDG; (e) example of variation of roughness; (f) example of variation of color; (g) example of variation of defect
size; (h) segmentation map
We manually chose a set of 137 inspection poses as an input results of our investigation indicate that synthetic training data
for the TDG. We generated 100 defects in a part and generated generated with our approach can be beneficial for training
822 images per part. We repeated this process for 6 times. The defect detection models when few real-world images are
generated images were filtered depending on the defect size in available.
the image. Defects larger than 50 px became part of the
defective class. Images without visible defects were assigned 5. Conclusion and outlook
to the defect-free class. Images with very small defects were
not further considered. In this way a synthetic dataset with We introduced and implemented a training data generator
4.906 images was created. that makes extensive use of the principle of domain
randomization and can be used to generate synthetic training
4.2. Model training data for visual defect detection tasks. Our concept radically
reduces the overall complexity and effort needed to construct a
To evaluate the suitability of our approach we created two synthetic dataset for a given inspection task.
real-world datasets of endoscopic images. The training and the The TDG allows researchers to explore what makes good
test dataset show both defective and defect-free textures, see synthetic training data for defect detection, which parameter to
Fig. 4a-b, and consist of each 110 images. The real-world randomize, and how to set appropriate randomization intervals
training dataset was split into equally sized training and for the process parameters. Researchers can use the TDG to
validation datasets. investigate if synthetic data is useful for their inspection task at
We used an 18-layer ResNet [40] architecture which was hand.
pre-trained on the ImageNet dataset. Two experiments were Our use case demonstrates the usability of our approach.
conducted. First, we trained the model with the synthesized Future work will apply models with our synthesized data
dataset. We split the synthetic dataset in an 80/20 ratio in a directly on real-world data without fine-tuning taking
training and a validation dataset and pre-trained with a learning advantage of the scalability of our TDG to create very large
rate of 1e-4 and a batch size of 64 for 45 epochs. Then, we fine- datasets with high variability. Additionally, we will leverage
tuned the model with the real-world training dataset. For the the pixelwise ground truth information that our TDG provides
second experiment we trained the model directly with the real- for object detection and segmentation tasks.
world training dataset without using our synthetic data.
For the two experiments we conducted a hyperparameter CRediT author statement
search and evaluated combinations of three learning rates (1e-
2, 1e-3, and 1e-4) and three batch sizes (16, 32, and 64). We Ole Schmedemann: Conceptualization, Methodology,
trained for 25 epochs with an SGD optimizer. In both Software, Investigation, Writing - Original Draft, Writing -
experiments the best performing network based on the Review & Editing, Visualization. Melvin Baaß:
validation accuracy was tested on the real-world test dataset. Conceptualization, Methodology, Software, Writing - Review
The model that was pre-trained with our synthetic data & Editing. Daniel Schoepflin: Writing - Review & Editing.
(accuracy 98.2 %, recall 96.2 %, specificity 100 %) Thorsten Schüppstuhl: Writing - Original Draft, Supervision,
outperformed the model solely trained on real-world data Resources, Funding acquisition, Project administration.
(accuracy 93.6 %, recall 94.2 %, specificity 93.1 %). The
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
1106 Ole Schmedemann et al. / Procedia CIRP 107 (2022) 1101–1106
Acknowledgements [19] Schoepflin, D., Holst, D., Gomse, M., Schüppstuhl, T., 2021.
Synthetic Training Data Generation for Visual Object Identification on
Load Carriers. Procedia CIRP 104, p. 1257.
This research was funded by the German Federal Ministry [20] Magana, A., Wu, H., Bauer, P., Reinhart, G., 92020. PoseNetwork:
for Economic Affairs and Climate Action under grant number Pipeline for the Automated Generation of Synthetic Training Data and
CNN for Object Detection, Segmentation, and Orientation Estimation,
ZF4736301LP9. in 2020 25th IEEE International Conference on Emerging
Technologies and Factory Automation (ETFA), IEEE, p. 587.
References [21] Nikolenko, S.I., 2021. Synthetic Data for Deep Learning, 1st edn.
Springer International Publishing; Imprint Springer, Cham.
[22] Toldo, M., Maracani, A., Michieli, U., Zanuttigh, P., 2020.
[1] Peres, R.S., Guedes, M., Miranda, F., Barata, J., 2021. Simulation-
Unsupervised Domain Adaptation in Semantic Segmentation: A
Based Data Augmentation for the Quality Inspection of Structural
Review. Technologies 8, p. 35.
Adhesive With Deep Learning. IEEE Access 9, p. 76532.
[23] Hodan, T., Vineet, V., Gal, R., Shalev, E., Hanzelka, J., Connell, T.,
[2] Gutierrez, P., Luschkova, M., Cordier, A., Shukor, M., Schappert, M.,
Urbina, P., Sinha, S.N., Guenter, B., 2019. Photorealistic Image
Dahmen, T., 2021. Synthetic Training Data Generation for Deep
Synthesis for Object Instance Detection.
Learning Based Quality Inspection.
[24] Pharr, M., Jakob, W., Humphreys, G., 2017. Physically Based
[3] He, Y., Song, K., Meng, Q., Yan, Y., 2020. An End-to-End Steel
Rendering, 3rd edn. Elsevier.
Surface Defect Detection Approach via Fusing Multiple Hierarchical
[25] Jonathan Tremblay, Aayush Prakash, David Acuna, Mark Brophy,
Features. IEEE Transactions on Instrumentation and Measurement 69,
Varun Jampani, Cem Anil, Thang To, Eric Cameracci, Shaad
p. 1493.
Boochoon, Stan Birchfield, 2018. Training Deep Networks With
[4] Lv, X., Duan, F., Jiang, J.-J., Fu, X., Gan, L., 2020. Deep Metallic
Synthetic Data: Bridging the Reality Gap by Domain Randomization.
Surface Defect Detection: The New Benchmark and Detection
Proceedings of the IEEE conference on computer vision and pattern
Network. Sensors (Basel, Switzerland) 20.
recognition workshops, p. 969.
[5] Peng, X., Sun, B., Ali, K., Saenko, K., 2014. Learning Deep Object
[26] Denninger, M., Sundermeyer, M., Winkelbauer, D., Zidan, Y., Olefir,
Detectors from 3D Models.
D., Elbadrawy, M., Lodhi, A., Katam, H., 2019. BlenderProc.
[6] Su, H., Qi, C.R., Li, Y., Guibas, L.J., 2015. Render for CNN:
[27] To, T., Tremblay, J., McKay, D., Yamaguchi, Y., Leung, K., Balanon,
Viewpoint Estimation in Images Using CNNs Trained with Rendered
A., Cheng, J., Hodge, W., Birchfield, S. NDDS: NVIDIA Deep
3D Model Views, in 2015 IEEE International Conference on
Learning Dataset Synthesizer, 2018.
Computer Vision (ICCV), IEEE, p. 2686.
[28] Unity Technologies. Unity Perception Package, 2020.
[7] Retzlaff, M.-G., Richter, M., Längle, T., Beyerer, J., Dachsbacher, C.,
[29] Staar, B., Lütjen, M., Freitag, M., 2019. Anomaly detection with
2016. Combining synthetic image acquisition and machine learning:
convolutional neural networks for industrial surface inspection, in
accelerated design and deployment of sorting systems. Forum
Procedia CIRP (79), Elsevier, p. 484.
Bildverarbeitung 2016, p. 49.
[30] Soukup, D., Huber-Mörk, R., 2014. Convolutional Neural Networks
[8] Haselmann, M., Gruber, D.P., 2019. Pixel-Wise Defect Detection by
for Steel Surface Defect Detection from Photometric Stereo Images, in
CNNs without Manually Labeled Training Data. Applied Artificial
Advances in Visual Computing, Springer International Publishing,
Intelligence 33, p. 548.
Cham, p. 668.
[9] Boikov, A., Payor, V., Savelev, R., Kolesnikov, A., 2021. Synthetic
[31] Weimer, D., Scholz-Reiter, B., Shpitalni, M., 2016. Design of deep
Data Generation for Steel Defect Detection and Classification Using
convolutional neural network architectures for automated feature
Deep Learning. Symmetry 13, p. 1176.
extraction in industrial inspection. CIRP Annals 65, p. 417.
[10] Niu, S., Li, B., Wang, X., Lin, H., 2020. Defect Image Sample
[32] Kim, S., Kim, W., Noh, Y.-K., Park, F.C., 2017 - 2017. Transfer
Generation With GAN for Improving Defect Recognition. IEEE
learning for automated optical inspection, in 2017 International Joint
Transactions on Automation Science and Engineering, p. 1.
Conference on Neural Networks (IJCNN), IEEE, p. 2517.
[11] Li, B., Yuan, X., Shi, M., 2020. Synthetic data generation based on
[33] Faghih-Roohi, S., Hajizadeh, S., Nunez, A., Babuska, R., Schutter, B.
local-foreground generative adversarial networks for surface defect
de, 2016 - 2016. Deep convolutional neural networks for detection of
detection. Journal of Electronic Imaging 29, p. 1.
rail surface defects, in 2016 International Joint Conference on Neural
[12] Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.,
Networks (IJCNN), IEEE, p. 2584.
2017. Domain Randomization for Transferring Deep Neural Networks
[34] Martin Mundt, Sagnik Majumder, Sreenivas Murali, Panagiotis
from Simulation to the Real World, in 2017 IEEE/RSJ International
Panetsos, Visvanathan Ramesh, 2019. CODEBRIM: COncrete DEfect
Conference on Intelligent Robots and Systems (IROS), IEEE, p. 23.
BRidge IMage Dataset. Zenodo.
[13] LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521,
[35] Mery, D., 2020. Aluminum Casting Inspection Using Deep Learning:
p. 436.
A Method Based on Convolutional Neural Networks. Journal of
[14] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L., 2009.
Nondestructive Evaluation 39.
ImageNet: A large-scale hierarchical image database, in 2009 IEEE
[36] Jain, S., Seth, G., Paruthi, A., Soni, U., Kumar, G., 2020. Synthetic
Conference on Computer Vision and Pattern Recognition, IEEE, p.
data augmentation for surface defect detection and classification using
248.
deep learning. Journal of Intelligent Manufacturing.
[15] Geiger, A., Lenz, P., Stiller, C., Urtasun, R., 2013. Vision meets
[37] Lee, Y.-H., Chuang, C.-C., Lai, S.-H., Jhang, Z.-J., 2019. Automatic
robotics: The KITTI dataset. The International Journal of Robotics
Generation of Photorealistic Training Data for Detection of Industrial
Research 32, p. 1231.
Components, in 2019 IEEE International Conference on Image
[16] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D.,
Processing (ICIP), IEEE, p. 2751.
Dollár, P., Zitnick, C.L., 2014. Microsoft COCO: Common Objects in
[38] Bosnar, L., Saric, D., Dutta, S., Weibel, T., Rauhut, M., Hagen, H.,
Context, in Computer Vision – ECCV 2014, Springer International
Gospodnetic, P., 2020. Image Synthesis Pipeline for Surface
Publishing, Cham, p. 740.
Inspection.
[17] Hinterstoisser, S., Lepetit, V., Wohlhart, P., Konolige, K., 2017. On
[39] Bath, L., Schmedemann, O., Schüppstuhl, T., 2021. Development of
Pre-Trained Image Features and Synthetic Images for Deep Learning.
new means regarding sensor positioning and measurement data
[18] Prakash, A., Boochoon, S., Brophy, M., Acuna, D., Cameracci, E.,
evaluation – automation of industrial endoscopy. wt Werkstattstechnik
State, G., Shapira, O., Birchfield, S., 2019. Structured Domain
online, p. 644.
Randomization: Bridging the Reality Gap by Context-Aware Synthetic
[40] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, 2016. Deep
Data, in 2019 International Conference on Robotics and Automation
Residual Learning for Image Recognition. Proceedings of the IEEE
(ICRA), IEEE, p. 7249.
conference on computer vision and pattern recognition, S. 770-778.
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.