Thesis - Anomaly Detection in Manufacturing
Thesis - Anomaly Detection in Manufacturing
Thesis - Anomaly Detection in Manufacturing
M ASTER T HESIS
Author: Supervisor:
Ashish D AHAL Dr. Phuong T. N GUYEN
Acknowledgements
I would like to express my deep gratitude to my academic supervisor, Dr. Phuong
T. Nguyen, for his invaluable guidance, support, and expertise throughout the thesis
process. His constant encouragement and feedback significantly contributed to the
success of this project.
I extend my immense appreciation to my supervisor Johan Westö from Novia
University of Applied Sciences, Finland and Mika Adler, supervisor from Mirka Oy,
for their industry insights, practical suggestions, and mentorship. Their contribu-
tions were crucial in shaping this research.
I am also grateful to my friends and family for their continuous support and
encouragement throughout my academic journey. Their unwavering belief in me
motivated me to overcome challenges and pursue my goals with enthusiasm.
Furthermore, I would like to acknowledge the many researchers and authors
whose work has provided the foundational knowledge for this study. Their dedica-
tion to advancing knowledge in the field of computer vision and machine learning
has been a constant source of inspiration and motivation.
v
Contents
Acknowledgements iii
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Scope and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Thesis Structure and Overview . . . . . . . . . . . . . . . . . . . . . . . 2
2 Literature Review 5
2.1 Defect Detection in Manufacturing . . . . . . . . . . . . . . . . . . . . . 5
2.2 Image Pre-selection for Machine Learning . . . . . . . . . . . . . . . . . 6
2.3 Image Embeddings and Pre-trained Models . . . . . . . . . . . . . . . . 6
2.4 Dimensionality Reduction Techniques: UMAP and Alternatives . . . . 7
2.5 Anomaly Detection: LOF and Alternatives . . . . . . . . . . . . . . . . 9
3 Methodology 11
3.1 The Proposed Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Dataset Acquisition and Pre-processing . . . . . . . . . . . . . . . . . . 12
3.2.1 Image Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Image Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Feature Extraction using ResNet Model . . . . . . . . . . . . . . . . . . 15
3.3.1 Pre-trained ResNet Model Selection . . . . . . . . . . . . . . . . 15
3.3.2 Image Embeddings Calculation . . . . . . . . . . . . . . . . . . . 16
3.4 Unsupervised Anomaly Detection using LOF . . . . . . . . . . . . . . . 18
3.4.1 LOF Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.2 Hyperparameter Selection and Tuning . . . . . . . . . . . . . . . 19
3.5 Dimensionality Reduction using UMAP . . . . . . . . . . . . . . . . . . 20
3.5.1 UMAP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.2 Parameter Selection and Tuning . . . . . . . . . . . . . . . . . . 20
3.6 Image Pre-selection for Annotation . . . . . . . . . . . . . . . . . . . . . 21
3.6.1 Threshold Determination for Anomaly Detection . . . . . . . . 21
3.6.2 Image Exploration and Pre-selection . . . . . . . . . . . . . . . . 21
3.7 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Discussion 37
5.1 Interpretation of the Results . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Potential Improvements and Extensions . . . . . . . . . . . . . . . . . . 38
5.2.1 Automatic Threshold Selection . . . . . . . . . . . . . . . . . . . 38
5.2.2 Image Selection Based on Vector Quantization . . . . . . . . . . 38
5.2.3 Contrastive Learning for Feature Extraction . . . . . . . . . . . 38
5.2.4 Active Learning Approach . . . . . . . . . . . . . . . . . . . . . 38
5.3 Applicability to Other Industries . . . . . . . . . . . . . . . . . . . . . . 39
6 Conclusions 41
6.1 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Bibliography 43
vii
List of Figures
List of Tables
Chapter 1
Introduction
To address these research questions, this thesis introduces a methodology that har-
nesses image embeddings extracted from a pre-trained ResNet model, the Local Out-
lier Factor for anomaly identification, and UMAP for reducing dimensionality and
facilitating visualization. This approach aims to expedite the process of image pre-
selection and annotation, thereby optimizing the labelling process and improving
the overall quality of the training dataset.
• Chapter 3 describes the methodology used in the study, including dataset ac-
quisition and pre-processing, feature extraction using a pre-trained ResNet
model, unsupervised anomaly detection using Local Outlier Factor, image pre-
selection for annotation, dimensionality reduction using UMAP, and evalua-
tion metrics.
Chapter 2
Literature Review
UMAP has been used in various applications, including the analysis of genomic
data. For instance, UMAP was applied to biobank-derived genomic data of a Japanese
population, revealing fine-scale population structure and differentiating adjacent in-
sular subpopulations [36]. This study demonstrated that UMAP, in combination
with PCA (PCA-UMAP), was able to clearly distinguish neighbouring clusters while
retaining the global structure, making it a powerful tool for visualizing and under-
standing complex genomic data [36].
In the context of image data, UMAP has been used to visualize high-dimensional
image embeddings. For instance, in the work of Zhu et al. [37], UMAP was used
to visualize the embeddings of an image-to-image translation model. The authors
found that UMAP was able to capture meaningful semantic relationships in the im-
age data. Similarly, in the work of Wang et al. [38], UMAP was used to visualize the
embeddings of a deep learning model trained for image-text matching. The authors
found that UMAP was able to effectively capture the semantic relationships between
images and their associated text descriptions.
Likewise, Principal Component Analysis (PCA) is a classical linear dimensional-
ity reduction method that has been widely used to uncover large population struc-
tures [36]. PCA identifies the directions (principal components) in which the data
varies the most and projects the data onto these directions to reduce its dimensional-
ity [35]. However, PCA’s linear nature may not capture the fine and subtle structure,
and it may not maintain the global structure of the data as effectively as UMAP [36].
t-Distributed Stochastic Neighbor Embedding (t-SNE) is another non-linear di-
mensionality reduction method that has been used to interpret complex population
structures and disease biology [36]. t-SNE converts similarities between data points
to joint probabilities and minimizes the Kullback-Leibler divergence between the
joint probabilities of the low-dimensional embedding and the high-dimensional data
[35]. However, t-SNE is more focused on preserving local structures, and it may not
maintain the global structure of the data as effectively as UMAP [35].
In comparison, UMAP exhibits high stability and moderate accuracy, with the
second highest computing cost after t-SNE [35]. UMAP is also computationally
fast and scalable for application to large datasets [36]. Moreover, UMAP is capable
of clearly distinguishing neighbouring clusters while retaining the global structure,
making it a powerful tool for visualizing and understanding complex data [36].
2.5. Anomaly Detection: LOF and Alternatives 9
Chapter 3
Methodology
In this chapter, we present in detail the proposed framework for the assessment of
the efficacy of unsupervised learning approaches in streamlining the annotation pro-
cess in sandpaper manufacturing by identifying images of potential defects prior to
labelling. We conceived a methodology that harnesses image embeddings extracted
from a pre-trained ResNet model, the Local Outlier Factor for anomaly identifica-
tion, and UMAP for reducing dimensionality and facilitating visualization. We aim
to expedite the process of image preselection and annotation, thus optimizing the
labelling process and improving the overall quality of the training dataset.
The next challenge is to find a way to quickly view the images in the scatter plot
in areas of interest and tag the images as defective or non-defective. Several tools
can perform these tasks; however, we opted for Plotly Dash1 and FiftyOne2 for our
research. Using these two Python packages, we can select data points on the graph
to preview images quickly, filter images based on anomaly scores, compare image
intensities between two or more selected regions on the graph, tag images, etc.
The output of pre-selection is a curated dataset of pre-selected images that can
then be sent to human labellers to create a training dataset for downstream tasks like
defect detection. This process means that only a small fraction of images are sent for
labelling instead of the hundreds of thousands of images that arrive daily.
moving objects is essential. This approach is ideal for the task at hand, consider-
ing that the defects on sandpaper are visually identifiable, thus allowing computer
vision to be a viable solution for determining sandpaper quality and classification.
For the scope of this study, we selected five datasets representing five batches of
the same product family, collectively comprising 122,487 images as shown in Figure
3.1. These images, captured in the Bitmap (.bmp) format, originally had dimen-
sions of 1948 × 550 pixels with a size ranging from 890 kb to 920 kb each and were
recorded in RGB colour.
TABLE 3.1: Dataset Image Count
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
])
img = Image.open(path).convert('RGB')
img_t = transform(img)
This pre-processing step prepared the dataset for subsequent unsupervised learn-
ing and defect detection tasks. By standardizing the size, format, and colour distri-
bution of the images, we ensured that the downstream models can focus on the
critical task of identifying potential defects, rather than handling image variety.
such as sandpaper manufacturing, where quick and reliable results are crucial.
ResNet-18 is a deep convolutional neural network consisting of 18 layers, which
include several types of layers: convolutional, batch normalization, ReLU activation,
pooling, and fully connected layers. Initially trained on millions of images from
the ImageNet database, it has learned to extract complex and detailed patterns and
features from images [47].
Opting for a pre-trained model such as ResNet-18 brings several benefits. Firstly,
it capitalizes on the power of transfer learning [47]. This is a process in which knowl-
edge gained while solving one problem is applied to another related problem. In
our context, the general object recognition capabilities learned by the model from
the ImageNet dataset can be reused for sandpaper defect detection. Secondly, using
ResNet-18 saves a substantial amount of time and computational resources, which
would otherwise be needed to train such a deep model from scratch. The reduced
number of layers compared to larger models like ResNet-50 allows for faster infer-
ence time, making ResNet-18 an excellent choice for applications where rapid re-
sponse times are essential.
When an image is passed through the ResNet-18 model, the output of the penul-
timate layer is a 512-dimensional feature vector or an "embedding". This embedding
encapsulates the essential visual features of the input image as understood by the
model.
In Python, using PyTorch, the procedure for calculating image embeddings is
illustrated as follows:
import torch
from torchvision import models
# Convert to 1D array
embedding = torch.flatten(embedding).numpy()
These 512-dimensional embeddings were then utilized as inputs to the LOF al-
gorithm for anomaly detection and were also reduced to a lower dimensionality for
visualization using UMAP.
This approach using a pre-trained ResNet-18 model as a feature extractor pro-
vides an efficient way to convert raw images into a form suitable for machine learn-
ing tasks. It allows us to benefit from the potent feature learning capability of deep
neural networks while avoiding the need for extensive re-training of the model.
Calculating image embeddings from the ResNet-18 model was executed without
major challenges. It is worth noting that the model’s ability to handle diverse and
complex image content makes it a reliable tool for extracting relevant features from
our sandpaper images, potentially leading to more accurate anomaly detection.
# Initialize MinMaxScaler
scaler = MinMaxScaler()
LOF scores typically range from 1.0 (indicating a data point closely resembling its
neighbours) to an upper limit determined by the level of deviation from the neigh-
bouring points. We normalized these scores using the MinMaxScaler from sklearn
[48] to bring them onto a comparable scale, which helps interpret the scores more
intuitively.
manufacturing, occurring in less than 0.05% of the products. Therefore, we set the
contamination parameter to 0.05 to reflect this imbalance. It is noteworthy that al-
though not all anomalies are defects, as a general guideline we used the estimated
defect rate from the Manufacturer as the value for the contamination parameter.
Tuning these hyperparameters effectively allowed us to calibrate the sensitivity
of the LOF algorithm, ensuring it is well-suited to the task of detecting rare defects
in the context of sandpaper manufacturing. These adjustments, in conjunction with
the ResNet-18 derived image embeddings, provided a comprehensive approach to
unsupervised anomaly detection within the dataset.
while a smaller value allows them to cluster more tightly. Based on the distribution
and the density of our data, a min_dist of 0.1 was found to be appropriate.
The number of components corresponds to the number of dimensions in the low-
dimensional space we are mapping to. For visualization purposes, this was set to 2.
These UMAP hyperparameters were carefully tuned to optimize the quality of
the low-dimensional representations of our high-dimensional image embeddings.
The two-dimensional embeddings generated from the sandpaper images offered sig-
nificant insights into their distribution, which included both regular and anomalous
instances. These insights greatly facilitated the overarching aim of potential defect
detection.
1. Finding local anomalies: Using the thresholding method outlined above, local
anomalies were segregated and tagged as defective.
2. Missed Defect Fraction (MDF): The Missed Defect Fraction quantifies the pro-
portion of defective images missed during pre-selection but later recognized
by the downstream defect detection model. This can be computed as:
Here, the residual image refers to the image that was not pre-selected. A lower
MDF denotes that the pre-selection process was efficient in capturing a ma-
jority of the defective images. However, a higher MDF would point towards
potential gaps in the pre-selection process, thereby necessitating a more robust
method.
These evaluation metrics, thus, offer significant insight into the pre-selection pro-
cess, its strengths, and areas for potential improvement.
25
Chapter 4
This chapter presents the conducted experiments, as well as the obtained results to
evaluate the performance of our proposed framework. All the experiments for the
image pre-selection workflow were conducted on the Databricks platform,1 with the
following specifications:
• Runtime: 12.2.x-gpu-ml-scala2.12.
·10−2
3.4 3.38 · 10−2
Seconds per Image
3.2
3.12 · 10−2
3
2.91 · 10−2
2.82 · 10−2
2.8
2.63 · 10−2
2.6
Batch A Batch B Batch C Batch D Batch E
1,273.8
1,200
786
800
617.4
600 523.2
400 358.2
Batch C, which has a per-image time of about 0.033 seconds. The total computa-
tional time seems to be increasing with larger batch size as seen in Figure 4.2. Given
this architecture, more complex images with intricate patterns might take longer to
process as they necessitate more computational resources to accurately capture their
intricacies.
·10−3
4 3.77 · 10−3
Seconds per Image
3 −3
2.72 · 10−3 2.82 · 10
2
1.7 · 10−3
1 9.17 · 10−4
Figure 4.3 shows that the UMAP performance exhibited a clear downward trend
in per-image computation time as the batch size increased. The per-image time for
UMAP varies more noticeably across batches, ranging from approximately 0.0009
to 0.0037 seconds. This indicates that UMAP may have a significant constant-time
component in its computation. This fixed cost gets amortized over a larger number
of images, leading to a lower per-image computation time for larger batches. This
means UMAP benefits from a larger batch size. However, similar to ResNet, the total
computation time increases with the size of the batch as shown in Figure 4.4.
4.1. Performance Evaluation 27
82.2
80
60
50 47.77 48.9
40 35.48
·10−3
1.07 · 10−3
1
Seconds per Image
8.42 · 10−4
7.99 · 10−4
0.8
0.6
LOF showed an upward trend in the per-image computation time as the batch
size increased as seen in Figure 4.5, ranging from approximately 0.0002 seconds to
0.0011 seconds, with Batch A being the most efficient (about 0.0002 seconds per im-
age) and Batch D the least efficient (about 0.0011 seconds per image). This trend
might be because LOF, being a density-based outlier detection method, has to com-
pute the local density for each data point, a task that may become increasingly com-
plex as the number of data points (images) increases. Despite this increase, the LOF
times are still significantly lower than those of ResNet and UMAP, which suggests
that LOF is quite efficient on a per-image basis.
51.94
50
30
21.21
20
14.6
10 6.91
3.6
0
Batch A Batch B Batch C Batch D Batch E
Parquet file, ready to be served for subsequent exploratory analysis and visualiza-
tion tasks. This approach allows us to capitalize on the time spent computing these
values by preserving the results for reuse, hence maximizing our computational ef-
ficiency.
(A) (B)
(A) (B)
F IGURE 4.9: (a) Images associated with patterns in the dataset from
Figure 4.8b. (b) Images showing global anomalies in the dataset
from Figure 4.8b. These figures provide visualization of patterns and
anomalies identified within the specified UMAP embeddings.
4.2. Embeddings Visualization and Observations 31
(A) (B)
(C) (D)
F IGURE 4.10: (a) Batch A with LOF threshold = 1. (b) Batch A with
LOF threshold = 0.97. (c) Batch C with LOF threshold = 1. (d) Batch
C with LOF threshold = 0.97. The figure presents the LOF scores vi-
sualization for the specified batches and thresholds.
32 Chapter 4. Experiments and Results
( A ) Batch A ( B ) Batch C
( A ) Batch A ( B ) Batch B
( C ) Batch D ( D ) Batch E
Parameter Value
epochs 100
batch 20
imgsz 1,952
cache ram
rect true
The YOLOv8 model achieved an accuracy of 95.5% as seen in Figure 4.14. This
indicates that the model correctly identified 95.5% of the validation data, either by
correctly identifying defective images as defective, or by correctly identifying nor-
mal images as normal. This high accuracy demonstrates the effectiveness of our pre-
selection process and the ability to train a successful downstream model for defect
detection. However, our focus is on the effectiveness of the pre-selection workflow
which is further evaluated as follows.
The misclassification rate was notably higher for defects, likely due to the com-
plexities and variations in defect manifestations as well as the limitations of the
thresholding strategy. The results can be presented in the form of a confusion matrix
as shown in Table 4.2.
TABLE 4.2: Confusion Matrix of the Pre-selection Result
Actual
Pre_selected Defect Non-Defect
Defect 1218 72
Non-Defect 2 1288
1218 + 1288
PAcc = = 0.9713 (4.2)
1290 + 1290
This demonstrates a high degree of accuracy (approximately 97.1%) in our pre-
selection process, indicating its efficacy in correctly identifying defective and non-
defective samples.
Given the absence of these labels, the downstream defect detection model provides
tentative labels which can be used for evaluation purposes.
To gauge the effectiveness of the pre-selection process, it is necessary to examine
the proportion of defects missed. This involves executing the downstream defect de-
tection model on the pool of non-selected images and analyzing the resulting output.
This method, however, hinges on the assumption that the defect detection model is
accurate and reliable in identifying defects in images. The fraction of defects missed
by the pre-selection process is referred to as the ’Missed Defect Fraction’ (MDF) and
serves as a crucial metric for assessing the performance of the pre-selection process.
The calculation of MDF can be represented as:
1513
MDF = ≈ 0.0126 (4.4)
119907
This result implies that around 1.26% of defective images were overlooked dur-
ing the pre-selection process but were later identified by the downstream inference
model.
The MDF result is summarized in Table 4.3:
TABLE 4.3: Calculation of Missed Defect Fraction (MDF)
Parameter Value
Number of defects detected during inference 1,513
Total number of residual images 119,907
Missed Defect Fraction (MDF) 0.0126
Chapter 5
Discussion
Chapter 6
Conclusions
We believe these directions could open up new ways to refine the image selection
process for downstream tasks, facilitating more efficient and accurate quality control
in manufacturing industries.
43
Bibliography
[1] Fukuo Hashimoto et al. ªAbrasive fine-finishing technologyº. In: CIRP Annals
65.2 (2016), pp. 597±620. ISSN: 0007-8506. URL: https://www.sciencedirect.
com/science/article/pii/S0007850616301950.
[2] Tian Wang et al. ªA fast and robust convolutional neural network-based de-
fect detection model in product quality controlº. In: The International Journal of
Advanced Manufacturing Technology 94 (Feb. 2018). DOI: 10.1007/s00170-017-
0882-0.
[3] Michela Prunella et al. ªDeep Learning for Automatic Vision-Based Recogni-
tion of Industrial Surface Defects: A Surveyº. In: IEEE Access 11 (2023), pp. 43370±
43423. DOI: 10.1109/ACCESS.2023.3271748.
[4] JK Park, BK Kwon, JH Park, et al. ªMachine learning-based imaging system
for surface defect inspectionº. In: International Journal of Precision Engineering
and Manufacturing-Green Technology 3.3 (July 2016), pp. 303±310. DOI: 10.1007/
s40684-016-0039-x.
[5] Haibo He and Edwardo A. Garcia. ªLearning from Imbalanced Dataº. In: IEEE
Transactions on Knowledge and Data Engineering 21.9 (2009), pp. 1263±1284. DOI:
10.1109/TKDE.2008.239.
[6] N. V. Chawla et al. ªSMOTE: Synthetic Minority Over-sampling Techniqueº.
In: Journal of Artificial Intelligence Research (2002). URL: https://doi.org/10.
1613/jair.953.
[7] Jinlei Hou et al. ªDivide-and-Assemble: Learning Block-wise Memory for Un-
supervised Anomaly Detectionº. In: 2021 IEEE/CVF International Conference on
Computer Vision (ICCV). 2021, pp. 8771±8780. DOI: 10.1109/ICCV48922.2021.
00867.
[8] Xian Tao et al. ªUnsupervised Anomaly Detection for Surface Defects With
Dual-Siamese Networkº. In: vol. 18. 11. 2022, pp. 7707±7717. DOI: 10.1109/
TII.2022.3142326.
[9] M. Zaheer et al. ªGenerative Cooperative Learning for Unsupervised Video
Anomaly Detectionº. In: 2022 IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR). IEEE Computer Society, 2022, pp. 14724±14734. URL:
https://doi.ieeecomputersociety.org/10.1109/CVPR52688.2022.01433.
[10] Samet Akcay et al. ªAnomalib: A Deep Learning Library for Anomaly Detec-
tionº. In: 2022 IEEE International Conference on Image Processing (ICIP). 2022,
pp. 1706±1710. DOI: 10.1109/ICIP46576.2022.9897283.
[11] R. Strudel et al. ªSegmenter: Transformer for Semantic Segmentationº. In: 2021
IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer
Society, 2021, pp. 7242±7252. URL: https://doi.ieeecomputersociety.org/
10.1109/ICCV48922.2021.00717.
44 Bibliography
[12] ªOn the Sentence Embeddings from Pre-trained Language Models", author =
"Li, Bohan and Zhou, Hao and He, Junxian and Wang, Mingxuan and Yang,
Yiming and Li, Leiº. In: Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing (EMNLP). Association for Computational Lin-
guistics, 2020, pp. 9119±9130. URL: https://aclanthology.org/2020.emnlp-
main.733.
[13] Jize Cao et al. ªBehind the Scene: Revealing the Secrets of Pre-trained Vision-
and-Language Modelsº. In: Computer Vision ± ECCV 2020. Cham: Springer In-
ternational Publishing, 2020, pp. 565±580. URL: https://doi.org/10.1007/
978-3-030-58539-6_34.
[14] Nasir Mohammad Khalid et al. ªCLIP-Mesh: Generating Textured Meshes from
Text Using Pretrained Image-Text Modelsº. In: SIGGRAPH Asia 2022 Confer-
ence Papers. Association for Computing Machinery, 2022. ISBN: 9781450394703.
DOI : 10.1145/3550469.3555392. URL : https://doi.org/10.1145/3550469.
3555392.
[15] Domen Tabernik et al. ªSegmentation-based deep-learning approach for surface-
defect detectionº. In: Journal of Intelligent Manufacturing (2019). URL: https :
//dx.doi.org/10.1007/s10845-019-01476-x.
[16] Tang Tang et al. ªAnomaly Detection Neural Network with Dual Auto-Encoders
GAN and Its Industrial Inspection Applicationsº. In: Sensors 20.12 (2020), p. 3336.
URL : https://www.mdpi.com/1424-8220/20/12/3336.
[17] Jungsuk Kim et al. ªPrinted Circuit Board Defect Detection Using Deep Learn-
ing via A Skip-Connected Convolutional Autoencoderº. In: Sensors 21.15 (2021),
p. 4968. URL: https://www.mdpi.com/1424-8220/21/15/4968.
[18] Liang Xu et al. ªA Weakly Supervised Surface Defect Detection Based on Con-
volutional Neural Networkº. In: IEEE Access 8 (2020), pp. 44200±44212. DOI:
10.1109/ACCESS.2020.2977821.
[19] Paul Bergmann et al. ªThe MVTec Anomaly Detection Dataset: A Comprehen-
sive Real-World Dataset for Unsupervised Anomaly Detectionº. In: Interna-
tional Journal of Computer Vision (2021). URL: https://dx.doi.org/10.1007/
s11263-020-01400-4.
[20] Yuan-Hong Liao, Amlan Kar, and Sanja Fidler. ªTowards Good Practices for
Efficiently Annotating Large-Scale Image Classification Datasetsº. In: June 2021,
pp. 4348±4357. DOI: 10.1109/CVPR46437.2021.00433.
[21] Lihai Nie, Laiping Zhao, and Keqiu Li. ªGlad: Global And Local Anomaly De-
tectionº. In: 2020 IEEE International Conference on Multimedia and Expo (ICME).
2020, pp. 1±6. DOI: 10.1109/ICME46284.2020.9102818.
[22] Chun-Liang Li et al. ªCutPaste: Self-Supervised Learning for Anomaly Detec-
tion and Localizationº. In: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. 2021. URL: https : / / dx . doi . org / 10 . 1109 /
CVPR46437.2021.00954.
[23] Eric Wu et al. ªConditional Infilling GANs for Data Augmentation in Mam-
mogram Classificationº. In: Medical Image Computing and Computer Assisted In-
tervention ± MICCAI 2018. 2018. URL: https://dx.doi.org/10.1007/978-3-
030-00946-5_11.
Bibliography 45
[24] Ben Sorscher et al. ªBeyond neural scaling laws: beating power law scaling
via data pruningº. In: Advances in Neural Information Processing Systems. Ed.
by S. Koyejo et al. Vol. 35. Curran Associates, Inc., 2022, pp. 19523±19536.
URL : https://proceedings.neurips.cc/paper_files/paper/2022/file/
7b75da9b61eda40fa35453ee5d077df6-Paper-Conference.pdf.
[25] Angelos Katharopoulos and Francois Fleuret. ªNot All Samples Are Created
Equal: Deep Learning with Importance Samplingº. In: (Mar. 2018).
[26] J. Yosinski et al. ªHow transferable are features in deep neural networks?º In:
Advances in neural information processing systems (2014), pp. 3320±3328.
[27] C. Tan et al. ªA survey on deep transfer learningº. In: International conference
on artificial neural networks. Springer. 2018, pp. 270±279.
[28] K. He et al. ªDeep residual learning for image recognitionº. In: Proceedings of
the IEEE conference on computer vision and pattern recognition. 2016, pp. 770±778.
[29] O. Russakovsky et al. ªImagenet large scale visual recognition challengeº. In:
International journal of computer vision 115.3 (2015), pp. 211±252.
[30] M.D. Zeiler and R. Fergus. ªVisualizing and understanding convolutional net-
worksº. In: European conference on computer vision. Springer. 2014, pp. 818±833.
[31] A. Gordo et al. ªDeep image retrieval: Learning global representations for im-
age searchº. In: Proceedings of the European conference on computer vision (ECCV).
2016, pp. 241±257.
[32] L.v.d. Maaten and G. Hinton. ªVisualizing data using t-SNEº. In: Journal of
machine learning research 9.Nov (2008), pp. 2579±2605.
[33] Leland McInnes, John Healy, and James Melville. ªUMAP: Uniform mani-
fold approximation and projection for dimension reductionº. In: arXiv preprint
arXiv:1802.03426 (2018). URL: https://arxiv.org/abs/1802.03426.
[34] R. Chalapathy, A.K. Menon, and S. Chawla. ªDeep learning for anomaly de-
tection: A surveyº. In: arXiv preprint arXiv:1901.03407 (2019).
[35] Yufei Liu, Jing Zhou, and Kevin P White. ªA Comparison for Dimensional-
ity Reduction Methods of Single-Cell RNA-seq Dataº. In: Frontiers in Genetics
(2021). URL: https : / / www . frontiersin . org / articles / 10 . 3389 / fgene .
2021.646936/full.
[36] Rui Yamaguchi et al. ªDimensionality reduction and visualization of genomic
data using UMAPº. In: Nature Communications (2020). URL: https : / / www .
nature.com/articles/s41467-020-15194-z.
[37] Jun-Yan Zhu et al. ªUnpaired Image-to-Image Translation using Cycle-Consistent
Adversarial Networksº. In: Proceedings of the IEEE International Conference on
Computer Vision. 2017, pp. 2223±2232. URL: https : / / openaccess . thecvf .
com/content_ICCV_2017/html/Jun- Yan_Zhu_Unpaired_Image- To- Image_
Translation_ICCV_2017_paper.html.
[38] Xun Wang et al. ªRanked List Loss for Deep Metric Learningº. In: Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition. 2019, pp. 5207±
5216. URL: https : / / openaccess . thecvf . com / content _ CVPR _ 2019 / html /
Wang_Ranked_List_Loss_for_Deep_Metric_Learning_CVPR_2019_paper.
html.
[39] M. M. Breunig et al. ªLOF: Identifying Density-Based Local Outliersº. In: Pro-
ceedings of the 2000 ACM SIGMOD International Conference on Management of
Data (2000), pp. 93±104.
46 Bibliography