Fabric Anomaly Detection Automation Process
Fabric Anomaly Detection Automation Process
Fabric Anomaly Detection Automation Process
Process
Simon Thomine Hichem Snoussi Mahmoud Soua
University of technology of Troyes University of technology of Troyes AQUILAE
AQUILAE Troyes, France Troyes, France
Troyes, France [email protected] [email protected]
[email protected]
arXiv:2306.10089v1 [cs.CV] 16 Jun 2023
Abstract—Unsupervised anomaly in industry has been a con- good samples for a texture-specific unsupervised anomaly
cerning topic and a stepping stone for high performance indus- detection model. In fabric industry, many types and colors
trial automation process. The vast majority of industry-oriented of fabric are analyzed, and it would be impossible to rely on
methods focus on learning from good samples to detect anomaly
notwithstanding some specific industrial scenario requiring even a specific training on good samples for each type of fabric
less specific training and therefore a generalization for anomaly without slowing the industrial process.
detection. The obvious use case is the fabric anomaly detection, Therefore, we propose a complete data processing chain for
where we have to deal with a really wide range of colors a robust, fast and adaptive texture specific anomaly detection
and types of textile and a stoppage of the production line for and localization. Our method is based on four main modules:
training could not be considered. In this paper, we propose an
automation process for industrial fabric texture defect detection a domain-generalized texture anomaly detector, a fast texture
with a specificity-learning process during the domain-generalized specific training/inference, an auto-evaluation process of our
anomaly detection. Combining the ability to generalize and the specific model and an automatic already-seen fabric detection
learning process offer a fast and precise anomaly detection and to avoid retraining an existing model.
segmentation. The main contributions of this paper are the The paper is organized as follows. In section II, we review
following: A domain-generalization texture anomaly detection
method achieving the state-of-the-art performances, a fast specific the related work especially on MVTEC dataset and present
training on good samples extracted by the proposed method, a the different approaches proposed in literature for domain-
self-evaluation method based on custom defect creation and an generalized and classic unsupervised anomaly detection. In
automatic detection of already seen fabric to prevent re-training. section III, we present an enhanced domain-generalized texture
defect detection method. In section IV, we present the specific
Index Terms—Domain-Generalization, unsupervised, anomaly,
unseen, knowledge distillation, student-teacher, memory banks,
learning method, the auto-evaluation process and the already
fabric, automation. seen texture recognition. Section V is dedicated to the analysis
of the results. Section VI concludes the paper.
I. I NTRODUCTION
II. R ELATED WORKS
Unsupervised anomaly detection in industry is a vast topic,
As our proposed methods address two specific tasks, we
since there are a lot of possible applications. In this paper,
first present the state of the art on domain-generalized texture
we focus on fabric anomaly, which is a concerning topic
anomaly detection and then the state of the art on unsupervised
for industry. The specificity of fabric is the pattern in the
defect detection of known objects.
structure and if we manage to understand that pattern we
can extract anomalies. Several methods have been introduced
A. Domain-generalized texture anomaly detection
for industry anomaly detection using MVTEC AD [1] the
dataset that gathers textures (carpet, leather, grid, wood, and Domain-generalized anomaly detection is an important topic
tile) and objects (bottle, cable, capsule, hazelnut, metal nut, for optimal industrial process, since in specific industrial fields,
pill, screw, toothbrush, transistor and zipper). These methods the type of textures often changes. The most obvious example
could achieve high performance. However, they rely on ob- is certainly fabric anomaly detection where fabric can have
ject/texture specific unsupervised learning without generaliza- different colors (red, blue, striped) and types (cotton, polyester,
tion capacity. Recently, knowledge-distillation based methods silk, etc). The main objective is to detect defects on any type
have been introduced for the unsupervised anomaly detection of fabric without resorting to a time-consuming training. The
task [2]. It consists of a student-teacher model focusing on the feature extraction from a pretrained classifier offers the most
bottom layers of the network as they represent the edges, color promising results with different types of networks such as an
and shapes information. We used the same approach to design episodic training [3], the use of extrinsic and intrinsic aspects
a domain-generalized texture anomaly detection method with [4] and multiscale feature extractor with co-attention modules
the ability to detect defects on unseen textures and to select [5].
B. Unsupervised anomaly detection on known objects
More commonly, unsupervised anomaly detection deals
with the problem of detecting defects on an object or texture
based on only good samples. In industry or security scenarios,
we often have a low rate of defects with a vast number of
different defect types which would lead to a time-consuming
annotation and a possibly non-pertinent classification if all the Fig. 1: Samples employed for the custom fabric dataset
anomaly types are not considered[6]. To tackle this question, (extracted from the fabrics dataset [23])
several methods emerged proposing different types of algo-
rithms such as autoencoders [7] and variational autoencoder
variants [8] [9]. Another common way of detecting anomalies In terms of layer selection, the deeper a layer, the more the
is Generative Adversarial Networks (GAN) introduced by information relates to the context and conversely, the shallower
[10] adapted to unsupervised anomaly detection such as Ano- a layer, the more information it contains on contours, edges,
GAN [11], G2D [12] and OCR-GAN [13]. More recently, and colors. Based on different layer configurations, we
approaches using a pretrained classifier has been at the heart show that for the purpose of texture domain generalization,
of the research in industrial anomaly detection and offers out- mid-level features would be the best choice to combine
standing performance. There are three main feature extraction- texture specific information such as contours and edges and
based approaches: normalizing flow, knowledge distillation a general vision of what a texture is.
and memory banks. The normalizing flow approach consists At least two classifiers are needed to attenuate each bias. We
of a flow training based on relevant features of good samples have used Resnet18 and EfficientNet-b0 for computation time
from a pretrained network such as AlexNet [14], Resnet [15] speed and meaningful features.
or efficient-net [16] trained on imageNet. Different strategies To fully exploit each classifier information, we used a parallel
were used to enhance performance, such as a 2D flow [17] architecture which can be seen as a multiple teachers/multiple
or a cross-scale flow [18]. Another interesting approach is students architecture where the training happen independently
the use of a memory bank to extract relevant information for each classifier, only the anomaly score is calculated with
from different good samples and to use this memory bank the two networks outputs. Our framework is an adaptation of
to compare and detect if there is an anomaly [19]. Finally, MixedTeacher [22] with a different layer selection strategy.
the concept of knowledge distillation was adapted for unsu- The first Resnet layer is not used as its output features are
pervised anomaly detection and localization [2]. The idea is too specific to training dataset textures. We used the features
to train a student network based on the output features of a of the three first residual blocks of Resnet18 and the last 2
teacher (already pretrained for a classification purpose) and on convolutional blocks of efficientNet-b0. As in [22], we used
good samples. The student will be able to reproduce teacher a reduced version of the Resnet18 model with a reduction of
features on a good sample, but will not be as precise for a the block size and a reduction of the dimension of each layer
defective sample. Several methods used this principle with with an adaptive average pooling, while we keep the same
different strategies such as a multi-layer feature selection [2], architecture for the EfficientNet part.
an asymmetric student teacher [20], a coupled-hypersphere-
based feature adaptation [21] and a mixed-teacher approach Given a training dataset of images without anomaly
[22]. D = [I1 , I2 , ..., In ], our goal is to extract the information
III. K NOWLEDGE DISTILLATION GENERALIZATION of L mid-level layers. For an image Ik ∈ Rw∗h∗c where
w is the width, h the height, and c the number of chan-
The proposed model is based on the knowledge distillation
nel, the teacher outputs features Ftl (Ik ) ∈ Rwl ∗hl ∗cl and
framework, where a pretrained network is used as a teacher
Fsl (Ik ) ∈ Rwl /2∗hl /2∗cl /2 with l > 1 and Fsl (Ik ) ∈ Rwl ∗hl ∗cl
and a student network is trained to reproduce the teacher
if l = 1. The loss is obtained by applying the l2 distance of
output on good samples. The student network is then expected
normalized feature vectors for each pixel of the feature map
to not be able to reproduce teacher features on defective
and summing them. For the Resnet student part, we used an
samples, a property which is used to detect abnormal samples.
adaptive average pooling layer on teacher features. The used
For domain generalization, we propose to train the student
layers are l = {1, 2, 3} for the Resnet part and l = {5, 6} for
on different types of textures and using many teachers to
the EfficientNet part.
guarantee generalization. In order to achieve this objective,
we first constitute a new dataset based on fabric datasets [23] Pixel loss for the resnet part is defined in the following Eq.1:
which regroups different categories of textures with different
quality and homogeneity.
1
Then, to tackle the problem of texture domain lossl (Ik )ij = l
∥norm(AAP (FResnet18 (Ik ))ij )−norm(Fsl (Ik )ij )∥
2
generalization, we used a specific student teacher architecture (1)
with different branches based on the paradigm that each pre- where AAP refers to Adaptive Average Pooling. For the
trained classifier have a different bias towards classification. EfficientNet part, pixel loss is defined in the following Eq.2:
B. Already seen fabric recognition
1 To guarantee an automated anomaly detector without the
lossl (Ik )ij = ∥norm(FEf l l
f N etb0 (Ik )ij )−norm(Fs (Ik )ij )∥
2 help of an operator for selecting an already-trained model,
(2) we propose an algorithm to precisely recognize a fabric type
For the layer l, the loss is defined as: already considered previously. For each trained model, we save
x extracted features from the specific model on good samples
wl Xhl reduced using coreset subsampling introduced in PatchCore
1 X
lossl (Ik ) = lossl (Ik )ij (3) [19] to guarantee fast computation. Each specific model is
wl hl i=1 j=1
also saved in a model bank of N models and linked to its
features in a feature bank. When we have to decide if the fabric
and finally, for the total loss is written as: was already seen, we calculate the sample/model proximity by
extracting features from all trained specific models from the
l
X model bank, applying the coreset subsampling and comparing
loss(Ik ) = lossl (Ik ) (4) these features to the x features from the feature bank of each
specific model with cosine similarity distance as described
in equation 6. We then compute the intra-class proximity by
IV. AUTO - LEARNING PROCESS FOR INDUSTRIAL calculating the cosine similarity between the x features of the
DEPLOYMENT
same model as reported in equation 7. The proximity score
is defined as the absolute value of the difference between
The previous part was presented in the context of industrial the sample/model proximity and the intra-class proximity.
efficiency, where it was not allowed to retrain for every new We finally make the decision by comparing the maximum
type/color of texture/fabric. The objective of this section is to proximity score with a similarityT hreshold. The threshold
propose a general classifier for handling the anomaly detection is chosen based on what is known about the similarity between
role while we gather enough images and train a specific model the fabric.
for increased efficiency. Even though it may seem laborious if the model bank
This section is divided in 3 parts: training and self-evaluation, is consistent, it is still real-time deployable thanks to the
recognition of an already trained type of fabric, and a typical inference speed of the reduced student architecture proposed
industrial use-case in fabric industry. in the previous subsection and the coreset subsampling, as we
show in the experiment part. This is by far the most accurate
method for comparing a new piece of fabric with a previously
A. Training and self-evaluation seen one and, we believe, it is still usable even in a specific
Given the deployment constraints, we considered different case of thousands of specific models.
criteria for the choice of the student-teacher network architec- The cosine similarity formula is:
ture: (i) the inference and training time, (ii) the performance
f eatA .f eatB
and (iii) the robustness to defective samples in the training sim(f eatA , f eatB ) = (5)
set. We also considered the possibility of running the process ∥f eatA ∥ ∥f eatB ∥
on several asynchronous defect detectors. The model Reduced
with f eatA and f eatB the extracted features. The sam-
Student proposed in [22] is a good candidate. Thanks to
ple/model proximity is defined as:
its reduced architecture, we can train a specific model in
an acceptable time. To minimize the number of potential x
1X
defective samples in the training, we gathered the samples proxsm (S, M odel) = sim(f eatS , f eati ) (6)
with acceptable anomaly score from the domain-generalized x i=1
model, i.e samples classified as good samples. Based on a test-
error approach, we determined the optimal number of epochs The intra-class proximity is defined as:
(during specific training) where the specific model becomes x x
better than the domain-generalized one so that we can start 1 X X
proxic (M odel) = sim(f eati , f eatj )
using the best model even if the complete training is not x(x − 1) i=1
j=1,i̸=j
finished. (7)
The self-evaluation part is based on two types of data: (i) The proximity score is :
the first type is defective samples detected by the domain-
generalized anomaly detector and (ii) the second type is proxScore(S, M odel) = abs(proxsm (S, M odel)−proxic (M odel))
generated data with a procedure inspired by DRAEM [9]: (8)
Perin noise and the texture database dtd [24]. We used the And the already-seen decision is described as :
same approach to generate non-absurd defects to self-evaluate
our model. max(proxScore(S, M odeli )) > similarityT hreshold (9)
i∈N
Fig. 2: Architecture of Resnet student teacher (left) and EfficientNet student teacher (right)
V. E XPERIMENTS
This section is divided into 3 parts: the analysis of the
Fig. 3: Custom defective samples generated with Perin noise domain-generalization model compared with state of the
art for different training configurations, the analysis of the
training speed and inference speed of our model and finally
C. Typical use-case : fabric industry the estimation of the number of required epochs on a specific
training to outperform the domain-generalization algorithm.
To demonstrate the effectiveness of our method, we describe
a typical real defect analysis use case. In a vast majority
of mid-range clothing industry, the fabric is analyzed several A. State-of-the-art comparison
times during the whole fabrication process by operators that For the evaluation of our model, we used two different
scroll the fabric and look for defects. This is a laborious databases for training. For the “MVTEC” one, we trained
job and often distraction occurs resulting in a globally low the DG model on all good samples of MVTEC AD textures
detection percentage of defects, not to mention the difficulty except the one we are testing on to reproduce the evaluation
for the eyes to look at certain fabric categories such as striped protocol of the other state-of-the-art papers. The “cotton” one
fabrics. Our automated process aims at assisting the operator is trained on the custom fabric dataset presented in section III
for the classification task and to speed up the scrolling of the and was created for fabric anomaly which explain the SOTA
fabric. The operator is still needed since he has to install the performances on carpet and leather. The results are presented
fabric roll on the machine and to verify the defect classification in table I.
done by the domain generalized model since the accuracy is For the training, we used stochastic gradient descent with a
still low for a full automation process. learning rate of 0.4 for 200 epochs with a batch size of 16.
For every fabric roll, the process start with an identification Both networks are pretrained on ImageNet. We resized all the
of the fabric to control. Two different cases may happen: images to 256x256, keeping 80% for training and 20% for
- If this type of fabric has never been seen, a specific validation. We kept the checkpoint with the lowest validation
training is started while still doing the anomaly detection loss.
with the domain-generalized model, we may have to slow the
scrolling of the fabric during the training depending on the TABLE I: AUC comparison between our method and
computational power. When the trained model becomes better existing ones on MVTEC AD
than the domain-generalized model, we used it instead, even textures Epi-FRC+[3] EISNet+[4] DGTSAD[5] Ours(MVTEC) Ours(Coton)
carpet 0.916 0.982 0.943 0.985 0.996
if the training is not completely finished. When the training is leather 1.000 1.000 1.000 0.991 0.996
wood 0.941 0.979 0.962 0.999 0.948
finished, we used the completely trained model for anomaly tile
grid
0.951
0.725
0.851
0.728
0.994
0.730
0.965
0.937
0.964
0.944
detection while keeping some features of defective samples mean 0.907 0.909 0.918 0.975 0.969
Fig. 5: Outputs of detected anomaly with our TABLE III: Epoch performance
domain-generalization model textures 10 epochs 20 epochs 30 epochs 100 epochs DG model
carpet 0.915 0.920 0.963 1.0 0.996
leather 0.949 0.974 0.990 0.997 0.996
wood 0.988 0.989 0.995 0.996 0.948
tile 0.986 0.987 0.991 0.987 0.964
on both carpet, leather and grid which contains patterns [25] mean 0.959 0.967 0.984 0.995 0.969