Fabric Defect Detection Based On Multi-Input Neural Network
Fabric Defect Detection Based On Multi-Input Neural Network
Jingxin Lin, Nianfeng Wang, Hao Zhu, Xianmin Zhang and Xuewei Zheng
Abstract— Fabric defect detection is an important part in transform, contourlet transform, wavelet transform, Gabor
the process of textile production. Fabric defect is an important transform. Yildiz et al. [3] proposed to use Fourier transform
factor affecting fabric quality. At present, in the fabric industry, to detect the defects on the fabric surface, and use the thermal
the automatic defect recognition technology has become more
and more mature. In this paper, gray level co-occurrence matrix difference between the defective part and the normal texture
(GLCM) and redundant contourlet transform (RCT) are used part to increase the defective part, so as to detect whether
to extract texture features from the segmented sub images, there are defects. Zhang et al. [4] proposed a fabric defect
and combined with convolution neural network classification detection algorithm based on Gabor filter. Using two Gabor
method, the recognition rate of fabric defects is improved. filters to extract the mean and variance of the image, a
Experiments are designed to prove the effectiveness of adding
manual extracted features to convolutional neural network. feature image with higher resolution can be obtained. The
Index Terms— Fabric defect detection, Texture feature ex- model-based method is to get the parameter estimation of
traction, Neural network the model by learning the fabric texture, and then use the
hypothesis testing method to judge whether it is suitable
I. INTRODUCTION for the model for the tested fabric image. If it does not
Fabric defect detection is significant for textile quality conform to the model, the fabric image contains defects.
control. In most cases, this work is still completed by in- The commonly used models are Markov random field and
spectors. However, humans are easily affected by subjective autoregression. But model-based method is rarely used in
factors such as emotion, fatigue, vision and light, so that recent years. The learning-based method is mainly about
the reliability and accuracy of the test results can not be deep learning. Convolutional neural network (CNN) [5] is
guaranteed. In recent years, the automatic monitoring of the most commonly used method to extract features and
textile quality has attracted the attention of many researchers, classify. Zhang et al. [6] proposed an automatic location
and the characterization and analysis of fabric texture is one and classification method of colored fabric defects based on
of the more important research hotspots. YOLOV2. Guan et al. [7] proposed to detect and classify
With the rapid development of image processing and filtered images based on VGG neural network model. Rong
recognition technology, many fabric defect detection meth- et al. [8] proposed a network model based on improved U-
ods have been proposed by some researchers. From the Net and added attention mechanism to detect fabric defects,
perspective of texture feature description, the methods can which greatly improved the detection accuracy.
be roughly divided into four categories: statistical, spec- In this paper, the traditional methods and CNN are
tral, model-based and learning-based. The statistical method combined to extract features from fabric images. A multi-
mainly uses the gray value and spatial distribution of each input neural network model is proposed. The inputs are the
pixel in the image to describe the image texture features. The segmented small-scale image and the corresponding features
classical methods mainly include gray level co-occurrence extracted by traditional methods. And experiments show that
matrix, morphology, histogram analysis. Biradar et al. [1] adding these manual extracted features into neural network
proposed an improved distance matching function to calcu- can improve the performance of neural network to a certain
late the horizontal and vertical periods of repeating units extent.
of printed fabrics. This method has high accuracy for the In section II, the image feature extraction methods used
detection of defects such as holes, broken warp, coarse in this paper are introduced. In section III, the neural
warp knot. Sadaghiyanfam et al. [2] proposed an automatic network model is described in detail, and the network models
implementation scheme of fabric detection system based compared with it are also introduced. In section IV, the data
on gray level co-occurrence matrix, and compared it with set is introduced, and the feasibility of adding the features
wavelet transform method. The spectral method is to trans- extracted by traditional methods is proved by experiments.
form fabric image from space domain to frequency domain,
and to identify defects by comparing the difference of
frequency domain. Typical spectral methods include Fourier II. FEATURE EXTRACTION METHODS
Jingxin Lin, Nianfeng Wang, Hao Zhu, and Xianmin Zhang are with
Guangdong Province Key Laboratory of Precision Equipment and Manu- In this paper, the extracted features are from the first two of
facturing Technology, South China University of Technology, Guangzhou, the four categories of fabric image defect detection methods:
Guangdong 510640, PR China. (Nianfeng Wang is is the corresponding statistics features and spectral features. The features are
author, e-mail: menfwang@scut.edu.cn)
Xuewei Zheng is with Faculty of Science and Technology, Uppsala based on the image gray level co-occurrence matrix [9] and
University, Uppsala, Sweden. redundant contourlet transform [10] respectively.
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-3153-8/21/$31.00 ©2021 IEEE 458
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
A. Feature Extraction Based on Gray Level Co-occurrence Contrast is a difference moment of GLCM. The calculation
Matrix formula is shown in (5). It represents the measure of local
change of image. The larger the Con value, the clearer the
Gray level co-occurrence matrix (GLCM) describes the
local texture contrast of the image.
texture features of an image by counting the specific gray
value pairs of two pixels with a specific position relationship X
Con = (m − n)2 Pθ,d (m, n) (5)
in the image. GLCM is a two-dimensional matrix whose size
m,n
is determined by the gray-level in the image. Generally, the
gray-level of an Ly × Lx eight bit gray scale image is 256, Correlation is a measure of related texture in a certain
so the corresponding gray-level co-occurrence matrix is 256 direction of an image. As shown in (6), the expression
× 256. describes the correlation of gray level co-occurrence matrix
As shown in Fig. 1, the position relationship of two in row direction or column direction. If the GLCM has a
pixels in an image can be represented by (θ, d), where θ is high Cor value, it reflects that the image has related texture
the direction of the vector formed by the other point and in the direction of generating the GLCM.
the reference point, the value of θ can be selected from X (mn)Pθ,d (m, n) − µx µy
{0◦ , 45◦ , 90◦ , 135◦ }, and d is the number of pixels separated Cor = (6)
σx σy
between the two points. Take θ = 45◦ and d = 1 as an m,n
example, if the ordered real number pairs (m, n), (m, n = where µx , µy , σx , σy are obtained by:
0, 1, 2, ..., 255) are used to represent the gray value of two X
points, the coordinates of the reference point are (x0 , y0 ), µx = mPθ,d (m, n) (7)
and the coordinates of the other point are (x1 , y1 ), then m,n
X
the statistical result N45◦ ,1 (m, n) can be expressed by the µy = nPθ,d (m, n) (8)
following formula. m,n
X
N45◦ ,1 (m, n) =#{((x0 , y0 ), (x1 , y1 )) σx = (m − µx )2 Pθ,d (m, n) (9)
m,n
∈ (Ly × Lx ) × (Ly × Lx )| X
(y0 − y1 = 1, x0 − x1 = 1)or (1) σy = (n − µy )2 Pθ,d (m, n) (10)
m,n
(y0 − y1 = −1, x0 − x1 = −1),
IDM measures the degree of image homogeneity. The
I(x0 , y0 ) = m, I(x1 , y1 ) = n} calculation formula is shown in (11). When the ratio of small
where #{} means that the number of times that the gray difference (m, n) of pixel pairs with fixed relative position
value combination (m, n) of the pixel pair of direction in the image is relatively high, the probability value near the
θ and distance d appears in the whole image, and it is main diagonal in the GLCM is higher, and IDM can get a
denoted as Nθ,d (m, n). Then Pθ,d (m, n) is used to express larger value. The degree of image homogenization is also
the probability of the pixel pair. higher.
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
459
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
460
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
Here, ki in (19) and wj in (20) are weights. Then the KLD Fig. 4. Multi-input network structure
of sub image A and reference sub image on subband C11
can be expressed as:
Z
p11 (z)
and max pooling layers, the features obtained by the feature
KLD11 (p11 k q11 ) = p11 (z) log dz (21) extraction method mentioned in Section II are spliced into
q11 (z)
the features obtained by the convolution neural network, and
Similarly, the corresponding KLD values (KLDi,dir (i = then they are inputted to the fully-connected layer together.
1, ..., J; dir = 1, ..., D)) can be obtained in other subbands. The image sizes in this paper are 32 × 32, much smaller
In this paper, the sub image size is 32×32, the decomposition than the input image size in VGG, so the network structure
scale number J is 2, the decomposition direction number D is further simplified. The another advantage of segmenting
is 4, there are 8 sub bands, corresponding to 8 KLD values into sub images is that by judging whether the sub image
as feature components, the combined feature vectors are as has defects, the approximate location of the defects in the
follows: original image can also obtained. If processing large images,
KLD ={KLD11 , KLD12 , KLD13 , KLD14 , another network structure to mark the defect location is
needed. The network used in this paper has 6 convolution
KLD21 , KLD22 , KLD23 , KLD24 } (22) layers, 3 max pooling layers and 4 fully-connected layers.
It is also used as classification basis in Section III. The detailed structure is shown in the Fig. 4 and simplified
to a diagram. In the figure, conv3-32 represents that this is a
III. FABRIC IMAGE DEFECT CLASSIFICATION convolution layer with convolution kernel size of 3 × 3, the
BASED ON CONVOLUTION NEURAL NETWORK number of output channels of this layer is 32. Similarly, FC-
Convolutional neural network (CNN) has made great 512 stands for the fully-connected layer and the number of
achievements in the field of image processing. Neural net- output neurons is 512. Maxpool represents the max pooling
work can fit very complex functions, but it is very difficult to layer and the pooling size is 2×2. And softmax is the short of
get the statistics information in the image using CNN, unless softmax regression function. GLCM-5 and RCT-8 represent
using very deep network. In this section, a multi-input CNN the five and eight features extracted by these two methods.
is proposed which combines the features extracted by tradi- Before the activation function of each layer of the neural
tional methods and the original image. The statistics features network, a batch normalization [14] operation is added.
and spectral features of the image are added, and a relatively And a dropout [15] operation is added between each fully-
simple neural network model is used. The prototype of the connected layer, whose dropout radio is 0.3. These enhance
proposed network is VGG [13], which is proposed by the the generalization ability of the network and reduce the
Visual Geometry Group of Oxford and is a commonly used trend of over fitting. The loss function of neural network
image processing neural network. is set as cross entropy loss function. In order to make the
neural network converge faster, except for the last layer,
A. Multi-input Neural Network the activation functions of the other layers are all ELU
VGG model includes convolution layer, max pooling layer functions [16]. Softmax regression is used as classification
and fully-connected layer. The basic composition layer is in the output layer.
relatively simple. And the convolution kernel sizes in the
convolution layer are all 3 × 3. So it was chosen as the B. Comparison of Network Structure
prototype of the network structure. In order to show the advantages of multi-input neural
VGG neural network structure can be regarded as the network, this paper uses several other neural network models
combination of feature extractor and classifier. Convolution to compare with them. The first network only uses the
layer and max pooling layer can be regarded as feature features obtained in Section II as the classification basis, and
extractor, and fully-connected layer can be regarded as the directly inputs the features into the multi-layer perceptron
classification of each extracted feature. A multi-input neural (MLP) [5] classifier. The second network only uses the
network structure is proposed. By inputting the original im- original image as input, and the other parts are basically the
age and extracting enough features in the convolution layers same as that in Fig. 4. The last network adds an additional
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
461
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
(d) Oil image (e) Scratch image (f) Crack sub image
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
462
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
TABLE I
manual extracted features are added to the network to verify
C OMPARISON OF EXPERIMENTAL RESULTS OF NEURAL NETWORK
whether it can make a breakthrough.
Models Accuracy/% Recall/%
Fea+MLP 98.5272 97.7789 ACKNOWLEDGMENT
CNN1 99.0735 98.2505
CNN2 99.2447 98.5903 The authors would like to gratefully acknowledge the
Fea+CNN 99.2170 98.5193 reviewers comments. This work is supported by National
Key R&D Program of China (Grant Nos. 2019YFB1310200),
National Natural Science Foundation of China (Grant
Nos.U1713207) and Science and Technology Program of
Guangzhou (Grant Nos.201904020020).
R EFERENCES
[1] M. S. Biradar, B. Sheeparmatti, P. Patil, and S. G. Naik, “Patterned
Fig. 7. Examples of recognition results fabric defect detection using regular band and distance matching
function,” in 2017 International Conference on Computing, Commu-
nication, Control and Automation (ICCUBEA). IEEE, 2017, pp. 1–6.
[2] S. Sadaghiyanfam, “Using gray-level-co-occurrence matrix and
CNN is better than that obtained by using CNN with the wavelet transform for textural fabric defect detection: A comparison
same structure alone. In the case of only using features, due study,” in 2018 Electric Electronics, Computer Science, Biomedical
Engineerings’ Meeting (EBBT). IEEE, 2018, pp. 1–5.
to the lack of original image, it lacks most of the information, [3] K. Yildiz, A. Buldu, M. Demetgul, and Z. Yildiz, “A novel thermal-
so the performance is the worst. The performance of the based fabric defect detection technique,” The Journal of The Textile
CNN combined with the GLCM and RCT features is similar Institute, vol. 106, no. 3, pp. 275–283, 2015.
[4] Y. Zhang, X. Ruan, S. Pan, L. Shi, and B. Zong, “A new fabric defect
to that of the CNN with one additional convolution layer. detection model based on summed-up distance matching function and
Experiment results show that the performance of CNN can gabor filter bank,” in Proceedings of the 2018 3rd Joint International
be further improved by combining extracted features. For the Information Technology, Mechanical and Electronic Engineering Con-
ference (JIMEC 2018), 2018.
whole images, the recognition result of CNN are as shown [5] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT
in the Fig. 7. The network classifies that the sub image with press, 2016.
defects has been marked with a red box. The blue boxes [6] H.-w. Zhang, L.-j. Zhang, P.-f. Li, and D. Gu, “Yarn-dyed fabric defect
detection with yolov2 based on deep convolution neural networks,” in
are FPs and the green boxes are FNs. For a resolution of 2018 IEEE 7th data driven control and learning systems conference
768 × 512 original image, the processing time of CNN is (DDCLS). IEEE, 2018, pp. 170–174.
about 220-230 ms on a computer equipped with i7-9700K [7] M. Guan, Z. Zhong, Y. Rui, H. Zheng, and X. Wu, “Defect detection
and classification for plain woven fabric based on deep learning,” in
CPU and GTX 1060 GPU. 2019 Seventh International Conference on Advanced Cloud and Big
Data (CBD). IEEE, 2019, pp. 297–302.
V. CONCLUSION AND FUTURE WORK [8] L. Rong-qiang, L. Ming-hui, S. Jia-chen, and L. Yi-bin, “Fabric defect
detection method based on improved u-net,” in Journal of Physics:
Conference Series, vol. 1948, no. 1. IOP Publishing, 2021, p. 012160.
In this paper, the fabric image is segmented into small- [9] R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features
scale sub images, and they are judged whether there are for image classification,” IEEE Transactions on systems, man, and
defects. Firstly, the features are extracted from the sub cybernetics, no. 6, pp. 610–621, 1973.
[10] M. S. Allili and N. Baaziz, “Contourlet-based texture retrieval using
images based on GLCM and RCT. Then, a multi-input a mixture of generalized gaussian distributions,” in International
neural network is used. Image features are extracted by Conference on Computer Analysis of Images and Patterns. Springer,
multiple convolution layers and pooling layers, and they 2011, pp. 446–454.
[11] F. T. Ulaby, F. Kouyate, B. Brisco, and T. L. Williams, “Textural
are inputted into MLP classifier together with the manual infornation in sar images,” IEEE Transactions on Geoscience and
extracted features. The neural network outputs whether there Remote Sensing, no. 2, pp. 235–245, 1986.
is defect in the image. The task of classification in this [12] M. N. Do and M. Vetterli, “The contourlet transform: an efficient
directional multiresolution image representation,” IEEE Transactions
dataset is relatively simple, the classification accuracys of on image processing, vol. 14, no. 12, pp. 2091–2106, 2005.
all experiments are very high. With the addition of the [13] K. Simonyan and A. Zisserman, “Very deep convolutional networks
manual extracted features, the overall classification accuracy for large-scale image recognition,” arXiv preprint arXiv:1409.1556,
2014.
of the neural network is improved by about 0.15%, which is [14] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
similar to the improvement brought by adding a convolution network training by reducing internal covariate shift,” in International
layer. Compared with other studies on fabric defect detection conference on machine learning. PMLR, 2015, pp. 448–456.
[15] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-
using neural network, this study mainly proves the promotion dinov, “Dropout: a simple way to prevent neural networks from
effect of manual extracted features on neural network. In the overfitting,” The journal of machine learning research, vol. 15, no. 1,
future work, the dataset with higher classification difficulty pp. 1929–1958, 2014.
[16] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate
or the image dataset captured in the actual production process deep network learning by exponential linear units (elus),” arXiv
are used to verify the effectiveness of the performance preprint arXiv:1511.07289, 2015.
improvement by adding the manual extracted features into [17] P. S. Stanimirović, B. Ivanov, H. Ma, and D. Mosić, “A survey
of gradient methods for solving nonlinear optimization,” Electronic
the neural network. At the same time, the depth of the neural Research Archive, vol. 28, no. 4, p. 1573, 2020.
network which only input the original image is deepened
until the performance is no longer improved, and then the
Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
463