0% found this document useful (0 votes)
7 views

Fabric Defect Detection Based On Multi-Input Neural Network

Uploaded by

anuragrokade3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Fabric Defect Detection Based On Multi-Input Neural Network

Uploaded by

anuragrokade3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

Fabric defect detection based on multi-input neural network


2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP) | 978-1-6654-3153-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/M2VIP49856.2021.9665111

Jingxin Lin, Nianfeng Wang, Hao Zhu, Xianmin Zhang and Xuewei Zheng

Abstract— Fabric defect detection is an important part in transform, contourlet transform, wavelet transform, Gabor
the process of textile production. Fabric defect is an important transform. Yildiz et al. [3] proposed to use Fourier transform
factor affecting fabric quality. At present, in the fabric industry, to detect the defects on the fabric surface, and use the thermal
the automatic defect recognition technology has become more
and more mature. In this paper, gray level co-occurrence matrix difference between the defective part and the normal texture
(GLCM) and redundant contourlet transform (RCT) are used part to increase the defective part, so as to detect whether
to extract texture features from the segmented sub images, there are defects. Zhang et al. [4] proposed a fabric defect
and combined with convolution neural network classification detection algorithm based on Gabor filter. Using two Gabor
method, the recognition rate of fabric defects is improved. filters to extract the mean and variance of the image, a
Experiments are designed to prove the effectiveness of adding
manual extracted features to convolutional neural network. feature image with higher resolution can be obtained. The
Index Terms— Fabric defect detection, Texture feature ex- model-based method is to get the parameter estimation of
traction, Neural network the model by learning the fabric texture, and then use the
hypothesis testing method to judge whether it is suitable
I. INTRODUCTION for the model for the tested fabric image. If it does not
Fabric defect detection is significant for textile quality conform to the model, the fabric image contains defects.
control. In most cases, this work is still completed by in- The commonly used models are Markov random field and
spectors. However, humans are easily affected by subjective autoregression. But model-based method is rarely used in
factors such as emotion, fatigue, vision and light, so that recent years. The learning-based method is mainly about
the reliability and accuracy of the test results can not be deep learning. Convolutional neural network (CNN) [5] is
guaranteed. In recent years, the automatic monitoring of the most commonly used method to extract features and
textile quality has attracted the attention of many researchers, classify. Zhang et al. [6] proposed an automatic location
and the characterization and analysis of fabric texture is one and classification method of colored fabric defects based on
of the more important research hotspots. YOLOV2. Guan et al. [7] proposed to detect and classify
With the rapid development of image processing and filtered images based on VGG neural network model. Rong
recognition technology, many fabric defect detection meth- et al. [8] proposed a network model based on improved U-
ods have been proposed by some researchers. From the Net and added attention mechanism to detect fabric defects,
perspective of texture feature description, the methods can which greatly improved the detection accuracy.
be roughly divided into four categories: statistical, spec- In this paper, the traditional methods and CNN are
tral, model-based and learning-based. The statistical method combined to extract features from fabric images. A multi-
mainly uses the gray value and spatial distribution of each input neural network model is proposed. The inputs are the
pixel in the image to describe the image texture features. The segmented small-scale image and the corresponding features
classical methods mainly include gray level co-occurrence extracted by traditional methods. And experiments show that
matrix, morphology, histogram analysis. Biradar et al. [1] adding these manual extracted features into neural network
proposed an improved distance matching function to calcu- can improve the performance of neural network to a certain
late the horizontal and vertical periods of repeating units extent.
of printed fabrics. This method has high accuracy for the In section II, the image feature extraction methods used
detection of defects such as holes, broken warp, coarse in this paper are introduced. In section III, the neural
warp knot. Sadaghiyanfam et al. [2] proposed an automatic network model is described in detail, and the network models
implementation scheme of fabric detection system based compared with it are also introduced. In section IV, the data
on gray level co-occurrence matrix, and compared it with set is introduced, and the feasibility of adding the features
wavelet transform method. The spectral method is to trans- extracted by traditional methods is proved by experiments.
form fabric image from space domain to frequency domain,
and to identify defects by comparing the difference of
frequency domain. Typical spectral methods include Fourier II. FEATURE EXTRACTION METHODS
Jingxin Lin, Nianfeng Wang, Hao Zhu, and Xianmin Zhang are with
Guangdong Province Key Laboratory of Precision Equipment and Manu- In this paper, the extracted features are from the first two of
facturing Technology, South China University of Technology, Guangzhou, the four categories of fabric image defect detection methods:
Guangdong 510640, PR China. (Nianfeng Wang is is the corresponding statistics features and spectral features. The features are
author, e-mail: menfwang@scut.edu.cn)
Xuewei Zheng is with Faculty of Science and Technology, Uppsala based on the image gray level co-occurrence matrix [9] and
University, Uppsala, Sweden. redundant contourlet transform [10] respectively.

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-3153-8/21/$31.00 ©2021 IEEE 458
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

angular second moment (ASM), contrast (Con), correlation


(Cor), inverse difference moment (IDM) and entropy (Ent).
ASM is also called as energy. ASM is a measure of image
texture uniformity. As shown in (4), the more uniform the
image texture is, the larger the ASM is.
Fig. 1. The position relationship of two pixel points
X
2
Asm = Pθ,d (m, n) (4)
m,n

A. Feature Extraction Based on Gray Level Co-occurrence Contrast is a difference moment of GLCM. The calculation
Matrix formula is shown in (5). It represents the measure of local
change of image. The larger the Con value, the clearer the
Gray level co-occurrence matrix (GLCM) describes the
local texture contrast of the image.
texture features of an image by counting the specific gray
value pairs of two pixels with a specific position relationship X
Con = (m − n)2 Pθ,d (m, n) (5)
in the image. GLCM is a two-dimensional matrix whose size
m,n
is determined by the gray-level in the image. Generally, the
gray-level of an Ly × Lx eight bit gray scale image is 256, Correlation is a measure of related texture in a certain
so the corresponding gray-level co-occurrence matrix is 256 direction of an image. As shown in (6), the expression
× 256. describes the correlation of gray level co-occurrence matrix
As shown in Fig. 1, the position relationship of two in row direction or column direction. If the GLCM has a
pixels in an image can be represented by (θ, d), where θ is high Cor value, it reflects that the image has related texture
the direction of the vector formed by the other point and in the direction of generating the GLCM.
the reference point, the value of θ can be selected from X (mn)Pθ,d (m, n) − µx µy
{0◦ , 45◦ , 90◦ , 135◦ }, and d is the number of pixels separated Cor = (6)
σx σy
between the two points. Take θ = 45◦ and d = 1 as an m,n
example, if the ordered real number pairs (m, n), (m, n = where µx , µy , σx , σy are obtained by:
0, 1, 2, ..., 255) are used to represent the gray value of two X
points, the coordinates of the reference point are (x0 , y0 ), µx = mPθ,d (m, n) (7)
and the coordinates of the other point are (x1 , y1 ), then m,n
X
the statistical result N45◦ ,1 (m, n) can be expressed by the µy = nPθ,d (m, n) (8)
following formula. m,n
X
N45◦ ,1 (m, n) =#{((x0 , y0 ), (x1 , y1 )) σx = (m − µx )2 Pθ,d (m, n) (9)
m,n
∈ (Ly × Lx ) × (Ly × Lx )| X
(y0 − y1 = 1, x0 − x1 = 1)or (1) σy = (n − µy )2 Pθ,d (m, n) (10)
m,n
(y0 − y1 = −1, x0 − x1 = −1),
IDM measures the degree of image homogeneity. The
I(x0 , y0 ) = m, I(x1 , y1 ) = n} calculation formula is shown in (11). When the ratio of small
where #{} means that the number of times that the gray difference (m, n) of pixel pairs with fixed relative position
value combination (m, n) of the pixel pair of direction in the image is relatively high, the probability value near the
θ and distance d appears in the whole image, and it is main diagonal in the GLCM is higher, and IDM can get a
denoted as Nθ,d (m, n). Then Pθ,d (m, n) is used to express larger value. The degree of image homogenization is also
the probability of the pixel pair. higher.

Nθ,d (m, n) X Pθ,d (m, n)


Pθ,d (m, n) = (2) Idm = (11)
N m,n
1 + (m − n)2
P255 P255
where N = m=0 n=0 Nθ,d (m, n). Entropy measures the disorder of an image. The calcula-
GLCM Pθ,d can be expressed as: tion formula is shown in (12). Generally, when the texture

Pθ,d (0, 0) Pθ,d (0, 1) · · · Pθ,d (0, 255)
 of the image is more orderly, the values of the elements in
 Pθ,d (1, 0) Pθ,d (1, 1) · · · Pθ,d (1, 255)
 the GLCM are larger, and the Ent is smaller.
Pθ,d = 
 
.. .. ..  .. X
 . . .  . Ent = − Pθ,d (m, n) log Pθ,d (m, n) (12)
Pθ,d (255, 0) Pθ,d (255, 1) · · · Pθ,d (255, 255) m,n
(3)
The above five feature components extracted from GLCM
Through the above operation, the GLCM of an image can
are combined into a feature vector:
be obtained. Reference [11] proposes five feature quantities
to describe the texture information of the image. They are F = {Asm, Con, Cor, Idm, Ent} (13)

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
459
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

(a) Frequency (b) Image decomposition example


division Fig. 3. Direction decomposed image gray histograms

Fig. 2. Redundant contourlet transform process

A fabric sub image is decomposed into several decom-


It is used as classification basis in Section III. In addition, in posed images after RCT. In order to express the gray dis-
order to reduce the computational complexity, the gray-level tribution of each decomposed image through mathematical
is selected as 32 in this paper. That is, the gray value of the model, the gray histograms of these decomposed images is
image will be divided by 8 and rounded, and then used to counted. For example, Fig. 3 shows the gray histograms
calculate the above feature values. of the decomposed images of a flawless reference image.
The difference between the two images can be expressed by
B. Feature Extraction Based on Redundant Contourlet comparing the difference between the gray histogram of the
Transform corresponding decomposed image obtained under the same
operation. The gray histogram of the decomposed image
Redundant contourlet transform (RCT) is an improved after RCT can be accurately described by Gaussian mixture
method of contourlet transform (CT) [12]. RCT uses a new model. For a single random variable z ∈ R, its generalized
generalized Gaussian low-pass filter with Gaussian property Gaussian distribution is defined as follows:
to multiscale decompose the image. The impulse response !
expression of the generalized Gaussian low-pass filter is β
z−µ
p (z |µ, σ, β ) = K (β, σ) exp −A (β) (17)
x+y
 x+y−b 2 x+y+b 2
 σ
gb (x, y) = e−2 b − e−2 e−2( b ) e−2( b ) (14)
q .
where b is the parameter that affects the frequency band- where K (β, σ) = β Γ(3/β) Γ(1/β) (2σΓ (1/β)), A (β) =
width. By using multiple generalized Gaussian filters on the h i β2
Γ(3/β)
original image, different filtered images are obtained, and Γ(1/β) . Γ(.) represents the gamma function. Parameter
redundant Laplacian pyramid (RLP) is formed. Each layer µ and σ represent the expectation and standard deviation of
of the RLP is represented by the following equations: the probability distribution, and represent the location and
dispersion degree of the distribution. Parameter β controls
Ii (m, n) =I0 (m, n) ∗ g2i (m, n) (15) the steepness of the probability distribution function curve,
RLPi (m, n) =Ii−1 (m, n) − Ii (m, n) , i = 1, ..., J (16) which determines whether the peak of the curve is a peak
or a flat top. Here β = 2. For non single random variable x,
Where, ∗ represents convolution operation, I0 is original its generalized Gaussian distribution is defined as follows:
image, Ii and RLPi are low-pass rough image on scale
level i and band-pass image on redundant Laplacian subband K
X
respectively. A RLP with J + 1 subbands is generated by p (z |θ ) = πi p (z |µi , σi , βi ), i = 1, ..., K (18)
J generalized Gaussian filters gb (where b = 2i ): a low- i=1
pass rough approximation image Ia and J band-pass images
decomposed by RLP subbands. The size of each decomposed where p (z |µi , σi , βi ) is the probability density of the i-th
image is consistent with that of I0 . Then, the J band-pass generalized Gaussian model. πi is the weightPK of the ith gener-
images are critically sampled by directional filter banks with alized Gaussian model, 0 < πi ≤ 1 and i=1 πi = 1. θ rep-
four directions in CT, and J × D directional subbands Ci,dir resents the parameter set of the model, θ = {πi , µi , σi , βi }.
with the same size are generated, where i = 1, ..., J; dir = After RCT and generalized Gaussian mixture model oper-
1, ..., 4 and D stands for D directions. ation, the corresponding generalized Gaussian mixture distri-
Fig. 2(a) shows the multi-scale representation and multi- bution of an image can be obtained. In order to characterize
directional division of RLP and RCT in frequency domain. the features of different images, a flawless image is selected
The top of (a) shows a two-level RLP decomposition, and as the reference image, and the difference between other
the bottom of (a) shows the further decomposition of each images and the reference image is taken as the feature
level decomposition in four directions. Fig. 2(b) shows the component of the image. And kullback-leibler divergence
decomposition result of an example image by RCT. In order (KLD) is used to express the difference between different
to display the image better, the coefficients of each subband images. For example, a flawless sub image block is selected
image are taken as their absolute values and normalized [0 − as the reference image, and the reference sub image is
255]. decomposed on C11 , the probability distribution of gray

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
460
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

value histogram is as follows:


K
X
p11 (z) = ki p ( z| µi , σi ) (19)
i=1

Assuming that the gray histogram of an sub image A de-


composed image on C11 has been obtained, the generalized
Gaussian distribution fitting is:
K
X
q11 (z) = wj p ( z| µj , σj ) (20)
j=1

Here, ki in (19) and wj in (20) are weights. Then the KLD Fig. 4. Multi-input network structure
of sub image A and reference sub image on subband C11
can be expressed as:
Z 
p11 (z)
 and max pooling layers, the features obtained by the feature
KLD11 (p11 k q11 ) = p11 (z) log dz (21) extraction method mentioned in Section II are spliced into
q11 (z)
the features obtained by the convolution neural network, and
Similarly, the corresponding KLD values (KLDi,dir (i = then they are inputted to the fully-connected layer together.
1, ..., J; dir = 1, ..., D)) can be obtained in other subbands. The image sizes in this paper are 32 × 32, much smaller
In this paper, the sub image size is 32×32, the decomposition than the input image size in VGG, so the network structure
scale number J is 2, the decomposition direction number D is further simplified. The another advantage of segmenting
is 4, there are 8 sub bands, corresponding to 8 KLD values into sub images is that by judging whether the sub image
as feature components, the combined feature vectors are as has defects, the approximate location of the defects in the
follows: original image can also obtained. If processing large images,
KLD ={KLD11 , KLD12 , KLD13 , KLD14 , another network structure to mark the defect location is
needed. The network used in this paper has 6 convolution
KLD21 , KLD22 , KLD23 , KLD24 } (22) layers, 3 max pooling layers and 4 fully-connected layers.
It is also used as classification basis in Section III. The detailed structure is shown in the Fig. 4 and simplified
to a diagram. In the figure, conv3-32 represents that this is a
III. FABRIC IMAGE DEFECT CLASSIFICATION convolution layer with convolution kernel size of 3 × 3, the
BASED ON CONVOLUTION NEURAL NETWORK number of output channels of this layer is 32. Similarly, FC-
Convolutional neural network (CNN) has made great 512 stands for the fully-connected layer and the number of
achievements in the field of image processing. Neural net- output neurons is 512. Maxpool represents the max pooling
work can fit very complex functions, but it is very difficult to layer and the pooling size is 2×2. And softmax is the short of
get the statistics information in the image using CNN, unless softmax regression function. GLCM-5 and RCT-8 represent
using very deep network. In this section, a multi-input CNN the five and eight features extracted by these two methods.
is proposed which combines the features extracted by tradi- Before the activation function of each layer of the neural
tional methods and the original image. The statistics features network, a batch normalization [14] operation is added.
and spectral features of the image are added, and a relatively And a dropout [15] operation is added between each fully-
simple neural network model is used. The prototype of the connected layer, whose dropout radio is 0.3. These enhance
proposed network is VGG [13], which is proposed by the the generalization ability of the network and reduce the
Visual Geometry Group of Oxford and is a commonly used trend of over fitting. The loss function of neural network
image processing neural network. is set as cross entropy loss function. In order to make the
neural network converge faster, except for the last layer,
A. Multi-input Neural Network the activation functions of the other layers are all ELU
VGG model includes convolution layer, max pooling layer functions [16]. Softmax regression is used as classification
and fully-connected layer. The basic composition layer is in the output layer.
relatively simple. And the convolution kernel sizes in the
convolution layer are all 3 × 3. So it was chosen as the B. Comparison of Network Structure
prototype of the network structure. In order to show the advantages of multi-input neural
VGG neural network structure can be regarded as the network, this paper uses several other neural network models
combination of feature extractor and classifier. Convolution to compare with them. The first network only uses the
layer and max pooling layer can be regarded as feature features obtained in Section II as the classification basis, and
extractor, and fully-connected layer can be regarded as the directly inputs the features into the multi-layer perceptron
classification of each extracted feature. A multi-input neural (MLP) [5] classifier. The second network only uses the
network structure is proposed. By inputting the original im- original image as input, and the other parts are basically the
age and extracting enough features in the convolution layers same as that in Fig. 4. The last network adds an additional

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
461
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

(a) Normal image (b) Crack image (c) Hole image

(d) Oil image (e) Scratch image (f) Crack sub image

Fig. 6. Examples in dataset

Fig. 5. Structure comparison B. Performance Comparison of Neural Networks


The experimental environment of neural network training
is as follows: python 3.6 and pytorch 1.7.1 are used in
convolution layer to the second network. This structure is
windows 10. The training parameters of each neural network
used to compare the performance difference of increasing
model are different. The detailed parameters are as follows:
the depth of the network and adding the manual extracted
the learning method of neural network is Adam optimization
features. The three detailed structures are shown in Fig. 5.
algorithm [17]. In the neural network model with only MLP,
Similarly, batch normalization and dropout operations are
the learning rate decreases with the increase of training epoch
added to the network models, and the loss function and
according to (23) and LRInit = 0.01, decay = 0.001, and
activation function are the same as those in the previous
the training batch is set to 128. In other CNN models, the
subsection.
learning rate decreases with the increase of training epoch
IV. EXPERIMENT according to (23) and LRInit = 0.0001, decay = 0.01, and
In this section, the above neural network models are the training batch is set to 16. The condition of the end
trained by using the fabric images in TILDA fabric dataset, of training is that the loss value obtained in the test set
and the test results are compared. will not decrease after certain epochs. The patience epoch
number in MLP model is 500, while 45 in CNN models.
A. Dataset Generation These parameters are the parameters that can achieve the
The images of c1r1 in TILDA dataset is used in this best performance after many tests.
experiment. The size of these images is 768 × 512. There
are 250 images in this data set, which are divided into LRInit
LearningRate = (23)
five categories with 50 images in each category. The five 1 + decay ∗ epoch
categories of images are normal images and four other The initialization of each parameter is random each time
categories containing defects, and four kinds of defects are the neural network model is trained, the performance of the
cracks, hole, oil and scratch. In this experiment, these images same network may be different at the end of training, so
will be segmented into 32×32 sub images. Typical examples each different model will be trained five times repeatedly.
of various types of images are shown in Fig. 6. And Fig. 6(f) The results are shown in the TABLE I. The results in the
shows the enlarged crack sub image. The label is provided table are the average of five repeated experiments. In the
for each sub image based on whether there is defect in the table, Accuracy, Recall are obtained by (24), (25).
sub image. The label of normal image is 0, and the label of
defective image is 1. The TILDA c1r1 dataset is obtained by TP + TN
Accuracy = (24)
a camera with a stained lens, so there are some black spots TP + TN + FP + FN
in the images as interference, such as the black spots in the
lower left corner of the images, which is marked with a red TP
Recall = (25)
box in Fig. 6(a). The sub images with these black spots are TP + FN
still regarded as normal images in data annotation. where T P represents true positive, which means that the im-
There are 15889 sub images in the dataset. In order to age is predicted to be defective and the prediction is correct.
show the contrast effect between each model more clearly, Similarly, F P , T N and F N represent false positive, true
the images in the test set will be divided more than that in negative and false negative respectively. For the performance
the normal training of neural network. Among them, 7945 of neural network, Accuracy is a good measure. But for
images are used for training, and the remaining 7944 images fabric defect detection, it is generally believed that false
are used for testing. In the training set, there are 4000 normal detection is better than missing detection, so Recall is useful.
images and 3945 defective images. In the training set, there It can be concluded from the table that the performance
are 4000 normal images and 3944 defective images. obtained by combining the GLCM and RCT features with

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
462
2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)

TABLE I
manual extracted features are added to the network to verify
C OMPARISON OF EXPERIMENTAL RESULTS OF NEURAL NETWORK
whether it can make a breakthrough.
Models Accuracy/% Recall/%
Fea+MLP 98.5272 97.7789 ACKNOWLEDGMENT
CNN1 99.0735 98.2505
CNN2 99.2447 98.5903 The authors would like to gratefully acknowledge the
Fea+CNN 99.2170 98.5193 reviewers comments. This work is supported by National
Key R&D Program of China (Grant Nos. 2019YFB1310200),
National Natural Science Foundation of China (Grant
Nos.U1713207) and Science and Technology Program of
Guangzhou (Grant Nos.201904020020).
R EFERENCES
[1] M. S. Biradar, B. Sheeparmatti, P. Patil, and S. G. Naik, “Patterned
Fig. 7. Examples of recognition results fabric defect detection using regular band and distance matching
function,” in 2017 International Conference on Computing, Commu-
nication, Control and Automation (ICCUBEA). IEEE, 2017, pp. 1–6.
[2] S. Sadaghiyanfam, “Using gray-level-co-occurrence matrix and
CNN is better than that obtained by using CNN with the wavelet transform for textural fabric defect detection: A comparison
same structure alone. In the case of only using features, due study,” in 2018 Electric Electronics, Computer Science, Biomedical
Engineerings’ Meeting (EBBT). IEEE, 2018, pp. 1–5.
to the lack of original image, it lacks most of the information, [3] K. Yildiz, A. Buldu, M. Demetgul, and Z. Yildiz, “A novel thermal-
so the performance is the worst. The performance of the based fabric defect detection technique,” The Journal of The Textile
CNN combined with the GLCM and RCT features is similar Institute, vol. 106, no. 3, pp. 275–283, 2015.
[4] Y. Zhang, X. Ruan, S. Pan, L. Shi, and B. Zong, “A new fabric defect
to that of the CNN with one additional convolution layer. detection model based on summed-up distance matching function and
Experiment results show that the performance of CNN can gabor filter bank,” in Proceedings of the 2018 3rd Joint International
be further improved by combining extracted features. For the Information Technology, Mechanical and Electronic Engineering Con-
ference (JIMEC 2018), 2018.
whole images, the recognition result of CNN are as shown [5] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT
in the Fig. 7. The network classifies that the sub image with press, 2016.
defects has been marked with a red box. The blue boxes [6] H.-w. Zhang, L.-j. Zhang, P.-f. Li, and D. Gu, “Yarn-dyed fabric defect
detection with yolov2 based on deep convolution neural networks,” in
are FPs and the green boxes are FNs. For a resolution of 2018 IEEE 7th data driven control and learning systems conference
768 × 512 original image, the processing time of CNN is (DDCLS). IEEE, 2018, pp. 170–174.
about 220-230 ms on a computer equipped with i7-9700K [7] M. Guan, Z. Zhong, Y. Rui, H. Zheng, and X. Wu, “Defect detection
and classification for plain woven fabric based on deep learning,” in
CPU and GTX 1060 GPU. 2019 Seventh International Conference on Advanced Cloud and Big
Data (CBD). IEEE, 2019, pp. 297–302.
V. CONCLUSION AND FUTURE WORK [8] L. Rong-qiang, L. Ming-hui, S. Jia-chen, and L. Yi-bin, “Fabric defect
detection method based on improved u-net,” in Journal of Physics:
Conference Series, vol. 1948, no. 1. IOP Publishing, 2021, p. 012160.
In this paper, the fabric image is segmented into small- [9] R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features
scale sub images, and they are judged whether there are for image classification,” IEEE Transactions on systems, man, and
defects. Firstly, the features are extracted from the sub cybernetics, no. 6, pp. 610–621, 1973.
[10] M. S. Allili and N. Baaziz, “Contourlet-based texture retrieval using
images based on GLCM and RCT. Then, a multi-input a mixture of generalized gaussian distributions,” in International
neural network is used. Image features are extracted by Conference on Computer Analysis of Images and Patterns. Springer,
multiple convolution layers and pooling layers, and they 2011, pp. 446–454.
[11] F. T. Ulaby, F. Kouyate, B. Brisco, and T. L. Williams, “Textural
are inputted into MLP classifier together with the manual infornation in sar images,” IEEE Transactions on Geoscience and
extracted features. The neural network outputs whether there Remote Sensing, no. 2, pp. 235–245, 1986.
is defect in the image. The task of classification in this [12] M. N. Do and M. Vetterli, “The contourlet transform: an efficient
directional multiresolution image representation,” IEEE Transactions
dataset is relatively simple, the classification accuracys of on image processing, vol. 14, no. 12, pp. 2091–2106, 2005.
all experiments are very high. With the addition of the [13] K. Simonyan and A. Zisserman, “Very deep convolutional networks
manual extracted features, the overall classification accuracy for large-scale image recognition,” arXiv preprint arXiv:1409.1556,
2014.
of the neural network is improved by about 0.15%, which is [14] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
similar to the improvement brought by adding a convolution network training by reducing internal covariate shift,” in International
layer. Compared with other studies on fabric defect detection conference on machine learning. PMLR, 2015, pp. 448–456.
[15] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-
using neural network, this study mainly proves the promotion dinov, “Dropout: a simple way to prevent neural networks from
effect of manual extracted features on neural network. In the overfitting,” The journal of machine learning research, vol. 15, no. 1,
future work, the dataset with higher classification difficulty pp. 1929–1958, 2014.
[16] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate
or the image dataset captured in the actual production process deep network learning by exponential linear units (elus),” arXiv
are used to verify the effectiveness of the performance preprint arXiv:1511.07289, 2015.
improvement by adding the manual extracted features into [17] P. S. Stanimirović, B. Ivanov, H. Ma, and D. Mosić, “A survey
of gradient methods for solving nonlinear optimization,” Electronic
the neural network. At the same time, the depth of the neural Research Archive, vol. 28, no. 4, p. 1573, 2020.
network which only input the original image is deepened
until the performance is no longer improved, and then the

Authorized licensed use limited to: NYSS'S Yeshwantrao Chavan College of Engineering. Downloaded on August 09,2024 at 10:03:02 UTC from IEEE Xplore. Restrictions apply.
463

You might also like