Sensors 23 06558 v2
Sensors 23 06558 v2
Article
LSD-YOLOv5: A Steel Strip Surface Defect Detection
Algorithm Based on Lightweight Network and Enhanced
Feature Fusion Mode
Huan Zhao , Fang Wan, Guangbo Lei *, Ying Xiong, Li Xu, Chengzhi Xu and Wen Zhou
Abstract: In the field of metallurgy, the timely and accurate detection of surface defects on metallic
materials is a crucial quality control task. However, current defect detection approaches face chal-
lenges with large model parameters and low detection rates. To address these issues, this paper
proposes a lightweight recognition model for surface damage on steel strips, named LSD-YOLOv5.
First, we design a shallow feature enhancement module to replace the first Conv structure in the
backbone network. Second, the Coordinate Attention mechanism is introduced into the MobileNetV2
bottleneck structure to maintain the lightweight nature of the model. Then, we propose a smaller
bidirectional feature pyramid network (BiFPN-S) and combine it with Concat operation for efficient
bidirectional cross-scale connectivity and weighted feature fusion. Finally, the Soft-DIoU-NMS
algorithm is employed to enhance the recognition efficiency in scenarios where targets overlap. Com-
pared with the original YOLOv5s, the LSD-YOLOv5 model achieves a reduction of 61.5% in model
parameters and a 28.7% improvement in detection speed, while improving recognition accuracy by
2.4%. This demonstrates that the model achieves an optimal balance between detection accuracy and
speed, while maintaining a lightweight structure.
Keywords: surface defect detection; YOLOv5s; Stem block; MobileNetV2 bottleneck; multi-scale
Citation: Zhao, H.; Wan, F.; Lei, G.;
feature fusion
Xiong, Y.; Xu, L.; Xu, C.; Zhou, W.
LSD-YOLOv5: A Steel Strip Surface
Defect Detection Algorithm Based on
Lightweight Network and Enhanced
1. Introduction
Feature Fusion Mode. Sensors 2023, Steel strips are a crucial product in the steel industry and serve as a foundational
23, 6558. https://doi.org/10.3390/ material in areas such as bridge engineering, shipbuilding, and automobile manufactur-
s23146558 ing [1]. The quality of steel strips directly impacts the performance and lifespan of various
Academic Editor: Gilbert-Rainer
infrastructure systems. However, during production and transportation, the surface of
Gillich
steel strips may present multiple defects. These defects not only compromise the quality
of the steel strips, but also contribute to inaccuracies in subsequent processing steps [2,3].
Received: 15 June 2023 The timely and accurate identification of defects is an effective way to improve the quality
Revised: 6 July 2023 and efficiency of steel strip production. Therefore, defect detection in industrial production
Accepted: 19 July 2023
carries significant practical value. Steel strip surface damage types are complex and diverse,
Published: 20 July 2023
and these defects may not be readily discernible [4], thereby posing certain challenges
to detection.
The techniques employed for identifying surface damage on steel strips can be catego-
Copyright: © 2023 by the authors.
rized into three groups, manual detection, automated detection, and artificial-intelligence-
Licensee MDPI, Basel, Switzerland.
based detection. Initially, defect detection was mainly accomplished through manual labor,
This article is an open access article
which necessitated prolonged periods of high-intensity labor for workers on the production
distributed under the terms and line. This not only resulted in inefficiencies, but also increased the likelihood of wrong
conditions of the Creative Commons and missed detections [5]. As a result, the quality of the steel strips could not be well-
Attribution (CC BY) license (https:// guaranteed. With the development of automatic detection technology, eddy current testing,
creativecommons.org/licenses/by/ infrared thermography, magnetic flux leakage testing, ultrasonic testing, etc., have become
4.0/). commonly used in industrial production. Ghanei et al. [6] utilized eddy current testing to
determine the martensite percentage of dual-phase steels and evaluate their mechanical
properties. Keo et al. [7] employed a microwave excitation system coupled with infrared
thermography for detecting vertical reinforcements in concrete. Zhang et al. [8] utilized an
open-ended rectangular waveguide based on microwave non-destructive testing for detect-
ing defects in thick-coated steel plates. However, these methods still have the limitations
of material and the inability to accurately classify defects, making it difficult to recognize
defects accurately and efficiently.
In recent years, artificial intelligence techniques have been widely used in defect
detection. Current methods for detecting damage can be generally classified into two main
categories: traditional machine-learning-based object detection [9] and deep-learning-based
object detection [10]. For instance, Hussain et al. [11] proposed an object recognition model
based on intelligent deep learning and an improved whale optimization algorithm (WOA).
In their study, a data augmentation approach was first employed to address the imbalance
in object classes. Then, the DenseNet201 network was enhanced, and an improved WOA
was proposed to select the best features. The application of these methodologies has led to a
substantial enhancement in the accuracy of the model. The traditional methods extract the
damage features manually [12,13], such as Local Binary Pattern (LBP), Histogram Oriented
Gradient (HOG), Grey-level Co-occurrence matrix (GLCM), etc., followed by the classifica-
tion, which improves the efficiency of damage identification to some extent. Gola et al. [14]
extracted textural features and morphological parameters from steel structures, utilizing
a support vector machine algorithm to classify the microstructure of low-carbon steels.
Luo et al. [15] proposed a new generalized completed local binary model framework for fea-
ture extraction. Furthermore, Ashour et al. [16] extracted multidirectional shearlet features
from hot-rolled steel strip images, followed by Gray-level Co-occurrence Matrix (GLCM)
calculations. However, these methods are susceptible to external environmental influences,
such as lighting conditions and backgrounds. Moreover, the accuracy of recognition heavily
depends on feature engineering, resulting in poor robustness and model generalization.
Compared with traditional methods, deep learning detection methods can automat-
ically learn features from raw data with higher accuracy and efficiency [17,18], and they
exhibit increased resistance to external interference. Deep learning detection algorithms
can be broadly categorized into two main categories. One is a two-stage algorithm based on
candidate regions. For example, Zhou et al. [19] improved the Fast R-CNN model’s ability
to detect surface diseases on steel strips, combining a novel residual atrous spatial pyramid
pooling module with the feature pyramid network to enhance multiscale feature fusion.
Akhyar et al. [20] introduced deformable convolution, deformable RoI pooling, and guided
anchoring RPN to optimize the cascade R-CNN algorithm. Selamet et al. [21] proposed a
metal surface defect detection model that combines the Faster R-CNN algorithm with the
shape from the shading method. The other is a regression-based single-stage algorithm.
For instance, Guo et al. [22] introduced the TRANS structure into the backbone network
and detection head of YOLOv5, aiming to combine features and global information effec-
tively. Zhang et al. [23] detected and classified damaged hot-rolled steel strips using a
network model based on YOLOv5s. Li et al. [24] improved the performance of the YOLOv5
algorithm by incorporating dense multiscale weighted feature fusion and ASPF mod-
ules. The above studies illustrate the significant advancements in deep learning detection
algorithms and their broad applicability in the industrial production of steel strips.
Surface defect detection on steel strips is a critical application of object detection.
However, current research primarily concentrates on enhancing the precision of surface
defect classification [25,26], while simultaneously considering the recognition speed and
lightweight design of the model can be challenging. Liu et al. [27] developed an end-to-end
multiscale contextual detection model for identifying steel strip damage with multiple
scales and complex backgrounds. Although this model can achieve real-time detection of
steel strip damage, its effectiveness in detecting defects with irregular shapes and unclear
boundaries is limited. Li et al. [28] proposed a surface defect detection approach based on
YOLOv4. However, the detection model is complex and not ideal for implementation in
Sensors 2023, 23, 6558 3 of 17
devices with limited resources. Tian et al. [29] utilized key point estimation to determine the
defect centers, which optimized the detection speed of the model. Nonetheless, the model’s
performance is suboptimal when detecting ambiguous defects. Zhou et al. [30] proposed a
lightweight detection mode, which performs well in terms of real-time and lightweight.
However, this structure is prone to wrong and missed detections when detecting steel strip
images with overlapping targets. Liu et al. [31] designed a TruingDet algorithm based on
Fast R-CNN. They strengthened the ability of the detection model to identify and localize
damage, but failed to address the issue of balancing classification and regression effectively.
Although the current methods for object detection have achieved some success, they
still face the following challenges: (1) The speed of classification is limited by the complex
network structure and large computational volume, leading to high latency. Furthermore,
powerful computer hardware is required for model training, making it challenging to
deploy on devices with limited memory and computation resources. (2) The detection
accuracy is generally low for poorly characterized defects and small-scale targets. There
are instances of missed detections for defects that are closely spaced and mutually oc-
cluded. To address the above challenges, we designed a practical defect detection method
with fewer parameters to achieve a good balance between detection accuracy and speed.
YOLOv5 is an open-source object detection model that addresses the issue of recognition
and detection in industrial scenarios, offering both speed and compactness. This study aims
to enhance the YOLOv5 model by reducing its complexity, boosting detection speed, and
simultaneously improving accuracy. In this work, the major contributions are as follows:
• We developed a lightweight steel strip surface defect detection model, LSD-YOLOv5.
• We proposed a new, efficient feature extraction network by integrating the R-Stem
module and CA-MbV2 module into the backbone network. This has led to a significant
reduction in model parameters, while also improving the speed of detection.
• A smaller bidirectional feature pyramid network (BiFPN-S) was implemented in the
model to effectively integrate feature information at multiple scales.
• We improved the recognition efficiency of overlapping targets by employing the
Soft-DIoU-NMS prediction frame screening algorithm.
The rest of this paper is organized as follows. Section 2 details the proposed model LSD-
YOLOv5 for steel strip surface defect detection. Section 3 evaluates the experimental results
and compares our proposed model with state-of-the-art methods. Section 4 concludes this
paper and describes the directions for future work.
(a) Punching (b) Welding line (c) Crescent gap (d) Water spot (e) Oil spot
(f) Silk spot (g) Inclusion (h) Rolled pit (i) Crease (j) Waist folding
utilizing the generated feature maps. The post-processing phase of the model utilizes the
Soft-DIoU-NMS algorithm for refined prediction boxes.
9 8 7 6 5 4 3 2 1 0
Backbone
640 640
Input
CA- CA- CA- CA-
SPPF MbV2 CBS MbV2 CBS MbV2 CBS MbV2 CBS R-Stem
10 11 12 13 14 15 16
Concat_ Concat_
CBS = Conv BN SILU
CBS Upsample BiFPN-S
C3 CBS Upsample BiFPN-S
Neck
Bottlen
23 22 21 20 19 18 17 eck = CBS CBS Add
Concat_ Concat_
C3 BiFPN-S
CBS C3 CBS C3
BiFPN-S Bottlen
C3 = CBS
eck
Concat CBS
CBS
Conv Conv Conv
Max Max Max
=
Head
SPPF CBS
Pool Pool Pool
Concat
20 20
40 40
CBS
80 80
Input Output
(6406403) (16016032)
CBS CBS
(33,32,2) (11,32,1)
CBS CBS
(11,16,1) (33,32,2)
Bottleneck Bottleneck
Input Output
1
Zcw (w) =
H ∑ xc ( j, w), (2)
0≤ j < H
to generate coordinate attention, the feature maps aggregated through the aforementioned
equations are combined, and then a shared 1 × 1 convolutional transform function F1 is
applied to the concatenated feature maps to obtain feature representation f , as shown in
Equation (3).
f = δ( F1 ([zh , zw ])), (3)
where f ∈ RC/r×(w+h) is the intermediate feature map that captures the spatial information
along the horizontal and vertical coordinates, r denotes the scale of downsampling, and δ
refers to the Sigmoid activation function. The feature map f is separated into two tensors
Sensors 2023, 23, 6558 7 of 17
along the spatial dimension and the feature maps f h and f w are converted into tensors with
the same number of channels as the input X through two 1 × 1 convolutions, respectively,
to obtain Equations (4) and (5).
gh = σ ( Fh ( f h )), (4)
gw = σ ( Fw ( f w )), (5)
where gh and gw are the attention weights in the height and width directions of the feature
map. The final output of the CA module can be expressed by Equation (6):
Conv2d
Sigmod
BatchNorm+Non-linear
Input Output C H W
Concat + Conv2d
X Avg Pool
Sigmod
Y Avg Pool
Conv2d
C H W
C 1W
C 1W C 1W
Input Output
X
Y
Coordinate Attention
improve the accuracy of detection. Therefore, to attain a better balance of surface defect
information across different scales of steel strips, we designed an augmented feature fusion
network based on BiFPN, called BiFPN-S. The structure of BiFPN-S is shown in Figure 7.
We streamlined the quintuplet input strata of BiFPN-S to a trio of input strata, to integrate
with the YOLOv5 framework. BiFPN shares one weight for all channels in each stratum
of the feature map, which impedes the network’s ability to acquire multi-scale encoding.
To address this, we introduced a separate CA attention mechanism in each prediction
branch to differentiate the importance of various channels within the same feature stratum.
We replaced PANet with BiFPN-S and combined Concat with BiFPN-S at layers 16, 20, 24,
and 28 in the network architecture, which we referred to as Concat_BiFPN-S. The utilization
of BiFPN-S results in the improved recognition ability of the model for multi-scale targets
and enhanced recognition rate of small targets with surface defects on steel strips.
P5out
P5in
Predict
P4td P4out
P4in
P3in P3out
Coordinate Attention
ρ2 (b, b gt )
R DIoU = , (8)
c2
where Si represents the classification score, R DIoU denotes the penalty term as shown in
Figure 8, M signifies the highest scoring prediction box, Bi represents other prediction boxes,
Sensors 2023, 23, 6558 9 of 17
and ξ indicates the threshold value for NMS. Equation (8) defines R DIoU , and d = ρ(b, b gt )
denotes the Euclidean distance between the central points of M and Bi . According to
Equation (7), when the IoU value between the highest confidence prediction box and other
prediction boxes is below the threshold value, the other prediction boxes remain unaltered;
otherwise, the confidence of other prediction boxes decays until it falls below the threshold
value, at which point the detection box is removed.
Bi
TP
Precision = , (9)
TP + FP
TP
Recall = , (10)
TP + FN
where TP represents the number of positive samples predicted as positive samples, TN is
the number of negative samples predicted as negative samples, FP denotes the number of
negative samples predicted as positive samples, and FN indicates the number of positive
samples predicted as negative samples.
AP denotes the mean detection accuracy for each defect class, while mAP represents
the mean detection accuracy across all defect classes. The equations for calculating AP and
mAP are expressed in Equations (11) and (12), respectively. The increase in the number of
model parameters can lead to a corresponding increase in both the size of the model file
and memory usage. The FPS metric indicates the number of images that can be processed
by the model per second and is used to evaluate whether the model meets the real-time
detection requirements.
Z 1
AP = P(r )dr, (11)
0
n
1
mAP =
n ∑ APi . (12)
i =1
Model Precision (%) mAP (%) Recall (%) Params (M) Inference Time (ms)
YOLOv5s(baseline) 67.6 65.5 62.3 7.04 14.2
+MobileNetV2 64.1 63.8 61.4 1.25 9.3
+CA-MbV2 66.3 64.7 62.6 1.83 9.9
+R-Stem+CA-MbV2 67.7 65.9 63.5 1.98 10.1
+R-Stem+CA-MbV2+BiFPN-S 68.9 67.2 64.9 2.71 10.7
LSD-YOLOv5 69.8 67.9 66.8 2.71 11.1
First, the original YOLOv5 backbone network was replaced by a lightweight Mo-
bileNetV2 structure. As shown in Table 1, the accuracy of the network model was 63.8%,
representing a slight decrease of 1.7% compared with the original model. However, the
model parameters and inference time were significantly reduced by 82.2% and 34.5%,
respectively. This indicates that the network has a more lightweight architecture and faster
detection capabilities, making it feasible for deployment on devices with limited memory
and computation resources. Second, a new feature extraction network was constructed
by incorporating the CA mechanism into the lightweight MobileNetV2 architecture. Com-
pared with using the MobileNetV2 architecture as the backbone network, the mAP of the
model was improved by 0.9%. This demonstrates that incorporating the CA attention mech-
anism makes the backbone network more focused on extracting useful information from
the features. Subsequently, the R-Stem was implemented on the backbone network with
a 1.2% improvement in mAP, while maintaining a minimal increase in model parameters
Sensors 2023, 23, 6558 11 of 17
and inference time. Moreover, the CA mechanism was placed in each prediction branch of
the BiFPN and combined this new module with the Concat operation in the network. As
presented in Table 1, the mAP of the model improved by 1.3%, demonstrating that BiFPN-S
can better merge feature information for identifying steel strip surface damage. Finally, we
utilized Soft-DIoU-NMS to refine the prediction boxes in the post-processing process of
defect detection. This resulted in an improvement of 0.7% and 1.9% in mAP and Recall of
the model, respectively, which reduced the missed detection rate of overlapping targets
and improved the detection accuracy.
Figure 9. Detection results of inconspicuous defects. (a) Original image sample; (b) Model 1 detection
results; (c) Model 2 detection results; (d) Model 3 detection results; (e) Model 4 detection results.
From Model 1 to Model 4, there was a gradual increase in the number of detectable
defects. And there were almost no missed inspections in the Model 4. As shown in Figure 9,
compared with Model 1 detection, Model 2 was optimized by adding the R-Stem and
CA-MbV2 modules to improve the detection accuracy for relatively faint, inconspicuous
defects. As indicated in Figure 10, we performed a series of experiments to evaluate
the performance of the four models in the scenario of small damage detection. Model 3
exhibited a substantial enhancement in the detection performance compared to the two
preceding models. As depicted in Figure 10d, it accurately identified relatively small
defects in the images, effectively improving the detection performance for multi-scale
targets. However, Model 3 was susceptible to confusion when there were overlapping or
closely located defects as shown in Figure 11. By using the Soft-DIoU-NMS algorithm in
Model 4, as indicated in Figure 11e, the model could identify and accurately locate targets
in proximity or overlap, thereby reducing the rate of missed detections.
Sensors 2023, 23, 6558 12 of 17
Figure 10. Detection results of small defects. (a) Original image sample; (b) Model 1 detection results;
(c) Model 2 detection results; (d) Model 3 detection results; (e) Model 4 detection results.
Figure 11. Detection results of overlapping defects. (a) Original image sample; (b) Model 1 detection
results; (c) Model 2 detection results; (d) Model 3 detection results; (e) Model 4 detection results.
Model Precision (%) mAP (%) Recall (%) Params (M) Flops (G) FPS
Faster R-CNN 68.6 68.2 67.8 63.57 263.5 12.6
SSD 53.3 51.6 52.1 27.32 39.2 54.7
YOLOv4 60.2 57.5 56.3 52.36 121.3 41.5
YOLOv5s 67.6 65.5 62.3 7.04 15.9 70
YOLOv7 62.0 61.3 60.6 37.21 103.6 56.3
LSD-YOLOv5 69.8 67.9 66.8 2.71 9.1 90.1
Figure 12 shows the visualization of the recognition effects, where the labels on the
anchor boxes indicate the confidence scores of the model detection. The visualization
shows that our proposed model has a high confidence score in the damaged area and
the recognition result is accurate. Furthermore, the model exhibits excellent detection
performance for inconspicuous defects, such as “Rolled pit” (marked as “8_yahen”), and
small-scale defects, such as “Oil spot” (marked as “5_youban”). As shown in Table 2, the
Faster R-CNN algorithm had the highest mAP and Recall of 68.2% and 67.8%. However, its
model parameters and computational demands of 63.57 M and 263.5 G were considerably
larger than other detection models. Additionally, the Faster R-CNN model had an FPS of
12.6, which falls short of real-time detection requirements. The mAP achieved by YOLOv7
was inferior to that of YOLOv5s, and the model entailed higher parameters and compu-
Sensors 2023, 23, 6558 14 of 17
(a) (b)
(c) (d)
Figure 13. Comparison of evaluation indexes under different models. (a) mAP curve. (b) Bounding
box loss curve. (c) Confidence loss curve. (d) Classification loss curve.
4. Conclusions
In this paper, we have proposed an optimal LSD-YOLOv5 model for recognizing
surface damage on steel strips in real-world scenarios, which is characterized by fewer
parameters and lower computational requirements. The model strikes an ideal balance
between detection accuracy and speed, while maintaining a lightweight structure. First,
to improve the feature extraction capability of shallow networks, the R-Stem module was
proposed to replace the first Conv structure in the backbone network. And, we designed a
new feature extraction network called CA-MbV2, which integrates the Coordinate Attention
mechanism into the bottleneck structure of MobileNetV2. CA-MbV2 has significantly
reduced the model parameters and improved the detection speed. Second, we introduced
BiFPN-S into the neck layer and placed a lightweight attention module in each prediction
branch, enhancing the model’s ability to modulate various scales of damage. Finally, Soft-
DIoU-NMS was utilized as a prediction frame screening algorithm to minimize the number
of missed detections of overlapping targets.
The effectiveness of the proposed method has been demonstrated by conducting
extensive evaluations and ablation studies on the GC10-DET dataset. In comparison with
the original YOLOv5s model, the proposed LSD-YOLOv5 model has achieved a reduction
of 61.5% in parameters, a decrease of 42.8% in computation, and a 2.4% improvement in
accuracy, making it more suitable for meeting the lightweight requirements of steel strip
surface defect detection. This study has the potential to provide insights into lightweight
and real-time methods for detecting metal surface defects in industrial settings, and may
establish a basis for the implementation of industrial automation. In the future, we will
consider exploring the adoption of more efficient data augmentation techniques, such as
Generative Adversarial Networks (GAN), to enhance the recognition and generalization
capabilities of our models.
Author Contributions: Conceptualization, H.Z. and F.W.; methodology, H.Z.; software, H.Z. and
G.L.; validation, H.Z.; formal analysis, Y.X and L.X.; investigation, F.W., G.L. and C.X.; resources, L.X.
and W.Z.; data curation, H.Z. and G.L.; writing—original draft preparation, H.Z.; writing—review
and editing, F.W. and G.L.; visualization, H.Z.; supervision, Y.X. and C.X.; project administration,
W.Z.; funding acquisition, L.X. and W.Z. All authors have read and agreed to the published version
of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (Grant
No. 62202147) and the Science and Technology Research Project of Education Department of Hubei
Province (Grant No. B2021070).
Institutional Review Board Statement: Not applicable.
Sensors 2023, 23, 6558 16 of 17
References
1. Wen, X.; Shan, J.; He, Y.; Song, K. Steel Surface Defect Recognition: A Survey. Coatings 2022, 13, 17. [CrossRef]
2. Luo, Q.; Fang, X.; Su, J.; Zhou, J.; Zhou, B.; Yang, C.; Liu, L.; Gui, W.; Tian, L. Automated visual defect classification for flat steel
surface: A survey. IEEE Trans. Instrum. Meas. 2020, 69, 9329–9349. [CrossRef]
3. Wang, H.; Li, Z.; Wang, H. Few-shot steel surface defect detection. IEEE Trans. Instrum. Meas. 2021, 71, 1–12. [CrossRef]
4. Tang, B.; Chen, L.; Sun, W.; Lin, Z.K. Review of surface defect detection of steel products based on machine vision. IET Image
Process. 2023, 17, 303–322. [CrossRef]
5. Zhao, W.; Song, K.; Wang, Y.; Liang, S.; Yan, Y. FaNet: Feature-aware Network for Few Shot Classification of Strip Steel Surface
Defects. Measurement 2023, 208, 112446. [CrossRef]
6. Ghanei, S.; Kashefi, M.; Mazinani, M. Eddy current nondestructive evaluation of dual phase steel. Mater. Des. 2013, 50, 491–496.
[CrossRef]
7. Keo, S.A.; Brachelet, F.; Breaban, F.; Defer, D. Steel detection in reinforced concrete wall by microwave infrared thermography.
NDT E Int. 2014, 62, 172–177. [CrossRef]
8. Zhang, H.; Gao, B.; Tian, G.Y.; Woo, W.L.; Bai, L. Metal defects sizing and detection under thick coating using microwave NDT.
NDT E Int. 2013, 60, 52–61. [CrossRef]
9. Wang, A.; Sha, M.; Liu, L.; Chu, M. A new process industry fault diagnosis algorithm based on ensemble improved binary-tree
SVM. Chin. J. Electron. 2015, 24, 258–262. [CrossRef]
10. Hussain, N.; Khan, M.A.; Tariq, U.; Kadry, S.; Yar, M.A.E.; Mostafa, A.M.; Alnuaim, A.A.; Ahmad, S. Multiclass Cucumber Leaf
Diseases Recognition Using Best Feature Selection. Comput. Mater. Contin. 2022, 70, 3281–3294. [CrossRef]
11. Hussain, N.; Khan, M.A.; Kadry, S.; Tariq, U.; Mostafa, R.R.; Choi, J.I.; Nam, Y. Intelligent deep learning and improved whale
optimization algorithm based framework for object recognition. Hum. Cent. Comput. Inf. Sci 2021, 11, 2021.
12. Chu, M.; Gong, R. Invariant feature extraction method based on smoothed local binary pattern for strip steel surface defect. ISIJ
Int. 2015, 55, 1956–1962. [CrossRef]
13. Wang, Y.; Xia, H.; Yuan, X.; Li, L.; Sun, B. Distributed defect recognition on steel surfaces using an improved random forest
algorithm with optimal multi-feature-set fusion. Multimed. Tools Appl. 2018, 77, 16741–16770. [CrossRef]
14. Gola, J.; Webel, J.; Britz, D.; Guitar, A.; Staudt, T.; Winter, M.; Mücklich, F. Objective microstructure classification by support
vector machine (SVM) using a combination of morphological parameters and textural features for low carbon steels. Comput.
Mater. Sci. 2019, 160, 186–196. [CrossRef]
15. Luo, Q.; Sun, Y.; Li, P.; Simpson, O.; Tian, L.; He, Y. Generalized completed local binary patterns for time-efficient steel surface
defect classification. IEEE Trans. Instrum. Meas. 2018, 68, 667–679. [CrossRef]
16. Ashour, M.W.; Khalid, F.; Abdul Halin, A.; Abdullah, L.N.; Darwish, S.H. Surface defects classification of hot-rolled steel strips
using multi-directional shearlet features. Arab. J. Sci. Eng. 2019, 44, 2925–2932. [CrossRef]
17. Zhang, S.; Zhang, Q.; Gu, J.; Su, L.; Li, K.; Pecht, M. Visual inspection of steel surface defects based on domain adaptation and
adaptive convolutional neural network. Mech. Syst. Signal Process. 2021, 153, 107541. [CrossRef]
18. Chen, X.; Lv, J.; Fang, Y.; Du, S. Online detection of surface defects based on improved YOLOV3. Sensors 2022, 22, 817. [CrossRef]
[PubMed]
19. Zhou, X.; Wei, M.; Li, Q.; Fu, Y.; Gan, Y.; Liu, H.; Ruan, J.; Liang, J. Surface Defect Detection of Steel Strip with Double Pyramid
Network. Appl. Sci. 2023, 13, 1054. [CrossRef]
20. Akhyar, F.; Liu, Y.; Hsu, C.Y.; Shih, T.K.; Lin, C.Y. FDD: A deep learning–based steel defect detectors. Int. J. Adv. Manuf. Technol.
2023, 126, 1093–1107. [CrossRef] [PubMed]
21. Selamet, F.; Cakar, S.; Kotan, M. Automatic detection and classification of defective areas on metal parts by using adaptive fusion
of faster R-CNN and shape from shading. IEEE Access 2022, 10, 126030–126038. [CrossRef]
22. Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. Msft-yolo: Improved yolov5 based on transformer for detecting defects of steel
surface. Sensors 2022, 22, 3467. [CrossRef] [PubMed]
23. Zhang, Y.; Wang, W.; Li, Z.; Shu, S.; Lang, X.; Zhang, T.; Dong, J. Development of a cross-scale weighted feature fusion network
for hot-rolled steel surface defect detection. Eng. Appl. Artif. Intell. 2023, 117, 105628. [CrossRef]
24. Li, G.; Zhao, S.; Zhou, M.; Li, M.; Shao, R.; Zhang, Z.; Han, D. YOLO-RFF: An Industrial Defect Detection Method Based on
Expanded Field of Feeling and Feature Fusion. Electronics 2022, 11, 4211. [CrossRef]
Sensors 2023, 23, 6558 17 of 17
25. Kou, X.; Liu, S.; Cheng, K.; Qian, Y. Development of a YOLO-V3-based model for detecting defects on steel strip surface.
Measurement 2021, 182, 109454. [CrossRef]
26. Zhao, W.; Chen, F.; Huang, H.; Li, D.; Cheng, W. A new steel defect detection algorithm based on deep learning. Comput. Intell.
Neurosci. 2021, 2021, 5592878. [CrossRef]
27. Liu, R.; Huang, M.; Gao, Z.; Cao, Z.; Cao, P. MSC-DNet: An efficient detector with multi-scale context for defect detection on strip
steel surface. Measurement 2023, 209, 112467. [CrossRef]
28. Li, M.; Wang, H.; Wan, Z. Surface defect detection of steel strips based on improved YOLOv4. Comput. Electr. Eng. 2022,
102, 108208. [CrossRef]
29. Tian, R.; Jia, M. DCC-CenterNet: A rapid detection method for steel surface defects. Measurement 2022, 187, 110211. [CrossRef]
30. Zhou, W.; Hong, J. FHENet: Lightweight Feature Hierarchical Exploration Network for Real-Time Rail Surface Defect Inspection
in RGB-D Images. IEEE Trans. Instrum. Meas. 2023, 72, 1–8. [CrossRef]
31. Liu, Z.; Tang, R.; Duan, G.; Tan, J. TruingDet: Towards high-quality visual automatic defect inspection for mental surface. Opt.
Lasers Eng. 2021, 138, 106423. [CrossRef]
32. Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 13713–13722.
33. Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768.
34. Lv, X.; Duan, F.; Jiang, J.J.; Fu, X.; Gan, L. Deep metallic surface defect detection: The new benchmark and detection network.
Sensors 2020, 20, 1562. [CrossRef] [PubMed]
35. Wang, R.J.; Li, X.; Ling, C.X. Pelee: A real-time object detection system on mobile devices. Adv. Neural Inf. Process. Syst. 2018, 31,
5278.
36. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Pro-
ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 4510–4520.
37. Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790.
38. Zhang, D.Y.; Luo, H.S.; Wang, D.Y.; Zhou, X.G.; Li, W.F.; Gu, C.Y.; Zhang, G.; He, F.M. Assessment of the levels of damage caused
by Fusarium head blight in wheat using an improved YoloV5 method. Comput. Electron. Agric. 2022, 198, 107086. [CrossRef]
39. Farady, I.; Kuo, C.C.; Ng, H.F.; Lin, C.Y. Hierarchical Image Transformation and Multi-Level Features for Anomaly Defect
Detection. Sensors 2023, 23, 988. [CrossRef] [PubMed]
40. He, Y.; Su, Y.; Wang, X.; Yu, J.; Luo, Y. An improved method MSS-YOLOv5 for object detection with balancing speed-accuracy.
Front. Phys. 2023, 10, 1349. [CrossRef]
41. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of
the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings,
Part I 14; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37.
42. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural
Inf. Process. Syst. 2015, 28, 91–99. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.