Automtic Defogging For Sewer Pipe
Automtic Defogging For Sewer Pipe
Automation in Construction
journal homepage: www.elsevier.com/locate/autcon
A R T I C L E I N F O A B S T R A C T
Keywords: Conventional deep-learning-based inspection methods for sewer pipeline defects neglect the complex inner
Sewer pipeline environment of pipelines (e.g., fog and motion blur) and real-time segmentation despite their high accuracy for
Generative adversarial network clear images. To solve the problem of low accuracy and slow speed of fuzzy image inspection, a novel defogging,
Real-time segmentation
deblurring, and real-time segmentation system for sewer pipeline defects is proposed. First, an attention-based
Defogging and deblurring
algorithm for defogging and a generative adversarial network (GAN) for deblurring are created to improve
the sharpness of pipeline images. Second, a real-time segmentation network called Pipe-Yolact-Edge is proposed
to detect the defects in pipeline images at a pixel level, which achieves the highest mean average precision (mAP)
of 92.65% and the fastest speed of 41.23 frames per second (fps) among the state-of-the-art segmentation net
works. Comparison experiments show that the mAP of the proposed segmentation model is improved by 7.93%
and 15.43% after defogging and deblurring of pipeline images, respectively, thus revealing the impact of Pipe-
Defog-Net and Pipe-Deblur-GAN. In particular, the proposed defogging and deblurring methods for pipeline
images can reduce the effects of contrast reduction, boundary enlargement, and missing inspection of small
defects caused by fog and motion blur. Finally, the trained model is transferred into a small development device
to segment the images of pipeline defects on site.
1. Introduction undertaking. Currently, owing to the low cost and ease of operation,
closed-circuit television (CCTV) is the most widely used defect image
Recently, natural disasters have become complex and grim, with collection method worldwide [5]. However, CCTV has three shortcom
extreme weather and climatic events occurring frequently. One such ings. First, the pipeline environment is moist, and there is considerable
event was a historically rare rainstorm in Henan province, China, which vapor that reduces the sharpness of the images. Second, the images are
caused a wide range of disasters and many casualties [1]. Questions vague when the camera lens moves rapidly. Third, the collected images
concerning urban waterlogging and road collapse caused by deposition, must be manually classified into fixed categories manually, which is
staggered joints, fractures, and other pipeline defects, have significantly time-consuming and subjective.
influenced the safety of people’s lives and property. Preventive main With the development of computer vision, there are many computer-
tenance can reduce the deterioration of pipeline performance and con based inspection algorithms that can be divided into three categories:
struction costs [2]. Hence, regular inspection of pipeline defects is image processing (IP), conventional machine learning (ML), and deep
indispensable to guide preventative maintenance [3]. learning (DL) [6,7]. IP methods are based on differential and first de
With increasing of urbanization in China, the total length of sewer rivatives of image intensity, including histogram and threshold seg
pipelines is growing at a rate of 5.5–10% annually, reaching 802 thou mentation [8], scale-invariant feature transform (SIFT) [9], statistical
sand kilometers [4]. Regular detection of defects in pipelines is a large adaptive filtering [10], and edge detection [11]. IP methods are simple
* Corresponding author at: Yellow River Laboratory, Zhengzhou University, Zhengzhou 450001, China.
E-mail address: [email protected] (H. Fang).
https://doi.org/10.1016/j.autcon.2022.104595
Received 24 May 2022; Received in revised form 17 September 2022; Accepted 21 September 2022
Available online 27 September 2022
0926-5805/© 2022 Elsevier B.V. All rights reserved.
D. Ma et al. Automation in Construction 144 (2022) 104595
and understandable, but only analyze a few features, such as the in information of defects [3,20,26]. However, because the task of these
tensity and gradient. Compared with IP algorithms, ML algorithms have studies was defect inspection, there are few datasets for image defogging
better performance in terms of accuracy and inference efficiency with and deblurring.
the learning mechanism and include artificial neural networks (ANN)
[12,13], support vector machine (SVM) [14,15], and random forest (RF) 2.2. Image preprocessing
[15,16]. However, conventional ML methods encounter bottlenecks
when handling big data. Deep learning methods are high-level Image preprocessing is a method to eliminate irrelevant information,
abstraction algorithms based on multiple processing layers composed recover useful real information, enhance the detectability of relevant
of multiple nonlinear transformations. Compared to conventional ML information, and improve the reliability of feature extraction. Wang
methods, DL algorithms have a deeper and wider network to extract fine et al. [27] enhanced the image quality by adjusting image intensity to
features and achieve higher accuracy, stronger robustness, and faster enhance image contrast and by converting the color images into gray
operation [17]. However, DL-based convolutional neural networks scale images. This method can remove noise but loses information with
(CNN) are data-dependent and require high-end hardware, such as only one channel. Wang and Cheng [3] enriched the data space by
graphics processing units (GPUs) and random-access memory (RAM), to affine-transformation-based data augmentation (e.g., flipping, scaling,
train the parameters and learn the internal relations between inputs and cropping, translation, and Gaussian noise) to avoid overfitting. How
outputs. ever, new images generated by affine transformation are highly related
To solve the problems of fog, motion blur, and real-time segmenta to the original images, and a generative adversarial network (GAN) is a
tion, an automatic system is proposed to preprocess and segment images better choice for data augmentation [28].
of sewer pipeline defects in real time. The main contributions of this Image hazing is a common problem in pipeline defect inspection
study are summarized as follows. First, a dataset containing real and because of the humid environment. Hazy image input increases the
synthetic image pairs is created using the atmospheric scattering model difficulty of visual tasks such as object detection and segmentation [29].
and motion-blurring kernel to train the inspection models. Second, an Traditional defogging methods can be divided into two parts. First,
image defogging and deblurring method is developed based on an contrast enhancement was used to highlight the information in foggy
attention-based algorithm and generative adversarial network (GAN) to images, through methods including histogram equalization, homomor
improve the quality of pipeline images. Third, a real-time instance phic filtering, and the Retinex algorithm [30]. Bao et al. [31] designed
segmentation method for pipeline defects is proposed. To further the Dark-Retinex (DR) algorithm to defog images for underwater
improve the inspection speed, an acceleration module called Tensor-RT structural damage, which showed specific advantages for visual and
is applied to the segmentation model. Finally, to address the limitation objective evaluations. Dai et al. [32] preprocessed the collected images
of high-end hardware, the model is transplanted into a small develop according to the characteristics of tunnel images using an improved
ment device, a Jetson TX2, to realize on-site data processing with a fast homomorphic filtering algorithm for defogging, and used an adaptive
execution time. median filter to denoise the grayscale image. Contrast enhancement
The remainder of this paper is organized as follows. Section 2 re algorithms can improve the quality of hazy images by enhancing the
views related studies on dataset creation, image preprocessing, and deep global or local characteristics of foggy images. However, they only
learning networks for the inspection of pipeline defects. Section 3 in attenuate the influence of haze on images, and the reason for image
troduces the proposed system, which includes defogging, deblurring, quality reduction in fog is not considered. Second, the atmospheric
and real-time segmentation. Section 4 describes the experimental study scattering model was used to estimate the transmittance and atmo
of training and evaluation of the image preprocessing and segmentation spheric illumination values of the foggy images. Yang et al. [33] pro
networks. In Section 5, the effects of defogging, deblurring, and Tensor posed a background light estimation method based on a dark channel
RT on segmentation are discussed. Section 6 summarizes the findings prior algorithm to enhance underwater images. However, when the
and discusses future work. scene color in the images is close to atmospheric light, the estimated
atmospheric illumination value may be high, resulting in overexposure
2. Related works of the defogged images. A defogging model based on deep learning could
automatically learn haze features without manual feature extraction,
The study of automatic inspection of pipeline defects has a long which can be regarded as a new method for image dehazing. The
history. CNN is a representative DL algorithm that is an important defogging model was trained using many foggy images to estimate the
subbranch of computer vision [18]. Recently, CNN-based algorithms transmittance maps or defogged images. Li et al. [6] proposed a novel
have been increasingly used in civil engineering, such as in pavements instance segmentation framework for sewer defect detection and a gated
[17,19], tunnels [20], and buildings [21], and have achieved state-of- context aggregation network (GCANet) for image defogging. However,
the-art performance. There are three main processes for CNN-based this model performed poorly, with an accuracy of only 59.3% and a
methods: dataset creation, image preprocessing, and pipeline defect speed of 15 frames per second (fps). Moreover, the defogging model was
inspection. not trained for the pipeline context.
Motion blurring is another common problem in pipeline defect in
2.1. Dataset creation for CNN-based defects inspection spections because of the fast-moving camera. Motion blur reduces the
clarity of the images, resulting in a low accuracy of defect segmentation.
Dataset creation is an important preparation for model training and There are three image deblurring methods: Wiener filtering, the total
can strongly impact prediction performance, including image collection variation (TV) algorithm, and the dark channel prior algorithm. Tong
and image annotation [21]. Existing public datasets, such as COCO [22], et al. [34] presented a deblurring method based on Wiener deconvolu
PASCAL VOC [23], and ImageNet [24], are used for animal and vehicle tion filtering for lidar images of streak tubes to improve the spatial
inspections. Many researchers have created special datasets for CNN- resolution and reduce the edge blurring effect. However, this method is
based defect inspections according to public standards. Kumar et al. not suitable for images with low signal-to-noise ratios. Wen et al. [35]
[5] collected 12,000 images with resolutions between 1440 × 720 and proposed a deblurring method based on an adaptive weighted total
320 × 256 pixels using an autonomous CCTV crawler. Wang et al. [25] variation algorithm that can suppress ringing artifacts and achieve a
trained an object detection model with 3600 images which were better effect than conventional algorithms. Pan et al. [36] proposed a
manually identified using rectangular bounding boxes. In previous blind image-deblurring algorithm based on the dark channel prior al
studies, images were mainly collected using CCTV cameras. An open- gorithm. However, conventional deblurring algorithms estimate the
source tool called Labelimg was used to annotate the geometrical fuzzy kernel through manual experience, which can only work well in a
2
D. Ma et al. Automation in Construction 144 (2022) 104595
Fig. 1. System framework diagram of the generation, detection, and counting system.
small subset of images [37]. Object detection is a target-level task that locates the defect using a
GAN is an unsupervised learning method composed of a generator bounding box. Many classical detection algorithms, such as the faster
and a discriminator, which can enhance the quality of the original region-based CNN (Faster RCNN) [25,26,49], single-shot detection
datasets and prevent overfitting [17,38]. In previous studies, GAN was (SSD), and You Only Look Once (YOLO) [20,46], are used to inspect
used for data augmentation [39,40]. Hu et al. [41] proposed a minor pipeline defects. Wang et al. [27] compared these three deep-learning-
class-based status detection method using enhanced GANs to address the based defect detection models and concluded that Faster R-CNN ob
imbalanced dataset problem caused by a few actual leak samples. Zhang tained the highest accuracy, SSD was faster than the others in detection
et al. [42] presented mixed GANs to provide additional effective leak speed but demonstrated low accuracy, and YOLO version 3 (YOLO v3)
data to train a high-accuracy model. Ma et al. [7] proposed an image was suitable for on-site inspection owing to its trade-off in accuracy and
generation algorithm for pipeline defects called StyleGAN-SDM to speed. However, owing to the nature of object detection, the geometrical
enrich the data volume, which achieves good performance in evaluation boundary, which is useful for defect quantification, was not provided.
indexes. GAN can also be used for image defogging and deblurring. Segmentation is the most complex pixel-level task and can provide
The attention mechanism is a method to make the model focus on the category and the geometrical boundary for defects. Wang and Cheng
critical information and filter out irrelevant information. This method [3] designed a unified neural network, called DilaSeg-CRF, integrated
can solve the problem of information overload and improve the effi with a deep convolutional neural network (CNN) and dense conditional
ciency and accuracy of task processing. Li et al. [43] proposed an random field (CRF), which improved the segmentation accuracy for
attention-based anomaly detection method for oil pipelines that had a pipeline defects. The images were tested on a GeForce GTX 1080 GPU
higher accuracy than the original CNN without an attention mechanism. with a speed of 0.107 s per image (s/img). Zhou et al. [50] proposed an
Liu et al. [44] proposed a signal identification method based on a re automated pixel-level segmentation and severity quantification method
sidual CNN and an attention mechanism for inspecting magnetic flux for sewer defects based on DeepLabv3+, which achieved the best per
leakage (MFL). The attention mechanism was used for the defect formance among three state-of-the-art segmentation methods (SegNet,
detection algorithms. Furthermore, it can improve the image defogging FCN, and U-Net). Pan et al. [51] presented a sewer defect segmentation
performance. network, called PipeUNet, which was integrated with feature reuse and
attention mechanism blocks to enhance feature extraction capability.
2.3. Deep learning networks for pipeline defects inspection For prediction, CCTV images were processed at a speed of 32 images per
second (fps) using a device powered by an NVIDIA GeForce GTX 1080
Deep-learning-based defect inspection methods have shown notable GPU with 32GB of RAM.
achievements in fetching information such as type, location, and geo Existing CNN-based methods have achieved state-of-the-art perfor
metric boundary. According to the level of the extracted information, mance in defect inspection. However, these methods have three main
deep learning networks can be divided into three categories: classifica shortcomings. First, segmentation algorithms can obtain rich informa
tion, object detection, and segmentation [6,25,45,46]. tion in images, but they take up a lot of storage space and cannot be
Classification is a simple image-level task that can only obtain the implemented in real time. Second, CCTV images must be processed
defect category of images. Hassan et al. [45] used AlexNet, a classical using a high-performance computer. Data transmission is vulnerable,
network, to evaluate the conditions of sewer pipeline systems auto which reduces the efficiency of field detection. Third, these methods
matically. They also developed an AlexNet-based model to detect leak have poor anti-interference abilities to counter fogging and motion
ages in scalogram images converted from acceleration signal data [47]. blurring.
Kumar et al. [5] presented a CNN with two convolutional layers and two
fully connected layers to classify multiple defects in sewer CCTV images. 3. Methodology
Li et al. [48] improved a deep residual network, called ResNet, with a
hierarchical classification strategy to handle imbalanced sewer defects. To improve the quality of images and the speed of inspection, an
CNN-based image classification models for pipeline defects were the first automatic defogging, deblurring, and real-time segmentation system for
attempt at deep-learning-based intelligent inspection. However, they sewer pipeline defects is proposed. As shown in Fig. 1, the framework
cannot provide the location and geometric information which is consists of three parts: (1) dataset construction, (2) image preprocessing,
important for damage severity assessment. and (3) real-time segmentation. First, the collected images are fogged
3
D. Ma et al. Automation in Construction 144 (2022) 104595
and blurred using the atmospheric scattering model [52] and a motion taken from indoor and outdoor scenes [53]. However, there are no
blur generation method to create a pipeline defect library (PDL). Second, datasets for defogging and deblurring training in sewer pipelines.
an attention-based defogging algorithm and a deblurring network based Furthermore, because the inner environment of pipelines is complex, it
on a feature pyramid network (FPN) are established. During training, is difficult to obtain image pairs from the same view. A recent method
the image pairs, including real images and synthesis images in the PDL, involves generating training sets by creating synthetic blurred images
are fed into the preprocessing networks to learn the features. During from clear images [53,54]. The atmospheric scattering model is a clas
testing, related indexes called structural similarity (SSIM) and peak sical description of fogging image generation [29,52,53,55]. As shown
signal-to-noise ratio (PSNR) are used to evaluate the deblurring and in Eq. (1), Narasimhan and Nayar [52] divided the effect of the atmo
defogging performance. Third, a real-time segmentation network called sphere on light into attenuation and ambient light superposition.
Pipe-Yolact-Edge is established. This is migrated into a small develop
I(x) = J(x)t(x) + α(1 − t(x) ) (1)
ment device for on-site inspection. Furthermore, the segmentation re
sults are used to evaluate the preprocessing performance. In this equation, I(x) denotes the image with fog and J(x) denotes the
real image without fog. t(x) is the transmission coefficient, described as t
(x) = exp [− βd(x)], which represents the light transmission ability in
3.1. Dataset construction
fog. β is the atmospheric scattering coefficient and α is the characteristic
constant of the external light. The detailed hazing process is summarized
3.1.1. Image hazing
in Algorithm 1.
Deep-learning-based computer vision methods require a large num
ber of images [45]. An open defogging dataset, called Realistic Single
Image Dehazing (RESIDE) contains 13,990 synthetic and real images
4
D. Ma et al. Automation in Construction 144 (2022) 104595
randomly set to simulate the real motion caused by the camera shaking.
Table 1
Parameters of Pipe-Defog-Net.
Layer Size Number Output 3.2. Image preprocessing based on pipe-defog-net and pipe-Deblur-GAN
The first convolution in the block 3×3 64 256 × 256 × 64
The second convolution in the block 3×3 64 256 × 256 × 64 3.2.1. Attention-based pipe-defog-net
The first convolution in CA 1×1 8 1×1×8 Most defogging networks normalize the channel direction and pixel
The second convolution in CA 1×1 64 1 × 1 × 64
The first convolution in PA 1×1 8 256 × 256 × 8
direction features. However, in practice, haze is unevenly distributed in
The second convolution in PA 1×1 1 256 × 256 × 1 the image pixels and channels. An attention-based algorithm called
Pipe-Defog-Net is proposed to improve the quality of the pipeline images
and the defect segmentation performance. As shown in Fig. 2, a block
3.1.2. Image blurring contains a local residual module and a feature attention (FA) module. A
GAN is an unsupervised-learning method that includes a generator channel attention (CA) module is proposed considering the uneven
and discriminator, which requires many image pairs to learn the cor distribution of fog in the channels. Global average pooling is used to
relation function between real and fake images. The common method for bring the global channel spatial information of H × W × C into the
obtaining image pairs for training is to use a motion blurring kernel, channel descriptor of 1 × 1 × C. Here, H, W, and C represent the height,
which is defined as follows: width, and channel number of the images, respectively. The attention
B = Convolution(K(M) , S) (2) weights of the channels are obtained using convolutional layers and
activation functions. Similar to channel attention, a pixel attention (PA)
Here, B is the blurred image, S is the sharp image, and K(M) is the module is proposed to solve the uneven distribution of fog in pixels. The
blurring kernel based on the motion field M. In this study, the blurring input is transformed from H × W × C to H × W × 1 by convolution layers
kernel is the product of a matrix with a diagonal of 1 and a rotation and activation functions to obtain the attention weights of the pixels.
matrix. The blurred image is obtained by convolving the sharp image There are 57 attention-based blocks stacked in the network, and a short
with the blurring kernel. Motion-blur processing is proposed as shown in connection is used to increase the depth of the network and to avoid
Algorithm 2. The size and angle of the motion-blurring kernel are gradient vanishing. Finally, the inputs and attention weights are
5
D. Ma et al. Automation in Construction 144 (2022) 104595
multiplied to obtain the defogged images. The detailed convolution boundaries directly, which can improve the segmentation speed sharply.
parameters of Pipe-Defog-Net in each layer are listed in Table 1. The As shown in Fig. 4, a one-stage segmentation network including a
sigmoid function, a commonly used activation function, is defined in Eq. backbone, FPN, prototype network, and prediction head is proposed for
(3). pipeline defects. The backbone is the residual network (ResNet), which
makes the model deeper by short connections [57]. The FPN makes the
Sigmoid(x) = 1/(1 + e− x ) (3)
learned features richer and is more conducive to the segmentation of
defects of different sizes [59]. Large feature maps contain rich details for
3.2.2. FPN-based pipe-Deblur-GAN
inspecting small defects, and small feature maps contain abstract fea
For image deblurring, a generative adversarial network (GAN) based
tures for detecting large defects. A prototype network is formed using
on a feature pyramid network (FPN), called Pipe-Deblur-GAN, is pro
four convolutional layers to produce prototype masks. The prediction
posed. As shown in Fig. 3, the generator contains an encoder, a decoder,
head has three sub-branches that can obtain the class confidence,
and an FPN. First, the encoder backbone combines the advantages of the
bounding box, and mask coefficients. The prototype masks and mask
Inception network and residual network (ResNet). The Inception
coefficients are multiplied to obtain the geometrical boundaries of the
network replaces the local sparse structure with a dense network
defects in the pictures. Furthermore, the segmentation speed is accel
structure to increase the network width [56], while the residual network
erated by Tensor-RT, which is an algorithm library for optimizing the
uses the skip connection to solve the problem of gradient explosion and
performance of trained CNN models on GPUs [60]. Tensor-RT has many
to increase network depth [57]. A fusion network called Inception-
advantages: (1) computational cost reduction, which can maximize
ResNet can improve the feature extraction performance by deepening
throughput by transforming the model parameters from floating-point
and widening the network [7]. Second, the decoder backbone uses the
32 to int 8; (2) precision calibration, which reduces the performance
nearest-neighbor upsampling algorithm to increase the resolution of
loss caused by the conversion of model parameters; (3) layer and tensor
feature maps. According to the U-Net method [58], the corresponding
fusion, which optimizes the use of GPU memory; (4) kernel auto-tuning,
feature maps with the same resolution in the encoder are concatenated
which can select the best algorithms based on the target GPU platform;
with the feature maps in the decoder to supplement the missing infor
and (5) dynamic tensor memory, which minimizes memory footprint
mation caused by downsampling. Third, the FPN is used to fuse multi-
and improves reuse efficiency [61,62].
scale features. The deep features are abstract but fuzzy for small ob
jects, while the shallow features are specific and have detailed features
of small objects, such as defects at a distance. Deep feature maps are 4. Experiment
enlarged to shallow feature maps by up-sampling to supplement the
shallow information. The high-resolution information is used to improve 4.1. Pipeline defect library with hazy and blurry image pairs
deblurring performance [59]. Finally, a discriminator with multiple
down-sampling layers is used to identify the similarity between the real In this study, the defect images of sewer pipelines were collected by
images and deblurred images. Here, the white rectangles represent the CCTV robots in Zhengzhou and Tianjin, China. The original images were
convolution kernels. The first two numbers represent the size of the strictly selected by specialized inspectors to improve the definition of
convolution kernel, the third number represents the number of channels, the images and to ensure that the number of defects in each category was
and the fourth number represents the number of convolution kernels in balanced. Furthermore, the angle of view was varied, including front
this layer. and side views. A total of 1403 original images containing three types of
defects (misalignment, obstacle, and fracture) were used for defogging,
deblurring, and real-time segmentation. As shown in Fig. 5, the
3.3. Instance segmentation for pipeline defects based on pipe-Yolact-edge misalignment is the transverse deviation in the pipe joints in a crescent
shape, while the fracture is the breakage caused by the external pressure
Recently, deep-learning-based segmentation algorithms have been of the pipe including cracks and potholes. Next, the selected images were
divided into two categories. One is a two-stage segmentation method, hazed and blurred using Algorithms 1 and 2 to create image pairs for the
such as Mask R-CNN, that first locates the bounding boxes of defects and training of Pipe-Defog-Net and Pipe-Deblur-GAN. LabelMe [63] was
then segments the defects with geometrical boundaries. This method can used to annotate the labels and geometrical boundaries of defects at the
provide accurate results; however, its inference speed is low; and it pixel level. All types of defects were numbered with the background
cannot satisfy the requirements of real-time segmentation. The other is pixel of 0, the misalignment pixel of 1, the fracture pixel of 2, and the
the one-stage method, which predicts the categories and geometrical obstacle pixel of 3. As shown in Table 2, the pairs of images and
6
D. Ma et al. Automation in Construction 144 (2022) 104595
7
D. Ma et al. Automation in Construction 144 (2022) 104595
8
D. Ma et al. Automation in Construction 144 (2022) 104595
(CPU) and an NVIDIA Titan V graphics processing unit (GPU). As shown MSE = [x(i, j) − y(i, j) ]2 (6)
hw i=0 j=0
in Table 5, some state-of-the-art segmentation algorithms, such as the
fully convolutional neural network (FCN), mask region-based convolu Here, x and y are the original and denoised images, respectively, and
tional neural network (Mask R-CNN), segmenting objects by locations h and w are the height and width of the image, respectively. MSE is the
(SOLO), and segmenting objects by location version 2 (SOLO v2), were mean-square error of x and y, and MAX is the maximum value of the
compared with the proposed model. Our model using Tensor RT ach image colors.
ieved the highest mAP of 92.65%. With the acceleration of Tensor-RT, However, PSNR is not the same as the visual quality observed by the
the segmentation speed improved by 39.8%. The playback speed human eyes. This is because the sensitivity of human vision to errors is
required for a normal video is 30 fps, while the speed of a continuous not absolute, and the perception results of the eyes are affected by many
video seen by the human eye is 24 fps. The speed of the traditional factors. SSIM is a measure of the similarity between two images, which
method was lower than the threshold value, and our model with Tensor- defines the structural information of images as a combination of
RT achieved the fastest speed of 41.23 fps, which meets the brightness, contrast, and structure. As shown in Eqs. (7–9), the mean
9
D. Ma et al. Automation in Construction 144 (2022) 104595
Here, μx and μy are the mean values of x and y, respectively; σ2x and σ2y
Table 5 are the variances of x and y; σxy is the covariance of x and y; c1, c2 and c3
Comparison of different instance segmentation methods.
are constants used for stabilization to avoid dividing by zero; and c3 is
With/without Tensor RT mAP Speed set as half of c2.
Our model Yes 92.65% 41.23 fps As shown in Eq. (10), the SSIM is the product of brightness, contrast,
Our model No 91.56% 29.49 fps and structure. Here, brightness, contrast, and structure are equally
SOLO v2 No 82.30% 26.10 fps important; α, β, and γ are set as 1. Eq. (11) is derived from Eqs. (7–10).
SOLO No 80.00% 27.90 fps
[ ]
Mask R-CNN No 75.97% 3.50 fps
SSIM(x, y) = l(x, y)α × c(x, y)β × s(x, y)γ (10)
FCN No 83.21% 1.02 fps
[( )( ) ]/[( 2 )( )]
SSIM(x, y) = 2μx μy + c1 2σxy + c2 μx + μ2y + c1 σ 2x + σ2y + c2
value was used as the estimate of brightness, the standard deviation as
the estimate of contrast, and covariance as the measure of structural (11)
similarity. As shown in Table 6, after the image preprocessing by Pipe-Defog-
(
l(x, y) = 2μx μy + c1
)/( 2
μx + μy 2 + c1
)
(7) Net and Pipe-Deblur-GAN, SSIM increased by 0.551 and 0.039, respec
tively, while PSNR enhanced by 27.97 and 3.46, respectively. The
(
c(x, y) = 2σ x σ y + c2
)/( 2
σ x + σ y 2 + c2
)
(8) quality of the images was improved using preprocessing methods.
Furthermore, the segmentation result was also introduced to evaluate
(
s(x, y) = σ xy + c3
)/(
σ x σ y + c3
)
(9) the results of the preprocessing. Compared with the segmentation results
of the foggy and blurry images, the segmentation mAP with Pipe-Defog-
10
D. Ma et al. Automation in Construction 144 (2022) 104595
Net and Pipe-Deblur-GAN was improved by 7.93% and 15.43%, 5.2. On-site segmentation
respectively, without speed reduction. Compared with the traditional
defogging algorithm of Retinex and the deblurring algorithm of Wiener, To verify the performance of defogging and deblurring on real im
the attention-based defogging network and the GAN-based deblurring ages, on-site images are defogged and deblurred by Pipe-Defog-Net and
network achieved better performance in evaluation indexes. The ex Pipe-Deblur-GAN at a speed of 8.60 fps and 12.17 fps before segmen
periments proved that the preprocessing methods of Pipe-Defog-Net and tation by Pipe-Yolact-Edge. As shown in Fig. 13, there are real and
Pipe-Deblur-GAN improved the accuracy of the segmentation models preprocessed images in the two left columns, and the segmented images
and the quality of images, which is a good way to enhance inspection are in the two right columns. In the first row, the fracture in the real
performance. image is not detected. In the second row, the fracture in the real image is
Fig. 12 shows a comparison of the segmentation results. In the first incorrectly regarded as an obstacle. However, the fractures in the
row, the reflection is falsely regarded as a defect in the hazy images, and deblurred images are detected correctly. Furthermore, in real images,
the segmentation width of the misalignment is out of bounds in the there is a missed inspection of the obstacle in the fourth row, and the
blurry images and the images deblurred by Wiener algorithm. This is misalignment is detected twice in the fifth row. Conversely, defects in
because the fog reduces the contrast between defects and the back defogged images can be inspected correctly. The experiment shows that
ground, while the motion blur weakens the differentiation of the defect the proposed defogging and deblurring method can be applied to real
edge. In the second row, the misalignment in the distance is not detected images, and can improve segmentation results.
in the images processed by fogging, blurring, Retinex, and Wiener. This The proposed model, with a model complexity of 14.36 GFLOPs and
means that fog and motion blur significantly affect the inspection of a parameter number of 23.75 M, was transplanted into the Jetson TX2 to
small and distant defects. In the third row, the segmentation area in the implement the on-site inspection. The Jetson TX2 is a small development
hazy image is incorrect. In the fourth row, the color of the obstacle is device, measuring 50 mm × 87 mm, with a 256-core NVIDIA Pascal
similar to that of fog, and it is not detected in hazy and blurry images. In GPU. It can be mounted onto a crawling robot to perform on-site in
the fifth row, the foggy fracture is not segmented and the blurry fracture spections. As shown in Table 7, with the acceleration of Tensor RT, the
is mistaken for the misalignment. Furthermore, the fractures in the model parameters of floating-point 32 are transformed to int 8, which
images deblurred by the Wiener algorithm are not detected. In conclu can improve the efficiency of inference on Jetson TX2 with a speed of
sion, fog and motion blur negatively affect the quality of the images and 2.38 fps. In particular, the model with Tensor-RT maintained a high
the segmentation performance. These noises lead to contrast reduction, mAP of 91.38% on the Jetson TX2, and thus can be used for on-site
enlargement of the boundary of the defects, and disregard for the in segmentation. In practice, the video is formed of consecutive frames,
spection of small defects. The preprocessing methods of Pipe-Defog-Net which have the same defect within a few seconds. We can extract three
and Pipe-Deblur-GAN enhance the image quality and the segmentation frames per second for inspection to overcome the low speed caused by
results are improved. In particular, the edges of defects are delineated. the limitation of the computation power.
11
D. Ma et al. Automation in Construction 144 (2022) 104595
5.3. Generalization ability ability. Therefore, different types of obstacles were used. As shown in
Fig. 14, clays, rubbles, and rocks were common obstacles that could be
The purpose of the training is to learn the laws hidden in the data; a detected in the first three columns. Furthermore, the bucket left by
trained network should be able to provide appropriate outputs for im humans, the leaves stacked in the pipeline, and a foreign body could be
ages beyond the training set. Generalization ability refers to the ability picked out from the on-site videos. They were also segmented using
of a machine-learning algorithm to adapt to new samples. To further Pipe-Yolact-Edge. There are two reasons why unseen obstacles can be
verify the generalization ability of the proposed model, images that were segmented by the Yolact model. First, the CNN can extract abstract
not in the training set were segmented. The misalignments were ring- features of defects such as shapes, locations and colors. The features of
like objects. And fractures were formed by cracks and holes. The de these unseen obstacles are similar to those of the obstacles in the training
fects in these categories had no new shape to test the generalization set. Second, when the model is trained on the training set with too many
12
D. Ma et al. Automation in Construction 144 (2022) 104595
Our Defogged by Pipe-Defog- 92.08% 30.14FPS 0.991 41.11 6. Conclusions and future work
model Net
Our Defogged by Retinex 83.36% 25.57FPS 0.501 16.13 In previous research, deep-learning-based inspection algorithms for
model pipeline defects had achieved high accuracy, but ignored the effects of
Our Foggy 84.15% 29.51FPS 0.440 13.14
model
fog and motion blur. To improve the quality of images and inspection
Our Deblurred by Pipe-Deblur- 88.88% 27.79FPS 0.949 26.54 performance, an automatic defogging, deblurring, and real-time seg
model GAN mentation system for sewer pipeline defects was proposed in this study.
Our Deblurred by Wiener 52.08% 25.91FPS 0.480 17.92 First, an attention-based image defogging network and an FPN-based
model
image deblurring network were proposed to preprocess the defect im
Our Blurry 73.45% 27.60FPS 0.910 23.08
model ages. Second, a real-time segmentation network called Pipe-Yolact-Edge
was proposed. Finally, the trained model was transferred into a small
development device, and on-site inspection was performed, which is
iterations, the learned features overfits the defects in the training set, but significant for practical applications.
fails to generalize to new examples. In this research, the accuracy of the The experimental results demonstrated that the segmentation accu
validation set was monitored to avoid overfitting during training. When racy was improved by 7.93% and 15.43% after defogging and deblur
the model achieved the highest accuracy, the parameters were saved as ring, respectively, which proving that the preprocessing of images can
13
D. Ma et al. Automation in Construction 144 (2022) 104595
14
D. Ma et al. Automation in Construction 144 (2022) 104595
respectively. Second, the comparison between the segmentation results (52108289) and the Key Scientific Research Projects of Higher Educa
of defogged, foggy, deblurred, and blurry images shows that fog and tion in Henan Province (21A560013). The authors are grateful for this
motion blur can reduce the contrast and enlarge the boundary of the financial support.
defects, which negatively affects the image quality and segmentation
performance and leads to missing inspection of small defects. Third, References
compared with some state-of-the-art segmentation algorithms, the pro
posed model achieved the highest mAP of 92.65% and the fastest speed [1] The Basic Information on Natural Disasters in 2021, Ministry of Emergency
Management of the People’s Republic of China. https://www.mem.gov.cn/xw/
of 41.23 fps, which meets the requirements of real-time segmentation. yjglbgzdt/202201/t20220123_407204.shtml, 2022.
With the acceleration provided by Tensor RT, the segmentation speed [2] J. Li, G. Yin, X. Wang, W. Yan, Automated decision making in highway pavement
improved by 39.8%. preventive maintenance based on deep learning, Autom. Constr. 135 (2022) 1–19,
https://doi.org/10.1016/j.autcon.2021.104111.
In this study, pipeline image preprocessing and real-time segmen [3] M. Wang, J. Cheng, A unified convolutional neural network integrated with
tation were achieved. Although the proposed model achieved the conditional random field for pipe defect segmentation, Computer-Aided Civil
highest mAP of 92.65%, it is possible to improve the segmentation of Infrastruct. Eng. 35 (2) (2020) 162–177, https://doi.org/10.1111/mice.12481.
[4] 2020 Statistical Yearbook of Urban and Rural Construction, Ministry of Housing
boundaries to further improve the accuracy of classification. There are and Urban-Rural Development of the People’s Republic of China. https://www.
other types of sewer pipe defects, such as corrosion, tree roots, and mohurd.gov.cn/file/2021/20211012/f88e6161ebf4bcbd4ffde8d1e9b64f0e.xls?
branch pipes. In the future, a database containing various pipeline de n=2020%E5%B9%B4%E5%9F%8E%E5%B8%82%E5%BB%BA%E8%AE%BE%
E7%BB%9F%E8%AE%A1%E5%B9%B4%E9%89%B4, 2021.
fects should be established for automated detection. Another problem is
[5] S.S. Kumar, D.M. Abraham, M.R. Jahanshahi, T. Iseley, J. Starr, Automated defect
that not all images of pipes contained fog or motion blur. Therefore, it is classification in sewer closed circuit television inspections using deep
inefficient to defog and deblur all the frames in a video stream. In the convolutional neural networks, Autom. Constr. 91 (2018) 273–283, https://doi.
future, a CNN-based classification network will be proposed to auto org/10.1016/j.autcon.2018.03.028.
[6] Y. Li, H. Wang, L.M. Dang, M. Jalil Piran, H. Moon, A robust instance segmentation
matically select real inspection videos including fog and motion blur, framework for underground sewer defect detection, Measurement 190 (2022)
which will be defogged and deburred by Pipe-Defog-Net and Pipe- 1–13, https://doi.org/10.1016/j.measurement.2022.110727.
Deblur-GAN. Furthermore, post-processing, such as qualification of the [7] D. Ma, J. Liu, H. Fang, N. Wang, C. Zhang, Z. Li, J. Dong, A multi-defect detection
system for sewer pipelines based on StyleGAN-SDM and fusion CNN, Constr. Build.
actual area and number of defects, was not considered in this study. In Mater. 312 (2021) 1–18, https://doi.org/10.1016/j.conbuildmat.2021.125385.
the future, we will study the transformation between pixel size and real [8] W. Guo, L. Soibelman, J.H. Garrett, Automated defect detection for sewer pipeline
size, and a tracking network will be proposed to count the number of inspection and condition assessment, Autom. Constr. 18 (5) (2009) 587–596,
https://doi.org/10.1016/j.autcon.2008.12.003.
defects in a section of the pipe. The aim of our work in this domain is to [9] W. Guo, L. Soibelman, J.H. Garrett, Automated defect detection in urban
create a complete inspection and evaluation system for pipeline defects. wastewater pipes using invariant features found in video images, Constr. Res.
Congr. (2009) 1194–1203, https://doi.org/10.1061/41020(339)121.
[10] S.K. Sinha, P.W. Fieguth, Automated detection of cracks in buried concrete pipe
Declaration of Competing Interest images, Autom. Constr. 15 (1) (2006) 58–72, https://doi.org/10.1016/j.
autcon.2005.02.006.
[11] T. Su, M. Yang, T. Wu, J. Lin, Morphological segmentation based on edge detection
The authors declare that they have no known competing financial
for sewer pipe defects on CCTV images, Expert Syst. Appl. 38 (10) (2011)
interests or personal relationships that could have appeared to influence 13094–13114, https://doi.org/10.1016/j.eswa.2011.04.116.
the work reported in this study. [12] M.J. Chae, D.M. Abraham, Neuro-fuzzy approaches for sanitary sewer pipeline
condition assessment, J. Comput. Civ. Eng. 15 (1) (2001) 4–14, https://doi.org/
10.1061/(asce)0887-3801(2001)15:1(4).
Data availability [13] S.K. Sinha, P.W. Fieguth, Neuro-fuzzy network for the classification of buried pipe
defects, Autom. Constr. 15 (1) (2006) 73–83, https://doi.org/10.1016/j.
Data will be made available on request. autcon.2005.02.005.
[14] M.R. Halfawy, J. Hengmeechai, Automated defect detection in sewer closed circuit
television images using histograms of oriented gradients and support vector
Acknowledgements machine, Autom. Constr. 38 (2014) 1–13, https://doi.org/10.1016/j.
autcon.2013.10.012.
[15] J. Myrans, R. Everson, Z. Kapelan, Automated detection of faults in sewers using
This research was funded by the National Key Research and Devel CCTV image sequences, Autom. Constr. 95 (2018) 64–71, https://doi.org/
opment Program of China (No. 2017YFC1501200), and the National 10.1016/j.autcon.2018.08.005.
Natural Science Foundation of China (No. 51978630). This project was [16] A. Liaw, M. Wiener, Classification and regression by randomForest, R News 2 (3)
(2002) 18–22. https://datajobs.
supported by the Outstanding Young Talent Research Fund of Zhengz com/data-science-repo/Random-Forest-[Liaw-and-Weiner].pdf.
hou University (1621323001), the Postdoctoral Science Foundation of
China (2020 M672276), the National Youth Foundation of China
15
D. Ma et al. Automation in Construction 144 (2022) 104595
[17] N. Sholevar, A. Golroo, S.R. Esfahani, Machine learning techniques for pavement based generative adversarial network, IEEE Trans. Ind. Inform. 16 (2) (2020)
condition evaluation, Autom. Constr. 136 (2022) 1–17, https://doi.org/10.1016/j. 1343–1351, https://doi.org/10.1109/tii.2019.2945403.
autcon.2022.104190. [41] X. Hu, H. Zhang, D. Ma, R. Wang, J. Zheng, Minor class-based status detection for
[18] D. Meijer, L. Scholten, F. Clemens, A. Knobbe, A defect classification methodology pipeline network using enhanced generative adversarial networks,
for sewer image sets with convolutional neural networks, Autom. Constr. 104 Neurocomputing 424 (2021) 71–83, https://doi.org/10.1016/j.
(2019) 281–298, https://doi.org/10.1016/j.autcon.2019.04.013. neucom.2020.11.009.
[19] J. Li, T. Liu, X. Wang, J. Yu, Automated asphalt pavement damage rate detection [42] H. Zhang, X. Hu, D. Ma, R. Wang, X. Xie, Insufficient data generative model for
based on optimized GA-CNN, Autom. Constr. 136 (2022) 1–17, https://doi.org/ pipeline network leak detection using generative adversarial networks, IEEE Trans.
10.1016/j.autcon.2022.104180. Cybern. 52 (7) (2020) 7107–7120, https://doi.org/10.1109/TCYB.2020.3035518.
[20] Z. Zhou, J. Zhang, C. Gong, Automatic detection method of tunnel lining multi- [43] J. Li, D. Yang, C. Guo, C. Ji, Y. Jin, H. Sun, Q. Zhao, Application of GPR system with
defects via an enhanced you only look once network, Computer-Aided Civil convolutional neural network algorithm based on attention mechanism to oil
Infrastruct. Eng. 37 (6) (2022) 762–780, https://doi.org/10.1111/mice.12836. pipeline leakage detection, Front. Earth Sci. 10 (2022), https://doi.org/10.3389/
[21] J. Guo, Q. Wang, Y. Li, Evaluation-oriented façade defects detection using rule- feart.2022.863730.
based deep learning method, Autom. Constr. 131 (2021) 1–15, https://doi.org/ [44] S. Liu, H. Wang, R. Li, Attention module magnetic flux leakage linked deep residual
10.1016/j.autcon.2021.103910. network for pipeline in-line inspection, Sensors (Basel) 22 (6) (2022), https://doi.
[22] T.Y. Lin, M. Maire, S. Belongie, J. Hays, C.L. Zitnick, Microsoft COCO: common org/10.3390/s22062230.
objects in context, Computer Vis. Pattern Recognit. (2014) 740–755, https://doi. [45] S.I. Hassan, L.M. Dang, I. Mehmood, S. Im, H. Moon, Underground sewer pipe
org/10.1007/978-3-319-10602-1_48. condition assessment based on convolutional neural networks, Autom. Constr. 106
[23] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, A. Zisserman, The Pascal (2019) 1–12, https://doi.org/10.1016/j.autcon.2019.102849.
visual object classes (VOC) challenge, Int. J. Comput. Vis. 88 (2) (2009) 303–338, [46] X. Yin, Y. Chen, A. Bouferguene, H. Zaman, L. Kurach, A deep learning-based
https://doi.org/10.1007/s11263-009-0275-4. framework for an automated defect detection system for sewer pipes, Autom.
[24] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, ImageNet: a large-scale Constr. 109 (2020) (2019) 1–17, https://doi.org/10.1016/j.autcon.2019.102967.
hierarchical image database, in: Proc of IEEE Computer Vision & Pattern [47] H. Shukla, K. Piratla, Leakage detection in water pipelines using supervised
Recognition, 2009, pp. 248–255, https://doi.org/10.1109/CVPR.2009.5206848. classification of acceleration signals, Autom. Constr. 117 (2020) 1–15, https://doi.
[25] M. Wang, S.S. Kumar, J.C.P. Cheng, Automated sewer pipe defect tracking in CCTV org/10.1016/j.autcon.2020.103256.
videos based on defect detection and metric learning, Autom. Constr. 121 (2021) [48] D. Li, A. Cong, S. Guo, Sewer damage detection from imbalanced CCTV inspection
1–19, https://doi.org/10.1016/j.autcon.2020.103438. data using deep convolutional neural networks with hierarchical classification,
[26] J.C.P. Cheng, M. Wang, Automated detection of sewer pipe defects in closed-circuit Autom. Constr. 101 (2019) 199–208, https://doi.org/10.1016/j.
television images using deep learning techniques, Autom. Constr. 95 (2018) autcon.2019.01.017.
155–171, https://doi.org/10.1016/j.autcon.2018.08.006. [49] S.S. Kumar, M. Wang, D.M. Abraham, M.R. Jahanshahi, J.C.P. Cheng, Deep
[27] M. Wang, H. Luo, J.C.P. Cheng, Towards an automated condition assessment learning–based automated detection of sewer defects in CCTV videos, J. Comput.
framework of underground sewer pipes based on closed-circuit television (CCTV) Civ. Eng. 34 (1) (2020) 1–13, https://doi.org/10.1061/(ASCE)CP.1943-
images, Tunn. Undergr. Space Technol. 110 (2021) 1–20, https://doi.org/ 5487.0000866.
10.1016/j.tust.2021.103840. [50] Q. Zhou, Z. Situ, S. Teng, H. Liu, W. Chen, G. Chen, Automatic sewer defect
[28] Y. Gao, B. Kong, K.M. Mosalam, Deep leaf-bootstrapping generative adversarial detection and severity quantification based on pixel-level semantic segmentation,
network for structural image data augmentation, Computer-Aided Civil Infrastruct. Tunn. Undergr. Space Technol. 123 (2022) 1–14, https://doi.org/10.1016/j.
Eng. 34 (9) (2019) 755–773, https://doi.org/10.1111/mice.12458. tust.2022.104403.
[29] X. Qin, Z. Wang, Y. Bai, X. Xie, H. Jia, FFA-net: feature fusion attention network for [51] G. Pan, Y. Zheng, S. Guo, Y. Lv, Automatic sewer pipe defect semantic
single image Dehazing, Proc. AAAI Conf. Artif. Intell. 34 (7) (2020) 11908–11915, segmentation based on improved U-net, Autom. Constr. 119 (2020) 1–12, https://
https://doi.org/10.1609/aaai.v34i07.6865. doi.org/10.1016/j.autcon.2020.103383.
[30] J. Liu, R. Jia, W. Li, F. Ma, X. Wang, Image Dehazing method of transmission line [52] S.G. Narasimhan, S.K. Nayar, Vision and the atmosphere, Int. J. Comput. Vis. 48 (3)
for unmanned aerial vehicle inspection based on densely connection pyramid (2002) 233–254, https://doi.org/10.1023/A:1016328200723.
network, Wirel. Commun. Mob. Comput. 2020 (4) (2020) 1–9, https://doi.org/ [53] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, Z. Wang, Benchmarking Single
10.1155/2020/8857271. Image Dehazing and Beyond, IEEE Trans. Image Process. 28 (1) (2018) 492–505,
[31] L. Bao, C. Zhao, X. Xue, L. Yu, Improved Dark Channel defogging algorithm for https://doi.org/10.1109/TIP.2018.2867951.
defect detection in underwater structures, Adv. Mater. Sci. Eng. 2020 (2020) 1–13, [54] C. Sakaridis, D. Dai, L. Van Gool, Semantic foggy scene understanding with
https://doi.org/10.1155/2020/8760324. synthetic data, Int. J. Comput. Vis. 126 (9) (2018) 973–992, https://doi.org/
[32] C. Dai, K. Jiang, Q. Wang, G. Lancioni, Recognition of tunnel lining cracks based on 10.1007/s11263-018-1072-8.
digital image processing, Math. Probl. Eng. 2020 (2020) 1–11, https://doi.org/ [55] E.J. McCartney, F.F. Hall, Optics of the atmosphere: scattering by molecules and
10.1155/2020/5162583. particles, Phys. Today 30 (5) (1977) 76–77, https://doi.org/10.1063/1.3037551.
[33] S. Yang, Z. Chen, Z. Feng, X. Ma, Underwater image enhancement using scene [56] C. Szegedy, L. Wei, Y. Jia, P. Sermanet, A. Rabinovich, Going deeper with
depth-based adaptive background light estimation and Dark Channel prior convolutions, in: IEEE Conference on Computer Vision and Pattern Recognition,
algorithms, IEEE Access 7 (2019) 165318–165327, https://doi.org/10.1109/ 2015, pp. 1–9, https://doi.org/10.1109/CVPR.2015.7298594.
access.2019.2953463. [57] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
[34] Luo Tong, Rongwei, fan, Zhaodong, Chen, Xing, Wang, Deying, Deblurring streak Proceedings of the IEEE Computer Society Conference on Computer Vision and
image of streak tube imaging lidar using Wiener deconvolution filter, Opt. Express Pattern Recognition, 2016, pp. 770–778, https://doi.org/10.1109/CVPR.2016.90.
27 (26) (2019) 37541–37551. [58] O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical
[35] J. Wen, J. Zhao, W. Cailing, S. Yan, W. Wang, Blind deblurring from single motion image segmentation, in: Medical Image Computing and Computer-Assisted
image based on adaptive weighted total variation algorithm, IET Signal Process. 10 Intervention 9351, 2015, pp. 234–241, https://doi.org/10.1007/978-3-319-
(6) (2016) 611–618, https://doi.org/10.1049/iet-spr.2015.0458. 24574-4_28.
[36] J. Pan, D. Sun, H. Pfister, M.H. Yang, Deblurring images via Dark Channel prior, [59] F. Yang, L. Zhang, S. Yu, D. Prokhorov, X. Mei, H. Ling, Feature pyramid and
IEEE Trans. Pattern Anal. Mach. Intell. 40 (10) (2018) 2315–2328, https://doi.org/ hierarchical boosting network for pavement crack detection, IEEE Trans. Intell.
10.1109/TPAMI.2017.2753804. Transp. Syst. 21 (4) (2020) 1525–1535, https://doi.org/10.1109/
[37] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, J. Matas, DeblurGAN: blind tits.2019.2910595.
motion Deblurring using conditional adversarial networks, IEEE/CVF Conf. [60] E. Jeong, J. Kim, S. Tan, J. Lee, S. Ha, Deep learning inference parallelization on
Computer Vis. Pattern Recognit. 2018 (2018) 8183–8192, https://doi.org/ heterogeneous processors with TensorRT, IEEE Embed. Syst. Lett. 14 (1) (2022)
10.1109/cvpr.2018.00854. 15–18, https://doi.org/10.1109/les.2021.3087707.
[38] C. Siu, M. Wang, J.C.P. Cheng, A framework for synthetic image generation and [61] M. Qasaimeh, K. Denolf, A. Khodamoradi, M. Blott, P.H. Jones, Benchmarking
augmentation for improving automatic sewer pipe defect detection, Autom. Constr. vision kernels and neural network inference accelerators on embedded platforms,
137 (2022), https://doi.org/10.1016/j.autcon.2022.104213. J. Syst. Archit. (2020) 1–19, https://doi.org/10.1016/j.sysarc.2020.101896.
[39] S. Jain, G. Seth, A. Paruthi, U. Soni, G. Kumar, Synthetic data augmentation for [62] S. Saurav, R. Saini, S. Singh, EmNet: a deep integrated convolutional neural
surface defect detection and classification using deep learning, J. Intell. Manuf. 33 network for facial emotion recognition in the wild, Appl. Intell. 51 (8) (2021)
(4) (2020) 1007–1020, https://doi.org/10.1007/s10845-020-01710-x. 5543–5570, https://doi.org/10.1007/s10489-020-02125-0.
[40] J. Lian, W. Jia, M. Zareapoor, Y. Zheng, R. Luo, D.K. Jain, N. Kumar, Deep- [63] A. Torralba, B.C. Russell, J. Yuen, LabelMe: online image annotation and
learning-based small surface defect detection via an exaggerated local variation- applications, Proc. IEEE 98 (8) (2010) 1467–1484, https://doi.org/10.1109/
JPROC.2010.2050290.
16