0% found this document useful (0 votes)

11 views18 pages

Joint Learning of Blind Super-Resolution and Crack

This paper presents a novel approach for crack segmentation enhanced by blind super-resolution (SR) using deep neural networks. The proposed method involves joint training of a SR network and a binary segmentation network, optimizing both for improved segmentation results in low-resolution images affected by unknown blurs. Experimental results demonstrate the effectiveness of this joint learning framework compared to state-of-the-art methods, addressing challenges such as class imbalance and fine crack detection.

Uploaded by

Ram Ram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views18 pages

Joint Learning of Blind Super-Resolution and Crack

Uploaded by

Ram Ram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Joint Learning of Blind Super-Resolution and Crack Segmentation for

Realistic Degraded Images

Yuki Kondoa , Norimichi Ukitaa,∗
a Toyota Technological Institute, 2-12-1 Hisakata, Tempaku-ku, Nagoya, 468-8511, Aichi, Japan

ARTICLE INFO ABSTRACT

Keywords: This paper proposes crack segmentation augmented by super resolution (SR) with deep neural
Crack segmentation networks. In the proposed method, a SR network is jointly trained with a binary segmentation
Image processing network in an end-to-end manner. This joint learning allows the SR network to be optimized for
Blind super resolution improving segmentation results. For realistic scenarios, the SR network is extended from non-blind
Deep neural networks to blind for processing a low-resolution image degraded by unknown blurs. The joint network is
arXiv:2302.12491v1 [cs.CV] 24 Feb 2023

Joint learning improved by our proposed two extra paths that further encourage the mutual optimization between
SR and segmentation. Comparative experiments with SoTA segmentation methods demonstrate the
superiority of our joint learning, and various ablation studies prove the effects of our contributions.

1. Introduction While even each of the aforementioned problems is not

an easy problem, crack segmentation is more challenging
While many constructions and infrastructures such as
due to the combination of all of these problems, even with
buildings, pavements, bridges, and tunnels are dilapidated SoTA methods, as shown in Fig. 1 (c) and (d). To cope with
in the world, it is difficult to always manually inspect all of these problems, this paper proposes a unified framework
them. Instead of the manual inspection, automatic inspection consisting of the following novel contributions (Table 1):
is one of the prospective solutions for efficiently diagnosing
these constructions. While such inspection can be achieved 1. Crack Segmentation with Blind Super-Resolution
by several types of sensors such as the Falling Weight De- (CSBSR): As with Crack Segmentation with Super
flectometer, the Pavement Density Profiler, and the Ground Resolution (CSSR) proposed in our earlier conference
Penetrating Radar, this paper focuses on crack segmentation paper [60], CSBSR proposed in this paper connects “a
on images captured by generic cameras for visual inspection. network for Super Resolution (SR) accepting an input
Crack segmentation [31] is defined to be binary semantic LR image” in series to “a segmentation network” for
segmentation in the field of computer vision. While the end-to-end joint learning. We extend CSSR to CSBSR
number of classes in crack segmentation (i.e., two classes) with blind SR to handle realistically-blurred images.
is much fewer than the one of recent generic segmentation Our joint learning of blind SR and segmentation al-
tasks, such as scene segmentation [53, 70] and aerial image lows us to optimize SR for improving segmentation
segmentation [63], real-world crack segmentation is not (Fig. 1 (e)) more than similar methods [12, 102] using
an easy problem even with recent powerful deep neural both SR and segmentation (Fig. 1 (c) and (d)).
networks. This is because of the following reasons: 2. Boundary Combo (BC) loss for segmentation: In
addition to super-resolving tiny cracks as mentioned
⟨A⟩. High class-imbalance: The number of crack pixels is above, fine boundaries are locally evaluated with
much less than the number of non-crack pixels (i.e., global constraints in the whole image for detecting
background pixels), as shown in Fig. 1 (b). In such a fine cracks robustly to the class-imbalance problem.
problem, all pixels tend to be classified to background. 3. Segmentation-aware SR-loss weights: While CSSR
⟨B⟩. Fine cracks: Cracks can be hairline, which are diffi- and CSBSR use BC loss to train not only the segmen-
cult to be segmented, as shown in Fig. 1 (b). tation network but also the SR network in and end-to-
⟨C⟩. Low-Resolution (LR): For inspection of various end manner, the SR network is less optimized due to
structures such as tunnels [96, 11], pavements [32, 5], gradient vanishing through the segmentation network.
and bridges [84, 25], an inspection camera captures To train the SR network more for segmentation, BC
cracks in LR (as shown in Fig. 1 (a)) because it cannot loss directly weights a loss for SR. For further im-
get close to the structures for safety reasons. provement, the SR loss is also weighted by additional
⟨D⟩. Cracks in blurred images: Since inspection images weights based on fine-crack and hard-negative pixels.
are usually captured from moving vehicles such as 4. Blur skip for blur-reflected task learning: Since an
cars and drones for efficient inspection, those images SR image is imperfect, blur effects remaining in the
can be blurred, as shown in Fig. 1 (a). SR image give a negative impact on segmentation.
∗ Corresponding author For segmentation more robustly to the blur effects, the
[email protected] (Y. Kondo); [email protected] blur estimated in SR is provided to the segmentation
(N. Ukita) network via a skip connection.
ORCID (s): 0000-0002-5263-8722 (Y. Kondo); 0000-0002-0240-1065 (N.
Ukita) Our code is available at https://github.com/Yuki-11/CSBSR.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 1 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

(a) Input LR (b) HR GT (c) Independent [12] (d) Multi-task [102] (e) Ours (CSBSR)

Figure 1: Difficulty in real-world crack segmentation. From an input Low-Resolution (LR) image (a), High-Resolution (HR) segmentation
results (c), (d), and (e) are acquired. (c) Independent and (d) Multi-task show the results on images enlarged by non-blind SR “trained
independently of segmentation” and “trained with segmentation in a multi-task learning manner,” respectively. (b) is the manually-annotated
ground-truth HR segmentation image.

Table 1 √
Problems ⟨𝐴⟩, ⟨𝐵⟩, ⟨𝐶⟩, and ⟨𝐷⟩ and their solutions 1, 2, 3, and 4. If solution 𝑆 is for problem ⟨𝑃 ⟩, column 𝑃 of row 𝑆 in the table is .

⟨𝐴⟩ Class imbalance ⟨𝐵⟩ Fine cracks ⟨𝐶⟩ LR cracks ⟨𝐷⟩ Blur
√ √
1. CSBSR √ √
2. BC loss √ √
3. Segmentation-aware SR-loss weights √
4. Blur-reflected task learning

2. Related Work are employed for crack segmentation in [86], in [19], and
in [71, 128, 69], respectively.
2.1. Image Segmentation In addition to the class-imbalance issue, the fine bound-
Image segmentation techniques [78] are briefly divided aries of cracks are not easy to be extracted and make crack
into three categories, namely semantic segmentation [90], segmentation difficult, as presented as Problem <B> in
instance segmentation [47], and panoptic segmentation [59]. Table 1. For such difficult fine crack segmentation, the afore-
Crack segmentation is categorized into semantic segmenta- mentioned schemes proposed against class imbalance (e.g.,
tion because it classifies all pixels into crack and background weighted loss, re-sampling, class-imbalance-oriented loss)
pixels with no instance. That is, these crack pixels are not are also useful. Previous methods for such fine cracks are di-
divided into crack instances. vided into the following two approaches, namely boundary-
Class-imbalance Segmentation: As well as in various com- based and coarse-to-fine weighting.
puter vision problems, in image segmentation, class im- In the boundary-based approach, the distance between
balance is a critical problem. Many approaches for class the boundaries of ground-truth and predicted cracks is min-
imbalance are applicable to class-imbalance segmentation imized. In [54], the Hausdorff distance is evaluated by using
tasks. For example, weighted loss such as the Weighted the distance transform. While its computational cost for the
Cross Entropy (WCE) loss [26] and the focal loss [67] for exact solution is high, the sum of L2 distances is approxi-
segmentation [64, 49, 109], re-sampling [126] for segmen- mated by the sum of regional integrals for efficiency in the
tation [18], and hard mining [29] for segmentation [35]. Boundary loss [55].
Among all segmentation tasks, medical image segmen- Various coarse-to-fine weighting approaches such as [20]
tation has to cope with highly-imbalanced classes (e.g., employ pyramid and U-net like networks for weighting a fine
tiny tumors and background). Such difficult medical image but unreliable representation by more reliable results in a
segmentation is tackled by a variety of loss functions such coarse representation. The effectiveness of this approach is
as the Dice loss [77], the Generalized Dice loss [97], the validated also in crack segmentation [107, 128, 71, 69, 22,
Combo loss [98], the Hausdorff loss [54], and the Boundary 66].
loss [55]. While the effectiveness of the both approaches is val-
Crack Segmentation: Since the class-imbalance issue is idated, the coarse-to-fine weighting approach is applicable
important also for crack segmentation as presented as Prob- only to pyramid and U-net like architectures. On the other
lem <A> in Table 1, the aforementioned schemes proposed hand, the boundary-based approach can be employed with
against class imbalance are useful for crack segmentation. any other loss functions in any network architectures in
For example, in order to balance the number of samples be- general.
tween classes, Crack GAN [119] oversamples crack images
by using DC-GAN [85]. The Dice, Combo, and WCE losses

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 2 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

2.2. Super Resolution (SR)

Non-blind SR: SR reconstructs a High-Resolution (HR)
image, 𝐼 𝐻 , from its LR image, 𝐼 𝐿 . The image degradation
process from HR to LR is modeled as follows:
I^{L} = (I^{H} \ast K)\downarrow _{s}, \label {eq:down_model} (1)
where 𝐾, ∗, ↓, and 𝑠 denote a blur kernel, a convolution
operator, a downsampling process, and a scaling factor, (a) Independent learning with non-blind SR [12]
respectively. By downscaling HR training images to their LR
images by a known downsampling process, such as bicubic
interpolation, we can have a set of 𝐼 𝐻 and 𝐼 𝐿 for training a
non-blind SR model.
Such non-blind SR is developed with various aspects [101,
17, 37, 116, 105] such as arbitrary image degradations [117,
118, 115], attention mechanisms [24, 81, 76], recurrent/iterative
networks [65, 61, 43], stochastic generation [89, 13, 73], and
reference-based SR [122, 92, 72]. However, since the image
degradation process is assumed to be known in all of these
non-blind SR methods, their performance is decreased in
real-world images with arbitrary unknown degradations. (b) Multi-task learning with non-blind SR [102]
Blind SR: To apply SR to arbitrarily-degraded images, blur
kernel 𝐾 is employed in blind SR. Even without modeling
𝐾 in a SR network, blind SR can be done by blurring
training images [95, 125, 83, 48, 114] or by deblurring input
images [52] by 𝐾. In the kernel conditioning approach [23,
36, 74, 51, 103], a blur representation estimated from an
input LR image is fed into a SR network for conditioning
the SR process by the blur. While this kernel conditioning
employs low-dimensional blur representations for efficiency
(c) Joint learning with blind SR and extra paths (Ours)
and stability in general, the original blur kernel, 𝐾, is mod-
eled within a SR network for further accuracy in [38, 57].
Since the blur kernel is more informative than its low- Figure 2: Combinations of SR and segmentation. While orange
arrows indicate data flows, arrows leading out of the loss functions
dimensional representation, the blur kernel can be useful for
(i.e., 𝑆 and 𝐶 ) indicate the back-propagation paths for training.
additional tasks using a SR image. As such an additional
Blue and green arrows indicate the back-propagations given by the
task, image segmentation is done in our work. SR and segmentation tasks, respectively. Each ellipse indicates a
loss or weights given to a certain loss. Our CSBSR (c) is illustrated
2.3. Joint Learning of SR and Other Tasks more in detail in Fig. 3.
With upscaled SR images, a variety of applications can
be realized. For example, distant-object detection [40, 50,
33] and segmentation [39], remote sensing [10, 93], wide-
angle image analysis[21, 8], and cell image analysis [68, 79]. 3. Joint Learning of Blind SR and Crack
As with these examples, crack segmentation can be also sup- Segmentation
ported by SR [12] for detecting blurred LR cracks presented While methods using joint end-to-end learning with
in Problems <C> and <D> in Table 1. SR [99, 94, 14, 82, 121, 62, 15, 44, 102] mentioned in
While these methods have models for SR and another Sec. 2.3 are close to our work, it is difficult to apply them
task (e.g., segmentation) separately (Fig. 2 (a)), these tasks to crack segmentation in realistic scenarios. This is be-
can be jointly trained in a single model for supporting cause these methods using non-blind SR cannot cope with
the additional task more explicitly. Such joint end-to-end unknown blurs observed in images degraded by out-of-
learning is also applicable in a variety of tasks such as clas- focus and motion blurs. Our CSBSR resolves this problem
sification [99, 94] and detection (e.g., faces [14], pedestri- by employing blind SR in joint learning (Sec. 3.1). For
ans [82, 121], vehicles [7], and generic objects [62, 15, 44]). further coping with the class-imbalance issue, this paper
Image segmentation can be also improved by combining also proposes a new combination of loss functions for class-
with SR. As shown in Fig. 2 (b), DSRL [102] applies multi- imbalance fine segmentation (Sec. 3.2). In addition to joint
task learning to the non-blind SR and segmentation tasks so learning, we propose loss weighting for optimizing segmen-
that a single feature extractor is shared by the parallel SR tation more for SR in Sec. 3.3 and extra skip connection
and segmentation branches following the extractor. While paths for optimizing SR more for segmentation, as described
multi-task learning may improve both tasks, the SR and in Secs. 3.4.
segmentation branches are independently trained.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 3 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Figure 3: Proposed joint learning network with blind SR and segmentation. See the caption of Fig. 2 for the explanations of arrows and
ellipses. ⊙ indicates a pixelwise multiplication operator.

3.1. Joint Learning loss that evaluates the whole image region. In [55], the Gen-
CSBSR consists of blind SR and segmentation networks, eralized Dice (GDice) loss [97] is empirically demonstrated
as shown in Fig. 2 (c). Its detail is shown in Fig. 3. The blind to be a good choice. However, it is reported that the Sigmoid
SR network, 𝑆(𝐼 𝐿 ; 𝜃𝑆 ) where 𝜃𝑆 denotes all the parameters function included in the GDice loss and its original Dice loss
of this SR network, maps 𝐼 𝐿 to its SR image, 𝐼 𝑆 . The tends to cause the vanishing gradient problem [98].
crack segmentation network, 𝐶, takes 𝐼 𝑆 and outputs a crack This paper explores more appropriate losses combined
segmentation image 𝐼 𝐶 = 𝐶(𝐼 𝑆 ; 𝜃𝐶 ). Any differentiable SR with the Boundary loss for stable learning as well as fine
and crack segmentation networks can be employed as 𝑆 and segmentation. We improve learning stability by combining
𝐶, respectively. Let 𝑆 and 𝐶 denote loss functions for 𝑆 the GDice loss with the WCE loss that is expressed with-
and 𝐶, respectively. While 𝑆 is back-propagated through out the derivative of the Sigmoid function, which tends to
𝑆, 𝐶 is back-propagated through 𝑆 and 𝐶 in an end-to-end cause gradient vanishing. Since the Dice loss and the WCE
manner. The whole network is trained by the following loss loss have different properties (i.e., which are categorized to
with the task weight 𝛽 as a hyper-parameter: region-based and distribution-based losses, respectively, as
introduced in [75]), it is also validated that a pair of the
𝐽 = (1 − 𝛽)𝑆 + 𝛽𝐶 (2) Dice and WCE losses, which is called the Combo loss [98],
complementarily work for better segmentation. Finally, we
Implementation details In our experiments, DBPN [45,
propose the following Boundary Combo (BC) loss, 𝐵𝐶 , as
43] and its extension to blind SR [111], which is called
𝐶 for 𝐶 in our joint learning:
KBPN, are employed as 𝑆 for fair comparison between our
( )
proposed methods with non-blind SR and blind SR (i.e., 𝐵𝐶 = 𝛼𝐵 + (1 − 𝛼) (1 − 𝛾)𝐷 + 𝛾𝑊 𝐶𝐸 (, 3)
comparison between CSSR and CSBSR). Different from
DBPN as non-blind SR, KBPN also outputs its estimated where 𝐵 , 𝐷 and 𝑊 𝐶𝐸 denote the Boundary, Dice, and
blur kernel. Loss functions used in [45, 43] and KBPN [111] WCE losses, respectively. 𝛼 ∈ [0, 1] and 𝛾 ∈ [0, 1] are
are used as 𝑆 in our joint learning with no change. hyper-parameters. 𝐵𝐶 consists of the region, distribution,
𝐶 is implemented with each of U-Net [87], PSPNet [124], and boundary-based losses. A combination of these three
CrackFormer [69], and HRNet+OCR [112] for validating a loss categories are never evaluated according to the sur-
wide applicability of our method1 . Section 3.2 proposes a vey [75]. As a variant of 𝐵𝐶 , we also propose 𝐺𝐵𝐶 in
new general-purpose segmentation loss, which is applicable which the GDice loss 𝐺𝐷 is used in 𝐵𝐶 instead of 𝐷 .
to all of these networks as 𝐶 . While one may refer to the original papers of 𝐵 , 𝐷 ,
𝐺𝐷 , and 𝑊 𝐶𝐸 for the details, these losses are briefly
3.2. Boundary Combo Loss explained in the following three paragraphs.
For suppressing class-imbalance difficulty in crack seg-
mentation, we propose the Boundary Combo (BC) loss Boundary loss (𝐵 ) The Boundary loss [55], computes
the distance-weighted 2D area between the ground-truth
that simultaneously achieves locally-fine and globally-robust
crack and its estimated one, which becomes zero in the ideal
segmentation. Fine segmentation can be achieved by the estimation, as follows:
boundary-based approach such as the Boundary loss [55].
However, if only the boundary-based approach is employed, 𝐷(𝜕𝐺, 𝜕𝑆) = ||𝑞𝜕𝑆 (𝑝) − 𝑝||2 𝑑𝑝
the segmentation network is easy to fall into local minima, ∫𝜕𝐺
as validated in [55]. This problem can be resolved by em- ≃ 2 𝐷 (𝑝)𝑑𝑝
ploying the boundary-based approach simultaneously with a ∫Δ𝑆 𝐺
( )
1 The implementations of these SR and segmentation networks are = 2 𝜙 (𝑝)𝑠(𝑝) − 𝜙 (𝑝)𝑔(𝑝)𝑑𝑝 , (4)
∫Ω 𝐺 ∫Ω 𝐺
publicly available [41, 110, 1, 123, 2, 113].

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 4 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

where 𝐺 and 𝑆 denote the pixel sets of the ground-truth • For detecting all fine thin cracks, a segmentation loss
crack and its estimated one, respectively. 𝑝 and 𝑞𝜕𝑆 (𝑝) denote function is weighted so that pixels inside and around
a point on boundary 𝜕𝐺 and its corresponding point on cracks are weighted higher. A weight given to pixel 𝑝,
boundary 𝜕𝑆, respectively. 𝑞𝜕𝑆 (𝑝) is an intersection between 𝑤𝐶𝑝 , is expressed as follows:
𝜕𝑆 and a normal of 𝜕𝐺 at 𝑝. Δ𝑆 = (𝑆∕𝐺) ∪ (𝐺∕𝑆) is the
mismatch part between 𝐺 and 𝑆. 𝐷𝐺 (𝑝) is the distance map 𝑤𝐶 = exp(−𝑚𝐶 𝐷𝑝 ) (9)
𝑝
from 𝐺. 𝑠(𝑝) and 𝑔(𝑝) are binary indicator functions, where
𝑠(𝑝) = 1 and 𝑔(𝑝) = 1 if 𝑝 ∈ 𝑆 and 𝑝 ∈ 𝐺, respectively.
where 𝑚𝐶 and 𝐷𝑝 denote a weight constant and a
𝜙𝐺 (𝑞) is the level set representation of boundary 𝜕𝐺: 𝜙𝐺 =
distance between 𝑝 and its nearest crack pixel, respec-
−𝐷𝐺 (𝑞) if 𝑞 ∈ 𝐺, and 𝜙𝐺 = 𝐷𝐺 (𝑞) otherwise. Ω denotes a
tively. 𝑤𝐶
𝑝 is called the Crack-Oriented (CO) weight.
pixel set in the image. The second term in Eq. (5) is omitted
as it is independent of the network parameters. By replacing • For hard pixel mining, a segmentation loss func-
𝑠(𝑝) by the network softmax outputs 𝑠𝜃 (𝑝), we obtain the tion is weighted so that pixels inside and around
Boundary loss function below: false-positive and false-negative pixels are weighted
higher. For such difficulty-aware segmentation, in our
\mathcal {L}_{B} = \int _{\Omega } \phi _G (p) s_\theta (p) dp \label {eq:Boundary} (5) method, a weight given to pixel 𝑝, 𝑤𝐹𝑝 , is expressed as
follows:
Dice and GDice losses (𝐷 and 𝐺𝐷 ) The Dice loss [77]
𝑤𝐹𝑝 = exp(𝑚𝐹 |𝑇𝑝𝑃 − 𝑇𝑝𝐺𝑇 |), (10)
is a harmonic mean of precision and recall as expressed as
follows:
∑ ∑𝑁 where 0 ≤ 𝑇𝑝𝑃 ≤ 1 and 𝑇𝑝𝐺𝑇 ∈ {0, 1} denote
2 𝑀 𝑗 𝑖 𝑝𝑖𝑗 𝑔𝑖𝑗 the value of 𝑝-th pixel in predicted and ground-truth
𝐷 = ∑𝑀 ∑𝑁 , (6) segmentation images, respectively. 𝑚𝐹 is a weight
(𝑝 2 + 𝑔2 )
𝑗 𝑖 𝑖𝑗 𝑖𝑗 constant. Our 𝑤𝐹𝑝 is applicable to any loss function
where 𝑀 and 𝑁 denote the number of classes (i.e., 𝑀 = 2 such as our BC loss, Eq. (3), consisting of multiple
in our problem) and the number of all pixels in each image, loss functions, while the focal loss [67] and the anchor
respectively. 𝑝𝑖𝑗 and 𝑔𝑖𝑗 are the classification probability loss [88], both of which also penalize hard samples,
(0 ≤ 𝑝𝑖𝑗 ≤ 1) and its ground truth (𝑔𝑖𝑗 ∈ {0, 1}). are based on a weighted cross entropy loss. 𝑤𝐹𝑝 is
Different from the Dice loss, the GDice loss [97] is called the Fail-Oriented (FO) weight.
weighted by the number of pixels in each class as follows:
These two weights (9) and (10) are multiplied pixelwise by
∑ (𝐺𝐷) ∑𝑁 𝑆 .
2 𝑀 𝑗 𝑤𝑗 𝑖 𝑝𝑖𝑗 𝑔𝑖𝑗
𝐺𝐷 = ∑ ∑ , (7)
𝑀 (𝐺𝐷) 𝑁
𝑗 𝑤𝑗 𝑖 (𝑝𝑖𝑗 + 𝑔𝑖𝑗 ) 3.4. Blur Skip for Blur-reflected Task Learning
It is not easy for the blind SR network to perfectly predict
where 𝑤(𝐺𝐷)
𝑗 = ∑𝑁
1
. the ground-truth blur kernel 𝐾 and the ground-truth HR
𝑔𝑖𝑗
𝑖
image 𝐼 𝐻 so that 𝐼 𝑆 = 𝐼 𝐻 . Let 𝐾 𝑃 and 𝐾 𝑆 denote the
WCE loss (𝑊 𝐶𝐸 ) The WCE loss [26] is the Cross En- predicted kernel and the blur kernel that remains in 𝐼 𝑆 so
that 𝐾 = 𝐾 𝑃 + 𝐾 𝑆 and 𝐼 𝑆 = 𝐼 𝐻 ∗ 𝐾 𝑆 . We assume that
tropy loss weighted by a hyper parameter, 𝑤(𝑊 𝐶𝐸)
, which is
𝑗 𝐾 𝑆 correlates with 𝐾 𝑃 .
determined based on the class imbalance (e.g., 𝑤(𝑊
𝑗
𝐶𝐸)
= Based on this assumption, this paper proposes blur-
1
∑𝑁 ′ where 𝑁 ′ = 𝑁𝑁𝐼 and 𝑁𝐼 is the number of all reflected segmentation learning via a skip connection, which
𝑖 𝑔𝑖𝑗
is called the blur skip, from the SR network 𝑆 to the
training images):
segmentation network 𝐶. This skip connection forwards 𝐾 𝑃
∑
𝑀 ∑
𝑁 to the end of 𝐶 in order to condition features extracted by 𝐶
𝑊 𝐶𝐸 = 𝑤(𝑊
𝑗
𝐶𝐸)
𝑔𝑖𝑗 log 𝑝𝑖𝑗 (8) with 𝐾𝑃 . While this conditioning is achieved by the Spatial
𝑗 𝑖 Feature Transform (SFT) [104], SFT is marginally modified
for CSBSR as follows. The detail of the modified SFT layer
3.3. Segmentation-aware Weights for SR is shown in Fig. 4. In the original SFT layer, conditions
In addition to end-to-end learning with 𝐶 (i.e., segmen- are directly fed into conv layers for producing conditioning
tation loss in Eq. (2)), we propose to weight 𝑆 by 𝐶 for features (which are depicted by red and yellow 3D boxes,
further optimizing the SR network 𝑆 for segmentation. This respectively, in Fig. 4) for scaling and shifting. Different
weighting is achieved by pixelwise multiplying 𝑆 by 𝐶 . from this original SFT layer, target features (“Segmentation
It is not yet easy to discriminate between crack and features” in Fig. 4) are concatenated to the conditions. It is
background pixels for precisely detecting fine cracks. This empirically validated that this concatenation process slightly
difficulty arises especially around crack pixels. For such improves the segmentation quality.
difficult pixelwise segmentation, our method employs the
following two difficulty-aware weights:

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 5 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Vec. trans SFT block Modified

Expand
SFT layer

Conv

Conv
Blur
conditions Concat
Blur
conditions
Conditions Conditions
for scaling for shifting
+

SFT layer
SFT layer

Conv

Conv
+
Segmentation Conditioned
features Residual connection segmentation features

Figure 4: The structure of our blur skip module using SFT [104]. Each 3D box and rectangle depict a feature set and a process, respectively.
⊙ indicates a pixelwise multiplication operator.

Figure 5: Sample images in the Khanhha dataset [56].

3.5. Training Strategy flips from each training image for data augmentation. This
Our joint learning has several loss functions, weights, patch is regarded as a HR image (𝐼 𝐻 ). From 𝐼 𝐻 , its LR
and hyper-parameters. They should be properly used for images (𝐼 𝐿 ) are generated with various blur kernels (𝐾) and
training our complex network consisting of 𝑆 and 𝐶. bicubic downsampling (↓𝑠 ), as expressed in Eq. (1). 𝐾 is ran-
domly sampled from the anisotropic 2D Gaussian blurs with
Step 1: As with most tasks each of which has a limited variance 𝜎𝑎2 , 𝜎𝑏2 ∈ [0.2, 4.0] and angle 𝜃𝑔𝑎𝑢𝑠 ∈ [0, 𝜋). The
amount of training data, 𝑆 is pre-trained with general kernel size is 21 × 21 pixels. The HR-LR downscaling factor
huge datasets for blind SR. is 14 . The feature extractor of 𝐶 is pre-trained depending on
Step 2: With a dataset for crack segmentation, only 𝑆 is the segmentation network as follows. For U-Net and PSPNet,
initially finetuned with 𝛽 = 0 in Eq .(2). VGG-16 is provided by torchvision [4]. For HRNet+OCR,
the authors’ model [113] is used.
Step 3: The whole network is finetuned so that 𝐶 is weighted For pre-training of 𝑆 in Step 1, the number of iterations
by a constant (i.e., 𝛽 ≠ 0). is 200,000. The minibatch size is six. Adam [58] is used as an
optimizer with 𝛽1 = 0.9, 𝛽2 = 0.999, 𝜖 = 10−8 . The learning
rate is 2 × 10−4 .
4. Experimental Results The number of iterations is 30,000 and 150,000 in Steps
4.1. Pre-training and Training Details 2 and 3, respectively. The minibatch size and the optimizer
For pre-training the SR network 𝑆, 3,450 images in are equal to those in the aforementioned pre-training. The
the DIV2K dataset [6] (800 images) and the Flickr2K learning rate is 2 × 10−5 .
dataset [100] (2,650 images) were used. The whole network
for crack segmentation 𝐶 was not pre-trained but its feature 4.2. Synthetically-degraded Crack Images
extractor was pre-trained with the ImageNet [28]. 4.2.1. Training
For pre-training 𝑆 (i.e., Step 1 in Sec. 3.5) and finetuning For experiments shown in Secs. 4.2 and 4.3, the Khanhha
𝑆 and 𝐶 (i.e., Steps 2 and 3), an image patch fed into each dataset [56] was used to finetune the whole network for
network is randomly cropped with vertical and horizontal CSBSR. the SR and segmentation networks. This dataset

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 6 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Table 2
Results on the Khanhha dataset. CSBSR is implemented with four different segmentation networks [124, 112, 87]. To validate the effect of
our proposed joint learning, the SR and segmentation networks are also trained without joint learning in each pair of the SR and segmentation
networks; see (d) vs. (e), (f) vs. (g), (h) vs. (i), and (j) vs. (k). For comparison, the results of SOTA methods with SR and segmentation are
also shown in (b) and (c). For reference, instead of an LR image, its original HR image is directly fed into the segmentation network in “(a)
Segmentation in HR” for the upper bound analysis. The best score in each column except for “(a) Segmentation in HR” is colored by red.

Segmentation metrics SR metrics

Model IoU𝑚𝑎𝑥 ↑ AIU↑ HD95𝑚𝑖𝑛 ↓ AHD95↓ PSNR↑ SSIM↑
(a) Segmentation in HR 0.616 0.559 6.20 11.89 - -
(b) SrcNet [12] 0.368 0.320 95.16 130.47 27.82 0.639
(c) DSRL [102] 0.391 0.285 44.23 148.97 20.16 0.501
(d) KBPN + PSPNet [124] 0.548 0.524 28.45 31.62 28.62 0.706
(e) CSBSR w/ [124] (𝛽 = 0.3) 0.573 0.552 20.92 22.52 28.75 0.703
(f) KBPN + (HRNet+OCR [112]) 0.522 0.501 26.45 28.74 28.68 0.706
(g) CSBSR w/ [112] (𝛽 = 0.9) 0.553 0.534 17.54 20.29 27.66 0.668
(h) KBPN + CrackFormer [69] 0.447 0.424 46.86 58.91 28.68 0.706
(i) CSBSR w/ [69] (𝛽 = 0.9) 0.469 0.443 39.37 56.59 25.93 0.571
(j) KBPN + U-Net [87] 0.470 0.455 45.26 45.94 28.68 0.706
(k) CSBSR w/ [87] (𝛽 = 0.3) 0.530 0.506 26.33 27.24 28.68 0.702

consists of CRACK500 [120], GAPs [30], CrackForest [91], 4.2.3. Comparison with SOTA segmentation methods
AEL [9], cracktree200 [127], DeepCrack [71], and CSSC [108] For comparative experiments, 1,695 HR test images in
datasets. As shown in the sample images of these datasets, the Khanhha dataset are degraded to their LR images in the
(Fig. 5), the Khanhha dataset is challenging so that a variety same manner as training image generation.
of structures are observed and the properties of annotated For validating the wide applicability of CSBSR, four
cracks differ between the elemental datasets [120, 30, 91, SOTA segmentation networks (i.e., PSPNet [124] for Table 2
9, 127, 71, 108]. In the Khanhha dataset, the image size is (e), HRNet+OCR [112] for Table 2 (g), CrackFormer [69]
448 × 448 pixels, which is regarded as a HR image in our for Table 2 (i), and U-Net [87] for Table 2 (k)) are used as
experiments. The dataset has 9, 122, 481, and 1,695 training, a segmentation network in CSBSR, as described in Sec. 3.1.
validation, and test images. These training and test sets were While CSBSR is trained in a joint end-to-end manner (i.e.,
used as training images for all experiments and test images (e), (g), (i), (k) in Table 2), the results of independent
in experiments shown in Sec. 4.2, respectively. blind SR and segmentation networks (i.e., (d), (f), (h), (j)
in Table 2) are also shown for comparison. To focus on the
4.2.2. Evaluation Metrics difference between the network architectures for segmen-
Each SR image is evaluated with PSNR and SSIM [106]. tation, all of these segmentation networks are trained with
Each segmentation image is evaluated with Intersect of our BC loss in Eq. (3). In BC loss, 𝛼 = 0.5 and 𝛾 = 0.5
Union (IoU). While IoU is computed in a binarized image, were determined empirically. The task weight 𝛽 in Eq. (2)
the output of CSBSR is a segmentation image in which is determined empirically for each method and fixed during
each pixel has a probability of being a crack or not a Step 3 in the training strategy (Sec. 3.5).
crack. Since IoU differs depending on a threshold for bi- In addition, CSBSR is compared with SOTA methods
narization, the threshold for each method is determined so in which non-blind SR and segmentation are used (i.e.,
that the mean IoU over all test images is maximized. This Table 2 (b) SrcNet [12] in which SR and segmentation are
maximized IoU is called IoU𝑚𝑎𝑥 . For evaluation indepen- trained independently and Table 2 (c) DSRL [102] in which
dently of thresholding, IoUs are averaged over thresholds SR and segmentation are trained in a multi-task learning
(AIU [107]). While IoU is a major metric for segmentation, manner). The segmentation network of SrcNet and DSRL
it is inappropriate for evaluating fine thin cracks because a is trained with the BCE loss. While SrcNet is implemented
slight displacement makes IoU significantly small even if the by ourselves because its code is not available, we used the
structures of ground-truth and estimated cracks are almost publicly-available implementation of DSRL [3].
similar. For appropriately evaluating such similar cracks, Quantitative Results: Table 2 shows quantitative results.
95% Hausdorff Distance (HD95) [27] is employed. As with In all metrics, all variants of CSBSR are better than their
IoU, the HD95 threshold for each method is also determined original segmentation methods. That is, (e), (g), (i), and (k)
so that the mean HD95 over all test images is minimized. are better than (d), (f), (h), and (j), respectively, in Table 2.
This minimized HD95 is called HD95𝑚𝑖𝑛 . For evaluation As a result, CSBSR is the best in all segmentation metrics
independently of thresholding, HD95s are also averaged (i.e., IoU, AIU, HD95, and AHD95).
over thresholds. This averaged HD95 is called AHD95. Our proposed methods are also compared with SoTA
segmentation methods using SR (i.e., (b) and (c) in Table 2).

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 7 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

0.7 1,000

0.6

0.5
100
0.4

0.3
10
0.2
(a) SS in HR (b) SrcNet (a) SS in HR (b) SrcNet
0.1 (c) DSRL (d) CSBSR w/o JL (c) DSRL (d) PSPNet w/o JL
(e) CSSR w/ PSPNet (f) CSBSR w/ PSPNet (e) CSSR w/ PSPNet (f) CSBSR w/ PSPNet
0 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 6: IoU and HD95 comparison with SOTA methods on the Khanhha dataset. (a) HR segmentation by PSPNet [124] (b) SR
segmentation by SrcNet. (c) SR segmentation by DSRL. (d) SR segmentation by CSBSR w/o joint learning. (e) SR segmentation by
CSSR. (f) SR segmentation by CSBSR.

(a) Input LR (b) HR GT (c) SrcNet [12] (d) DSRL [102] (e) Ours

Figure 7: Visual results of comparative experiments on the Khanhha dataset.

The performance improvement of CSBSR compared to Sr- images are also shown as (a) in Fig. 6, while LR images are
cNet might be acquired by BC loss, joint learning, and/or fed into all other methods (b), (c), (d), (e), and (f) in Fig. 6.
blind SR. In comparison between CSBSR and DSRL, we It can be seen that (b) SrcNet and (c) DSRL are clearly
can see the effectiveness of serial joint learning, as well as inferior to others in both IoU and HD95. In particular, the
BC loss and blind SR. scores of DSRL are significantly changed depending on a
Even in comparison with (a) segmentation in HR images change in the threshold. This reveals that DSRL is sensitive
(implemented by PSPNet with BC loss), the segmentation to a change in the threshold. The scores of all other methods
scores of CSBSR get close to those of segmentation in HR. accepting LR images are close to those of (a) segmentation
For example, IoU and AIU of CSBSR with PSPNet are in HR images. In particular, (f) CSBSR can get higher scores
93.0% and 98.7% of those of segmentation in HR. In terms in a wide range of the thresholds. This stability against a
of HD95, on the other hand, CSBSR is much inferior to change in the threshold is crucial in applying CSBSR to a
segmentation in HR. This reveals that CSBSR should be variety of segmentation tasks.
improved more in order to extract fine crack structures. Visual Results: Figure 7 shows visual results. In the upper
The IoU and HD95 scores of our proposed method with row, from left to right, the first and second images are an
CSBSR are shown in Fig. 6. For comparison, our method input LR image (enlarged by nearest neighbor interpolation)
with non-blind SR (i.e., CSSR) and SOTA segmentation and its ground-truth HR image. The remaining three images
methods using SR are compared with CSBSR. As the upper are SR images of SrcNet, DSRL, and CSBSR. It can be seen
limitation, the scores of segmentation on ground-truth HR that the SR image of CSBSR is much sharper than those of

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 8 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Images
Labels
Images
Labels
Images
Labels

(a) Input LR (b) HR GT (c) SS in HR (d) SrcNet [12] (e) DSRL [102] (f) CSBSR

Figure 8: Visual comparison on the Khanhha dataset. In the upper row of each example: (a) Input LR image (enlarged by Bicubic
interpolation for visualization. (b, c) Ground-truth HR image. (d) SR image obtained by SrcNet. (e) SR image obtained by DSRL. (f)
SR image obtained by our CSBSR. In the lower row of each example: (a) No image. (b) Ground-truth segmentation image in HR. (c) HR
segmentation image obtained by PSPNet [124] (d) SR segmentation image obtained by SrcNet. (e) SR segmentation image obtained by
DSRL. (f) SR segmentation image obtained by our CSBSR.

SrcNet and DSRL. In terms of the crack segmentation image HR images shown in Fig. 8 (c). It can also be seen that
also, CSBSR outperforms SrcNet and DSRL. CSBSR can reconstruct and detect even thin fine cracks in
Figure 8 shows the examples of more complex cracks. the SR image and segmentation images, respectively. As a
Since such complex crack pixels make it difficult to correctly result, our results are similar to the ground-truth segmenta-
detect these pixels, even segmentation methods using SR tion images shown in Fig. 8 (b).
reconstruction (i.e., SrcNet [12] and DSRL [102]) cannot Figure 9 shows examples where (f) the SR segmentation
detect many crack pixels, as shown in Fig. 8 (d) and (e). image obtained by CSBSR is better even than (c) the HR
As shown in Fig. 8 (f), on the other hand, our CSBSR can segmentation image obtained in the ground-truth HR im-
obtain crack segmentation images that are similar to their age. These images are characterized by low image-contrast
corresponding segmentation images obtained in the original

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 9 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Images
Labels
Images
Labels

(a) Input LR (b) HR GT (c) SS in HR (d) SrcNet [12] (e) DSRL [102] (f) CSBSR

Figure 9: Examples where (f) the SR segmentation image obtained by our CSBSR is better than (c) the HR segmentation image obtained
in the ground-truth HR image.

around crack pixels, thin cracks, and/or local illumination More specifically, in terms of the segmentation results,
change around crack pixels. IoU𝑚𝑎𝑥 and AIU are not so changed depending on 𝛽. On the
We interpret that, while it is difficult for SR to recon- other hand, the best HD95𝑚𝑖𝑛 and AHD95 scores are better in
struct and for segmentation to detect such high-frequency the training strategy with increasing 𝛽 (i.e., “Increasing” in
structures and low-contrast structures shown in Figs. 8 the table) and have a larger margin from the scores obtained
and 9, our joint learning of SR and segmentation with the with any fixed 𝛽. Intuitively speaking, the segmentation
segmentation-aware SR loss and the blur skip for blur- score should be best with 𝛽 = 1 so that the segmentation
reflected segmentation learning can achieve these difficult loss (i.e.,  in Eq. 2) is fully weighted. We interpret that the
tasks. segmentation scores are not best with 𝛽 = 1 because it is
Figure 10 shows sample test images where no crack difficult to fully optimize the whole network directly from
pixels are observed. While there are no crack pixels in these the pre-trained SR and segmentation networks. That is why
images, observed masonry joints tend to be false-positives. the training strategy with increasing 𝛽 is better than 𝛽 = 1.
For real applications using automatic image inspection, it is In terms of the SR image quality, While the best SSIM
important to successfully suppress such false-positives for is acquired without joint learning, the best PSNR is with
avoiding false alarms because most images have no crack 𝛽 = 0.3. Since the SR network is trained without joint
pixels in real buildings. In Fig. 10, it can be seen that learning just to improve SR, it is expected that the best SR
(d) SrcNet and (e) DSRL detect false-positives around the results are obtained without joint learning. This expecta-
masonry joints, while (f) CSBSR successfully neglects all tion is betrayed probably because of the feature extractor
of these masonry joint pixels. augmentation through the training of the segmentation task.
The features can be marginally augmented also for SR as
4.2.4. Effects of 𝛽 in multi-task learning if 𝛽 is smaller, while the features are
Table 3 shows the evaluation results obtained in accor- optimized for the segmentation task if 𝛽 is larger.
dance with changes in 𝛽. In all metrics of both SR and
segmentation tasks, CSBSR outperforms CSSR. Further- 4.2.5. Effects of Segmentation losses
more, in both CSSR and CSBSR, our proposed joint learning To verify the effectiveness of our BC and GBC losses,
acquires better results in all segmentation metrics. CSBSR is trained with other losses for class-imbalance

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 10 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Images
Labels
Images
Labels

(a)(a)Input LR
Input LR (b)
(b) HR GT
HR GT (c)
(c) SS inHR
SS in HR (d)SrcNet
(d) SrcNet [7]
[12] (e)
(e) DSRL [87]
DSRL [102] (f)CSBSR
(f) Ours

Figure 10: Examples where there are no crack pixels in (a) input LR image.

0.7 1,000
0.6
SS in HR BC
GBC WCE
0.5 B + GDice Combo
100
0.4
0.3
SS in HR BC 10
0.2
GBC WCE
0.1
B + GDice Combo
0 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 11: Curves of IoU and HD95 scores varying with a change in the threshold for segmentation image binarization.

segmentation (i.e., WCE [26], Dice [77], Combo [98], and other hand, while WCE gets better results in a few metrics in
GDice [97]). As shown in Table 4, BC loss gets the best Table 4, its performance drop depending on the threshold is
scores in four metrics (i.e., IoU, AIU, AHD95, and PSNR) significant. This performance drop makes it difficult to apply
and the second-best in HD95. While it is the third place in WCE loss to a variety of scenarios. As with GBC, the curves
SSIM, the gap from the best is tiny (0.705 vs 0.703). of BC are also not so decreased.
Figure 11 shows IoU and HD95 scores varying with a Based on the aforementioned observations, we conclude
change in a threshold for binarizing the segmentation image. that our BC and GBC loesses are superior to other SOTA
As shown in Table 4, GBC is inferior to BC. However, GBC losses in terms of the max performance (as shown in Table 4)
gets higher scores in a large range of thresholds in both IoU and stability (as shown in Fig. 11).
and HD95. This property might be given by GDice, included
in GBC, which works robustly to class imbalance. On the

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 11 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Table 3
Performance change depending on 𝛽. 𝛽 is fixed during Step 3 in the training strategy, except for “Increasing” shown in the bottom line in
which 𝛽 is increased from 0 to 1 in proportion to iterations.

Segmentation metrics SR metrics

Model 𝛽 IoU𝑚𝑎𝑥 ↑ AIU↑ HD95𝑚𝑖𝑛 ↓ AHD95↓ PSNR↑ SSIM↑
w/o joint learning 0.548 0.524 28.45 31.62 28.62 0.706
0.1 0.563 0.541 19.16 21.96 28.73 0.705
0.3 0.573 0.552 20.92 22.52 28.75 0.703
CSBSR 0.5 0.572 0.550 18.80 21.18 28.69 0.701
w/ PSPNet 0.7 0.551 0.528 23.31 28.66 28.07 0.687
0.9 0.554 0.533 26.03 27.29 27.72 0.669
1.0 0.565 0.544 19.27 22.32 22.78 0.472
Increasing 0.568 0.549 16.24 19.02 27.12 0.662
w/o joint learning 0.531 0.512 36.01 38.33 27.85 0.667
0.1 0.547 0.529 24.45 28.27 28.42 0.653
0.3 0.475 0.446 53.75 55.96 28.47 0.663
CSSR 0.5 0.546 0.523 22.12 24.61 28.39 0.657
w/ PSPNet 0.7 0.557 0.539 21.20 24.74 28.35 0.656
0.9 0.552 0.534 20.88 22.48 28.01 0.653
1.0 0.539 0.515 21.82 26.04 20.29 0.436
Increasing 0.544 0.512 28.28 35.30 27.02 0.635

Table 4
Comparison with other losses for class-imbalance segmentation. The best and second best scores are colored by red and blue, respectively.

Segmentation metrics SR metrics

Model IoU𝑚𝑎𝑥 ↑ AIU↑ HD95𝑚𝑖𝑛 ↓ AHD95↓ PSNR↑ SSIM↑
BC loss (Ours) 0.573 0.552 20.92 22.52 28.75 0.703
GBC loss (Ours) 0.551 0.534 23.34 33.46 28.70 0.705
WCE [26] 0.569 0.459 16.91 26.29 28.60 0.704
Dice [77] 0.466 0.465 59.21 59.65 28.66 0.704
Combo [98] 0.483 0.436 39.48 62.27 28.51 0.697
Boundary [55] + GDice [97] 0.469 0.425 65.13 68.90 28.31 0.692

4.2.6. Effects of Segmentation-aware SR-loss Weights 𝑤𝐹 given to 𝑆 are almost equal to those of the
The effects of additional weights given to 𝑆 , which are baseline CSBSR (i.e., 0.573 vs 0.573 in IoU and 0.551
proposed in Sec. 3.3, are evaluated in Table 5. Since 𝑤𝐶 and vs 0.552 in AIU).
𝑤𝐹 have hyper parameters
{ 𝐶 (i.e.,} 𝑚 {and
𝐶 𝑚𝐹 , respectively), } • While 𝑤𝐹 weights the segmentation loss (𝐶 ), the
the best results among 𝑚 , 𝑚 = 2 , 2−2 , 2−1 , 20 , 21 , 22 , 23
𝐹 −3
results are inferior to the baseline in most metrics, as
are shown in Table 5. We can see the following observations:
shown in the bottom row of Table 5.
• All weights given to 𝑆 improve HD95.
In addition to the quantitative comparison shown in
• Conversely, all weights given to 𝑆 decrease IoU Table 5, Fig. 12 visually shows the effect of the FO weight.
and AIU, while the performance drops are not so All images are the results obtained with 𝑤𝐹 = 1.0. In the
significant. In particular, IoU and AIU provided by left part of Fig. 12, we can see that 𝑤𝐹 allows CSBSR to

Table 5
Ablation study of weights given to 𝑆 , namely 𝐶 , 𝑤𝐶 , and 𝑤𝐹 . Scores better than the baseline (i.e., CSBSR w/o any weight) are underlined.

Segmentation metrics SR metrics

Model IoU𝑚𝑎𝑥 ↑ AIU↑ HD95𝑚𝑖𝑛 ↓ AHD95↓ PSNR↑ SSIM↑
CSBSR 0.573 0.552 20.92 22.52 28.75 0.703
w/ 𝐿𝐶 0.558 0.535 19.72 22.90 27.32 0.649
w/ 𝑤𝐶 (𝑚𝐶 = 8.0) 0.553 0.531 19.21 26.02 28.70 0.703
w/ 𝑤𝐹 (𝑚𝐹 = 1.0) 0.573 0.551 18.73 21.70 28.73 0.702
w/ 𝑤𝐹 (𝑚𝐹 = 0.5) for 𝐶 0.556 0.531 22.26 25.94 28.70 0.706

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 12 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

Images

Cropped
Labels
Images

Cropped
Labels

(a) Input LR (b) HR GT (c) w/o 𝑤𝐹 (d) w/ 𝑤𝐹 (b’) HR GT (c’) w/o 𝑤𝐹 (d’) w/o 𝑤𝐹

Figure 12: Visual comparison between CSBSR w/ and w/o the FO weight 𝑤𝐹 . [Left part] In the upper row of each example: (a) Input
LR image (enlarged by Bicubic interpolation for visualization). (b) Ground-truth HR image. (c) SR image obtained by CSBSR w/o 𝑤𝐹 .
(d) SR image obtained by CSBSR. In the lower row of each example: (a) No image. (b) Ground-truth segmentation image in HR. (c) SR
segmentation image obtained by CSBSR w/o 𝑤𝐹 . (d) SR segmentation image obtained by CSBSR. [Right part] Rectangle regions are
cropped from the SR images shown in the left part, and their zoom-in images are shown. The boundary color of each cropped image shows
the correspondence between the cropped images in the left and right parts. Differences between (c’) and (d’) are pointed by white arrows.

Table 6
Ablation study of our blur skip process. Scores better than the baseline (i.e., CSBSR w/o any weight) are underlined.

Segmentation metrics SR metrics

Model IoU𝑚𝑎𝑥 ↑ AIU↑ HD95𝑚𝑖𝑛 ↓ AHD95↓ PSNR↑ SSIM↑ Kernel PSNR↑
CSBSR 0.573 0.552 20.92 22.52 28.75 0.703 50.65
CSBSR w/ KS 0.544 0.523 28.86 32.02 28.52 0.696 50.82
CSBSR w/ KS and 𝑚𝐹 = 1.0 0.550 0.528 18.06 19.10 28.65 0.702 50.91

detect thin crack pixels in segmentation images. In order to effectiveness of 𝑤𝐹 for discriminating between remarkably-
see the results of SR image enhancement by 𝑤𝐹 , the zoom-in similar crack and background pixels in the segmentation
images of several regions in the SR images are shown in the network of CSBSR.
right part of Fig. 12.
In (c) images obtained without 𝑤𝐹 , detected crack pixels 4.2.7. Effects of Blur Skip
are broken. In (d) images obtained with 𝑤𝐹 , on the other The effects of the proposed blur skip process are shown
hand, cracks are more continuously detected, though it is in Table 6. Since the quality of the estimated kernel is high
difficult to visually see any significant difference between enough (e.g., above 50 dB in PSNR), our kernel skip should
zoom-in SR images shown in (c’) and (d’). In an opposite have the potential to support the segmentation task. While
way, background textures enclosed by the purple dashed the single usage of the blur skip cannot work well for all
ellipse are falsely detected in CSBSR without 𝑤𝐹 , as shown metrics, the blur skip used with 𝑤𝐹 improves HD95 and
in (c) of the lower example. However, these background AHD95. The typical examples are shown in Fig. 13. While
pixels reconstructed by CSBSR without and with 𝑤𝐹 (en- the results without the blur skip are much inferior to their
closed by the purple dashed ellipses in (c’) and (d’)) are also ground truths, the blur skip can improve the performance, as
almost the same as each other. These results demonstrate the shown in the rightmost image in Fig. 13.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 13 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

(a) GT (b) Results without blur skip (c) Results with blur skip

Figure 13: Effectiveness of our proposed blur skip. The left and right images show the HR/SR image and the segmentation image,
respectively.

Input image Input image DSRL [102] SrcNet [12] Ours

Figure 14: Visual results of comparative experiments on real blurred crack images. In (c), (d), and (e), detected crack pixel are colored by
red. True-positive, false-negative, and false-positive cracks are enclosed by blue, green, and yellow ellipses, respectively.

4.3. Crack Images with Real Degradations CSBSR can detect more true-positive crack pixels, in par-
For experiments with real images, we captured 809 wall ticular, along a crack located in the upper part of the image
images (1280 × 720 pixels) with a flying drone (DJI MAVIC (enclosed by blue ellipses). However, there are also many
MINI). This dataset includes out-of-focus images as well false-negative crack pixels (enclosed by green ellipses) even
as motion-blurred images. By using all the images in this in the segmentation image of CSBSR.
dataset as test images, we visually verify the effectiveness In the input image shown in the third row of Fig. 14,
of CSBSR for realistically-blurred images. Since it is essen- there are thin electrical wires as well as thin cracks (enclosed
tially difficult to annotate severely-blurred cracks correctly, by blue and green ellipses). A crack segmentation method is
only qualitative comparison is done with this dataset. required to detect only real cracks without being disturbed by
In the first row of Fig. 14, cracks are very thin. DSRL the wires. DSRL detects several wire pixels (enclosed by the
and SrcNet cannot detect any crack pixels. In addition, false- yellow ellipse) and crack pixels, while SrcNet detects noth-
positive cracks (enclosed by yellow ellipses) are detected. ing. While CSBSR detects only crack pixels, even CSBSR
CSBSR, on the other hand, can detect most crack pixels, as fails to detect blurry cracks observed in the lower part of the
depicted by superimposed red pixels. image (enclosed by green ellipses).
The second row of Fig. 14 shows the segmentation As mentioned above, while our CSBSR outperforms
results detected on the image of complex cracks observed SOTA segmentation methods using SR, it also fails to detect
on a building wall. While DSRL detects no crack pixels, severely-degraded cracks. Improving crack segmentation in
SrcNet and CSBSR successfully detect several crack pixels. such severely-degraded images is important for future work.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 14 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

5. Concluding Remarks [14] Yancheng Bai, Yongqiang Zhang, Mingli Ding, and Bernard
Ghanem. Finding tiny faces in the wild with generative adversarial
This paper proposes an end-to-end joint learning net- network. In CVPR, 2018.
work consisting of blind SR and segmentation networks. [15] Yancheng Bai, Yongqiang Zhang, Mingli Ding, and Bernard
Blind SR allows us to apply the proposed method to realistically- Ghanem. SOD-MTGAN: small object detection via multi-task
generative adversarial network. In ECCV, 2018.
blurred images. The information exchange between the SR
[16] Goutam Bhat, Martin Danelljan, Radu Timofte, Kazutoshi Akita,
and segmentation networks (i.e., segmentation-aware SR- Wooyeong Cho, Haoqiang Fan, Lanpeng Jia, Daeshik Kim, Bruno
loss weights and blur skip for blur-reflected task learning) Lecouat, Youwei Li, Shuaicheng Liu, Ziluan Liu, Ziwei Luo,
enables further improvement. For better segmentation in Takahiro Maeda, Julien Mairal, Christian Micheloni, Xuan Mo,
class-imbalance fine crack images, BC loss is proposed. Takeru Oba, Pavel Ostyakov, Jean Ponce, Sanghyeok Son, Jian Sun,
Norimichi Ukita, Rao Muhammad Umer, Youliang Yan, Lei Yu,
Future work includes quantitative evaluation on real- Magauiya Zhussip, and Xueyi Zou. NTIRE 2021 challenge on burst
image datasets in which ground-truth segmentation pixels super-resolution: Methods and results. In CVPR Workshop, 2021.
are manually given. It is also interesting to apply CSBSR [17] Yochai Blau, Roey Mechrez, Radu Timofte, Tomer Michaeli, and
to other segmentation tasks such as medical imaging. An Lihi Zelnik-Manor. 2018 pirm challenge on perceptual image super-
essential difficulty in SR is that SR is an ill-posed problem resolution. In ECCV Workshop, 2018.
[18] Samuel Rota Bulò, Gerhard Neuhold, and Peter Kontschieder. Loss
in which a larger number of pixels are reconstructed from a max-pooling for semantic image segmentation. In CVPR, 2017.
smaller number of pixels. In order to relieve this difficulty, [19] Hanshen Chen, Yishun Su, and Wei He. Automatic crack segmen-
multiple LR images are used as a set of input images in video tation using deep high-resolution representation learning. Applied
SR [34, 80, 46, 42] and burst SR [16]. Our proposed method Optics, 60(21):6080–6090, 2021.
can also be extended to the one with time-series images. [20] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin
Murphy, and Alan L. Yuille. Deeplab: Semantic image segmen-
tation with deep convolutional nets, atrous convolution, and fully
5.1. Acknowledgments connected crfs. IEEE Trans. Pattern Anal. Mach. Intell., 40(4):834–
This work was partly supported by JSPS KAKENHI 848, 2018.
Grant Numbers 19K12129 and 22H03618. [21] Dong-Yoon Choi, Ji Hoon Choi, Jin Wook Choi, and Byung Cheol
Song. Sharpness enhancement and super-resolution of around-view
monitor images. IEEE Trans. Intell. Transp. Syst., 19(8):2650–2662,
References 2018.
[1] Crack segmentation. https://github.com/khanhha/crack_ [22] Wooram Choi and Young-Jin Cha. Sddnet: Real-time crack segmen-
segmentation. tation. IEEE Trans. Ind. Electron., 67(9):8016–8025, 2020.
[2] Crackformer-ii. https://github.com/LouisNUST/CrackFormer-II. [23] Victor Cornillère, Abdelaziz Djelouah, Yifan Wang, Olga Sorkine-
[3] Dual super-resolution learning for semantic segmentation. https: Hornung, and Christopher Schroers. Blind image super-
//github.com/Dootmaan/DSRL. resolution with spatially variant degradations˙ ACM Trans. Graph.,
[4] Torchvision.models. https://pytorch.org/vision/stable/models. 38(6):166:1–166:13, 2019.
html. [24] Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang.
[5] Allen Zhang abd Kelvin C. P. Wang, Yue Fei, Yang Liu, Siyu Tao, Second-order attention network for single image super-resolution. In
Cheng Chen, Joshua Q. Li, and Baoxian Li. Deep learning–based CVPR, 2019.
fully automated pavement crack detection on 3d asphalt surfaces [25] Dimitris Dais, Ihsan Engin Bal, Eleni Smyrou, and Vasilis Sarho-
with an improved cracknet. Journal of Computing in Civil Engi- sis. Automatic crack classification and segmentation on masonry
neering, 32(5), 2018. surfaces using convolutional neural networks and transfer learning.
[6] Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single Automation in Construction, 25, 2021.
image super-resolution: Dataset and study. In CVPRW, 2017. [26] Dimitris Dais, İhsan Engin Bal, Eleni Smyrou, and Vasilis Sarho-
[7] Kazutoshi Akita, Muhammad Haris, and Norimichi Ukita. Region- sis. Automatic crack classification and segmentation on masonry
dependent scale proposals for super-resolution in object detection. surfaces using convolutional neural networks and transfer learning.
In IPAS, 2020. Automation in Construction, 125:103606, 2021.
[8] Kazutoshi Akita, Masayoshi Hayama, Haruya Kyutoku, and Norim- [27] DeepMind. Surface distance metrics. https://github.com/deepmind/
ichi Ukita. AVM image quality enhancement by synthetic image surface-distance.
learning for supervised deblurring. In MVA, 2021. [28] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-
[9] Rabih Amhazand, Sylvie Chambon, Jérôme Idier, and Vincent Bal- Fei. Imagenet: A large-scale hierarchical image database. In CVPR,
tazart. Automatic crack detection on two-dimensional pavement 2009.
images: An algorithm based on minimal path selection. TITS, [29] Qi Dong, Shaogang Gong, and Xiatian Zhu. Class rectification hard
17(10):2718–2729, 2016. mining for imbalanced deep learning. In ICCV, 2017.
[10] Md Rifat Arefin, Vincent Michalski, Pierre-Luc St-Charles, Alfredo [30] Markus Eisenbach, Ronny Stricker, Daniel Seichter, Karl Amende,
Kalaitzis, Sookyung Kim, Samira Ebrahimi Kahou, and Yoshua Klaus Debes, Maximilian Sesselmann, Dirk Ebersbach, Ulrike
Bengio. Multi-image super-resolution for remote sensing using deep Stoeckert, and Horst-Michael Gross. How to get pavement distress
recurrent networks. In CVPR Workshops, 2020. detection ready for deep learning? a systematic approach. In IJCNN,
[11] Leanne Attard, Carl James Debono, Gianluca Valentino, and 2017.
Mario Di Castro. Tunnel inspection using photogrammetric tech- [31] Faris Elghaish, Saeed Talebi, Essam Abdellatef, Sandra T Matarneh,
niques and image processing: A review. ISPRS Journal of Pho- M Reza Hosseini, Song Wu, Mohammad Mayouf, Aso Hajirasouli,
togrammetry and Remote Sensing, 140:180–18, 2018. et al. Developing a new deep learning cnn model to detect and
[12] Hyunjin Bae, Keunyoung Jang, and Yun-Kyu An. Deep super classify highway cracks. Journal of Engineering, Design and
resolution crack network (srcnet) for improving computer vision– Technology, 2021.
based automated crack detectability in in situ bridges. Structural [32] Yue Fei, Kelvin C. P. Wang, Allen Zhang, Cheng Chen, Joshua Q.
Health Monitoring, 20(4):1428–1442, 2021. Li, Yang Liu, Guangwei Yang, and Baoxian Li. Pixel-level cracking
[13] Yuval Bahat and Tomer Michaeli. Explorable super resolution. In detection on 3d asphalt pavement images through deep-learning-
CVPR, 2020.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 15 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

based cracknet-v. IEEE Trans. Intell. Transp. Syst., 21(1):273–284, 2021.

2020. [52] Shady Abu Hussein, Tom Tirer, and Raja Giryes. Correction filter
[33] Sanae Fujii, Kazutoshi Akita, and Norimichi Ukita. Distant bird for single image super-resolution: Robustifying off-the-shelf deep
detection for safe drone flight and its dataset. In MVA, 2021. super-resolvers. In CVPR, 2020.
[34] Dario Fuoli, Zhiwu Huang, Shuhang Gu, Radu Timofte, Arnau [53] Jianbo Jiao, Yunchao Wei, Zequn Jie, Honghui Shi, Rynson W. H.
Raventos, Aryan Esfandiari, Salah Karout, Xuan Xu, Xin Li, Xin Lau, and Thomas S. Huang. Geometry-aware distillation for indoor
Xiong, Jinge Wang, Pablo Navarrete Michelini, Wenhao Zhang, semantic segmentation. In CVPR, 2019.
Dongyang Zhang, Hanwei Zhu, Dan Xia, Haoyu Chen, Jinjin Gu, Zhi [54] Davood Karimi and Septimiu E. Salcudean. Reducing the hausdorff
Zhang, Tongtong Zhao, Shanshan Zhao, Kazutoshi Akita, Norimichi distance in medical image segmentation with convolutional neural
Ukita, Hrishikesh P. S, Densen Puthussery, and C. V. Jiji. AIM 2020 networks. IEEE Trans. Medical Imaging, 39(2):499–513, 2020.
challenge on video extreme super-resolution: Methods and results. [55] Hoel Kervadec, Jihene Bouchtiba, Christian Desrosiers, Eric
In ECCV Workshop, 2020. Granger, Jose Dolz, and Ismail Ben Ayed. Boundary loss for highly
[35] Lixue Gong, Yiqun Zhang, Yunke Zhang, Yin Yang, and Weiwei unbalanced segmentation. Medical Image Anal., 67:101851, 2021.
Xu. Erroneous pixel prediction for semantic image segmentation. [56] Khanhha. Crack segmentation, 2020. https://github.com/khanhha/
Comput. Vis. Media, 8(1):165–175, 2022. crack_segmentation.
[36] Jinjin Gu, Hannan Lu, Wangmeng Zuo, and Chao Dong. Blind super- [57] Soo Ye Kim, Hyeonjun Sim, and Munchurl Kim. Koalanet: Blind
resolution with iterative kernel correction. In CVPR, 2019. super-resolution using kernel-oriented adaptive local adjustment. In
[37] Shuhang Gu, Hanwen Liu, Dan Zhu, Tangxin Xie, Xin Yang, Chen CVPR, 2021.
Zhu, Jia Yu, Wenyu Sun, Xin Tao, Zijun Deng, Liying Lu, Martin [58] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic
Danelljan, Wenbo Li, Taian Guo, Xiaoyong Shen, Xuemiao Xu, Yu- optimization. In ICLR, 2015.
Wing Tai, Jiaya Jia, Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun [59] Alexander Kirillov, Kaiming He, Ross B. Girshick, Carsten Rother,
Jiang, Radu Timofte, Jiayi Ma, Zhi-Song Liu, Li-Wen Wang, Chu- and Piotr Dollár. Panoptic segmentation. In CVPR, 2019.
Tak Li, Wan-Chi Siu, Yui-Lam Chan, Ruofan Zhou, Majed El Helou, [60] Yuki Kondo and Norimichi Ukita. Crack segmentation for low-
Kuldeep Purohit, Praveen Kandula, Muhammad Haris, Maitreya resolution images using joint learning with super-resolution. In
Suin, A. N. Rajagopalan, Kazutoshi Akita, Greg Shakhnarovich, MVA, 2021.
Norimichi Ukita, Pablo Navarrete Michelini, and Wenbin Chen. Aim [61] Junyeop Lee, Jaihyun Park, Kanghyu Lee, Jeongki Min, Gwantae
2019 challenge on image extreme super-resolution: Methods and Kim, Bokyeung Lee, Bonhwa Ku, David K. Han, and Hanseok
results. In AIM (ICCVW), 2019. Ko. FBRNN: feedback recurrent neural network for extreme image
[38] Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, Jiezhang Cao, super-resolution. In CVPR Workshop, 2020.
Zeshuai Deng, Yanwu Xu, and Mingkui Tan. Closed-loop matters: [62] Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, and
Dual regression networks for single image super-resolution. In Shuicheng Yan. Perceptual generative adversarial networks for small
CVPR, 2020. object detection. In CVPR, 2017.
[39] Zhiling Guo, Guangming Wu, Xiaoya Song, Wei Yuan, Qi Chen, [63] Xiangtai Li, Hao He, Xia Li, Duo Li, Guangliang Cheng, Jianping
Haoran Zhang, Xiaodan Shi, Mingzhou Xu, Yongwei Xu, Ryosuke Shi, Lubin Weng, Yunhai Tong, and Zhouchen Lin. Pointflow:
Shibasaki, and Xiaowei Shao. Super-resolution integrated building Flowing semantics through points for aerial image segmentation. In
semantic segmentation for multi-source remote sensing imagery. CVPR, 2021.
IEEE Access, 7:99381–99397, 2019. [64] Zeju Li, Konstantinos Kamnitsas, and Ben Glocker. Analyzing
[40] Zekun Hao, Yu Liu, Hongwei Qin, Junjie Yan, Xiu Li, and Xiaolin overfitting under class imbalance in neural networks for image
Hu. Scale-aware face detection. In CVPR, 2017. segmentation. IEEE Trans. Medical Imaging, 40(3):1065–1077,
[41] Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. Deep 2021.
back-projection networks for super-resolution. https://github.com/ [65] Zhen Li, Jinglei Yang, Zheng Liu, Xiaomin Yang, Gwanggil Jeon,
alterzero/DBPN-Pytorch. and Wei Wu. Feedback network for image super-resolution. In
[42] Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. CVPR, 2019.
Space-time-aware multi-resolution video enhancement. In CVPR, [66] Zhihang Li, Huamei Zhu, and Mengqi Huang. A deep learning-
2020. based fine crack segmentation network on full-scale steel bridge
[43] Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. Deep images with complicated backgrounds. IEEE Access, 9:114989–
back-projectinetworks for single image super-resolution. IEEE 114997, 2021.
Trans. Pattern Anal. Mach. Intell., 43(12):4323–4337, 2021. [67] Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He, and Piotr
[44] Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. Task- Dollár. Focal loss for dense object detection. In ICCV, 2017.
driven super resolution: Object detection in low-resolution images. [68] Zhipeng Lin, Zhi Gao, Hong Ji, Ruifang Zhai, Xiaoqing Shen, and
In ICONIP, 2021. Tiancan Mei. Classification of cervical cells leveraging simultane-
[45] Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. ous super-resolution and ordinal regression. Appl. Soft Comput.,
Deep back-projection networks for super-resolution. In CVPR, 2018. 115:108208, 2022.
[46] Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. [69] Huajun Liu, Xiangyu Miao, Christoph Mertz, Chengzhong Xu, and
Recurrent back-projection network for video super-resolution. In Hui Kong. Crackformer: Transformer network for fine-grained crack
CVPR, 2019. detection. In ICCV, 2021.
[47] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. Girshick. [70] Shice Liu, Yu Hu, Yiming Zeng, Qiankun Tang, Beibei Jin, Yinhe
Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell., 42(2):386– Han, and Xiaowei Li. See and think: Disentangling semantic scene
397, 2020. completion. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle,
[48] Majed El Helou, Ruofan Zhou, and Sabine Süsstrunk. Stochastic Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors,
frequency masking to improve super-resolution and denoising net- NeurIPS, 2018.
works. In ECCV, 2020. [71] Yahui Liu, Jian Yao, Xiaohu Lu, Renping Xie, and Li Li. Deepcrack:
[49] Md Sazzad Hossain, John M. Betts, and Andrew P. Paplinski. Dual A deep hierarchical feature learning architecture for crack segmen-
focal loss to address class imbalance in semantic segmentation. tation. Neurocomputing, 338:139–153, 2019.
Neurocomputing, 462:69–87, 2021. [72] Liying Lu, Wenbo Li, Xin Tao, Jiangbo Lu, and Jiaya Jia. MASA-
[50] Peiyun Hu and Deva Ramanan. Finding tiny faces. In CVPR, 2017. SR: matching acceleration and spatial adaptation for reference-based
[51] Zheng Hui, Jie Li, Xiumei Wang, and Xinbo Gao. Learning the image super-resolution. In CVPR, 2021.
non-differentiable optimization for blind super-resolution. In CVPR,

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 16 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

[73] Andreas Lugmayr, Martin Danelljan, Luc Van Gool, and Radu Tim- [90] Evan Shelhamer, Jonathan Long, and Trevor Darrell. Fully convo-
ofte. Srflow: Learning the super-resolution space with normalizing lutional networks for semantic segmentation. IEEE Trans. Pattern
flow. In ECCV, 2020. Anal. Mach. Intell., 39(4):640–651, 2017.
[74] Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, and Tieniu [91] Yong Shi, Limeng Cui, Zhiquan Qi, Fan Meng, and Zhensong Chen.
Tan. Unfolding the alternating optimization for blind super resolutio Automatic road crack detection using random structured forests.
n. In NeurIPS, 2020. TITS, 17(12):3434–3445, 2016.
[75] Jun Ma, Jianan Chen, Matthew Ng, Rui Huang, Yu Li, Chen Li, [92] Gyumin Shim, Jinsun Park, and In So Kweon. Robust reference-
Xiaoping Yang, and Anne L. Martel. Loss odyssey in medical image based super-resolution with similarity-aware deformable convolu-
segmentation. Medical Image Anal., 71:102035, 2021. tion. In CVPR, 2020.
[76] Yiqun Mei, Yuchen Fan, and Yuqian Zhou. Image super-resolution [93] Kodai Shimosato and Norimichi Ukita. Multi-modal data fusion
with non-local sparse attention. In CVPR, 2021. for land-subsidence image improvement in psinsar analysis. IEEE
[77] Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Access, 9:141970–141980, 2021.
Fully convolutional neural networks for volumetric medical image [94] Maneet Singh, Shruti Nagpal, Richa Singh, and Mayank Vatsa. Dual
segmentation. In 3DV, 2016. directed capsule network for very low resolution image recognition.
[78] Shervin Minaee, Yuri Boykov, Fatih Porikli, Antonio Plaza, Nasser In ICCV, 2019.
Kehtarnavaz, and Demetri Terzopoulos. Image segmentation using [95] Jae Woong Soh, Sunwoo Cho, and Nam Ik Cho. Meta-transfer
deep learning: A survey. arXiv, 2001.05566, 2020. learning for zero-shot super-resolution. In CVPR, 2020.
[79] Masashi Nagaya and Norimichi Ukita. Embryo grading with unre- [96] Simon Stent, Riccardo Gherardi, Björn Stenger, Kenichi Soga, and
liable labels due to chromosome abnormalities by regularized PU Roberto Cipolla. An image-based system for change detection on
learning with ranking. IEEE Trans. Medical Imaging, 41(2):320– tunnel linings. In MVA, 2013.
331, 2022. [97] Carole H. Sudre, Wenqi Li, Tom Vercauteren, Sébastien Ourselin,
[80] Seungjun Nah, Radu Timofte, Shuhang Gu, Sungyong Baik, Seokil and M. Jorge Cardoso. Generalised dice overlap as a deep learning
Hong, Gyeongsik Moon, Sanghyun Son, Kyoung Mu Lee, Xintao loss function for highly unbalanced segmentations. In MICCAI,
Wang, Kelvin C. K. Chan, Ke Yu, Chao Dong, Chen Change Loy, 2017.
Yuchen Fan, Jiahui Yu, Ding Liu, Thomas S. Huang, Xiao Liu, Chao [98] Saeid Asgari Taghanaki, Yefeng Zheng, S Kevin Zhou, Bogdan
Li, Dongliang He, Yukang Ding, Shilei Wen, Fatih Porikli, Ratheesh Georgescu, Puneet Sharma, Daguang Xu, Dorin Comaniciu, and
Kalarot, Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita, Ghassan Hamarneh. Combo loss: Handling input and output im-
Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Jiayi Ma, balance in multi-organ segmentation. Comput Med Imaging Graph,
Hang Dong, Xinyi Zhang, Zhe Hu, Kwan-Young Kim, Dong Un 75:24–33, 2019.
Kang, Se Young Chun, Kuldeep Purohit, A. N. Rajagopalan, Yapeng [99] Hossein Talebi and Peyman Milanfar. Learning to resize images for
Tian, Yulun Zhang, Yun Fu, Chenliang Xu, A. Murat Tekalp, computer vision tasks. In ICCV, 2021.
M. Akin Yilmaz, Cansu Korkmaz, Manoj Sharma, Megh Makwana, [100] Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang,
Anuj Badhwar, Ajay Pratap Singh, Avinash Upadhyay, Rudrabha and Lei Zhang. Ntire 2017 challenge on single image super-
Mukhopadhyay, Ankit Shukla, Dheeraj Khanna, A. S. Mandal, San- resolution: Methods and results. In CVPRW, 2017.
tanu Chaudhury, Si Miao, Yongxin Zhu, and Xiao Huo. NTIRE 2019 [101] Radu Timofte, Shuhang Gu, Jiqing Wu, and Luc Van Gool. Ntire
challenge on video super-resolution: Methods and results. In CVPR 2018 challenge on single image super-resolution: Methods and re-
Workshop, 2019. sults. In NTIRE (CVPRW), 2018.
[81] Ben Niu, Weilei Wen, Wenqi Ren, Xiangde Zhang, Lianping Yang, [102] Li Wang, Dong Li, Yousong Zhu, Lu Tian, and Yi Shan. Dual super-
Shuzhen Wang, Kaihao Zhang, Xiaochun Cao, and Haifeng Shen. resolution learning for semantic segmentation. In CVPR, 2020.
Single image super-resolution via a holistic attention network. In [103] Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jun-
ECCV, 2020. gang Yang, Wei An, and Yulan Guo. Unsupervised degradation
[82] Yanwei Pang, Jiale Cao, Jian Wang, and Jungong Han. Jcs-net: Joint representation learning for blind super-resolution. In CVPR, 2021.
classification and super-resolution network for small-scale pedes- [104] Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. Recover-
trian detection in surveillance images. IEEE Trans. Inf. Forensics ing realistic texture in image super-resolution by deep spatial feature
Secur., 14(12):3322–3331, 2019. transform. In CVPR, 2018.
[83] Seobin Park, Jinsu Yoo, Donghyeon Cho, Jiwon Kim, and Tae Hyun [105] Zhihao Wang, Jian Chen, and Steven C. H. Hoi. Deep learning for
Kim. Fast adaptation to super-resolution networks via meta-learning. image super-resolution: A survey. TPAMI, 43(10):3365–3387, 2021.
In ECCV, 2020. [106] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simon-
[84] Prateek Prasanna, Kristin J. Dana, Nenad Gucunski, Basily B. celli. Image quality assessment: from error visibility to structural
Basily, Hung Manh La, Ronny Salim Lim, and Hooman Parvardeh. similarity. IEEE Transactions on image processing, 13(4):600–612,
Automated crack detection on concrete bridges. IEEE Trans Autom. 2004.
Sci. Eng., 13(2):591–599, 2016. [107] Fan Yang, Lei Zhang, Sijia Yu, Danil Prokhorov, Xue Mei, and
[85] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised rep- Haibin Ling. Feature pyramid and hierarchical boosting network
resentation learning with deep convolutional generative adversarial for pavement crack detection. TITS, 21(4):1525–1535, 2020.
networks. In ICLR, 2016. [108] Liang Yang, Bing Li, Wei Li, Liu Zhaoming, Guoyong Yang, and
[86] Amir Rezaie, Radhakrishna Achanta, Michele Godio, and Katrin Jizhong Xiao. Deep concrete inspection using unmanned aerial
Beyer. Comparison of crack segmentation using digital image vehicle towards cssc database. In IROS, 2017.
correlation measurements and deep learning. Construction and [109] Michael Yeung, Evis Sala, Carola-Bibiane Sch´’onlieb, and Leonardo
Building Materials, 261(20):120474, 2020. Rundo. Unified focal loss: Generalising dice and cross entropy-based
[87] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convo- losses to handle class imbalanced medical image segmentation.
lutional networks for biomedical image segmentation. In MICCAI, Computerized Medical Imaging and Graphics, 95:102026, 2022.
2015. [110] Tomoki Yoshida, Yuki Kondo, Takahiro Maeda, Kazutoshi Akita,
[88] Serim Ryou, Seong-Gyun Jeong, and Pietro Perona. Anchor loss: and Norimichi Ukita. Kernelized back-projection networks for blind
Modulating loss scale based on prediction difficulty. In ICCV, 2019. super resolution. https://github.com/Yuki-11/KBPN.
[89] Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. Singan: [111] Tomoki Yoshida, Yuki Kondo, Takahiro Maeda, Kazutoshi Akita,
Learning a generative model from a single natural image. In ICCV, and Norimichi Ukita. Kernelized back-projection networks for blind
2019. super resolution. arXiv, 2023.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 17 of 18

Joint Learning of Blind Super-Resolution and Crack Segmentation

[112] Yuhui Yuan, Xilin Chen, and Jingdong Wang. Object-contextual 6. Biography Section
representations for semantic segmentation. In ECCV, 2020.
[113] Yuhui Yuan et al. Ocnet series. https://github.com/openseg-group/
openseg.pytorch/blob/master/MODEL_ZOO.md.
[114] Kai Zhang, Jingyun Liang Luc Van Gool, and Radu Timofte. De- Yuki Kondo received the bachelor degree in en-
signing a practical degradation model for deep blind image super- gineering from Toyota Technological Institute in
resolution. In ICCV, 2021. 2022. Currently, he is a researcher with Toyota
[115] Kai Zhang, Luc Van Gool, and Radu Timofte. Deep unfolding Technological Institute. His research interests in-
network for image super-resolution. In CVPR, 2020. clude low-level vision including image and video
[116] Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, super-resolution and its application to tiny image
Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong analysis such as crack detection. His award in-
Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, cludes the best practical paper award in MVA2021.
Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru
Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu,
Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Norimichi Ukita received the B.E. and M.E. de-
Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon grees in information engineering from Okayama
Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, University, Japan, in 1996 and 1998, respectively,
Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu and the Ph.D. degree in Informatics from Kyoto
Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, University, Japan, in 2001. From 2001 to 2016, he
Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, was an assistant professor (2001 to 2007) and an
Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, associate professor (2007-2016) with the graduate
Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, school of information science, Nara Institute of
and C. V. Jiji. Ntire 2020 challenge on perceptual extreme super- Science and Technology, Japan. In 2016, he be-
resolution: Methods and results. In NTIRE (CVPRW), 2020. came a professor with Toyota Technological Insti-
[117] Kai Zhang, Wangmeng Zuo, and Lei Zhang. Learning a single tute, Japan. He was a research scientist of Precur-
convolutional super-resolution network for multiple degradations. In sory Research for Embryonic Science and Tech-
CVPR, 2018. nology, Japan Science and Technology Agency,
[118] Kai Zhang, Wangmeng Zuo, and Lei Zhang. Deep plug-and-play during 2002–2006, and a visiting research scien-
super-resolution for arbitrary blur kernels. In CVPR, 2019. tist at Carnegie Mellon University during 2007–
[119] Kaige Zhang, Yingtao Zhang, and Heng-Da Cheng. Crackgan: 2009. Currently, he is also an adjunct professor at
Pavement crack detection using partially accurate ground truths Toyota Technological Institute at Chicago. Prof.
based on generative adversarial learning. TITS, 22(2):1306–1319, Ukita’s awards include the excellent paper award
2020. of IEICE (1999), the winner award in NTIRE 2018
[120] Lei Zhang, Fan Yang, Yimin Daniel Zhang, and Ying Julie Zhu. challenge on image super-resolution, the 1st place
Road crack detection using deep convolutional neural network. In in PIRM 2018 perceptual SR challenge, the best
ICIP, 2016. poster award in MVA2019, and the best practical
[121] Yongqiang Zhang, Yancheng Bai, Mingli Ding, Shibiao Xu, and paper award in MVA2021.
Bernard Ghanem. Kgsnet: Key-point-guided super-resolution net-
work for pedestrian detection in the wild. IEEE Trans. Neural
Networks Learn. Syst., 32(5):2251–2265, 2021.
[122] Zhifei Zhang, Zhaowen Wang, Zhe L. Lin, and Hairong Qi. Image
super-resolution by neural texture transfer. In CVPR, 2019.
[123] Hengshuang Zhao. Pytorch semantic segmentation. https://github.
com/hszhao/semseg/blob/master/model/pspnet.py.
[124] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and
Jiaya Jia. Pyramid scene parsing network. In CVPR, 2017.
[125] Ruofan Zhou and Sabine Süsstrunk. Kernel modeling super-
resolution on real low-resolution images. In ICCV, 2019.
[126] Zhi-Hua Zhou and Xu-Ying Liu. Training cost-sensitive neural
networks with methods addressing the class imbalance problem.
IEEE Trans. Knowl. Data Eng., 18(1):63–77, 2006.
[127] Qin Zou, Yu Cao, Qingquan Li, Qingzhou Mao, and Song Wang.
Cracktree: Automatic crack detection from pavement images. Pat-
tern Recognition Letters, 33(3):227–238, 2012.
[128] Qin Zou, Zheng Zhang, Qingquan Li, Xianbiao Qi, Qian Wang, and
Song Wang. Deepcrack: Learning hierarchical convolutional fea-
tures for crack detection. IEEE Trans. Image Process., 28(3):1498–
1512, 2019.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 18 of 18

Inequalities - One Shot by NV Sir #BounceBack 2
100% (1)
Inequalities - One Shot by NV Sir #BounceBack 2
158 pages
Ejercicios Interpolacion Lagrange
No ratings yet
Ejercicios Interpolacion Lagrange
1 page
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) by LazyProgrammer
No ratings yet
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) by LazyProgrammer
183 pages
Plant Health Monitoring Using Digital Image Processing: By: Sivapriya.G
No ratings yet
Plant Health Monitoring Using Digital Image Processing: By: Sivapriya.G
12 pages
Chapter 4 Least-Mean-Square Algorithm (LMS Algorithm)
No ratings yet
Chapter 4 Least-Mean-Square Algorithm (LMS Algorithm)
10 pages
Algorithms in Real World PDF
No ratings yet
Algorithms in Real World PDF
303 pages
Unit 4 Mathematical Aspects and Analysis of Algorithms - 2: Structure
No ratings yet
Unit 4 Mathematical Aspects and Analysis of Algorithms - 2: Structure
23 pages
10 SVM
No ratings yet
10 SVM
23 pages
Optimization and Artificial Intelligence Applications in Power System
No ratings yet
Optimization and Artificial Intelligence Applications in Power System
56 pages
Merged Presentation Choladeck
No ratings yet
Merged Presentation Choladeck
128 pages
Sensors: Learning To Detect Cracks On Damaged Concrete Surfaces Using Two-Branched Convolutional Neural Network
No ratings yet
Sensors: Learning To Detect Cracks On Damaged Concrete Surfaces Using Two-Branched Convolutional Neural Network
18 pages
DAA IA1 Updated
No ratings yet
DAA IA1 Updated
14 pages
Optimization Modelling A Practical Approach 1st Edition Ruhul Amin Sarker Instant Download
No ratings yet
Optimization Modelling A Practical Approach 1st Edition Ruhul Amin Sarker Instant Download
42 pages
1 s2.0 S0031320324003522 Main
No ratings yet
1 s2.0 S0031320324003522 Main
49 pages
CN4SRSS Combined Network For Super Resolution Reco - 2024 - Engineering Applica
No ratings yet
CN4SRSS Combined Network For Super Resolution Reco - 2024 - Engineering Applica
32 pages
Gaussian Quadrature
No ratings yet
Gaussian Quadrature
4 pages
A Deep Hierarchical Feature Learning Architecture For Crack Segmentation
No ratings yet
A Deep Hierarchical Feature Learning Architecture For Crack Segmentation
15 pages
Understanding Deep Learning Techniques For Image Segmentation
No ratings yet
Understanding Deep Learning Techniques For Image Segmentation
58 pages
cs231n 2018 Lecture09
No ratings yet
cs231n 2018 Lecture09
106 pages
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
No ratings yet
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
6 pages
Lightweight Image Super-Resolution Based On
No ratings yet
Lightweight Image Super-Resolution Based On
27 pages
Bisection Method Calculator
No ratings yet
Bisection Method Calculator
6 pages
Discretization Techniques
No ratings yet
Discretization Techniques
29 pages
PaveSAM - Segment Anything For Pavement Distress
No ratings yet
PaveSAM - Segment Anything For Pavement Distress
37 pages
Image Compression - Unit 4
No ratings yet
Image Compression - Unit 4
34 pages
A Review of Deep Learning Methods For Pixel-Level Crack Detection
No ratings yet
A Review of Deep Learning Methods For Pixel-Level Crack Detection
24 pages
Group Q Presentation4
No ratings yet
Group Q Presentation4
21 pages
Ex. No: 9.a Power Spectral Density Estimation Date: by Bartlett Method
No ratings yet
Ex. No: 9.a Power Spectral Density Estimation Date: by Bartlett Method
7 pages
DAA Lab File
No ratings yet
DAA Lab File
45 pages
Laboratory Activity No. 3 Error Calculations
No ratings yet
Laboratory Activity No. 3 Error Calculations
15 pages
REF-18-Blind Image Super-Resolution A Survey and Beyond
No ratings yet
REF-18-Blind Image Super-Resolution A Survey and Beyond
20 pages
Drones 08 00417 v2
No ratings yet
Drones 08 00417 v2
22 pages
New Python Programs
No ratings yet
New Python Programs
53 pages
Learning Enriched Features For Real Image Restoration and Enhancement
No ratings yet
Learning Enriched Features For Real Image Restoration and Enhancement
20 pages
Remotesensing 17 00935
No ratings yet
Remotesensing 17 00935
22 pages
Single Image Super-Resolution Via Locally Regularized Anchored Neighborhood Regression and Nonlocal Means
No ratings yet
Single Image Super-Resolution Via Locally Regularized Anchored Neighborhood Regression and Nonlocal Means
14 pages
Unsupervised Learning of Image Segmentation Based On Differentiable Feature Clustering
No ratings yet
Unsupervised Learning of Image Segmentation Based On Differentiable Feature Clustering
14 pages
SCSegamba Lightweight Structure-Aware Vision Mamba For Crack Segmentation in Structures
No ratings yet
SCSegamba Lightweight Structure-Aware Vision Mamba For Crack Segmentation in Structures
17 pages
Boundary-Aware Segmentation Network For Mobile and Web Applications
No ratings yet
Boundary-Aware Segmentation Network For Mobile and Web Applications
19 pages
Dint A 00062
No ratings yet
Dint A 00062
16 pages
Xy If
No ratings yet
Xy If
15 pages
Sensors 23 01419 v2
No ratings yet
Sensors 23 01419 v2
21 pages
Kong Et Al - 2021 - ClassSR - CVPR2021
No ratings yet
Kong Et Al - 2021 - ClassSR - CVPR2021
16 pages
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
No ratings yet
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
10 pages
RAISR: Rapid and Accurate Image Super Resolution: Yaniv Romano, John Isidoro, and Peyman Milanfar, Fellow, IEEE
No ratings yet
RAISR: Rapid and Accurate Image Super Resolution: Yaniv Romano, John Isidoro, and Peyman Milanfar, Fellow, IEEE
31 pages
Pavement Crack Detection Using Partially Accurate Ground Truth Based On GANs
No ratings yet
Pavement Crack Detection Using Partially Accurate Ground Truth Based On GANs
14 pages
InverseForm A Loss Function For Structured Boundar
No ratings yet
InverseForm A Loss Function For Structured Boundar
11 pages
DDSNet Deep Dual-Branch Networks For Surface Defect Segmentation
No ratings yet
DDSNet Deep Dual-Branch Networks For Surface Defect Segmentation
16 pages
TFCLEAN: Time-Frequency Noise Suppression: Topics
No ratings yet
TFCLEAN: Time-Frequency Noise Suppression: Topics
14 pages
Yuan Unsupervised Image Super-Resolution CVPR 2018 Paper
No ratings yet
Yuan Unsupervised Image Super-Resolution CVPR 2018 Paper
10 pages
NoUCSR - Efficient Super-Resolution Network Without Upsampling Convolution
No ratings yet
NoUCSR - Efficient Super-Resolution Network Without Upsampling Convolution
10 pages
Segmentation by Gan
No ratings yet
Segmentation by Gan
18 pages
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
No ratings yet
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
21 pages
22A Heterogenous Group CNN For Image SR
No ratings yet
22A Heterogenous Group CNN For Image SR
13 pages
Sensors: Depth Estimation and Semantic Segmentation From A Single RGB Image Using A Hybrid Convolutional Neural Network
No ratings yet
Sensors: Depth Estimation and Semantic Segmentation From A Single RGB Image Using A Hybrid Convolutional Neural Network
20 pages
Shock Response Spectrum
No ratings yet
Shock Response Spectrum
10 pages
2210 11810FJFTJTsu
No ratings yet
2210 11810FJFTJTsu
13 pages
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
No ratings yet
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
13 pages
Problem Set 1 - Econ 217 PDF
No ratings yet
Problem Set 1 - Econ 217 PDF
3 pages
Wa0061.
No ratings yet
Wa0061.
25 pages
DeepCrack Learning Hierarchical Convolutional Features For Crack Detection
No ratings yet
DeepCrack Learning Hierarchical Convolutional Features For Crack Detection
15 pages
Remote Sensing
No ratings yet
Remote Sensing
18 pages
Semantic Image Segmentation Using An Improved Hierarchical Graphical Model
No ratings yet
Semantic Image Segmentation Using An Improved Hierarchical Graphical Model
8 pages
Super Resolution HSI
No ratings yet
Super Resolution HSI
10 pages
Wang 等。 - Exploring Sparsity in Image Super-Resolution for E
No ratings yet
Wang 等。 - Exploring Sparsity in Image Super-Resolution for E
10 pages
Two-Stage Framework For Faster Semantic Segmentation
No ratings yet
Two-Stage Framework For Faster Semantic Segmentation
9 pages
Remotesensing 13 02187 v2
No ratings yet
Remotesensing 13 02187 v2
20 pages
MixSegNet A Novel Crack Segmentation Network Combining CNN and Transformer
No ratings yet
MixSegNet A Novel Crack Segmentation Network Combining CNN and Transformer
11 pages
Sensors 23 00053 v2
No ratings yet
Sensors 23 00053 v2
16 pages
Leveraging GAN Priors For Few-Shot Part Segmentation: Mengya Han Heliang Zheng Chaoyue Wang
No ratings yet
Leveraging GAN Priors For Few-Shot Part Segmentation: Mengya Han Heliang Zheng Chaoyue Wang
9 pages
Buildings 13 00055 v3
No ratings yet
Buildings 13 00055 v3
19 pages
Infrastructures 08 00090
No ratings yet
Infrastructures 08 00090
13 pages
W-Net A Deep Model For Fully Unsupervised Image Segmentation
No ratings yet
W-Net A Deep Model For Fully Unsupervised Image Segmentation
13 pages
A Data-Related Patch Proposal For Semantic Segmentation of Aerial Images
No ratings yet
A Data-Related Patch Proposal For Semantic Segmentation of Aerial Images
5 pages
Research Article: Concrete Cracks Detection Using Convolutional Neural Network Based On Transfer Learning
No ratings yet
Research Article: Concrete Cracks Detection Using Convolutional Neural Network Based On Transfer Learning
10 pages
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
No ratings yet
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
6 pages
Improving Unsupervised Defect Segmentation by Applying Structural Similarity To Autoencoders
No ratings yet
Improving Unsupervised Defect Segmentation by Applying Structural Similarity To Autoencoders
9 pages
Feature Pyramid and Hierarchical Boosting Network For Pavement Crack Detection
No ratings yet
Feature Pyramid and Hierarchical Boosting Network For Pavement Crack Detection
11 pages
Lecture 16 Region Segmentation
No ratings yet
Lecture 16 Region Segmentation
9 pages
Deep Learning-Based Semantic Segmentation Methods For Pavement Cracks
No ratings yet
Deep Learning-Based Semantic Segmentation Methods For Pavement Cracks
11 pages
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
No ratings yet
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
6 pages
Zhang 2016
No ratings yet
Zhang 2016
5 pages
Road Extraction Image Processing
No ratings yet
Road Extraction Image Processing
5 pages
An Efficient K-Means Clustering Algorithm
No ratings yet
An Efficient K-Means Clustering Algorithm
7 pages
Assignment 1 Solution
No ratings yet
Assignment 1 Solution
2 pages
Ieee Transactions Pattern Analysis and Machine Vol. NO. March
No ratings yet
Ieee Transactions Pattern Analysis and Machine Vol. NO. March
4 pages
DataSets 1
No ratings yet
DataSets 1
4 pages
Assignment Question
No ratings yet
Assignment Question
3 pages

Joint Learning of Blind Super-Resolution and Crack

Uploaded by

Joint Learning of Blind Super-Resolution and Crack

Uploaded by

Joint Learning of Blind Super-Resolution and Crack Segmentation for

Realistic Degraded Images

ARTICLE INFO ABSTRACT

1. Introduction While even each of the aforementioned problems is not

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 1 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 2 of 18

2.2. Super Resolution (SR)

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 3 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 4 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 5 of 18

Vec. trans SFT block Modified

Figure 5: Sample images in the Khanhha dataset [56].

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 6 of 18

Segmentation metrics SR metrics

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 7 of 18

Figure 7: Visual results of comparative experiments on the Khanhha dataset.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 8 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 9 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 10 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 11 of 18

Segmentation metrics SR metrics

Segmentation metrics SR metrics

Segmentation metrics SR metrics

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 12 of 18

Segmentation metrics SR metrics

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 13 of 18

Input image Input image DSRL [102] SrcNet [12] Ours

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 14 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 15 of 18

based cracknet-v. IEEE Trans. Intell. Transp. Syst., 21(1):273–284, 2021.

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 16 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 17 of 18

Y. Kondo and N. Ukita: Preprint submitted to Elsevier Page 18 of 18

You might also like