0% found this document useful (0 votes)

24 views16 pages

BCDNet

The article presents BCDetNet, a novel deep learning architecture designed for building change detection using bi-temporal high-resolution satellite images. This model incorporates a Multiple Feature Extraction block to enhance the detection of small changes and outperforms existing models with fewer parameters, achieving high accuracy metrics on benchmark datasets. The study emphasizes the importance of effective change detection in remote sensing and the potential of BCDetNet to improve analysis of Earth's surface changes.

Uploaded by

aningkasomwoshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views16 pages

BCDNet

Uploaded by

aningkasomwoshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

https://doi.org/10.1007/s13042-023-01880-z

ORIGINAL ARTICLE

BCDetNet: a deep learning architecture for building change detection

from bi‑temporal high resolution satellite images
K. S. Basavaraju1 · N. Solanki Hiren1 · N. Sravya1 · Shyam Lal1 · J. Nalini2 · Chintala Sudhakar Reddy3

Received: 20 October 2022 / Accepted: 23 May 2023 / Published online: 25 June 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023

Abstract
Change detection is becoming more and more popular technology for the analysis of remote sensing data and is very impor-
tant for an accurate understanding of changes that are happening in the Earth’s surface. Different Deep Learning methods
proposed till now are mainly focused on simple networks which results in poor detection for small changed areas because
they can not differentiate between the bi-temporal image’s characteristics. To solve this problem, this article proposes a novel
Building Change Detection Network (BCDetNet) for building object change detection and its analysis from bi-temporal high
resolution satellite image. The proposed BCDetNet model can detect small change areas with the help of multiple feature
extraction block. The proposed BCDetNet model executes building change detection using bi-temporal high resolution satel-
lite images. The proposed BCDetNet model is trained on two publicly available datasets namely LEVIR and WHU change
detection(CD) datasets. These datasets contain RGB images with dimensions of (1024 × 1024) and (512 × 512), respectively.
The BCDetNet model can learn from scratch during training and performs better than the benchmark change detection models
with fewer trainable parameters. The BCDetNet model gives Recall—94.06%, Precision—93.00%, Jaccard score—88.40%,
Accuracy—98.73%, F1 score—93.52% and Kappa coefficient—87.05% on LEVIR CD dataset and Recall—89.51%, Preci-
sion —92.78%, Jaccard score - 84.38%, Accuracy—96.78%, F1 score—91.06% and Kappa coefficient - 82.12% on WHU
CD dataset. This work is a step in the direction of achieving best results in building change detection from high resolution
satellite images.

Keywords Deep learning · Change detection · Siamese difference · Multiple feature extraction · Remote sensing

1 Introduction

Over the previous couple of decades, Change Detection

(CD) has been a crucial aspect of satellite image analysis.
CD is a process that takes registered pair of images of the
K. S. Basavaraju, N. Solanki Hiren, N. Sravya, Shyam Lal, J. Nalini, same place but taken at two different times, and from these
Chintala Sudhakar Reddy contributed equally to this work.

1
* Shyam Lal Department of Electronics and Communication Engineering,
[email protected] National Institute of Technology Karnataka, Mangalore,
Karnataka 575025, India
K. S. Basavaraju
2
[email protected] Aerial Services and Digital Mapping, National Remote
Sensing Centre, Indian Space Research Organisation,
N. Solanki Hiren
Balanagar, Hyderabad, Telangana 500037, India
[email protected]
3
Forest Biodiversity and Ecology Division, National Remote
N. Sravya
Sensing Centre, Indian Space Research Organisation,
[email protected]
Balanagar, Hyderabad, Telangana 500037, India
J. Nalini
[email protected]
Chintala Sudhakar Reddy
[email protected]

13
Vol.:(0123456789)
4048 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

images, it will identify the changed area. e.g., the evolution is the formulation of robust loss function and the improve-
of plants or urban mutations, etc. ment of deep learning models. In this section, the litera-
Two types of CD are there, binary and semantic change ture explored during the study about the Land cover CD is
detection. In binary CD, per pixel, it will assign a binary given. In the discipline of remote sensing, CD is a critical
label on a pair of images taken at different times. Here task. For remote sensing using satellite images, various CD
positive label means the area related to that pixel is being approaches have been developed. To discover differences
changed, and a negative label means there is no change across remotely sensed images, manual approaches were ini-
in that pixel. While semantic CD will identify what is the tially used, but their main disadvantage was that they were
change that have happened at each location. time-consuming. There are several supervised and unsuper-
Copernicus and Landsat Program has made an enor- vised CD techniques in the literature nowadays, like graphi-
mous amount of Earth observation imagery available. So cal models [4, 5] principal component analysis [6], Markov
now, people can use this in advanced supervised machine Random Fields [7], and kernels [8]. Due to advancements
learning algorithms, which have been very popular in the in Machine Learning, Many neural network-based strategies
past decade, mainly in image processing. It is crucial to find have been developed in recent years. [9–12] have evolved,
out the best ways possible to use the available data. There which are capable of successfully addressing CD issues.
is a lot of data available in this field, but annotated datasets Most image analysis issues have lately been dominated by
are less. As a result, the complexity of the models that may more complex machine learning approaches (deep learn-
be employed is limited. Nevertheless, Many datasets, like ing), and this progress is slowly approaching the challenge
Onera Satellite Change Detection dataset published in [1], of change detection [13–15].
and the Air Change dataset presented in [2] may be utilized Since the development of AlexNet [16] and the first place
to train supervised machine learning algorithms that can win in the 2012 ImageNet Large Scale Visual Recognition
identify the change in image pairs. Challenge, Convolutional Neural Network (CNN) architec-
The major contributions of this research paper are as tures for CD applications have been investigated by a number
follows: of researchers.The most current techniques can be divided
into patch-based approaches, which use different types of
1. Introduced new Multiple Feature Extraction (MFE) CNN architectures to identify an image patch as changed
block, which extracts multi-scale features to detect or unchanged, and semantic segmentation approaches,
small change areas, which are crucial in accurate change which conduct semantic segmentation across the full image.
detection from satellite images. Because a single training image can yield numerous train-
2. Developed novel BCDetNet model by integrating newly ing patches, patch-based techniques overcome the paucity
introduced Multiple Feature Extraction block with Siam- of training data. Patch-based techniques, on the other hand,
diff Model and attention mechanism [3]. operate in a sliding window mode, which is both sluggish
3. The proposed BCDetNet model performs better on two and wasteful because the same locations are visited several
widely used CD datasets than Siam-diff Model and times (Patches that correspond to neighbouring centre pixels
existing deep learning change detection models, which have a lot of overlap).
is evident from the experimental results. Because of the minimal quantity of data available, most of
these methods rely on transfer learning techniques. Almost
The rest of the paper is organized as follows: Sect. 2 all networks for example, were trained on RGB images
describes the Related work. Section 3 gives a detailed therefore it can not be used to SAR or multispectral images,
description of the proposed work. Section 4 explains the as in the case of the dataset reported in [1]. These techniques
CD datasets used and the implementation details. Section 5 also prevent end-to-end training, which has been shown to
presents the ablation study. Section 6 depicts the experi- produce superior results for properly taught systems. As a
mental results and computational complexity of BCDetNet result, the focus of this study is on algorithms that can learn
and other CD architectures. Section 7 concludes the work. purely from accessible change detection data and, as a result,
can be applied to any dataset.
It’s not a novel concept to use machine learning to com-
2 Related work pare images. CNNs are a class of image-processing algo-
rithms that have been used to compare images in a variety
This section gives the summary of all background study and of scenarios [17–19]. For issues involving dense prediction,
prerequisites to work on CD technology. Many research- such as pixel-level prediction, fully convolutional architec-
ers worked on the approaches which to deal with weakly tures (FCNNs) have been proposed [20–22] in Earth observa-
supervised learning based on robust statistics and dedicated tion situations [23]. The use of such concepts in Earth obser-
mathematical modelling. The popular solution in this field vation, as well as their dominance over superpixel-based,

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4049

patch-based, and other techniques, has already been inves- and self-attention, can automatically weight the feature
tigated [24]. Siamese models have also been proposed in map, enhancing the changed features and weakening the
several situations for the purpose of image comparison [18, unchanged features [35, 36]. Peng et al. [37] developed a
25]. With the advent of Deep learning research, many novel model that captures object change features by introducing
models have been applied to change detection tasks in recent spatial and channel attention. Currently, attention-based
years. Fully convolution networks (FCNs) are among the methods have high computational complexity. This article
commonly used structures [26]. Deep learning approaches presents a simple yet effective Deep Learning Architecture
seek to learn or transform abstract features from bitemporal BCDetNet for building Change Detection from High-Res-
images into a common feature space where their informa- olution Satellite Images. The proposed model extends the
tion is consistent and comparable. A Symmetric UNet net- FC-Siam-diff model with an attention module and MFE
work [27] was proposed for landslide mapping, in which a block that can detect small change areas. The proposed
pyramid pooling module is used to obtain multiscale change model performs better than the benchmark models with
information. Peng et al. [28] proposed a UNet++ network fewer parameters, thus, reducing computation complexity.
with multiple side output fusion. In [29] three UNet-based
fully convolutional (FC) networks are presented: FC-Early
Fusion (FC-EF), FC-Siamese-Concatenation (FC-Siam- 3 Proposed architecture
conc), and FC-Siamese-Difference (FC-Siam-diff). FC-EF
used the EF strategy, whereas FC-Siam-conc and FC-Siam- This section describes the proposed BCDetNet-deep learn-
diff used the late fusion strategy, with FC-Siam-conc fusing ing model for building object change detection from bi-
the features through concatenation and FC-Siam-diff fusing temporal High-Resolution Satellite Images. The proposed
the features through difference. Although these methods BCDetNet model is extension of the Fully Convolutional
have proven effective in CD, they lack global feature extrac- Siamese difference model (FC-Siam-diff) [29]. The pro-
tion. These methods place little emphasis on spatial context posed BCDetNet model consists of MFE block, the encoder-
information and the internal relationship between high- and decoder block and attention mechanism block. The main
low-level features. The obtained features are typically sensi- contribution of this work is the introduction of the new MFE
tive to noise, angle, shadow, and context, making them less (Multi Feature Extraction) block, which is used as a feature
robust to pseudo-changes. As a result, many improved algo- extractor. Instead of feeding the input image directly to the
rithms are proposed to encode image context in time-space modified FC-Siam-diff architecture (a combination of the
dimension better and improve feature discrimination ability, FC-siam-diff architecture and attention module), the MFE
such as stacking more convolution layers [30] and using the block extracts features from different fields of view, enabling
attention mechanism [31–34]. attention to both small and large land-cover changes. The
The attention mechanism, which includes spatial schematic of overall network architecture of BCDetNet is
attention (SA), channel attention, positional attention, shown in Fig. 1. For instance, in Fig. 1 the notation ‘20 → 16

Fig. 1 Schematic of Overall network architecture of BCDetNet, shared weights. E1 to E4 are encoder layers 1 to 4. D1 to D4 are
Block color legend: green color is Convolution, purple color is Trans- decoder layers 1 to 4 (color figure online)
pose Convolution, yellow color is Final layer and Doted lines shows

13
4050 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

→ 16’ in E1 indicates that 20 feature maps are provided as different filters of size 3 × 3 and 5 × 5 with ten channels
input to the layer, and after the convolution operation, there each. After the convolution operation with filters of various
are 16 feature maps as output. This convention applies to all sizes, the resulting multi-scale features are concatenated.
layers, where the first number represents the number of input Concatenated features are passed through the relu activation
feature maps, and the last number represents the number of layer to speed learning and produce more accurate results.
output feature maps after the convolution operation. In D1, The primary purpose of this block is to extract multi-scaled
the notation ‘up 2,→128 → 128’ means that the feature maps features from smaller and larger change areas of the input
are upsampled by a factor of 2, resulting in 128 input feature image. This block’s extracted multi-scale features are given
maps. After the convolution operation, there are 128 output to the encoder unit. Let I be the input image with size H ×
feature maps. W × 3. After convolution with two filters of size, 3 × 3 and
The research aims to develop FCNN architectures that 5 × 5 with ten channels each and concatenation, the size
could learn to identify changes solely from change detec- of the extracted multi-scaled feature f is H × W × 20. MFE
tion datasets, with no pretraining or transfer learning from block operations are as shown in Eq. 1. 𝜑 is the ReLU acti-
other datasets. Unlike most recent work on change detec- vation function. w3×3 and w5×5 are kernels. ∗ is convolution
tion, these designs can be taught from start to finish. The operation.
technique provided by Daudt et al. [29] has evolved into {( ) ( )}
these fully convolutional structures, where a patch-based f = 𝜑 I ∗ w3×3 + I ∗ w5×5 (1)
approach is used. Moving from patch-based architectures
to a fully convolutional method without affecting training
time that improves speed and prediction accuracy. These
fully convolutional networks can process inputs of any size. 3.2 Encoder unit
Two MFE blocks in the proposed architecture aims to
extract features from different area of the bi-temporal image As in [29], the encoder unit is divided into two streams with
and attention mechanism [3] is used to improve the per- identical structures that share weights. Let f1, and f2 , be the
formance. The goal is to extract important features using multi-scale feature extracted from the two MFE blocks with
MFE block and then combine the encoded information’s input1 ( I1) and input2 ( I2 ) respectively which are assigned
more abstract and less localised information with the spa- to one of these structures. Like the FC-Siam- diff network,
tial details available in the network’s earlier layers to create the encoder is composed of convolutional and pooling lay-
accurate class prediction with precise bounds in the output ers. In each stream of the encoder, four blocks of convolu-
image. tions followed by a maxpool of 2 × 2 are included. In the
first and second layers of the encoder, there are two 3 × 3
3.1 Multi feature extracting (MFE) block convolutions with a total of 16 and 32 channels, respectively,
followed by maxpool. In the third and fourth layers of the
Input 1 and input 2 images are applied to the two MFE encoder, there are three 3 × 3 convolutions with a total of
blocks. MFE block is illustrated in Fig. 2, which uses two 64 and 128 channels, respectively, followed by maxpool.

Fig. 2 Network architecture of

multi feature extracting (MFE)
Block

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4051

Four max pooling layers are used in stream1, and three attention module [3]. These operations are repeated after
max pooling layers are used in stream2 of the encoder unit. each upsampling layer, but the number of 3 × 3 convolu-
Because the primary goals of CD techniques are to iden- tions used after the first, second, and third upsampling layers
tify differences between two images, the absolute value of are five, five, and four, respectively. The resulting feature
the difference in the features learned at each encoder layer from the fourth decoder stage is fed into the fourth atten-
is concatenated at the corresponding decoder layer. Equa- tion module. Decoder unit operations are shown in Eq. 6. Ω
tions 2 to 5 denotes the operation performed in encoder represents the upsampling and convolution operations per-
part. encs1 and encs2 are the features of encoder stream1 and formed at decoder unit. dec is the output obtained at each
encoder stream2 respectively. 𝜂 denotes the convolution and decoding stage.
max pooling operations performed in encoder stream1 and { }
stream2. dec = Ω |encs1 − encs2 | (6)
{( ) ( )}
f1 =𝜑 I1 ∗ w3×3 + I1 ∗ w5×5 (2)
3.4 Adaptive attention fusion module
{( ) ( )}
f2 =𝜑 I2 ∗ w3×3 + I2 ∗ w5×5 (3) Nowadays attention mechanism is becoming more and more
famous in the field of deep learning. In recent years, many
{ }
encs1 =𝜂 f1 (4) research scholars have added different type of attention mod-
ules and comparison of spatial and channel attention [38]
{ } modules. This article adds an attention mechanism to the
encs2 =𝜂 f2 (5) proposed CD model. The Network architecture of Adaptive
attention fusion module [3] used in this article is shown in
Fig. 3.
3.3 Decoder unit Spatial attention contributes to increase the distance
between changing and unmodified pixels. Channel atten-
The decoder part is responsible for projecting the learned tion’s job is to boost channels connected with changes in
features onto the pixel space. The decoder part receives the ground features and block channels that are not relevant. Not
feature from the encoder unit. The channel-wise concatena- all high-dimensional characteristics are useful for difference
tion of the features from the corresponding stages in the discriminating in the phase of change detection [39, 40] and
encoder part is used here, as in the FC-Siam-diff model irrelevant features may cause change detection further dif-
[29]. The decoder includes four upsampling layers. The ficult. The attention mechanism presented in [3] is adaptive
input signal is upsampled by a factor of two before going attention fusion module which helps to improve useful infor-
through a 2 × 2 convolution step. The absolute value of the mation and suppress absurd information using a dual-stream
difference features from the corresponding stages of encoder attention mechanism.
streams are then concatenated with this signal. The feature Channel attention module (CAM) As shown in Fig. 3 first
map is passed through a series of 3 × 3 convolutions and average pooling is applied on the concatenated input features

Fig. 3 Network architecture

of adaptive attention fusion
module [3]

13
4052 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

(F) of dimension (H × W × C). To construct a C × 1 × 1 Where F is the input to the SAM module, and 𝜎(.) is the
vector, the elements of each channel are averaged, here C sigmoid activation function, as defined in Eq. 8. The deter-
is equal to number of channels. Then one-dimensional con- mination of 2D convolution kernel size k2 is same as that
volution operation is performed on the vector with a kernel of k1. An adaptive value determination technique depend-
size of k1. As in Fig. 3 the result of convolution is normalised ing on the size of the input matrix of feature (W and H) is
to a weight coefficient with value as in [41]. Then multiply used. Because in the datasets that we have used images are
each element of obtained result with each spatial element of squared shape like in Levir-cd it is (1024 × 1024 × 3) and in
the original feature for obtaining globally enhanced Channel whu-building dataset it is (512 × 512 × 3) so as we can see
attention matrix of feature ( MC ) which has the expression here value of W and H are equal so we can say that here W
as in Eq. 7: = H. The functional link between W and k2 is constructed
as in Eq. 13:
Mc = 𝜎(conv1d(Avgpool(F))) ⊗ F (7)
W = g(k2 ) (13)
here F is input to the CAM (merged matrix of feature ), 𝜎(.)
is the sigmoid activation function as in Eq. 8: Because the size of the input bi-temporal images used in
our study is 1024 × 1024 in one dataset and 512 × 512 in
1
𝜎(Xin ) = (8) the other, the size of the matrix of feature in various times
1 + e−Xin is always an exponential power of 2. k2 is calculated in the
Adaptive technique is used to determine the size of the one- same way as k1, as in Eq. 14:
dimensional convolution kernel k1 based on the number of ( )
log(W) + b
channels C. Relationship between k1 and C is described in k2 = Mod (14)
Eq. 9. a odd

C = h(k1 ) (9) Finally, the augmented matrix of features added by CAM

and SAM are analysed element-by-element to achieve the
Simplest mapping is the linear mapping in general; nonethe- final output as in 15:
less, linear relationships have limited expressive potential.
The size of C is normally e2 [42]; consequently, Eq. 9 can M = Mc ⊕ Ms (15)
be written as in Eq. 10: Average pooling is used in CAM and Maximum pooling in
SAM to minimize dimensionality, so after performing global
C = h(k1 ) = 2 (a∗k1 −b)
(10)
and local enhancement, we obtain change information col-
As a result, after specifying the channel number C, the size lected by two modules. Adaptive attention module is used
of k1 can be written as in Eq. 11: at each stage of decoder. The output of the fourth attention
( ) module is convoluted point-wise before the Softmax activa-
log(C) + b tion function is applied as the model’s final layer.
k1 = Mod (11)
a odd

where |M|odd denotes the odd closest number to M. From

4 Datasets and implementation details
experimentation a=2 and b=1 values are taken.
Spatial Attention Module (SAM) As shown in Fig. 3 first
4.1 Dataset
Max pooling is performed on each pixel point of input con-
catenated map (H × W × C) which means it will take maxi-
The evaluation of BCDetNet and benchmark models on
mum value of each pixel position along channel to obtain 1
Building change detection is assessed using the LEVIR-CD
× W × H vector (here W is columns and H is rows of input
and WHU-CD datasets. Both datasets contain pairs of RGB
map) then perform two-dimensional convolution operation
images taken at different times with labeled images that tell
on a vector with kernel size of k2 × k2. Now as in Fig. 3, nor-
where change has happened. With the help of these datasets,
malise the result of convolution to a weight coefficient with
the BCDetNet is trained and tested.
value between [41]. Then multiply each element of obtained
LEVIR-CD [43] is a new large-scale building Change
result and each spatial element of the original feature for
Detection dataset based on remote sensing. This dataset has
obtaining locally enhanced spatial attention matrix of feature
become a new standard for testing CD methods, particu-
( Ms ) which is as shown in expression 12:
larly deep learning-based techniques. LEVIR-CD contains
Ms = 𝜎(conv2d(Avgpool(F))) ⊗ F (12) 637 Google Earth (GE) image patch pairs with very high
resolution (VHR). Significant land-use changes, particularly
the construction boom, have been seen in these bitemporal

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4053

images because of the time span of 5–14 years. LEVIR-CD Even after reducing it to half, if validation accuracy won’t
covers many structures, including villas, high-rise flats, tiny improve, then training will be stopped automatically after
garages, and extensive warehouses. In this dataset there are 20 training epochs. BCDetNet and other benchmark mod-
637 images which are divided into different sets: (1) Train- els are trained for 30 epochs. Parameters used to set up
ing set-445 images, (2) Validation set-64 images (3) Test the neural network are given in Table 2.
set-128 images
WHU-CD [44] This dataset is a sub-dataset of an aerial 4.3 Loss function
Image. There is a total of two aerial images, which are
sub-divided. The aerial dataset contains almost 220,000 Weighted class categorical cross entropy (wcce) is the loss
individual buildings extracted from aerial photographs function used here, for image with b1 × b2 pixels and C
of Christchurch, New Zealand, with a spatial resolution classes, wcce is defined as in Eq. 16:
of 0.075 m and a coverage area of 450 km2 . WHU-CD b1 ,b2 ,c=2
dataset consists of a single image of size 15,354 × 32,507. 1 ∑
(16)
�
lwcce (y , y) = − wc ∗ Yijc ∗ log(Pijc )
We have created patches of 512 × 512 size which resulted C i,j,c=1
in a total of 828 images, which are divided into different
sets: (1) Training set-580 images, (2) Validation set-82 Where P,Y 𝜖 (0, 1)b1 ,b2 ,C and pijc = predicted probability that
images, (3) Test set-166 images. pixel (i,j)
{ belongs to class C, wc is weight of each } class
Table 1 gives the summary of the datasets. Yijc =
1 if pixel(i, j) belongs to class C
.
0 if pixel (i,j) does not belongs to class C
4.4 Evaluation metrics
4.2 Implementation details
The following metrics are used to assess the performance of
The proposed model is built on an NVIDIA Quadro the BCDetNet and benchmark models used for comparison:
RTX4000 GPU with 8GB of onboard memory. The model
is built with TensorFlow 2.7 and the Keras API frame- 1. Jaccard Coefficient [45] It is a widely used approach
work. ADAM optimizer (with momentum parameters as for determining the overlap between two sets. It is a
𝛽 1 = 0.9, 𝛽 2 = 0.9999, epsilon = 10-7) is used with an measure of how similar or different binary data are. In
initial learning rate of 0.0001. But while training, if the the instance of Binary Change Detection using satellite
validation accuracy doesn’t improve after five epochs, photos in a deep learning framework, Jaccard coefficient
then the learning rate will be automatically divided by 2.

Table 1 Summary of the datasets

Sl. no. Dataset Number Samples per class (number of pixels) Important features Samples per set (number of
of classes images)
Unchanged (pixels) Changed (pixels) Training Test Validation

1 LEVIR-CD 2 636,876,269 31,066,646 Structures of villas, high-rise 445 128 64

flats, tiny garages and ware-
house
2 WHU-CD 2 211,303,268 5,751,964 Independent buildings 580 166 82

Table 2 Parameters used to set Parameters Used in model

up the neural network
Batch size 1
Loss function Weighted class categorical cross entropy
Optimizer Adam
Initial leaming rate 0.0001
Epochs 30
Early stopping Patience = 20, min-delta = 0.0001, monitor-validation loss
Reduce leaming rate Factor = 0.5, patience = 5, min-delta = 0.0001, monitor-
validation loss, restore best weights

13
4054 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

measurement is the best way to evaluate performance. TP + TN

Accuracy = (22)
Jaccard Coefficient can be computed using Eqs. (17) to TP + TN + FP + FN
(21).
4. Precision (P) [45] The percentage of accurately antici-
Jaccard(A, B) =
Number of items in the intersection of A and B
(17) pated positive observations compared to the total num-
Number of items in the union of A and B
ber of predicted positive observations. The term very
jaccard coefficient of set with itself will be ONE exact refers to a low rate of false positives. Precision can
be computed using Eq. (23).
Jaccard(A, B) = 1, whereA = B (18)
TP
for disjoint sets there is no members are common so
P= (23)
TP + FP
jaccard coefficient will be ZERO
5. Recall (R) [45] Ratio of correctly predicted positive
Jaccard(A, B) = 0 (19) observation to the observation in actual class. Precision
can be computed using Eq. (24).
for different size of sets it will be assigns value between
0 to 1 TP
R= (24)
TP + FN
0 ≤ Jaccard(A, B) ≤ 1 (20)
6. F1 Score [46] The percentage of accurately predicted
Jaccard distance∕jaccard loss = (1 − jaccard coefficient) positive observations to actual positive observations in
(21) class. F1 score can be computed using Eq. (25).
2. Confusion matrix It’s used to solve classification prob- 2∗R∗P
lems with binary or multiclass output. In a tabular style,
F1 Score = (25)
R+P
it’s a detailed overview of the model predictions and
ground truth labels. As seen in Fig. 4, the prediction here R = Recall and P = Precision
class is described by the row of matrix, while the actual 7. Kappa coefficient [45] To assess change detection algo-
class is described by the column. It describes the mod- rithms the Kappa coefficient (k) is often employed to
el’s whole performance. The accuracy of the matrix may measure classification performance. Kappa Coefficient
be determined by averaging the values along the major is computed using Eq. (26).
diagonal. Accuracy − 𝜂
3. Accuracy [45] Accuracy is a useful statistic only when k= (26)
1−𝜂
the dataset is symmetric. False positive (FP) and false
negative (FN) values should be nearly identical. Where, where 𝜂 is given by
TP is True positive and TN is True negative. Accuracy
can be computed using Eq. (22).

Fig. 4 Confusion matrix

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4055

Table 3 Ablation study of BCDetNet with simulated data experi- base model alone is tested, and test results are tabulated
ments on LEVIR-CD dataset in the second column of Tables 3 and 4. Secondly, the
Base model Base model BCDetNet base model with the attention module is tested, and the
With attention results are tabulated in the third column of Tables 3 and
4. Finally, the Base model with attention unit and MFE is
MFE × × ✓
tested, and the results are tabulated in the fourth column of
Attention × ✓ ✓ Tables 3 and 4. In the BCDetNet, all performance metrics
Accuracy 0.9828 0.9848 0.9873 are improved when compared with the base model alone
Recall 0.9165 0.8375 0.9406 or with the base model plus the attention module. For a
Precision 0.7829 0.8799 0.9300 better understanding of the model, features extracted at the
F1-Score 0.8444 0.86165 0.9352 different stages are shown in Fig. 5.
Kappa coefficient 0.8385 0.8502 0.8705 Tables 3 and 4 shows the Ablation Study of BCDetNet,
Jaccard Score 0.7349 0.8023 0.8840 from tables it is evident that BCDetNet performs better than
Trainable parameters 1,238,914 1,238,914 1,508,578 base model alone and with base model plus Attention mod-
ule. By incorporating the MFE block, the performance on
Bold indicates the best results the Levir-CD dataset improved significantly, as evidenced
by a 7.36% increase in F1-score, a 2.03% improvement in
Table 4 Ablation study of BCDetNet with simulated data experi- kappa coefficient, and an 8.17% increase in Jaccard score
ments on WHU-CD dataset compared to the modified FC-Siam-diff architecture. On the
Base model Base model BCDetNet WHU-CD dataset, the MFE block led to a 3.84% increase
With attention in F1-score, 0.9% increase in kappa coefficient, and a 6.04%
improvement in the Jaccard score compared to the modi-
MFE ✓
× ×
fied FC-Siam-Diff architecture. The 5 × 5 convolution in the
Attention × ✓ ✓ MFE block captures high-level context information, while
the 3 × 3 convolution focuses on local details to predict small
Accuracy 0.9501 0.9606 0.9678
changes.
Recall 0.7116 0.7023 0.8951
As illustrated in Fig. 5, visualization of Feature maps, we
Precision 0.7808 0.8436 0.9278
can see the features extracted at a different stage of the pro-
F1-Score 0.7396 0.8722 0.9106
posed model. Figure 5d shows the output of the MFE block,
Kappa coefficient 0.4805 0.8121 0.8212
which is used to extract multi-scaled features from smaller
Jaccard Score 0.6336 0.7834 0.8438
and larger change areas of the input image. When the output
Trainable parameters 1,238,914 1,238,914 1,508,578
of MFE is given as input to the model with encoder, decoder,
Bold indicates the best results and attention unit, the results are excellent, evident from the
predicted output as shown in Fig. 5h and results tabulated
(TP + FP) × (TP + FN) + TN + FP) × (TN + FN) in Tables 3 and 4.
𝜂=
(TP + TN + FP + FN)2
(27)
6 Experimental results

To validate the efficacy of the BCDetNet model, we compare

5 Ablation study
it to some state-of-the-art methods, like UNet [21], U-Net++
[28], FC-Siam-diff [29], ADS-NET [3], and AGCDETNET
The BCDetNet is created by experimenting with various
[45]. In the first part of this section comparative analysis
techniques on the FC-Siam-diff model. The BCDetNet
is presented. In the second part computation complexity is
consitsts of a MFE block, FC-Siam-Diff architecture and
presented.
attention module. To study how the BCDetNet model
works, we have divided the proposed BCDetNet into
6.1 Comparative analysis
three parts and analyzed performance of model at each
stage. First, the FC-Siam-diff model (base model) is tested,
Table 5 shows the evaluation metrics Accuracy, Precision,
and the base model with the attention module is tested;
Recall, F1-score, Kappa coefficient, and Jaccard score used
at the end, the BCDetNet, which is a combination of the
to evaluate the performance of the proposed BCDetNet and
base model, attention module, and MFE, is tested. First

13
4056 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

Fig. 5 Feature maps visualization. a Input 1. b Input 2. c Label. d After MFE Block. e Before final layer of encoder. f After encoder. g Before
final layer of decoder. h Final output of BCDetNet

existing state-of-the-art models on the LEVIR-CD dataset. which shows the effectiveness of the BCDetNet in terms of
The accuracy of the proposed model is 98.73%, whereas, for computational complexity. Figure 8 shows the Benchmark
AGCDetNet and ADS-Net, it is 98.38% and 98.45%, respec- and BCDetNet prediction results on the LEVIR-CD data-
tively. The recall metric for the BCDetNet is much greater set. From Fig. 8 it is evident that the BCDetNet prediction
than any other benchmark models and higher by 2.5% than results are much better than the benchmark model prediction
the FC-Siam-diff model, which has a higher value among the results.
benchmark models. U-Net++ has the highest value for pre- Table 6 shows Benchmark and BCDetNet quality met-
cision among the benchmark models, whereas the BCDet- rics on the WHU-CD dataset. All Quality Metrics are
Net has a slightly higher value than U-Net++. AGCDetNet increased in the proposed model compared to any bench-
has the highest value for Jaccard score among the existing mark model. From Table 6, it is evident that the num-
state-of-the-art models, but the BCDetNet is having 5% ber of parameters in the BCDetNet is around 1.5 million,
higher value than AGCDetNet. The F1-score is best for the which is 2nd lowest. In contrast, the BCDetNet gives the
BCDetNet, which is 9% higher than the FC-Siam-diff model, best result in every quality metric compared to any other
which has the highest value among benchmark models. The benchmark model. Among all the benchmark models,
FC-Siam-diff has the best Kappa coefficient value among AGCDet-Net gives the best results and contains the high-
benchmark models, whereas the BCDetNet has a 3.20% est parameters of more than 60 million compared to the
higher value than FC-Siam-diff. Fc-Siam-diff has fewer BCDetNet, which only has 1.5 million parameters and
parameters than the proposed model, but the BCDetNet out- gives better results than AGCDet-Net. The Percentage
performs FC-Siam-diff in terms of all quality metrics with improvement in parameters when compared to any other
slightly more parameters. The effectiveness of the BCDetNet benchmark model are Recall: 12.36%; In Precision: 1.13%;
is evident in Table 5 for the LEVIR-CD dataset. AGCDet- F1-Score: 8.74%; kappa: 12.65%.
Net and ADS-Net have good accuracy among the bench- Figure 9 shows the Benchmark and BCDetNet models
mark models with more than 60 and 2.5 million parameters, prediction results on the WHU-CD dataset. From Fig. 9 it
respectively. The BCDetNet gives accuracy more than AGC- is evident that the BCDetNet model prediction results are
DetNet and ADS-Net with around 1.5 million parameters, much better than the benchmark model prediction results.

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4057

Table 5 Benchmark and BCDetNet Quality Metrics on the LEVIR-CD dataset

U-Net FC-siam-diff U-Net++ AGCDet-Net ADS-Net Proposed BCDetNet
(2015) (2018) (2019) (2021) (2021)

Recall 0.7568 0.9165 0.7767 0.8097 0.7822 0.9406

Precision 0.7468 0.7829 0.9296 0.8653 0.8755 0.9300
Jaccard Score 0.6863 0.7349 0.6847 0.8325 0.7218 0.8840
Accuracy 0.9730 0.9828 0.9788 0.9838 0.9845 0.9873
F1-Score 0.8140 0.8444 0.8128 0.8366 0.8283 0.9352
Kappa coefficient 0.8030 0.8385 0.8014 0.8281 0.8303 0.8705
Trainable parameters 31,048,466 1,238,914 4,882,914 60,205,927 2,577,032 1,508,578

Bold indicates the best results

Table 6 Benchmark and BCDetNet models Quality Metrics on the WHU-CD dataset
U-Net FC-siam-diff U-Net++ AGCDet-Net ADS-Net Proposed BCDetNet
(2015) (2018) (2019) (2021) (2021)

Recall 0.7128 0.7116 0.7060 0.7715 0.5925 0.8951

Precision 0.8887 0.7808 0.8108 0.9112 0.9165 0.9278
Jaccard Score 0.6656 0.6336 0.6395 0.7263 0.5622 0.8438
Accuracy 0.9323 0.9501 0.9201 0.9448 0.9518 0.9678
F1-Score 0.7684 0.7396 0.7443 0.8232 0.7197 0.9106
Kappa coefficient 0.54080 0.4805 0.4910 0.6481 0.6947 0.8212
Trainable parameters 31,048,466 1,238,914 4,882,914 60,205,927 2,577,032 1,508,578

Bold indicates the best results

Table 7 Computation Model Training Prediction time per Trainable FLOPs

complexity study table for (in bil-
LEVIR-CD dataset lions)
time (h) image (s) parameters

U-Net (2015) 2.56 0.15 31,048,466 442.9

FC-Siam-diff (2018) 0.74 0.112 1,238,914 35.09
U-Net++ (2019) 1.65 0.16 4,882,914 166.5
AGCDet-Net (2021) 2.85 0.30 60,205,927 652.6
ADS-Net (2021) 1.91 0.16 2,577,032 31.06
Proposed BCDetNet 1.01 0.1411 1,508,578 38.74

Bold indicates the best results

6.2 Computation complexity study 38.74 billion FLOPs and above 1.5 million parameters.
The highest number of FLOPs and parameters are found
The number of Floating-point operations per second in AGCDet-Net. The ADS-Net utilizes the least number
(FLOPs) and the total number of trainable parameters of FLOPs. The FC-Siam-diff uses less number of param-
required to run the model are used to determine the model’s eters. The training time and prediction time per image is
complexity [47]. Tables 7 and 8 shows the Computation also a measure of a model’s complexity [47]. Except for
Complexity study of the BCDetNet and other benchmark FC-Siam-diff, the BCDetNet takes the least training time and
models used for comparison on LEVIR-CD and WHU-CD prediction time per image compared to any of the existing
datasets, respectively. Except for FC-Siam-diff, the BCDet- benchmark models used here for LEVIR-CD dataset. AGC-
Net has the fewest parameters compared to any of the exist- Det-Net takes highest training time of 2.85 h and 0.30 s of
ing benchmark models used here. The BCDetNet requires prediction time per image on LEVIR-CD dataset. Even in the

13
4058 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

Table 8 Computation Model Training Prediction time per Trainable FLOPs

complexity study table for (in bil-
WHU-CD dataset lions)
time (h) image (s) parameters

U-Net (2015) 0.7833 0.0963 31,048,466 442.9

FC-Siam-diff (2018) 0.2833 0.0091 1,238,914 35.09
U-Net++ (2019) 0.5444 0.0991 4,882,914 166.5
AGCDet-Net (2021) 1.00 0.0804 60,205,927 652.6
ADS-Net (2021) 0.4667 0.0104 2,577,032 31.06
Proposed BCDetNet 0.3 1.2 1,508,578 38.74

Bold indicates the best results

WHU-CD dataset, FC-Siam-Diff has the least training time per image compared to FC-Siam-diff, the performance in
and prediction time per image. The BCDetNet has the sec- quality metrics is improved significantly than FC-Siam-diff.
ond best training time with 0.3 h. Even though the BCDet- The quality metrics precision, Jaccard, and F1-score on the
Net has more parameters, training time, and prediction time LEVIR-CD dataset are improved above 10%, and on the

Fig. 6 Comparison of the

parameters versus metrics
F1-Score and Jaccard Score of
different models for LEVIR-CD
dataset

Fig. 7 Comparison of the

parameters versus metrics
F1-Score and Jaccard Score of
different models for WHU-CD
dataset

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4059

Fig. 8 Benchmark and BCDet-

Net models prediction results on
the LEVIR-CD dataset Input
Image-1

Input
Image-2

Ground
Truth

ADS-Net

AGCDet-
Net

FC-Siam-
diff

U-Net

U-Net++

Proposed
BCDet-
Net

13
4060 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

Fig. 9 Benchmark and BCDet-

Net models prediction results on
the WHU-CD dataset Input
Image-1

Input
Image-2

Ground
Truth

ADS-Net

AGCDet-
Net

FC-Siam-
diff

U-Net

U-Net++

Proposed
BCDet-
Net

WHU-CD dataset, precision, recall, Jaccard, F1-score, and 7 Conclusion

kappa-coefficient are increased by above 15% than FC-Siam-
diff. Figures 6 and 7 shows comparison of the Parameters This research proposed a novel BCDetNet architecture for
versus metrics F1-Score and Jaccard Score of different mod- building change detection from high resolution RGB sat-
els for LEVIR-CD and WHU-dataset repectively. ellite images. In the proposed model, MFE block extracts

13
International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062 4061

multi-scaled features from the input image’s smaller and (CVPRW), pp 61–69. https://doi.org/10.1109/CVPRW.2015.
larger change areas and gains contextual intelligence. Dur- 7301384
5. Vakalopoulou M, Platias C, Papadomanolaki M, Paragios N,
ing training, the proposed model can learn from scratch and Karantzalos K (2016) Simultaneous registration, segmentation
outperforms benchmark models with fewer parameters. The and change detection from multisensor, multitemporal satellite
proposed BCDetNet architecture performs better than the image pairs. In: 2016 IEEE international geoscience and remote
existing change detection deep learning models. The pro- sensing symposium (IGARSS), pp. 1827–1830. https://doi.org/
10.1109/IGARSS.2016.7729469
posed model boosts the quality metrics Recall—2.41%, 6. Deng J, Wang K, Deng Y, Qi G (2008) Pca-based land-use change
Jaccard score—5.15%, F1-score—9.08%, and Kappa detection and analysis using multitemporal and multisensor satel-
coefficient—3.2% compared to any benchmark model lite data. Int J Remote Sens 29(16):4823–4838
on the LEVIR-CD dataset. There is an improvement of 7. Singh P, Kato Z, Zerubia J (2014) A multilayer Markovian model
for change detection in aerial image pairs with large time differ-
Recall—12.36%, precision—1.13%, Jaccard score—11.75%, ences. In: 2014 22nd international conference on pattern recogni-
F1-score—8.74%, and Kappa coefficient—12.36% compared tion, pp 924–929 . https://doi.org/10.1109/ICPR.2014.169
to any benchmark model on the WHU-CD dataset. The pro- 8. Volpi M, Tuia D, Camps-Valls G, Kanevski M (2011) Unsuper-
posed BCDetNet model with around 1.5 million parameters vised change detection in the feature space using kernels. In: 2011
IEEE international geoscience and remote sensing symposium, pp
performs better in all quality metrics than U-Net, U-Net++, 106–109. https://doi.org/10.1109/IGARSS.2011.6048909
AGCDet-Net, and ADS-Net with 31.04, 48.82, 60.20, and 9. Liu J, Gong M, Qin K, Zhang P (2018) A deep convolutional
2.57 million parameters, respectively. This work’s limitation coupling network for change detection based on heterogeneous
is that the number of parameters is slightly higher than the optical and radar images. IEEE Trans Neural Netw Learn Syst
29(3):545–559. https://doi.org/10.1109/TNNLS.2016.2636227
FC-Siam-diff model. Future work will include expanding 10. Gong M, Zhao J, Liu J, Miao Q, Jiao L (2016) Change detection
this work to semantic change detection and developing a in synthetic aperture radar images based on deep neural networks.
mechanism to reduce the number of parameters while retain- IEEE Trans Neural Netw Learn Syst 27(1):125–138. https://doi.
ing the quality metrics or further improving than existing org/10.1109/TNNLS.2015.2435783
11. El Amin AM, Liu Q, Wang Y (2017) Zoom out cnns features for
ones. optical remote sensing change detection. In: 2017 2nd Interna-
tional conference on image, vision and computing (ICIVC), pp
812–817. https://doi.org/10.1109/ICIVC.2017.7984667
Funding This research work was supported by RESPOND scheme 12. Zhan Y, Fu K, Yan M, Sun X, Wang H, Qiu X (2017) Change
of Indian Space Research Organization (ISRO), Govt. of India under detection based on deep Siamese convolutional network for opti-
Grant No. ISRO/RES/4/683/19-20, December 30, 2019. cal aerial images. IEEE Geosci Remote Sens Lett 14(10):1845–
1849. https://doi.org/10.1109/LGRS.2017.2738149
Data availability The implementaion code and datasets used during the 13. Stent S, Gherardi R, Stenger B, Cipolla R (2015) Detecting change
current study are available from the corresponding author on reason- for multi-view, long-term surface inspection. In: BMVC, pp 127-1
able request. 14. Liu J, Gong M, Qin K, Zhang P (2016) A deep convolutional
coupling network for change detection based on heterogeneous
Declarations optical and radar images. IEEE Trans Neural Netw Learn Syst
29(3):545–559
Conflict of interest No conflict of interest exits in the submission of 15. Gong M, Zhao J, Liu J, Miao Q, Jiao L (2015) Change detection
this manuscript, and manuscript is approved by all authors for publica- in synthetic aperture radar images based on deep neural networks.
tion. IEEE Trans Neural Netw Learn Syst 27(1):125–138
16. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classifi-
cation with deep convolutional neural networks. Adv Neural Inf
Process Syst 25:84–90
References 17. Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity met-
ric discriminatively, with application to face verification. In: 2005
1. Daudt RC, Le Saux B, Boulch A, Gousseau Y (2018) Urban IEEE computer society conference on computer vision and pattern
change detection for multispectral earth observation using con- recognition (CVPR’05), vol 1. IEEE, pp 539–546
volutional neural networks. In: IGARSS 2018–2018 IEEE interna- 18. Zagoruyko S, Komodakis N (2015) Learning to compare image
tional geoscience and remote sensing symposium, pp 2115–2118 patches via convolutional neural networks. In: Proceedings of the
. https://doi.org/10.1109/IGARSS.2018.8518015 IEEE conference on computer vision and pattern recognition, pp
2. Benedek C, Szirányi T (2009) Change detection in optical aerial 4353–4361
images by a multilayer conditional mixed Markov model. IEEE 19. Stent S, Gherardi R, Stenger B, Cipolla R (2015) Detecting change
Trans Geosci Remote Sens 47(10):3416–3430 for multi-view, long-term surface inspection. In: BMVC, pp 127-1
3. Wang D, Chen X, Jiang M, Du S, Xu B, Wang J (2021) Ads-net: 20. Long J, Shelhamer E, Darrell T (2015) Fully convolutional net-
An attention-based deeply supervised network for remote sens- works for semantic segmentation. In: Proceedings of the IEEE
ing image change detection. Int J Appl Earth Obs Geoinform conference on computer vision and pattern recognition, pp
101:102348 3431–3440
4. Vakalopoulou M, Karatzalos K, Komodakis N, Paragios N (2015) 21. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional
Simultaneous registration and change detection in multitemporal, networks for biomedical image segmentation. In: International
very high resolution remote sensing data. In: 2015 IEEE con- conference on medical image computing and computer-assisted
ference on computer vision and pattern recognition workshops intervention. Springer, pp 234–241

13
4062 International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

22. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH 36. Diakogiannis FI, Waldner F, Caccetta P (2021) Looking for
(2016) Fully-convolutional siamese networks for object track- change? Roll the dice and demand attention. Remote Sens
ing. In: European conference on computer vision. Springer, pp 13(18):3707
850–865 37. Peng X, Zhong R, Li Z, Li Q (2020) Optical remote sensing image
23. Audebert N, Le Saux B, Lefèvre S (2018) Beyond rgb: very high change detection based on attention mechanism and image differ-
resolution urban remote sensing with multimodal deep networks. ence. IEEE Trans Geosci Remote Sens 59(9):7296–7307
ISPRS J Photogramm Remote Sens 140:20–32 38. Zhang C, Yue P, Tapete D, Jiang L, Shangguan B, Huang L, Liu
24. Audebert N, Le Saux B, Lefèvre S (2017) Segment-before-detect: G (2020) A deeply supervised image fusion network for change
vehicle detection and classification through semantic segmenta- detection in high resolution bi-temporal remote sensing images.
tion of aerial images. Remote Sens 9(4):368 ISPRS J Photogramm Remote Sens 166:183–200. https://doi.org/
25. Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1993) Sig- 10.1016/j.isprsjprs.2020.06.003
nature verification using a “Siamese” time delay neural network. 39. Saha S, Bovolo F, Bruzzone L (2019) Unsupervised deep change
Adv Neural Inf Process Syst 6 vector analysis for multiple-change detection in vhr images. IEEE
26. Papadomanolaki M, Vakalopoulou M, Karantzalos K (2021) A Trans Geosci Remote Sens 57(6):3677–3693. https://doi.org/10.
deep multitask learning framework coupling semantic segmenta- 1109/TGRS.2018.2886643
tion and fully convolutional lstm networks for urban change detec- 40. Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excita-
tion. IEEE Trans Geosci Remote Sens 59(9):7651–7668. https:// tion networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–
doi.org/10.1109/TGRS.2021.3055584 2023. https://doi.org/10.1109/TPAMI.2019.2913372
27. Lei T, Zhang Y, Lv Z, Li S, Liu S, Nandi AK (2019) Landslide 41. Bruzzone L, Bovolo F (2013) A novel framework for the design
inventory mapping from bitemporal images using deep convolu- of change-detection systems for very-high-resolution remote sens-
tional neural networks. IEEE Geosci Remote Sens Lett 16(6):982– ing images. Proc IEEE 101(3):609–630. https://doi.org/10.1109/
986. https://doi.org/10.1109/LGRS.2018.2889307 JPROC.2012.2197169
28. Peng D, Zhang Y, Guan H (2019) End-to-end change detection for 42. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: effi-
high resolution satellite images using improved unet++. Remote cient channel attention for deep convolutional neural networks.
Sens 11(11):1382. https://doi.org/10.3390/rs11111382 In: 2020 IEEE/CVF Conference on computer vision and pattern
29. Caye Daudt R, Le Saux B, Boulch A (2018) Fully convolutional recognition (CVPR), pp. 11531–11539. https://d oi.org/10.1109/
Siamese networks for change detection. In: 2018 25th IEEE inter- CVPR42600.2020.01155
national conference on image processing (ICIP), pp 4063–4067. 43. Chen H, Shi Z (2020) A spatial-temporal attention-based method
https://doi.org/10.1109/ICIP.2018.8451652 and a new dataset for remote sensing image change detection.
30. Zhang M, Shi W (2020) A feature difference convolutional neu- Remote Sens 12(10):1662
ral network-based change detection method. IEEE Trans Geosci 44. Ji S, Wei S, Lu M (2019) Fully convolutional networks for mul-
Remote Sens 58(10):7232–7246. https://doi.org/10.1109/TGRS. tisource building extraction from an open aerial and satellite
2020.2981051 imagery data set. IEEE Trans Geosci Remote Sens 57(1):574–586.
31. Zhang C, Yue P, Tapete D, Jiang L, Shangguan B, Huang L, Liu https://doi.org/10.1109/TGRS.2018.2858817
G (2020) A deeply supervised image fusion network for change 45. Song K, Jiang J (2021) Agcdetnet: an attention-guided network
detection in high resolution bi-temporal remote sensing images. for building change detection in high-resolution remote sens-
ISPRS J Photogramm Remote Sens 166:183–200 ing images. IEEE J Select Top Appl Earth Obs Remote Sens
32. Ding Q, Shao Z, Huang X, Altan O (2021) Dsa-net: a novel deeply 14:4816–4831
supervised attention-guided network for building change detection 46. Singh R, Rani R (2020) Semantic segmentation using deep con-
in high-resolution remote sensing images. Int J Appl Earth Obs volutional neural network: a review. In: Proceedings of the inter-
Geoinf 105:102591 national conference on innovative computing & communications
33. Chen J, Yuan Z, Peng J, Chen L, Huang H, Zhu J, Liu Y, Li H (icicc)
(2021) Dasnet: dual attentive fully convolutional Siamese net- 47. Basavaraju KS, Sravya N, Lal S, Nalini J, Reddy CS, Dell’Acqua
works for change detection in high-resolution satellite images. F (2022) Ucdnet: a deep learning model for urban change detec-
IEEE J Select Top Appl Earth Obs Remote Sens 14:1194–1206. tion from bi-temporal multispectral sentinel-2 satellite images.
https://doi.org/10.1109/JSTARS.2020.3037893 IEEE Trans Geosci Remote Sens 60:1–10. https://d oi.o rg/1 0.1 109/
34. Shi Q, Liu M, Li S, Liu X, Wang F, Zhang L (2022) A deeply TGRS.2022.3161337
supervised attention metric-based network and an open aerial
image dataset for remote sensing change detection. IEEE Trans Publisher's Note Springer Nature remains neutral with regard to
Geosci Remote Sens 60:1–16. https://doi.org/10.1109/TGRS. jurisdictional claims in published maps and institutional affiliations.
2021.3085870
35. Alimjan G, Jiaermuhamaiti Y, Jumahong H, Zhu S, Nurmamat Springer Nature or its licensor (e.g. a society or other partner) holds
P (2021) An image change detection algorithm based on multi- exclusive rights to this article under a publishing agreement with the
feature self-attention fusion mechanism unet network. Int J Pattern author(s) or other rightsholder(s); author self-archiving of the accepted
Recognit Artif Intell 35(14):2159049 manuscript version of this article is solely governed by the terms of
such publishing agreement and applicable law.

Introduction-to-TikTok-Shop-Affiliate-Program 2
No ratings yet
Introduction-to-TikTok-Shop-Affiliate-Program 2
10 pages
Dental Chair Manual
100% (3)
Dental Chair Manual
170 pages
Sistema de Frenos Freight m12
No ratings yet
Sistema de Frenos Freight m12
457 pages
Fully Transformer Network For Change Detection of Remote Sensing Images
No ratings yet
Fully Transformer Network For Change Detection of Remote Sensing Images
18 pages
MaskCD A Remote Sensing Change Detection Network Based On Mask Classification
No ratings yet
MaskCD A Remote Sensing Change Detection Network Based On Mask Classification
16 pages
Deep Learning-Based Land-Cover Change Detection in Remote-Sensing Imagery
No ratings yet
Deep Learning-Based Land-Cover Change Detection in Remote-Sensing Imagery
11 pages
A Robust CNN Framework For Change Detection Analysis From Bi-Temporal Remote Sensing Images
No ratings yet
A Robust CNN Framework For Change Detection Analysis From Bi-Temporal Remote Sensing Images
13 pages
Domain Adaptive and Interactive Differential Attention Network For Remote Sensing Image Change Detection
No ratings yet
Domain Adaptive and Interactive Differential Attention Network For Remote Sensing Image Change Detection
16 pages
MFINet - Multi-Scale Feature Interaction Network For Change Detection of High-Resolution Remote Sensing Images
No ratings yet
MFINet - Multi-Scale Feature Interaction Network For Change Detection of High-Resolution Remote Sensing Images
19 pages
ChangeMamba Remote Sensing Change Detection
No ratings yet
ChangeMamba Remote Sensing Change Detection
21 pages
A CBAM Based Multiscale Transformer Fusion Approach For Remote Sensing Image Change Detection
No ratings yet
A CBAM Based Multiscale Transformer Fusion Approach For Remote Sensing Image Change Detection
9 pages
Remotesensing 12 01662 v2
No ratings yet
Remotesensing 12 01662 v2
23 pages
2021 S2Looking-A Satellite Side-Looking Dataset For Building Change Detection
No ratings yet
2021 S2Looking-A Satellite Side-Looking Dataset For Building Change Detection
20 pages
ADS Net
No ratings yet
ADS Net
17 pages
26 ESR-DMNet Enhanced Super-Resolution-Based Dual-Path Metric Change Detection Network For Remote Sensing Images With Different Resolutions
No ratings yet
26 ESR-DMNet Enhanced Super-Resolution-Based Dual-Path Metric Change Detection Network For Remote Sensing Images With Different Resolutions
15 pages
Satellite 4 Good
No ratings yet
Satellite 4 Good
14 pages
Unsupervised Change Detection
No ratings yet
Unsupervised Change Detection
8 pages
AFDE-Net Building Change Detection Using Attention-Based Feature Differential Enhancement For Satellite Imagery
No ratings yet
AFDE-Net Building Change Detection Using Attention-Based Feature Differential Enhancement For Satellite Imagery
11 pages
26 ELGC-Net Efficient LocalGlobal Context Aggregation For Remote Sensing Change Detection
No ratings yet
26 ELGC-Net Efficient LocalGlobal Context Aggregation For Remote Sensing Change Detection
11 pages
MS-Former: Memory-Supported Transformer For Weakly Supervised Change Detection With Patch-Level Annotations
No ratings yet
MS-Former: Memory-Supported Transformer For Weakly Supervised Change Detection With Patch-Level Annotations
11 pages
An Analysis of Remote Sensing Change Detection
No ratings yet
An Analysis of Remote Sensing Change Detection
14 pages
AGCDet Net
No ratings yet
AGCDet Net
16 pages
Change Detection in Hyperdimensional Images Using Untrained Models
No ratings yet
Change Detection in Hyperdimensional Images Using Untrained Models
13 pages
A Novel Feature Descriptor For Automatic Change Detection in Remote Sensing Images
No ratings yet
A Novel Feature Descriptor For Automatic Change Detection in Remote Sensing Images
10 pages
Satellite Images Classification Using CNN A Survey
No ratings yet
Satellite Images Classification Using CNN A Survey
6 pages
Deep Learning For Change Detection in Remote Sensing A Review
No ratings yet
Deep Learning For Change Detection in Remote Sensing A Review
28 pages
RegCDNet Seminar Presentation
No ratings yet
RegCDNet Seminar Presentation
20 pages
Remotesensing 17 01840
No ratings yet
Remotesensing 17 01840
24 pages
Coastal Forest Cover Change Detection Using Satellite Images and Convolutional Neural Networks in Vietnam
No ratings yet
Coastal Forest Cover Change Detection Using Satellite Images and Convolutional Neural Networks in Vietnam
9 pages
Change Detection From Remotely Sensed Images - From Pixel-Based To Object-Based Approaches
No ratings yet
Change Detection From Remotely Sensed Images - From Pixel-Based To Object-Based Approaches
16 pages
Deep Learning and Explainable AI For Urban Change Detection in Satellite Imagery
No ratings yet
Deep Learning and Explainable AI For Urban Change Detection in Satellite Imagery
7 pages
Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model
No ratings yet
Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model
16 pages
Semisupervised Change Detection Using Graph Convolutional Network
No ratings yet
Semisupervised Change Detection Using Graph Convolutional Network
5 pages
Change Alignment-Based Image Transformation For Un
No ratings yet
Change Alignment-Based Image Transformation For Un
25 pages
1999 Oct 1187-1194
No ratings yet
1999 Oct 1187-1194
8 pages
Object-Based Change Detection From Satellite Imagery by Segmentation Optimization and Multi-Features Fusion
No ratings yet
Object-Based Change Detection From Satellite Imagery by Segmentation Optimization and Multi-Features Fusion
22 pages
Earth Sci. Informatics - Publn
No ratings yet
Earth Sci. Informatics - Publn
12 pages
Remote Sensing: Change Detection Based On Artificial Intelligence: State-of-the-Art and Challenges
No ratings yet
Remote Sensing: Change Detection Based On Artificial Intelligence: State-of-the-Art and Challenges
35 pages
Said - Deep Learning For Change Detection
No ratings yet
Said - Deep Learning For Change Detection
32 pages
Srsi 1
No ratings yet
Srsi 1
3 pages
Satellite Image Classification Using Deep Learning Approach
No ratings yet
Satellite Image Classification Using Deep Learning Approach
14 pages
Comparing CNNs and Random Forests For Landsat
No ratings yet
Comparing CNNs and Random Forests For Landsat
19 pages
SSRN 5266752
No ratings yet
SSRN 5266752
23 pages
IPRREPORT
No ratings yet
IPRREPORT
7 pages
Remote Sensing: Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks
No ratings yet
Remote Sensing: Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks
21 pages
Object Detection in Satellite Images by Faster R-CNN Incorporated With Enhanced ROI Pooling (FrRNet-ERoI) Framework
No ratings yet
Object Detection in Satellite Images by Faster R-CNN Incorporated With Enhanced ROI Pooling (FrRNet-ERoI) Framework
18 pages
Journal of King Saud University - Computer and Information Sciences
No ratings yet
Journal of King Saud University - Computer and Information Sciences
9 pages
CHP - 10.1007 - 978 3 030 14132 5 - 13125
No ratings yet
CHP - 10.1007 - 978 3 030 14132 5 - 13125
15 pages
Verbal Non Verbal
No ratings yet
Verbal Non Verbal
6 pages
1188-Article Text-3493-1-10-20240225
No ratings yet
1188-Article Text-3493-1-10-20240225
12 pages
Liu2017 PDF
No ratings yet
Liu2017 PDF
13 pages
SMTGGG
No ratings yet
SMTGGG
24 pages
Multi-Temporal Change Detection and Image Segmentation: Under Guidance Of: Dr. Anupam Agarwal Sir
No ratings yet
Multi-Temporal Change Detection and Image Segmentation: Under Guidance Of: Dr. Anupam Agarwal Sir
19 pages
Image Captioning Using Deep Convolutional Neural N
No ratings yet
Image Captioning Using Deep Convolutional Neural N
14 pages
Confrence Paper Satellite Springer Format
No ratings yet
Confrence Paper Satellite Springer Format
14 pages
Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain Using Multi-Temporal and Multi-Sensor Remote Sensing Images
No ratings yet
Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain Using Multi-Temporal and Multi-Sensor Remote Sensing Images
15 pages
International Journal of Remote Sensing
No ratings yet
International Journal of Remote Sensing
44 pages
Convolutional Neural Network For Satellite Image Classification
100% (1)
Convolutional Neural Network For Satellite Image Classification
14 pages
Review of Deep Learning Methods For Remote Sensing Satellite Images Classification Experimental Survey and Comparative Analysis
No ratings yet
Review of Deep Learning Methods For Remote Sensing Satellite Images Classification Experimental Survey and Comparative Analysis
24 pages
Review 1.3 PPT
No ratings yet
Review 1.3 PPT
27 pages
Eismann 2008
No ratings yet
Eismann 2008
13 pages
A Variable-Resolution Probabilistic Three-Dimensional Model For Change Detection
No ratings yet
A Variable-Resolution Probabilistic Three-Dimensional Model For Change Detection
12 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Letter of Invitation SGC
No ratings yet
Letter of Invitation SGC
7 pages
Rectus Tema
No ratings yet
Rectus Tema
486 pages
AI-Powered Exam Assessment System For Handwritten Answer Sheets
No ratings yet
AI-Powered Exam Assessment System For Handwritten Answer Sheets
4 pages
Sales Forecasting
No ratings yet
Sales Forecasting
32 pages
SLEX 4 Monster Mash
No ratings yet
SLEX 4 Monster Mash
7 pages
System Partitioning
No ratings yet
System Partitioning
3 pages
Filtermedia HSL HSL-C Uk
No ratings yet
Filtermedia HSL HSL-C Uk
2 pages
Economic Decision Making-2024
No ratings yet
Economic Decision Making-2024
5 pages
XRIO User Manual
No ratings yet
XRIO User Manual
38 pages
Printlac High Gloss TDS
No ratings yet
Printlac High Gloss TDS
2 pages
Assignment Project Management
No ratings yet
Assignment Project Management
9 pages
Positive and Negative Effects of Machines in Human Behavior
100% (3)
Positive and Negative Effects of Machines in Human Behavior
7 pages
Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
No ratings yet
Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
311 pages
Literary Voice - March 2021
No ratings yet
Literary Voice - March 2021
372 pages
CCM 303 Topic 8 PPT Gender and Communication in The Media PDF
No ratings yet
CCM 303 Topic 8 PPT Gender and Communication in The Media PDF
23 pages
Academic Writing
No ratings yet
Academic Writing
12 pages
Gyan Sagar College of Engineering, SAGAR, (M.P.)
No ratings yet
Gyan Sagar College of Engineering, SAGAR, (M.P.)
5 pages
THE RESEARCH PROCESS Ed 201 3
No ratings yet
THE RESEARCH PROCESS Ed 201 3
25 pages
Ed Ruscha's One Way Street
No ratings yet
Ed Ruscha's One Way Street
16 pages
Lab 1 Group 3 - Pure and Series
No ratings yet
Lab 1 Group 3 - Pure and Series
60 pages
Man Cruise
No ratings yet
Man Cruise
73 pages
Displacement and Acceleration C Programming
No ratings yet
Displacement and Acceleration C Programming
11 pages
TRC P4P Proposal
No ratings yet
TRC P4P Proposal
48 pages
REPORT Contour
100% (3)
REPORT Contour
7 pages
21-Economics-2017 (Tamil) - Final - 1693223768823
No ratings yet
21-Economics-2017 (Tamil) - Final - 1693223768823
74 pages
CPP Ignou
No ratings yet
CPP Ignou
187 pages
Expt. No. 2 - Basic Operational Amplifier Circuit PDF
No ratings yet
Expt. No. 2 - Basic Operational Amplifier Circuit PDF
2 pages

BCDNet

Uploaded by

BCDNet

Uploaded by

International Journal of Machine Learning and Cybernetics (2023) 14:4047–4062

BCDetNet: a deep learning architecture for building change detection

Over the previous couple of decades, Change Detection

Fig. 2 Network architecture of

Fig. 3 Network architecture

C = h(k1 ) (9) Finally, the augmented matrix of features added by CAM

where |M|odd denotes the odd closest number to M. From

Table 1 Summary of the datasets

1 LEVIR-CD 2 636,876,269 31,066,646 Structures of villas, high-rise 445 128 64

Table 2 Parameters used to set Parameters Used in model

measurement is the best way to evaluate performance. TP + TN

Fig. 4 Confusion matrix

To validate the efficacy of the BCDetNet model, we compare

Table 5 Benchmark and BCDetNet Quality Metrics on the LEVIR-CD dataset

Recall 0.7568 0.9165 0.7767 0.8097 0.7822 0.9406

Bold indicates the best results

Recall 0.7128 0.7116 0.7060 0.7715 0.5925 0.8951

Bold indicates the best results

Table 7 Computation Model Training Prediction time per Trainable FLOPs

U-Net (2015) 2.56 0.15 31,048,466 442.9

Bold indicates the best results

Table 8 Computation Model Training Prediction time per Trainable FLOPs

U-Net (2015) 0.7833 0.0963 31,048,466 442.9

Bold indicates the best results

Fig. 6 Comparison of the

Fig. 7 Comparison of the

Fig. 8 Benchmark and BCDet-

Fig. 9 Benchmark and BCDet-

WHU-CD dataset, precision, recall, Jaccard, F1-score, and 7 Conclusion

You might also like

WHU-CD dataset, precision, recall, Jaccard, F1-score, and 7 Conclusion