0% found this document useful (0 votes)
101 views15 pages

Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain Using Multi-Temporal and Multi-Sensor Remote Sensing Images

This document describes a new method for detecting land cover changes using multi-temporal remote sensing images from different sensors. The method has three key contributions: 1) It uses a multi-scale feature descriptor generated from a pretrained VGG neural network to register images from different sensors onto the same coordinate system. 2) It employs a gradually increasing selection of inliers to robustly estimate correspondences and transformations during image registration. 3) It uses fuzzy C-means classification on the registered image pairs to generate a similarity matrix and detect land cover changes. The method is tested on multi-temporal image pairs from different satellites and unmanned aerial vehicles over mountainous terrain in southern China.

Uploaded by

Prabha Karan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views15 pages

Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain Using Multi-Temporal and Multi-Sensor Remote Sensing Images

This document describes a new method for detecting land cover changes using multi-temporal remote sensing images from different sensors. The method has three key contributions: 1) It uses a multi-scale feature descriptor generated from a pretrained VGG neural network to register images from different sensors onto the same coordinate system. 2) It employs a gradually increasing selection of inliers to robustly estimate correspondences and transformations during image registration. 3) It uses fuzzy C-means classification on the registered image pairs to generate a similarity matrix and detect land cover changes. The method is tested on multi-temporal image pairs from different satellites and unmanned aerial vehicles over mountainous terrain in southern China.

Uploaded by

Prabha Karan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Received October 30, 2018, accepted November 17, 2018, date of publication November 29, 2018,

date of current version December 31, 2018.


Digital Object Identifier 10.1109/ACCESS.2018.2883254

Multi-Scale Feature Based Land Cover Change


Detection in Mountainous Terrain Using
Multi-Temporal and Multi-Sensor
Remote Sensing Images
FEI SONG 1,2,3 , ZHUOQIAN YANG 4 , XUEYAN GAO1,2,3 , TINGTING DAN1,2,3 ,
YANG YANG 1,2,3 , WANJING ZHAO1,2,3 , AND RUI YU1,2,3
1 School of Information Science and Technology, Yunnan Normal University, Kunming 650500, China
2 Engineering Research Center of GIS Technology in Western China, Ministry of Education of the People’s Republic of China, Kunming 650500, China
3 Laboratory of Pattern Recognition and Artificial Intelligence, School of Information Science, Yunnan Normal University, Kunming 650500, China
4 College of Software, Beihang University, Beijing 100083, China

Corresponding author: Yang Yang ([email protected])


This work was supported in part by the National Natural Science Foundation of China under Grant 41661080 and in part by the Scientific
Research Foundation of Yunnan Provincial Department of Education under Grant 2018Y037.

ABSTRACT Land use and land cover (LULC) change is frequent in mountainous terrain of southern China.
Although remote sensing technology has become an important tool for gathering and monitoring LULC
dynamics, image pairs can occur scale changes, noises, geometrical distortions, and illuminated variations if
these are acquired from different types of sensors (e.g., satellites). Meanwhile, how to design an efficient land
cover change detection algorithm that ensures a high detection rate remains a critical and challenging step.
To address these problems, we propose a robust multi-temporal change detection framework for land cover
change in mountainous terrain which contains the following contributions. i) To transform multi-temporal
remote sensing image pairs acquired by different type of sensors into the same coordinate system by
image registration, a multi-scale feature description is generated using layers formed via a pretrained VGG
network. ii) A gradually increasing selection of inliers is defined for improving the robustness of feature
points registration, and L2 -minimizing estimate (L2 E)-based energy optimization is formulated to calculate a
reasonable position in a reproducing kernel Hilbert space. iii) Fuzzy C-Means classifier is adopted to generate
a similarity matrix between image pair of geometric correction, and a robust and contractive change map
is built through feature similarity analysis. Extensive experiments on multi-temporal image pairs taken by
different type of satellites (e.g., Chinese GF and Landsat) or small unmanned aerial vehicles are conducted.
Experimental results show that our method provides better performances in most cases after comparing with
the five state-of-the-art image registration methods and the four state-of-the-art change detection methods.

INDEX TERMS LULC change, multi-scale feature description, inliers, L2 E, fuzzy C-Means classifier.

I. INTRODUCTION In recent years, many change detection methods [1]–[10],


Under the special natural conditions (e.g., overcast and [13]–[15], [20], [26] have been developed to derive land cover
foggy) and fragile ecological environment in mountainous change information from remote sensing image, such as, prin-
terrain of southern China, land use and land cover (LULC) cipal component analysis and k-means (PCA_Kmeans) clus-
change have occurred frequently. In addition, land covered tering based change detection method [14], spectral variance
composition and its change serve as a crucial role as agricul- and slow feature analysis (SSFA) based change detection
tural production, food security and sustainable development method algorithm [15], local estimation and global search
in mountainous terrain. Therefore, accurate and up-to-date based deep network (LEGS) [4], and semi-supervised fuzzy
information on land cover and its dynamic change are increas- C-means (Semi_FCM) clustering based change detection
ingly necessary at different spatial and temporal scales. method [5]. However, most of these methods only focus

2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
77494 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

on the remote sensing image acquired by satellite sensors maintains a high matching ratio on inliers while taking advan-
(e.g., Landsat, MODIS and SPOT-VGT), and the relatively tage of outliers for varying the warping grids.
low spatial resolution limited the identification of them due In this paper, we present a robust set of change detection
to the small size and scattered distribution of land cover in framework for monitoring land cover change in mountainous
mountainous terrain. Compared with the above-mentioned terrain with multi-temporal remote sensing images. In the
methods, Wei et al. [9] and Milas et al. [11] capture more preprocessing stage of change detection, a multi-scale fea-
land distribution details than satellite remote sensing images ture based image registration method is proposed to align
by small unmanned aerial vehicle (UAV) with a small dig- image pairs acquired by different type of sensors. Compared
ital camera. There has to exist visual difference in camera with the current methods, the major contributions of our
viewpoints, although they were captured from a same location work include: (i) multi-scale feature descriptor (MFD) con-
and were matched using GPS data. Deep networks are robust structed by CNN-based feature descriptor (CFD) and shape
(i.e., invariant) to differences in viewpoints and illumination context (SC). CFD is generated by layers formed a pretrained
condition, and nevertheless are sensitive to highly-abstract, VGG network; (ii) to estimate correspondence and transfor-
semantic differences of images. Specifically, recently popular mations, a gradually increasing selection of inliers is realized
convolutional neural networks (CNNs) are particularly well instead of using a stationary distinction of inliers and outliers.
suited this task, Li and Yu [12] supposed a high-quality visual At the early stage of registration, the rough transformation is
saliency model can be learned from multi-scale features quickly determined by the most reliable feature points. After
extracted using CNNs. which the registration details are optimized by increasing
Moreover, some factors cause that these image pairs the number of feature points. Then, L2 -minimizing estimate
acquired cannot apply directly to identify regions of change (L2 E) based energy optimization is formulated to calculate
since scale changes, noises, geometrical distortions, and dis- a reasonable position in a reproducing kernel Hilbert space;
continuous rotated images with illuminated variations may (iii) fuzzy C-Means classifier is adopted to generate a simi-
also be produced in such multi-temporal images. These fac- larity matrix between transformed image pairs.
tors are as follows: (i) when satellite revolves around its The rest of the paper is organized as follows. Section II
orbit, image acquired can have geometrical distortions due introduces a novel deep learning based framework, which
to the modeling inaccuracy of the sensor geometry, and the is infused with the CNN feature and the deep neural net-
jitter of the instruments platform during image acquisition. work (DNN), to detect land cover change in mountain-
(ii) when collecting multi-temporal images for the same loca- ous terrain. Section III demonstrates our experiments; and
tion, the imaging perspective of small UAVs is often easily Section IV draws conclusions.
affected by wind speed/direction, complex terrain, aircraft
II. METHODOLOGY
posture (pitch, roll, yaw), flying height and other human
In this section, we first give the details of three contributions
factors. In order to effectively improve the matching degree • multi-scale feature description;
between the image and the actual terrain, the preprocessing • dynamic inlier selection;
of these image pairs is an essential step, i.e., image registra- • fuzzy C-means classifier based pre-classification.
tion method can align these image pairs of the same scene Second, we give the details of the proposed land cover change
taken from different viewpoints, from different times or with detection framework. Figure 1 shows the framework of the
different sensors. However, most of the current registration proposed method. Finally, our algorithm and parameter set-
methods are only suitable for a type of sensor, and are not tings are discussed in the latter part of this section.
sensitive enough to multi-temporal image pairs. Therefore, Let us consider a image pair It1 and It2 , acquired over the
our goal focus on multi-temporal remote sensing image pairs same geographical area at two different time t1 and t2 . The
acquired by different type of sensors, and transforms them feature point sets A and B first are extracted from It1 and It2
into the same coordinate system. respectively. Next the transformed image It is obtained by
Numerous algorithms [21], [24], [25], [27]–[31] for dif- our registration algorithm. Note that It01 and It02 is obtained by
ferent registration scenarios have been presented in the last equal split of It and It2 according to a certain ratio. Finally,
few decades. The coherent point drift (CPD) algorithm for we input It01 and It02 into the model as change detection, and a
both rigid and non-rigid point set registration [21] treated change detection map Smap will be generated.
one point set as centroids of a Gaussian mixture model, Throughout the paper we use the following notations:
and then fitted it to the other. It applied a fast Gaussian • AN ×D = {a1 , ..., aN }T , BM ×D = {b1 , ..., bM }T - feature
transform [22] and low-rank matrix [23] approximation tech- point sets are extracted from a image pair It1 and It2 ,
niques to reduce a large computational burden. Recently, respectively. D denote the dimension of feature point
in order to estimate correspondence relationship between sets, and D = 2.
two images, GLMDTPS [24] proposed a global and local • τ - the transformation function.
mixture distance. PRGLS [25] used the point registration as • B∗ - transformed locations of source point set B.
the estimation of a mixture of densities to preserve both • It - the transformed image.
global and local structures during matching. More recently, • It0 , It0 - obtained by equal split of It and It2 .
1 2
Zhang et al. [28], [29] introduced an effective method that • Smap - a change detection map.

VOLUME 6, 2018 77495


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 1. Flowchart of the proposed land cover change detection framework, consisting of three main phases:
(1) a multi-scale feature description, (2) an effective registration processing, and (3) an effective change detection
processing. Note that correct feature point matches are denoted by yellow lines, incorrect ones are denoted by red lines.

A. MULTI-SCALE FEATURE DESCRIPTOR dataset for image classification [32]. VGG-16 has sixteen
The mountainous terrain is the major geomorphic structure layers (as shown Figure 2) including 5 blocks of convolu-
in the south of China, and have special natural conditions tion computation, each with 2-3 convolution layers and a
(e.g., overcast and foggy), a fragile ecological environment. max-pooling layer at the end of each block, from which we
Therefore, it is sometimes hard to perform precise image select one of its pool3, pool4 and pool5_1 layers. We lay
registration since images acquired by different type of sensors a 28 × 28 grid over the input image dividing our patches,
can aggravate the non-rigid geometric distortions of images. each corresponding to a 256 − d vector in the pool3 output,
We will attempt the features extracted by the convolutional a descriptor is generated in every 8 × 8 square. The center of
neural networks (CNNs) to improve the feature expression. each patch is regarded as a feature point. The 256 − d vector
is defined as the pool3 feature descriptor. The pool3 layer
output directly forms our pool3 feature map f1 , which is of
1) CNN-BASED FEATURE DESCRIPTOR (CFD). size 28 × 28 × 256. The pool4 layer output, which is of size
CFD is constructed by one of the state-of-the-art CNN which 14×14×512, is handled slightly differently. In every 16×16
used the VGG-16 architecture and pre-trained on ImageNet area we obtain a pool4 descriptor, and pool4 feature map f2

77496 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 2. Architecture of modified VGG-16 network. h and w denote the height and width of the input image, respectively. Since we only use
convolution layers to extract features, the input image will not adjust the size to keep the feature of the original image as long as h and w are
multiples of 32.

is written as: normalized to unit variance using


fk
O
f2 = Op4 I2×2×1 (1) fk = , k = 1, 2, 3 (3)
σ (fk )
N
where denotes Kronecker product. I2×2×1 presents a ten-
where σ (.) calculates the standard deviation of elements in a
sor of subscripted shape and filled with 1s. Note that f2 is
matrix. Therefore, the pool3, pool4 and pool5_1 descriptors
shared by 4 feature points.
of point sets A and B are represented by F1 (a), F2 (a) and
The pool 5_1 layer output is of size 7×7×512, and pool5_1
F3 (a), F1 (b), F2 (b) and F3 (b) respectively.
feature map f3 takes the form:
CFD is used to measure the feature distance of between
O two feature point sets A and B. CFD is a weighted sum of
f3 = Op5_1 I4×4×1 (2)
three distance values, and is written as:
Similarly, every pool5_1 descriptor is shared by 16 feature √
Fcfd = 2F1 (a, b) + F2 (a, b) + F3 (a, b) (4)
points. The distribution of feature descriptors is shown is
Figure 3. After producing f1 , f2 and f3 , the feature maps are where each component distance value Fk (a, b) is the
Euclidean distance between the respective feature descriptors

Fk (a, b) = Euclidean − disance(Fk (a), Fk (b)) (5)

The distance computed with √ pool3 descriptors F1 (a, b) is


compensated with a weight 2 because F1 is 256-d whereas
F2 and F3 are 512-d. Ccfd is a cost matrix of CFD, and the
matrix form is as follows
 F(bm , an ) c ,

θ 1
Ccfd (m, n) = Fθmax (6)
1 otherwise

where c1 denotes a valid match of bm and an under thresh-


old θ, F max is the maximum distance of all matched feature
point pairs under threshold θ.

2) SHAPE CONTEXT (SC).


The SC [33]–[35] could play such a role in shape matching.
Consider a center point an on the first shape and a center
FIGURE 3. Distribution of feature descriptor. It is shown in a 32 × 32 point bm on the second shape. Firstly, the SC constructs
squared region. Green dots represent pool3 descriptors, generated in a a polar coordinate system (see Figure 4). Then, we com-
8 × 8 squared region. Blue dots represent pool4 descriptors, each shared
by 4 feature points. The cyan dot represent a pool5_1 descriptor, shared pute a histogram han or hbm of the relative coordinates of
by 16 feature points. the remaining n − 1 or m − 1 points for an or bm on

VOLUME 6, 2018 77497


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 4. Shape context (SC) computation and matching. Left of (a) and (b): diagrams of log-polar histogram bins
centered at an and bm used in computing the shape contexts. We use 5 bins for log(r ) and 12 bins for θ . Right
of (a) and (b): each shape context e.g., hb a
m or hn is a log-polar histogram of the coordinates of the rest of the point set
measured using the centered point as the origin.

the shape respectively. Csc (m, n) denotes the cost of match- B. DYNAMIC INLIER SELECTION
ing these two point sets, and is measured using Chi-square Our feature points are acquired at the center of square
distribution as: shaped image patches. Due to reasons of large rotation angles
X
1 X [hbm (x) − han (x)]2 and deformation, corresponding feature points may have
Csc (m, n) = (7) their image patches overlapping partly or completely. Thus,
2 hbm (x) + han (x)
x=1 to improve the effect of the registration, feature points with
where hbm (x) and han (x) are two 1 × X sets, and denote the large overlapping ratios should have a better degree of align-
number of points within each bin surrounding bm and an , ment, where as partly overlapping patches should have a
respectively. small distance between their centers. Therefore, the degree of
alignment is determined using our dynamic inlier selection.
3) MIXTURE FEATURE DESCRIPTOR (MFD). In point set registration, there are several ways to esti-
We first compute a integrated cost matrix Cmfd mate the parameters of the mixture model, such as the EM
J using a
element-wise Hadamard product (denoted by ), and is algorithm, gradient descent and variational inference. Our
written as: point set registration mainly contains the following two steps:
θ (i) correspondence estimation, the corresponding target point
K
Cmfd = Ccfd Csc (8)
set Aψ is estimated between B and A; (ii) transformation
where Ccfdθ and C θ are value in [0, 1]. Then, we apply Updating, the transformation function τ is established to
sc
Jonker-Volgenant algorithm [37] to solve the linear assign- update the position of τ (B) constantly, until τ (B) and Aψ can
ment on cost matrix Cmfd . Assigned point pairs are regarded overlap as much as possible. Note that τ (B) (initial τ (B) = B)
as putatively corresponding. indicates the transformed set B in each iteration.

77498 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

Therefore, the inliers of selection are assigned in every k generated by the GMM. The GMM probability density
iteration to iteratively address B. Note that these inliers guide function is
for bundle adjustment of point locations whereas outliers are 1 1
moved coherently. At the feature prematching stage, a low p(an |bm ) = − exp 2 kan − bm k2 (11)
2π σ 2 2σ
threshold θ0 is applied to filter out irrelevant points and select
Then, the outlier and noise distribution is supposed an
coarsely a large number of feature points. Then, a large start-
additional uniform distribution p(a|M + 1) = N1 , which
ing threshold θ̃ is adopted to select confident inliers satisfy.
is added to the mixture model. Thus, the mixture model
In the rest of registration process, threshold θ is subtracted
takes the form
by a step-length ι in every k iterations, allowing a few more
N M
feature points with high similarity to affect the estimating X X 1
p(an ) = (1 − ε) log p(m)p(an |bm ) + ε
correspondence and transformation. Such technology enables N
n=1 m=1
feature points with high similarity to complete the overall
(12)
transformation while other feature points optimize registra-
tion accuracy. where p(m) = M1 denotes the mixed weight that are
nonnegativity and sum-to-one. We use equal isotropic
C. PRE-CLASSIFICATION covariances σ 2 and equal membership probabilities p(m)
The pre-classification step chooses the pixels that are for all GMM components (m = 1, ..., M ). ε denotes the
best suited to train the deep neural network. The Fuzzy weight of the uniform distribution, with 0 ≤ ε ≤ 1.
C-Means (FCM) is a popular image segmentation technique We compute the revised parameter as:
that segments an image by discovering cluster centers. Sup- PN PM
p(m|an )
pose a0ij and b0ij denote gray levels of the image pixels at ε = 1 − n=1 m=1 (13)
the corresponding positions (i, j) in It01 and It02 , respectively. N
We use FCM classifier to provide jointly classify for the two Subsequently, inlier selection calculate a m × n prior
input images, and a similarity matrix s0ij is established. probability matrix pmn which is then taken by our
Gaussian mixture model (GMM) based transformation
|a0 ij − b0 ij | solver.
s0ij = (9) 
a0 ij + b0 ij 1 if bm and an are corresponding,
pmn = 1 − υ (14)
where 0 ≤ s0ij ≤ 1. Then, a global threshold value of  otherwise
similarity T will be applied to s0ij by the iterative threshold N
method. Iterate over all a0 ij and b0 ij , if s0ij > T , then jointly where υ ∈ (0, 1) should be designated according to our
label a0 ij and b0 ij by FCM based on the principle of minimum confidence of the inlier selection to be accurate. Prior
variance δij2 . Otherwise label aij and bij separately. δij2 is probability matrix requires normalization:
written as: pmn
pmn := PN (15)
a0 ij b0 ij k=1 pmk
δij2 = a0ij [s0 ]2 (10)
a0 ij + b0 ij ij By the equation (15), the M × N posterior probability
matrix is obtained, which is used as the fuzzy correspon-
The gray-level of each pixel in the same position of the
dence matrix P between Is and Ir . Then, the correspond-
corresponding two original images are compared to label the
ing target point set is obtained by
pixels. The label of a pixel and its surrounding neighbor-
hood can be used to determine if a pixel is either part of Aψ = PA (16)
an edge or noise. The results are then passed to the neural
Though the target coordinate Aψ is estimated by GMM,
network for training.
the method will inescapable produce mismatching.
• Transformation Updating. Firstly, a positive definite
D. MAIN PROCESS
kernel (e.g., Gaussian kernel) is chosen; and a repro-
1) IMAGE REGISTRATION
ducing kernel hilbert space (RKHS) [38], [39] H is
To effectively eliminate the geometric error and improve the defined. Then, we employ the Gaussian Radial Basis
matching degree between the image and the actual terrain, Function (GRBF), which is in the form G(bi , bj ) =
image registration is an essential step in the preprocessing of |b −b |2
remote sensing image including two processes: feature point exp(− i β 2 j ), where β is a constant to control the
set registration and image transformation. Firstly, we carry spatial smoothness and G is of size m × m. According
out feature point set registration. to the representation theorem, a displacement function
• Correspondence Estimation. Gaussian mixture model ν(B) takes the form
(GMM) has been proven the popular model in computer M
X
vision and pattern recognition. Thus, the set B are used τ (B) = G(b, bm )ψ (17)
as GMM centroids, and the set A as the data points m=1

VOLUME 6, 2018 77499


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 5. Left: Vectorization of neighborhood features to be fed into the network. Right: The structure of an RBM, consisting of two layers, one
visible (v) and one hidden (h), with no connections within a layer. Hidden nodes are indicated by blue filled circles and the visible nodes indicated by
unfilled circles.

where ψ = (ψ1 , ψ2 , ..., ψm )T is a D × 1 coefficient After updating the coordinates of the source point set by
matrix. Therefore, the minimization over energy equa- B = B + Uψ, we anneal the covariances of the GMM
tion in H boils down to finding a finite coefficients by σ 2 = ρσ 2 , then return to correspondence estimation
matrix ψ. transformation function ν(B) is equivalent to and continue the feature point sets registration process until
the initial position plus a displacement function τ (B), the maximum iteration number is reached. Note the trans-
i.e., ν(B) = B + τ (B). formed source point set B∗ is obtained in the final iteration.
Though the reliable target coordinate Aψ is estimated by Next, we employ the backward approach [40] to establish a
GMM, the method will inescapable produce mismatch- thin-plate spline (TPS) [41] transformation model, then the
ing. Therefore, our next concern draws on formulating transformed image It can be calculated using the model.
a function, by which a reasonable position τ (bm ) of bm It01 and It02 is obtained by equal split of It and It2 according
is determined. This position in turn improves the accu- to a certain ratio. (see Figure 1)
racy of the correspondence estimation as subsequent
iterations interlock. Since the error of L2 -minimizing 2) ESTABLISHING AND TRAINING THE DEEP NEURAL
estimator is less than the error of maximum likelihood NETWORKS FOR CHANGE DETECTION
estimation (MLE), L2 Euclidean an distance is widely Although the difference image method is well researched,
used in multiple applications, and many registration change detection is a comprehensive procedure that requires
methods, especially, the problem of point set registration careful consideration of many factors such as the nature of
can be well formulated by minimizing the L2 Euclidean change detection problems, image preprocessing, selection
distance between two point sets. Therefore, we employ of suitable variables and algorithms. DNN has brought in
the L2 E [39] based energy function to estimate the trans- profound and revolutionary changes to the realm of artifi-
formation function τ , which is written as cial intelligence, and achieved great improvements in many
1 domains such as computer vision, speech recognition and
E(ψ, σ 2 ) = − p̄ + λ k τ k2G (18)
D
2 (πσ )
D
2
natural language processing, etc. Therefore, we employ DNN
to train the pre-classification results and create a change
kAψ −Um,· ψk2
 
where p̄ = m2 M 1
P
m=1 D exp − 2σ 2 , detection map from pre-processed image pair directly with-
(2π σ 2 ) 2 out generating difference images. After pre-classification,
Uij = G(bi , bj ), Um,· denotes the mth row of matrix U,
the neighborhood features of each pixel and its corresponding
ψi denotes the ith row of the coefficient matrix ψh×D .
pixel in another image are converted into a vector as inputs to
Next, we can directly take the partial derivatives of
a neural network.
equation (18) with respect to coefficients matrix ψ,
The Restricted Boltzmann Machine (RBM). RBM is a
By setting them to zero, and solve the resulting linear
stochastic neural network, which consists of two layers of
system of equations. As follows:
! binary units: a visible layer v with n visible units and hidden
∂E T 28 (H ⊗ 1)
layer h with m hidden units. An example of this structure is
=U + 2λGψ (19) in Figure 5 with the hidden nodes indicated by blue circles
∂ψ D
nσ 2 (2πσ 2 ) 2
and the visible nodes indicated by white circles. A common
where 8 = Uψ − Aψ , H = exp{diag(88T )/2σ 2 } is a use for RBMs is to create features for use in classification.
M × 1 vector, diag(·) denotes the diagonal of a matrix, The energy function of the RBM model for visible and hidden
1 is a 1 × D row vector of all ones. Symbols and ⊗ units can be represented by the following:
denote the Hadamard product and Kronecker product,
respectively. E(v, h) = −ηT v − ς T h − hT Wv (20)

77500 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

where η and ς are biases of the visible units and hidden Algorithm 1 Land Cover Change Detection Using
units, respectively. The matrix W denotes weights of the Multi-Temporal and Multi-Sensor Remote Sensing
connection between visible and hidden layer units, where Images in Mountainous Terrain
each matrix element is the conditional probability of the next Input: The source point set A and the target point set B
layer neuron conditioned on the previous layer neuron. The Output: The transformed image It
joint probability distribution of visible units v and hidden 1 Initialize θ0 , θ̃, ι, k, β, ω, δ , W and λ;
2
units h of the RBM is interpreted by 2 Image Registration.
1 E(v,h) 3 while not reach the maximum iteration number do
P(v, h) = e (21) 4 Correspondence Estimation:
Z
Compute Ccfd θ , C and C
5 sc mfd by equation (6), (7)
where Z = v0 h0 eE(v ,h ) is the partition function of the
P P 0 0

and (8), respectively;


system. The conditional distributions are:
6 Compute the posterior probability matrix P by
P(hj = 1|v) = σ (ςj + vT W(:,j) ) (22) equation (15);
P(vi = 1|h) = σ (ηi + W(i,:) h) (23) 7 Compute the corresponding target point set Aψ
by equation (16);
Previous studies [42], [43] that the updating rules for W, 8 Transformation Updating:
η and ς during the training process with a learning rate γ 9 Construct the kernel matrix G and U;
are the following: 10 Compute ψ by using equation (18);
11 Update σ 2 = ρσ 2 ;
1Wij = γ (hvi hj id − hvi hj im ) (24)
12 Update the sensed image’s feature point set by
1ηi = γ (hvi id − hvi im ) (25) B = B + Uψ;
1ςj = γ (hhj id − hhj im ) (26) 13 end

14 The transformed source point set B is obtained in the
where h·id and h·im are the expectations under the distribu-
final iteration;
tion specified by the training input data and the theoretical
15 The transformed image It can be calculated using a
RBM model. Although computing hvi hj id is straightforward,
thin-plate spline (TPS) [41] transformation model that
hvi hj im is intractable due to the large number of possible
is established by the backward approach [40].
joint (v, h) configurations. Contrastive Divergence (CD) algo-
16 Change Detection.
rithm [42] is a learning procedure being used to approxi-
17 Pre-Classification:
mate hvi hj im . For every input, it starts a Markov Chain by
18 Compute a similarity matrix s0ij by equation (9).
assigning an input vector to the states of the visible units and
19 Set a global threshold value of similarity T .
attempts a small number of full Gibbs sampling steps. Result-
20 Pre-Establishing and Training the Deep Neural
ing reconstructed visible units are applied to approximate the
Networks:
expectation of the model distribution.
21 The neighborhood features of each pixel and its
A deep neural network (DNN) [44] pre-trained via stack-
corresponding pixel in another image are converted into
ing restricted Boltzmann machines (RBMs) demonstrates
a vector as inputs to DNN.
high performance. Therefore, we utilize DNN to train net-
22 training.
works using the features of images for detecting land cover
23 Test:
change. The process mainly contains the following three
24 It01 and It02 can be inputted to the network.
steps: (1) neighborhood features of each pixel at the same
location on the image pair are fitted to the DNN; (2) RBMs 25 A change detection map Smap will be obtained.
are then unrolled to create a deep neural network for training.
Note that CD training algorithm is used to pre-train each
RBM in the stack of RBMs via training data; (3) DNN is found by δ = (θ̃ − θ0 )/10, the covariance parameters δ 2 ;
fine-tuned by the backpropagation of error derivatives. (3) outlier balancing weight is initialized as 0.5; (4) in the
After the training, the deep neural network is established. Gaussian radial basis function (GRBF), β is used to control
Next, It01 and It02 can be inputted to the network and a robust the spatial smoothness. Since we normalize the spatial coor-
and contractive change map Smap will be built. dinates of the sensed image feature points to [-1.5, 1.5], β set
to 2; (5) in equation (15) and energy equation (18), δ 2 are
E. OUR ALGORITHM AND PARAMETER SETTINGS initialized to 1 and 0.05 respectively.
Our method is summarized in Algorithm 1. There are five
groups of parameters in our method: (1) in the feature pre- III. EXPERIMENTS AND RESULTS
matching stage, treshold θ0 matically determined by selecting A. STUDY AREA AND DATA SOURCE
the most reliable 128 pairs of feature points . Similarly, θ̃ is The study was mainly carried out in the ten key land con-
determined by selecting the most reliable 64 pairs of feature servation regions of Sichuan, Guizhou and Hunan China
points; (2) in the inlier selection stage, the step-length ι is (see Figure 6). The regions locate in the warm temperate zone

VOLUME 6, 2018 77501


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 6. Location of study area in mountainous terrain of southern China. Red dots represent ten key land conservation regions regions
of Sichuan, Guizhou and Hunan China. Note that Sichuan Province, China (Longitude range: 97o 210 E to 108o 330 E; Latitude range: 26o 030 N
to 34o 190 N). Guizhou Province, China (Longitude range: 103o 360 E to 109o 350 E; Latitude range: 24o 370 N to 29o 130 N). Hunan Province, China
(Longitude range: 111o 530 E to 114o 150 E; Latitude range: 27o 510 N to 28o 410 N).

TABLE 1. The experimental dataset (I) and (II).

and have four distinct seasons because of the continental methods, we divided this dataset into three pars, 3000 for
monsoon. These areas have a variety of land cover types training, 1000 for validation and the remaining 2000 image
including cropland, building-up, forest, etc. Among these pairs for testing. In order to achieve better training effect, date
land cover types, the most dominant one is cropland, which set is formed by two categories of remote sensing image pairs:
can be easily affected by pseudo changes of phenological dif- (1) 4000 image pairs are acquired by different type of
ferences. In addition, we also obtained some satellite remote multi-sensor and multi-temporal satellites including Chinese
sensing data from other foreign mountainous terrain to verify GF, Landsat. The details of dataset (I) and (II) are summarized
the applicability of the method. in Table 1. In this dataset, a same satellite generally follow
We evaluate the performance of the proposal framework the same orbital paths with the same viewing angles and
on an available data set. The data set contains a total passed over a certain spot on earth at the same local time
of 6000 image pairs. To facilitate a fair comparison with other due to orbital mechanics. Therefore, image pairs acquired by

77502 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

the same satellite cannot contains lager viewpoint change. formulations are as follows:
However, image pairs acquired by different sensors suffer v
u
u1 X M
serious scale change.
RMSE = t (bti − ati ) (27)
M
i=1
TABLE 2. The experimental dataset (III).
The RMSE can well reflect the spatial deviation of cor-
responding landmarks in the sensed image and the ref-
erence image, respectively. Where M is the total number
of the selected landmarks, and bti is the landmark that
corresponds to ati ;

2) CHANGE DETECTION PRECISION TEST


A ground truth is compared to the change detection map to
measure the accuracy of the performance of change detection.
In precision-recall curve, the precision metric measures the
(2) 2000 image pairs are acquired by a small-sized UAV, fraction of detections that are true positives and the Recall
the DJI Phantom 4 Pro (DJI, Shenzhen, China) with a CMOS metric measures the fraction of positives that are correctly
camera, basically maintained the same flight altitude (around identified. Precision and Recall can be defined as:
50 ∼ 150 m) for collecting these images of the same location TP
at different times. The details of dataset (III) are shown Precision = (28)
TP + FP
in Table 2. In this dataset, it was not easy to navigate the TP
aircraft along the planned lines since operation of small UAV Recall = (29)
TP + FN
is always limited by air traffic constraints and monitoring
object, e.g., mountainous landforms are the rugged terrain where TP denotes true positives in which changed pixels are
area, and in most seasons these areas often are overcast detected correctly, FP denotes the false positives in which are
and foggy. Thus, image pairs of the same scene have to be detected as changed when compared with the ground truth,
captured from different viewpoints through multiple flight and FN denotes false negatives in which changed pixels are
routes so that a full coverage of the object surface can be detected as unchanged when compared with the ground truth,
obtained. In addition, small UAVs cannot avoid the influence respectively.
of flight attitude (pitch, roll, yaw) in flight practices due to
its flight high, speed, airflow and other factors, which will C. RESULTS AND DISCUSSION
cause the acquired images to be squeezed, twisted, stretched 1) RESULTS AND DISCUSSION OF IMAGE
and offset relative to the target position of ground. Therefore, REGISTRATION ACCURACY TEST
these image pairs often contain large rotation angles. In this experiment, Table 3 shows quantitative comparisons
on image registration measured using the mean RMSE, where
B. EXPERIMENTAL DESIGN it from left to right has a decreasing tendency. In addition,
To qualitatively evaluate the proposal framework results, Figures 7, 8 and 9 show results of image registration on six
two kinds of experiments are conducted: image reg- representative image pairs. The results show that our method
istration and change detection. The former uses some reached the best performance in most cases, especially when
state-of-the-art methods, such as SIFT [45], SURF [46], the appearance difference in the image pair is challenging.
CPD [21], GLMDTPS [24] and ZGL_CATE [28]. The lat- Therefore, our method can be used widely since an efficient
ter adopts PCA_Kmeans [14], SSFA [15], LEGS [4] and change rule should have robust image registration. Moreover,
Semi_FCM [5] . In addition, we adopt two standards and ZGL_CATE can yield a better performance. However its
widely used evaluation metrics, precision-recall curve (PRC) drawback originates from the extracted feature points that
and root of mean square error (RMSE). These experiments are not sensitive enough to multi-temporal images. CPD per-
are performed on a PC with 2.5GHz Intel Core CPU, 8GB forms unsatisfying in some cases. In contrast, GLMDTPS
memory. can achieve better performance since it employs mixture
feature descriptor. However, GLMDTPS emphasizes one-to-
1) IMAGE REGISTRATION ACCURACY TEST one correspondence relationship which is vulnerable under
The root of mean square error (RMSE) is used to quantify the presence of outliers.
the image registration accuracy. We manually construct at
least 15 pairs of corresponding points in each image pair as 2) RESULTS AND DISCUSSION OF CHANGE
landmarks. Note that all the landmarks are well-distributed DETECTION PRECISION TEST
and selected the interest areas in which the surface features In this experiment, the effectiveness of the proposed frame-
are distinct, easily distinguished places or where the colour work is evaluate by comparing the different change detec-
contrast is large with nearby surface features. The related tion methods for PCA_Kmeans, SSFA, LEGS, Semi_FCM.

VOLUME 6, 2018 77503


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

TABLE 3. Experimental results on image registration. Quantitative comparisons on image registration measured using the mean RMSE are carried out.

FIGURE 7. Registration examples on two typical image pairs from dataset (I). (i) LakeOroumeih; (ii) Bastrop. Left: Image pair It and It acquired over
1 2
the same geographical area at two different time t1 and t2 by Landsat 8. Right: The first column until the end show the registration results of SIFT, SURF,
CPD, GLMDTPS, ZGL_CATE and Ours. For each method, the first row shows 5 × 5 checkboard and the second row shows the transformed image It .

TABLE 4. Experimental results on change detection. Quantitative comparisons on change detection measured using the PRC are carried out.

The comparison results are depicted Figures 10, 11 and 12 mainly because our method adopts DNNs to directly create
and Table 4. As shown in Table 4, the average precision a change detection map from pre-processed image pair by
of our method on dataset (I), (II) and (III) have reached bypassing the steps of filtering or generating a difference
to (98.3%, 97.5%), (97.9%, 96.3%), (98.4%, 96.8%) . How- image (DI). In contrast, PCA_Kmeans performs unsatisfy-
ever, the average precision of PCA_Kmeans only reach ing in some cases since it often have noisy result of not
(78.3%, 77.4%), (73.2%, 72.9%), (76.8%, 77.9%). This is considering the spatial relationship among image pixels.

77504 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 8. Registration examples on two typical image pairs from dataset (II). (iii) Guizou; (iv) Hunan. Left: Image pair It and It acquired over the
1 2
same geographical area at two different time t1 and t2 by Chinese GF1 and Chinese GF2 respecively respectively. Right: The first column until the end
show the registration results of SIFT, SURF, CPD, GLMDTPS, ZGL_CATE and Ours. For each method, the first row shows 5 × 5 checkboard and the second
row shows the transformed image It .

FIGURE 9. Registration examples on two typical image pairs from dataset (III). (v) Sichuan; (vi) GuiZhou. Left: Image pair It and It acquired over the
1 2
same geographical area at two different time t1 and t2 by small UAV. Right: The first column until the end show the registration results of SIFT, SURF,
CPD, GLMDTPS, ZGL_CATE and Ours. For each method, the first row shows 5 × 5 checkboard and the second row shows the transformed image It .

Moreover, SSFA and Semi_FCM can achieve better perfor- from the multi-temporal images and transform the data into a
mance. Since SSFA employs the slow feature analysis (SFA) new feature space, DI can be better generated. The compared
algorithm to extract the most temporally invariant component methods in terms of PCA_Kmeans and SSFA, Semi_FCM

VOLUME 6, 2018 77505


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

FIGURE 10. Change detection examples on two typical image pairs from dataset (I). (i) LakeOroumeih; (ii) Bastrop. (i) Yanan, Sichuan Province;
(ii) Ansun, Guizhou Province. Left: It0 and It0 is the division of It and It according to a certain ratio. Right: The first column until the end show the
1 2 2
change detection results of PCK_Kmeans, SSFA, Ground Turth and Ours. (i) PCK_Kmeans (TP:23; FP:8; FN:7; Precision:76.7%; Recall: 75.2%), SSFA
(TP:25; FP:5; FN:5; Precision:83.3%; Recall: 83.3%), LEGS (TP:25; FP:3; FN:5; Precision:89.2%; Recall:83.3%), Semi_FCM (TP:24; FP:5; FN:6;
Precision:82.7%; Recall: 80.0%), Ours (TP:28; FP:0; FN:2; Precision:93.3%; Recall: 100%). (ii) PCK_Kmeans (TP:22; FP:7; FN:8; Precision:73.3%;
Recall: 75.9%), SSFA (TP:27; FP:4; FN:3; Precision: 90.0%; Recall: 87.1%), LEGS (TP:26; FP:3; FN:4; Precision:89.6%; Recall:86.7%),
Semi_FCM (TP:28; FP:4; FN:2; Precision:87.5%; Recall: 93.3%), Ours (TP:29; FP:2; FN:1; Precision:96.7%; Recall: 93.4%).

FIGURE 11. Change detection examples on two typical image pairs from dataset (II). (iii) Guizou; (iv) Hunan. Left: It0 and It0 is the division of It and It
1 2 2
according to a certain ratio. Right: The first column until the end show the change detection results of PCK_Kmeans, SSFA, Ground Turth and Ours.
(iii) PCK_Kmeans (TP:21; FP:5; FN:9; Precision:70.0%; Recall: 80.8%), SSFA (TP:24; FP:2; FN:6; Precision:80.0%; Recall: 92.3%), LEGS (TP:24; FP:2; FN:6;
Precision:80.0%; Recall: 92.3%), Semi_FCM (TP:26; FP:4; FN:4; Precision:86.7%; Recall: 86.7%), Ours (TP:29; FP:1; FN:1; Precision:96.7%; Recall: 96.7%).
(iv) PCK_Kmeans (TP:23; FP:3; FN:7; Precision:76.7%; Recall: 88.4%), SSFA (TP:25; FP:6; FN:5; Precision: 83.3%; Recall: 80.6%), LEGS (TP:26; FP:3; FN:4;
Precision:89.6%; Recall:86.7%), Semi_FCM (TP:26; FP:4; FN:4; Precision:86.7%; Recall: 86.7%), Ours (TP:29; FP:2; FN:1; Precision:96.7%; Recall: 93.3%).

FIGURE 12. Change detection examples on two typical image pairs from dataset (III). (v) Sichuan; (vi)GuiZhou. Left: It0 and It0 is the division of It and
1 2
It according to a certain ratio. Right: The first column until the end show the change detection results of PCK_Kmeans, SSFA, Ground Turth and Ours.
2
(v) PCK_Kmeans (TP:19; FP:5; FN:11; Precision:63.3%; Recall: 79.1%), SSFA (TP:23; FP:2; FN:7; Precision:76.7%; Recall: 92.0%), LEGS (TP:27; FP:6; FN:3;
Precision:81.8%; Recall: 90.0%), Semi_FCM (TP:27; FP:6; FN:3; Precision:81.8%; Recall: 90.0%), Ours (TP:29; FP:1; FN:1; Precision:96.7%; Recall: 96.7%).
(vi) PCK_Kmeans (TP:20; FP:3; FN:10; Precision:66.7%; Recall: 86.9%), SSFA (TP:24; FP:5; FN:6; Precision: 80.0%; Recall: 77.4%), LEGS (TP:27; FP:6; FN:3;
Precision:81.8%; Recall: 90.0%), Semi_FCM (TP:28; FP:6; FN:2; Precision:82.3%; Recall: 93.3%), Ours (TP:28; FP:2; FN:2; Precision:93.3%; Recall: 93.3%).

use semi-supervised fuzzy C-means filter the pseudolabels saliency detection, the complex relationship between dif-
from the difference image. Since LEGS can effectively ferent global saliency cues by local estimation and global
capture local contrast, texture and shape information for search. Therefore, LEGS also achieve better performance.

77506 VOLUME 6, 2018


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

IV. CONCLUSION [7] K. Yang, Z. Yu, Y. Luo, Y. Yang, L. Zhao, and X. Zhou, ‘‘Spatial and tempo-
In this paper, a robust set of change detection framework ral variations in the relationship between lake water surface temperatures
and water quality—A case study of Dianchi Lake,’’ Sci. Total Environ.,
for land cover change in mountainous terrain is proposed, vol. 624, pp. 859–871, May 2018.
which can detect multi-temporal remote sensing image pairs [8] J. D. T. De Alban, G. M. Connette, P. Oswald, and E. L. Webb, ‘‘Combined
acquired by different type of sensors. The superiority of our Landsat and L-band SAR data improves land cover classification and
change detection in dynamic tropical landscapes,’’ Remote Sens., vol. 10,
framework can be summarized through three main contri- no. 2, p. 306, 2018.
butions as follows: 1) a multi-scale feature description is [9] Z. Wei et al., ‘‘A small UAV based multi-temporal image registration for
generated using layers formed via a pretrained VGG net- dynamic agricultural terrace monitoring,’’ Remote Sens., vol. 9, no. 9,
p. 904, 2017.
work; 2) a gradually increasing selection of inliers is realized [10] K. Yang, A. Pan, Y. Yang, S. Zhang, S. H. Ong, and H. Tang, ‘‘Remote
to estimate correspondence and transformations; 3) fuzzy sensing image registration using multiple image features,’’ Remote Sens.,
C-means classifier is adopted to generate a similarity matrix vol. 9, no. 6, p. 581, 2017.
[11] A. S. Milas, K. Arend, C. Mayer, M. A. Simonson, and S. Mackey,
between image pair of geometric correction, deep neural ‘‘Different colours of shadows: Classification of UAV images,’’ Int. J.
networks (DNNs) are applied to directly create a change Remote Sens., vol. 38, nos. 8–10, pp. 3084–3100, 2017.
detection map from pre-processed image pair by bypassing [12] G. Li and Y. Yu, ‘‘Visual saliency detection based on multiscale
deep CNN features,’’ IEEE Trans. Image Process., vol. 25, no. 11,
the steps of filtering or generating a difference image (DI). pp. 5012–5024, Nov. 2016.
The proposed framework can provide a stable change rule for [13] Z. Lv, W. Shi, X. Zhou, and J. A. Benediktsson, ‘‘Semi-automatic system
monitoring land cover change from multi-temporal data. For for land cover change detection using bi-temporal remote sensing images,’’
Remote Sens., vol. 9, no. 11, p. 1112, 2017.
the purpose of experimental evaluation, dataset was mainly [14] T. Celik, ‘‘Unsupervised change detection in satellite images using prin-
collected in the ten key land conservation regions of Sichuan, cipal component analysis and k-means clustering,’’ IEEE Geosci. Remote
Guizhou and Hunan, China. Compared with five state-of- Sens. Lett., vol. 6, no. 4, pp. 772–776, Oct. 2009.
[15] C. Wu, B. Du, and L. Zhang, ‘‘Slow feature analysis for change detection in
the-art registration methods and four state-of-the-art change multispectral imagery,’’ IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5,
detection methods, our method shows better performances in pp. 2858–2874, May 2014.
most cases. [16] H. Lyu, H. Lu, and L. Mou, ‘‘Learning a transferable change rule from a
recurrent neural network for land cover change detection,’’ Remote Sens.,
Future studies will be conducted in two directions: vol. 8, no. 6, p. 506, 2016.
(i) thematic applications of land cover changes, such as cul- [17] H. Zhang, M. Gong, P. Zhang, L. Su, and J. Shi, ‘‘Feature-level change
tivated land changes; (ii) different sourcing images, such as detection using deep representation and feature change analysis for mul-
tispectral imagery,’’ IEEE Geosci. Remote Sens. Lett., vol. 13, no. 11,
image pair of combination of UAV image and satellite remote pp. 1666–1670, Nov. 2016.
sensing image. Indeed, combining different sourcing images [18] E. M. de Oliveira Silveira, J. M. de Mello, F. W. Acerbi, Jr., and
will identify more regions of change in many other typical L. M. T. de Carvalho, ‘‘Object-based land-cover change detection applied
to Brazilian seasonal savannahs using geostatistical features,’’ Int. J.
regions with various land cover types. Remote Sens., vol. 39, no. 8, pp. 2597–2619, 2018.
[19] R. Xiao, R. Cui, M. Lin, L. Chen, Y. Ni, and X. Lin, ‘‘SOMDNCD:
Image change detection based on self-organizing maps and deep neural
networks,’’ IEEE Access, vol. 6, pp. 35915–35925, 2018.
ACKNOWLEDGMENT
[20] B. Uamkasem, H. L. Chao, and B. Jiantao, ‘‘Regional land use dynamic
We are grateful to David G. Lowe, Herbert Bay, monitoring using Chinese GF high resolution satellite data,’’ in Proc. Int.
Andriy Myronenko, Turgay Celik, Chen Wu, Lijun Wang Conf. Appl. Syst. Innov., 2017, pp. 838–841.
and Pan Shao for providing their implementation source [21] A. Myronenko and X. Song, ‘‘Point set registration: Coherent point drift,’’
IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 12, pp. 2262–2275,
codes and test data sets. This greatly facilitated the compar- Dec. 2010.
ison experiments. (Fei Song and Zhuoqian Yang contributed [22] L. Greengard and J. Strain, ‘‘The fast gauss transform,’’ SIAM J. Sci. Stat.
equally to this work.) Comput., vol. 12, no. 1, pp. 79–94, 2006.
[23] I. Markovsky, ‘‘Structured low-rank approximation and its applications,’’
Automatica, vol. 44, no. 4, pp. 891–909, Apr. 2008.
REFERENCES [24] Y. Yang, S. H. Ong, and K. W. C. Foong, ‘‘A robust global and local mixture
distance based non-rigid point set registration,’’ Pattern Recognit., vol. 48,
[1] Y. Wang, F. Zhao, L. Cheng, and K. Yang, ‘‘Framework for monitoring the no. 1, pp. 156–173, 2015.
conversion of cultivated land to construction land using SAR image time [25] J. Ma, J. Zhao, and A. L. Yuille, ‘‘Non-rigid point set registration by pre-
series,’’ Remote Sens. Lett., vol. 6, no. 10, pp. 794–803, 2015. serving global and local structures,’’ IEEE Trans. Image Process., vol. 25,
[2] K. Simonyan and A. Zisserman. (Sep. 2014). ‘‘Very deep convolu- no. 1, pp. 53–64, Jan. 2016.
tional networks for large-scale image recognition.’’ [Online]. Available: [26] K. Yang et al., ‘‘Quake warning funds on shaky ground,’’ Science, vol. 358,
https://arxiv.org/abs/1409.1556 no. 6368, p. 1263, 2017.
[3] Y. Wu, S. Li, and S. Yu, ‘‘Monitoring urban expansion and its effects [27] S. Zhang, Y. Yang, K. Yang, Y. Luo, and S. H. Ong, ‘‘Point set registration
on land use and land cover changes in Guangzhou city, China,’’ Environ. with global-local correspondence and transformation estimation,’’ in Proc.
Monitor. Assessment, vol. 188, no. 1, p. 54, 2016. Int. Conf. Comput. Vis., 2017, pp. 2688–2696.
[4] L. Wang, H. Lu, R. Xiang, and M.-H. Yang, ‘‘Deep networks for saliency [28] S. Zhang, K. Yang, Y. Yang, and Y. Luo, ‘‘Nonrigid image registration for
detection via local estimation and global search,’’ in Proc. IEEE Conf. low-altitude SUAV images with large viewpoint changes,’’ IEEE Geosci.
Comput. Vis. Pattern Recognit., Jun. 2015, pp. 3183–3192. Remote Sens. Lett., vol. 15, no. 4, pp. 592–596, Apr. 2018.
[5] P. Shao, W. Shi, P. He, M. Hao, and X. Zhang, ‘‘Novel approach to [29] S. Zhang, K. Yang, Y. Yang, Y. Luo, and Z. Wei, ‘‘Non-rigid point set regis-
unsupervised change detection based on a robust semi-supervised FCM tration using dual-feature finite mixture model and global-local structural
clustering algorithm,’’ Remote Sens., vol. 8, no. 3, p. 264, 2016. preservation,’’ Pattern Recognit., vol. 80, pp. 183–195, Aug. 2018.
[6] S. A. Azzouzi, A. Vidal-Pantaleoni, and H. A. Bentounes, ‘‘Desertification [30] F. Song, M. Li, Y. Yang, K. Yang, X. Gao, and T. Dan, ‘‘Small UAV
monitoring in Biskra, Algeria, with Landsat imagery by means of super- based multi-viewpoint image registration for monitoring cultivated land
vised classification and change detection methods,’’ IEEE Access, vol. 5, changes in mountainous terrain,’’ Int. J. Remote Sens., vol. 39, no. 21,
pp. 9065–9072, 2017. pp. 7201–7224, 2018.

VOLUME 6, 2018 77507


F. Song et al.: Multi-Scale Feature Based Land Cover Change Detection in Mountainous Terrain

[31] T, Dan et al., ‘‘Multifeature energy optimization framework and parameter XUEYAN GAO received the B.S. degree from
adjustment-based nonrigid point set registration,’’ J. Appl. Remote Sens., Henan Normal University, China, in 2016. She is
vol. 12, no. 3, pp. 12–27, 2018. currently pursuing the M.S. degree with the School
[32] P. A. Permatasari, A. Fatikhunnada, Liyantono, Y. Setiawan, Syartinilia, of Information Science and Technology, Yunnan
and A. Nurdiana, ‘‘Analysis of agricultural land use changes in Jombang Normal University. Her current research interests
Regency, East Java, Indonesia using BFAST method,’’ Procedia Environ. include image registration, point set registration,
Sci., vol. 33, pp. 27–35, Apr. 2016. pattern recognition, and change detection.
[33] S. Belongie, J. Malik, and J. Puzicha, ‘‘Shape matching and object recog-
nition using shape contexts,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 24, no. 4, pp. 509–522, Apr. 2002.
[34] J. Bohg and D. Kragic, ‘‘Learning grasping points with shape context,’’
Robot. Auto. Syst., vol. 58, no. 4, pp. 362–377, 2010.
[35] Y. Gu, K. Ren, P. Wang, and G. Gu, ‘‘Polynomial fitting-based shape
matching algorithm for multi-sensors remote sensing images,’’ Infr. Phys.
Technol., vol. 76, pp. 386–392, May 2016.
[36] R. Jonker and A. Volgenant, ‘‘A shortest augmenting path algorithm for
dense and sparse linear assignment problems,’’ Computing, vol. 38, no. 4, TINGTING DAN received the B.S. degree from
pp. 325–340, Nov. 1987. China West Normal University, China, in 2016.
[37] A. L. Yuille and N. M. Grzywacz, ‘‘A mathematical analysis of She is currently pursuing the M.S. degree with the
the motion coherence theory,’’ Int. J. Comput. Vis., vol. 3, no. 2, School of Information Science and Technology,
pp. 155–175, 1989. Yunnan Normal University. Her current research
[38] J. Ma, J. Zhao, J. Tian, A. L. Yuille, and Z. Tu, ‘‘Robust point matching interests include image registration, point set regis-
via vector field consensus,’’ IEEE Trans. Image Process., vol. 23, no. 4, tration, pattern recognition, and change detection.
pp. 1706–1721, Apr. 2014.
[39] J. Ma, J. Zhao, Y. Ma, and J. Tian, ‘‘Non-rigid visible and infrared face
registration via regularized Gaussian fields criterion,’’ Pattern Recognit.,
vol. 48, no. 3, pp. 772–784, 2015.
[40] S. Ji and S. Peng, ‘‘Terminal perturbation method for the backward
approach to continuous time mean–variance portfolio selection,’’ Stochas-
tic Process. Appl., vol. 118, no. 6, pp. 952–967, 2008.
[41] F. L. Bookstein, ‘‘Principal warps: Thin-plate splines and the
decomposition of deformations,’’ IEEE Trans. Pattern Anal. Mach.
Intell., vol. 11, no. 6, pp. 567–585, Jun. 1989. YANG YANG received the master’s degree from
[42] G. E. Hinton, Training Products of Experts by Minimizing Contrastive Waseda University, Japan, in 2007, and the Ph.D.
Divergence. Cambridge, MA, USA: MIT Press, 2002. degree from the National University of Singapore,
[43] G. E. Hinton, ‘‘A practical guide to training restricted Boltzmann Singapore, in 2013. He is currently an Associate
machines,’’ Momentum, vol. 9, no. 1, pp. 599–619, 2010. Professor with the School of Information Science
[44] A. Lozano-Diez, R. Zazo, D. T. Toledano, and J. Gonzalez-Rodriguez, and Technology, Yunnan Normal University. His
‘‘An analysis of the influence of deep neural network (DNN) topology in research interest covers image registration, remote
bottleneck feature based language recognition,’’ PLoS ONE, vol. 12, no. 8, sensing image processing, medical image process-
p. e0182580, 2017. ing, geography information system, and human
[45] D. G. Lowe, ‘‘Distinctive image features from scale-invariant keypoints,’’
masticatory system.
Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
[46] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, ‘‘Speeded-up robust
features (SURF),’’ Comput. Vis. Image Understand., vol. 110, no. 3,
pp. 346–359, 2008.

WANJING ZHAO received the B.S. degree from


China Yunnan Normal University, China, in 2018.
FEI SONG received the B.S. degree from Sichuan She is currently pursuing the M.S. degree with the
Normal University, China, in 2013. He is cur- School of Information Science and Technology,
rently pursuing the M.S. degree with the School Yunnan Normal University. Her current research
of Information Science and Technology, Yunnan interests include image registration, point set reg-
Normal University. His current research interests istration, and pattern recognition.
include image registration, point set registration,
and change detection.

ZHUOQIAN YANG is currently pursuing the B.S. RUI YU received the B.S. degree from Harbin
degree with the College of Software, Beihang Uni- Huade University, China, in 2017. She is currently
versity. His research interest includes computer pursuing the M.S. degree with the School of Infor-
vision and image registration. mation Science and Technology, Yunnan Normal
University. Her current research interests include
image registration, point set registration, and pat-
tern recognition.

77508 VOLUME 6, 2018

You might also like