0% found this document useful (0 votes)
1 views

Image processing

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Image processing

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

We are IntechOpen,

the world’s leading publisher of


Open Access books
Built by scientists, for scientists

6,400
Open access books available
172,000
International authors and editors
190M Downloads

Our authors are among the

154
Countries delivered to
TOP 1%
most cited scientists
12.2%
Contributors from top 500 universities

Selection of our books indexed in the Book Citation Index


in Web of Science™ Core Collection (BKCI)

Interested in publishing with us?


Contact [email protected]
Numbers displayed above are based on latest data collected.
For more information visit www.intechopen.com
Chapter

Hyperspectral and Multispectral


Image Fusion Using Deep
Convolutional Neural
Network - ResNet Fusion
K. Priya and K.K. Rajkumar

Abstract

In recent years, deep learning HS-MS fusion has become a very active research tool
for the super resolution of hyperspectral image. The deep conventional neural
networks (CNN) help to extract more detailed spectral and spatial features from the
hyperspectral image. In CNN, each convolution layer takes the input from the
previous layer which may cause the problems of information loss as the depth of the
network increases. This loss of information causes vanishing gradient problems,
particularly in the case of very high-resolution images. To overcome this problem in
this work we propose a novel HS–MS ResNet fusion architecture with help of skip
connection. The ResNet fusion architecture contains residual block with different
stacked convolution layer, in this work we tested the residual block with two-, three-,
and four- stacked convolution layers. To strengthens the gradients and for decreases
negative effects from gradient vanishing, we implemented ResNet fusion architecture
with different skip connections like short, long, and dense skip connection. We
measure the strength and superiority of our ResNet fusion method against traditional
methods by using four public datasets using standard quality measures and found that
our method shows outstanding performance than all other compared methods.

Keywords: convolution neural network, residual network, ResNet fusion, stacked


layer, dense skip connection

1. Introduction

Spectral imaging technology captures contiguous spectrum for each image pixel
over a selected range of wavelength bands in the spectrum. Thus, spectral images
accommodate more information than conventional monochromatic or RGB images.
The wide range of spectral information available in hyperspectral image brings the
spectral imaging technology into a new horizon of research for analyzing the pixel
content at macroscopic level. This tremendous change in image processing research
area makes revolutionary developments in every walks of human life in coming
future. In general, spectral images are divided into either Multispectral (<20 numbers
1
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

of wavelength bands sampled) or Hyperspectral (>20 numbers of wavelength bands).


Multispectral image (MSI) captures a maximum of 20 spectral bands whereas
Hyperspectral image (HSI) captures hundreds of contiguous spectral bands at a time.
Due to this exciting prominence, HSI is now becoming an emerging area and at the
same time faces a lot of challenges to analyze the minute details of the pixel content in
image processing and computer vision areas [1].
Hyperspectral images (HSIs) are rich in spectral information that highly
strengthens their information storing ability. This property of HSI is enable rapid
growth in the development in many areas such as remote sensing, medical science,
food industry, and various computer vision tasks. However, hyperspectral images
capture all these bands in a narrow wavelength range, and hence it limits the amount of
energy received by each band. Therefore, the HSI information can be easily influenced
by many kinds of noises, and it leads to lower the spatial resolution of HSI [2].
Many studies have been introduced in literature so far to control the tradeoff
between the spatial and spectral resolution in the hyperspectral images. As a result of
this, many HS–MS fusion methods are evolved in the past decades to address it. The
straightforward approach of the HS–MS fusion method has become the most popular
and trending research area of image processing and computer vision. The early
approach is pansharpening-based image fusion that fuses spectral and spatial infor-
mation from low resolution multispectral (LR–MS) images with high resolution (HR)
panchromatic (PAN) images to enhance the spatial and spectral resolution of the
fused image. Subsequently, pansharpening image fusion algorithms have been
gradually extended to HS-MS image fusion [3].
In HS–MS fusion, a high spatial and spectral hyperspectral image is estimated by
fusing LR–HS image with HR–MS image of the same scene. However, the estimated
spatial and spectral data quality is highly influenced by the constraints used in the fusion
process. Recently, neural network-based methods have been widely used in many areas
to improve the HS–MS fusion quality in both spatial and spectral domains. One such
network named as convolution neural network (CNN) in deep learning (DL) performs
much better in image reconstruction, super-resolution, object detection, etc. [4].
In CNN, each layer takes the output from the previous layer, which tends to lose
information as the network goes into deeper architecture. In this work, we use
ResNet-based HS–MS fusion by adding the skip connection between the convolution
layers. This skip connection helps to map the identity of information throughout the
deep convolution network [5].
The following sections of this paper are arranged as Section 2 includes various
literature reviews of HS–MS fusion methods in both traditional and newly introduced
deep learning methods. Section 3 includes the materials and methods used in this
work. Sections 4 and 5 includes the detailed representation of problem formulation
and implementation of our work. The results and discussion of our proposed method
are discussed in Section 6, and finally, Section 7 concludes the proposed work with
future scope.

2. Review of literature

2.1 Traditional methods

Many algorithms have been proposed to enhance the spatial quality of HS images
in past decades. One such popular and attractive method is HS-MS image fusion,
2
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

which is mainly divided into four groups: component substitution (CS), multi-
resolution analysis (MRA), Bayesian approach, and spectral unmixing (SU) [6]. The
CS and MRA methods are described under the concept of an injection framework. In
this framework, the high-quality information from one image is injected into another
[7]. Apart from these, Bayesian-based methods use probability or posterior distribu-
tion of prior information about the target image. The posterior distribution of the
target image is considered based on the given HS and MS images [8]. Later, spectral
unmixing-based HS–MS image fusion was introduced and is one of the promising and
widely used methods for enhancing the quality of HS image.
In SU method, the quality of the abundance estimation highly depends on the
accuracy of the endmembers. Therefore, any obstruction that occurs during the end
member extraction process leads to inconsistency in the abundance estimation. To
overcome this limitation, Paatero and Tapper in 1994 [9] introduced nonnegative
matrix factorization (NMF) method and it was popularized in article by Lee and
Seung in 1999 [10]. It has become an emerging tool for processing high-dimensional
data due to the automatic feature extraction capability. The main advantage of this
NMF method is that it shows a unique solution to the problem compared to other
unmixing techniques [11]. In general, NMF based on the spectral unmixing jointly
estimates both endmember and corresponding fractional abundance in a single step
are mathematically represented as follows,

Y ¼ EA (1)

Where the output matrix Y is simultaneously factorized into two nonnegative


matrix E (endmember) and A (abundance) without any prior knowledge and hence
NMF comes under an unsupervised framework [12]. Later NMF is one of the trending
methods for blind source spectral unmixing problems. NMF factorizes the input
matrix into a product of two nonnegative matrices (endmember matrix, E and abun-
dance matrix, A) by enforcing nonnegativity. So NMF method has high relevance in
SU to enhance the quality of the image by adding these constraints. Finally, SU-based
fusion is accomplished by using coupled NMF (CNMF) method to obtain enhanced
hyperspectral image with high spatial and spectral quality. The CNMF fusion
algorithm gives high-fidelity reconstructed image compared to other existing fusion
methods [13].
Yokoya et al. in 2012 [14] introduced a coupled non-negative matrix factorization
(CNMF) method, which is an unsupervised unmixing-based HS-MS image fusion.
CNMF uses a straightforward approach to unmixing and fusion processes, so its
mathematical formulation and implementation are not as complex as other existing
fusion methods. Finally, this method optimizes the solution with minimum residual
errors and reconstructs the high-fidelity hyperspectral image.
Simoes et al. in 2015 [15] introduced a super-resolution method for hyperspectral
image termed as HySure. This method formulated a new model to preserve the edges
between the objects during the unmixing-based data fusion. This method uses an
edge-preserving constraint called vector total variation (VTV) regularizer that
preserves the edges and promotes piecewise smoothness to the spatial quality of the
image.
Lin et al. in 2018 [16] introduced a convex optimization-based CNMF (CO-CNMF)
method. This method is proposed by incorporating sparsity and sum-of-squared-
distances (SSD) regularizer. To extract high-quality data from the images, this method
uses an SSD regularizer and provides sparsity by using ℓ1 -norm regularization.

3
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

By adding these two regularization terms with two convex subproblems helps to
upgrade the performance of the existing CNMF method. However, sometimes per-
formance degradation may occur in the CO-CNMF algorithm as the noise level
increases. Therefore, it is necessary to add image denoising and spatial smoothing
constraints with this fusion method.
Yang et al. in 2019 [17] introduced a total variation and signature-based (TVSR)
regularizations CNMF method named as TVSR-CNMF. The TV regularizer is added to
the abundance matrix to ensure the images spatial smoothness. Similarly, a signature-
based regularizer (SR) is added with the endmember matrix for extracting high-
quality spectral data. So, this method helps to reconstruct a hyperspectral image with
good quality in spatial and spectral data.
Yang et al. in 2019 [18] introduced a sparsity and proximal minimum-volume
regularized CNMF method named as SPR-CNMF. The minimum-volume regularizer
controls and minimizes the distance between selected endmembers and the center of
mass of the selected region in the image to reduce the computational complexity. It
redefines the fusion method at each iteration until reaches the simplex with minimum
volume. This method improves the fusion performance by controlling the loss of cubic
structural information.
After being influenced by this work, we implemented an unmixing-based fusion
algorithm named fully constrained CNMF (FC-CNMF). This method is a modified
version of CNMF by including all spatial and spectral constraints available in the
literature. In our method, a simplex with minimum volume constraint is imposed with
the endmember matrix to exploit the spectral information fully. Similarly, sparsity
and total variation constraints are incorporated with the abundance matrix to provide
dimensionality reduction and spatial smoothness to the image. Finally, we evaluated
the quality of the fused image obtained by FC-CNMF against the methods discussed in
the literature using some standard quality measures. From these evaluations, we
understood that our method shows better performance by yielding high-fidelity in the
reconstructed images.
These traditional approaches reconstruct the high-resolution hyperspectral image
by fusing the high-quality data from hyperspectral and multispectral images. How-
ever, to improve the quality of the reconstructed images, these approaches use differ-
ent constraints such as sparsity, minimum volume simplex, and total variance
regularization, etc. The performance and quality of the reconstructed HS image are
highly influenced by these constraints and therefore our existing methods still have an
ample space to enhance the quality of HSI.

2.2 Deep learning methods

Deep learning (DL) is a subbranch in machine learning (ML) and has shown
remarkable performances in the research field, especially in the area of image
processing and computer vision recently. DL is based on an artificial neural network
that has been widely used in different areas such as super-resolution, classification,
image fusion, object detection, etc. DL-based image fusion methods have the ability to
extract deep features automatically from the image. Therefore, DL-based methods
overcome the difficulties that are faced during the conventional image fusions
methods and make the whole fusion process as easier and simple.
A deep learning-based HS-MS image fusion concept was first introduced by
Palsson et. al in 2017 [19]. In this method, they used a 3-D convolutional neural
network (3D-CNN) to fuse LR–HS and HR–MS image to construct HR-HS image.
4
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

This method improves the quality of hyperspectral image by reducing noise and the
computational cost. In this paper, they focused on enhancing the spatial data of LR–
HS image without any changes in the spectral information and it caused the degrada-
tion of spectral data [19].
Later, Masi et al. in 2017 [20] proposed a CNN-architecture for image super-
resolution, which uses deep CNN for extracting both spatial and spectral features.
Deep CNN is used to acquire features from HSI with a very complex spatial-spectral
structure. But in this paper, authors used deep CNN with single branch CNN archi-
tecture which is difficult to extract the discriminating features from the image.
To overcome this drawback, Shao and Cai in 2018 [21] designed a fusion method
by extending CNN with depth of 3D-CNN for obtaining better performance while
fusion. For implementing this, they used a remote sensing image fusion neural net-
work (RSIFNN) that uses two CNN branches separately. One branch extract the
spectral and the other extract the spatial data from the image. In this way, this method
helps to exploit the spectral as well as spatial information from the input images to
reconstruct high spectral and spatial resolution hyperspectral image.
Yang et.al in 2019 [22] introduced a deep two-branch CNN for HS–MS fusion. This
method uses a two-branch CNN architecture for extracting spectral and spatial fea-
tures from LR–HSI and HR–MSI. These extracted features from two branches of CNN
are concatenated and then passed to the fully connected convolution layer to obtain
HR–HSI. In all the conventional fusion methods, HR–HSI is reconstructed in a band-
by-band fashion whereas in CNN concepts all bands are reconstructed jointly. There-
fore, it helps to reduce the spectral distortion that occurs in the fused image. But this
method uses fully connected layer for image reconstruction that is heavily weighted
layer and it increases the network parameters.
Chen et al in 2020 [23], introduced a spectral–spatial features extraction fusion-
CNN (S2FEF- CNN) which extracts joint spectral and spatial features by using three -
S2FEF blocks. The S2FEF method use 1D and 2D convolution network to extract
spectral and spatial features and fuse these spectral and spatial features. This method
uses fully connected network layer for dimensionality reduction, and it further
reduces the network parameters during the fusion. This method shows good results
with less computational complexity compared to all other deep learning-based fusion
method.
Although the deep learning-based fusion methods achieved tremendous improve-
ment in their implementation, however, all these methods still possess many draw-
backs [24]. As the network goes deeper, its performance gets saturated and then
rapidly degrades. This is because, in DL method, each convolution layer takes inputs
from the output of the previous layers, so when it reaches the last layer, a lot of
meaningful information obtained from the initial layers will be lost. The information
loss tends to get worse when the network is going deeper in architecture. This will
bring some negative effects such as overfitting of data and this effect is called
vanishing gradient problem [25].
Due to the vanishing gradient problem, the existing deep learning-based fusion
could not be able to extract the detailed features from high dimensional images. He
et al in ref., [26], introduced a deep network with residual learning to address the
vanishing gradient problem. In this framework, a residual block is added between the
layers to diminish the performance degradation. The networks with these concepts are
called residual networks or ResNets. Therefore, in this work, our aim is to invoke this
ResNet architecture into the standard CNN to exploit more detailed features from
both spatial and spectral data of HSI.
5
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

3. Materials and methods

3.1 Dataset

The four real datasets such as Washington DC mall, Botswana, Pavia University,
and Indian Pines are used in this work. The Washington DC Mall dataset is a well-
known dataset captured by HYDICE sensor, which acquired a spectral range from
400 to 2500 nm have 1278307 pixel size and 191 bands. The Botswana dataset which
is captured by Hyperion sensor acquired over the Okavango delta in Botswana, which
acquired a spectral range from 400 to 2500 nm with 1476  256 pixel size and
145 bands. The Pavia University dataset was captured by the reflective optics spec-
trographic imaging system (ROSIS-3) at the University of Pavia, northern Italy, in
2003. It has a spectral range from 430 to 838 nm and has a 610  340 pixel size and
103 bands. Finally, the dataset AVIRIS Indian Pines was captured by AVIRIS sensor
over the Indian Pines test site in northwestern Indiana, USA, in 1992. It acquired a
spectral range from 4 to 2500 μm having 512  614 pixel size and 192 bands [26].
All these datasets have been widely used in earlier spectral unmixing-based fusion
research.

3.2 Convolution neural networks

Convolutional neural networks (CNN) have an important role in deep learning


models. CNN specially built an algorithm that is designed to work with images to
extract deep features from the image through convolution. The convolution is a
process that applies a kernel filter across every element of an image to understand and
react to each element within the images. This concept of convolution is more helpful
during the extraction of specific features from high dimensional images. A
convolutional network architecture is composed of an input layer, an output layer,
and one or more hidden layers. The hidden layers are combination of convolution
layers, pooling layers, activation layers, and normalization layers. These layers auto-
matically detect essential features without any human supervision. So it is considered
as a powerful tool for image processing [27].

A. Convolution layer
The convolution layer is used to extract various features from the input image
with the help of filters. In convolution layer, mathematical operation is
performed between the input image and the filter with m  m kernel size. This
filter is sliding across the input image to calculate the dot product of the filter
and part of the image. This process is repeated for convolving the kernel to all
over the image and the output of the convolution operation is called a feature
map. This feature map includes all essential information about the image such as
the boundary and edges of objects etc. [28].

B. Pooling layer
The convolution layer is followed by a pooling layer, which reduces the size of
the feature map by maintaining all the essential features. There are two types of
pooling layers such as max pooling and average pooling. In Max pooling, the
largest element is taken from the feature map whereas in the average pooling
calculates the average of the element in the feature map [28].
6
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

C. Activation function
One of the most important characteristics of any CNN is its activation function.
There are several activation functions such as sigmoid, tanH, softmax, and
ReLU, and all these functions have their own importance. The ReLU is the most
commonly used activation function in DL that accounts for the nonlinear nature
of the input data [28].

3.3 Residual network (ResNet)

A residual network is formed by stacking several residual blocks together.


Each residual block consists of convolution layers, batch normalization, and activation
layers. The batch normalization process the data and brings numerical stability by
using some scaling techniques without distorting the structure of the data. The
activation layer is added into the residual network to help the neural network to
learn more complex data. The CNN or deep learning method uses ReLU (rectified
linear unit) function in the activation layer to accommodate the nonlinearity nature
of the image data while providing the output. The residual blocks allow to flow
information from the first layer to the last layers of the network by adding residual
or skip connection strategy. Therefore, ResNet can effectively utilize features of the
input data to the output of the network and thus alleviate gradient vanishing
problems.
Let x be the input to the residual block, after processing the information x with
two-stacked convolution layers of a residual unit, obtains F(W1x), where W is the
weight given to the convolution layer. In ResNet, before giving an output of one
convolution layer F(W1x) as input of the next layer by adding the x term, which is
the input parameters of previous residual block, to provide an additional identity
mapping information called as skip connection. Therefore the general formulation of a
residual block can be represented as follows:

y ¼ F ðW i x Þ þ x (2)

Here x is an input and y is the output of the residual unit. Then y is a guaranteed
input to the next residual block. The function F(Wi x) represents the output of each
convolution layer, and Wi is the weight associated with ith residual blocks. Figure 1
uses two convolution layers for the residual unit, so the output from this residual layer
can be written as:

F ðxW Þ ¼ W 2 ReLU ðW 1 xÞ (3)

Where ReLU represents the nonlinear activation function rectified linear unit
(ReLU), W 1 and W 2 are the weight associated with convolution layers 1 and 2 of the
residual block A. Deep residual networks consist of many stacked residual blocks and
each block can be formulated in general as follows:

xiþ1 ¼ Fðxl W l Þ þ xi (4)

Where F is the output from residual block with l stacked convolution layer and xi is
the residual connection to the ith residual block, then xiþ1 become the output of the ith
residual block, which is calculated by a skip connection and element-wise
7
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

Figure 1.
HS–MS fusion using CNN.

multiplication. After passing through the ReLU activation layer, the output residual
network can be represented as:

y ¼ ReLU ðxiþ1 Þ (5)

4. Problem formulation

A high-resolution hyperspectral image Z ∈ ℝLN with L spectral band and N pixels.


The observed LR–HSI is obtained by downsampling the spatial quality of Z with a
gaussian blur factor d is represented as Yh ∈ ℝLN=d with L bands and N=d pixels.
Similarly, the observed HR–MSI is obtained by downsampling the spectral quality of Z
and it is represented as Ym ∈ ℝLm N with Lm bands and N pixels, where Lm < L [27].
Therefore, the hyperspectral image can be mathematically modeled as:

Z ¼ EA þ R (6)

Where, Z is the original referenced images, E and A are the endmember, abun-
dance matrices, and R is the residual matrix respectively.
The observed Yh and Ym are spectrally and spatially degraded versions of image Z
is further mathematically represented by:

Ym ≈ SZ þ Rm (7)
Yh ≈ ZB þ Rh (8)

Where B ∈ ℝNN=d is a Gaussian blur filter with blurring factor d used to blur the
spatial quality of the referenced hyperspectral image Z to obtain LR–HSI, Yh . The
spectral response function, S ∈ ℝLm L is used to downsampling the spectral quality of
the referenced hyperspectral image Z to obtain HR–MSI, Ym . The term Lm means the
number of spectral bands used in the multispectral image after downsampling. In this
work, referenced image Z is downsampled by its spectral values using standard L and
sat 7 multispectral image that contains a high-quality visual image of Earth’s surface as
HR–MSI with Lm ¼ 7 [28]. Both B and S are spared matrices containing zeros and
ones. In general, the residual matrix Rm and Rh are assumed as zero-mean Gaussian
noises in the literature, Therefore, the original CNMF method is shown as:

CNMF ðE, AÞ ¼ kYh ðEAh Þk2F þ kYm ðEm AÞk2F (9)

However, in this work, we make use of the residual term Rm and Rh as a nonneg-
ative residual matrix to account for the nonlinearity effects in the image fusion [29].
8
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

Since the objective function for the original CNMF method expressed in the Eq. (9)
can be re-written as:

CNMF ðE, A, RÞ ¼ kYh ðEAh þ Rh Þk2F þ kYm ðEm A þ Rm Þk2F (10)

Therefore the Eq. (10) represents the proposed model of the HS–MS fusion by
including the nonlinearity nature of the image. To implement this model, we use
standard deep neural network architecture CNN and ResNet. For further enhance-
ment of the proposed method, we implemented modified architecture of ResNet with
different stacked layers and multiple skip connections.

5. Problem implementation

5.1 CNN fusion architecture

In CNN architecture, 1D CNN convolution operation is performed over the


observed HS image Yh of dimension LhxNh with Lh spectral band and Nh number of
pixels in the image with the help of filter to obtain the spectral data. In the same way,
2D CNN convolution operation is performed over the observed MS image is denoted
by Ym of dimension Lm x Nm, with Lm spectral bands and Nm number of pixels in
the image to obtain the spatial data. Finally, the high spectral component obtained
from Yh and high spatial component obtained from Ym are fused together to recon-
struct a high HR-HSI. The entire deep neural network-based HS–MS fusion is shown
in Figure 1.
In CNN architecture, the Conv1D() convolution filter with kernel size r having
weight v are used for extracting spectral data from LR–HSI, Yh are represented as
follows:

f spec ¼ Conv1DðReLU ð Fðvi Y h ÞÞÞ (11)

Similarly, the Conv2D() convolution filter with kernel size r  r having weight w
are used for extracting spatial data from HR–MSI, Ym image are represented as:

f spat ¼ Conv2D ReLU F wij Y m (12)

The two convolutional layers use ReLU (rectified linear unit) activation functions,
i.e., ReLU (x) = max(x, 0), to provide nonlinear mapping of data. Finally, fuse the
extracted spatial and spectral features to get high-quality reconstructed image as
shown in Eq. (4).
 
F ¼ ReLU f spec  f spat (13)

To implement this CNN fusion architecture, we use two convolution networks


such as 1D and 2D convolution. Both 1D and 2D convolution uses the same number of
convolution layers and kernel size. Each network uses four convolution layers with 32,
64, 128, and 256 filters. The kernel size of 3  3 and 1  3 are used for 2D CNN and 1D
CNN for extracting spatial and spectral information about the image. Therefore, the
architecture and parameters of CNN HS-MS fusion are shown in Table 1.
9
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

Layer Filter Kernel size Stride Padding Activation

Conv 1 Conv 1D 32 13 1 Same ReLU

Conv 2D 32 33

Conv 2 Conv 1D 64 13 1 Same ReLU

Conv 2D 64 33

Conv 3 Conv 1D 128 13 1 Same ReLU

Conv 2D 128 33

Conv 4 Conv 1D 256 13 1 Same ReLU

Conv 2D 256 33

Output layer Conv 1D 1 11 1 Same ReLU

Conv 2D 1 11

Table 1.
The Simple CNN Fusion Architecture.

In CNN, each layer takes its input as the output from the previous layer and it
introduces lose information as the network architecture goes in deeper. This problem
in deep neural network leads to overfitting of data, and it is known as vanishing
gradient problem [24]. To overcome this, we implemented HS-MS fusion using an
alternative ResNet-based network architecture. In ResNet, we introduced the skip
connection between two convolution layers. This skip connection helps to map the
identity of information throughout the deep convolution network.

5.2 Resnet fusion architecture

The ResNet fusion architecture for HS–MS fusion uses residual or skip connection
which helps to improve the feature extraction capability from the images. For imple-
mentation, we use 1D ResNet to extract the spectral features from the LR–HSI and 2D
ResNet for extracting spatial features from HR–MSI. Both 1D and 2D ResNet archi-
tecture consists of three residual blocks each having two convolutional layers and 64
filters as shown in Figure 2. A3  3 kernel size for 2D Resnet and 1  3 kernel size for
1D Resnet are used for extracting the spatial and spectral data from MSI and HSI. Each
residual block has ReLU activation layer to accommodate the nonlinearity constraints
included in the proposed hyperspectral image fusion model as explained in Eq. (10).
Finally, the feature embedding and image reconstruction process are performed using
another 2D CNN.

Figure 2.
Residual block with two stacked layer.

10
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

A. Spectral generative network


The spectral data from hyperspectral image Yh is extracted using 1D ResNet
connection. Initially, spectral data are extracted from LR–HSI using 1D CNN and
then mapping the residual connection r(Yh) with the stacked convolution layers.
Finally, the output from ID CNN and r(Yh) are given to the input of the next
residual block and this process is repeated for an entire residual block in the
ResNet. The entire process in 1D ResNet is shown mathematically as:
 
f Y hl ¼ ReLU W l Y hl (14)
  
f spec Y hl ¼ f Y hl þ r Y hl (15)

Therefore, output of ith residual block is represented as:

f ispec ¼ f ispec1 Y hl þ ri 1
 
Y hl (16)

Where, Y h denotes the input LR- HSI data, i is the number of residual units
i = 1,2,3 … ..I and l are the number of convolution layer l = 1,2,3 … ..l. The weight of
convolution kernel is represented as W. Finally, ReLU an activation functions are
exploited to introduce nonlinearities in the output of deep network as follows:
 
F spec ¼ ReLU f spec (17)

B. Spatial generative network


The spatial data from HR–MSI, Ym is extracted using 2D ResNet. Initially, spatial
data are extracted from HR–MSI using 2D CNN and then mapping the residual
connection r(Ym) with the stacked convolution layers. Finally, the output from
2D CNN to r(Ym) is given to the input of the next residual block and this process
is repeated for an entire residual block in the ResNet. The entire process in 2D
ResNet is shown mathematically as:
 
f Y ml ¼ ReLU W l Y ml (18)
  
f spat Y ml ¼ f Y ml þ r Y ml (19)

Therefore, output of the ith residual block is represented as:

f ispat ¼ f ispat1 Y ml þ ri 1
 
Y ml (20)

Where, Y m denotes the input HR- MSI data, i is the number of residual blocks
i = 1,2,3 … ..I and l are the number of convolution layer l = 1,2,3 … ..L. The weight
of the convolution kernel is represented as W. Finally, similar to spectral
extraction ReLU is exploited to introduce nonlinearities in the spatial output of a
deep network as follows:
 
Fspat ¼ ReLU f spat (21)

11
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

C. Fusion of spectral-spatial data


The spectral data from LR–HSI and spatial data from HR–MSI are extracted
using ResNet with size as (1x1x Spec) and (Spat x Spat x 1). After obtaining the
spatial and spectral features, next step is to fuse this information by element-
wise multiplication.

FZ ¼ Fspec x Fspat (22)

Then, the feature embedding and image reconstruction are performed by using
ReLU activation layer. The proposed ResNet Fusion framework is shown in
Figure 3. Therefore, the final generated HR-HSI, Z can be written as:

Z ¼ ReLU ðF Z Þ (23)

D.Different stacked layers and skip connection


We also proposed an extension to the ResNet fusion architecture by varying the
number of stacked convolution layers (2 to 4) in the residual block to increase
the performance of the fusion using deep network. The 2-layer residual block
contains two stacked convolution l ayer followed by ReLU activation layer.
Similarly, in three-layer and four-layer residual blocks contain three and four-
stacked convolution layers followed by ReLU activation layer. In addition to this,
we utilize the ResNet fusion architecture by including different skip
connections. The skip connection helps us to regulate the flow of information to
a deeper network more effectively. For this, we use long skip and dense skip
connections as shown in Figure 4. The long skip connections are designed by
creating a connection between alternate residual layer ith and (i + 2)th along with
a short skip connection between every layer in the ResNet. In dense skip
connection, each layer i obtain an additional input from all the preceding layers.
Then, the layer i pass its own feature maps to all the subsequent layers. Using the

Figure 3.
The framework of the proposed ResNet Fusion architecture.

12
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

Figure 4.
Representation of short, long, and dense skip connection on ResNet.

dense skip connection, each layer in the ResNet receives feature maps from all
the preceding layers and that limits the number of filters and network
parameters for extracting deep features. In order to obtain high fidelity
reconstructed image, we proposed a modified version of ResNet with long and
dense skip connections shown in Figure 4.
In the Figure 4 show three Resnet architecture, having three- residual blocks
(Res Block), with three different types of skip connections. Algorithm 1 summarizes
the procedures of our proposed ResNet fusion method.

Algorithm 1: Resnet Fusion

Input: LR-Hyperspectral image Yh and HR-Multispectral image Ym


begin
1. Extract spectral features from Yh and spatial features from Ym using ResNet
2. r(Yh) Yh and r(Ym) Ym
3. For each residual block in ResNet i = 1,2,3 … .I # for each residual block
4. for each convolution layer l in the residual block l = 2,3,4 # for stacked convolution layer
 
f Yhl ¼ ReLU Wl Yhl
f ðYml Þ ¼ ReLU ðWl Yml Þ
end for
# add the residual connection
  
f spec Yhl ¼ f Yhl + r Yhl
f spat ðYml Þ ¼ f ðYml Þ + rðYml Þ

r(Yh) f spec Yhl
r(Ym) f spat ðYml Þ
end for
5. The extracted spectral features Fspec of size (1x1x Spec) and spatial features Fspat of size
(Spat x Spat x1) are fused together by element-wise multiplication.
6. FZ = Fspec x Fspat
7. Finally, generated HR-HSI after feature embedding and image reconstruction using relu activation
layer.
8. Z = ReLU ðFZ Þ
End

Output: HR- Hyperspectral image, Z

13
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

6. Results and discussion

In this paper, intially we implemented CNN-based fusion by extracting the spec-


tral data from LR–HSI using 1D convolution network and spectral data from HR–MSI
using 2D convolution network. These extracted spatial and spectral features are then
fused together to obtain HR–HSI. To extract more detailed features from HS and MS,
it requires deep CNN architecture. As CNN architecture become deeper, it introduced
vanishing gradient problem. To overcome this, we implemented an unsupervised
ResNet Fusion network by using skip connections. The proposed ResNet fusion
inherits all the advantages of standard CNN. In addition to this, ResNet allows the
designing of a deeper network without any performance degradation during the
feature extraction process. Therefore, the proposed ResNet Fusion architecture
extracts more discriminative features from both HSI and MSI and finally reconstruct a
high-resolution HSI by fusing these extracted high-quality features from the ResNet.
The performance of CNN and ResNet fusion method is evaluated on four bench-
mark data sets using standard quality measures namely SAM, ERGAS, PSNR, and
UIQI [30]. Further, we also compared the performance of CNN and ResNet fusion
against the baseline fusion methods namely, CNMF [9], FCN-CNMF, and S2FEF-
CNN [22]. Out of these, CNN shows better performance compared to CNMF and
FCN-CNMF. The ResNet-based fusion shows outstanding performance compared to
all other methods including CNN. The results obtained by CNN and ResNet fusion
method against the baseline methods on four benchmark datasets are shown in
Table 2. The low SAM indicates the good spectral data in the fused image and low
ERGAS shows the statistical quality of the reconstructed image. The high PSNR and

Dataset Methods CNMF FC-CNMF CNN S2FEF-CNN ResNet

Pavia university SAM 0.0633 0.0652 0.0451 0.0441 0.0409

EARGAS 0.5423 0.4502 0.4311 0.4901 0.4029

PSNR 64.4502 64.8923 65.1299 64.4915 66.1127

UIQI 0.8779 0.9316 0.9262 0.9665 0.9872

Indian pines SAM 0.5113 0.3976 0.4525 0.4118 0.3896

EARGAS 0.8733 0.6991 0.6434 0.7192 0.6170

PSNR 62.6779 63.1076 63.1311 64.8165 65.2971

UIQI 0.7988 0.8432 0.8118 0.8776 0.8991

Washington DC mall SAM 0.5609 0.5998 0.5956 0.5519 0.5171

EARGAS 0.5741 0.5034 0.4993 0.4886 0.4850

PSNR 64.09 64.12 64.19 65.11 65.1358

UIQI 0.9199 0.9409 0.9213 0.9365 0.9656

Botswana SAM 0.2541 0.2179 0.2233 0.2108 0.1908

EARGAS 0.5194 0.4989 0.5034 0.4992 0.4698

PSNR 63.1123 63.4321 63.9019 64.0116 64.8798

UIQI 0.9703 0.9772 09715 0.9827 0.9960

Table 2.
The performance evaluation of different fused algorithms on four hyperspectral datasets.

14
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

UIQI show good spatial quality and high fidelity reconstructed image with less spec-
tral distortion. From Table 2, it is further clear that good spectral preservation is
obtained in Botswana dataset on analyzing the SAM value, which is reduced by more
than 0.02 dB. Simultaneously, significant spatial preservation is achieved in the Indian
Pine database revealed by the PSNR value increased by 1.5 dB.
The above work is extended by introducing different stacked convolution layers in
the residual block of the ResNet. The experimental results obtained after stacked
convolution layer in the ResNet are shown in Table 3. From the SAM value in Table 3,
it is clear that the spectral information of the image is reducing as and when the
number of stacked layers in the residual block increases. The UIQI value from the
Table 3 also reveals that quality of the reconstructed image is also diminishing as the
number stacked layer increases in the ResNet. The PSNR and EARGS show a stable
performance, which ensure the spatial consistency of our proposed method. So, we
concluded that ResNet Fusion network with two-stacked convolution layer acquires
more discriminative features from the source images and guarantee the quality of the
reconstructed image on analyzing the results obtained in Table 3.
Figure 1 shown below is the visual representation of the output provided by our
proposed ResNet fusion method on four benchmark datasets against all other baseline
methods. From the figure, it is evident that ResNet Fusion with two-stacked convolu-
tion layers produces better performance in most of the areas in the image
(highlighted) of the four datasets (Figure 5).
We further extend the Resnet fusion architecture to reduce the number of param-
eters to make our proposed method more efficient and effective to handle high

Dataset Methods Number of stacked convolution layers

2 layers 3 layers 4 layers

Pavia university SAM 0.0409 0.065 0.069

EARGAS 0.4029 0.4029 0.4029

PSNR 66.1127 66.1127 66.1127

UIQI 0.9872 0.9713 0.9622

Indian pines SAM 0.3896 0.4186 0.4553

EARGAS 0.6170 0.6170 0.6170

PSNR 65.2971 65.2971 65.2971

UIQI 0.8991 0.8904 0.8801

Washington DC mall SAM 0.5171 0.5529 0.5721

EARGAS 0.4850 0.4850 0.4850

PSNR 65.1358 65.1358 65.1358

UIQI 0.9656 0.9432 0.9209

Botswana SAM 0.1908 0.1978 0.2085

EARGAS 0.4698 0.4698 0.4698

PSNR 64.8798 64.8798 64.8798

UIQI 0.9960 0.9822 0.9589

Table 3.
The performance of ResNet fusion by varying the stacked layers.

15
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

Figure 5.
The ground truth and fused image of different methods using four benchmark datasets.

Architecture Number of parameters

CNN 31,586,081million

ResNet with Short Skip 8,045,825 million

ResNet with Long Skip 390,529 million

ResNet with Dense Skip 19,393 million

Table 4.
The performance of different skip connection.

dimensional data. For that, we used short skip, long skip, and dense skip connection to
the ResNet architecture with two-stacked convolution layers. Table 4 gives the total
number of network parameters required for this ResNet architecture in each skip
connection. From Table 4, it is clear that ResNet architecture with dense skip con-
nection provides very less network parameters compared to ResNet with short and
long skip connections.

A. Time complexity
Comparing the performance and running time of all the proposed algorithms on
four benchmark datasets are shown in Figure 6. From this figure, it is evident
that ResNet fusion with dense skip connection took very less running time and
showed good performance in reconstructing high-fidelity hyperspectral image.
16
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

FCN-CNMF
CNN
Short skip Resnet
Long skip Resnet
7000
6000
Running Time (seconds)

5000
4000
3000
2000
1000
0
Washinton DC Mall Pavia University Indian Pine Botswana
Datasets

Figure 6.
The running time of traditional and deep learning HS-MS image fusion.

On comparing the performance and running time of ResNet with long skip and
short skip connection, long skip connection ResNet fusion architecture shows
good performance and running time than short skip connection. On evaluating
the performance and running time of all ResNet fusion architectures, ResNet
with dense skip connection outperformed compared to the other two ResNet
fusion architectures. While comparing the performance and running time, the
FCN-CNMF method showed better performance and time than CNN-based
fusion. Finally, we concluded that, ResNet with dense skip connection with less
network parameter shown highlighting performance for reconstructing good
spatial and spectral quality HR-HSI compared to all other proposed methods.
However, all our proposed methods show good in performance but the cost
incurred in terms of time is high.

B. Resnet HS-MS fusion model


The experimental analysis of our ResNet fusion architecture with various
parameters is done to build a general model for our proposed HS-MS ResNet
fusion algorithm. For this purpose, we trained the network by using cropped HSI
and MSI image pairs from each dataset. That means each dataset is cropped into
several patches and then divided into training and testing data. In the case of
Indian Pine dataset with size 610  340  103 are cropped into several patches
of size M  N  L. The patch size was M N L = 15  15  103 for Indian Pine
dataset showing high performance to our network model. Similarly, we create
training and testing samples for all three datasets. The patch size for Washington
Dc Mall dataset was M  N  L = 19  19  191, for Botswana dataset, were
M  N  L = 17  17  145 and for Pavia University dataset were
M  N  L = 19  19  192 gives a network model with good running time and
network parameters.
We measure the quality matrix value of our ResNet fusion by varying the
number of stacked layers and found that residual blocks, each having two-
stacked convolution layers is performing better than the others. The most
significant part of ResNet is skipped connection, which helps for the flow of
information through the network more efficiently and effectively. So, we also
17
18

Hyperspectral Imaging - A Perspective on Recent Advances and Applications


Name Layer Kernel size Input size Input content Stride Padding Activation Output size Output content

Input Layer Conv 1 ID-CNN 1*3 1 1D Image(Spectral) 1 same ReLU 64 1DConv1

2D-CNN 3*3 2 2D Image(Spatial) 1 same ReLU 64 2DConv1

Residual Block1 Conv 2 ID-CNN 1*3 64 1DConv1 1 same ReLU 64 1DConv2

2D-CNN 3*3 64 2DConv1 1 same ReLU 64 2DConv2

Conv 3 ID-CNN 1*3 64 1DConv2 1 same ReLU 64 1DConv3

2D-CNN 3*3 64 2DConv2 1 same ReLU 64 2DConv3

Skip Connection Add 1 — — 1DConv1 + 1DConv3 — — — — 1DResB1

2DConv1+ 2DConv3 2DResB1

Residual Block2 Conv 4 ID-CNN 1*3 64 1DResB1 1 same ReLU 64 1DConv4

2D-CNN 3*3 64 2DResB1 1 same ReLU 64 2DConv4

Conv 5 ID-CNN 1*3 64 1DConv4 1 same ReLU 64 1DConv5

2D-CNN 3*3 64 2DConv4 1 same ReLU 64 2DConv5

Skip Connection Add 2 — — 1DConv1 + 1DResB1 + 1DConv5 — — — — 1DResB2

2DConvt1 + 2DResB1 + 2DConv5 2DResB2

Residual Block3 Conv 6 ID-CNN 1*3 64 1DResB2 1 same ReLU 64 1DConv6

2D-CNN 3*3 64 2DResB2 1 same ReLU 64 2DConv6

Conv 7 ID-CNN 1*3 64 1DConv6 1 same ReLU 64 1DConv7

2D-CNN 3*3 64 2DConv6 1 same ReLU 64 2DConv7

Skip Connection Add 3 — — 1DConv1 + 1DResB1 + 1DResB2 + 1DConv7 — — — — 1DResB3

2DConv1 + 2DResB1 + 2DResB2 + 2DConv7 2DResB3

Max pooling Conv 8 ID-CNN 1*3 64 1DResB3 1 same ReLU 32 1DConv8

2D-CNN 3*3 64 2DResB3 1 same ReLU 32 2DConv8


19

DOI: http://dx.doi.org/10.5772/intechopen.105455
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
Name Layer Kernel size Input size Input content Stride Padding Activation Output size Output content

Flatten layer Conv 9 ID-CNN 1*1 32 1DConv8 1 same ReLU 1 Spectral data

2D-CNN 1*1 32 2DConv8 1 same ReLU 1 Spatial data

Upsampling layer Conv 10 2D-CNN 3*3 1 Spectral/Spatial data 1 same ReLU 32 Spectral*Spatial

Output layer Conv 11 2D-CNN 3*3 32 Spectral * Spatial 1 same ReLU 64 Fused Image

Table 5.
ResNet-dense skip Architecture of HS-MS image fusion.
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

experimented with three skip connections: short skip, long skip, and dense skip
connection. From this experiment, we found that ResNet with a dense skip
connection reduces the number of network parameters to a large extent.
Finally, we built a generative ResNet model for the fusion of HS–MS image as
shown in Table 5. The ResNet fusion model uses ID and 2D convolution
networks. These two convolution networks consist of three residual blocks, each
residual block contains two convolution layers with 64 filters, 3x3 kernel size,
stride = 1, max-pooling, and padding = same. To make the information flow
accurately throughout the network, we use dense skip connection. At last, it uses
a 2D convolution to decode the reconstructed image into the original format.

7. Conclusion

In this work, we implemented HS–MS fusion on deep learning method because of


its strong ability to extract features from the image. At first, we implemented the HS–
MS fusion process in conventional CNN method. But in CNN, each layer takes the
output from the previous layer, which tends to lose information as the network goes
into deeper architecture. So we further implemented the fusion process in ResNet by
adding the skip connection between the convolution layers. This skip connection helps
to extract more detailed features from the images without any degradation problems.
Our constructed ResNet fusion architecture includes three-residual blocks, and each
block is a combination of stacked convolution layer and skip connections. Moreover,
we modify the ResNet fusion architecture with different stacked layers and found that
ResNet with two-stacked layer gives more accurate results. Finally, we extend ResNet
architecture to reduce the number of parameters by using different skip connections
like short ship, long skip, and dense skip connections. From the experimental analysis,
it is found that the ResNet- dense skip improve the performance in image reconstruc-
tion with very less network parameters and running time compared to other fusion
methods. This deep residual network helps to extract nonlinearity features with the
help of the ReLU activation layer. The experiment and performance analysis of our
algorithm is done effectively and quantitatively on four benchmark datasets. The
fusion results indicate that ResNet with dense skip fusion method shows outstanding
performance over traditional and DL methods by keeping the spatial and spectral data
to a large extent in the reconstructed image.

20
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

Author details

K. Priya* and K.K. Rajkumar


Department of Information Technology, Kannur University, Kerala, India

*Address all correspondence to: [email protected]

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of
the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly cited.
21
Hyperspectral Imaging - A Perspective on Recent Advances and Applications

References

[1] Michael NH, Kudenov W. Review of [8] Wei Q , Bioucas-Dias J, Dobigeon N,


snapshot spectral imaging technologies. Tourneret JY. Hyperspectral and
Optical Engineering. 2013;52(10): multispectral image fusion based on a
090901 sparse representation. IEEE Transactions
on Geoscience and Remote Sensing.
[2] Feng F, Zhao B, Tang L, Wang W, 2015;53:3658-3668
Jia S. Robust low-rank abundance
matrix estimation for hyperspectral [9] Patero and U. Tapper. Positive matrix
unmixing. IET International Radar factorization: A non-negative factor
Conference (IRC 2018). 2019;2019(21): model with optimal utilization of error
6406-6409 estimates of data values. Environmetrics.
1994;5:111-126
[3] Dhore AD, Veena CS. Evaluation of
various pansharpening methods using [10] Lee DD, Seung HS. Algorithms for
image quality metrics. 2nd International non-negative matrix factorization. In:
Conference on Electronics and Advances in Neural Information
Communication Systems (ICECS). IEEE. Processing Systems. Denver. Cambridge,
18 June 2015:2015. DOI: 10.1109/ MA, United States: MIT press; 2001.
ecs.2015.7125039 pp. 556-562

[4] Wang Z, Chen B, Ruiying L, [11] Tong L, Zhou J, Qian B, Yu J, Xiao C.


Zhang H, Liu H, Varshney PK. Adaptive graph regularized multilayer
FusionNet: “An unsupervised nonnegative matrix factorization for
convolutional variational network for hyper-spectral unmixing. IEEE Journal
hyperspectral and multispectral image of Selected Topics in Applied Earth
fusion”. IEEE Transactions on Image Observations and Remote Sensing.
Processing. 2020;29:7565-7577 2020;13:434-447

[5] He K, Zhang X, Ren H, Sun J. Deep [12] Cao J et al. An endmember


residual learning for image recognition. initialization scheme for nonnegative
Proceedings of the IEEE conference on matrix factorization and its application in
computer vision and pattern hyper-spectral unmixing. ISPRS
Recognition. IEEE. 12 December 2016: International Journal of Geo-Information.
770-778 2018;7:195. DOI: 10.3390/ijgi7050195

[6] Loncan L, de Almeida LB, Bioucas- [13] José M, Nascimento P, Bioucas


Dias JM, Briottet X, et al. Hyperspectral Dias JM. Vertex component analysis: A
pansharpening: A review. In: IEEE fast algorithm to unmix hyperspectral
Geoscience and Remote Sensing data. IEEE Transactions on Geoscience
Magazine. IEEE; September 2015;3(3): and Remote Sensing. 2005;43(4)
27-46
[14] Yokoya N, Yairi T. Iwasaki,
[7] Vivone G et al. A critical “A Coupled nonnegative matrix
comparison among pansharpening factorization unmixing for hyperspectral
algorithms. IEEE Transactions on and multispectral data fusion”. IEEE
Geoscience and Remote Sensing. 2015; Transactions on Geoscience and Remote
53(5):2565-2586 Sensing. 2012;50:528-537
22
Hyperspectral and Multispectral Image Fusion Using Deep Convolutional Neural Network…
DOI: http://dx.doi.org/10.5772/intechopen.105455

[15] Simoes M, Bioucas-Dias J, [22] Yang J, Zhao Y-Q , Chan J.


Almeida L, Chanussot J. A convex Hyperspectral and multispectral image
formulation for hyperspectral image fusion via deep two-branches
super resolution via subspace-based convolutional neural network. Remote
regularization. IEEE Transactions on Sensing. 2019;10(5):800
Geoscience and Remote Sensing. 2015;
53:3373-3388 [23] Chen L, Wei Z, Xu Y. A lightweight
spectral–spatial feature extraction and
[16] Lin C-H, Ma F, Chi C-Y, Hsieh C-H. fusion network for hyperspectral image
A convex optimization-based coupled classification. Remote Sensing. 2020;12:
nonnegative matrix factorization 1395. DOI: 10.3390/rs12091395. 28 April
algorithm for hyperspectral and
multispectral data fusion. IEEE [24] Song W, Li S, Fang L, Lu T.
Transactions on Geoscience and Remote Hyperspectral image classification with
Sensing. 2018;56(3):1652-1667. deep feature fusion network. IEEE
DOI: 10.1109/tgrs.2017.2746078 Transactions on Geoscience and Remote
Sensing. 2018;56(7):3173-3184
[17] Yang F, Ma F, Ping Z, Guixian X.
Total variation and signature-based [25] Available from: http://lesun.weebly.
regularizations on coupled nonnegative com/hyperspectral-data-set.html
matrix factorization for data fusion.
Digital Object Identifier. 2019;7: [26] Ian Goodfellow, Yoshua Bengio,
2695-2706. DOI: 10.1109/ACCESS.2018. Aaron Courville. Deep Learning.
2857943. IEEE Access Available from: https://www.deeplea
rningbook.org/
[18] Yang F, Ping Z, Ma F, Wang Y.
Fusion of hyperspectral and [27] Ma F, Yang F, Ping Z, Wang W. Joint
multispectral images with sparse and spatial-spectral smoothing in a
proximal regularization. IEEE Access minimum-volume simplex for
Digital Object Identifier. 2019;2019: hyperspectral image super-resolution.
2961240. DOI: 10.1109/ACCESS Applied Sciences. 2019;10(1)

[19] Palsson F, Sveinsson JR, [28] Available from: https://www.usgs.


Ulfarsson MO. Multispectral and gov/landsat-missions/landsat-7
hyperspectral image fusion using a 3-D
convolutional neural network. IEEE [29] Hong D, Yokoya N, Chanussot J,
Geoscience and Remote Sensing Letters. Zhu X. An augmented linear mixing
2017;14:639-643 model to address spectral varialbilty for
hyperspectral unmixing, geography,
[20] Masi G, Cozzolino D, Verdoliva L, computer science. In: IEEE Transactions
Scarpa G. Pansharpening by on Image Processing. 2018
convolutional neural networks. Remote
Sensing. 2017;8(7):594 [30] Wang ACBZ. A universal image
quality index. IEEE Signal Proessing
[21] Shao Z, Cai J. Remote sensing image Letters. 2002;9:81-84
fusion with deep convolutional neural
network. IEEE Journal of Selected Topics
in Applied Earth Observations and
Remote Sensing. May 2018;11(5):
1656-1669
23

You might also like