0% found this document useful (0 votes)

12 views24 pages

Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification

CNN FOR HSI

Uploaded by

Bhavatarini Rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views24 pages

Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification

CNN FOR HSI

Uploaded by

Bhavatarini Rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

remote sensing

Article
An Enhanced Spectral Fusion 3D CNN Model for Hyperspectral
Image Classification
Junbo Zhou, Shan Zeng *, Zuyin Xiao, Jinbo Zhou, Hao Li and Zhen Kang

School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430023, China
* Correspondence: [email protected]

Abstract: With the continuous development of hyperspectral image technology and deep learning
methods in recent years, an increasing number of hyperspectral image classification models have
been proposed. However, due to the numerous spectral dimensions of hyperspectral images, most
classification models suffer from issues such as breaking spectral continuity and poor learning of
spectral information. In this paper, we propose a new classification model called the enhanced
spectral fusion network (ESFNet), which contains two parts: an optimized multi-scale fused spectral
attention module (FsSE) and a 3D convolutional neural network (3D CNN) based on the fusion of
different spectral strides (SSFCNN). Specifically, after sampling the hyperspectral images, our model
first implements the weighting of the spectral information through the FsSE module to obtain spectral
data with a higher degree of information richness. Then, the weighted spectral data are fed into
the SSFCNN to realize the effective learning of spectral features. The new model can maximize the
retention of spectral continuity and enhance the spectral information while being able to better utilize
the enhanced information to improve the model’s ability to learn hyperspectral image features, thus
improving the classification accuracy of the model. Experiment results on the Indian Pines and Pavia
University datasets demonstrated that our method outperforms other relevant baselines in terms of
classification accuracy and generalization performance.
Citation: Zhou, J.; Zeng, S.; Xiao, Z.;
Zhou, J.; Li, H.; Kang, Z. An Keywords: deep learning; hyperspectral image classification; attention mechanism; feature fusion;
Enhanced Spectral Fusion 3D CNN 3D CNN
Model for Hyperspectral Image
Classification. Remote Sens. 2022, 14,
5334. https://doi.org/10.3390/
rs14215334 1. Introduction
Academic Editors: Junjun Jiang, In recent years, with the continuous development of hyperspectral image (HSI) tech-
Jiayi Ma and Leyuan Fang nology [1], the analysis and processing of hyperspectral data has become one of the hotspots
Received: 31 August 2022
in many research areas [2]. HSIs are characterized by high information content, strong
Accepted: 20 October 2022
spectral continuity, high spectral resolution and so on. These characteristics allow HSIs to
Published: 25 October 2022
be used in an increasingly wide range of applications, such as environmental monitoring,
agricultural production, mineral development and other fields [3–5]. Among the many
Publisher’s Note: MDPI stays neutral
applications of HSIs, the classification of pixels in images is one of the main research
with regard to jurisdictional claims in
tasks [6].
published maps and institutional affil-
HSI classification is more complex than traditional image classification to a certain
iations.
extent. This is mainly reflected in two points: First, the number of HSIs is much smaller than
the number of conventional images. Taking the COCO public dataset [7] as an example, it
contains 91 easily recognizable object categories with a total of 2.5 million tagged instances
Copyright: © 2022 by the authors.
in 328,000 images. The common public dataset of HSIs generally has only one original
Licensee MDPI, Basel, Switzerland. dataset containing spectral data and label data. Second, the spectral dimension of an HSI
This article is an open access article is much larger than that of a traditional image. The large amount of spectral data makes
distributed under the terms and it difficult for general classifiers to achieve high accuracy, especially when the training
conditions of the Creative Commons samples are extremely limited. Therefore, HSI classification can be studied from two
Attribution (CC BY) license (https:// aspects: classification and spectral information processing.
creativecommons.org/licenses/by/ Early researchers faced with the problem of how to deal with complex spectral in-
4.0/). formation mainly processed spectral information via the following methods: principal

Remote Sens. 2022, 14, 5334. https://doi.org/10.3390/rs14215334 https://www.mdpi.com/journal/remotesensing

Remote Sens. 2022, 14, 5334 2 of 24

component analysis (PCA), independent component analysis (ICA), linear discriminant

analysis (LDA), etc. The key point of these methods is to select the most representative and
effective spectra and discard some spectra that do not contribute much to the classification
in order to achieve dimensionality reduction of HSIs. However, the biggest problem with
these methods of processing HSIs is that the spectral information of HSIs is very typical
of nonlinear relationships, while PCA and LDA are traditional linear processing meth-
ods [8]. Therefore, it is not robust to process the spectral information of HSIs by these
linear methods. As for HSI classification, the early methods are mainly based on machine
learning methods. For example, support vector machines (SVM) [9], the k-nearest neighbor
algorithm (k-NN) [10] and the naive Bayesian algorithm [11] are used for classification
of HSIs. In the face of hyperspectral images with few labeled samples, Zhang et al. [12]
proposes a semisupervised classification method based on simple linear iterative cluster
(SLIC) segmentation for HSIs. However, in the processing of features by traditional ma-
chine learning methods, compared with deep learning methods, deep learning methods
can extract higher-level features [13]. Therefore, researchers have started to solve HSI
classification and spectral information processing by using deep learning methods.
Compared with traditional machine learning methods, deep learning methods can
automatically extract features from each layer and train the classification model at the
same time. With the increase in the number of layers, the overall model will continue to
become more robust [14]. As a result, deep learning methods have become increasingly
popular in recent years [15,16], especially in the field of image processing [17–21]. In the
face of evolving HSI technology, Chen et al. [22] applied deep learning methods to HSI
classification for the first time. Since then, there have been an increasing number of HSI
studies based on deep learning [23,24]. The research targeting HSI classification can be
divided into two main lines:
(1) HSI [25]. Due to the rich spectral information and continuity of HSIs, the rich spectral
information cannot be handled well or effectively by traditional dimensionality re-
duction methods. Luo et al. [26]. proposed a multi-structure unified discriminative
embedding method to better represent the low-dimensional features of HSIs. With
the proposal of the SE module [27], the attention mechanism has received more and
more attention. The biggest advantage of the attention mechanism is that it can focus
on the useful channel information when facing multiple channels and can directly
establish the dependency between input and output, which enhances the paralleliza-
tion of models [28]. It is an effective attempt to introduce an attention mechanism
into his classification. The spectral information of HSIs can be effectively enhanced by
weighting each band of HSIs through the channel attention mechanism. Ma et al. [29]
went further on this basis and proposed an attention mechanism module using the
correlation between spectra obtained by multi-scale convolution, the SeKG module,
which further enhanced the spectral information.
(2) HSI classification models [23]. Due to the large number of classification models
based on deep learning methods, the study of classification models is one of the
research hotspots for HSI classification. Chen et al. [30] used a deep belief network
(DBN) to extract features and classify HSIs. Mou et al. [31] used recurrent neural
networks (RNN) to achieve classification of HSIs. In the last decade of research on
deep learning-based classification models, convolutional neural networks (CNNs)
have emerged as one of the main focuses in the research field due to their advantages
of feature extraction through local connectivity and weight sharing to reduce the
number of parameters. For HSI classification, Zhao et al. [32] first used PCA to
reduce the dimension of the original HSI and then extracted features from the reduced
image using a CNN model to achieve the classification of the HSI. Zhang et al. [33]
input HSIs of different regions into a CNN, expecting better classification results.
Guo et al. [34] proposed a CNN-based spatial feature fusion model that can fuse
spatial information into spectral information to obtain good classification results. In
recent years, other methods used to study hyperspectral image classification using
Remote Sens. 2022, 14, 5334 3 of 24

convolutional neural networks or other deep learning methods are FusionNet [35],
HSI bidirectional encoder representation from transformers (HSI-BERT) [36], spatial–
spectral transformers (SST) [37] and two-stream spectral-spatial residual networks
(TSRN) [38]. With the CNN model being studied for a long time, the 3D CNN model
was proposed by Tran et al. [39]. The biggest advantage of 3D CNN over 2D CNN
is that the features of the channel dimension can be extracted, which is very suitable
for HSIs. Chen et al. [40] applied 3D CNN to the classification of HSIs. After that,
many researchers have begun using 3D CNN for HSI classification. For example,
Ahmad et al. [41] proposed a 3D CNN model that can rapidly classify hyperspectral
images. Zhong et al. [42] designed a residual module based on 3D CNN to extract
spatial and spectral information and applied it to HSI classification. Laban et al. [43]
proposed a 3D deep learning framework which combined PCA and 3D CNN. Due
to the advantages of 3D convolution, other models for hyperspectral image study
using 3D CNN are: spectral four-branch multi-scale networks (SFBMSN) [44], 3D ×
2D CNN [45] and 3D ResNet50 [46]. However, as 3D CNN has the ability to extract
both spatial and spectral information, there is no need to extract spatial and spectral
features separately.
By analyzing these two main lines, we can find some new ideas or problems that can
be solved: (1) Can the two main lines of research be better integrated? Although research
on HSIs serves classification models, ordinary classification models are not effective in
extracting the main features of the spectra due to the complexity of the original spectral
information of hyperspectral images. So, can we design a network structure that can better
learn the spectral features after processing? (2) In terms of classification models, 3D CNN
is theoretically well suited to HSIs. It is worthwhile to try to make the design idea of the
new network structure more closely fit 3D CNN.
In order to solve the above problems, we designed and tested a new HSI classification
model (ESFNet). The innovations of our model can be divided into two parts:
(1) We optimize the SeKG module [29], termed FsSE. In order to better process and utilize
the spectral information while preserving the continuity between spectra as much as
possible, we reduce the convolution of multiple scales in the SeKG module to two
scales and set the scaling parameter in the excitation layer to 1. These two optimiza-
tions allow the module to extract correlations between spectra more efficiently while
retaining maximum spectral continuity, so that the classification model can better
learn the spectral features.
(2) We propose a new network named the spectral stride fusion network (SSFCNN). The
new network implements the fusion of different strides by taking advantage of the fact
that 3D CNN can slide in the spectral dimension. This structure not only enhances the
learning ability of the model regarding spectral features, but also solves the problem
of redundant spectra.
Our model effectively solves the problem of integrating the two main lines mentioned
above. On the one hand, the usefulness of the FsSE module cannot be realized if the
enhanced information of this module is not effectively utilized. On the other hand, without
the support of enhanced features, the advantages of SSFCNN cannot be better demonstrated.
Therefore, the two parts are complementary and indispensable, which greatly enhances
the model’s ability to learn spectral characteristics. A series of experiments shows that
our proposed ESFNet is effective, and its overall accuracy is better than that of other
classification models.
The rest of the paper is organized as follows. Section 2 introduces the FsSE module
and SSFCNN. Section 3 presents the datasets used for the experiments, the experimental
environment and the training and test sets. Section 4 focuses on the relevant experimental
analysis. Finally, conclusions and discussions are summarized in Section 5.
environment and the training and test sets. Section IV focuses on the relevant experi-
mental analysis. Finally, conclusions and discussions are summarized in Section V.

2. Enhanced Spectral Fusion Network (ESFNet)

Remote Sens. 2022, 14, 5334 4 of 24
2.1. FsSE Module
Because hyperspectral images (HSI) have hundreds of spectral bands, traditional hy-
perspectral processing methods
2. Enhanced generally
Spectral Fusion begin
Network with dimensionality reduction methods
(ESFNet)
such as PCA before2.1. employing
FsSE ModuleHSI classification methods such as neural networks. One
obvious problem with these Because hyperspectral
methods images
is that (HSI) have
the number ofhundreds
spectralofbands
spectral bands, traditional
reserved needs
hyperspectral processing methods generally begin with dimensionality reduction methods
to be determined subjectively, which will prevent the most efficient use of spectral infor-
such as PCA before employing HSI classification methods such as neural networks. One
mation. However, ifobvious
training is directly
problem with theseinput intois the
methods neural
that the numbernetwork without
of spectral dimension
bands reserved needs to
reduction, there will be a problem, in that the spectral characteristics cannot be
be determined subjectively, which will prevent the most efficient use of spectral information.well
learned. Thus, it is However, if training is
a good solution todirectly input into
adaptively the neural
adjust the network
weightswithout dimension
of spectral reduction,
channels
there will be a problem, in that the spectral characteristics cannot be well learned. Thus,
by introducing a channel attention mechanism to enhance and suppress different channels
it is a good solution to adaptively adjust the weights of spectral channels by introducing
so that the classification model
a channel can better
attention mechanismlearn to the spectral
enhance features.
and suppress different channels so that the
We were inspired by SENet [27] and SSKNet [29]. In order to effectively extract the
classification model can better learn the spectral features.
We were
features of different spectral inspired
bands of by SENet [27]
different and SSKNet
grounds, we[29]. In order to
optimized effectively
the extract the
SeKG module
features of different spectral bands of different grounds, we optimized the SeKG module in
in SSKNet and named SSKNettheand
optimized
named themodule
optimizedthe two-scale
module fusion
the two-scale SE module
fusion SE module(FsSE mod-
(FsSE module).
ule). Figure 1 showsFigure
the structure
1 shows theof this module.
structure of this module.

Figure 1. The structure of the

Figure 1. TheFsSE module.
structure of the After global After
FsSE module. averageglobalpooling, two different
average pooling, scales
two different ofof
scales
convolution are fusedconvolution
to obtainarethe correlation
fused to obtain thecharacteristics between
correlation characteristics different
between spectra,
different spectra,and thenthe
and then
model
the model can learn the can learn
fused the fused
features features
better better
by the by the excitation
excitation module module
(i.e.,(i.e., a fullyconnected
a fully connected layer with
layer
with the same number theof
same number
nodes of nodes
in both in bothFinally,
layers). layers). Finally, the spectral
the spectral channels of
channels of hyperspectral
hyperspectral images
im-are
weighted by scale operation.
ages are weighted by scale operation.
In the SE module, the input data are compressed into a one-dimensional tensor
In the SE module, the input
by global averagingdata are compressed
pooling, and then the exclusive into a one-dimensional
mask of the channel tensor by
is obtained
through the excitation layer to achieve weighting
global averaging pooling, and then the exclusive mask of the channel is obtained through of the channel. However, this strategy
has a drawback for HSIs: it ignores a certain correlation that exists between HSI spectra.
the excitation layerTherefore,
to achieve weighting
it is necessary of thethe
to improve channel.
specificityHowever, this strategy
of HSIs. In SSKNet, the SeKGhas modulea
drawback for HSIs:first
it ignores
performsamultiscale
certain correlation
convolution on that theexists between tensor
one-dimensional HSI spectra.
obtainedThere-
by global
fore, it is necessary to improve the specificity of HSIs. In SSKNet, the SeKG modulechannel
averaging pooling and then fuses the convolved results. Finally, the weight of the first
is obtained through the same excitation as in SENet. Based on the SE module, the SeKG
performs multiscale convolution on the one-dimensional tensor obtained by global aver-
module proposes multi-scale convolution for the spectra of an HSI to extract the correlation
aging pooling and then
of the fuses
spectrathe convolved
at different results.
distances. In theFinally,
SeKG module, the weight
Ma et al.of the channel
defined is
a set of multi-
obtained through the scalesame excitation
convolution kernelsasf in
= [SENet.
f 1 , f 2 , . . . ,Based on can
f k ], which thebeSEusedmodule, the
to extract theSeKG
spectral
correlation at multiple distances. Our experiment
module proposes multi-scale convolution for the spectra of an HSI to extract the correla- shows that this multi-scale convolution
strategy is sufficient to extract spectral correlations using only two scales, and the distance
tion of the spectra at different distances. In the SeKG module, Ma et al. defined a set of
between scales should not be too large. Meanwhile, in order to retain as much spectral
multi-scale convolution kernels
continuity 𝑓 = we
as possible, 𝑓 , set
𝑓 , the
. . . ,scaling
𝑓 , which parameter caninbetheused to extract
excitation moduletheto 1.spec-
We will
tral correlation at multiple distances. Our experiment shows that this multi-scale convo-
introduce the implementation process in detail below.
lution strategy is sufficient to extract spectral correlations using only two scales, and the
distance between scales should not be too large. Meanwhile, in order to retain as much
spectral continuity as possible, we set the scaling parameter in the excitation module to 1.
We will introduce the implementation process in detail below.
Remote Sens. 2022, 14, 5334 5 of 24

The input of the model is hyperspectral data X h×w×c , h and w are the length and width
of the input data, respectively, and c is the number of spectral bands. After global averaging
pooling, a one-dimensional spectral channel vector Xc = { X1 , X2 , . . . , Xc } can be obtained.
The formula can be expressed as:

h w
∑ ∑ Xl (i, j)
i =1 j =1
Xl = , l = (1, 2, 3, . . . , c) (1)
h×w
After obtaining one-dimensional spectral channel vectors, we use multiscale convolu-
tion to weigh the spectral characteristics to enhance the correlation between the spectra.
We only set up two convolution cores of different scales Ks = {Ks1 , Ks2 }. The layer is
a 1D convolution, and the size of the convolution kernel is 1 × 1 × ck , ck = {3, 5, 7, . . .}.
The size of the convolution kernel can be adjusted according to the experiment. The
convolution kernel slides in the direction of the spectral dimension, and the stride length is
1. The value generated by the convolution represents the correlation between the spectra
at that size. The size after convolution is ensured to be the same as the original size by
zero-padding. Finally, we used the ReLU function to ensure that the channel correlation is
positive. The specific calculation formula is as follows:

c k −1

Yl = ∑ Xl +i · Ksi+1 + b, l = (1, 2, 3, . . . , c)

i =0 (2)
Xl +i = 0, l + i > c


where Yc = {Y1 , Y2 , . . . , Yc } represents the result after convolution, and the size of the
output is still 1 × 1 × c, as we only use two scales Yc = {Y1 , Y2 }. b represents the bias value.
In order to obtain a wealth of spectral information, we fused the results of the obtained
spectral correlations at different convolution scales by channel. This can be expressed in a
formula as:
Fs = Xc ⊕ Y1 ⊕ Y2 (3)
Fs represents the spectral features after fusion. Xc represents the original one-dimensional
spectral channel vector. Y1 and Y2 represent the results obtained at two convolution scales.
⊕ represents the summation of these three vectors. Each channel of the fused feature
contains the original spectral information and the related features of the adjacent spectrum,
which can better generate channel weights that match the HSI.
In order to obtain the mask of the spectral channels, we need to input the fused results
into an excitation module consisting of two fully connected layers. In the SE module and
SeKG module, in order to reduce the amount of computation, the first fully connected
layer usually reduces the dimensionality of the data to c/r (c represents the number of
channels, while r represents the scaling parameter). The second fully connected layer
restores the dimensions to the original dimensions. This processing is a good choice for
ordinary images. However, for HSIs, the wealth of spectral information is the biggest
characteristic. In addition, we further enriched the spectral information by multi-scale
fusion. The method to reduce the dimension before restoring it will undoubtedly cause
part of this rich information to be lost. Therefore, we used a fully connected layer with the
same number of nodes in both layers (i.e., the scaling parameter is set to 1), which not only
reduces the effectiveness of this module but also preserves the spectral information. The
formula for calculating the channel mask can be expressed as

M = L(∂( L2 ∂( L1 Fs))) (4)

In Formula (4), L represents the sigmoid function. ∂ represents the ReLU function.
Li represents the fully connected layer. After obtaining the final channel weights M, the
weighted result is obtained by multiplying M with the two-dimensional matrix input to the
module through the scale operation.
Remote Sens. 2022, 14, x FOR PEER REVIEW 6 of 25

Remote Sens. 2022, 14, 5334 6 of 24

weighted result is obtained by multiplying M with the two-dimensional matrix input to
the module through the scale operation.
By this step, the correlation problem between multiple spectra is alleviated, and the
By this step, the correlation problem between multiple spectra is alleviated, and the
subsequent classification model is facilitated to learn more spectral features.
subsequent classification model is facilitated to learn more spectral features.
2.2.Spectral
2.2. SpectralStride
StrideFusion
FusionNetwork
Network(SSFCNN)
(SSFCNN)
2.2.1.3D
2.2.1. 3DConvolution
Convolution
Thebiggest
The biggestdifference
differenceof of
3D3DCNNCNN compared
compared to 2D
to 2D CNNCNN is that
is that the kernel
the kernel can slide
can slide on
on the channels. For input data with a large number of channels, a 3D
the channels. For input data with a large number of channels, a 3D CNN model can learn CNN model can
learnabundant
more more abundant
features,features, which perfectly
which perfectly fits the fits the characteristics
characteristics of HSIs.ofThus,
HSIs.3DThus,
CNN 3D
CNN
can takecan
fulltake full advantage
advantage of spectral
of the rich the rich information
spectral information of HSIs
of HSIs and learnand learn
richer richer
spectral
spectralby
features features
slidingby sliding
over over the
the spectral spectral dimension
dimension by using 3D bykernels.
using 3D kernels.
Figure Figure
2 shows 2D 2
shows 2D convolution
convolution and 3D convolution.
and 3D convolution.

(a) (b)
Figure 2. The difference between 3D convolution and 2D convolution. (a) Schematic diagram of 2D
Figure 2. The difference between 3D convolution and 2D convolution. (a) Schematic diagram of 2D
convolution; (b) Schematic diagram of 3D convolution.
convolution; (b) Schematic diagram of 3D convolution.
In 3D CNN, the calculation formula of the output 𝑂 value of the neuron node (x,
In 3D CNN, the calculation formula of the output Oxyz value of the neuron node (x, y,
y, z) is as follows:
z) is as follows:

K w −1 K h −1 K c −1 (5)
Oxyz = ∑ ∑ ∑ I( x+i)(y+ j)(z+m) · K(i+1)( j+1)(m+1) + b (5)
i =0 j =0 m =0
In Formula (5), 𝐾 , 𝐾 and 𝐾 represent the width, height and number of channels
of the kernel, respectively.
In Formula 𝐼 Kis
(5), Kw , Kh and the input,the
c represent and b is the
width, bias value.
height and number of channels of
the kernel, respectively. Ixyz is the input, and b is the bias value.
2.2.2. SSFCNN
2.2.2. SSFCNN
HSIs are rich in spectral information. In order to make better use of this important
HSIs are rich
characteristic, wein spectralainformation.
designed 3D CNN model In order
basedtoonmake better
spectral use offusion
feature this important
named the
characteristic,
spectral stridewe designed
fusion network a 3D CNN model
(SSFCNN). based onwe
The structure spectral
designedfeature fusion
can both named
ensure that
the spectral
enough stride information
spectral fusion network (SSFCNN).
is collected andThe structure
make we designed
the model learn more canabundant
both ensurefea-
that enough
tures spectral
by fusing information
different spectral is collected and
information. Themake
reasonthewhy
model
we learn
want tomore abundant
emphasize the
features by fusing different spectral information. The reason why we want to emphasize
learning ability of the model for spectral features is that in an actual HSI, there are certain
the learning between
similarities ability ofthe
thespectra
model of fordifferent
spectral ground
featuresobjects.
is that Taking
in an actual HSI, there
the Indian Pinesare
da-
certain
taset as an example, we plotted the spectral curves of these 16 types of samples.Pines
similarities between the spectra of different ground objects. Taking the Indian From
dataset
Figureas3, an
weexample, we plotted
can see that the trendtheofspectral curves
the spectral of these
curves 16 types
of these of samples.
16 types Fromis
of samples
Figure 3, we
basically thecan seeand
same thathas
thestrong
trend of the spectral
continuity. Thiscurves of these
requires 16 types
the model of samples
to have is
a stronger
basically the same and has strong continuity. This requires the model to have
spectral learning capability. Therefore, our original intention of designing SSFCNN is to a stronger
spectral learning
solve this capability. Therefore, our original intention of designing SSFCNN is to
problem.
solve this problem.
Compared with the traditional HSI classification model using a convolutional neural
network, a 3D convolutional neural network (3D CNN) can classify images relatively
quickly without manual dimensionality reduction. The difference between the method
in this paper and the general 3D CNN model is that the structure we designed allows
the model to extract spectral features under different spectral strides and then fuse those
features. This structure allows for both dimensionality reduction and for the network
Remote Sens. 2022, 14, x FOR PEER REVIEW 7 of 25

Remote Sens. 2022, 14, 5334 7 of 24

022, 14, x FOR PEER REVIEW 7 of 25

to learn different spectral features, which can better guide the model to classify targets.
Figure 4 shows the network structure of our model.

Figure 3. The spectral curves of 16 types of samples.

Compared with the traditional HSI classification model using a convolutional neural
network, a 3D convolutional neural network (3D CNN) can classify images relatively
quickly without manual dimensionality reduction. The difference between the method in
this paper and the general 3D CNN model is that the structure we designed allows the
model to extract spectral features under different spectral strides and then fuse those fea-
tures. This structure allows for both dimensionality reduction and for the network to learn
different spectral features, which can better guide the model to classify targets. Figure 4
shows
Figure 3. The spectral
Figure 3.the
curves network
Theof structure
16 types
spectral oftypes
of samples.
curves of 16 our model.
of samples.

Figure 4. The structure of SSFCNN.

Takingthe
Taking theIndian
IndianPines
Pines dataset
dataset as
as an
an example,
example, through
through Figure
Figure 4,
4, we
we aimed
aimed to
to fuse
fuse
the results of two different spectral strides. The values of stride for each layer are (1,
the results of two different spectral strides. The values of stride for each layer are (1, 3) 3)
and (1, 5), respectively. The results of the two different strides are concatenated by the
concat operation. We use different spectral sampling strides for concatenation because
the spectral features extract at different strides are not the same. With a small stride, the
model can extract more spectral features, but it also extracts some redundant information;
and (1, 5), respectively. The results of the two different strides are concatenated by the
concat operation. We use different spectral sampling strides for concatenation because the
Remote Sens. 2022, 14, 5334 spectral features extract at different strides are not the same. With a small stride, the model
8 of 24
can extract more spectral features, but it also extracts some redundant information; with
a large stride, the model extracts less redundant information, but the wealth of the spectral
features is likewise reduced. As a result, we planned to combine the results at different
with a large stride, the model extracts less redundant information, but the wealth of the
stages so that they could complement each other. With the extraction of the two strides,
spectral features is likewise reduced. As a result, we planned to combine the results at
the model can learn more abundant spectral information. The first two layers of the model
different stages so that they could complement each other. With the extraction of the two
are designed according to this idea. In Layer3, in order to facilitate the final model output,
strides, the model can learn more abundant spectral information. The first two layers of the
the fusion strategy is no longer used. However, the size of the feature map arriving at
model are designed according to this idea. In Layer3, in order to facilitate the final model
Layer3 will be different due to the size of the patch. When the patch size is less than 9, the
output, the fusion strategy is no longer used. However, the size of the feature map arriving
size of the feature map reaching that layer is already a one-dimensional vector, and there
at Layer3 will be different due to the size of the patch. When the patch size is less than 9,
is no need to downsample the feature map. Therefore, we set a discriminator to judge the
the size of the feature map reaching that layer is already a one-dimensional vector, and
feature maps input to Layer3. Finally, the final output is calculated through the fully con-
there is no need to downsample the feature map. Therefore, we set a discriminator to judge
nected layer.maps input to Layer3. Finally, the final output is calculated through the fully
the feature
connected layer.
3. Experimental Setting
3. Experimental
3.1. Dataset Setting
3.1. Dataset
Two public datasets, the Indian Pines dataset and the Pavia University dataset, were
used Two public
in this datasets,
experiment. theIndian
The IndianPines
Pines dataset
dataset wasand the Pavia
collected byUniversity
the Airbornedataset, were
Visible/In-
used in
frared this experiment.
Imaging Spectrometer The(AVIRIS)
Indian Pines
sensordataset was collected
over Northwest by the
Indiana, Airborne
United Visi-
States, in
ble/Infrared
1992. Imaging
This dataset Spectrometer
consists (AVIRIS)
of 145 × 145 pixelssensor
with aover Northwest
spatial Indiana,
resolution of 20 United
m. There States,
are
in 1992.
220 This dataset
continuous bandsconsists of 145 × 145range
in the wavelength pixelsofwith a spatial
400~2500 resolution
nm, with 20 of 20 m.
water There are
absorption
220 continuous bands in the wavelength range of 400~2500 nm, with 20
and low signal-to-noise ratio bands (104~108, 150~163, 220) removed. The ground truth water absorption
and low 16
includes signal-to-noise ratiomost
types of samples, bandsof (104~108,
which are 150~163, 220) removed.
crops at different growthThe ground
stages. truth
The spec-
includes
tral 16 types
features of samples,
of these 16 types most of which
of samples areare crops atsimilar,
relatively differentand
growth stages.resolution
the image The spec-
tral
is features
low, whichofcan
these 16 types
easily of samples
produce mixed are relatively
hybrid pixels,similar, and thesome
thus causing imagedifficulties
resolutionin is
low, which can easily produce mixed hybrid pixels, thus causing some difficulties
image classification. Figure 5 shows the pseudo-color image and the ground truth, respec- in image
classification. Figure 5 shows the pseudo-color image and the ground truth, respectively.
tively.

Figure
Figure 5.
5. Pseudo-color
Pseudo-color image
image (R:
(R: 50,
50, G:
G: 30,
30, B:
B: 20)
20) and
and ground
ground truth
truth of
of Indian
Indian Pines
Pines dataset.
dataset.

Figure
Figure 66 shows
shows the
the Pavia
Pavia University
University dataset.
dataset. This
This dataset
dataset was
was acquired
acquired inin 2002
2002 using
using
ROSIS sensors over Pavia, Italy. It includes nine types of samples, such as roads,
ROSIS sensors over Pavia, Italy. It includes nine types of samples, such as roads, numbers numbers
and
and roofs.
roofs. The
The image
image consists
consists of 610 ××340 340pixels
pixelswith
withaaspatial
spatialresolution
resolution of
of 1.3
1.3 m.
m. There
There
are
are 115 bands in in the
thewavelength
wavelengthrangerangeofof430~860
430~860nm,nm,ofof which
which 103103 bands
bands areare reserved
reserved for
for testing
testing after
after removing
removing 12 bands
12 bands with with strong
strong noise
noise andand water
water absorption.
absorption.
Remote Sens. 2022, 14, x FOR PEER REVIEW 9 of 25
Remote Sens. 2022, 14, x5334
FOR PEER REVIEW 9 9ofof 25
24

Figure 6. Pseudo-color image (R: 60, G: 30, B: 2) and ground truth of Pavia University dataset.
Figure
Figure 6.
6. Pseudo-color
Pseudo-color image
image (R:
(R: 60,
60, G:
G: 30,
30, B:
B: 2)
2) and
and ground
ground truth
truth of
of Pavia
Pavia University
University dataset.
dataset.
3.2. Running Environment
3.2. Running
3.2. Running Environment
Environment
The processor used for the experiments is an i7-10750H from Intel with a main fre-
The processor
The processor used
used for
for the
the experiments
experiments isis an
an i7-10750H
i7-10750Hfrom
fromIntel
Intelwith
withaamain
mainfre-
fre-
quency of 2.60 GHz. The graphics card used for the experiments is an RTX2060 from
quency
quency ofof 2.60 GHz. The graphics card used for the experiments is an RTX2060
The graphics card used for the experiments is an RTX2060 from from
NVIDIA with 6 GB of video memory. The experimental device has 16 GB of memory. The
NVIDIA with
NVIDIA with 66 GB
GB of
of video
video memory.
memory. The
The experimental
experimental device
device has
has 16
16 GB
GB of
of memory.
memory. The
The
system used is Windows 10. The deep learning framework used is Pytorch.
system used
system used is
is Windows
Windows 10. The deep learning framework used is Pytorch.
3.3.
3.3. Dataset
Dataset Processing
Processing
3.3. Dataset Processing
InIn
In
this
this
this
paper,
paper,
paper,
a fully
aa fully
fully supervised
supervised
supervised
learning
learning
learning
approach
approach
approach
is used,
is used,
is used, and
and
and
the
the
the
dataset
dataset
dataset
is divided
is divided
is divided
intothe
into thetraining
trainingsetsetand
andthe
the test
test set.
set. Figure
Figure77shows
shows the
thetraining
trainingsetset
and
andtest setset
test of of
two
into the training set and the test set. Figure 7 shows the training set and test set of two
datasets.
two datasets.
datasets.

(a) (b) (c) (d)

(a) (b) (c) (d)
Figure 7. The training set and test set of two datasets. (a,c) are the training sets. (b,d) are the test
Figure 7. The training
Figure training set
setand
andtest
testset
setofoftwo datasets.
two (a,c)
datasets. areare
(a,c) thethe
training sets.sets.
training (b,d) are the
(b,d) are test
the sets.
test
sets.
sets.
4. Discussion
4. Comparison
4.1. Discussion
4. Discussion of Modules
4.1.To
Comparison
be ableoftoof Modulesillustrate that the combination of both the FsSE module and
further
4.1. Comparison Modules
To be able to
the classification further
model illustrate
SSFCNN canthat the combination
significantly improveof both the FsSEaccuracy
the overall module of and the
the
To be able to further illustrate that the combination of both the FsSE module and the
classification
model, model three
we conducted SSFCNN sets can significantlyexperiments.
of comparison improve theTheoverall
threeaccuracy of the model,
sets of comparison
classification model SSFCNN can significantly improve the overall accuracy of the model,
experiments
we conductedwerethree
modeled
sets ofascomparison
follows: (1) aexperiments.
simple 3D CNN modelsets
The three without using the FsSE
of comparison exper-
we conducted three sets of comparison experiments. The three sets of comparison exper-
module
imentsand SSFCNN;
were modeled (2) as
a simple 3D(1)
follows: CNN model 3D
a simple thatCNN
only makes
modeluse of the using
without FsSE module;
the FsSE
iments were modeled as follows: (1) a simple 3D CNN model without using the FsSE
and (3) SSFCNN
module without
and SSFCNN; (2)using the FsSE
a simple 3D CNNmodule.
modelFigure 8 shows
that only makes the
useaccuracies
of the FsSEof mod-
the
module
four and SSFCNN; (2) a simple 3D CNN model that only makes use of the FsSE mod-
ule;models.
and (3) SSFCNN without using the FsSE module. Figure 8 shows the accuracies of the
ule; and (3) SSFCNN without using the FsSE module. Figure 8 shows the accuracies of the
four models.
four models.
efficiently. In the results shown in Figure 8, the accuracy only has a little difference when
either method is used alone. However, after combining FsSE with SSFCNN, the accuracy
of prediction can be significantly improved. The reason why the combined model can im-
prove the accuracy is that it can both ensure the continuity of the HSI spectra and enhance
the spectral information of the HSI while allowing the enhanced information to be better
Remote Sens. 2022, 14, 5334 utilized in the classification model. Therefore, our idea of designing this new classification
10 of 24
model for HSIs is effective.

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

91 96.1 96.05
90.13
96
90 95.86
95.9 95.83
89 88.53 88.52 95.8
95.67
88 87.62 95.7
95.6
87
95.5
86 95.4
Only_3D CNN Only_FsSE Only_SSFCNN ESFNet Only_3D CNN Only_FsSE Only_SSFCNN ESFNet

(a) (b)
Figure 8. Comparison results of the four models on two datasets. (a) Accuracies of the four models
Figure 8. Comparison results of the four models on two datasets. (a) Accuracies of the four models
on the Indian Pines dataset; (b) Accuracies of the four models on the Pavia University dataset.
on the Indian Pines dataset; (b) Accuracies of the four models on the Pavia University dataset.
4.2. Parameter
In FigureSensitivity
8, we canAnalysis
clearly see that the effect of ESFNet is better than the other three
Some
groups of hyperparameter settings in models.
comparison experimental the ESFNetWemodule can have
know that an impact
the FsSE module on the effect
enhances
of the model. After analysis, we mainly analyze the performance of the
the information contained in each band of the hyperspectral image (HSI), but the global model from three
aspects: the size of the convolution kernel in the FsSE module, the combination
averaging pooling in the FsSE module obtains a global receptive field and does not take of strides
in SSFCNN
into accountand thethe patch
spatial size of the input.
information. SSFCNN We set
hasthe
thebatch size to
capability of16, usedlearning
spatial the RMSprop while
algorithm
enhancingas thethe optimizer
learning of the
ability loss function
of spectra, but ifand set the epoch model
the classification of all models
is allowedto 200. The
to learn
test
the set is evaluated
original spectralby selecting
bands the the
directly, model with the
effective bandhighest detection
information accuracy
cannot on the
be extracted
validation
efficiently. set, andresults
In the finallyshown
the best choice 8,
in Figure ofthe
these three components
accuracy is used
only has a little as the when
difference final
choice of the model,
either method is usedwhich
alone.isHowever,
compared with
after other models
combining FsSE inwithSection
SSFCNN, 4.3. Ittheshould
accuracy be
of prediction
noted that the can
otherbeparameters
significantly of improved.
the model are Thethereason
same why
whenthe wecombined
analyze a modelparticularcan
improve the accuracy is that it can both ensure the continuity of the HSI spectra and
parameter.
enhance the spectral information of the HSI while allowing the enhanced information to
be better
4.2.1. Impactutilized in the classification
of Convolution Kernel Size model.
of the Therefore,
FsSE Module ouronidea
Modelof designing
Accuracy this new
classification model for HSIs is effective.
We will introduce the advantages of this module in Section 3. However, due to the
different settings of the convolution kernel size, the extracted spectral correlations are dif-
4.2. Parameter Sensitivity Analysis
ferent. Common convolution kernels are typically of size 3, 5 or 7. Therefore, we con-
ducted Some
threehyperparameter
sets of comparison settings in the ESFNet
experiments for both module
datasets.can have
Figure an impact
9 shows on the
the results.
effectFrom
of the model. After analysis, we mainly analyze the performance of the model
Figure 9, it can be seen that for extracting the correlation between the spectra, from
three aspects: the size of the convolution kernel in the FsSE module, the combination
a high accuracy has been achieved by using the convolution of two scales. Because of the of
strides in SSFCNN and the patch size of the input. We set the batch size to 16,
strong continuity existing between spectra, the use of two convolution kernels with little used the
RMSprop algorithm as the optimizer of the loss function and set the epoch of all models to
difference in size is enough to ensure that the model can extract sufficient spectral corre-
200. The test set is evaluated by selecting the model with the highest detection accuracy
lation. Therefore, our optimization of the SeKG module is effective. The best combination
on the validation set, and finally the best choice of these three components is used as
of convolution kernels for the Indian Pines dataset is 1 × 1 × 5 and 1 × 1 × 7 because the
the final choice of the model, which is compared with other models in Section 4.3. It
should be noted that the other parameters of the model are the same when we analyze a
particular parameter.

4.2.1. Impact of Convolution Kernel Size of the FsSE Module on Model Accuracy
We will introduce the advantages of this module in Section 3. However, due to
the different settings of the convolution kernel size, the extracted spectral correlations
are different. Common convolution kernels are typically of size 3, 5 or 7. Therefore,
we conducted three sets of comparison experiments for both datasets. Figure 9 shows
the results.
model has the highest accuracy with this combination. For the Pavia University dataset,
the accuracy of the convolution kernel combinations 1 × 1 × 3 & 1 × 1 × 7 and 1 × 1 × 5 & 1
× 1 × 7 is the same. Thus, we need to analyze these three combinations from other perspec-
Remote Sens. 2022, 14, 5334 11 of 24
tives. Table 1 shows the training time and the number of parameters for the three combi-
nations on the Pavia University dataset.

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

91 90.13 93.5 93.13 93.20 93.20

Overall Accuracy

Overall Accuracy
89.68
90 93
89 92.5
87.73
88 92
87 91.5
86 91
3&5 3&7 5&7 3&5 3&7 5&7
Combination of kernel size Combination of kernel size

(a) (b)
Figure 9. Accuracy of different combinations of convolution kernels in the FsSE module for both
Figure 9. Accuracy of different combinations of convolution kernels in the FsSE module for both
datasets. (a) The result of the Indian Pines dataset; (b) The result of the Pavia University dataset.
datasets. (a) The result of the Indian Pines dataset; (b) The result of the Pavia University dataset.
Table 1. Training time and number of parameters required for three different combinations of con-
From Figure 9, it can be seen that for extracting the correlation between the spectra, a
volutional kernel sizes on the Pavia University dataset.
high accuracy has been achieved by using the convolution of two scales. Because of the
strong continuity
Model existing between spectra,
Trainingthe use of two convolution
Time/s Totalkernels
Params with little
difference in3&5
size is enough to ensure that 1570
the model can extract sufficient spectral correla-
249,816
tion. Therefore,
3&7 our optimization of the SeKG
1455 module is effective. The
249,818combination
best
of convolution kernels for the Indian Pines dataset is 1 × 1 × 5 and 1 × 1 × 7 because the
5&7 1160 249,820
model has the highest accuracy with this combination. For the Pavia University dataset,
the accuracy of the convolution kernel combinations 1 × 1 × 3 & 1 × 1 × 7 and 1 × 1 ×
From Table 1, it is obvious that the combination 5&7 takes the least time to train
5 & 1 × 1 × 7 is the same. Thus, we need to analyze these three combinations from other
among the three combinations. Although the number of parameters is the largest among
perspectives. Table 1 shows the training time and the number of parameters for the three
the three, the number of extra parameters is very small. Therefore, in the case that the two
combinations on the Pavia University dataset.
combinations of 3&7 and 5&7 have the same accuracy for the Pavia University dataset,
the less time-consuming combination of 1 × 1 × 5 and 1 × 1 × 7 is chosen.
Table 1. Training time and number of parameters required for three different combinations of
convolutional kernel sizes on the Pavia University dataset.
4.2.2. Impact of Patch Size on Model Accuracy
BecauseModel Training Time/sdataset is only one
the image in the public hyperspectral Total Params
piece, if the whole
image is input3&5 into the network for training,1570 it is not only disadvantageous for the network
249,816
training, but also3&7 the amount of data is far from 1455 enough. Therefore, we need to sample the
249,818
image and send 5&7the sampled part into the network 1160 for training, which can 249,820
both reduce the
training time of the model and increase the training volume of the model. Taking the In-
dian Pines
From dataset
Table 1,asitanis example,
obvious that the size
the of the original5&7
combination image is 145
takes the× least
145 ×time
200, and we
to train
select
among a the
block
threeof M × M × 200 pixels
combinations. to input
Although into the of
the number model for training.
parameters The choice
is the largest among of
patch size,the
the three, however,
numbercan have parameters
of extra a significantis impact on model
very small. accuracy
Therefore, in theandcasetraining
that thetime
two
as well. If the selected
combinations of 3&7 andsize5&7
is too
havesmall, the model
the same accuracy willfor
not
thebePavia
trained properly;
University if the se-
dataset, the
lected size is too large, the model training time will
less time-consuming combination of 1 × 1 × 5 and 1 × 1 × 7 is chosen. increase. Therefore, in order to select
the best patch size, we chose seven different sizes of sampling windows for comparison.
4.2.2. Impact
Figure 10 shows of Patch Size on for
the accuracy Model Accuracy
the two datasets when faced with different patch sizes.
From
BecauseFigure 10a, weincan
the image theobserve that the modeldataset
public hyperspectral has theishighest
only one accuracy
piece, ifintheclassify-
whole
ing the is
image Indian
inputPines dataset
into the networkwhen forthe size ofitthe
training, sampling
is not is increased to 11,
only disadvantageous forand
the then the
network
accuracy
training, decreases
but also the as the patchofsize
amount dataincreases.
is far from From Figure Therefore,
enough. 10b, we canwe seeneed
that to
thesample
accu-
the image
racy of the and
model send the sampled
is highest whenpartthe into
patchthe sizenetwork for training,
is 5. After which canofboth
that, the accuracy reduce
the model
thethe
on training time of the dataset
Pavia University model and increase
decreases whenthe the
training
patchvolume of the model.
size continues Taking
to increase. the
Also,
Indian
we know Pines
thatdataset as an
the patch example,
size not only theaffects
size ofthe theaccuracy
original ofimage is 145 ×but
the model, × 200,
145also hasand
an
we select a block of M × M × 200 pixels to input into the model for training. The choice of
patch size, however, can have a significant impact on model accuracy and training time as
well. If the selected size is too small, the model will not be trained properly; if the selected
size is too large, the model training time will increase. Therefore, in order to select the best
patch size, we chose seven different sizes of sampling windows for comparison. Figure 10
shows the accuracy for the two datasets when faced with different patch sizes.
Remote Sens. 2022, 14, x FOR PEER REVIEW 12 of 25

Remote Sens. 2022, 14, 5334 impact on the training time of the model. Table 2 shows the training time of the model
12 of 24
with seven patch sizes.

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

95 100
90.13
Overall accuracy

Overall accuracy
87.36 95.56
90 86.74 94.27
84.79 86.09 95 91.99
85 95.12 90.14
85.56 90 93.20
80 82.83 91.12
75 85
5 7 9 11 13 15 17 5 7 9 11 13 15 17
Patch size Patch size

(a) (b)
Figure 10. Accuracy of different patch sizes for two datasets. (a) Results for different patch sizes for
Figure 10. Accuracy of different patch sizes for two datasets. (a) Results for different patch sizes for
the Indian Pines dataset. (b) Results for different patch sizes for the Pavia University dataset.
the Indian Pines dataset. (b) Results for different patch sizes for the Pavia University dataset.
Table 2. Training time for seven patch sizes for the two datasets.
From Figure 10a, we can observe that the model has the highest accuracy in classifying
the IndianDataset:
Pines dataset
Indianwhen
Pinesthe size of the sampling is increased
Dataset: to 11, and then the
Pavia University
accuracy decreases
Patch Size as the patch size increases.
Training Time/s From Figure 10b,
Patch Size we can see that the
Training accuracy
Time/s
of the model
5 is highest when the
208patch size is 5. After that,
5 the accuracy of the
703 model on
the Pavia University dataset decreases when the patch size continues to increase. Also, we
7 293 7 962
know that the patch size not only affects the accuracy of the model, but also has an impact
9 284 9 865
on the training time of the model. Table 2 shows the training time of the model with seven
11 620 11 1160
patch sizes.
13 688 13 1779
15 1073 15
Table 2. Training time for seven patch sizes for the two datasets. 1887
17 1021 17 2223
Dataset: Indian Pines Dataset: Pavia University
Patch Size Training Time/s Patch Size Training Time/s
Table 2 clearly shows the relationship between the training time and the patch size.
5
With increasing size, the training208
time increases as well. 5For the Pavia University
703 dataset,
a patch size7of 5 gives the best results
293 and takes the least7 amount of time to train.
962 For the
9 284 9 865
Indian Pines dataset, although the training time is at its minimum when the patch size is
11 620 11 1160
set to 5, the13accuracy is 7.296% lower
688 than when the size 13 is 11. Therefore, for the Indian
1779
Pines dataset,
15 a patch size setting of 11 is optimal.
1073 15 1887
17 1021 17 2223
4.2.3. Impact of Stride Combinations on Model Accuracy
We already
Table know
2 clearly that there
shows is a certain degree
the relationship between ofthe
redundant
traininginformation
time and the inpatch
the spec-
size.
tra
With increasing size, the training time increases as well. For the Pavia University methods
of an HSI, which is the theoretical basis for the use of descending dimension dataset, a
such
patch assize
PCAofin5 numerous
gives the beststudies of HSI
results and classification.
takes the least In the sameof
amount way, even
time if we weight
to train. For the
the spectral
Indian Pinesinformation by the the
dataset, although set training
FsSE module,
time isthe redundant
at its minimum information
when the patch still exists,
size is
which
set to requires us to find
5, the accuracy ways tolower
is 7.296% makethanthe 3DwhenCNN themodel
size islearn as much effective
11. Therefore, infor-
for the Indian
mation as possible.
Pines dataset, Therefore,
a patch we designed
size setting SSFCNN to solve this problem. However, with
of 11 is optimal.
different spectral strides, the extracted spectral features are different. To study the effect
of4.2.3.
this Impact
part on of
theStride
model, Combinations on Model
we set up multiple Accuracy Tables 3 and 4 show the accu-
combinations.
racy and trainingknow
We already time of these
that combinations
there on theoftwo
is a certain degree datasets.information in the spectra
redundant
of anFrom
HSI, Table
which3,isthe
thecombination of Layer1
theoretical basis for theanduseLayer2 of the model
of descending is 1_3 methods
dimension and 1_5. suchThe
model
as PCA has
in the best classification
numerous studies of HSI effect on the Indian
classification. In Pines
the samedataset,
way,and
eventhe accuracy
if we weightratethe
isspectral
generally 1%–2% higher
information by the compared with other
set FsSE module, thecombinations. Although the
redundant information stilltraining time
exists, which
isrequires us to than
a bit higher find ways to make
for other the 3D CNN
combinations, it ismodel learn as
still within anmuch effective
acceptable information
range. The accu- as
possible. Therefore, we designed SSFCNN to solve this problem.
racy on the Pavia University dataset can be analyzed by Table 4. It can be seen that theHowever, with different
spectral strides, the extracted spectral features are different. To study the effect of this part
on the model, we set up multiple combinations. Tables 3 and 4 show the accuracy and
training time of these combinations on the two datasets.
Remote
Remote Sens.
Sens. 14, 14,
2022,
2022, 5334PEER REVIEW
x FOR 13 of1325
of 24

Table 3. Accuracy and training time for different stride combinations on the Indian Pines dataset.
accuracies of different combinations are close. Figure 11 shows the accuracies more visu-
ally. Combination of Strides Training Time/s Overall Accuracy/%
1_3&1_3
Table 3. Accuracy 585 combinations on the Indian88.585
and training time for different stride Pines dataset.
1_3&1_5 620 90.125
Combination 1_3&1_7
of Strides Training 506Time/s 88.770
Overall Accuracy/%
1_5&1_3 895 88.737
1_3&1_3 585 88.585
1_5&1_5 410 88.379
1_3&1_5
1_5&1_7 620736 90.125
87.902
1_3&1_7
1_7&1_3 506617 88.770
88.444
1_7&1_5
1_5&1_3 895464 87.458
88.737
1_7&1_7 474 88.466
1_5&1_5 410 88.379
1_5&1_7 736 87.902
1_7&1_3and training time for different
Table 4. Accuracy 617stride combinations on the Pavia
88.444
University dataset.
1_7&1_5 464 87.458
Combination of Strides Training Time/s Overall Accuracy/%
1_7&1_7 474 88.466
1_3&1_3 966 95.343
1_3&1_5 703 95.558
Table 4. Accuracy and training time for different stride combinations on the Pavia University da-
1_3&1_7 991 95.979
taset.
1_5&1_3 819 95.660
Combination1_5&1_5
of Strides Training 771Time/s 95.701
Overall Accuracy/%
1_5&1_7 1028 95.984
1_3&1_3
1_7&1_3 966769 95.343
95.522
1_3&1_5
1_7&1_5 703754 95.558
96.044
1_3&1_7
1_7&1_7 991897 95.979
95.561
1_5&1_3 819 95.660
From Table 3, the combination of Layer1 and Layer2 of the model95.701
1_5&1_5 771 is 1_3 and 1_5. The
model has1_5&1_7
the best classification effect on1028
the Indian Pines dataset, and95.984
the accuracy rate is
generally 1%–2% higher compared with other
1_7&1_3 769 combinations. Although the training time is
95.522
a bit higher than for other combinations, 754
1_7&1_5 it is still within an acceptable96.044
range. The accuracy
on the Pavia University dataset can be analyzed
1_7&1_7 897 by Table 4. It can be seen that the accuracies
95.561
of different combinations are close. Figure 11 shows the accuracies more visually.

OA (%)
97 96.05
95.98 95.99
Overall accuracy

95.56 95.66 95.70 95.52 95.56

96 95.34
95
94
93
1_3&1_3 1_3&1_5 1_3&1_7 1_5&1_3 1_5&1_5 1_5&1_7 1_7&1_3 1_7&1_5 1_7&1_7
Combination of strides

Figure 11. Accuracy of different stride combinations on the Pavia University dataset.
Figure 11. Accuracy of different stride combinations on the Pavia University dataset.
From the training time, when the combination is 1_7 and 1_5, the training time is the
second
Fromlowest amongtime,
the training all combinations, and the accuracy
when the combination is also
is 1_7 and 1_5,the
thehighest.
trainingHowever, due
time is the
to the large amount of data in the Pavia University dataset, the training time
second lowest among all combinations, and the accuracy is also the highest. However, of the model
dueis to
increased
the largecompared
amount ofto data
the Indian
in the Pines
Pavia dataset. In summary,
University dataset, thewetraining
use thetime
combination
of the
model is increased compared to the Indian Pines dataset. In summary, we use theUniversity
1_3&1_5 for the Indian Pines dataset and the combination 1_7&1_5 for the Pavia combi-
dataset.
nation 1_3&1_5 for the Indian Pines dataset and the combination 1_7&1_5 for the Pavia
University dataset.
4.3. Comparison with Other Baselines
4.3.4.3.1. Baselinewith Other Baselines
Comparison
In order to verify the advantages of the models in this paper, we have selected some
4.3.1. Baseline
mainstream models in the field of HSI classification for comparison. The implementation
details of the comparison models are as follows:
Remote Sens. 2022, 14, 5334 14 of 24

(1) SVM: The SVM model in this paper used the radial basis function (RBF kernel), which
classifies by raw spectral features. We implemented the model using the SVM function
in the Sklearn module.
(2) ANN: The original spectral features are classified by an artificial neural network
(ANN), which contains four fully connected layers and a dropout layer, and was
trained with a learning rate of 0.0001 using the Adam algorithm.
(3) 1D CNN: We used the same 1D CNN structure as in [24], Pytorch to implement the
model and the stochastic gradient descent algorithm to train the model with a learning
rate of 0.01.
(4) 3D CNN: A structure proposed in [40] was used for the 3D CNN model, which is a
conventional structure consisting of three convolution-pooling layers and one fully
connected layer. The model was implemented in Pytorch and trained with a learning
rate of 0.003 using the stochastic gradient descent algorithm.
(5) Hamida (3D CNN + 1D classifier) [47]: We implemented the model in Pytorch, where
we extracted a 5 × 5 × 200 cube from the image as an input to the model. The
characteristic of the model is that it utilizes one-dimensional convolution instead of
the usual pooling method and finally utilizes one-dimensional convolution instead of
a fully connected layer. The model was trained with a learning rate of 0.01 using the
stochastic gradient descent algorithm.
(6) HybridSN: The model used the specific structure proposed in [48], and the model was
implemented in Pytorch. The patch size is 25 × 25. The model contains a total of four
convolutional layers and two fully connected layers, where the four convolutional
layers include three 3D convolutional layers and one 2D convolutional layer, with the
3D convolutional layer for learning spatial-spectral features and the 2D convolutional
layer for learning spatial features.
(7) RNN: We used an RNN model for HSI classification, which is similar to [31]. We
replaced the activation function with a tanh function and implemented the model
in Pytorch.
(8) SpectralFormer (SF): We implemented the model directly using the model code pro-
vided in [49]. The model is an improvement of Transformer with the addition of two
new modules, GSE and CAF, in order to improve the detail-capturing capacity of
subtle spectral discrepancies and enhance the information transitivity between layers,
respectively. We implemented it in Pytorch.

4.3.2. Performance Analysis

We produced a quantitative analysis table of the classification results of these eight
models for the Indian Pines dataset and the Pavia University dataset. The results are shown
in Tables 5 and 6.
Tables 5 and 6 give the overall accuracy (OA), kappa coefficients and accuracy rates for
each of these eight models on these two types of datasets, respectively. When comparing
the overall accuracy of the eight models for the two datasets, it is obvious that the overall
accuracy of the eight models on the Pavia University dataset is higher than on the Indian
Pines dataset. The main reason for this phenomenon is the increase in the amount of
training data. Although the distribution of ground objects in the Pavia University dataset
appears to be relatively sparse, the amount of data is much larger than that in the Indian
Pines dataset, which affects the training of the models to some extent.
In the classification results of the two datasets, we can clearly observe that the overall
accuracy of the ESFNet proposed in this paper is the highest in both datasets, and our
model achieves the best accuracy in the majority of categories, except for a few categories
in both datasets. Taking the Indian Pines dataset as an example, through Table 5, we can
find that the classification accuracy of the model in this paper is low on the class Oats. To
analyze the reason for this problem accurately, we extracted the spectral curves of four
other classes of features near the class Oats, as shown in Figure 12.
Remote Sens. 2022, 14, 5334 15 of 24

Table 5. Classification results of nine models on the Indian Pines dataset.

Class Name
SVM RNN ANN 1D CNN SF 3D CNN Hamida HybridSN ESFNet
[F1 Scores (%)]
1. Alfalfa 36.1 8.7 82.1 0.0 10.9 95.3 74.2 100.0 75.8
2. Corn-notill 74.8 63.3 79.3 52.8 69.2 90.2 89.4 93.7 93.5
3. Corn-mintill 72.6 44.9 70.4 31.1 68.7 54.5 77.9 59.2 85.6
4. Corn 64.4 44.1 71.0 2.7 64.5 55.4 76.3 46.2 88.3
5. Grass-pasture 86.5 75.6 91.0 8.6 84.0 70.2 92.5 73.6 93.0
6. Grass-trees 93.8 89.3 93.0 76.1 91.6 97.2 98.3 99.5 97.2
7. Grass-pasture-mowed 85.7 60.0 95.8 0.0 86.3 93.6 35.3 83.7 93.6
8. Hay-windrowed 94.7 91.7 97.6 87.3 93.3 67.7 96.8 74.1 95.8
9. Oats 52.6 0.0 71.4 0.0 66.7 80.0 86.5 94.7 50.0
10. Soybean-notill 73.2 56.1 77.4 34.7 71.9 83.9 87.6 87.0 91.4
11. Soybean-mintill 80.4 67.0 82.6 66.6 78.9 90.4 90.3 92.2 94.1
12. Soybean-clean 82.1 57.2 74.8 15.8 68.6 75.0 81.0 79.2 88.8
13. Wheat 93.7 90.2 96.2 81.9 95.7 100.0 99.2 97.6 99.5
14. Woods 91.8 90.2 94.5 82.2 91.5 82.5 95.5 85.0 97.9
15. B-G-T-D 62.8 56.1 69.5 12.9 53.5 37.5 76.5 39.5 68.5
16. Stone-Steel-Towers 91.0 81.0 86.7 90.3 90.3 74.1 97.6 91.1 97.6
OA(%) 81.0 68.9 83.2 59.6 78.3 72.9 88.5 76.3 90.1
Kappa × 100 0.783 0.645 0.808 0.522 0.751 0.698 0.869 0.735 0.888

Table 6. Classification results of nine models on the Pavia University dataset.

Class Name
SVM RNN ANN 1D CNN SF 3D CNN Hamida HybridSN ESFNet
[F1 Scores (%)]
1. Asphalt 91.5 90.5 95.8 90.2 92.9 90.8 97.2 93.5 97.9
2. Meadows 95.1 95.3 97.1 91.2 94.3 83.9 95.5 86.8 96.9
3. Gravel 79.3 73.3 85.8 56.3 73.1 83.4 93.0 87.2 93.4
4. Trees 92.8 93.4 96.3 90.5 92.7 93.6 96.7 94.6 98.4
5. Painted metal sheets 99.2 99.5 99.6 99.1 99.4 100.0 100.0 99.8 100.0
6. Bare Soil 84.5 88.8 93.4 70.3 83.0 94.5 94.6 100.0 99.6
7. Bitumen 71.2 73.5 91.8 80.1 84.7 95.4 95.2 100.0 96.6
8. Self-Blocking Bricks 85.8 80.6 88.5 81.6 78.8 98.2 95.5 98.8 96.4
9. Shadows 99.9 99.6 99.6 99.9 99.9 97.9 99.9 98.8 100.0
OA(%) 91.2 90.5 94.7 86.3 89.9 83.2 94.5 86.2 96.1
Kappa × 100 0.882 0.875 0.930 0.816 0.866 0.792 0.927 0.828 0.948

In Figure 12, the spectral curves of the classes Oats and Grass_trees are very close
to each other. In the band range of 50–100, the spectral curves of these two classes of
features almost overlap. Reflecting on the specific classification effect, our model classified
50% of the class Oats into the class Grass_trees, which can be seen in Figure 16j shown
later. The reason for this is that the model in this paper learns some spatial features while
enhancing the spectral learning ability. However, as we do not emphasize the learning
of spatial features, coupled with the very small number of class Oats in the training set,
the final features of Oats learned by our model are closer to those of Grass_trees, which
led to misclassification. In contrast, HybridSN was designed with a convolutional layer
dedicated to extracting spatial features. Therefore, the classification of such samples has
some advantages. ESFNet, however, has enhanced its ability to learn spectral features,
enabling it to gain an advantage in the classification of most categories. The reason is
that these two fusion operations can effectively extract the effective features of the sample
spectrum so that the model can be trained to achieve better results.
advantages. ESFNet, however, has enhanced its ability to learn spectral features, enabling
it to gain an advantage in the classification of most categories. The reason is that these two
Remote Sens.fusion
2022, 14, operations
5334 can effectively extract the effective features of the sample spectrum so 16 of 24

that the model can be trained to achieve better results.

Figure 12. Spectral curves of Oats

Figure 12. and
Spectral its surrounding
curves of Oats and itsfour types offour
surrounding samples.
types of samples.

Although our model has average results on very few categories, it can stay ahead in
Although our model has average results on very few categories, it can stay ahead in
most of the categories, which means that by our design, we can make our model learn
most of the categories, which means that by our design, we can make our model learn
enough features in most categories and make the model learn more complex spectral
enough features infeatures
most categories
by fusing the and make
results the model
of different learnfinally
strides, moreobtaining
complexexcellent
spectralclassification
fea-
tures by fusing theresults.
results ofhyperspectral
For different strides,
image finally obtaining
classification, we areexcellent classification
more concerned re-
with performance
in overall
sults. For hyperspectral accuracy
image and performance
classification, we areinmoremost concerned
categories, and withourperformance
method is ahead in of the
other methods.
overall accuracy and performance in most categories, and our method is ahead of the
Figure 13 shows the training loss and validation accuracy of seven deep learning
other methods.
models on the Indian Pines dataset and the Pavia University dataset. Through these graphs,
Figure 13 shows
we can theseetraining
that the 1D loss
CNN and validation
model converges accuracy of our
the slowest, seven
model deep learning
converges the fastest
models on the Indian
on the Pines
Indian dataset and and
Pines dataset, the HybridSN
Pavia University
converges thedataset.
fastestThrough
on the Pavia these
University
graphs, we can see that the
dataset. The1D CNN model
validation accuracy converges
of HybridSN the isslowest, ourofmodel
the highest all the converges
models, but when
combined with the final test accuracy, it shows a certain
the fastest on the Indian Pines dataset, and HybridSN converges the fastest on the Pavia degree of overfitting. There are
University dataset.twoThereasons for thisaccuracy
validation situation:ofone is that theisnetwork
HybridSN layersofofallHybridSN
the highest are deeper
the models,
compared to other networks, and the other is that the number of training samples is smaller.
but when combined with the final test accuracy, it shows a certain degree of overfitting.
Although HybridSN is able to extract both spatial and spectral features, the smaller number
There are two reasons for this
of training situation:
samples makes the one is that
model not the
learnnetwork layers
sufficiently, whileof theHybridSN are layers
deeper network
deeper compared aggravate
to other networks,
this problem. and the other
HybridSN hadisfaster
that convergence
the numberand of higher
training samples
validation accuracy,
is smaller. Although HybridSN
but the model was isstill
able to extract
overfitted dueboth
to thespatial and spectral
two problems mentioned features, the model
above. Our
smaller number of performs
training wellsamples
in terms of training
makes theloss and validation
model not learn accuracy, and its convergence
sufficiently, while the speed
is also aggravate
deeper network layers fast. Combining the validation
this problem. accuracyhad
HybridSN and faster
testing convergence
accuracy, our modeland does
not have a serious problem of overfitting and has good generalization performance. In
higher validation order
accuracy, but the model was still overfitted due to the two problems
to better show the differences between models, we plotted the confusion matrix of
mentioned above.the Our model
nine modelsperforms well of
on two types in datasets
terms ofastraining loss and
a significance test. validation accu-
The results are shown in
racy, and its convergence
Figures 14 speed is also fast. Combining the validation accuracy and testing
and 15.
accuracy, our model does not have a serious problem of overfitting and has good gener-
alization performance. In order to better show the differences between models, we plotted
the confusion matrix of the nine models on two types of datasets as a significance test. The
results are shown in Figures 14 and 15.
Remote Sens. 2022, 14, x FOR PEER REVIEW 17 of 25
Remote Sens. 2022, 14, 5334 17 of 24

(a) dataset: Indian Pines (b) dataset: Indian Pines

(c) dataset: Pavia University (d) dataset: Pavia University

Figure 13. Training loss and validation accuracy of eight models on two datasets. (a,c) represents
Figure 13. Training loss and validation accuracy of eight models on two datasets. (a,c) represents
the variation curve of the loss of different models on the two datasets; (b,d) represents the variation
the variation
curve curve of the
of the validation loss ofofdifferent
accuracy differentmodels
modelsonon
the two
the datasets;
two (b,d) represents the variation
datasets.
curve of the validation accuracy of different models on the two datasets.
In Figures 14 and 15, we can clearly observe the classification results of different mod-
In Figures 14 and 15, we can clearly observe the classification results of different
els for different categories. As for the Indian Pines dataset, because it has a small sample
models for different categories. As for the Indian Pines dataset, because it has a small
number of samples, models lacking learning of spectral features are prone to category
sample number of samples, models lacking learning of spectral features are prone to
misclassification, and this problem is even more obvious for models like ANN, where the
category misclassification, and this problem is even more obvious for models like ANN,
input data are one-dimensional. As for the Pavia University dataset, although it has a large
where the input data are one-dimensional. As for the Pavia University dataset, although
number of samples, the problems mentioned above still exist. In addition, because of the
it has a large number of samples, the problems mentioned above still exist. In addition,
large sampling window of HybridSN, it misclassifies many samples as uncategorized, but
because of the large sampling window of HybridSN, it misclassifies many samples as
ifuncategorized,
the patch sizebut
is changed to asize
if the patch smaller size, the
is changed to amodel
smallerwill
size,not
theconverge at not
model will all, converge
and the
accuracy will be 0 directly. Thus, in comparison, our model has not only high accuracy,
at all, and the accuracy will be 0 directly. Thus, in comparison, our model has not only high
but also robustness
accuracy, and extensibility.
but also robustness and extensibility.
Remote Sens. 2022, 14, x FOR PEER REVIEW 18 of 25
Remote Sens. 2022, 14, 5334 18 of 24

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 14. Confusion matrixes of the nine models on the Indian Pines dataset. (a) SVM. (b) RNN. (c)
Figure (d) Confusion
ANN. 14. 1D CNN. (e)matrixes ofCNN.
SF. (f) 3D the nine models on
(g) Hamida. (h)the Indian Pines
HybridSN. dataset. (a) SVM. (b) RNN.
(i) ESFNet.
(c) ANN. (d) 1D CNN. (e) SF. (f) 3D CNN. (g) Hamida. (h) HybridSN. (i) ESFNet.
Remote Sens. 2022, 14, x FOR PEER REVIEW 19 of 25
Remote Sens. 2022, 14, 5334 19 of 24

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 15. Confusion matrixes of the nine models on the Pavia University dataset. (a) SVM. (b)
Figure 15. Confusion matrixes of the nine models on the Pavia University dataset. (a) SVM. (b) RNN.
RNN. (c) ANN. (d) 1D CNN. (e) SF. (f) 3D CNN. (g) Hamida. (h) HybridSN. (i) ESFNet.
(c) ANN. (d) 1D CNN. (e) SF. (f) 3D CNN. (g) Hamida. (h) HybridSN. (i) ESFNet.
Figures 16 and 17 show the maps of the classification effects of the eight models on
Figures 16 and 17 show the maps of the classification effects of the eight models on
the two datasets. It can be clearly observed that 1D CNN, ANN, RNN and SVM, which
the two datasets. It can be clearly observed that 1D CNN, ANN, RNN and SVM, which
are classification models using the spectrum of a single pixel for learning classification,
are classification models using the spectrum of a single pixel for learning classification, are
are significantly worse than the other four classification models with sampled regions.
significantly worse than the other four classification models with sampled regions. The
The reason for this phenomenon is that the information learned by using a single pixel
reason for this phenomenon is that the information learned by using a single pixel point is
point is very limited, it is difficult to maintain continuity within the object, and there is a
very limited, it is difficult to maintain continuity within the object, and there is a significant
significant gap between pixels, which results in a very obvious misclassification of pixels
gap between pixels, which results in a very obvious misclassification of pixels within the
same category of regions. In contrast, the other four classification models with sampled
within the same category of regions. In contrast, the other four classification models wi
Remote Sens. 2022, 14, 5334 sampled regions will learn part of the spatial information at the same time, and
20 of 24 the fin
classification maps are relatively smoother. By comparing the classification maps, it is al
obvious that HybridSN, with increased spatial learning capability, and ESFNet, the mod
of this will
regions paper with
learn partenhanced spectral
of the spatial learning
information at thecapability,
same time, andhavethebetter classification resu
final classification
maps are relatively
than the other two smoother. By comparing
classification models.the classification
Comparing maps, it iswith
HybridSN also obvious
our model, that althoug
HybridSN, with increased spatial learning capability, and ESFNet, the
HybridSN enhances the spatial learning ability and selects 30 spectra with higher immodel of this paper
with enhanced spectral learning capability, have better classification results than the other
portance by PCA, we know that the most important thing in hyperspectral images is t
two classification models. Comparing HybridSN with our model, although HybridSN
spectral information. Using PCA not only destroys the continuity between spectra, b
enhances the spatial learning ability and selects 30 spectra with higher importance by PCA,
also
we the representativeness
know that the most important of thing
the selected spectra is
in hyperspectral not necessarily
images is the spectralaccurate, which caus
information.
Using PCA not only destroys the continuity between spectra, but also the representativeness imag
the model to fail to make good use of the rich spectral information in hyperspectral
and
of themakes
selectedits final isclassification
spectra not necessarilyeffect inferior
accurate, whichtocauses
that the
of our
model model.
to fail toAsmake
for the late
good
method,use of
SF,the rich not
it did spectral information
utilize the spectra in hyperspectral
efficiently and images
did notandspecifically
makes its final solve the i
classification effect inferior to that of our model. As for the latest method,
formation differences that existed between the spectra, which resulted in a lower classi SF, it did not
utilize the spectra efficiently and did not specifically solve the information differences that
cation accuracy than that of our model. Our model makes good use of this characteris
existed between the spectra, which resulted in a lower classification accuracy than that
of hyperspectral images, and through weighting and our designed network structure, t
of our model. Our model makes good use of this characteristic of hyperspectral images,
model
and can fully
through learnand
weighting theour
features of hyperspectral
designed network structure, images and finally
the model can fullyachieve
learn the better cla
sification results.
features of hyperspectral images and finally achieve better classification results.

Figure16.
Figure 16.The
The ground
ground truth
truth andand classification
classification maps maps of the
of the nine nine models
models on the
on the Indian Indian
Pines Pines datas
dataset.
(a)The
(a) Theground
groundtruth.
truth.
(b)(b) SVM.
SVM. (c) (c)
RNN.RNN. (d) ANN.
(d) ANN. (e)CNN.
(e) 1D 1D CNN.
(f) SF.(f)
(g)SF.
3D(g) 3D (h)
CNN. CNN. (h) Hamida.
Hamida.
HybridSN.
(i) HybridSN.(j)
(j)ESFNet.
ESFNet.

To determine the significant differences between the models, we used the Friedman
test [50] for statistical significance. We compared the significance of the models among the
categories in the two datasets.
In the Friedman test, we used the chi-square distribution to approximate the Friedman
test statistic. We calculated the ranking of the models in the above experiments in terms
of F1 scores in each category of the datasets. The results are shown in Tables 7 and 8. We
assume that there is no difference between the models, and thus R2j should be equal. Based
Remote Sens. 2022, 14, 5334 21 of 24

on the following equation and the data in Tables 7 and 8, the value of the Friedman test
statistic can be calculated.

k
2 12 2
 X1 r = n1 k(k+1) ∑ R j − 3n1 (k + 1) = 71.825, dataset : IndianPines



j =1
Remote Sens. 2022, 14, x FOR PEER REVIEW k
(6) 21 of 2
2 12 2
 X2 r = n k(k+1) ∑ R j − 3n2 (k + 1) = 39.489, dataset : PaviaUniversity



2
j =1

Figure17.17.The
Figure The ground
ground truth
truth andand classification
classification maps maps
of the of themodels
nine nine models on the
on the Pavia Pavia Universit
University
dataset.(a)
dataset. (a)The
Theground
ground truth.(b)(b)
truth. SVM.
SVM. (c)(c) RNN.
RNN. (d) (d) ANN.
ANN. (e) CNN.
(e) 1D 1D CNN. (f)(g)
(f) SF. SF.3D(g)CNN.
3D CNN. (h
Hamida. (i) HybridSN. (j) ESFNet.
(h) Hamida. (i) HybridSN. (j) ESFNet.

InToEquation (6), nthe i is the ith the 2 indicates the

and Rwe
determine i is the number of
significant categories, between
differences dataset,
models, j used the Friedma
sum of the ranks for the all categories of the kth algorithm.
test [50] for statistical significance. We compared the significance of the models among th
In the statistical significance test, to reject the null hypothesis, Xr2 must be greater than
categories in the two datasets.
or equal to the critical value of the chi-square distribution. In this set of experiments, we
In the Friedman test, we used the chi-square distribution to approximate the Fried
adopted the commonly used critical value of 0.05 degrees of freedom. By comparison,
X0.05 =test
man
2 statistic.
15.507 < X1 2r We calculated
= 71.825, 2
X0.05 =the ranking
15.507 < X2of2 =the
r
models
39.489, in means
which the above experiments
that we can i
terms
reject theofnull
F1 scores in each
hypothesis, andcategory of the datasets.
there are significant The results
differences among thearenine
shown in Tables 7 an
models.
8. We assume that there is no difference between the models, and thus 𝑅 should b
equal. Based on the following equation and the data in Tables 7 and 8, the value of th
Friedman test statistic can be calculated.

In Equation (6), 𝑛 is the number of categories, i is the ith dataset, and 𝑅 indicate
Remote Sens. 2022, 14, 5334 22 of 24

Table 7. Ranking of nine models on the categories of Indian Pines dataset.

Class Name SVM RNN ANN 1D CNN SF 3D CNN Hamida HybridSN ESFNet
1. Alfalfa 6 8 3 9 7 2 5 1 4
2. Corn-notill 6 8 5 9 7 3 4 1 2
3. Corn-mintill 3 8 4 9 5 7 2 6 1
4. Corn 5 8 3 9 4 6 2 7 1
5. Grass-pasture 4 6 3 9 5 8 2 7 1
6. Grass-trees 5 8 6 9 7 3.5 2 1 3.5
7. Grass-pasture-mowed 5 7 1 9 4 2.5 8 6 2.5
8. Hay-windrowed 4 6 1 7 5 9 2 8 3
9. Oats 6 8.5 4 8.5 5 3 2 1 7
10. Soybean-notill 6 8 5 9 7 4 2 3 1
11. Soybean-mintill 6 8 5 9 7 3 4 2 1
12. Soybean-clean 2 8 6 9 7 5 3 4 1
13. Wheat 7 8 5 9 6 1 3 4 2
14. Woods 4 6 3 9 5 8 2 7 1
15. B-G-T-D 4 5 2 9 6 8 1 7 3
16. Stone-Steel-Towers 4 8 7 5.5 5.5 9 1.5 3 1.5
Total Rank 77 118.5 63 138 92.5 82 45.5 68 35.5

Table 8. Ranking of nine models on the categories of Pavia University dataset.

Class Name SVM RNN ANN 1D CNN SF 3D CNN Hamida HybridSN ESFNet
1. Asphalt 6 8 3 9 5 7 2 4 1
2. Meadows 5 4 1 7 6 9 3 8 2
3. Gravel 6 7 4 9 8 5 2 3 1
4. Trees 7 6 3 9 8 5 2 4 1
5. Painted metal sheets 8 6 5 9 7 2 2 4 2
6. Bare Soil 7 6 5 9 8 4 3 1 2
7. Bitumen 9 8 5 7 6 3 4 1 2
8. Self-Blocking Bricks 6 8 5 7 9 2 4 1 3
9. Shadows 3.5 6.5 6.5 3.5 3.5 9 3.5 8 1
Total Rank 57.5 59.5 37.5 69.5 60.5 46 25.5 34 15

5. Conclusions
In this paper, we proposed a new enhanced spectral fusion network (ESFNet) for
hyperspectral image classification. The new model can improve the classification accuracy
of hyperspectral images by targeted learning based on the characteristics of hyperspectral
images. Firstly, we optimized the SeKG module and termed the optimized module the FsSE
module. The FsSE module is designed to enhance the spectral information of hyperspectral
images and to maximally preserve the spectral continuity. Secondly, in order to enable the
classification model to learn the maximum amount of effective spectral information, we
designed the SSFCNN model with fusion by different strides. This model was designed to
be able to filter out redundant features by different step sizes and to fuse these feature maps
with different levels of learning so that the results of different strides can complement each
other. In addition, because there are not many 3D CNN networks with complex structures,
we hope that our proposed SSFCNN network can provide ideas for the development of
more complex 3D CNN networks to be designed in the future. In the experiments of this
paper, we use two hyperspectral public datasets. Through a series of experiments, we
proved that our proposed ESFNet can lead to a significant improvement in the classification
effect by enhancing the model’s learning ability regarding the spectrum. In future work, we
will explore a better feature fusion method and further improve the classification accuracy
for hyperspectral images.
Remote Sens. 2022, 14, 5334 23 of 24

Author Contributions: All authors have made great contributions to the work. Conceptualization,
J.Z. (Junbo Zhou) and S.Z.; software, J.Z. (Junbo Zhou); validation, J.Z. (Junbo Zhou), S.Z. and Z.X.;
formal analysis, J.Z. (Junbo Zhou), J.Z. (Jinbo Zhou) and H.L.; investigation, Z.K.; writing—original
draft preparation, J.Z. (Junbo Zhou) and S.Z.; writing—review and editing, J.Z. (Jinbo Zhou), Z.K.
and Z.X. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Hubei Province Natural Science Foundation for Distin-
guished Young Scholars, grant No. 2020CFA063, and funded by excellent young and middle-aged
scientific and technological innovation teams in colleges and universities of Hubei Province, grant
No. T2021009.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Goetz, A.F.H. Three decades of hyperspectral remote sensing of the Earth: A personal view. Remote Sens. Environ. 2009, 113,
S5–S16. [CrossRef]
2. Nalepa, J. Recent Advances in Multi- and Hyperspectral Image Analysis. Sensors 2021, 21, 6002. [CrossRef] [PubMed]
3. Kemker, R.; Kanan, C. Self-Taught Feature Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017,
55, 2693–2705. [CrossRef]
4. Lu, B.; Dao, P.D.; Liu, J.G.; He, Y.H.; Shang, J.L. Recent Advances of Hyperspectral Imaging Technology and Applications in
Agriculture. Remote Sens. 2020, 12, 2659. [CrossRef]
5. Kruse, F.A. Identification and mapping of minerals in drill core using hyperspectral image analysis of infrared reflectance spectra.
Int. J. Remote Sens. 1996, 17, 1623–1632. [CrossRef]
6. Wang, Z.M.; Du, B.; Zhang, L.F.; Zhang, L.P.; Jia, X.P. A Novel Semisupervised Active-Learning Algorithm for Hyperspectral
Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3071–3083. [CrossRef]
7. Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollar, P.; Zitnick, C.L. Microsoft COCO: Common Objects in
Context. In Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September
2014; pp. 740–755.
8. Zeng, S.; Wang, Z.Y.; Gao, C.J.; Kang, Z.; Feng, D.G. Hyperspectral Image Classification With Global-Local Discriminant Analysis
and Spatial-Spectral Context. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 5005–5018. [CrossRef]
9. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]
10. Blanzieri, E.; Melgani, F. Nearest neighbor classification of remote sensing images with the maximal margin principle. IEEE Trans.
Geosci. Remote Sens. 2008, 46, 1804–1811. [CrossRef]
11. Yager, R.R. An extension of the naive Bayesian classifier. Inf. Sci. 2006, 176, 577–588. [CrossRef]
12. Zhang, Y.X.; Liu, K.; Dong, Y.N.; Wu, K.; Hu, X.Y. Semisupervised Classification Based on SLIC Segmentation for Hyperspectral
Image. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1440–1444. [CrossRef]
13. Shinde, P.P.; Shah, S. A review of machine learning and deep learning applications. In Proceedings of the 2018 Fourth international
conference on computing communication control and automation (ICCUBEA) 2018, Pune, India, 16–18 August 2018; pp. 1–6.
[CrossRef]
14. Zhu, X.X.; Tuia, D.; Mou, L.C.; Xia, G.S.; Zhang, L.P.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive
Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [CrossRef]
15. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM
2017, 60, 84–90. [CrossRef]
16. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [CrossRef] [PubMed]
17. Ma, W.P.; Zhang, J.; Wu, Y.; Jiao, L.C.; Zhu, H.; Zhao, W. A Novel Two-Step Registration Method for Remote Sensing Images
Based on Deep and Local Features. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4834–4843. [CrossRef]
18. Ma, J.Y.; Tang, L.F.; Fan, F.; Huang, J.; Mei, X.G.; Ma, Y. SwinFusion: Cross-domain Long-range Learning for General Image
Fusion via Swin Transformer. IEEE/CAA J. Autom. Sin. 2022, 9, 1200–1217. [CrossRef]
19. Zeng, N.Y.; Wang, Z.D.; Zhang, H.; Kim, K.E.; Li, Y.R.; Liu, X.H. An Improved Particle Filter With a Novel Hybrid Proposal
Distribution for Quantitative Analysis of Gold Immunochromatographic Strips. IEEE Trans. Nanotechnol. 2019, 18, 819–829.
[CrossRef]
20. Rawat, W.; Wang, Z.H. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review. Neural Comput.
2017, 29, 2352–2449. [CrossRef]
21. Xu, H.; Ma, J.Y.; Jiang, J.J.; Guo, X.J.; Ling, H.B. U2Fusion: A Unified Unsupervised Image Fusion Network. IEEE Trans. Pattern
Anal. Mach. Intell. 2022, 44, 502–518. [CrossRef]
22. Chen, Y.S.; Lin, Z.H.; Zhao, X.; Wang, G.; Gu, Y.F. Deep Learning-Based Classification of Hyperspectral Data. IEEE J. Sel. Top.
Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [CrossRef]
23. Lv, W.J.; Wang, X.F. Overview of Hyperspectral Image Classification. J. Sens. 2020, 2020, 4817234. [CrossRef]
Remote Sens. 2022, 14, 5334 24 of 24

24. Hu, W.; Huang, Y.Y.; Wei, L.; Zhang, F.; Li, H.C. Deep Convolutional Neural Networks for Hyperspectral Image Classification.
J. Sens. 2015, 2015, 258619. [CrossRef]
25. Imani, M.; Ghassemian, H. An overview on spectral and spatial information fusion for hyperspectral image classification: Current
trends and challenges. Inf. Fusion 2020, 59, 59–83. [CrossRef]
26. Luo, F.L.; Zou, Z.H.; Liu, J.M.; Lin, Z.P. Dimensionality Reduction and Classification of Hyperspectral Image via Multistructure
Unified Discriminative Embedding. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [CrossRef]
27. Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E.H. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42,
2011–2023. [CrossRef] [PubMed]
28. Zhao, Q.; Cai, X.; Chen, C.; Lv, L.; Chen, M. Commented content classification with deep neural network based on attention
mechanism. In Proceedings of the 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control
Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 2016–2019.
29. Ma, W.P.; Ma, H.X.; Zhu, H.; Li, Y.T.; Li, L.W.; Jiao, L.C.; Hou, B. Hyperspectral image classification based on spatial and spectral
kernels generation network. Inf. Sci. 2021, 578, 435–456. [CrossRef]
30. Chen, Y.S.; Zhao, X.; Jia, X.P. Spectral-Spatial Classification of Hyperspectral Data Based on Deep Belief Network. IEEE J. Sel. Top.
Appl. Earth Obs. Remote Sens. 2015, 8, 2381–2392. [CrossRef]
31. Mou, L.C.; Ghamisi, P.; Zhu, X.X. Deep Recurrent Neural Networks for Hyperspectral Image Classification. IEEE Trans. Geosci.
Remote Sens. 2017, 55, 3639–3655. [CrossRef]
32. Zhao, W.Z.; Du, S.H. Spectral-Spatial Feature Extraction for Hyperspectral Image Classification: A Dimension Reduction and
Deep Learning Approach. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4544–4554. [CrossRef]
33. Zhang, M.M.; Li, W.; Du, Q. Diverse Region-Based CNN for Hyperspectral Image Classification. IEEE Trans. Image Process. 2018,
27, 2623–2634. [CrossRef] [PubMed]
34. Guo, A.J.X.; Zhu, F. A CNN-Based Spatial Feature Fusion Algorithm for Hyperspectral Imagery Classification. IEEE Trans. Geosci.
Remote Sens. 2019, 57, 7170–7181. [CrossRef]
35. Yang, L.M.; Yang, Y.H.; Yang, J.H.; Zhao, N.Y.; Wu, L.; Wang, L.G.; Wang, T.R. FusionNet: A Convolution-Transformer Fusion
Network for Hyperspectral Image Classification. Remote Sens. 2022, 14, 4066. [CrossRef]
36. He, J.; Zhao, L.N.; Yang, H.W.; Zhang, M.M.; Li, W. HSI-BERT: Hyperspectral Image Classification Using the Bidirectional Encoder
Representation From Transformers. IEEE Trans. Geosci. Remote Sens. 2020, 58, 165–178. [CrossRef]
37. He, X.; Chen, Y.S.; Lin, Z.H. Spatial-Spectral Transformer for Hyperspectral Image Classification. Remote Sens. 2021, 13, 498.
[CrossRef]
38. Khotimah, W.N.; Bennamoun, M.; Boussaid, F.; Sohel, F.; Edwards, D. A High-Performance Spectral-Spatial Residual Network for
Hyperspectral Image Classification with Small Training Data. Remote Sens. 2020, 12, 3137. [CrossRef]
39. Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning Spatiotemporal Features with 3D Convolutional Networks. In
Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; pp. 4489–4497.
40. Chen, Y.S.; Jiang, H.L.; Li, C.Y.; Jia, X.P.; Ghamisi, P. Deep Feature Extraction and Classification of Hyperspectral Images Based on
Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [CrossRef]
41. Ahmad, M.; Khan, A.M.; Mazzara, M.; Distefano, S.; Ali, M.; Sarfraz, M.S. A Fast and Compact 3-D CNN for Hyperspectral Image
Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [CrossRef]
42. Zhong, Z.L.; Li, J.; Luo, Z.M.; Chapman, M. Spectral-Spatial Residual Network for Hyperspectral Image Classification: A 3-D
Deep Learning Framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 847–858. [CrossRef]
43. Laban, N.; Abdellatif, B.; Ebeid, H.M.; Shedeed, H.A.; Tolba, M.F. Reduced 3-D Deep Learning Framework for Hyperspectral
Image Classification. In International Conference on Advanced Machine Learning Technologies and Applications; Springer: Cham,
Switzerland, 2020; pp. 13–22.
44. Shi, C.P.; Sun, J.W.; Wang, L.G. Hyperspectral Image Classification Based on Spectral Multiscale Convolutional Neural Network.
Remote Sens. 2022, 14, 1951. [CrossRef]
45. Diakite, A.; Jiangsheng, G.; Xiaping, F. Hyperspectral image classification using 3D 2D CNN. IET Image Process. 2021, 15,
1083–1092. [CrossRef]
46. Firat, H.; Hanbay, D. Classification of Hyperspectral Images Using 3D CNN Based ResNet50. In Proceedings of the 2021 29th
Signal Processing and Communications Applications Conference (SIU), Istanbul, Turkey, 9–11 June 2021; pp. 1–4.
47. Ben Hamida, A.; Benoit, A.; Lambert, P.; Ben Amar, C. 3-D Deep Learning Approach for Remote Sensing Image Classification.
IEEE Trans. Geosci. Remote Sens. 2018, 56, 4420–4434. [CrossRef]
48. Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D-2-D CNN Feature Hierarchy for Hyperspectral
Image Classification. IEEE Geosci. Remote Sens. Lett. 2020, 17, 277–281. [CrossRef]
49. Hong, D.F.; Han, Z.; Yao, J.; Gao, L.R.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking Hyperspectral Image
Classification With Transformers. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [CrossRef]
50. Sheskin, D.J. Handbook of Parametric and Nonparametric Statistical Procedures; CRC Press: Boca Raton, FL, USA, 2003. [CrossRef]

Digital Signal and Image Processing
67% (3)
Digital Signal and Image Processing
268 pages
HyperSpecTral Image Classification
No ratings yet
HyperSpecTral Image Classification
17 pages
A Lightweight Transformer Network For Hyperspectral Image Classification
No ratings yet
A Lightweight Transformer Network For Hyperspectral Image Classification
17 pages
Dual-Branch Domain Adaptation Few-Shot Learning For Hyperspectral Image Classification
No ratings yet
Dual-Branch Domain Adaptation Few-Shot Learning For Hyperspectral Image Classification
16 pages
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
No ratings yet
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
38 pages
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
No ratings yet
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
16 pages
Retracted-Advances in Hyperspectral Image Classification With A Bottleneck Attention Mechanism Based On 3D-FCNN Model and Imaging Spectrometer Sensor
No ratings yet
Retracted-Advances in Hyperspectral Image Classification With A Bottleneck Attention Mechanism Based On 3D-FCNN Model and Imaging Spectrometer Sensor
17 pages
Cmtnet: A Hybrid Cnn-Transformer Network For Uav-Based Precision Agriculture
No ratings yet
Cmtnet: A Hybrid Cnn-Transformer Network For Uav-Based Precision Agriculture
18 pages
10.hyperspectral and LiDAR Data Classification Using Joint CNNs and Morphological Feature Learning
No ratings yet
10.hyperspectral and LiDAR Data Classification Using Joint CNNs and Morphological Feature Learning
16 pages
Small Sample Classification For Hyperspectral Imagery Using Temporal Convolution and Attention Mechanism
No ratings yet
Small Sample Classification For Hyperspectral Imagery Using Temporal Convolution and Attention Mechanism
11 pages
A Convolution - Transformer Fusion Network For Hyperspectral Image Classification
No ratings yet
A Convolution - Transformer Fusion Network For Hyperspectral Image Classification
21 pages
Report of Hyperboys
No ratings yet
Report of Hyperboys
5 pages
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
No ratings yet
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
17 pages
4.final Version
No ratings yet
4.final Version
18 pages
Image Processing
No ratings yet
Image Processing
24 pages
DL For HSI - Review
No ratings yet
DL For HSI - Review
39 pages
Mingyi He, Bo Li, Huahui Chen: Al. (11) Proposed A Modified Deep Stacking Network (DSN) For
No ratings yet
Mingyi He, Bo Li, Huahui Chen: Al. (11) Proposed A Modified Deep Stacking Network (DSN) For
5 pages
Full Document - Hyperspectral PDF
No ratings yet
Full Document - Hyperspectral PDF
96 pages
Spectralformer: Rethinking Hyperspectral Image Classification With Transformers
No ratings yet
Spectralformer: Rethinking Hyperspectral Image Classification With Transformers
13 pages
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
No ratings yet
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
11 pages
GAO 2020 Combining T-Distributed Stochastic (AAM)
No ratings yet
GAO 2020 Combining T-Distributed Stochastic (AAM)
6 pages
Deep Clustering Using 3D Attention
No ratings yet
Deep Clustering Using 3D Attention
13 pages
Base Paper
No ratings yet
Base Paper
16 pages
Sun 2022
No ratings yet
Sun 2022
25 pages
1 s2.0 S1110982324000048 Main
No ratings yet
1 s2.0 S1110982324000048 Main
17 pages
2019 Deep Learning Ensemble For Hyperspectral Image Classification
No ratings yet
2019 Deep Learning Ensemble For Hyperspectral Image Classification
16 pages
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
100% (1)
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
15 pages
Jstars 2014
No ratings yet
Jstars 2014
12 pages
Electronics 12 00488 v2
No ratings yet
Electronics 12 00488 v2
34 pages
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
No ratings yet
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
13 pages
GlobalLocal Multigranularity Transformer For Hyperspectral Image Classification
No ratings yet
GlobalLocal Multigranularity Transformer For Hyperspectral Image Classification
20 pages
2017 Multiple Kernel Learning For Hyperspectral Image Classification A Review
No ratings yet
2017 Multiple Kernel Learning For Hyperspectral Image Classification A Review
19 pages
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
No ratings yet
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
21 pages
Zhong Et Al. - 2017 - Learning To Diversify Deep Belief Networks For Hyperspectral Image Classification
No ratings yet
Zhong Et Al. - 2017 - Learning To Diversify Deep Belief Networks For Hyperspectral Image Classification
15 pages
A Survey of Deep Learning For Hyperspectral Image Classification
No ratings yet
A Survey of Deep Learning For Hyperspectral Image Classification
26 pages
2018 Recent Advances On Spectral-Spatial Hyperspectral Image Classification An Overview and New Guidelines
No ratings yet
2018 Recent Advances On Spectral-Spatial Hyperspectral Image Classification An Overview and New Guidelines
19 pages
Unit 4 Supervised Learning
100% (1)
Unit 4 Supervised Learning
75 pages
Neural Ordinary Differential Equations For Hyperspectral Image Classification-Plaza2020
No ratings yet
Neural Ordinary Differential Equations For Hyperspectral Image Classification-Plaza2020
17 pages
Survey Paper
No ratings yet
Survey Paper
35 pages
Chen 2016
No ratings yet
Chen 2016
20 pages
Decoding The Moons Surface A Graph Neural Network Based Analysis of Chandrayaan-2 Lunar Data Classification
No ratings yet
Decoding The Moons Surface A Graph Neural Network Based Analysis of Chandrayaan-2 Lunar Data Classification
4 pages
MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
No ratings yet
MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
16 pages
Koumoutsou 2020
No ratings yet
Koumoutsou 2020
8 pages
Zhang 2018
No ratings yet
Zhang 2018
12 pages
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
No ratings yet
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
13 pages
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
No ratings yet
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
15 pages
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
No ratings yet
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
10 pages
Final Hyper Inka Anthe
No ratings yet
Final Hyper Inka Anthe
5 pages
Combining T-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks For Hyperspectral Image Classification
No ratings yet
Combining T-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks For Hyperspectral Image Classification
5 pages
Paper 82-Hyperspectral Image Classification
No ratings yet
Paper 82-Hyperspectral Image Classification
7 pages
Learning High-Level Spectral-Spatial Features For Hyperspectral Image Classification With Insufficient Labeled Samples
No ratings yet
Learning High-Level Spectral-Spatial Features For Hyperspectral Image Classification With Insufficient Labeled Samples
9 pages
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
No ratings yet
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
5 pages
Machine Learning Assignments and Answers
No ratings yet
Machine Learning Assignments and Answers
35 pages
FBI Codes Ciphers and Concealments Example
100% (1)
FBI Codes Ciphers and Concealments Example
6 pages
Sample EIP-II Report
No ratings yet
Sample EIP-II Report
7 pages
Review Article Overview of Hyperspectral Image Classification
No ratings yet
Review Article Overview of Hyperspectral Image Classification
13 pages
Deep Convolutional Neural Networks For The Classification of Snapshot Mosaic Hyperspectral Imagery
No ratings yet
Deep Convolutional Neural Networks For The Classification of Snapshot Mosaic Hyperspectral Imagery
6 pages
Spectral-Spatial Hyperspectral Image Classification With Edge-Preserving Filtering
No ratings yet
Spectral-Spatial Hyperspectral Image Classification With Edge-Preserving Filtering
12 pages
Deep Feature Learning and Classification of Remote Sensing Images
No ratings yet
Deep Feature Learning and Classification of Remote Sensing Images
19 pages
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
No ratings yet
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
12 pages
Heba DSBook 2022
No ratings yet
Heba DSBook 2022
337 pages
Python For Control Engineering
No ratings yet
Python For Control Engineering
54 pages
Automatic Target Detection in Hyperspectral Images Using Neural Network
No ratings yet
Automatic Target Detection in Hyperspectral Images Using Neural Network
8 pages
Automatic Target Detection in
No ratings yet
Automatic Target Detection in
8 pages
R&D HiFACE
No ratings yet
R&D HiFACE
5 pages
Elmer Models Manual
No ratings yet
Elmer Models Manual
342 pages
A Deep Learning Methodology To Predicting Cybersecurity Attacks On The Internet of Things
No ratings yet
A Deep Learning Methodology To Predicting Cybersecurity Attacks On The Internet of Things
22 pages
Chapter 3 An Illustrative Example of Case 1 Best-Worst Scaling - Non-Market Valuation With R
No ratings yet
Chapter 3 An Illustrative Example of Case 1 Best-Worst Scaling - Non-Market Valuation With R
41 pages
Emergency Fund-TVM
No ratings yet
Emergency Fund-TVM
33 pages
A G1002 Pages: 2: Answer Any Two Full Questions, Each Carries 15 Marks
No ratings yet
A G1002 Pages: 2: Answer Any Two Full Questions, Each Carries 15 Marks
2 pages
Module 10 Math 8
No ratings yet
Module 10 Math 8
6 pages
Unit-4 Introduction To Data Mining
No ratings yet
Unit-4 Introduction To Data Mining
26 pages
HyperNova: Recursive Arguments For Customizable Constraint Systems
No ratings yet
HyperNova: Recursive Arguments For Customizable Constraint Systems
40 pages
Unit 3 Queue
No ratings yet
Unit 3 Queue
52 pages
N 228, PV - $1,100, FV $13,438 Compute I: Solutions To TVM Practice Set II
No ratings yet
N 228, PV - $1,100, FV $13,438 Compute I: Solutions To TVM Practice Set II
5 pages
Artificial Intelligence-Based Intrusion Detection and Prevention in Edge-Assisted SDWSN With Modified Honeycomb Structure
No ratings yet
Artificial Intelligence-Based Intrusion Detection and Prevention in Edge-Assisted SDWSN With Modified Honeycomb Structure
37 pages
Lecture 05 - Sampled Data Systems
No ratings yet
Lecture 05 - Sampled Data Systems
26 pages
Unit Iii Efficiency 9
No ratings yet
Unit Iii Efficiency 9
16 pages
Quick Sort Master Theorem Time Complexity Analysis and Space Complexity Analysis
No ratings yet
Quick Sort Master Theorem Time Complexity Analysis and Space Complexity Analysis
3 pages
Quiz 7 Data Sci
No ratings yet
Quiz 7 Data Sci
3 pages
Mec Vib Week 3 231019 140022
No ratings yet
Mec Vib Week 3 231019 140022
30 pages
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
No ratings yet
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
22 pages
Marketing Models of Consumer Heterogenei
No ratings yet
Marketing Models of Consumer Heterogenei
22 pages
23-SIMPLEC Algorithm For Colocated Meshes
No ratings yet
23-SIMPLEC Algorithm For Colocated Meshes
31 pages
Numerical Solutions of The Integral Equations of The First Kind
100% (1)
Numerical Solutions of The Integral Equations of The First Kind
8 pages
Index Page
No ratings yet
Index Page
12 pages
Quantum Breakdown.: Catalysts Coding Contest 3 April 2020
No ratings yet
Quantum Breakdown.: Catalysts Coding Contest 3 April 2020
7 pages
Open Ended Lab
No ratings yet
Open Ended Lab
4 pages
Unit5.ipynb - Numerical Integration
No ratings yet
Unit5.ipynb - Numerical Integration
5 pages

Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification

Uploaded by

Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification

Uploaded by

remote sensing

Remote Sens. 2022, 14, 5334. https://doi.org/10.3390/rs14215334 https://www.mdpi.com/journal/remotesensing

component analysis (PCA), independent component analysis (ICA), linear discriminant

2. Enhanced Spectral Fusion Network (ESFNet)

Figure 1. The structure of the

M = L(∂( L2 ∂( L1 Fs))) (4)

Remote Sens. 2022, 14, 5334 6 of 24

Remote Sens. 2022, 14, 5334 7 of 24

022, 14, x FOR PEER REVIEW 7 of 25

Figure 3. The spectral curves of 16 types of samples.

Figure 4. The structure of SSFCNN.

(a) (b) (c) (d)

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

91 90.13 93.5 93.13 93.20 93.20

Dataset: Indian Pines OA (%) Dataset: Pavia University OA (%)

95.56 95.66 95.70 95.52 95.56

4.3.2. Performance Analysis

Table 5. Classification results of nine models on the Indian Pines dataset.

Table 6. Classification results of nine models on the Pavia University dataset.

that the model can be trained to achieve better results.

Figure 12. Spectral curves of Oats

(a) dataset: Indian Pines (b) dataset: Indian Pines

(c) dataset: Pavia University (d) dataset: Pavia University

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

InToEquation (6), nthe i is the ith the 2 indicates the

Table 7. Ranking of nine models on the categories of Indian Pines dataset.

Table 8. Ranking of nine models on the categories of Pavia University dataset.

You might also like