Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification

Du, Shuhan; Han, Wei; Kang, Zhenping; Lu, Xiangning; Liao, Yurong; Li, Zhaoming

doi:10.3390/rs16163097

Open AccessArticle

Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification

by

Shuhan Du

¹,

Wei Han

²,

Zhenping Kang

¹,

Xiangning Lu

²,

Yurong Liao

¹ and

Zhaoming Li

^1,*

¹

Department of Electronic and Optical Engineering, Space Engineering University, Beijing 101416, China

²

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(16), 3097; https://doi.org/10.3390/rs16163097

Submission received: 13 June 2024 / Revised: 13 August 2024 / Accepted: 20 August 2024 / Published: 22 August 2024

(This article belongs to the Special Issue Advances in Remote Sensing, Radar Techniques, and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Focusing on the problem of identifying and classifying aero-engine models, this paper measures the infrared spectrum data of aero-engine hot jets using a telemetry Fourier transform infrared spectrometer. Simultaneously, infrared spectral data sets with the six different types of aero-engines were created. For the purpose of classifying and identifying infrared spectral data, a CNN architecture based on the continuous wavelet transform peak-seeking attention mechanism (CWT-AM-CNN) is suggested. This method calculates the peak value of middle wave band by continuous wavelet transform, and the peak data are extracted by the statistics of the wave number locations with high frequency. The attention mechanism was used for the peak data, and the attention mechanism was weighted to the feature map of the feature extraction block. The training set, validation set and prediction set were divided in the ratio of 8:1:1 for the infrared spectral data sets. For three different data sets, the CWT-AM-CNN proposed in this paper was compared with the classical classifier algorithm based on CO₂ feature vector and the popular AE, RNN and LSTM spectral processing networks. The prediction accuracy of the proposed algorithm in the three data sets was as high as 97%, and the lightweight network structure design not only guarantees high precision, but also has a fast running speed, which can realize the rapid and high-precision classification of the infrared spectral data of the aero-engine hot jets.

Keywords:

infrared spectral detection; FT-IR; aero-engine hot jet; deep learning; attention mechanism

1. Introduction

Aircraft fault detection requires the rapid identification of aero-engine models, and infrared spectroscopy provides a solution. A method for determining a substance’s chemical makeup and molecular structure is called infrared spectroscopy (IR) [1,2,3]. This method measures the wavelength and intensity of the absorbed or emitted light and generates a particular spectrum diagram by taking advantage of the energy level transition that occurs in material molecules when exposed to infrared radiation. Therefore, the chemical bond and structure of the substance are analyzed and judged. This technology has been studied importantly in environmental monitoring [4], refuse classification [5], biochemistry [6] and other fields.

Varied types of aero-engines will create varied gas compositions and emissions during combustion. The categorization features of aero-engines are frequently connected to the fuel type, combustion mode and emission characteristics. The vibrations and rotations of these molecules form a specific infrared absorption and emission spectrum. The Fourier transform infrared spectrometer (FT-IR Spectrometer) [7,8,9] is an important means to measure infrared spectrum. Interferogram is obtained through interferometer. Based on Fourier transform, the interferogram is reduced to spectrogram. In this paper, FT-IR was used to cover a wider spectral range (the spectral range of the hyperspectrum is usually from the visible region (0.4–0.7 µm) to the short-wave infrared (SWIR) region (nearly 2.4 µm), and the spectral coverage of the FT-IR spectrometer used in this experiment is 2.5~12 µm), and the spectral data imply molecular type and structure information, so the richer and finer characteristic information of FT-IR is the data basis for classification and recognition in this paper. Meanwhile, hyperspectral detection typically employs a dispersive spectrometer to capture continuous narrowband spectra within the visible light region. This type of spectrometer is constrained to observing a very narrow frequency range at any given time and requires several minutes to complete the scan. In contrast, FT-IR spectrometers utilize non-dispersive optical elements such as prisms and gratings to optimize the radiation energy reaching the detector, thereby achieving a higher signal-to-noise ratio with increased light throughput. By employing this method, the interference map scan of the light signal in the 2.5–12 µm range can be completed in less than a second by moving the mirror, and spectral data conversion can be accomplished in just a few seconds. Since passive FT-IR can gather data in any direction and perform fast target analysis as well as continuous, long-distance real-time monitoring, it is frequently employed to identify air contaminants. This method is more suitable for the measurement of the hot jet spectrum of aero-engines.

The research on image recognition methods has been mature [10,11]. For spectral data classification methods, spectral classification methods of hyperspectral data provide us with a reference approach, which is divided into classical methods and deep methods. Li [12] summarized that the classical method of spectral classification is to transform data into new feature space and retain identification information through data projection, including principal component analysis (PCA), independent component analysis (ICA), linear discriminant analysis (LDA), one-dimensional discrete wavelet transform (1D-DWT), etc. With the rapid development of deep learning in recent years, various feature extraction frameworks have emerged in an endless stream. Traditional classification methods often require feature design for data, while deep learning methods have stronger feature learning capabilities and better data representation abilities. Deep learning methods for spectral classification mainly include the convolutional Neural Network (CNN), recurrent Neural Network (KNN), Transformer [13] and variants of the above methods. Audebert [14] and Rasti [15] summarized that deep learning methods mainly include supervised learning methods (1DCNN, RNN, recursion and convolution) and unsupervised learning methods (AE and PCA). For the method of spectral feature extraction in deep learning networks, Lu [16] adopted five convolutional layers 1DCNN, Chen [17] and Kemker [18] adopted an automatic encoder (AE) and stacked autoencoder (SAE), Ding [19] adopted a graph convolutional network and Li [20] combined spectral convolution and dictionary learning. Ashraf [21] constructed BERT induction, Hamouda [22] used spectral digits and minimum redundancy and maximum correlation selection, Zhang [23] adopted the U-Net model, Dou [24] proposed a new framework, Wang [25] adoptedResNet model and Faqe [26] utilized the advantages of multimodal data. The Transformer framework structure is the most popular deep learning structure at present, and it is also a popular research topic in hyperspectral data processing in recent years. However, Transformer replaces computing efficiency with high memory consumption, which needs to take up more memory resources and does not match the computing power of edge computing platform. In contrast, lightweight network design is needed for our infrared spectral data to achieve a balance between accuracy and efficiency. The wide application of 1DCNN in spectral feature extraction and appropriate memory and time efficiency make 1DCNN the preferred choice for the network structure in this paper.

Because of the extraordinary lack of public data sets, the infrared spectral data of six different models of aero-engine hot jets were measured by outfield experiment, and three spectral data sets were made according to the measurement distance and the different environment. In this paper, a convolutional Neural Network framework based on continuous wavelet transform peak-seeking attention mechanism was designed for the extraction of spectral features and the realization of multi-classification tasks. In order to compare the accuracy of the deep learning method and the traditional classification algorithm, this paper designed the characteristic spectral vector based on CO₂, an important component of the aero-engine hot jet, and combined it with the classifier to compare the classification accuracy of CWT-AM-CNN with the traditional classifier method. To compare the effectiveness of deep learning method of spectral data, CWT-AM-CNN was compared with the popular AE, RNN and LSTM spectral processing networks.

The contribution of this paper is summarized in the following three points:

This paper provides a classification method for aircraft engines based on infrared spectroscopy technology. The selective absorption of infrared radiation by different molecules is an important method for determining different substances. Similarly, the different spectral features in the infrared spectra of hot jet gases from different types of aircraft engines help to classify them.
Due to the scarcity of infrared spectral data for aircraft engine thermal jets, this paper constructed a new benchmark data set. The data set covers the infrared spectrum in the wavelength range of 2.5~12 μm, including six types of different aero-engine models (including turbine engine and turbofan engine).
This paper provides a deep learning framework for the classification of aero-engine hot jet infrared spectra. A convolutional Neural Network based on a peak-seeking attention mechanism was designed. The backbone network consisted of three feature extraction blocks with the same structure, batch normalization layer and maximum pooling layer. In the part of attention mechanism based on peak seeking, the spectral peak value was detected by continuous wavelet transform method, and the peak wave number of high-frequency occurrences was counted. The attention mechanism weighted the peak value obtained by statistics and acted on the feature map of the trunk CNN. The structure of the network was light, and the classification accuracy and operation efficiency could be taken into account.

The structure of this paper consists of five parts. Section 1 reviews the method of spectral feature extraction and briefly describes the method, contribution and structure of this paper. In Section 2, the detailed structure and algorithm details of the attention mechanism convolutional Neural Network designed in this paper are described. In Section 3, the data set used in the experiment is designed, and the design of the outfield experiment, the data preprocessing and the composition of the spectral data set are described in detail. In Section 4, the experiment and results are analyzed, and the performance results are used to evaluate and analyze the experiment, and the traditional classifier methods and the current popular spectral processing network methods are compared. Section 5 summarizes the thesis.

2. Spectral Classification Network Structure Design

In this section, the CNN based on the peak-seeking attention mechanism designed by infrared is described, which consists of four parts: the overall network structure design, backbone network design, peak-based attention mechanism and network training method.

2.1. Overall Network Design

Convolutional Neural Networks (CNNs) structures commonly used at present include LeNet-5 [27], AlexNet [28], VGG [29], GoogleNet [30], ResNet [31], etc., all of which achieve high accuracy in the classification of ImageNet data sets. However, the commonality feature of these algorithms is that by increasing the depth of the network, the accuracy of the calculation results is exchanged with a more excessive consumption of computing resources. In contrast to infrared images and hyperspectral images, infrared spectrum presents non-imaging data as a two-dimensional point set. An FT-IR spectrometer captures interference patterns using an interferometer, with each point on the pattern representing the total light intensity of the entire spectral range detected by the detector over integration time. Subsequently, a Fourier transform is applied to convert the interference pattern into an infrared spectrum. Our spectral data adopted the file type of Print File Format (PRN), which is expressed as two-dimensional point set data of horizontal wave number and vertical brightness temperature spectrum. The data were characterized by a single piece of data with many data points and few dimensions. Choosing a network model that is too deep will reduce the efficiency of the algorithm and consume a lot of computing resources. In hyperspectral spectral processing, 1DCNN is often used as the extraction method of spectral data features, so we designed a spectral classification network with 1DCNN as the basic structure of spectral data processing.

The CWT-AM-CNN developed in this paper, as depicted in Figure 1, was composed of three modules: the backbone network, feature extraction block, peak-seeking algorithm block and attention mechanism block. When the network was in the training stage, the infrared spectrum and its label information were input, and the feature extraction block was first entered to extract the feature map. At the same time, the peak-seeking block was input to detect the peak value. The peak detection results were assigned to the attention mechanism, and the attention weight was assigned to the feature map. When the network was in the prediction stage, the prediction data set was entered into the network, and the classification results and performance results were obtained.

2.2. Backbone Network Design

Before data entry, we need to unify the spectral data with different resolutions, because the aero-engine hot jet is composed of high-temperature gas, and the radiation characteristics of high-temperature gas are more obvious in the mid-infrared band (400–4000 cm⁻¹), so we focused on cutting the data in the mid infrared band. The 0.5 cm⁻¹ resolution data were interval sampled, and data amounts in the mid infrared band were adjusted to ensure the consistency of the model data input.

We defined the three components of the backbone network as feature extraction blocks. Each feature extraction block, as presented in Figure 2, consisted of a one-dimensional convolution layer, Batch Normalization layer and Maximum pooling layer.

2.2.1. One-Dimensional Convolutional Layer (Cov1D Layer)

The convolutional layer is a means of feature extraction for convolutional Neural Networks, and the convolutional layer is complete with linear and translation invariant operations. Through convolution operation, we can extract the features of the data, enhance some features of the original signal and reduce the noise.

The convolution between two functions can be defined as

f, g : R^{d} \to R

:

(f * g) (x) = \int f (z) g (x - z) d z

(1)

Convolution can be understood as the overlap between two functions when the function is flipped and shifted by x. For two-dimensional tensors, convolution can be written as:

(f * g) (i, j) = \sum_{a} \sum_{b} f (a, b) g (i - a, j - b)

(2)

Among them,

(a, b)

is the index of

f

,

(i - a, j - b)

is the index of

g

.

This paper selected a one-dimensional convolution kernel with a size of 3, stride of 1 and filled with valid. The small kernel is designed to capture local features and fine details with computational efficiency, offering potential for improved generalization. In contrast, the large kernel is tasked with extracting global patterns and spatial relationships, but this comes at the cost of increased computational complexity and potential loss of fine-grained details. Given that infrared spectroscopic data tends to be lengthy, employing small convolutional kernels can reduce model parameters, lower computational complexity and mitigate the risk of overfitting.

2.2.2. Batch Normalization (BN)

BN [32] is commonly used to accelerate convergence and solve the gradient dispersion of deep Neural Networks. The BN layer can be represented as a learnable network layer with parameters (γ,β). With the introduction of BN, the network can learn to recover the feature distribution that the original network needs to learn.

Assuming that the input from a small batch

B

is

x \in B

, BN can be converted to

x

according to Formula (3):

B N (x) = γ ⊙ \frac{x - {\hat{μ}}_{B}}{{\hat{σ}}_{B}} + β

(3)

Among them,

{\hat{μ}}_{B}

is the sample mean of the small lot

B

, and

{\hat{σ}}_{B}

is the sample standard deviation of the small lot

B

. After applying BN, the mean value of the generated small batches is 0, and the unit variance is 1.

γ

is the scale parameter, while

β

is the shift parameter. These two parameters need to be learned.

BN maintains the amplitude of change in the middle layer during training through the active centralization of each layer. Adjusting the mean and size by re-adjusting

{\hat{μ}}_{B}

and

{\hat{σ}}_{B}^{2}

can help smooth the middle-layer change. The

{\hat{μ}}_{B}

and

{\hat{σ}}_{B}^{2}

are shown in Equation (4):

\begin{matrix} {\hat{μ}}_{B} & = \frac{1}{| B |} \sum_{x \in B} x \\ {\hat{σ}}_{B}^{2} & = \frac{1}{| B |} \sum_{x \in B} {(x - {\hat{μ}}_{B})}^{2} + ϵ \end{matrix}

(4)

where the constant

ϵ

> 0.

BN can train the model with a larger initial learning rate to achieve fast convergence. When the convergence rate is very slow, gradient explosion, etc., BN can be used. At the same time, the use of BN behind the convolution layer can normalize the value of each spatial position.

2.2.3. Maximum Pooling Layer

Maximum Pooling is a method of down-sampling in CNN, which can reduce the amount of data processing while retaining useful information. Operations such as convolution layer, pooling layer and activation function layer can be understood as mapping raw data to the hidden layer feature space.

The maximum pooling layer specifies a pool size of 2 and a default stride of 2. In each pooling operation, the input data are divided into windows of size 2, and the maximum value is selected as the output in each window. Due to the default stride size being the pool window size, there is no overlap between each window.

2.2.4. Flatten Layer

The flatten layer is used to concatenate multiple feature maps along the channel dimension. In a convolutional Neural Network, feature maps at different levels of the network contain information of different abstraction levels. The flatten layer’s role is to integrate these feature maps so that subsequent fully connected layers or other operations can better utilize this information. By performing the flatten operation, we can convert feature maps into one-dimensional vectors while preserving spatial structure information and achieving a flattening of the data. This helps to improve the model’s understanding and representation of the input data and provides more rich and accurate feature representations for subsequent tasks such as classification and detection.

2.2.5. Fully Connected Layer (FC Layer)

The fully connected layer integrates the features and passes the integrated features to the activation function, where the scores are transformed into category probabilities by the softmax function to determine the category information of the input data. Meanwhile, the fully connected layer contains a large number of learnable weight and bias parameters, which are adjusted by the optimizer Adam during training. The extension layer and fully connected layer are shown in Figure 1.

2.3. Attention Mechanism Based on Peak Seeking

The attention mechanism based on peak seeking is mainly composed of two parts; one part is the peak-seeking algorithm, and the other part is the attention mechanism. The peak-seeking algorithm is used to analyze the infrared spectral data in the mid-wave band. In the part of attention mechanism, peak features are input, Global Average Pooling (GAP) is added and attention weights are calculated and applied to the feature map of the backbone network. Figure 3 shows the composition diagram of the attention mechanism module based on peak seeking. The left figure is the peak-seeking algorithm block, and the right figure is the attention mechanism block.

2.3.1. Peak-Seeking Algorithm Block

Continuous wavelet transform (CWT) is a classical peak detection method that uses wavelet transform to analyze signals at multiple scales to detect peaks at different scales.

The continuous wavelet coefficient

W (a, b)

is calculated as follows:

W (a, b) = \int_{- \infty}^{\infty} x (t) Ψ^{*} (\frac{t - b}{a}) d t

(5)

where,

x (t)

can be represented as spectral data, which are generally the continuous data.

Ψ (t)

is the mother wavelet used for CWT and represents the complex conjugate of the wavelet function. The scaling parameter

a

is used to control the scale of changes, and the translation parameter

b

is used to control translation. This formula is the result of the transformation of the spectral data with the wavelet function under different scales and translations.

We describe the algorithm running flow of the peak-seeking algorithm block in the form of pseudo-code, as shown in Algorithm 1.

Algorithm 1: Peak-seeking algorithm and peak statistics

Input: Spectral data.

Output: Peak data.

①: Trim the mid-wave band of spectral data (400–4000 cm⁻¹).
②: Data smoothing.
③: Set the threshold and scale parameters of continuous wavelet change.
④: Count the wave number with peak value.
⑤: Extract the wave number position with high frequency according to the proportion of threshold value.
⑥: According to the peak wave number position, for each data, the adjacent points are extracted as the peak data points.
⑦: Output the peak data for each spectral data.

The advantage of the CWT algorithm lies in its superior multi-scale analysis capability and noise insensitivity. This paper chose CWT in the hope of obtaining a small but representative number of peak features.

2.3.2. Attention Mechanism

An attention mechanism (AM) [33,34,35,36] consists of three parts: query, key and value. Suppose there is a query

q \in R^{q}

and M key-value pairs

(k_{1}, v_{1}), \dots, (k_{m}, v_{m})

, where

k_{i} \in R^{k}, v_{i} \in R^{v}

; the attention aggregation function is expressed as a weighted sum of values:

f (q, (k_{1}, v_{1}), \dots, (k_{m}, v_{m})) = \sum_{i = 1}^{m} α (q, k_{i}) v_{i} \in R^{v} \begin{matrix} = \sum_{i = 1}^{m} \frac{\exp (- \frac{1}{2} {(q - k_{i})}^{2})}{\sum_{j = 1}^{n} e x p (- \frac{1}{2} {(q - k_{j})}^{2})} v_{i} \\ = \sum_{i = 1}^{m} s o f t m a x (- \frac{1}{2} {(q - k_{i})}^{2}) v_{i} . \end{matrix}

(6)

Among them, the attention weight (scalar) of the query

q

and key

k_{i}

is mapped into a scalar by the two vectors of the attention score function

α = - \frac{1}{2} {(q - k_{i})}^{2}

and then obtained by softmax operation:

α (q, k_{i}) = s o f t m a x (a (q, k_{i})) = \frac{\exp (a (q, k_{i}))}{\sum_{j = 1}^{m} e x p (a (q, k_{j}))} \in R

(7)

Figure 4 reflects the operation of the attention mechanism in which the attention score function is indicated as

α

. Choosing different attention scoring functions

α

leads to different attention-gathering operations.

We describe the action mode of the attention mechanism based on peak seeking in the network in the form of pseudo code, as shown in Algorithm 2.

Algorithm 2: CNN with attention Mechanism

Input: Spectral data, peak data.

Output: Prediction label for prediction data set.

①: Trim the mid-wave band of spectral data (400–4000 cm⁻¹)
②: $The Adam optimizer with a learning rate of 0.00001 is adopted, and the number of iterations \in$ is set to 500.
③: Create a training and test data loader.
④: $for i = 1 to \in$ do:
⑤: Using feature extraction block to extract shallow features from spectral data.
⑥: Global average pooling of peak data and intensive layer calculation of attention weight.
⑦: Apply the weight of attention mechanism to feature graph.
⑧: The weighted feature graph continues to extract features.
⑨: Calculate the Loss function, category and accuracy.
⑩: End for
⑪: Use trained models and data for prediction to obtain predictive labels.

2.4. Network Training Method

2.4.1. Optimizer

The optimizer is a method for finding the optimal solution of the model. The commonly used optimizer such as gradient descent (GD) method includes standard gradient descent, stochastic gradient descent (SGD) and batch gradient descent (BGD). The GD method is slow to train and easy to fall into the local optimal solution, and adaptive learning rate optimization algorithms, including AdaGrad, RMSProp, Adam and AdaDelta. Among them, Adam optimizer is the most commonly used, which is suitable for many kinds of Neural Network structures, such as CNN, RNN, GANs and so on.

Adaptive Moment Estimation (Adam) [37] uses exponential weighting to estimate momentum and second moments with winter averages. The state variables of Adam are:

\begin{matrix} v_{t} \leftarrow β_{1} v_{t - 1} + (1 - β_{1}) g_{t} \\ s_{t} \leftarrow β_{2} s_{t - 1} + (1 - β_{2}) g_{t}^{2} \end{matrix}

(8)

where

β_{1}

and

β_{2}

are non-negative weighted parameters, the

β_{1}

is usually set to

0.9

and the

β_{2}

is usually set to

0.999

. Standardized state variables can be derived using the formula below:

{\hat{v}}_{t} = \frac{v_{t}}{1 - β_{1}^{t}}

(9)

{\hat{s}}_{t} = \frac{s_{t}}{1 - β_{2}^{t}}

(10)

As a result, the update formula of gradient is obtained:

g_{t}^{'} = \frac{η {\hat{v}}_{t}}{\sqrt{{\hat{s}}_{t}} + ϵ}

(11)

where

η

is the learning rate, and

ϵ

is a constant, which is usually set to 10⁻⁶.

A simple update is shown in Formula (12):

x_{t} \leftarrow x_{t - 1} - g_{t}^{'}

(12)

By combining momentum and root mean square propagation (RMSProp), Adam dynamically modifies the learning rate of each parameter and dynamically adjusts the update step of parameters in the training process, which improves the convergence speed and stability. Moreover, Adam can achieve an excellent balance between computational complexity and performance and can ensure good performance while having low computational complexity.

2.4.2. Loss Function

The cross-entropy loss function is usually used for classification tasks, which is used to describe the difference of the sample probability distribution and to measure the difference between the learned distribution and the real distribution. The cross-entropy loss function can be expressed as:

L o s s = - \frac{1}{N} \sum_{i = 0}^{N - 1} \sum_{k = 0}^{K - 1} y_{i, k} l n p_{i, k}

(13)

where K represents the number of tag values, N represents the number of samples,

y_{i, k}

indicates the real label of the

i t h

sample is

k

and

p_{i, k}

represents the probability that the sample

i

is predicted to be the

k

tag value. The smaller the value of cross entropy is, the better the prediction effect of the model is. At the same time, cross entropy is often used together with softmax in classification. Softmax sums up the classification predicted values of the output results to 1, and then calculates the loss by cross entropy.

2.4.3. Activation Function

Rectified Linear Unit (ReLU) is a commonly used nonlinear activation function. ReLU controls the input of data. If the input is less than or equal to zero, the output will be zero. If the input is a positive number, the output will be the same as the input value. ReLU can be expressed as:

f (x) = m a x (0, x)

(14)

The function of the ReLU activation function is to introduce nonlinear transformation, so that the Neural Network has the ability to grasp more complex patterns and features. Its main advantages are that the calculation is simple, there is no gradient disappearance problem and it can accelerate the convergence and improve the generalization ability of the model.

3. Spectral Data Set

Section 3 briefly describes the production process of aero-engine hot jet infrared spectral data set, including the design of aero-engine spectrum measurement experiments, data preprocessing and data set production.

3.1. Design of Aero-Engine Spectrum Measurement Experiment

Initially, outfield measurement experiments were conducted to acquire infrared spectrum data from six types of aero-engines. The experiment utilized two FT-IR spectrometers: the EM27 and the telemetry FT-IR spectrometer created by the Aerospace Information Research Institute. The precise specifications of the two devices are displayed in Table 1.

Meanwhile, the infrared spectrum measurement of aero-engine hot jet was arranged on the spot as shown in Figure 5.

Meanwhile, we recorded the parameters of environmental conditions during the experiment as shown in Table 2.

The spectrum of gas is affected by many complex factors, such as pressure, temperature, humidity and environment. Among them, the environmental temperature will affect the response of the spectrometer, resulting in the inconsistency before and after the spectrogram; the environmental humidity will affect the intensity of the spectrum and the width of the characteristic spectrum; the observation distance indicates the atmospheric transmission on the measurement path. The atmosphere will absorb and attenuate the spectrum. The transmission of solar radiation and surface thermal radiation in the atmosphere is influenced by the absorption and scattering of atmospheric molecules such as H₂O, mixed gases (CO₂, CO, N₂O, CH₄, O₂), O₃, N₂, etc., as well as the scattering or absorption of aerosolized particulate matter. This leads to a reduction in the intensity of both solar radiation and surfaces thermal radiation. When the difference between the target and our result is considerable, the atmosphere will significantly impact the acquired spectrum. We employed the method of conducting experiments in the outfield, and the distance of the experiment was relatively short. The gas temperature of the hot jet is generally as high as 300–400 °C, which is very different from the background, so the effects of atmosphere and environment can be ignored.

3.2. Data Preprocessing

The Brightness Temperature spectrum (BTS) [38,39] of an object refers to the temperature of a blackbody that emits the same spectral radiation intensity at the same wavelength as the object. The utilization of BTS analysis can directly extract the characteristics of the target gas.

To obtain the BTS using passive infrared spectrum, it is essential to subtract the instrument’s bias and response from the measured spectral signal obtained by the spectrometer. This subtraction allows us to acquire the incident radiance spectrum on the spectrometer.

T (v)

can be calculated by transforming Planck’s formula to obtain the formula below:

T (v) = \frac{h c v}{k l n {[L (v) + 2 h c^{2} v^{3}] / L (v)}}

(15)

In the formula, Planck’s constant is recorded as

h

with a value of 6.62607015 × 10⁻³⁴ J·S,

c

is the speed of light with a value of 2.998 × 10⁸ m/s,

v

is the wave number in cm⁻¹,

k

is Boltzmann’s constant with a value of 1.380649 × 10⁻²³ J/K and

L (v)

stands for the radiance about the wave number.

The BTSs of the aero-engine hot jets are measured experimentally as shown in Figure 6, in which the transverse coordinate is the wavenumber, and the longitudinal coordinate is the Kelvin temperature. Simultaneously, the important components of the hot jet are marked in the Figure 6.

3.3. Data Set Production

According to the difference of test object, detection distance and detection environment, three types of data sets were made to complete the experiment of the algorithm. The specific data information is presented in Table 3, Table 4 and Table 5.

Where data set C is the combination of data set A and data set B.

4. Experiments and Results

Section 4 continues with the experimental evaluation of the algorithm. Initially, this paper introduces the performance measures of the classification algorithm and presents the experimental results of the network on three data sets. Then, the experimental results with the classifier method based on CO₂ feature vector and the method using CNN, AE, RNN and LSTM are compared. Finally, the ablation study is proved to compare the effectiveness of the peak method, the effectiveness of the attention mechanism, the design of the network and the running time.

4.1. Performance Measures and Experimental Results

The performance measures for aero-engine spectral classification include accuracy, precision, recall, F1-score and confusion matrix. If an instance is classified as a positive class and is correctly predicted as positive, it is labeled as TP. If it is predicted as negative, it is labeled as FN. Conversely, if an instance is classified as a negative class and is incorrectly predicted as positive, it is labeled as FP. If it is correctly predicted as negative, it is labeled as TN. Based on the above assumptions, these performance measures were, respectively, defined as follows:

①: Accuracy: the ratio of correctly classified samples to the total number of samples.

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(16)

②: Precision: the ratio of the number of true positive samples to the total number of samples predicted as positive.

Precision = \frac{T P}{T P + F P}

(17)

③: Recall: the ratio of the number of samples correctly predicted to be in the positive category to the number of samples in the true positive category.

Recall = \frac{T P}{T P + F N}

(18)

④: F1-score: a metric that quantifies the overall performance of a model by combining the harmonic mean of precision and recall.

$F 1 - score = \frac{2^{*} P^{*} R}{P + R}$

(19)

where $P$ stands for the precision, and $R$ stands for the recall.

⑤: Confusion matrix: The confusion matrix provides a comprehensive evaluation of the classifier’s performance in classifying various categories. It displays the discrepancy between actual value and predicted values. The diagonal elements of the matrix indicate the number of accurate predictions generated by the classifier for each category. Table 6 displays the confusion matrix.

This research validates the efficacy of the CWT-AM-CNN method by assessing its performance on three benchmark data sets. The computation experiment was conducted on a Windows 10 workstation with a 32 GB RAM, an Intel Core i7-8750H processor and a GeForce RTX 2070 graphics card. The experiment relied on TensorFlow-gpu 2.9.0.

Specific parameters of the CWT-AM-CNN are provided by Table 7.

According to the table parameters, we conducted network training and label prediction, and the experimental results are displayed in Table 8 and Figure 7.

By analyzing the experimental results of the Loss curve and the accuracy curve, we observed that the CWT-AM-CNN effectively enhanced the processing speed of spectral data and converge rapidly in short training. In terms of accuracy, first of all, according to the overall performance of the algorithm in the three data sets based on the accuracy and F1-score, the accuracy was higher, indicating that the overall classification performance of the algorithm was better, and the high F1-score represents a good balance performance of the classifier. Secondly, according to the analysis of the three data sets of precision and recall, the precision is high, indicating that the algorithm performed well in reducing false positives, while recall was slightly lower than precision, indicating that the classifier still had some room for improvement in reducing false positives. According to the results of the confusion matrix, the three categories in data set A showed that the component of the first category was relatively weak, and the other two categories had good performance; the number of data in data set B was relatively small, the performance of the network on this data set was perfect and each category is accurate; in data set C, the fifth category was unideal, and the classification of other categories was very accurate.

The experimental results show that the CWT-AM-CNN designed in this paper performed well on the three spectral data sets. In the process of the experiment, some exceptional cases like engine failure had a remote impact on our classification results. On the other hand, our network as a whole had good robustness, and the incorrect data had little influence on the overall classification accuracy.

4.2. Comparative Experimental Results of Traditional Classification Methods

The main components of the aero-engine hot jet were analyzed to compare with the classical classifier method. Meanwhile, we designed a classification method based on CO₂ features in paper [40]. Paper [40] conducted classification experiments using data from 70% to 100% of the aircraft’s maximum speed, under which the aero-engine was able to achieve full combustion. Compared with [40], we added infrared spectral measurement experiments of aero-engines, expanded our data set and added data to all operating states of aero-engines.

As is known, the emission products of an aero-engine typically include oxygen (O₂), nitrogen (N₂), carbon dioxide (CO₂), steam (H₂O), carbon monoxide (CO), et al. Among them, the spectral characteristics of CO₂ are obvious. Based on the peak positions of CO₂, four wavenumbers were selected, including 2350 cm⁻¹, 2390 cm⁻¹, 719 cm⁻¹ and 667 cm⁻¹, as shown in Figure 8 for constructing feature vectors. Among them, the characteristic peak position of CO₂ formed by symmetric stretching vibration was at 2350 cm⁻¹, the characteristic position formed by bending vibration was at 667 cm⁻¹ and the other two characteristic positions were observed through the overall spectral curve.

Spectral feature vectors

a = [a_{1}, a_{2}]

were constructed from the difference in brightness temperature spectra between two characteristic peaks:

\begin{matrix} a_{1} = T_{v = 2390 {c m}^{- 1}} - T_{v = 2350 {c m}^{- 1}} \\ a_{2} = T_{v = 719 {c m}^{- 1}} - T_{v = 677 {c m}^{- 1}} \end{matrix}

(20)

Due to environmental influences, the peak positions of characteristic peaks may subtly shift. Table 9 shows the range of the maximum values of the four characteristic peak wave number positions.

The CO₂ feature vector needs to be combined with a classifier for classification tasks. Experimental classification of aero-engine hot jet infrared spectra using feature vectors and widely used classifier algorithms, including SVM, XGBoost, CatBoost, AdaBoost, Random Forest, LightGBM and Neural Networks, was performed. Table 10 provides parameter settings for the classifier algorithm.

To compare with the deep learning method, we combined the training set and the validation set, setting the training set and the prediction set in a 9:1 ratio. The following Table 11, Table 12 and Table 13 are the experimental results on the three data sets using CO₂ feature vectors and classifiers.

Based on the analysis of the experimental results of data set A, the performance of SVM was disappointing, all the indicators were low and there were many misclassifications in the confusion matrix; the performance of XGBoost was very excellent, all the performance results were more than 96% and the misclassification of the confusion matrix was very few; all the performance results of AdaBoost represented 70%, there were some misclassifications; the performance results of Random Forest and CatBoost were both more than 96%, and the classification performance was extremely good. The performance of LightGBM was equally excellent when the performance results were close to 95%. Neural Networks had a high recall rate, low accuracy and F1-score and more classification errors.

Based on the analysis of the experimental results of data set B, except for AdaBoost, all classifiers have accurate performance, while the performance of AdaBoost is slightly inferior. But the accuracy of all classifiers is not moreover than 90%.

Based on the analysis of the experimental results of data set C, the SVM algorithm’s overall classification effect was mediocre, while XGBoost, CatBoost, Random Forest and LightGBM showed good performance in predicting and capturing correct examples, displaying a balance between accuracy and recall rate. The prediction effect of AdaBoost was not good, as all indicators were low, and the performance of Neural Networks indicators was not excellent.

The experiments above confirm the outstanding performance of our CO₂ feature vector and classifier. XGBoost, CatBoost, Random Forest and LightGBM demonstrated higher classification accuracies across three data sets, although substantially not exceeding 90% in data set B. It is clear that in an outfield experiment with complex environmental factors, using CO₂ as a single spectral feature to classify spectral data is not accurate enough and more potential features must be explored. Simultaneously, we observed that the deep learning approach continued to display outstanding performance in classification prediction when comparing the performance of CWT-AM-CNN with that of traditional classifiers.

4.3. Comparative Experimental Results of Deep Learning Classification Methods

We compared and analyzed the widespread network of spectral processing in hyperspectral data at present, and the network parameters are shown in Table 14.

From the above parameters, we obtain the classification experimental results on the three data sets, as shown in Table 15.

Correspondingly, we carried out 500 times of training and make predictions, and the misclassification rates of the three networks were found to be high, the classification accuracy was low. AE and LSTM had a very fast running speed, but the shock was obvious, it was easy to fall into local optimization and the network structures did not adapt to our data set. The running speed of RNN was very slow, and the prediction effect was poor. In contrast, the CNN structure designed in this paper and CWT-AM-CNN had good feature learning ability; compared with the popular network structure, it was more suitable for our FT-IR data sets.

4.4. Analysis of Ablation Study

4.4.1. Effectiveness of Peak Features

The effectiveness of peak features for algorithm improvement can be verified by the effect of combining peak features with traditional classifiers for data classification. We sought peaks on the three data sets and obtained the experimental results as presented in Figure 9.

Among them, the three graphs on the left with red points indicate the wavelet algorithm’s extracted peak points, the three graphs in the middle show the data set’s frequency of these peak points and the three graphs on the right with red lines indicate the location of the points with higher frequency in the spectral data set. According to the intersection of the high-frequency wavenumber positions of data set A and data set B, we obtain 13 peak wavenumber positions, which are 403, 720, 853, 1091, 1107, 1226, 1462, 1502, 2042 and 3998, respectively. At the same time, we calculated the data of the 13 peak positions of each data, combined the XGBoost and obtained the classification results as shown in Table 16.

Combined with the experimental results, the peak data we extracted were effective; compared with the CO₂ feature vector and classifier algorithm, all the performance results were improved. The experimental results show that the peak-seeking algorithm was effective for classification tasks.

4.4.2. Effectiveness of AM

The experimental results of three data sets with CNN with the same parameters of Table 7 are shown in the Table 17 and Figure 10.

The CNN structure designed in this paper performed well on all three data sets, and the accuracy was more than 90%. There were a few misclassifications above data set A and data set C. Compared with CWT-AM-CNN, only using the backbone network for classification will cause Loss shock, and compared with the CWT-AM-CNN, its convergence speed was slightly lower, and the accuracy was also slightly lower.

The AM of this paper focused on peak data, which is an important feature of infrared spectra. The core idea was to weight the peak data, generate an attention weight matrix and apply it to the main data input to enhance the model’s attention to important features. After the first layer of convolution operation, AM was used in combination with feature input to generate attention feature maps for subsequent convolution operations. This section enhanced the input of convolution operations through AM, allowing the model to focus more accurately on specific features and thus improve overall classification ability. The subsequent convolutional layers were mainly responsible for extracting more complex feature information and constructing deeper representations. This model could focus more on peak features and dynamically adjust the model’s attention focus, thereby improving classification accuracy.

The components of the attention mechanism in this paper included global average pooling, attention weight calculation, reshape attention weights and feature scaling with attention weights. Global average pooling extracted the overall feature information of peak values, which helped generate more reasonable attention weights. The accuracy of weights in attention weight calculation directly determined the effectiveness of the attention mechanisms. Reshaping attention weights ensured that they could be correctly multiplied element by element with the input spectrum. Feature scaling helped amplify the influence of important features while suppressing the role of secondary features, allowing the network to focus more on the peak part in subsequent calculations and improving the overall performance of the model.

4.4.3. Comparison of Network Design

①: Effectiveness of BN layer design: Remove the BN layer using the same parameters in Table 7 to obtain the experimental results such as Table 18 and Figure 11.

Only using the CNN network without the BN layer network structure in this paper also had a good classification accuracy, while the Loss curve and the accuracy change curve had a great shock in the process of training, which shows that the model was too sensitive to the training data, and the model was unstable, which will affect the results of our classification experiments.

The stability of CWT-AM-CNN is reflected in the absence of severe oscillations in its loss and accuracy curves during training, which ensures that the model can continue to converge during training, and the prediction results on the three data sets were also stable with high accuracy.

②: Network depth: In deep learning algorithms, network depth carries out a decisive role in network expression. The deeper the depth is, the better the network expression is, because network depth determines the quality of features from aspects such as invariance and abstraction. Therefore, we conducted an experimental comparison of networks with different layer structures. Each layer used a feature extraction block and used the same loss function, optimizer and learning rate to obtain Table 19 and Figure 12.

In general, CNN with three-layer feature extraction blocks as the structure had excellent results in training time and accuracy, so we adopted it as the structure of our backbone network.

③: Optimizer selection: The five optimizers were essentially divided into two categories, SGD, SGDM and Adagrad, RMSProp, Adam. The most frequently used ones were SGDM and Adam. We tested the backbone network with different optimizers on our data set and obtained Table 20 and Figure 13.

By comparing the outcomes, we can observe that the Adagrad and Adam tables on our network and data set are now quite good. The Adam algorithm had a better adaptability and convergence effect, so this paper manipulated Adam as the optimizer.

④: Selection of learning rate: In the training of the data set, the loss curve showed a situation of concussion. Given this scenario, we examined our data set using various learning rates and descended in increments of 10⁻¹ from 0.001 to generate Table 21 and Figure 14.

The learning rate affected the classification effect of the network to a great extent. From the prediction accuracy, running time and loss function training effect, the comprehensive effect of the learning rate at 0.00001 was the best.

4.4.4. Running Time

Compared with the traditional classifier method, the deep learning method needs to expend more time on the model training. However, the advantage of the deep learning method is in the trained model. We compared the prediction time of the proposed algorithm and each method on three data sets and obtained the following expression, as shown in Table 22.

Table 22 shows that in terms of running time, most of the traditional classifier methods and AE methods had high running efficiency. The CWT-AM-CNN method and the same structure CNN also had higher running efficiency in the prediction. The running efficiency of LSTM method was a little slower, and the running efficiency of RNN was the worst. The network’s drawback compared to the classifier was the extended training time, but it offered increased running efficiency post-training.

5. Conclusions

In order to classify aero-engines, the infrared spectrum of the hot jets of six different types aero-engines in various states was measured using a telemetry FT-IR spectrometer, and three data sets were created in this article. This study presents the design of a CNN based on the peak-seeking attention mechanism, named CWT-AM-CNN. The medium wave band peak value was determined by CWT, the high-frequency wave number position was tallied, and the peak data were recovered. The attention mechanism was adopted for the peak data, and the feature graph of the feature extraction network was weighted by the attention mechanism. The training set, validation set and prediction set were randomly sampled according to the ratio of 8:1:1; The CWT-AM-CNN was trained, verified and predicted, and the ablation experiment was conducted for experimental comparison. The accuracy, precision, recall, confusion matrix and F1-score were used to evaluate the classification results. The accuracy of the prediction on the three data sets was as high as 97%. Comparison the experimental findings with the classifier algorithm based on feature vector and the current popular network approaches, AE, RNN, and LSTM, revealed that CWT-AM-CNN is effective, practical and can achieve excellent classification accuracy. The proposed CWT-AM-CNN technique has higher accuracy and better stability for the three different data sets and can complete the task of infrared spectral classification of aero-engine hot jets. The current study utilized a single attention mechanism; future improvements could involve employing multi-head attention mechanisms to enhance the model’s representational capacity. Additionally, integrating CNN architecture with residual connections, Siamese networks or Transformer structures can further enhance performance in complex tasks. Building on the successful application of Transformers in hyperspectral image classification, future research aims to replicate their effectiveness in aero-engine hot jet infrared spectrum classification. The successful application of the Transformer in hyperspectral image classification has provided us with valuable insights. In our future work, we aim to integrate the Transformer model with infrared spectroscopy, leveraging its self-attention mechanism for global feature capture, parallel processing efficiency, dynamic attention mechanism and strong generalization ability. This will enable us to design more efficient and accurate Transformer models capable of adapting to the complex and diverse sets of aero-engine hot jet data collected in outfield experiments.

Author Contributions

Formal analysis, Y.L.; investigation, S.D. and Z.L.; software, Z.K. and X.L.; validation, W.H. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 62005320.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Razeghi, M.; Nguyen, B.-M. Advances in mid-infrared detection and imaging: A key issues review. Rep. Prog. Phys. 2014, 77, 082401. [Google Scholar] [CrossRef] [PubMed]
Chikkaraddy, R.; Arul, R.; Jakob, L.A.; Baumberg, J.J. Single-molecule mid-IR detection through vibrationally-assisted luminescence. Nat. Photonics 2022, 7, 865–871. [Google Scholar]
Knez, D.; Toulson, B.W.; Chen, A.; Ettenberg, M.H.; Nguyen, H.; Potma, E.O.; Fishman, D.A. Spectral imaging at high definition and high speed in the mid-infrared. Sci. Adv. 2022, 8, eade4247. [Google Scholar] [CrossRef]
Zhang, J.; Gong, Y. Automated identification of infrared spectra of hazardous clouds by passive FTIR remote sensing. In Multispectral & Hyperspectral Image Acquisition & Processing; International Society for Optics and Photonics: Bellingham, DC, USA, 2001. [Google Scholar]
Roh, S.B.; Oh, S.K. Identification of Plastic Wastes by Using Fuzzy Radial Basis Function Neural Networks Classifier with Conditional Fuzzy C-Means Clustering. J. Electr. Eng. Technol. 2016, 11, 103–116. [Google Scholar] [CrossRef]
Kumar, V.; Kashyap, M.; Gautam, S.; Shukla, P.; Joshi, K.B.; Vinayak, V. Fast Fourier infrared spectroscopy to characterize the biochemical composition in diatoms. J. Biosci. 2018, 3, 717–729. [Google Scholar] [CrossRef]
Han, X.; Li, X.; Gao, M.; Tong, J.; Wei, X.; Li, S.; Ye, S.; Li, Y. Emissions of Airport Monitoring with Solar Occultation Flux-Fourier Transform Infrared Spectrometer. J. Spectrosc. 2018, 2018, 1–10. [Google Scholar] [CrossRef]
Cięszczyk, S. Passive open-path FTIR measurements and spectral interpretations for in situ gas monitoring and process diagnostics. Acta Phys. Pol. A 2014, 126, 673–678. [Google Scholar] [CrossRef]
Schütze, C.; Lau, S.; Reiche, N.; Sauer, U.; Borsdorf, H.; Dietrich, P. Ground-based remote sensing with open-path Fourier-transform infrared (OP-FTIR) spectroscopy for large-scale monitoring of greenhouse gases. Energy Procedia 2013, 37, 4276–4282. [Google Scholar] [CrossRef]
Yang, L.; Tao, Z. Aircraft image recognition in airport flight area based on deep transfer learning. In Proceedings of the International Conference on Smart Transportation and City Engineering 2021, Chongqing, China, 6–8 August 2021. [Google Scholar]
Shen, H.; Huo, K.; Qiao, X.; Li, C. Aircraft target type recognition technology based on deep learning and structure feature matching. Intell. Fuzzy Syst. 2023, 45, 5685–5696. [Google Scholar] [CrossRef]
Li, X.; Li, Z.; Qiu, H.; Hou, G.; Fan, P. An overview of hyperspectral image feature extraction, classification methods and the methods based on small samples. Appl. Spectrosc. Rev. 2021, 58, 367–400. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Audebert, N.; Le Saux, B.; Lefevre, S. Deep Learning for Classification of Hyperspectral Data: A Comparative Review. IEEE Geosci. Remote Sens. Mag. 2019, 7, 159–173. [Google Scholar] [CrossRef]
Rasti, B.; Hong, D.; Hang, R.; Ghamisi, P.; Kang, X.; Chanussot, J.; Benediktsson, J.A. Feature Extraction for Hyperspectral Imagery: The Evolution from Shallow to Deep: Overview and Toolbox. IEEE Geosci. Remote Sens. Mag. 2020, 8, 60–88. [Google Scholar] [CrossRef]
Lu, W.; Wang, X.; Sun, L.; Zheng, Y. Spectral–Spatial Feature Extraction for Hyperspectral Image Classification Using Enhanced Transformer with Large-Kernel Attention. Remote Sens. 2024, 16, 67. [Google Scholar] [CrossRef]
Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep Learning-Based Classification of Hyperspectral Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
Kemker, R.; Kanan, C. Self-Taught Feature Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2693–2705. [Google Scholar] [CrossRef]
Ding, Y.; Zhao, X.; Zhang, Z.; Cai, W.; Yang, N. Graph Sample and Aggregate-Attention Network for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Li, Z.; Zhao, B.; Wang, W. An Efficient Spectral Feature Extraction Framework for Hyperspectral Images. Remote Sens. 2020, 12, 3967. [Google Scholar] [CrossRef]
Ashraf, M.; Zhou, X.; Vivone, G.; Chen, L.; Chen, R.; Majdard, R.S. Spatial-Spectral BERT for Hyperspectral Image Classification. Remote Sens. 2024, 16, 539. [Google Scholar] [CrossRef]
Hamouda, M.; Ettabaa, K.S.; Bouhlel, M.S. Smart feature extraction and classification of hyperspectral images based on convolutional neural networks. IET Image Process. 2020, 14, 1999–2005. [Google Scholar] [CrossRef]
Zhang, J.; You, S.; Liu, A.; Xie, L.; Huang, C.; Han, X.; Li, P.; Wu, Y.; Deng, J. Winter Wheat Mapping Method Based on Pseudo-Labels and U-Net Model for Training Sample Shortage. Remote Sens. 2024, 16, 2553. [Google Scholar] [CrossRef]
Dou, P.; Huang, C.; Han, W.; Hou, J.; Zhang, Y.; Gu, J. Remote sensing image classification using an ensemble framework without multiple classifiers. ISPRS J. Photogramm. Remote Sens. 2024, 208, 190–209. [Google Scholar] [CrossRef]
Wang, M.; Zhang, X.; Niu, X.; Wang, F.; Zhang, X. Scene Classification of High-Resolution Remotely Sensed Image Based on ResNet. J. Geovisualization Spat. Anal. 2019, 3. [Google Scholar] [CrossRef]
Faqe Ibrahim, G.R.; Rasul, A.; Abdullah, H. Improving Crop Classification Accuracy with Integrated Sentinel-1 and Sentinel-2 Data: A Case Study of Barley and Wheat. J. Geovisualization Spat. Anal. 2023, 7, 22. [Google Scholar] [CrossRef]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Li, X.; Hu, X.; Yang, J. Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks. arXiv 2019, arXiv:1905.09646. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Doubenskaia, M.; Pavlov, M.; Grigoriev, S.; Smurov, I. Definition of brightness temperature and restoration of true temperature in laser cladding using infrared camera. Surf. Coat. Technol. 2013, 220, 244–247. [Google Scholar] [CrossRef]
Homan, D.C.; Cohen, M.H.; Hovatta, T.; Kellermann, K.I.; Kovalev, Y.Y.; Lister, M.L.; Popkov, A.V.; Pushkarev, A.B.; Ros, E.; Savolainen, T. MOJAVE. XIX. Brightness Temperatures and Intrinsic Properties of Blazar Jets. Astrophys. J. 2021, 923, 67. [Google Scholar] [CrossRef]
Du, S.; Han, W.; Shi, Z.; Liao, Y.; Li, Z. An Aero-Engine Classification Method Based on Fourier Transform Infrared Spectrometer Spectral Feature Vectors. Electronics 2024, 13, 915. [Google Scholar] [CrossRef]

Figure 1. CWT-AM-CNN classification network of aero-engine hot jet infrared spectrum.

Figure 2. Feature extraction block.

Figure 3. Composition diagram of an attention mechanism module based on peak seeking: (a) represents a peak-seeking algorithm block, (b) represents an attention mechanism block.

Figure 4. Attention mechanism operation diagram.

Figure 5. Site layout of the outfield measurement experiment for the infrared spectrum of an aeroengine hot jet.

Figure 6. Experimental measurement of the BTSs of aero-engines’ hot jet: the red box represents CO2, the gray box represents water vapor, and the blue box represents CO.

Figure 7. CWT-AM-CNN network training and validation loss function and accuracy change curve: the blue curve represents the training set, the orange curve represents the validation set, (a) is the experimental result of data set a, (b) is the experimental result of data set B and (c) is the experimental result of data set C.

Figure 8. Four characteristic peak positions in the infrared spectrum of an aero-engine hot jet.

Figure 9. Experimental results of CWT peak detection and high-frequency peak statistics.

Figure 10. CNN network training and validation loss function and accuracy change curve: the blue curve represents the training set, the orange curve represents the validation set, (a) is the experimental result of data set A, (b) is the experimental result of data set B and (c) is the experimental result of data set C.

Figure 11. Loss function and accuracy change curve of CNN (without BN layer) network training and validation: the blue curve represents the training set, the orange curve represents the validation set, (a) is the experimental result of data set A, (b) is the experimental result of data set B and (c) is the experimental result of data set C.

Figure 12. Training and validation of the CNN network with different layers of loss function and accuracy change curve: the blue curve represents the one-layer structure, the orange curve represents the two-layer structure, the green represents the three-layer structure, the red represents the four-layer structure, the purple represents the five-layer structure, and the brown represents the six-layer structure, (a) is the experimental result of data set A, (b) is the experimental result of data set B and (c) is the experimental result of data set C.

Figure 13. Network training and validation of the different optimizers on data set C loss function and accuracy change curve: blue is SGD, orange is SGDM, green is Adagrad, red is RMSProp and purple is Adam.

Figure 14. Network training and validation of the Adam optimizer with different learning rates on data set C loss function and accuracy change curve: blue is the learning rate of 0.01, orange is the learning rate of 0.001, green is the learning rate of 0.0001, red is the learning rate of 0.00001 and purple is the learning rate of 0.000001.

Table 1. Parameters of the FT-IR spectrometers used for the experiment.

Name	Manufacturer	Measurement Pattern	Spectral Resolution (cm⁻¹)	Spectral Measurement Range (µm)	Full Field of View Angle
EM27	Bruker	Active/Passive	Active: 0.5/1 Passive: 0.5/1/4	2.5~12	30 mrad (no telescope) (1.7°)
Telemetry Fourier Transform Infrared Spectrometer	Aerospace Information Research Institute	Passive	1	2.5~12	1.5°

Table 2. Table of experimental aero-engines and environmental factors.

Aero-Engine Serial Number	Environmental Temperature	Environmental Humidity	Detection Distance
Turbofan engine 1	19°C	58.5%Rh	5 m
Turbofan engine 2	16°C	67%Rh	5 m
Turbojet engine	14°C	40%Rh	5 m
Turbojet UAV	30°C	43.5%Rh	11.8 m
Turbojet UAV with propeller at tail	20°C	71.5%Rh	5 m
Turbojet manned aircraft	19°C	73.5%Rh	10 m

Table 3. Data set A information table.

Label	Type	Number of Data Pieces	Number of Error Data	Full Band Data Volume	Medium Wave Range Data Volume
1	Turbofan engine 1	792	17	16,384 (1 cm⁻¹)/32,768 (0.5 cm⁻¹)	7464/14,928
2	Turbofan engine 2	258	2	16,384 (1 cm⁻¹)/32,768 (0.5 cm⁻¹)	7464/14,928
3	Turbojet engine	384	4	16,384 (1 cm⁻¹)/32,768 (0.5 cm⁻¹)	7464/14,928

Table 4. Data set B information table.

Label	Type	Number of Data Pieces	Number of Error Data	Full Band Data Volume	Medium Wave Range Data Volume
1	Turbojet UAV	193	0	16,384	7464
2	Turbojet UAV with propeller at tail	48	0	16,384	7464
3	Turbojet manned aircraft	202	3	16,384	7464

Table 5. Data set C information table.

Label	Type	Number of Data Pieces	Number of Error Data	Full Band Data Volume	Medium Wave Range Data Volume
1	Turbojet UAV	193	0	16,384	7464
2	Turbojet UAV with propeller at tail	48	0	16,384	7464
3	Turbojet manned aircraft	202	3	16,384	7464
4	Turbofan engine 1	792	17	16,384	7464
5	Turbofan engine 2	258	2	16,384	7464
6	Turbojet engine	384	4	16,384	7464

Table 6. Confusion matrix.

		Forecast Results
		Positive Samples	Negative Samples
Real results	Positive samples	TP	TN
Real results	Negative samples	FP	FN

Table 7. Parameter table of the CWT-AM-CNN model.

Methods	Parameter Settings
CWT-AM-CNN	Conv1D (32, 3), Conv1D (64, 3), Conv1D (128, 3), activation = ‘ReLU’
	BatchNormalization()
	MaxPooling1D(2)
	Dense(128, activation = ‘ReLU’), activation = ‘softmax’
	Optimizers = Adam, lr = 0.00001
	loss = ‘sparse_categorical_crossentropy’, metrics = [‘accuracy’])
	epochs = 500

Table 8. Results of the CWT-AM-CNN classification experiments.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data Sets	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data set A	97.44%	94.08%	85.11%	[11 8 0] [0 77 0] [1 0 38]	88.24%
Data set B	100.00%	100.00%	100.00%	[19 0 0] [0 8 0] [0 0 17]	100.00%
Data set C	100%	98.72%	94.70%	[17 0 0 0 0 0] [0 7 0 0 0 0] [0 0 16 0 0 0] [0 0 0 84 0 0] [0 0 0 7 15 0] [0 0 0 0 0 33]	96.18%

Table 9. Value range of the characteristic peak threshold.

Characteristic Peak Type	Emission Peak (cm⁻¹)			Absorption Peak (cm⁻¹)
Peak standard features	2350	2390	720	667
Characteristic peak range values	2350.5–2348	2377–2392	722–718	666.7–670.5

Table 10. Parameter table of the classifier method based on the feature vector.

Methods	Parameter Settings
SVM	decision_function_shape = ‘ovr’, kernel = ‘rbf’
XGBoost	objective = ‘multi:softmax’, num_classes = num_classes
CatBoost	loss_function = ‘MultiClass’
Adaboost	n_estimators = 200
Random Forest	n_estimators = 300
LightGBM	‘objective’: ‘multiclass’, ‘num_class’: num_classes
Neural Network	hidden_layer_sizes = (100), activation = ‘ReLU’, solver = ‘adam’, max_iter = 200

Table 11. Experimental results of the classifier method based on the feature vector on data set A.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Classification Methods	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Feature vector + SVM	57.04%	33.33%	19.01%	[0 0 0] [19 77 39] [0 0 0]	24.21%
Feature vector + XGBoost	96.30%	96.09%	94.36%	[18 3 0] [1 74 1] [0 0 38]	95.14%
Feature vector + CatBoost	97.04%	96.53%	95.80%	[18 2 0] [1 75 1] [0 0 38]	96.14%
Feature vector + AdaBoost	74.81%	74.29%	71.93%	[11 25 0] [8 52 1] [0 0 38]	71.35%
Feature vector + Random Forest	97.04%	96.53%	95.80%	[18 2 0] [1 75 1] [0 0 38]	96.14%
Feature vector + LightGBM	96.30%	96.09%	94.36%	[18 3 0] [1 74 1] [0 0 38]	95.14%
Feature vector + Neural Networks	86.67%	68.42%	92.64%	[1 0 0] [16 77 0] [2 0 39]	66.03%

Table 12. Experimental results of the classifier method based on the feature vector on data set B.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Classification Methods	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Feature vector + SVM	86.36%	88.24%	92.00%	[19 0 6] [0 8 0] [0 0 11]	88.31%
Feature vector + XGBoost	84.09%	86.48%	88.89%	[18 0 6] [0 8 0] [1 0 11]	86.53%
Feature vector + CatBoost	86.36%	88.24%	92.00%	[19 0 6] [0 8 0] [0 0 11]	88.31%
Feature vector + AdaBoost	77.27%	80.60%	85.19%	[18 0 9] [0 8 0] [1 0 8]	79.93%
Feature vector + Random Forest	86.36%	88.24%	92.00%	[19 0 6] [0 8 0] [0 0 11]	88.31%
Feature vector + LightGBM	84.09%	86.48%	88.89%	[18 0 6] [0 8 0] [1 0 11]	86.53%
Feature vector + Neural Networks	88.64%	90.20%	93.06%	[19 0 5] [0 8 0] [0 0 12]	90.38%

Table 13. Experimental results of classifier method based on feature vector on data set C.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Classification Methods	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Feature vector + SVM	59.78%	44.15%	47.67%	[8 0 3 0 0 0] [0 3 0 0 0 0] [9 1 12 0 0 0] [0 3 1 84 22 33] [0 0 0 0 0 0] [0 0 0 0 0 0]	42.38%
Feature vector + XGBoost	94.97%	92.44%	93.59%	[15 0 3 0 0 0] [0 7 0 0 0 0] [2 0 13 0 0 0] [0 0 0 83 3 0] [0 0 0 1 19 0] [0 0 0 0 0 33]	92.95%
Feature vector + CatBoost	94.41%	90.35%	93.52%	[15 0 2 0 0 0] [0 6 0 0 0 0] [2 0 14 0 0 0] [0 0 0 83 4 0] [0 1 0 1 18 0] [0 0 0 0 0 33]	91.81%
Feature vector + AdaBoost	79.89%	63.66%	71.49%	[17 5 6 0 0 0] [0 2 0 0 0 0] [0 0 10 0 0 0] [0 0 0 84 18 3] [0 0 0 0 0 0] [0 0 0 0 4 30]	62.56%
Feature vector + Random Forest	94.41%	91.40%	92.70%	[15 0 4 0 0 0] [0 7 0 0 0 0] [2 0 12 0 0 0] [0 0 0 83 3 0] [0 0 0 1 19 0] [0 0 0 0 0 33]	91.91%
Feature vector + LightGBM	94.41%	90.68%	92.40%	[14 0 2 0 0 0] [0 6 0 0 0 0] [3 0 14 0 0 0] [0 0 0 82 2 0] [0 1 0 2 20 0] [0 0 0 0 0 33]	91.42%
Feature vector + Neural Networks	84.92%	76.79%	76.57%	[17 0 2 0 0 0] [0 6 0 0 0 0] [0 0 12 0 0 0] [0 0 2 84 18 0] [0 1 0 0 0 0] [0 0 0 0 4 33]	76.02%

Table 14. Parameters of common deep learning networks.

Methods	Parameter Settings
AE	Dense(encoding_dim,activation = “ ReLU “) Dense(input_dim, activation = “sigmoid”) Dense(num_classes, activation = “softmax”) epochs = 500, optimizer = Adam(lr = 0.00001),loss = ‘sparse_categorical_crossentropy’, metrics = [‘accuracy’]
RNN	SimpleRNN(4, return_sequences = True) BatchNormalization() Dense(4, activation = ‘ReLU’) Dense(num_classes, activation = ‘softmax’) epochs = 500, optimizer = Adam(lr = 0.00001), loss = ‘sparse_categorical_crossentropy’, metrics = [‘accuracy’]
LSTM	LSTM(8, return_sequences = True),BatchNormalization() LSTM(8), BatchNormalization() Dense(8, activation = ‘ReLU’)) Dense(num_classes, activation = ‘softmax’) epochs = 500, optimizer = Adam(lr = 0.00001), loss = ‘sparse_categorical_crossentropy’, metrics = [‘accuracy’]

Table 15. Results of common deep learning network classification experiments.

Methods	Data Set	Accuracy	Precision Score	Recall	Confusion Matrix	F1-score
AE	A	58.52%	52.63%	36.84%	[2 17 0] [0 77 0] [0 39 0]	30.79%
	B	38.64%	12.88%	33.33%	[0 0 19] [0 0 8] [0 0 17]	18.58%
	C	46.93%	7.82%	16.67%	[0 0 0 17 0 0] [0 0 0 7 0 0] [0 0 0 16 0 0] [0 0 0 84 0 0] [0 0 0 22 0 0] [0 0 0 33 0 0]	10.65%
RNN	A	38.64%	12.88%	33.33%	[0 0 19] [0 0 8] [0 0 17]	18.58%
	B	57.03%	19.01%	33.33%	[0 19 0] [0 77 0] [0 39 0]	24.21%
	C	46.92%	7.80%	16.66%	[0 0 0 17 0 0] [0 0 0 7 0 0] [0 0 0 16 0 0] [0 0 0 84 0 0] [0 0 0 22 0 0] [0 0 0 33 0 0]	10.64%
LSTM	A	38.63%	12.88%	33.33%	[0 0 19] [0 0 8] [0 0 17]	18.58%
	B	57.03%	19.01%	33.33%	[0 19 0] [0 77 0] [0 39 0]	24.21%
	C	62.57%	48.72%	41.91%	[4 0 13 0 0 0] [0 0 7 0 0 0] [0 0 16 0 0 0] [0 0 0 82 0 2] [0 0 0 22 0 0] [0 0 0 23 0 10]	36.97%

Table 16. Experimental results of the peak-seeking classifier classification.

XGBoost	Accuracy	Precision	Recall	Confusion Matrix	F1-Score	Running Time
Data set A	100.00%	100.00%	100.00%	[19 0 0] [0 8 0] [0 0 17]	100.00%	0.1359
Data set B	99.26%	99.15%	99.57%	[19 0 0] [0 77 1] [0 0 38]	99.35%	0.2040
Data set C	98.88%	95.24%	98.15%	[17 0 0 0 0 0] [0 5 0 0 0 0] [0 2 16 0 0 0] [0 0 0 84 0 0] [0 0 0 0 22 0] [0 0 0 0 0 33]	96.24%	0.3402

Table 17. Results of the CNN classification experiment.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data Sets	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data set A	94.07%	96.86%	85.96%	[11 8 0] [0 77 0] [0 0 39]	89.47%
Data set B	100%	100%	100%	[19 0 0] [0 8 0] [0 0 17]	100%
Data set C	96.09%	98.72%	94.70%	[17 0 0 0 0 0] [0 7 0 0 0 0] [0 0 16 0 0 0] [0 0 0 84 0 0] [0 0 0 7 15 0] [0 0 0 0 0 33]	96.18%

Table 18. Results of the CNN (without BN layer) classification experiment.

	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data Sets	Accuracy	Precision Score	Recall	Confusion Matrix	F1-Score
Data set A	92.59%	91.70%	84.68%	[11 8 0] [1 76 0] [1 0 38]	87.29%
Data set B	100%	100%	100%	[19 0 0] [0 8 0] [0 0 17]	100%
Data set C	92.18%	94.11%	89.94%	[17 0 0 0 0 0] [0 7 0 0 0 0] [4 0 12 0 0 0] [0 0 0 81 0 3] [0 0 0 7 15 0] [0 0 0 0 0 33]	91.02%

Table 19. Results of the network depth comparison.

Data Set		1	2	3	4	5	6
Data Set	Evaluation	1	2	3	4	5	6
Data set A	Accuracy	63%	66%	83%	81%	79%	82%
	Training Time/s	315.83	939.22	1332.54	1527.18	1735.24	2032.12
	Evaluation Time/s	0.14	0.18	0.22	0.33	0.35	0.32
Data set B	Accuracy	93%	100%	100%	100%	100%	100%
	Training Time/s	81.90	148.38	258.92	347.15	408.00	431.55
	Evaluation Time/s	0.12	0.13	0.18	0.25	0.22	0.25
Data set C	Accuracy	63%	74%	77%	73%	78%	82%
	Training Time/s	421.56	1088.86	1522.65	2014.09	2411.60	2850.66
	Evaluation Time/s	0.16	0.15	0.21	0.23	0.30	0.36

Table 20. Results of different optimizer experiments.

Optimizers	Prediction Accuracy	Training Time/s	Prediction Time/s
SGD	93%	1663.36	0.25
SGDM	93%	2074.59	0.23
Adagrad	94%	2133.88	0.24
RMSProp	89%	2194.60	0.27
Adam	94%	2165.09	0.24

Table 21. Table of experimental results of the Adam optimizer at different learning rates.

Learning Rate	Prediction Accuracy	Training Time/s	Prediction Time/s
0.01	0.47	878.21	0.26
0.001	0.75	1215.80	0.20
0.0001	0.42	1246.89	0.21
0.00001	0.95	1241.00	0.22
0.000001	0.95	1221.39	0.21

Table 22. Comparison table of the classified prediction running time.

Method	Running Time/s
Method	Data Set A	Data Set B	Data Set C
CNN	5	4	6
CNN-BN	5	4	5
CWT-AM-CNN	6	5	6
RNN	980	243	1151
LSTM	14	4	17
AE	0.025	0.025	0.026
Feature vector + SVM	0.08	0.01	0.12
Feature vector + XGBoost	0.17	0.24	0.30
Feature vector + CatBoost	3.09	2.61	4.74
Feature vector + AdaBoost	0.30	0.26	0.39
Feature vector + Random Forest	0.48	0.44	0.56
Feature vector + LightGBM	0.20	0.17	0.44
Feature vector + Neural Networks	0.29	0.31	0.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Du, S.; Han, W.; Kang, Z.; Lu, X.; Liao, Y.; Li, Z. Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification. Remote Sens. 2024, 16, 3097. https://doi.org/10.3390/rs16163097

AMA Style

Du S, Han W, Kang Z, Lu X, Liao Y, Li Z. Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification. Remote Sensing. 2024; 16(16):3097. https://doi.org/10.3390/rs16163097

Chicago/Turabian Style

Du, Shuhan, Wei Han, Zhenping Kang, Xiangning Lu, Yurong Liao, and Zhaoming Li. 2024. "Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification" Remote Sensing 16, no. 16: 3097. https://doi.org/10.3390/rs16163097

APA Style

Du, S., Han, W., Kang, Z., Lu, X., Liao, Y., & Li, Z. (2024). Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification. Remote Sensing, 16(16), 3097. https://doi.org/10.3390/rs16163097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Continuous Wavelet Transform Peak-Seeking Attention Mechanism Conventional Neural Network: A Lightweight Feature Extraction Network with Attention Mechanism Based on the Continuous Wave Transform Peak-Seeking Method for Aero-Engine Hot Jet Fourier Transform Infrared Classification

Abstract

1. Introduction

2. Spectral Classification Network Structure Design

2.1. Overall Network Design

2.2. Backbone Network Design

2.2.1. One-Dimensional Convolutional Layer (Cov1D Layer)

2.2.2. Batch Normalization (BN)

2.2.3. Maximum Pooling Layer

2.2.4. Flatten Layer

2.2.5. Fully Connected Layer (FC Layer)

2.3. Attention Mechanism Based on Peak Seeking

2.3.1. Peak-Seeking Algorithm Block

2.3.2. Attention Mechanism

2.4. Network Training Method

2.4.1. Optimizer

2.4.2. Loss Function

2.4.3. Activation Function

3. Spectral Data Set

3.1. Design of Aero-Engine Spectrum Measurement Experiment

3.2. Data Preprocessing

3.3. Data Set Production

4. Experiments and Results

4.1. Performance Measures and Experimental Results

4.2. Comparative Experimental Results of Traditional Classification Methods

4.3. Comparative Experimental Results of Deep Learning Classification Methods

4.4. Analysis of Ablation Study

4.4.1. Effectiveness of Peak Features

4.4.2. Effectiveness of AM

4.4.3. Comparison of Network Design

4.4.4. Running Time

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI