0% found this document useful (0 votes)
34 views5 pages

An Incept-TextCNN Model For Ship Target

This document describes a new Incept-TextCNN model for detecting ship targets in synthetic aperture radar (SAR) range-compressed domain data. The model takes 1D range profile signals as input and outputs which range cells likely contain ships, avoiding time-consuming SAR imaging processing. The effectiveness of the proposed method is evaluated using simulated and real SAR data.

Uploaded by

vasikas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

An Incept-TextCNN Model For Ship Target

This document describes a new Incept-TextCNN model for detecting ship targets in synthetic aperture radar (SAR) range-compressed domain data. The model takes 1D range profile signals as input and outputs which range cells likely contain ships, avoiding time-consuming SAR imaging processing. The effectiveness of the proposed method is evaluated using simulated and real SAR data.

Uploaded by

vasikas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL.

21, 2024 3501305

An Incept-TextCNN Model for Ship Target


Detection in SAR Range-Compressed Domain
HongCheng Zeng , Member, IEEE, YuTong Song, Wei Yang , Member, IEEE, Tian Miao ,
Wei Liu , Senior Member, IEEE, WeiJie Wang, and Jie Chen , Senior Member, IEEE

Abstract— Traditionally, synthetic aperture radar (SAR)-based there are clear limitations for CFAR-based detectors: their
ship target detection is performed in the image domain, where detection ability is affected by surrounding buildings and ports
SAR imaging processing has to be applied first. However, SAR in nearshore scenes; its characteristic pixel-by-pixel detection
imaging processing is complex and time-consuming, especially
in the wide-swath working mode. Actually, for open sea scenes, process leads to low processing efficiency. Recently, deep-
most echoes are sea surface signals with no ship targets, and learning-based techniques have been applied to SAR ship
there is no need for imaging processing in those areas. Therefore, detection. The faster R-CNN model is combined with the
non-image domain ship target detection is studied in this letter, CFAR detector in [2], while SAR image target detection
and a novel Incept-text convolutional neural network (TextCNN) based on the SSD model is performed in [3]. In [4], a dense
model is proposed for ship target detection in the SAR range-
compressed domain (RCD). In the proposed method, the SAR connection module is introduced in YOLOv3 to detect small
echo data are converted into a 1-D range profile signal first targets. In [5], the large-size detection process is optimized,
by range compression and mean pooling, and then, the Incept- where the slices with potential targets are screened first,
TextCNN model is proposed and applied, and information about followed by further refined detection. Based on this, a new
existence of ship targets in relevant range cells will be its output. method is proposed in [6] to reduce the involved calculations
Finally, the effectiveness and efficiency of the proposed method
is testified by simulation and real spaceborne SAR data, and through context information. These deep-learning-based algo-
the results demonstrate that the proposed model can filter out rithms significantly improve the detection speed and accuracy;
the invalid range-compressed data of the sea surface area, which however, those methods are performed in the SAR image
can significantly reduce the amount of data for subsequent SAR domain, which has two major drawbacks: 1) SAR image
imaging and ship classification. processing is time consuming, especially azimuth focusing
Index Terms— Data Filtering, ship target, synthetic aperture and 2) for sea surface scenarios, most areas have no tar-
radar (SAR) range-compressed domain (RCD), text convolutional gets, and a lot of computing resources are wasted in these
neural network (TextCNN).
regions.
I. I NTRODUCTION To overcome these shortcomings, some target detection
methods in the non-image domain have been presented
S HIP detection is an important application of synthetic
aperture radar (SAR). As SAR works in the microwave
band, compared with optical sensors, SAR can obtain high-
in recent years and one representative example is the
range-compressed domain (RCD) as it does not require time-
consuming azimuth focusing. In [7], a ship detector is pro-
quality remote sensing images under complex weather con-
posed based on Faster R-CNN working in the RCD; in [8],
ditions, and realize all-day, all-weather, and high-resolution
a two-step detection method is presented with the first step
wide-swath imaging tasks. As a result, SAR has become an
using complex signal kurtosis in the RCD to screen possi-
important tool for ocean imaging and monitoring.
ble ship areas coarsely, and the second step applying CNN
Traditionally, widely used ship detection methods are based
to further detect the potential ship areas; an oriented ship
on the constant false alarm rate (CFAR) in SAR [1]. However,
detection strategy is designed in [9], which calculates the
Manuscript received 2 November 2023; revised 26 December 2023; CFAR detection threshold in the range-Doppler domain; a
accepted 3 January 2024. Date of publication 9 January 2024; date of current supportive ship tracking concept is introduced in [10] in the
version 23 January 2024. This work was supported by the Beijing Natural
Science Foundation under Grant 4222006. (Corresponding author: Wei Yang.)
range-Doppler domain using an airborne-based radar sensor;
HongCheng Zeng, YuTong Song, Wei Yang, and Jie Chen are with in addition, a method for ship detection from raw SAR echo
the School of Electronic and Information Engineering, Beihang University, data is proposed in [11]. However, most of them are based on
Beijing 100191, China (e-mail: [email protected]; songyutong@
buaa.edu.cn; [email protected]; [email protected]). 2-D data for detection, where the model size tends to be large
Tian Miao is with the Key Laboratory of Network Information System Tech- and dependent on CPU resources, and detection based on 1-D
nology (NIST), Aerospace Information Research Institute, Chinese Academy data often does not take advantage of deep learning that can
of Sciences, Beijing 100045, China (e-mail: [email protected]).
Wei Liu is with the School of Electronic Engineering and Computer extract deep features.
Science, Queen Mary University of London, E1 4NS London, U.K. (e-mail: In this letter, a novel Incept-TextCNN model is presented
[email protected]). to detect ship targets. In the proposed method, the SAR echo
WeiJie Wang is with the Shanghai Aerospace Electronic Technology Insti-
tute, Shanghai 201108, China (e-mail: [email protected]). signal is converted into the 1-D range profile firstly, and
Digital Object Identifier 10.1109/LGRS.2024.3351745 then the TextCNN model extracts the depth features of the
1558-0571 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
3501305 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 21, 2024

amplitude information in the data and screens the range gates


containing ship targets. As a result, the area of interest can be
located fast, and the large non-target areas can be filtering out
effectively, and the consumption of subsequent imaging and
detection of non-target areas can be reduced.
The remainder of this letter is organized as follows. Con-
struction of the RCD data set is presented in Section II, and
the Incept-TextCNN model is introduced in Section III for
coarse detection of the ship target area. Experimental results
are provided in Section IV, and conclusions are drawn in
Section V.

II. RCD S HIP TARGET DATASET


To apply the deep-learning-based method, preparing the
RCD ship target data set is an important step in the proposed
method. The RCD data are the intermediate product of SAR
imaging. In this letter, the ship target data set in RCD is
constructed first, and a corresponding 1-D range profile is also
provided, which is obtained from range-compressed data and
contains information about signal amplitude changes along the
range direction.

A. Simulated Ship Target Data in RCD


Spaceborne SAR simulation is used to obtain sufficient Fig. 1. Simulated ship target data. (a) Data A in SAR image. (b) Data B
range-compressed ship target data. As it is difficult to obtain in SAR image. (c) RCD data of A. (d) RCD data of B. (e) One-dimensional
range profile of A. (f) One-dimensional range profile of B.
the SAR echo data, the ship target in real SAR images is
used to simulate the echo data which is then processed and
transformed into the range-compressed data. For the RCD data
simulation, the target SAR image is used as an input, and
the simulated echo data is obtained through parameter setting,
SAR system simulation, random phase simulation, and echo
simulation. Then, range Fast Fourier Transform (FFT), range-
matched filtering, and range inverse FFT (IFFT) are performed
to obtain the range-compressed data. Furthermore, a mean
pooling operation is carried out along the azimuth direction to
obtain the corresponding 1-D range profile signal, which is the
input of the subsequent training model. Based on the presented
simulation method, the simulated RCD ship target data and its
corresponding 1-D range profile are presented in Fig. 1, where
the horizontal direction represents the range. Here, the range
resolution is about 1 m. As shown, the fluctuation of signal
amplitude can be clearly observed, and this characteristic will
be useful in the subsequent ship target detection.
Fig. 2. Real ship target data. (a) Real data A in SAR image. (b) Real data B
in SAR image. (c) One-dimensional range profile of A. (d) One-dimensional
B. Real Ship Target Data in RCD range profile of B.
For real SAR data, only range FFT, range-matched filtering,
and range IFFT operations are needed. Using the Pujiang-
2 spaceborne SAR data, Fig. 2 presents the real data and III. S HIP TARGET D ETECTION BASED ON
corresponding 1-D range profile, including the good and bad I NCEPT-T EXT CNN
sea conditions. Finally, based on the simulated and real ship To filter out the non-target area data, a novel Incept-
target data, a 1-D range profile data set of ship target in RCD TextCNN model for ship target detection is proposed in this
is generated, which contains 433 sets of training sample data, part. The general idea is shown in Fig. 3, where the 1-D
110 sets of verification sample data, 252 sets of test sample range profile data is obtained from SAR echo first, and then
data, with a positive and negative sample ratio of 1:1 and a TextCNN outputs which range gates contain targets and which
resolution of about 1–3 m. The overall ratio of simulated and range gates do not according to the different characteristics of
real data in training and verification samples is 6:4. amplitude in the background region and the region containing

Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
ZENG et al.: Incept-TextCNN MODEL FOR SHIP TARGET DETECTION IN SAR RCD 3501305

Fig. 3. General idea for the proposed ship detection model based on the
RCD data.
Fig. 5. Structure of the conventional TextCNN model.

Fig. 4. Description of the 1-D convolution operation.

the ship targets in the data, thus giving the range in the range
direction that requires further imaging processing, avoiding the
need to image the entire data in the image domain refinement
detection. Fig. 6. Inception module diagram. (a) Original version. (b) Improved version.

A. Conventional TextCNN Model of 1-D convolution plus batch normalization (BN) plus activa-
In TextCNN, the output feature sequence of 1-D convolution tion function, and carries out feature extraction, normalization,
has two dimensions: length and depth. The length depends and nonlinear assignment processing to improve the character-
on the size of convolution kernel, and the information of ization ability of the model. After the fifth convolutional block
length dimension is similar to the information of each channel outputs the feature sequence, the sequence is transformed
in image domain feature map. The depth is the number of into a 1-D sequence through the “flattening” operation. After
channels, depends on the number of convolution kernels, and that, the sequence dimension is reduced through three fully
is similar to the number of channels in image domain feature connected layers, normalized and activated through the BN
map [12]. The relationship between the length of output layer and ReLU activation function, and finally, the confidence
feature sequence and the convolution kernel is shown in Fig. 4, of each item is obtained through the Softmax function.
and the calculation expression is given below
i + 2p − k B. Proposed Incept-TextCNN Model
o= +1 (1)
s Based on the structure of the conventional TextCNN model,
where o is the length of the output feature sequence, i is the by analogy with GoogLeNet’s Inception [13] module in the
length of the input feature sequence, p is the length of all zero image domain, the concatenation operation of different con-
pixels extended around the feature matrix during convolution, volutional feature sequences is introduced and in this way,
k is the size of the convolution kernel, and s is the step size, TextCNN based on the Inception module (Incept-TextCNN)
i.e., the span of each movement of the convolution kernel. for RCD is constructed.
The structure of the conventional text convolutional neural The Inception structure is shown in Fig. 6, where (a) is the
network (TextCNN) model for the RCD is shown in Fig. 5. original version. The feature extraction from the input feature
After the 1-D RCD data is fed into the model, the feature is map is carried out through different convolution kernels. The
extracted by five convolutional blocks successively. The length convolution kernel of multiple sizes is used respectively to
of the feature sequence gradually decreases and the dimension carry out convolution operations on the same input feature
gradually increases. Each convolutional block adopts the form map so that the feature map of different scales can be

Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
3501305 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 21, 2024

Fig. 8. Accuracy curves of the TextCNN model and the Incept-TextCNN


model.

images, and it is demonstrated that the proposed model can


detect potential targets present in the range gate effectively.

A. Specific Experimental Setup


The models are trained on the constructed simulated and real
RCD data set. The training parameters of the coarse detection
Fig. 7. Structure of the proposed Incept-TextCNN model.
model based on TextCNN and Incept-TextCNN are set as
follows: the initial learning rate is set to 0.0001, the batch
size to 16, and the number of training rounds is 200. The
obtained. After that, all feature maps are spliced to obtain the
AdaMax optimizer is used for training, in which β1 is set to
output feature map containing information of different scales.
0.9 and β2 to 0.999.
Fig. 6(b) shows an improved version of inception. On the basis
of the original structure, 1 × 1 convolution is used to reduce
the number of channels in the feature map, thus reducing the B. Comparison Between TextCNN and Incept-TextCNN
accumulation of parameters. Models
As shown in Fig. 7, the Incept-TextCNN model is based Simulated and real data sets are used to train TextCNN and
on the structure of TextCNN. When the 1-D RCD signal is Incept-TextCNN models respectively, where the input data size
fed into the model, first of all, a 1-D convolution operation of each group is 1 × 512. The training efficiency results of
is performed by three convolution kernels of different sizes the verification set are shown in Fig. 8, where blue represents
in block1, block2, and block3, and the three output feature the TextCNN model and red represents the Incept-TextCNN
sequences are kept the same length by adding 0 s and adjusting model. Both models can realize ship target detection after a
the step size. Then, in order to avoid a too large model size period of iterative training. The detection rate by the Incept-
caused by too many feature sequence channels after concate- TextCNN model on the verification set improves faster, and
nation, referring to the operation of the Inception module its accuracy stabilizes above 90% after only a few rounds of
in GoogLeNet, the convolution kernel of size 1 is used to training; while the convergence speed of the TextCNN model
reduce the depth of feature sequence. Finally, output sequences is slower, it reaches the steady state after about 60 rounds of
representing three different scale features are concatenated training. In the verification set, the accuracy of the Incept-
along the depth dimension. The output of the first Inception TextCNN model can reach 97%, while the accuracy of the
module is obtained and sent to block4. The feature sequence TextCNN model is relatively poor, and the average detection
output by block4 is then sent to the next Inception module, and rate of the former after stabilization is 2.4% higher than that
features are further extracted by three different convolution of the latter, where the accuracy is calculated by the ratio
kernels. Then convolution, flattening, full connection layer of correctly judged samples to the total samples. Tests on
processing, and activation function processing are carried out the sliced test data set resulted in an accuracy of 93.2%,
successively. Finally, the confidence of each item is obtained a recall [3] of 88.4%, an F1-score [3] of 90.7% based on
through the Softmax function. TextCNN, an accuracy of 94.5%, a recall of 90.7%, and an
F1-score of 92.6% based on Incept-TextCNN. It can be seen
that the detection accuracy, recall, and F1-score of the Incept-
IV. E XPERIMENTAL R ESULTS
TextCNN model are all higher than TextCNN.
In this section, the experimental results are presented and In summary, Incept-TextCNN outperforms TextCNN in
discussed using the RCD data set described in Section II. detecting, and its model size is 16.58 M, although it is larger
TextCNN and Incept-TextCNN are tested and the detection compared to TextCNN’s 1.10 M, but because it is a method
performance of the two models is compared. Moreover, the that uses 1-D convolution for 1-D data processing, it is lighter
Incept-TextCNN model is used to test two real large scene than the model of 2-D convolution for 2-D data processing

Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
ZENG et al.: Incept-TextCNN MODEL FOR SHIP TARGET DETECTION IN SAR RCD 3501305

and other structures to classify the target features. And the


Inception structure in GoogLeNet is introduced to fuse the
feature information of different scales together, which can
improve the detection accuracy.
A key feature of the proposed method is to output the
range gates containing potential ship targets (including strong
clutters, artificial platforms, or island areas) to achieve coarse
detection of ship targets with high detection rates and high
false alarm rates. After the detection by the proposed method,
only the selected areas should be imaged and input into the
subsequent image domain fine detection model. Thus, the
advantages of time-saving of the overall framework are mainly
reflected in: (1) the coarse detection is for one-dimension,
which is faster than the detection speed of two-dimension;
(2) there is no need to image all the echo data, but only the
part containing targets; and (3) since a large number of non-
target areas are not imaged, especially for open sea, there is no
Fig. 9. Coarse detection results using real data. (a) Detection results using
need to carry out image domain fine detection on non-target
real data A displayed in RCD. (b) Detection results using real data B displayed areas. Therefore, the proposed model will be very useful in
in RCD. (c) Detection results using real data A displayed on 1-D range profile improving the processing efficiency of ship target detection.
signal. (d) Detection results using real data B displayed on 1-D range profile
signal.
In the future, we will study nearshore ship targets and the
scenes with more clutter, improve the existing model to make it
suitable for more complex scenes, and explore the performance
used in image and non-image domains commonly, and has an
of using other models such as NPL for detection.
advantage in model size and detection efficiency.
R EFERENCES
C. Real Data Coarse Detection Based on Incept-TextCNN
Model [1] T. Xie, M. Liu, M. Zhang, S. Qi, and J. Yang, “Ship detection based
on a superpixel-level CFAR detector for SAR imagery,” Int. J. Remote
The validity of the Incept-TextCNN model is further verified Sens., vol. 43, no. 9, pp. 3412–3428, May 2022.
by using real large image SAR data. For the input of a large [2] M. Kang, X. Leng, Z. Lin, and K. Ji, “A modified faster R-CNN based
on CFAR algorithm for SAR ship detection,” in Proc. Int. Workshop
image, the whole 1-D range profile signal is obtained first, and Remote Sens. With Intell. Process. (RSIP), May 2017, pp. 1–4.
then the long 1-D data is divided into many groups of 1 × 512 [3] Z. Wang, L. Du, J. Mao, B. Liu, and D. Yang, “SAR target detection
data by the method of overlapping sliding window, which is based on SSD with data augmentation and transfer learning,” IEEE
Geosci. Remote Sens. Lett., vol. 16, no. 1, pp. 150–154, Jan. 2019.
input in sequence for model detection, and finally the groups
[4] Z. Wang, Research on Intelligent Detection Algorithm of SAR Ship Tar-
of data with potential targets are determined, that is, the range get in Complex Environment. Beijing, China: Beijing Univ. Aeronautics
gates containing potential targets are the corresponding output. and Astronautics, 2021.
Fig. 9 shows the coarse detection results of SAR images of the [5] F. Xiaoya and W. Zhaocheng, “SAR ship target rapid detection method
combined with scene classification in the inshore region,” J. Signal Pro-
Taiwan Strait taken by the Pujiang-2 satellite, with the stride cess., vol. 36, no. 12, pp. 2123–2130, 2020, doi: 10.16798/j.issn.1003-
of 200 range gates when sliding the window on data A and 0530.2020.12.019.
20 range gates on data B. [6] L. Zhai, Y. Li, and Y. Su, “Inshore ship detection via saliency and context
information in high-resolution SAR images,” IEEE Geosci. Remote Sens.
As can be seen from Fig. 9, the Incept-TextCNN model has Lett., vol. 13, no. 12, pp. 1870–1874, Dec. 2016.
successfully detected the positions in range direction of five [7] T. Loran, A. B. C. da Silva, S. K. Joshi, S. V. Baumgartner, and
ship targets in image A and two in image B, and 73.09% non- G. Krieger, “Ship detection based on faster R-CNN using range-
compressed airborne radar data,” IEEE Geosci. Remote Sens. Lett.,
target sea area in image A and 70.34% non-target sea area vol. 20, 2023, Art. no. 3500205, doi: 10.1109/LGRS.2022.3229141.
in image B are excluded, demonstrating the effectiveness of [8] X. Leng, J. Wang, K. Ji, and G. Kuang, “Ship detection in range-
using 1-D convolutional network for deep feature extraction compressed SAR data,” in Proc. IEEE Int. Geosci. Remote Sens.
Symp., Kuala Lumpur, Malaysia, Jul. 2022, pp. 2135–2138, doi:
on the 1-D RCD data which only contains amplitude features 10.1109/IGARSS46834.2022.9884909.
but lacks ship outline and size features. Note that the vertical [9] S. K. Joshi, S. V. Baumgartner, A. B. C. da Silva, and G. Krieger,
length of the red box in Fig. 9(a) and (b) is consistent and “Range-Doppler based CFAR ship detection with automatic training data
selection,” Remote Sens., vol. 11, no. 11, p. 1270, May 2019.
covers all points in the azimuth- direction, meaning that the
[10] S. K. Joshi, S. V. Baumgartner, and G. Krieger, “Tracking and track man-
exact position of target in the azimuth-direction cannot be agement of extended targets in range-Doppler using range-compressed
given. airborne radar data,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022,
Art. no. 5102720, doi: 10.1109/TGRS.2021.3084862.
V. C ONCLUSION [11] X. Leng, K. Ji, and G. Kuang, “Ship detection from raw SAR echo data,”
IEEE Trans. Geosci. Remote Sens., vol. 61, 2023, Art. no. 5207811, doi:
In this letter, a novel Incept-TextCNN model has been 10.1109/TGRS.2023.3271905.
proposed for ship target detection in SAR RCD. It employs [12] Y. Kim, “Convolutional neural networks for sentence classification,”
2014, arXiv:1408.5882.
1-D convolution to extract the features of 1-D RCD data [13] C. Szegedy et al., “Going deeper with convolutions,” 2014,
and then uses activation functions, full connection layers, arXiv:1409.4842.

Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.

You might also like