An Incept-TextCNN Model For Ship Target
An Incept-TextCNN Model For Ship Target
Abstract— Traditionally, synthetic aperture radar (SAR)-based there are clear limitations for CFAR-based detectors: their
ship target detection is performed in the image domain, where detection ability is affected by surrounding buildings and ports
SAR imaging processing has to be applied first. However, SAR in nearshore scenes; its characteristic pixel-by-pixel detection
imaging processing is complex and time-consuming, especially
in the wide-swath working mode. Actually, for open sea scenes, process leads to low processing efficiency. Recently, deep-
most echoes are sea surface signals with no ship targets, and learning-based techniques have been applied to SAR ship
there is no need for imaging processing in those areas. Therefore, detection. The faster R-CNN model is combined with the
non-image domain ship target detection is studied in this letter, CFAR detector in [2], while SAR image target detection
and a novel Incept-text convolutional neural network (TextCNN) based on the SSD model is performed in [3]. In [4], a dense
model is proposed for ship target detection in the SAR range-
compressed domain (RCD). In the proposed method, the SAR connection module is introduced in YOLOv3 to detect small
echo data are converted into a 1-D range profile signal first targets. In [5], the large-size detection process is optimized,
by range compression and mean pooling, and then, the Incept- where the slices with potential targets are screened first,
TextCNN model is proposed and applied, and information about followed by further refined detection. Based on this, a new
existence of ship targets in relevant range cells will be its output. method is proposed in [6] to reduce the involved calculations
Finally, the effectiveness and efficiency of the proposed method
is testified by simulation and real spaceborne SAR data, and through context information. These deep-learning-based algo-
the results demonstrate that the proposed model can filter out rithms significantly improve the detection speed and accuracy;
the invalid range-compressed data of the sea surface area, which however, those methods are performed in the SAR image
can significantly reduce the amount of data for subsequent SAR domain, which has two major drawbacks: 1) SAR image
imaging and ship classification. processing is time consuming, especially azimuth focusing
Index Terms— Data Filtering, ship target, synthetic aperture and 2) for sea surface scenarios, most areas have no tar-
radar (SAR) range-compressed domain (RCD), text convolutional gets, and a lot of computing resources are wasted in these
neural network (TextCNN).
regions.
I. I NTRODUCTION To overcome these shortcomings, some target detection
methods in the non-image domain have been presented
S HIP detection is an important application of synthetic
aperture radar (SAR). As SAR works in the microwave
band, compared with optical sensors, SAR can obtain high-
in recent years and one representative example is the
range-compressed domain (RCD) as it does not require time-
consuming azimuth focusing. In [7], a ship detector is pro-
quality remote sensing images under complex weather con-
posed based on Faster R-CNN working in the RCD; in [8],
ditions, and realize all-day, all-weather, and high-resolution
a two-step detection method is presented with the first step
wide-swath imaging tasks. As a result, SAR has become an
using complex signal kurtosis in the RCD to screen possi-
important tool for ocean imaging and monitoring.
ble ship areas coarsely, and the second step applying CNN
Traditionally, widely used ship detection methods are based
to further detect the potential ship areas; an oriented ship
on the constant false alarm rate (CFAR) in SAR [1]. However,
detection strategy is designed in [9], which calculates the
Manuscript received 2 November 2023; revised 26 December 2023; CFAR detection threshold in the range-Doppler domain; a
accepted 3 January 2024. Date of publication 9 January 2024; date of current supportive ship tracking concept is introduced in [10] in the
version 23 January 2024. This work was supported by the Beijing Natural
Science Foundation under Grant 4222006. (Corresponding author: Wei Yang.)
range-Doppler domain using an airborne-based radar sensor;
HongCheng Zeng, YuTong Song, Wei Yang, and Jie Chen are with in addition, a method for ship detection from raw SAR echo
the School of Electronic and Information Engineering, Beihang University, data is proposed in [11]. However, most of them are based on
Beijing 100191, China (e-mail: [email protected]; songyutong@
buaa.edu.cn; [email protected]; [email protected]). 2-D data for detection, where the model size tends to be large
Tian Miao is with the Key Laboratory of Network Information System Tech- and dependent on CPU resources, and detection based on 1-D
nology (NIST), Aerospace Information Research Institute, Chinese Academy data often does not take advantage of deep learning that can
of Sciences, Beijing 100045, China (e-mail: [email protected]).
Wei Liu is with the School of Electronic Engineering and Computer extract deep features.
Science, Queen Mary University of London, E1 4NS London, U.K. (e-mail: In this letter, a novel Incept-TextCNN model is presented
[email protected]). to detect ship targets. In the proposed method, the SAR echo
WeiJie Wang is with the Shanghai Aerospace Electronic Technology Insti-
tute, Shanghai 201108, China (e-mail: [email protected]). signal is converted into the 1-D range profile firstly, and
Digital Object Identifier 10.1109/LGRS.2024.3351745 then the TextCNN model extracts the depth features of the
1558-0571 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
3501305 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 21, 2024
Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
ZENG et al.: Incept-TextCNN MODEL FOR SHIP TARGET DETECTION IN SAR RCD 3501305
Fig. 3. General idea for the proposed ship detection model based on the
RCD data.
Fig. 5. Structure of the conventional TextCNN model.
the ship targets in the data, thus giving the range in the range
direction that requires further imaging processing, avoiding the
need to image the entire data in the image domain refinement
detection. Fig. 6. Inception module diagram. (a) Original version. (b) Improved version.
A. Conventional TextCNN Model of 1-D convolution plus batch normalization (BN) plus activa-
In TextCNN, the output feature sequence of 1-D convolution tion function, and carries out feature extraction, normalization,
has two dimensions: length and depth. The length depends and nonlinear assignment processing to improve the character-
on the size of convolution kernel, and the information of ization ability of the model. After the fifth convolutional block
length dimension is similar to the information of each channel outputs the feature sequence, the sequence is transformed
in image domain feature map. The depth is the number of into a 1-D sequence through the “flattening” operation. After
channels, depends on the number of convolution kernels, and that, the sequence dimension is reduced through three fully
is similar to the number of channels in image domain feature connected layers, normalized and activated through the BN
map [12]. The relationship between the length of output layer and ReLU activation function, and finally, the confidence
feature sequence and the convolution kernel is shown in Fig. 4, of each item is obtained through the Softmax function.
and the calculation expression is given below
i + 2p − k B. Proposed Incept-TextCNN Model
o= +1 (1)
s Based on the structure of the conventional TextCNN model,
where o is the length of the output feature sequence, i is the by analogy with GoogLeNet’s Inception [13] module in the
length of the input feature sequence, p is the length of all zero image domain, the concatenation operation of different con-
pixels extended around the feature matrix during convolution, volutional feature sequences is introduced and in this way,
k is the size of the convolution kernel, and s is the step size, TextCNN based on the Inception module (Incept-TextCNN)
i.e., the span of each movement of the convolution kernel. for RCD is constructed.
The structure of the conventional text convolutional neural The Inception structure is shown in Fig. 6, where (a) is the
network (TextCNN) model for the RCD is shown in Fig. 5. original version. The feature extraction from the input feature
After the 1-D RCD data is fed into the model, the feature is map is carried out through different convolution kernels. The
extracted by five convolutional blocks successively. The length convolution kernel of multiple sizes is used respectively to
of the feature sequence gradually decreases and the dimension carry out convolution operations on the same input feature
gradually increases. Each convolutional block adopts the form map so that the feature map of different scales can be
Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
3501305 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 21, 2024
Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.
ZENG et al.: Incept-TextCNN MODEL FOR SHIP TARGET DETECTION IN SAR RCD 3501305
Authorized licensed use limited to: St. Petersburg State University. Downloaded on March 01,2024 at 16:03:29 UTC from IEEE Xplore. Restrictions apply.