1. Introduction
In the past few decades, our planet has been impacted by the rapid urbanization wave, resulting in a large number of surfaces being covered by impervious surfaces (ISs) [
1,
2]. Impervious surface is mainly composed of artificial materials such as asphalt, cement, metal, and glass [
3]. Permeable natural surfaces have been replaced by impervious surfaces, which has changed the material circulation process of the global ecosystem, leading to increased ecological risks and threatening human health. In consideration of human well-being, the United Nations officially launched the 2030 Agenda for Sustainable Development in 2016, putting forward 17 sustainable development goals and calling on the world to take joint action to eradicate poverty, protect the planet, and improve the lives and futures of all people [
4]. Building inclusive, safe, resilient and sustainable cities and human settlements is one of the 17 sustainable development goals. Reasonable planning of urban layout is an important way to build the above-mentioned sustainable cities, and understanding the spatio-temporal distribution of urban impervious surfaces is the premise for reasonable planning of cities [
5]. Therefore, it is of great significance to monitor urban impervious surfaces.
With the development of Earth observation technology, remote sensing technology plays an irreplaceable role in the extraction of urban impervious surfaces. For example, Misra et al. used Sentinel-2 as the data source and evaluated the performances of three different machine learning algorithms in the extraction of urban impervious surfaces to better understand the feature extraction method and the appropriate classifier for the classification of urban impervious areas [
6]. Additionally, Huang et al. used more than three million Landsat images since 1972 to extract the world’s impervious surface from 1972 to 2019, with the miss rate, false alarm rate, and F-score of 5.16%, 0.82%, and 0.954, respectively [
7]. In rapidly urbanized areas, it is very important to monitor the changes in impervious surface regularly. In order to realize high-frequency dynamic monitoring of impervious surface, Liu et al. proposed a method to dynamically capture continuous impervious surface using Landsat data of a spatial-temporal rule and dense time series. It had an overall accuracy of 85.5% [
8]. However, the most rapidly urbanizing region in the world is located in Southeast Asia, which is a typical tropical or subtropical region with cloudy and rainy weather all year round [
9]. According to data statistics, for the cloudy and rainy tropical and subtropical regions, the single-point effective date of optical remote sensing data is generally less than 40%, and the regional effective date is less than 20%, which makes it difficult for optical satellites to effectively monitor impervious surface. Synthetic aperture radar (SAR) detects the geometric and physical properties of ground objects by actively transmitting signals and then receiving electromagnetic signals reflected by ground objects. Due to its long wavelength, SAR is capable of all-day and all-weather Earth observation. Therefore, SAR is very suitable for monitoring the impervious surface in such areas. However, urban impervious surfaces include roads, squares, buildings, etc., with various spatial locations and different features on SAR images. Therefore, extracting urban impervious surfaces from SAR images is a challenging task.
Many scholars have conducted in-depth studies on the extraction of urban impervious surfaces using SAR images. For example, Guo et al. used a full-polarization SAR (PolSAR) RADARSAT-2 image as the data source, extracted polarization features through polarization decomposition, and used them as the input for C5.0 decision tree algorithm to extract the impervious surface of Beijing, China [
10]. According to the characteristics of impervious surface on polarimetric SAR images, Zhang et al. proposed a new framework for impervious-surface classification based on H/A/Alpha decomposition theory [
11]. Ban et al. constructed a robust urban-area extractor based on ENVISAT advanced synthetic aperture radar (ASAR) intensity data to extract global built-up areas [
12]. By comparing different regions, Attarchi et al. verified the effectiveness of texture features of full PolSAR images based on a gray co-occurrence matrix in urban impervious surface extraction [
13]. In addition, Jiang et al. used the coherence feature of SAR for land-cover classification and achieved considerable accuracy, which proved the potential of coherence information in land-cover classification [
14]. Subsequently, Sica et al., used interferometric SAR of short time series Sentinel-1 images for land-cover classification, which again confirmed the feasibility of land-cover classification using SAR coherence features [
15]. Although researchers have comprehensively analyzed the effectiveness of intensity, coherence, and polarization features of SAR images in the extraction of urban impervious surfaces, there are still three limitations in the following aspects: (1) Most of their studies only proved the effectiveness of a single feature of SAR images in the extraction of urban impervious surfaces but did not evaluate the differences between multiple features of SAR images and their integration effect. (2) The current studies on impervious surface extraction using SAR data mainly focus on the use of SAR image intensity or amplitude information, and rarely on the use of phase and polarization information. (3) It is difficult to fully mine the information contained in SAR images only by manually extracting a small number of shallow features. (4) Traditional machine learning methods struggle to obtain satisfactory extraction results.
Deep learning technology has strong feature-learning ability, which can mine useful information from massive data. Therefore, deep learning has been adopted by remote sensing scholars and has achieved satisfactory results. For example, Zhang et al. used full PolSAR and optical images as data sources, and used deep convolution networks based on small patches to extract urban impervious surfaces [
16]. The extraction accuracy was significantly better than those of traditional machine learning algorithms, such as random forest (RF) and support vector machine (SVM). Wang et al. proposed an urban impervious surface extraction method based on modified deep belief network, which also achieved better accuracy than the traditional machine learning methods RF and SVM [
17]. Wu et al. employed UNet as the backbone network and Gaofen-3 as the data source to extract the built-up areas in the whole of China, providing overall accuracy of 76.05% to 93.45% [
18]. Hafner et al. proposed a semi-supervised model of domain adaptation based on Sentinel-1 and Sentinel-2 data to extract global urban regions [
19]. Due to the limitation of impervious surface datasets, the research on impervious surface extraction based on deep learning is relatively scant. With free access to Sentinel-1 and other SAR data, more and more datasets will be released in the future, and then impervious surface extraction methods based on deep learning will be developed rapidly.
Although the above studies have proved that methods based on deep learning can obtain better accuracy than traditional methods, most of them focused on the intensity or amplitude information of SAR images, ignoring the rich and unique phase and scattering information related to ground objects contained in SAR images. Cities are highly heterogeneous scenes. The spatial distribution of an impervious surface is diversified, which makes its features on SAR images diversified. For example, tall buildings are highlighted on SAR images due to double-scattering effects, but for low and randomly distributed residential areas, their features on SAR images are not always highlighted due to volume scattering effects [
20]. Other examples are wide airports and roads, which reflect radar waves in a specular way. The reflected waves received by the radar antenna are very weak, appearing dark colors in an SAR image, which are difficult to distinguish from bodies of water. Therefore, it is difficult to accurately extract urban impervious surfaces only by using the intensity or amplitude features of SAR images. In addition, under the deep learning framework, the independent effectiveness of different types of SAR image features in impervious surface extraction and the role of multi-feature integration have not been fully evaluated. To bridge this gap, based on Sentinel-1 dual-polarization data, we selected UNet, HRNet, and Deeplabv3+ deep learning models as impervious surface extraction models to deeply explore the specific roles of SAR image intensity, coherence, and polarization scattering features in the extraction of urban impervious surfaces. This study provides a new data scheme for the extraction of impervious surfaces in the tropical and subtropical areas with cloudy and rainy weather.
The rest of this study is organized as follows.
Section 2 is the data description and preprocessing steps.
Section 3 describes methods, experimental details, and evaluation metrics. In
Section 4, the experimental results are presented quantitatively and qualitatively.
Section 5 discusses the temporal and spatial transfer capability of the impervious surface extraction models and the limitations of the impervious surface extraction based on SAR images. Finally, the conclusions of this study are drawn in
Section 6.
2. Dataset Description
Sentinel-1 satellite is an Earth observation satellite in the Copernicus Program (GMES) of the European Space Agency (ESA) and is a continuation of the earth observation missions of ERS-1/2 and ENVISAT ASAR. It is composed of two satellites equipped with a C-band SAR sensor and has four operation modes, namely, Stripmap (SM), Interferometric Wide (IW) Swath, Extra-Wide (EW) Swath, and Wave (WV) [
21]. As an innovative SAR system, Sentinel-1 not only provides dual polarization capabilities, but also has an extremely short revisit time and fast product delivery. The revisit time of a single satellite is 12 days, whereas that of a dual satellite is as short as 6 days. The Sentinel-1 data can be freely downloaded from the ESA data distribution website (
https://scihub.esa.int (accessed on 25 June 2019)). Therefore, Sentinel-1 can provide strong data support for global environmental monitoring. In this study, we took the main urban area of Wuhan (covering an area of about 3000 km
) as the study area, and the Sentinel-1 single-look complex (SLC) images from 13 and 25 June 2019 as the data. Under the framework of deep learning, the roles of SAR image intensity, coherence, polarization features, and multi-feature integration in the extraction of urban impervious surface were studied. The detailed information of the data used is shown in
Table 1.
2.1. Intensity
Since this study needed to use the coherence and polarization information of SAR images, the Sentinel-1 data product used in this study was SLC images, rather than directly using Sentinel-1 GRD data. In order to better maintain the information in SAR images, we only carried out necessary preprocessing steps. From SLC to intensity data, a series of preprocessing steps, such as orbit correction, thermal noise removal, radiometric calibration, S-1 TOPS deburst, multi-looking, and terrain correction, are required. All preprocessing steps in this study were carried out in the software SNAP. Due to the mechanism of coherent imaging, SAR images are seriously affected by speckle, which makes it difficult to interpret and extract information from SAR images. Therefore, this study uses refined Lee filter with a window size of 7 × 7 px for despeckling.
2.2. Interferometric Coherence
Interferometric SAR (InSAR) technology refers to acquiring two SAR SLC images repeatedly in the same area and then processing them coherently. This technology has been widely used in the monitoring of surface deformation and the acquisition of digital elevation model data [
22,
23]. The coherence coefficient is an important index to evaluate the quality of the interference fringe pattern. The larger the coherence coefficient is, the better the quality of the interference fringe pattern. In addition, the coherence coefficient can also be used to estimate the phase stability of targets in two SAR images. The value of the coherence is related to the platform parameters and the scatterers of the ground objects. In practical applications, the coherence map can be calculated from two SLC images after registration according to the following formula:
where
is the coherence coefficient.
is a mathematical expectation, and
and
represent the registered SLC images. The symbol * indicates conjugate multiplication. The value of
ranges from 0 to 1. If
, it means that the two SAR images are completely uncorrelated. If
, it means that the two SAR images are completely correlated; that is, during the radar imaging process, all parameters and ground objects have not changed. These are two extreme cases.
To reduce the influence of noise and assume that the scatterer is ergodic, the calculation of the coherence coefficient can replace the overall mean value with the local mean value within a window of a certain size. The specific calculation formula is as follows:
where
N and
M represent the size of the sliding window. Generally, they are equal. Studies have shown that the window size of 5 × 5 px is suitable for the calculation of the coherence coefficient in urban areas [
24].
SAR SLC image pairs use the above formula to calculate the coherence value in a window with the current pixel as the center and N × M as the size, and get the final coherence image through sliding over the window pixel by pixel. The value of each pixel on the coherence image represents the coherence value of this pair of SLC images at that pixel. Decoherence of different ground objects is the basis of remote sensing image classification using interferometric coherence [
14,
25]. Since the speckle in the coherence image will affect the accuracy of urban impervious surface extraction, this study used the mean filter to process the coherence image, and the size of the filtering window was 3 × 3 px.
2.3. Polarimetric Information
2.3.1. Polarimetric Covariance Matrix
In polarimetric SAR images, each pixel contains the amplitude and phase information of ground objects, which can be represented by the backscattering matrix
S [
26]:
where
represents the scattering polarization information vertically transmitted and horizontally received by the radar antenna, which is related to the reflectivity of ground objects. Other elements in the the backscattering matrix
S are defined similarly. In the case of monostatic radar, according to the reciprocity theorem, the backscattering matrix
S becomes a symmetric matrix—that is,
.
For Setinel-1 dual-polarization data, the backscattering matrix
S is:
Then, the eigenvector
K can be obtained by vectorizing the scattering matrix:
where
T represents conjugate transpose.
From the eigenvector
K, the polarization covariance matrix
C of Sentinel-1 image can be obtained as follows:
where
,
and
represent SAR amplitude and phase information, respectively;
represents conjugate multiplication; and
represents statistical mean.
2.3.2. H/A/Alpha Dual Polarization Decomposition
Polarization decomposition is the most commonly used interpretation method in polarimetric SAR, which decomposes the polarization measurement data into multiple basic components to reveal the physical mechanisms of different scatterers [
27]. Common decomposition methods based on the polarization covariance matrix include Cloude decomposition [
28], Touzi decomposition [
29], and H/A/Alpha decomposition [
30]. To explore the roles of polarization features in the extraction of urban impervious surface, we used the H/A/Alpha decomposition method to extract polarization features, including polarimetric entropy (H), mean scattering angle (Alpha), and polarimetric anisotropy (A).
H/A/Alpha decomposition was initially proposed for full PolSAR data, and then extended to dual PolSAR data [
31]. By eigenvalue decomposition of the covariance matrix, H/A/Alpha components can be obtained. The specific calculation formula is as follows:
with
where
and
are the eigenvalues of the covariance matrix and their corresponding eigenvectors, respectively, and
is the probability of the relative contribution of the eigenvalue
to the total backscatter power.
Polarimetric entropy H is used to describe the proportions of different scattering mechanisms in the total scattering process. It is a measure of the randomness of scatterers, and its values range from 0 to 1. The closer H is to 1, the higher the randomness of ground object scattering is, and vice versa. Low polarimetric entropy H indicates that the pixel is dominated by a single scattering type, and high polarimetric entropy H involves multiple scattering processes. The scattering angle alpha is an important factor to identify the main scattering mechanism of ground objects. The range of is , which describes the change in scattering mechanism from odd scattering () to volume scattering () to even scattering (). The polarimetric anisotropy A is mainly used to describe the scattering anisotropy of scattering randomness, reflecting the influences of two smaller scattering mechanisms on the results, which is a supplement to H, and its value range is [0, 1]. When H is large, the scattering of ground objects involves multiple scattering processes. At this time, the scattering mechanism with the maximum scattering power cannot be considered only, and the data need to be further analyzed through polarimetric anisotropy A. When A is large, it indicates that there are two scattering mechanisms that are dominant. When A is large and H is low, only one scattering mechanism is dominant. When A is large and H is high, it means that the three scattering mechanisms are similar and the scattering is almost random.
2.4. Dataset Form
Figure 1 shows the visualization results of the intensity, coherence, and polarization features of ground objects. It can be seen that the intensity and coherence of a impervious surface are significantly higher than those of a pervious surface. To better understand the scattering mechanism of ground objects, the H, A, and Alpha can be split according to the split criteria shown in
Table 2 to form an H-Alpha plane, which consists of nine zones. According to the H and Alpha values, the impervious surface is generally located in zones 2, 4, 5, 7, and 8 of the H-Alpha plane. In this study, we aimed to fully explore the roles of intensity, coherence, and polarization features of SAR images in the extraction of urban impervious surfaces. To this end, we constructed seven types of datasets, as shown in
Table 3. Among them, dataset D1 contains only the intensity information; dataset D2 contains only the coherence information; dataset D3 contains the polarization scattering information; dataset D4 contains the intensity and coherence information; dataset D5 contains the intensity and polarization scattering information; dataset D6 contains the coherence and polarization information, and dataset D7 contains the intensity, coherence, and polarization scattering information. The dataset used in this study was annotated pixel-by-pixel. Due to the difficulty of SAR image interpretation, in the annotating process, the high-resolution optical image was first registered with the SAR image, and then the SAR image was annotated by visual interpretation with the assistance of the optical image. An impervious surface was labeled as 1, and a pervious surface was labeled as 0. The annotation tool used was Adobe Photoshop software 5.0. After annotation, the whole image was cut into a number of 128 × 128 image patches to finally obtain 5603 patches. The above patches were randomly divided into training and validation sets according to the ratio of 8:2 and input into models.
6. Conclusions
Urban impervious surface area has become an important indicator for measuring the quality of an urban ecological environment. Due to the limitations of climatic conditions, it is very difficult to use optical remote sensing images alone to achieve seamless, impervious spatio-temporal surface monitoring in large regions, especially in the cloudy and rainy subtropical regions. Synthetic aperture radar (SAR) is an active sensor which is very suitable for monitoring the impervious surfaces in such areas. As a more advanced and complex SAR imaging mode, polarimetric SAR contains more scattering information of ground objects. However, the modern studies on impervious surface extraction using SAR data mainly focused on the use of SAR image intensity or amplitude information, and rarely on the use of phase and polarization information. Additionally, regarding deep learning, there has been little research on the performances of SAR image intensity, coherence, and polarization features and their integration in impervious surface extraction. Therefore, we used Sentinel-1 dual-polarization data as the data source to extract the intensity, coherence, and polarization features, and input them into UNet, Deeplabv3+, and HRNet to discuss the performance of each feature in the extraction of impervious surfaces. The experimental results show that among intensity, coherence, and polarization, intensity is the most useful feature for the extraction of impervious surfaces based on SAR images. Additionally, in most cases, the extraction accuracy for impervious surfaces when using multi-feature integration is improved compared with that based on a single feature, and the extraction accuracy for impervious surfaces based on the combination of intensity and coherence is significantly improved and more stable. In addition, we also analyzed the limitations of extracting urban impervious surface based on SAR images, and gave a simple and effective solution. The relevant findings of this study have certain reference significance for the seamless spatio-temporal monitoring of impervious surfaces in large-scale areas, especially in cloudy and rainy areas.