1. Introduction
Precipitation is the main driving force of the hydrological cycle. Precipitation estimation is an important issue in meteorology, climate, and hydrology research [
1]. Real-time and accurate precipitation estimation can not only provide data for precipitation nowcasting [
2] but also provide the data support for extreme-precipitation monitoring [
3,
4], flood simulation [
5], and other meteorological and hydrological studies [
6,
7].
Precipitation estimation mainly relies on three observation means, i.e., rain gauge, weather radar, and remote sensing. The rain gauge can directly measure precipitation, however, as a point measurement method, it is spatially discontinuous and sparse [
8,
9], which makes the application limited. Weather radar can provide continuous observations with high spatial and temporal resolution [
10]. However, radar imagery only covers an area with a radius of about 1 degree. In large-scale scenes, the application value of the radar data greatly depends on the deployment density of the radar observation network [
11]. Satellite observation can provide global precipitation estimation with high temporal and spatial resolution, and fill the data gaps in the ocean, mountains, and the areas without radar and rain gauge [
12,
13]. That means in some research, such as marine meteorology, the satellite is an irreplaceable observation mean of precipitation data.
Remote sensing precipitation observations can be mainly divided into three categories: infrared (IR), passive microwave (PMW), and active microwave [
14]. Compared with the latter two categories, infrared data has the advantage of high spatio-temporal resolution. So far, infrared (IR) data have been widely used in authoritative precipitation products, such as the Climate Prediction Center morphing technique (CMORPH) [
15], the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) [
16], the Tropical Rainfall Measuring Mission (TRMM) [
17], and the Global Precipitation Measurement (GPM) [
18]. However, the release of these high-quality precipitation products has a certain time delay. Thus, it is impossible to support extreme precipitation monitoring and nowcasting which require high timeliness. How to design a lightweight and high-precision precipitation estimation algorithm for remote sensing infrared imagery is always an important research topic.
The development of artificial intelligence provides the possibility to solve this problem. Various machine learning techniques, e.g., random forest (RF), support vector machine (SVM), and artificial neural network (ANN), have been adopted to model the relationship between infrared observations and the precipitation intensity [
19,
20,
21,
22]. PERSIANN [
16] is a classic algorithm based on Artificial Neural Network (ANN) for constructing the relationship between precipitation and infrared observations, which is developed by the University of California, cooperated with the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA). The PERSIANN family also includes two other precipitation estimation products, i.e., PERSIANN-CCS [
23] and PERSIANN-CDR [
24]. PERSIANN-CCS improves PERSIANN via the predefined cloud patch feature. PERSIANN-CDR utilizes the National Centers for Environmental Prediction (NCEP) precipitation product to replace the PMW imagery used for model training in PERSIANN [
24].
Although traditional machine learning methods have shown potential in precipitation detection and monitoring, deep learning (DL) methods are more accurate in processing the big data [
25]. Tao et al. [
26] propose a algorithm based on stacked Stacked Denoising Autoencoders (SDAE) [
27] named PERSIANN-SDAE. They think that PERSIANNs are based on manually defined features, which limits the ability of precipitation estimation, and PERSIANN-SDAE has the advantage of automatically extracting features from infrared cloud images, which leads to higher accuracy. Nevertheless, the inefficient structure of SDAE also limits its ability to effectively use neighborhood information to estimate precipitation. Therefore, Sadeghi et al. [
28] proposed a precipitation estimation algorithm based on Convolutional Neural Network (CNN) [
29] named PERSIANN-CNN. The research demonstrates that extracting spatial features from a data-driven perspective can obtain precipitation information more effectively.
In recent years, the two-stage framework has been widely used in the deep-learning study of precipitation estimation [
30,
31,
32,
33], which including a preliminary “rain/no-rain” binary classification and a non-zero precipitation estimation. The core of the two-stage framework is utilizing the results of the binary classification task as the masks to filter the no-rain data of the precipitation estimation task and alleviate the data imbalance problem. To a certain extent, utilizing the masked data to train the precipitation estimation model could reduce the bias of the model for no-rain data which accounts for a large proportion. It has been proved that the two-stage framework with deep learning methods has the ability to provide more accurate and reliable precipitation estimation products than the PERSIANN products.
However, as shown in
Section 4.4.1, we experimentally found that the accuracy of the single classification model is lower than that of the precipitation estimation model. It is obviously contrary to the perception that the classification is simpler than the non-linear regression problem, i.e., the estimation. An intuitive reason is that there is a large gap between the captured information from the supervision signals of the classification network and the estimation network. The supervision signal of the estimation network is the abundant quantitative precipitation intensity information, while the classification network is only the qualitative information, i.e., “rain/no-rain”. Although the above-mentioned two-stage framework obtains better results than the single estimation model by using the classification predictions as the masks for the estimation model, such a simple series-wound mode leads to the error accumulation phenomenon from classification model to estimation model. For example, some rainy pixels may be filtered directly in the estimation model as it is classified as no-rain by mistake, and then these pixels can not be predicted correctly in the estimation model. When the accuracy of the classification model is not satisfactory, this phenomenon will be more serious.
If a novel combination mode of the classification and estimation model can be designed, in which the error accumulation phenomenon can be alleviated, meanwhile, the ability to improve the data balance can also be retained, the accuracy of the precipitation estimation can be further improved. To achieve this goal, we propose a Multi-Task Collaboration deep learning Framework (MTCF) for precipitation estimation. The framework achieves a cross-branch positive information feedback loop based on the “classification-estimation” dual-branch multi-tasking learning mechanism. Specifically, we propose a multi-task consistency constraint mechanism, in which the information captured by the estimation branch is propagated back to the classification branch through gradients to improve the information abundance of the classification branch. At the same time, we propose a cross-branch interaction module (CBIM). Through the soft spatial attention mechanism, CBIM realizes the soft transfer of features from the classification branch to the estimation branch. In addition to alleviating the dilemma of data imbalance, it also reduces the error accumulation caused by the simple series method of the hard masks. Through the combination of the above two points, we realize a positive information feedback loop from estimation to classification and back to estimation. Extensive experiments based on Himawari-8 data demonstrate that our multi-task collaboration framework, compared with the previous two-stage “classification-estimation” framework, can derive high spatio-temporal resolution precipitation products with higher accuracy. Moreover, we also analyze the correlation between infrared bands of different wavelengths of Himawari-8 under different precipitation intensities to lay a foundation for optimizing the input and improve the generalization capability of the model to other infrared remote sensing data.
Our contributions can be summarized as follows:
We proposed a multi-task collaboration framework named MTCF, i.e., a novel combination mode of the classification and estimation model, which alleviates the error accumulation and retains the ability to improve the data balance;
We propose a multi-task consistency constraint mechanism. Through this mechanism, the information abundance and the prediction accuracy of the classification branch are largely improved;
We propose a cross-branch interactive module, i.e., CBIM, which realizes the soft feature transformation between branches via the soft spatial attention mechanism. The consistency constraint mechanism and CBIM together make up a positive information feedback loop to produce more accurate estimation results;
We model and analyze the correlation between infrared bands of different wavelengths of Himawari-8 under different precipitation intensities to improve the applicability of the model on other data.
3. Methods
In this section, we illustrate the proposed multi-task collaboration precipitation estimation framework (MTCF) in detail. First, we give an overview for the proposed MTCF for precipitation estimation. Second, we introduce the baseline network structure in our framework. Then, we elaborate on the proposed multi-branch network with the proposed consistency constraint mechanism. Finally, we illustrate the proposed cross-branch interaction module (CBIM) to improve the performance of precipitation estimation effectively.
3.1. Overview
In view of the disadvantages of the two-stage framework mentioned in
Section 1, we propose a multi-task collaboration deep learning framework named MTCF for precipitation estimation. In MTCF, we design a cross-branch positive information feedback loop, which is composed of the multi-task consistency constraint mechanism and cross-branch interaction module. As shown in
Figure 2, the whole framework consists of an encoder and two decoders, i.e., two parallel branches, which carry out the precipitation estimation and classification tasks, respectively.
On the one hand, we introduce a consistency constraint mechanism into the loss function. To ensure the consistency of the predictions of the two parallel tasks, the mechanism calculates the consistency loss of the classification and estimation branch. In the process, we take the predictions of the estimation branch as the pseudo ground-truth. Thus, the information captured by the estimation branch can be propagated back to the classification branch via gradients to improve the information abundance and the classification accuracy of the classification branch. On the other hand, we propose a cross-branch interaction module (CBIM). It transmits the spatial features of the classification branch to the estimation branch and realizes the soft transfer and fusion of features between the two parallel branches. In addition to alleviating the dilemma of data imbalance, it also reduces the error accumulation (Error accumulation refers to the transfer of errors from the classification model to the estimation model in the process of using classification results to balance the data distribution for the estimation task) caused by the two-stage framework. The combination of the above two aspects forms a positive-going information propagation cycle of “estimation-classification-estimation”, and the experiments demonstrate that such design can achieve higher accuracy for precipitation estimation. Moreover, to explore the correlation between infrared bands of different wavelengths of Himawari-8, we introduce a channel attention module [
38] before the encoder. By designing the corresponding experiments of different precipitation intensities, we obtain the change of the correlation of each infrared band under different precipitation intensities. During the experiments, we take the Himawari-8 data with 10 infrared bands as the input of the models and use the GPM data as the ground-truth for supervision.
3.2. Baseline Network
The precipitation estimation task aims at estimating per-pixel precipitation rate values, which is usually formulated as a segmentation-like problem [
39,
40]. In this paper, we take U-Net [
41], which is a widely used model in image segmentation based on fully convolutional network (FCN) [
42], as our baseline network. The baseline network consists of an encoder and a decoder. The encoder progressively reduces the spatial resolution and performs feature representation learning, while the decoder upsamples the feature maps and performs the pixel-level classification or regression. We give a detailed depiction of our baseline model as follows.
As shown in
Figure 3, the encoder consists of the repeated application of double 3 × 3 convolutions, each followed by a rectified linear unit (ReLU) and a 2 × 2 max pooling operation with stride 2 for downsampling. The whole network has a downsampling ratio 16. The decoder contains three types of layers, including the repeated transposed convolution layer with stride 2 for upsampling, the ReLU layer, and the 3 × 3 convolution layer. Moreover, the feature maps in the encoder are directly concatenated with the ones of the same scale in the decoder to obtain accurate context information and achieve better prediction results. At the end of the network, a single convolution layer is added, which converts the channel number into 1, for the precipitation classification or estimation prediction.
3.3. Multi-Branch Network with Consistency Constraint
In the experiments of
Section 4.4.1, we found that in the single classification and estimation task, the accuracy of the estimation task is significantly higher than that of the classification task under each precipitation intensity. We think the reason is that the information captured by the classification network is insufficient. Compared with the quantitative precipitation intensity information of the estimation task, the classification task can only obtain the qualitative information of “rain/no rain”, which leads to poor prediction results. In the multi-task precipitation estimation, the classification task plays a role of alleviating the data imbalance of precipitation estimation. Thus, it is extremely important to improve the accuracy of the classification task in order to reduce the negative impact of the error accumulation on the estimation task. To achieve this, one intuitive idea is that we can introduce the abundant quantitative precipitation information from the estimation branch to improve the information abundance for the classification model. Specifically, we design a consistency constraint mechanism between the two branches. By taking the predictions of the estimation branch as the pseudo ground-truth, the gap between the classification branch and the estimation branch is calculated and introduced into the loss function. In this way, the precipitation intensity information is further transmitted to the classification branch through gradient back-propagation to improve the accuracy of the classification branch.
As shown in
Figure 4, the total loss function is composed of three parts: (1) the classification loss
, (2) the estimation loss
, and (3) the consistency loss
. The classification branch aims at determining whether the precipitation rate at each pixel is larger than the threshold
m (mm/h). We perform the binary cross-entropy loss computation for the classification branch. The estimation branch is designed to output the precipitation intensity (mm/h) concretely of each pixel. We use the mean squared error loss (MSE) and mean absolute error (MAE) loss for its supervision. In the consistency loss
, let
denote the predictions of the classification branch and
represent the predictions of the estimation branch. Specifically, we transfer the estimation prediction
to 0-1 mask
as the pseudo ground-truth for the consistency loss. Its values are 1 if the prediction values in
are larger than
m and 0 otherwise. Then, we supervise
by the mask
to ensure the consistency of the two branches via the binary cross-entropy loss.
Thus, the total loss function is designed as follows:
where
are the loss weights for balancing the three losses, and
,
,
are the classification loss, estimation loss, and consistency loss, respectively.
3.4. Cross-Branch Interaction with Soft Attention
From
Section 2.4, we observe that there exists obviously unbalanced distribution in precipitation data. Direct precipitation estimation on the unbalanced data will lead to the model being more inclined to no or light precipitation, which accounts for a larger proportion of the data. Thus, the prediction performance of the model for high-intensity precipitation will be unsatisfactory.
The two-stage solution is to serialize the classification and estimation task. Usually, it uses the predictions of the classification task to filter the precipitation data in the estimation task, which can adjust the distribution of data by removing the non-precipitation pixels. One obvious problem in this solution is that the classification task has a crucial, even decisive, influence on the estimation, and the error of the classification task will be directly reflected in the final estimation results.
In order to alleviate the error accumulation issue while using the predictions of the classification task to balance the data distribution in the estimation task, we proposed a cross-branch interaction attention mechanism. In this mechanism, we use the precipitation probability maps predicted by the classification branch as the soft spatial attention masks to carry out the multi-level feature transfer and fusion to the estimation branch. In this way, it can adjust the attention of the model to the data of different precipitation intensities and alleviate the problem of data imbalance.
As shown in
Figure 5, the proposed cross-branch interaction module (CBIM) is inserted between the two parallel branches. The detailed process is described as follows and the detailed depiction of CBIM is shown in
Figure 6. The detailed execution process in CBIM is described as follows.
Let
M denote the output probability maps from the precipitation classification branch. First, based on
M, we generate four attention maps as shown in
Figure 5, i.e.,
,
,
,
, with different scales via bilinear interpolation, where
and
. Next, we use these resized attention maps,
, to extract the region-aware estimation features. As shown in
Figure 6, given the
i-th stage feature maps
in the precipitation estimation branch, the enhanced feature maps
is calculated as follows:
where * indicates the element-wise multiplication operation. Then, we introduce the features from the classification branch, which can be denoted by
, to the estimation branch. Specifically, we utilize the above attention mask
to get the region-aware classification features, and the obtained features (
) are concatenated with the enhanced feature maps
, the corresponding enhanced features (
) in the encoder. This process can be calculated as follows:
where
denotes the concatenation operation and
represents the corresponding feature maps in the encoder.
Finally, we use two sequential 1 × 1 convolution layers to refine the aggregated features
and output the final features
at stage
i:
In this way, the features of the precipitation estimation branch are enhanced effectively via the proposed CBIM. So far, the proposed consistency constraint and CBIM have constituted the whole positive information feedback loop. The former realizes the information feedback from the estimation task to the classification task, which reduces the error of the classification branch. The latter realizes the information transfer from the classification task to the estimation task, which weakens the error accumulation of the classification branch. The model capacity can be largely and effectively improved through such a feedback loop and iteration, so as to realize the win-win situation of the multi-task model of the precipitation classification and estimation.
3.5. Implementation Details
In this section, we report the implementation details of our experiments. We conduct all the experiments based on Pytorch 1.1 [
43] on a workstation with 2 GTX 1080 Ti GPU cards. During the training phase, the mini-batch size is set to 8 and the total training epoch is set to 150. The network parameters are optimized by Adam, with an initial learning rate 3 × 10
, and a weight decay 1 × 10
. The classification threshold
m (mm/h) is set to 5.0 mm/h in our experiments. For the loss weights of the total loss function,
are set to 1, 1, and 1, respectively. The principle of setting these loss weights is to make the three losses in the same order of magnitude, so as to ensure stable optimization in the multi-task learning.
6. Conclusions
In this paper, we have proposed a multi-task collaboration deep learning framework named MTCF for remote sensing infrared precipitation estimation. In this framework, we have developed a novel parallel cooperative combination mode of the classification and estimation model, which is realized by the consistency constraint mechanism and a cross-branch interaction module (CBIM) with soft attention. Compared to the simple series-wound combination mode of the previous two-stage framework, our framework has alleviated the error accumulation problem in multi-tasking and retained the ability to improve the data balance. Extensive experiments based on the Himawari-8 and GPM dataset have demonstrated the effectiveness of our framework against the two-stage framework and PERSIANN-CNN methods.
From the perspective of the application, our framework is lightweight, real-time, and high-precision with a strong identification ability of the precipitation spatial distribution details and a generalization ability of extreme weather conditions. It can provide strong data support for short-term precipitation prediction, extreme weather monitoring, flood control, and disaster prevention. In addition, the importance of the input infrared bands obtained in our work will lay a foundation for optimizing the input of the precipitation estimation and improving the generalization capability of our model to other infrared remote sensing data in the future. We hope that the novel multi-tasking collaborative framework proposed by us will serve as a solid framework and benefit other research in precipitation estimation in the future.