Tropical Cyclone Forecast Using Multitask Deep Learning Framework
Tropical Cyclone Forecast Using Multitask Deep Learning Framework
Abstract—A tropical cyclone is a robust weather system that and intensity of tropical cyclones, so it is very difficult to
affects human daily life. Accurate and rapid tropical cyclone accurately describe it with existing models [1]. Furthermore,
forecast can guide human disaster prevention and mitigation forecasting the intensity of tropical cyclones is more difficult
work against tropical cyclones. The mainstream tropical cyclone
forecasting method is numerical forecasting, which requires than forecasting the path. Researchers believe that this is
abundant prior knowledge and luxurious calculation. Nowadays, because the physical process that causes the intensity change-
machine learning methods have received increasing attention ment of tropical cyclones is so complicated that we process
for they can overcome these disadvantages. However, existing little knowledge of it [2]. These two factors affect each other
machine learning methods usually ignored some potential factors during tropical cyclone development, so it’s challenging but
due to they mainly concentrated on one aspect of the tropical cy-
clone forecast. This letter proposes a multitask machine learning significant work to combine these two tasks together.
framework to forecast tropical cyclone path and intensity, which In this letter, we proposed a framework that can give
possesses two modules: one is the prediction module, the other is a quick and reliable forecast of tropical cyclone path and
the estimate module. We use an improved generative adversarial intensity based on the infrared image. The framework has a
network as the prediction module to predict the tropical cyclone prediction module to predict the future spatial data of tropical
spatial data at a certain moment in the future. Then, we use
two different deep neural networks as the estimation module to cyclones and an estimation module to determine the value of
extract the position and intensity from the generated prediction the indicator from the predicted result. We set a retrospective
data. The method we propose is a general and relatively accurate CycleGAN [3] using Wasserstein loss [4] in the prediction
tropical cyclone forecast method. We reach a 24h path forecast module. Then in the estimation module, we build a new model
error of 116km and a 24h intensity forecast error of 13.06kt. called TIENet to predict the intensity and use TCLNet to
Index Terms—Tropical Cyclone Forecast, Generative Adver- predict position. Our work achieves an average 6-hour path
sarial Network, Wasserstein Distance. forecast error of 61km and an average 24-hour path forecast
error of 116km, while our 6-hour intensity error and 24-hour
I. INTRODUCTION intensity forecast error respectively reach 14.20kt and 13.06kt.
These results are produced with the last 24 hours data within
Fig. 1. An overview of the forecast framework. WCycleGAN generates the predicted data from historical data, next TCLNet and TIENet extract the position
and intensity information from it. The green point shows the forecast location, besides the blue point shows the ground truth location.
The research of neural network methods used in the tropical to accomplish the path and intensity forecast although this
cyclone forecast started at the end of the last century. Until framework can complete more tasks. In this section, we will
the 2010s, MLP and BP network were the mainstream neural talk about the details of the three networks in the two modules
network methods for forecasting the intensity and path of that compose our framework.
tropical cyclones [6] [7]. Since the mid-2010s, due to the
development of deep learning, more new methods have been
A. Prediction module
introduced into the forecast of tropical cyclones. Recurrent
neural network(RNN) is a class of neural networks that exhibit This module consists of one network which we call WCy-
temporal dynamic behavior, Moradi Kordmahalleh et al. [8] cleGAN. The network is inspired by Kwon et al. [3] in predict-
and Alemany et al. [9] used this method to forecast the path ing future frames. We use the same retrospective method as
of tropical cyclones in 2016 and 2019, while Pan et al. [10] they did, and we adopt their network architecture and change
used this method to forecast the intensity of tropical cyclones the input and output layers to adapt the grayscale image. This
in 2019. Rüttgers et al. [5] introduced GAN into the tropical GAN has two discriminators, one is the frame discriminator
cyclone path forecast field in 2018. The LSTM network also likes others own, the other one is the sequence discriminator
plays an important role in tropical cyclone forecast, Kim et that we use to enhance the relationship between the inputs
al. [11] used convLSTM to forecast the tropical cyclone path. and outputs. We improved the loss function with the idea of
Meanwhile, Chen et al. [12] combined LSTM with CNN Wasserstein distance [17], for the original discriminator loss so
to forecast the intensity. These methods mainly concentrated rapidly converges to zero that it can’t provide instructions for
on one aspect of the tropical cyclone forecast, so they were the generator. We apply Wasserstein distance by using gradient
inclined to use specialized data. Furthermore, they ignored penalty [4]. It successfully solves this problem and provides
some potential factors due to this operation. Meanwhile, it a better result. We also believe the Wasserstein loss can help
means we have to call several different models to obtain our find out the connection between long interval sequence data
expected results. such as the image of tropical cyclones. The generator loss can
Since the generative adversarial network was proposed in be formulated as follows:
2014, it has become a research hotspot these years. Re- f rame seq
LG = Limage + λ1 LLoG + λ2 LGadv + λ3 LGadv (1)
searchers have made a lot of effort to improve its performance.
On one hand, researchers changed the network’s structure, where λ1 , λ2 , λ3 are parameters. We call the first two items
for example, DCGAN [13], LAPGAN [14], and CycleGAN in the formula as reconstruction losses and the last two
[15], to help the network fit more tasks. On the other hand, adversarial losses. The reconstruction losses can be formulated
researchers adjusted the loss function of the network, for as follows:
example, LSGAN [16] and WGAN [4], to help strengthen 1 X
Limage = l1 (p, q) (2)
the adversariness between generator and discriminator, which 6 pair
(p,q)∈S m,n
can lead into a higher-quality result. These high-performance
models are of great help to meteorological research. Among 1 X
all these works, the CycleGAN is widely applied in computer LLoG = l1 (LoG (p) , LoG (q)) (3)
6 pair
vision as a result of its ability to build connections among (p,q)∈S m,n
This method can be regarded as a simple application of where xm is the first picture in a sample, xn+1 is the last
0 00
the framework that we proposed. We only use two networks picture in a sample, x and x respectively represent the two
3
seq 1 X h i
LGadv = E D seq X
e (6)
4e
X∈Mm,n
where n 0 00 0 00
o
Pm,n = xm , xm , xn+1 , xn+1 (7) Fig. 2. The structure of TIENet. It has 5 convolution layers and 2 liner layers.
0 00
xm , · · · , xn+1 , xm , · · · , xn+1 ,
=
Mm,n (8)
x0 , · · · , xn+1 , x0 , · · · , xn+1
IV. EXPERIMENTS AND RESULT
m m
A. Datasets
Here are two kinds of discriminator losses, one is the
1) Tropical cyclone infrared time series dataset: This
frame loss, the other one is the sequence loss. The frame
dataset comes from the infrared window channel in the US
discriminator loss can be formulated as follows:
X h i h i grid satellite dataset (GridSat) [19]. We intercept a tropical
−E D f rame (x) + E D f rame (e x) cyclone at a certain time t according to BST dataset [20]
1
LDf rame = ex∈Pm,n ,xQm,n
with a resolution of 256 x 256 pixels (20 latitudes multiply
4 2
20 longitudes) and 6h, whose center is ensured to locate in
−λ4 E k∇ex∈Pm,n D f rame (e x) k2 − 1
the center of the picture. Afterward, keep the position of this
(9)
f rame window in the GridSat global image unchanged, and intercept
where Pm,n is same with Pm,n in L , λ4 is a parameter,
Gadv 2 the images at time t − 24, t − 18, t − 12, t − 6, t, t + 6, t + 12,
Qm,n = {xm , xn+1 }, E k∇ex∈Pm,n D f rame (e x) k2 − 1 is the gradi- t + 18, t + 24 under this window in turn. All these images
ent penalty, which is used to ensure the Lipschitz continuity. compose a sequence consisting of 9 pictures, and one example
The sequence discriminator loss can be formulated as follows: is shown in Fig 3. Next, we extract 5 continuous images from
X h i h i each tropical cyclone sequence as training samples, and each
−E D seq (x) + E D seq (e x)
1 sequence provides 5 training samples.
LDseq = ex∈Mm,n ,xNm,n
(10)
4 2 2) Estimation dataset: This dataset uses the results ob-
−λ5 E k∇ex∈Pm,n D seq (ex) k2 − 1 tained by the WCycleGAN as the input image. The 6h
seq
prediction results are the direct outputs of the network while
where Mm,n is same with Mm,n in LGadv , λ5 is2a parameter, the 24h prediction results are the 4-step outputs. We get the
Nm,n = (xm , · · · , xn+1 ), E k∇ex∈Pm,n D seq (e
x) k2 − 1 is also the coordinate(u,v) of the tropical cyclone center in the image
gradient penalty. according to the BST dataset, and the intensity label comes
from the maximum wind speed near the tropical cyclone center
extracted from the BST dataset, which is divided into 13
B. Estimation module
categories according to Beaufort scale, corresponding to 7-
This module consists of two networks, one is the TIENet, 17+ levels. The heatmap can be produced as the formula:
the other is the TCLNet. We propose a novel network termed
(x − u)2 + (y − v)2
!
as TIENet to determine the intensity, whose output is a H(x, y) = exp (11)
predicted Beaufort scale when input is the predicted image −2σ2
out from the WCycleGAN. Unlike other methods predicting where H (x, y) means the pixel value at (x,y) in the heatmap,
the specific wind speed of the tropical cyclone, this trick σ is a parameter, which takes 15 here. An example of a set
can reduce the calculation amount while evaluating the effect of images in location and intensity determination dataset is
well. We stress the importance of the detailed structure of shown in Fig 4.
tropical cyclones, so we choose 5 convolutional layers with
convolution kernels doubling and a stride of 1. This network
has far fewer parameters than the ResNet50 but provides B. Training Details
similar performance. We choose the cross-entropy among the To train our networks, we use 4,930 training samples from
predicted Beaufort scales and the real intensity labels as its 986 tropical cyclone sequences. In WCycleGAN, we set λ1 =
loss function. The structure of TIENet is illustrated in Fig 2. 0.005, λ2 = λ3 = 0.003, λ4 = λ5 = 10 according to Kwon et
The TCLNet comes from Tan C.’s work [18] Its output al.’s work [3] and Gulrajani et al.’s work [4]. We use adam
is a heatmap when input is the predicted image out from optimizers [21] with β1 = 0.5, β2 = 0.999 and the learning rate
the WCycleGAN. It uses an improved MSE loss to describe is set as follows: for WCycleGAN, first, we use the learning
the difference between the generated heatmap and the real rate of 0.0001 to train for 20 epochs, then reduce it to 0.00005
heatmap and is trained to narrow this difference. The co- for 30 epochs, finally to 0.00001 for 40 epochs; for TCLNet,
ordinate which has the highest pixel value of the generated after 4 epochs of training with a learning rate of 0.001, it
heatmap represents the location of the tropical cyclone center. is reduced to 0.00005 for 1 epoch; for TIENet, we train 10
4
Fig. 3. An example of the sequence in the tropical cyclone infrared time series dataset.
TABLE I
Overall result. 50 6h-test samples. 10 24h-test samples. PSNR, SSIM are
the larger the better, while MSE, LD and S E are the smaller the better.
Metrics 6h 24h
Average PSNR 22.46 19.77
Fig. 4. An example of a set of images in location and intensity determination Average SSIM 0.65 0.64
dataset. Average MSE 5.89 11.10
Average LD 61 km 116 km
Max LD 122 km 192 km
epochs with a learning rate of 0.1, then reduce it to 0.01 for Average S E 14.20 kt 13.06 kt
20 epochs, finally to 0.001 for 30 epochs. Max S E 38.49 kt 38.49 kt
C. Metrics
We use PSNR, SSIM, and MSE (we multiplied the original
MSE by a factor of 100 to show the difference) to measure
the quality of the result of WCycleGAN. We use path forecast
error(LD ) to measure the quality of the result of TLCNet. Here
is its calculating formula: