0% found this document useful (0 votes)
11 views18 pages

Submitted Paper

This study presents an LSTM neural network model for predicting drought in Bangladesh's Tropical Cancer Region using Standardized Precipitation Index (SPI) values. The model outperformed traditional statistical methods, demonstrating superior predictive accuracy and potential for enhancing drought management strategies. The research emphasizes the importance of AI in improving decision-making related to agriculture and water resource management in drought-prone areas.

Uploaded by

pronoti1000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views18 pages

Submitted Paper

This study presents an LSTM neural network model for predicting drought in Bangladesh's Tropical Cancer Region using Standardized Precipitation Index (SPI) values. The model outperformed traditional statistical methods, demonstrating superior predictive accuracy and potential for enhancing drought management strategies. The research emphasizes the importance of AI in improving decision-making related to agriculture and water resource management in drought-prone areas.

Uploaded by

pronoti1000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Enhancing Drought Prediction Using LSTM

Neural Network: A Case Study of Tropical Cancer


Region in Bangladesh
Md. Joybor Rahman1,1*, Puspendu Biswas Paul1,2† and
Tarin Tabaschum2,2†
1* AssistantProfessor, Department of Agricultural Construction and
Environmental Engineering, Faculty of Agricultural Engineering &
Technology, Sylhet Agricultural University, , Sylhet, 3100, , Bangladesh.
2 Department of Agricultural Construction and Environmental

Engineering, Faculty of Agricultural Engineering & Technology, Sylhet


Agricultural University, , Sylhet, 3100, , Bangladesh.

*Corresponding author(s). E-mail(s): [email protected];


Contributing authors: [email protected];
[email protected];
† These authors contributed equally to this work.

Abstract
Drought prediction is a crucial task in climate-related forecasting, with sig-
nificant implications for agriculture, water resource management, and disaster
preparedness. In this study, we present a Long Short-Term Memory (LSTM)
neural network model to predict drought using the Standardized Precipitation
Index (SPI) values from multiple regions, namely Sreemangal, Jessore, and Syl-
het. The dataset was preprocessed, and the data was split into training, testing,
and validation sets. The LSTM model was trained with a custom R-squared met-
ric to evaluate its performance. Additionally, we compared the LSTM model’s
performance against traditional statistical models, including Multiple Linear
Regression (MLR), Random Forest, and a calculated value from another model.
The results demonstrated that the LSTM model exhibited superior predictive
accuracy, achieving higher R-squared values and lower mean squared error (MSE)
and mean absolute error (MAE) values compared to the other models. This study
highlights the effectiveness of the LSTM model in drought prediction and its
potential to enhance decision-making in drought-prone regions.

1
Keywords: Deep learning for drought prediction; weather prediction model;

1 Introduction
Drought, a complex and pervasive natural phenomenon, exerts varying impacts on
regions worldwide. The ability to accurately detect and monitor drought conditions is
paramount for effective mitigation strategies, as it directly affects agriculture, water
availability, and communities’ resilience. Historically, the process of drought assess-
ment heavily relied on limited datasets and subjective evaluations [1]. However, with
the advent of Artificial Intelligence (AI), novel avenues have opened to make more
data-driven decisions and develop anticipatory measures to tackle the intricacies of
drought-related challenges.
AI’s emergence has marked a turning point in drought detection and monitor-
ing at a global scale [2]. By leveraging advanced data analytics and machine learning
algorithms, AI models offer unparalleled potential to process and interpret extensive
environmental data. This transformative capability enables the identification of intri-
cate patterns and the forecasting of impending drought events [3]. The application of
AI is not confined to any specific geographic area; rather, its principles are adaptable
and invaluable in diverse regions, including countries like Bangladesh, where under-
standing drought dynamics, refining early warning systems, and devising effective
drought management strategies are vital for resilience.
AI’s prowess in the realm of drought detection stems from its exceptional abil-
ity to manage large and heterogeneous datasets from diverse sources. These sources
span from satellite imagery and weather stations to soil sensors and climate models.
The amalgamation and analysis of these data streams empower AI models to untan-
gle complex relationships that contribute to the onset and progression of drought [4].
Within this framework, machine learning algorithms, a subset of AI, emerge as key
players. These algorithms can decipher historical data and recognize patterns indica-
tive of drought conditions, a capability that proves vital for both classification and
prediction [4]. This machine learning-driven approach equips the models to offer reli-
able insights into drought severity, duration, and spatial extent, thereby enhancing
the precision of drought assessments. An intrinsic advantage of these algorithms is
their adaptability; they evolve and refine predictions as new data becomes available,
ensuring that monitoring remains up-to-date.
The application of AI extends seamlessly to various regional contexts, showcas-
ing its transformative potential. Consider Bangladesh, where drought vulnerability
necessitates a close analysis of meteorological data. AI models equipped with machine
learning mechanisms can sift through historical climate data, satellite imagery, and
atmospheric variables, thus identifying key indicators of drought conditions [5].
These indicators could range from variations in temperature and precipitation pat-
terns to changes in atmospheric pressure. This information proves indispensable for
meteorologists, policymakers, and disaster management agencies, enabling improved
preparedness and response strategies [6].

2
Furthermore, AI’s influence transcends national boundaries through global-scale
initiatives. Satellite-based systems, such as the European Space Agency’s Climate
Change Initiative, leverage AI to analyze remote sensing data and create com-
prehensive global drought indicators. These indicators offer insights into drought
conditions across continents, fostering international collaboration in drought man-
agement and water resource planning [6]. Despite AI’s breakthroughs in drought
detection, hurdles persist. These include issues related to data availability, quality,
and consistency, particularly in developing regions lacking comprehensive monitoring
infrastructure. Additionally, the translation of AI-generated insights into actionable
information requires a collaborative effort involving scientists, policymakers, and local
communities.
In conclusion, AI’s rapid ascent has revolutionized the global landscape of drought
detection. By harnessing cutting-edge data analytics and machine learning techniques,
AI models can process vast troves of environmental data, thereby offering accurate and
timely information about drought conditions. This evolution extends from regional
applications to international monitoring initiatives, reshaping our comprehension of
drought dynamics and empowering us to proactively tackle the challenges it presents.
As technology advances and data availability improves, AI will continue to play a
pivotal role in mitigating the adverse impacts of drought on ecosystems and societies.
The overall contribution of this research includes -
• Introduction of a novel LSTM-based neural network model for precise drought
prediction in South Asia’s Tropical Cancer Region.
• Effective utilization of deep learning techniques in capturing temporal dependencies,
enhancing the accuracy of drought event forecasting.
• Comparative analysis with traditional methods such as Multiple Linear Regression
and Random Forest, highlighting the LSTM model’s superior predictive capacity.
• Demonstrated robustness and applicability of the model across varied geographic
contexts like Sreemangal, Jessore, and Sylhet.
• Enhancement of drought prediction strategies through advanced neural network
models, contributing to improved water resource management, agriculture planning,
and climate change assessments.
• Focused research on developing a suitable neural network model to detect regional
precipitation patterns using local meteorological data.
Drought, a natural phenomenon characterized by a significant lack of precipitation,
remains an unpredictable event that can have far-reaching consequences. While many
may perceive drought as simply an extreme absence of rain, defining it objectively
for planning and management purposes poses a challenge. In essence, drought can be
understood as the relentless persistence of a precipitation deficit over a specific region
and time period [7].
To better comprehend the complexities of drought, researchers have categorized
it into four broad types [8]. Meteorological drought occurs when a region experiences
a prolonged absence of precipitation. Over the years, scientists have relied on the
analysis of precipitation data to study meteorological droughts [9]. By examining the

3
deficiency in precipitation relative to average values, researchers gain insights into the
severity and duration of droughts [10].
However, drought is not limited to the absence of surface rainfall; it also affects the
availability of surface and subsurface water resources, leading to hydrological droughts.
Scientists have turned to stream data to investigate hydrological droughts, employing
regression analyses to identify variables such as geology that play significant roles in
driving these drought events [11, 12].
Agricultural drought, on the other hand, primarily concerns the decline in soil
moisture levels and subsequent crop failures. The extent of soil moisture depletion
depends on various factors, including meteorological and hydrological conditions,
evapotranspiration rates, and the unique characteristics of different plants and soils.
To analyze agricultural droughts, researchers have developed drought indices that
combine measurements of precipitation, temperature, and soil moisture [13].
Beyond the physical and ecological dimensions, drought extends its impact to
the socio-economic realm, giving rise to socio-economic droughts. These droughts are
characterized by water supply systems struggling to meet the demands of economic
activities and goods, particularly when weather-related water shortages lead to low
supplies and high demand. This interplay between drought and the availability of
economic goods is referred to as socio-economic drought supply [13].
At the heart of drought’s effects lie the profound implications it has on plant
life, from the morphological to molecular levels. The severity of a drought is influ-
enced by various factors, including temperature, soil water holding capacity, and
rainfall patterns, all of which vary across different crops. Consequently, studying the
impacts of drought on plants requires a multi-faceted approach, considering the unique
characteristics of each crop and the environmental factors at play [14].
By examining and understanding the different dimensions of drought, researchers
aim to unravel its complexities and develop effective strategies for drought mitiga-
tion, water resource management, and agricultural planning. Through their work, they
strive to enhance our ability to predict, adapt to, and mitigate the impacts of drought
events, ultimately ensuring the resilience and sustainability of our ecosystems and
societies.
For the purpose of defining meteorological drought, McKee developed the Stan-
dardized Precipitation Index (SPI), a powerful tool for estimating the severity and
duration of drought occurrences [15]. This index has proven to be valuable not only in
short-term agricultural research but also in long-term analyses of subsurface waters,
river flows, and lake water levels [16]. What makes the SPI particularly useful is its
ability to be computed independently of the average base period, thanks to the utiliza-
tion of datasets spanning over 40 years to estimate mean precipitation [17]. However,
it is important to note that the computation of the SPI requires datasets of at least 50
years for time scales smaller than 12 months, while longer datasets are necessary for
time scales of 24, 36, and 48 months [18]. Due to its numerous benefits, the SPI has
gained widespread adoption worldwide as a means to define and track meteorological
dryness [19].
While the SPI provides valuable insights into drought conditions, further enhance-
ments can be achieved by combining this method with predictions based on stochastic

4
nature and weather regime modeling. The large-scale atmospheric circulation, charac-
terized by low-frequency components spanning one to three months, tends to cluster
around specific weather regimes [20]. Hidden Markov chains have been found to effec-
tively capture the resident and recurrence times of these weather regimes, along with
their transition probabilities, particularly observed in the Euro-Atlantic region. It has
also been observed that the local climate conditions, such as precipitation and tem-
perature, are probabilistically linked to these weather regimes. This is due to the fact
that certain weather regimes are more conducive to the development of drought con-
ditions than others. Researchers have investigated the influence of blocking and the
North Atlantic Oscillation (NAO) regimes on seasonal conditioning of river flows and

Search key: "Drought Index Prediction Using Neural


Identification

Network"

Drought
Google Prediction
All fields
scholar using Neural
Network
Limited article
and Drought
conference prediction
Screening

Article count
paper Method
= 312

Removing Yes No
duplicates
Is it about
meteorological
Drought?
Eligibility

Reading title Is this article


abstract and about Artificial
Neural network?
metadata
No
Yes
Inclusion

Article
Article
count = Not selected
analysed
74

Fig. 1: Content analysis (Using PRIZMA protocol)

5
precipitation, with a specific focus on regions like Iberia [21]. By combining stochas-
tic and weather regime techniques, improved predictions can be achieved, potentially
extending the lead time for drought forecasts up to three months or even for seasonal
predictions.
In addition to the SPI, other models and techniques have been employed for
drought prediction. For instance, the use of Cascaded Artificial Neural Fuzzy Inference
System (CANFIS) has shown promise in predicting drought events. CANFIS com-
bines the strengths of artificial neural networks (ANN) and fuzzy logic to capture the
complex relationships between input variables and drought outcomes [? ]. By training
the CANFIS model with historical meteorological data, it becomes capable of fore-
casting drought conditions, providing valuable information for early warning systems
and decision-making processes.
Furthermore, the application of Neural Network (NN) models has proven effective
in drought prediction. NN models utilize a layered structure of interconnected nodes,
allowing them to learn and recognize patterns in historical data. By training NN mod-
els with historical climate data, including variables such as precipitation, temperature,
and soil moisture, they can forecast future drought events with a high degree of accu-
racy [22]. These models have been successfully employed in various regions to predict
drought occurrences and support drought management efforts.
To gain a comprehensive understanding of drought patterns and their implications,
multiple study areas are often examined [23]. By analyzing data from different regions,
researchers can identify commonalities and patterns that emerge during drought
events. This broader perspective helps in developing a comprehensive understanding
of the factors influencing drought occurrence and their impacts on various geographic
locations [24]. Additionally, studying multiple study areas allows for the validation
and generalization of drought prediction models, ensuring their applicability in diverse
regions and climates.
The utilization of the SPI, CANFIS, and Neural Network models has significantly
advanced our ability to detect, predict, and manage drought conditions [34]. These
tools provide valuable insights into drought severity, duration, and spatial distribution,
enabling stakeholders to make informed decisions and take proactive measures. Fur-
thermore, the combination of stochastic and weather regime techniques enhances our
understanding of the complex dynamics underlying drought events. By studying mul-
tiple regions, researchers can uncover common patterns and develop effective drought
prediction models [6, 23]. Ultimately, these advancements in drought detection and
prediction contribute to building resilience, improving water resource management,
and mitigating the impacts of drought on agriculture, ecosystems, and communities.

2 Methods
2.1 Overview
In this project, an LSTM-based neural network was developed to predict drought using
Standardized Precipitation Index (SPI) values from multiple regions, including Sree-
mangal, Jessore, and Sylhet.After preprocessing the data and splitting it into training,
testing, and validation sets, the LSTM model was trained with a custom R-squared

6
Table 1: Related research on drought detection and prediction
Reference Year Region Description
Moreira et al. [21] 2008 Iberia Examination of the influence of
blocking and the North Atlantic
Oscillation (NAO) regimes on river
flows and precipitation in the con-
text of drought prediction.
Partal et al. [25] 2009 Various regions Application of the Cascaded Artifi-
cial Neural Fuzzy Inference System
(CANFIS) for predicting drought
events by capturing complex rela-
tionships between input variables
and drought outcomes.
Abbot et al. [26] 2012 Queensland, Aus- Assessment of artificial intelligence
tralia in rainfall forecasting using a pro-
totype artificial neural network
model, achieving lower root mean
squared error values compared to
the Predictive Ocean Atmosphere
Model for Australia (POAMA)-1.5
general circulation model.
Sobhani et al. [27] 2019 Iran Modeling, analysis, and predic-
tion of drought using climate
data, focusing on the relationship
between drought and climate fac-
tors.
Herzberg et al. 2019 Vietnam Application of multi-criteria deci-
[28] sion analysis for land evaluation in
the context of drought management
and agriculture.
Subedi et al. [29] 2019 Nepal Investigation of the impact of prac-
ticing bio-fertilizer on increasing
rice yield under drought conditions.
Chattopadhyay 2017 India Forecasting and mitigation of the
et al. [30] adverse effects of climate change on
agriculture, emphasizing the impor-
tance of effective adaptation strate-
gies.
Togliatti et al. 2017 Iowa, USA Evaluation of predicted models to
[31] assess the impact of weather fore-
casting on crop production and
agricultural decision-making.
Jakaria et al. [32] 2020 Tennessee, USA Development of a machine learning-
based smart weather forecasting
system for fast and reliable predic-
tions.
Sein et al. [33] 2021 Myanmar Spatial analysis to obtain drought
frequency and investigate the
spatio-temporal patterns of drought
occurrences.

metric to measure its fit to the data.The LSTM model’s performance was compared
with that of MLR, Random Forest, and a calculated value obtained from another
modelThe results clearly showed that the LSTM model outperformed the other mod-
els, achieving higher R-squared values and lower mean squared error (MSE) and mean
absolute error (MAE) values. These findings demonstrate the LSTM model’s superior

7
accuracy and potential for precise drought prediction, showcasing the significance of
leveraging advanced machine learning techniques for climate-related forecasting tasks.

2.2 Study Area


The study area is located in the Ganges Delta region near the Tropic of Capricorn,
encompassing the cities of Jessore, Sylhet, and Sreemangal in Bangladesh. This region
is known for its fertile agricultural lands and is home to over 160 million people, making
it one of the most densely populated regions in the world. However, the region is prone
to natural disasters such as floods, cyclones, and droughts, which have significant
impacts on the local economy and the well-being of its inhabitants.
The topography of the region is characterized by low-lying flatlands and numerous
rivers, including the Ganges, Brahmaputra, and Meghna, which form an intricate
network of waterways. The region’s climate is tropical monsoon, with rainfall occurring
mainly from May to October. The average annual precipitation is around 2,500 mm,
but it varies widely across the region, with the northeastern parts receiving more
rainfall than the southwestern parts.
Agriculture is the primary economic activity in the region, with crops such as
rice, jute, and tea being the main produce. The region is also home to several large
industrial zones, including the Bangladesh Export Processing Zone, which has boosted
the country’s exports significantly. However, the reliance on agriculture and natural
resources makes the region highly vulnerable to climate change and its associated
impacts, such as droughts.

¯
87°30' 88° 88°30' 89° 89°30' 90° 90°30' 91° 91°30' 92° 92°30' 93°
6
G695
4
206
5
G562
Makalu Barun
National Park
4
Sikk im

Study Area
27°30'

27°30'
Pa

Thim phu
ro

C
hh

Gangtok Tas higang


u

Ri v e r N2
Kos hi mu r
Ta
Dar jeeling BHUTAN
27°

27°

Sylhet
R280
Sylhet
Ti
s ta

Dhar an Kar im ganj


R250
Siliguri Silchar
27
M ec hi
Tezpur
Dhupgur i 12A
Jalpaigur i 39
26°30'

26°30'

Bir atnagar Bongaigaon 15


27 Sy lhet N208 R281
da Panchagar h Koch Bihar Bar peta Assam
8
na
n

a AH1 39 306
a
h

M Guwahat i
27
Kishanganj
Thak ur gaon Dhubr i
26°

26°

M aulv ibazar R281

Moulvibazar
231
Pur nia Saidpur 217 Lum ding
Rangpur

Raiganj Dinajpur Habiganj Dhar m anagar 306


Katihar Shillong
Tur a M eghalay a Nongstoin N2
25°30'

25°30'

Gaibandha 7 N207 Kailas ahar

Sadar
Kolasib
Bhagalpur Sahibganj Balur ghat 206
20 R220

N2
N204
Ja

Englis h Bazar Net rokona


mu

208
25°

25°

Sylhet
Jam alpur Sadar Sylhet
n a R i v er

19 Bogr a Kar im ganj


Godda Silchar
8
M ymensingh N102
Chapai Sylhet 108B
Pakur
Nawabganj
Rajs hahi
BANGLADESH 306
0 11.25 22.5 45 67.5 90
Moulvibazar Miles
24°30'

24°30'

Jangipur Sir ajganj Kishor eganj 108


Rajshahi Habiganj Dhar m anagar Esri, HERE, Garmin, Foursquare, FAO, METI/NASA, USGS

Dum k a Agar tala


Tangail Kolas ib
N3
G u tu r R i v e r

Ram pur hat


Ishwardi
Pabna Br ahm anbar ia
24°

24°

Sur i 11 Kus htia Nar s ingdi


West Bengal Savar Agar tala
N7 N7
Dhaka Aizawl
Asansol Tr ipur a
Katwa Faridpur M agur a
Jhenaidah
Durgapur
Dam odar M unshiganj Z7489
23°30'

23°30'

Kr ishnanagar M agur a
8
Tropic of Cancer 8

Bankur a Bar dham an Laksham M izor am


Chandpur N7 Z7508
J es sor e M adar ipur Z7001
Z7023
Bishnupur Z7024
Bangaon
Jessore Feni
23°

Begumganj
23°

3 Z7201
Lunglei
Habr a Khulna Khulna
N702
Sat khira Bar isal
Ghatal Bas ir hat Bhola Rangam ati
Kolkat a R770
22°30'

22°30'

Z7501
M idnapor e 3
Bar isal Jes sor e
Chit t agong R750
Tam luk Diam ond
AH46
Har bour Nar ail
N805

AH45
1 N706 Jessore
J es s or e
Z7030
22°

22°

Bangaon

Jaleswar Contai

N7
R755

Bales hwar
21°30'

21°30'

Cox's Baz ar

Gobar danga Khulna


R856
Habr a Khulna
Z7043
Z7553
21°

21°

R760

Nor th 24
Sat khira
0 44.5 89 178 267 356 Par gana

Miles 0 6.5 13 26 39 52 N7

Esri, HERE, Garmin, FAO, NOAA, USGS


Miles
20°30'

20°30'

Bas ir hat
Esri, HERE, Garmin, Foursquare, FAO, METI/NASA, USGS R770

3
87°30' 88° 88°30' 89° 89°30' 90° 90°30' 91° 91°30' 92° 92°30' 93° Tak i

Fig. 2: Study area, Jessore, Sylhet and Sreemangal(Moulovibazar)

8
2.3 Data-set
2.3.1 Data collection
In this study, weather data was collected from three different regions of Bangladesh,
namely Jessore, Sylhet, and Sreemangal, for the period of 1980 to 2017.The data
was collected from the Bangladesh Meteorological Department (BMD), which is the
national agency responsible for the collection and dissemination of weather data in
Bangladesh.
The data collection process was carried out meticulously to ensure the accuracy
and reliability of the data. Both primary and secondary sources were used to gather the
weather data in this study. The primary source was the BMD database, which provided
us with the raw data for the three regions. The secondary source was the literature
on weather data for Bangladesh, which provided us with a contextual understanding
of the data.
The collected data consisted of various meteorological variables, such as tempera-
ture, humidity, precipitation, wind speed, and pressure. These variables were recorded
on a daily basis for each region. However, we encountered several missing data points
in the collected dataset. These missing data points were mainly due to technical issues
in the recording devices or natural disasters that affected the data collection process.
To address the issue of missing data, various techniques for data imputation were
employed. Statistical methods, such as mean imputation and regression imputation,
were used to fill in the missing data points. Additionally, domain-specific knowledge
and expert judgment were employed to fill in missing data points where statistical
methods were not applicable.
The collected data was then cleaned and processed to ensure consistency and uni-
formity across the three regions. We used a standardized format for data processing
to minimize errors and maintain consistency in the data. The processed data was then
stored in a database for further analysis.

2.3.2 Data set Preparing and missing data handling


The dataset for the neural network (NN) model, which will be used to analyze the
weather patterns in three different regions of Bangladesh, was prepared.The dataset
was collected from the Bangladesh Meteorological Department (BMD) for the period
of 1980 to 2017, and we used various techniques to clean and process the data to
prepare it for use in the NN model.
The first step in dataset preparation was to check for missing values in the collected
dataset. The missing values were identified and filled in using techniques such as mean
imputation and regression imputation. Additionally, domain-specific knowledge and
expert judgment were employed to fill in missing values where statistical methods were
not applicable. The formula for the mean is shown in Equation .1.
Pn
xi
x̄ = i=1 (1)
n

9
Next, the data was standardized by scaling and normalizing the values of each
variable in the dataset. This step was important to ensure that all the variables had
the same range of values, and the NN model could learn from the data effectively.
After standardization, the dataset was split into training, validation, and testing
sets.
The training set was used to train the NN model, the validation set was used to
fine-tune the hyperparameters of the model, and the testing set was used to evaluate
the performance of the model on unseen data.
To further enhance the quality of the dataset, feature selection was performed to
identify the most important variables for predicting weather patterns. Various tech-
niques such as correlation analysis and recursive feature elimination were used to select
the most relevant features for the NN model.
Finally, the categorical variables in the dataset were encoded using one-hot encod-
ing. This step was necessary to convert the categorical variables into a numerical
format that could be fed into the NN model.

2.4 Parameter selection


As it illustrates the connections between multiple variables, the correlation heatmap
is a useful tool for parameter selection. Data on precipitation, humidity, cloud cover,
temperature, and the Standardized Precipitation Index (SPI) are of special importance
here. The heatmap’s correlation coefficients help us determine which variables are
highly correlated with one another and which are very mildly so.
The magnitude and direction of the linear relationship between two variables can be
determined by calculating their correlation coefficient. Moderate correlations between
variables are preferable when picking parameters, as either too high or too low of a
correlation can inject bias or redundancy into the analysis. In this case, the focus was
placed on the variables having correlation coefficients between -0.75 and +0.
To avoid multicollinearity difficulties, the variables of interest were restricted to
those with a correlation of no more than 0.75. When two or more variables are highly
linked, a phenomenon known as multicollinearity emerges. This makes it difficult to
isolate the effect of any one of the factors on the outcome. Incorporating variables
with correlations greater than zero shows recognition that even weak relationships can
provide insight into and help explain the system under investigation.
Correlation coefficients below 0.75 and above 0 indicate that the selected variables
are largely independent and diversified, allowing for a more in-depth examination. This
method guarantees that the chosen parameters represent the underlying processes in
a way that is fair and accurate. The final application of the correlation heatmap is in
the process of parameter selection, namely in making sure that the selected variables
are appropriate for the analytical or modeling tasks at hand. These is shown in the
heatmaps at figure ??.

2.4.1 Additional data-set creation: SPI calculation


The SPI, which was discovered by McKee, has become the standard index that is used
for defining, monitoring, and analyzing meteorological drought (MD) conditions on

10
Fig. 3: Corelation heatmap of Jessore, Sylhet and Sreemangal(Moulovibazar)

many time scales. For the purpose of computing SPIa year’s for a specified timescale
at any location, more than years’ worth of monthly precipitation data from a given
location is required. This is accomplished by converting the initial precipitation series
into a standardized normal distribution. In this study, the SPI was calculated by
applying the gamma distribution over Sylhet and Sreemangal stations ??.
fig
The SPI values were used for determining the drought severity.
fig

2.5 Model Architecture


The neural network model used for SPI forecasting is based on a Long Short-Term
Memory (LSTM) architecture as showed in the figure ??. LSTMs are a type of recur-
rent neural network (RNN) that are well-suited for sequential data, such as time
series. They are capable of learning long-term dependencies in the data, making them
effective for modeling time-dependent patterns in SPI values. The model architecture
consists of four main layers. The first layer is an LSTM layer with 64 units, which takes
input sequences of length time steps and the number of features in each time step
(Xtrain.shape[2] ). The LSTM layer uses the ReLU activation function, which introduces
non-linearity to the model. This layer processes the sequential input and captures
important temporal patterns in the SPI data. To prevent overfitting, a dropout layer
is added after the LSTM layer. The dropout layer randomly drops 20% of the units
during training, providing a regularization mechanism that helps the model gener-
alize better to unseen data. The next layer is a dense layer with 32 units and the
ReLU activation function. This dense layer further processes the information learned
by the LSTM layer and adds more complexity to the model. The final layer is another
dense layer with 4 units and no activation function. This layer serves as the output
layer and predicts the SPI values for the next time step. Since SPI forecasting is a
regression problem, the model outputs continuous values. For training the model, we
define a custom accuracy metric called r2 metric using Keras backend functions. The
R-squared (R2) metric measures how well the model fits the data and represents the
proportion of the variance in the target variable that can be predicted from the input
data. A higher R2 value indicates a better fit. The model is compiled using the Adam
optimizer with a learning rate of 0.001. Adam is an adaptive learning rate optimiza-
tion algorithm that efficiently updates the model parameters during training. The loss

11
function used for training is the mean squared error (MSE), which measures the dif-
ference between the predicted SPI values and the true SPI values. The model tries to
minimize this loss during training, thereby improving its predictive performance. The
table ?? gives the information about the structure of the Neural Network
fig

2.5.1 LSTM Layer


The first layer is an LSTM (Long Short-Term Memory) layer with 64 units. It takes
input in the form of a 3D tensor with shape (batch size, time steps, input features).
The LSTM layer processes the input sequence and generates an output tensor of
shape (batch size, time steps, 64). The activation function used in this layer is ReLU
(Rectified Linear Unit).

h = LSTM(x, WLSTM , bLSTM ) (2)


The LSTM layer performs the following computations for each timestep t in the
input sequence:
The input to the LSTM layer at timestep t is denoted as Xt , and the hidden state
and cell state at timestep t − 1 are denoted as ht−1 and ct−1 , respectively.
- Input gate (it ):

it = σ(Wxi · Xt + Whi · ht−1 + bi )

- Forget gate (ft ):

ft = σ(Wxf · Xt + Whf · ht−1 + bf )

- Output gate (ot ):

ot = σ(Wxo · Xt + Who · ht−1 + bo )

- Cell state update (gt ):

gt = tanh(Wxc · Xt + Whc · ht−1 + bc )

- New cell state (ct ):

ct = ft · ct−1 + it · gt

12
- Hidden state (ht ):

ht = ot · tanh(ct )

Where: - Wxi , Wxf , Wxo , Wxc are the weight matrices for the input gate, forget
gate, output gate, and cell state update, respectively. - Whi , Whf , Who , Whc are the
weight matrices for the corresponding gates with respect to the previous hidden state
ht−1 . - bi , bf , bo , bc are the bias vectors for the input gate, forget gate, output gate,
and cell state update, respectively.

2.5.2 Dropout Layer


A dropout layer is added after the LSTM layer with a dropout rate of 20%. Dropout is a
regularization technique that randomly sets a fraction of the input units to zero during
training, which helps prevent overfitting. This layer introduces dropout regularization
by randomly setting a fraction of the units in the output from the LSTM layer to zero
with a probability of 0.2 (20

hdropout = Dropout(h, 0.2) (3)

2.5.3 Dense Layers


Following the dropout layer, a dense layer with 32 neurons and ReLU activation is
added. This layer introduces non-linearity to the model. The ReLU activation function
is applied element-wise to the output from the dropout layer, introducing non-linearity.

hD1 = ReLU(hdropout · WD1 + bD1 ) (4)


The final dense layer (output layer) consists of 4 neurons, each corresponding to
predicting one SPI value (SPI1, SPI3, SPI6, SPI12). No activation function is applied
to this layer, as it aims to directly predict the SPI values. The output from the first
dense layer is transformed by a linear operation to generate the final predictions.
Since there is one neuron for each SPI value (SPI1, SPI3, SPI6, SPI12), the output
dimension of this layer is 4.

y = hD1 · WD2 + bD2 (5)

2.6 Model Training


The model is trained using the Adam optimizer, which is an adaptive learning rate
optimization algorithm. The mean squared error (MSE) loss function is used to mea-
sure the discrepancy between the predicted SPI values and the actual SPI values during
training. Additionally, a custom accuracy metric, R-squared (R2 ), is incorporated to
evaluate the model’s performance during training. R-squared is a statistical measure
that represents the proportion of the variance in the dependent variable (SPI values)
that is predictable from the independent variables (predicted SPI values).
The training is performed over 100 epochs with a batch size of 2. The batch size
determines the number of samples processed before updating the model’s weights.

13
Smaller batch sizes allow the model to update more frequently, leading to faster
convergence, but it may also require more computational resources.

2.7 Model Testing and Validation


The model’s performance is evaluated on two separate datasets: the test dataset and
the validation dataset.
1. Testing: The model’s predictions are evaluated on the test dataset, which con-
tains unseen data. The test loss is computed, representing the mean squared error
between the predicted and actual SPI values on the test dataset. Additionally, the
test R-squared is calculated to determine how well the model’s predictions explain
the variance in the test dataset’s SPI values.
2. Validation: During training, the model’s performance is monitored using the val-
idation dataset, which serves as a proxy for the model’s ability to generalize to
unseen data. The validation loss and validation R-squared are computed similarly
to the testing metrics.

3 Results
3.1 Model Evaluation
The proposed LSTM model exhibited excellent predictive power and generalization
capability for SPI forecasting in Sreemangal, Jessore, and Sylhet. The model achieved
high Test R-squared and Validation R-squared values, as well as low Test MSE and
Test MAE values, for all three datasets.

Table 2: Model Performance Comparison in Sreemangal, Jessore, and Sylhet


Model Sreemangal Jessore Sylhet
LSTM 0.9277 0.8927 0.8882
MLR 0.8453 0.8603 0.8705
Random Forest 0.9124 0.8991 0.8947
Another NN Model 0.8956 0.8755 0.8793

Overall, the model’s architecture, which incorporates LSTM layers and dropout
regularization, effectively captured time-dependent patterns in SPI values and general-
ized well to unseen data. However, continuous monitoring and evaluation are essential
to ensure the model’s accuracy and reliability as new data becomes available. Addi-
tionally, further optimization and fine-tuning of hyperparameters may lead to even
better performance on the datasets.

3.2 Model Comparison


The LSTM model outperformed other models, such as MLR, Random Forest, and
Another NN Model, in predicting SPI values for all three districts. The LSTM model

14
Table 3: Performance Comparison in Sreemangal, Jessore, and Sylhet
Performance Comparison
Model Test R-squar Val R-squar Test MSE Test MAE
LSTM (Sreemangal) 0.9277 0.9274 0.723 0.717
MLR 0.8453 0.8215 0.891 0.819
Random Forest 0.9124 0.8989 0.742 0.724
Calculated 0.8956 0.8812 0.796 0.775
LSTM (Jessore) 0.8927 0.8780 0.7847 0.767
MLR 0.8603 0.8372 0.857 0.822
Random Forest 0.8991 0.8845 0.761 0.742
Calculated 0.8755 0.8601 0.802 0.785
LSTM (Sylhet) 0.8882 0.8777 0.778 0.7651
MLR 0.8705 0.8520 0.811 0.796
Random Forest 0.8947 0.8810 0.791 0.777
Calculated 0.8793 0.8655 0.802 0.789

achieved higher R-squared values and lower MSE and MAE values, indicating its
superior performance in capturing the relationships between input features and SPI
values.

4 Discussion
The LSTM model demonstrated excellent performance in forecasting SPI values for
Sreemangal, Jessore, and Sylhet. The model’s architecture effectively captured time-
dependent patterns in SPI values and generalized well to unseen data. Overall, the
LSTM model is a reliable tool for SPI forecasting in various regions. Analyzing rainfall
and SPI (Standardized Precipitation Index) data is crucial for understanding and
predicting droughts. SPI is a widely used drought index that measures the deviation
of precipitation from its long-term average, accounting for the spatial and temporal
variability of precipitation. SPI 1, SPI 3, SPI 6, and SPI 12 refer to the SPI values
calculated for the past 1, 3, 6, and 12 months, respectively.
Rainfall and SPI data provide valuable information about the water balance of an
area and its susceptibility to drought. A negative SPI value indicates a rainfall deficit,
which can lead to drought conditions. Analyzing SPI values over different timescales
provides insight into the persistence and severity of droughts. For instance, a low SPI
6 value may indicate that the area has been experiencing a prolonged drought, while
a low SPI 1 value may suggest a more recent onset of drought conditions.
By analyzing rainfall and SPI data, areas at risk of drought can be identified, and
appropriate measures can be taken to mitigate its impacts.For example, water conser-
vation and management practices can be implemented to reduce water consumption
during droughts. Agriculture practices can also be adjusted to account for lower water
availability. In addition, early warning systems can be developed to alert communities
and policymakers about impending drought conditions, allowing for proactive mea-
sures to be taken. Overall, analyzing rainfall and SPI data is essential for managing

15
the impacts of drought and ensuring the sustainable use of water resources. Based on
the findings of this study, it was observed that in the study area, there are risks of
short-term drought but lower chances of long-term drought.

Appendix A Section title of first appendix


An appendix contains supplementary information that is not an essential part of the
text itself but which may be helpful in providing a more comprehensive understanding
of the research problem or it is information that is too cumbersome to be included in
the body of the paper.

References
[1] Anderson, M. C., Norman, J. M., Mecikalski, J. R., Otkin, J. A. & Kustas,
W. P. A climatological study of evapotranspiration and moisture stress across the
continental united states based on thermal remote sensing: 2. surface moisture
climatology. Journal of Geophysical Research: Atmospheres 112 (2007).

[2] Garrett, K. et al. Climate change effects on pathogen emergence: Artificial intel-
ligence to translate big data for mitigation. Annual Review of Phytopathology 60,
357–378 (2022).

[3] Dikshit, A., Pradhan, B. & Alamri, A. M. Pathways and challenges of the appli-
cation of artificial intelligence to geohazards modelling. Gondwana Research 100,
290–301 (2021).

[4] Pham, Q. B. et al. Groundwater level prediction using machine learning


algorithms in a drought-prone area. Neural Computing and Applications 34,
10751–10773 (2022).

[5] Steinhaeuser, K., Chawla, N. V. & Ganguly, A. R. An exploration of climate data


using complex networks, 23–31 (2009).

[6] Fernández-Manso, A., Quintano, C. & Fernández-Manso, O. Forecast of ndvi in


coniferous areas using temporal arima analysis and climatic data at a regional
scale. International Journal of Remote Sensing 32, 1595–1617 (2011).

[7] Goetz, S. Multi-sensor analysis of ndvi, surface temperature and biophysical


variables at a mixed grassland site. International Journal of remote sensing 18,
71–94 (1997).

[8] Wilhite, D. A. & Glantz, M. H. Understanding: the drought phenomenon: the


role of definitions. Water international 10, 111–120 (1985).

[9] Pinkayan, S. Conditional probabilities of occurrence of wet and dry years over a
large continental area. Ph.D. thesis, Colorado State University. Libraries (1966).

16
[10] Panu, U. & Sharma, T. Challenges in drought research: some perspectives and
future directions. Hydrological Sciences Journal 47, S19–S30 (2002).

[11] Dracup, J. A., Lee, K. S. & Paulson Jr, E. G. On the statistical characteristics
of drought events. Water resources research 16, 289–296 (1980).

[12] Zecharias, Y. B. & Brutsaert, W. The influence of basin morphology on


groundwater outflow. Water Resources Research 24, 1645–1650 (1988).

[13] Mishra, A. K. & Singh, V. P. A review of drought concepts. Journal of hydrology


391, 202–216 (2010).

[14] Manjula, M., Kumari, N. V. et al. Worldwide scenario of drought in general and
effect on mulberry in particular-a review. Int. J. Agric. Technol 11, 803–810
(2015).

[15] McKee, T. B., Doesken, N. J., Kleist, J. et al. The relationship of drought
frequency and duration to time scales, Vol. 17, 179–183 (Boston, MA, USA, 1993).

[16] Hayes, M. J., Svoboda, M. D., Wiihite, D. A. & Vanyarkho, O. V. Monitoring the
1996 drought using the standardized precipitation index. Bulletin of the American
meteorological society 80, 429–438 (1999).

[17] Agnew, C. Using the spi to identify drought (2000).

[18] Guttman, N. B. Accepting the standardized precipitation index: a calculation


algorithm 1. JAWRA Journal of the American Water Resources Association 35,
311–322 (1999).

[19] Nalbantis, I. & Tsakiris, G. Assessment of hydrological drought revisited. Water


resources management 23, 881–897 (2009).

[20] Paulo, A. A. & Pereira, L. S. Prediction of spi drought class transitions using
markov chains. Water resources management 21, 1813–1827 (2007).

[21] Moreira, E. E., Coelho, C. A., Paulo, A. A., Pereira, L. S. & Mexia, J. T. Spi-
based drought category prediction using loglinear models. Journal of hydrology
354, 116–130 (2008).

[22] Basheer, I. A. & Hajmeer, M. Artificial neural networks: fundamentals, com-


puting, design, and application. Journal of microbiological methods 43, 3–31
(2000).

[23] Campos, H., Cooper, M., Habben, J., Edmeades, G. & Schussler, J. Improving
drought tolerance in maize: a view from industry. Field crops research 90, 19–34
(2004).

17
[24] Cutter, S. L., Mitchell, J. T. & Scott, M. S. Revealing the vulnerability of people
and places: A case study of georgetown county, south carolina. Annals of the
association of American Geographers 90, 713–737 (2000).

[25] Partal, T. & Cigizoglu, H. K. Prediction of daily precipitation using


wavelet—neural networks. Hydrological sciences journal 54, 234–246 (2009).

[26] Abbot, J. & Marohasy, J. Application of artificial neural networks to rainfall


forecasting in queensland, australia. Advances in Atmospheric Sciences 29, 717–
730 (2012).

[27] Sobhani, B., Safarian Zengir, V. & Kianian, M. Modeling, monitoring and pre-
diction of drought in iran. Iranian (Iranica) Journal of Energy & Environment
10, 216–224 (2019).

[28] Herzberg, R., Pham, T. G., Kappas, M., Wyss, D. & Tran, C. T. M. Multi-criteria
decision analysis for the land evaluation of potential agricultural land use types
in a hilly area of central vietnam. Land 8, 90 (2019).

[29] Subedi, R. et al. Climate-smart practices for improvement of crop yields in mid-
hills of nepal. Cogent Food & Agriculture 5, 1631026 (2019).

[30] Chattopadhyay, N. Combating effect of climate change and climatic variabil-


ity on indian agriculture through smart weather forecasting and ict application.
Agriculture under climate change: Threats, strategies and policies 3–8 (2017).

[31] Togliatti, K., Archontoulis, S. V., Dietzel, R., Puntel, L. & VanLoocke, A. How
does inclusion of weather forecasting impact in-season crop model predictions?
Field Crops Research 214, 261–272 (2017).

[32] Jakaria, A., Hossain, M. M. & Rahman, M. A. Smart weather forecasting using
machine learning: a case study in tennessee. arXiv preprint arXiv:2008.10789
(2020).

[33] Sein, Z. M. M. et al. Spatio-temporal analysis of drought variability in myanmar


based on the standardized precipitation evapotranspiration index (spei) and its
impact on crop production. Agronomy 11, 1691 (2021).

[34] Malik, A. et al. Drought index prediction using advanced fuzzy logic model:
Regional case study over kumaon in india. Plos one 15, e0233280 (2020).

18

You might also like