0% found this document useful (0 votes)
67 views8 pages

Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning

COVID-19 is a human-to-human transmissible virus responsible for damage to the human body, and people died all over the world. Bangladesh was affected by COVID-19 on March 8th, 2020. During the pandemic, people and the government struggled to prevent transmission due to an inadequate supply of vaccines and healthcare equipment. Therefore, it is essential to understand the upcoming infected cases for several days.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views8 pages

Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning

COVID-19 is a human-to-human transmissible virus responsible for damage to the human body, and people died all over the world. Bangladesh was affected by COVID-19 on March 8th, 2020. During the pandemic, people and the government struggled to prevent transmission due to an inadequate supply of vaccines and healthcare equipment. Therefore, it is essential to understand the upcoming infected cases for several days.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Covid-19 Short-Term Forecasting in Bangladesh


Using Supervised Machine Learning
Md Mahbubur Rahman1 Farhana Tazmim Pinki2
Computer Science and Engineering Discipline Computer Science and Engineering Discipline
Khulna University Khulna University
Khulna – 9206, Bangladesh Khulna – 9206, Bangladesh

Abstract:- COVID-19 is a human-to-human transmissible adopted strict actions, including lockdowns, stay-at-home
virus responsible for damage to the human body, and recommendations, mass gathering cancellations, school, and
people died all over the world. Bangladesh was affected non-essential shop closures, banned flights, sealed
by COVID-19 on March 8th, 2020. During the pandemic, international borders, and export-import transportation. These
people and the government struggled to prevent actions have significantly impacted the economy, with many
transmission due to an inadequate supply of vaccines and businesses struggling and many people losing their jobs. The
healthcare equipment. Therefore, it is essential to healthcare system in Bangladesh has also been under
understand the upcoming infected cases for several days. significant strain, with hospitals and healthcare facilities
That may help people and the government make pre- facing shortages of personal protective equipment, oxygen,
decision before the pandemic to save live. In this paper, and other essential supplies. The country has struggled to
we proposed a COVID-19 short-term forecasting model ramp up its testing, and there are concerns about the accuracy
using Linear Regression (LR), Least Absolute Shrinkage of its COVID-19 data. Bangladeshi government carried out
and Selection Operation (LASSO) Regression, and COVID-19 mass vaccination campaigns to swiftly contain the
Support Vector Regression (SVR) to predict the next disease outbreak and completed 346,605,580 vaccine doses in
seven days of COVID-19 infected cases in Bangladesh 3 steps till 15 January 2023 [4]. An inadequate supply of
during the pandemic situation. Here we considered data vaccines making challenging to normalize the situation. Also,
from 8th May 2021 to 21st July 2021. We analyzed the virus mutation prevents the vaccine from being fully
different past data volumes for the model to understand effective in the human body. Population density is another
the impact of past data in the model. The result reveals concern for maintaining the vaccine necessary measures and
that Support Vector Regression (SVR) performance was proper doses of vaccines. In that case, it would be much more
better than LR and LASSO in all aspects with high effective if the government and healthcare industries could
accuracy. The performance also indicated that the high know the upcoming transmission in advance. From the
volume of past data helps to increase prediction accuracy. beginning, Bangladesh has faced three pandemic waves of
COVID-19. The first wave started on 8th March 2020. The
Keywords:- COVID-19, Short-term Forecast, New infected Beta variant's second wave started in mid-March 2021, and
cases, Bangladesh, Supervised Machine Learning, LR, the third wave of the Delta variant started on 08 May 2021.
LASSO, SVR. Each time a new variant of COVID-19 made a new pandemic
situation. It rapidly increases confirmed and death cases [6]
I. INTRODUCTION [7]. Significant increases in transmission affect human daily
life, and healthcare organizations also struggle to prevent
The 21st-century world is facing a new pandemic caused transmission. Therefore, measuring the downturn of COVID-
by COVID-19. In December 2019, the first COVID-19 case 19 transmission in that situation is almost impossible. So, it is
has initially been identified in Wuhan, China [1]. According essential to know and understand how COVID-19 can
to the World Health Organization, COVID-19 symptoms transmit shortly to make a valuable decision to prevent
include fever, dry cough, shortness of breath, fatigue, bodily COVID-19 transmission. In this regard, Short-term Forecast
pain, sore throat, etc. [2]. The high human-to-human on COVID-19 transmission helps understand and predict
transmission nature of the disease has been progressively upcoming scenarios, manage public health efficiently, and
unpredictable and spread rapidly worldwide. As of January optimize healthcare resources in a pandemic. Also, the
30, 2023, the world has reported over 200 million confirmed healthcare system will get prepared and provide the best
cases and over 4 million deaths due to COVID-19 [3]. The support in the pandemic that may help save lives.
pandemic also has impacted global health and economies,
causing economic downturns and disruptions to daily life. On II. LITERATURE REVIEW
March 8, 2020, Bangladesh received its first case report; since
then, the number of cases has rapidly escalated. The official Presently the application of machine learning is an
statistics of the Institute of Epidemiology, Disease Control advanced approach for solving the problem in several fields
and Research (IEDCR) show 2,037,386 infected cases and such as business, industrial, scientific, etc. The Performance
29,441 confirmed deaths till 19th January 2023 [4] [5]. In the of machine learning algorithms in future forecasting using
early stage of the pandemic, Bangladesh’s government existing data has promising results with high accuracy.

IJISRT23JUN1007 www.ijisrt.com 1127


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Several Researchers have checked and performed future Administration U.S. Department of Commerce), a five-day
forecasts using different machine-learning methods in forecast can accurately predict the weather 90 percent of the
Bangladesh. From traditional methods to deep learning, time. That drops to 80 percent for a seven-day forecast and
research is always an open interest. [8] Developed a cloud- then down to 50 percent for a 10-day forecast” for weather
based model for forecasting COVID-19 infected cases in data. In our case, we attempt to forecast up to 7 days for
Bangladesh. Several machine learning models were higher prediction accuracy for COVID-19 future forecast.
employed to forecast the next seven days scenario by training
the previous sample data. They considered Bangladesh's III. RESEARCH METHODOLOGY
COVID-19 infection and fatalities data from the beginning to
10th June 2020. Total data was split into 35 subsets, and each This section will describe a detailed overview of the
subset was trained and predicted individually. All models proposed methodology: data collection, preprocessing, and
were evaluated by RMSE, MAE, and R2 values. When it future forecasting of COVID-19 transmission in Bangladesh.
came to both infection and death cases, the prophet model
produced the best results. Another study by [9] demonstrates A. Dataset Description
how machine learning models can be used to predict COVID- Nowadays, the online source is the most authentic and
19 virus transmission in the upcoming days. Linear acceptable source for COVID-19 data. Many Academic
regression, support vector machines, lasso regression, and researchers use research data from online sources [9] [13].
exponential smoothing methods were used to forecast the COVID-19 data was obtained from the GitHub repository of
next ten days of COVID-19 infected cases. The result proves Johns Hopkins University's Center for System Science and
that the ES performs best among all models. LR and LASSO Engineering for this work [14]. The dataset folder name
perform better for newly confirmed, death, and recovery (csse_covid_19_time_series) includes the COVID-19 daily
cases. SVM exhibits low performance in all prediction time series data report, and the data updated frequency is
scenarios for the given dataset. Another study by [10] once a day. The dataset has six attributes. However,
evaluated the effectiveness of convolutional as well as Bangladeshi confirmed case data had been used for our work.
recurrent neural networks in predicting COVID-19. Long
short-term memory (LSTM), gated recurrent unit (GRU), B. Data Preprocessing
convolutional neural network (CNN), and multivariate Fig. 1 shows the total COVID-19 infected cases
convolutional neural network (MCNN) were used to perform scenario from 2020-01-22 to 2022-01-18, where the highest
probable future outbreaks in Brazil, Russia, and UK. The alarming situation happened on 2021-07-28. The graph
Study highlighted that CNN performed better than other shows that infected cases increased very fast from 2021-05-
algorithms when making forecasts using a few features and 08 to 2021-07-28. Therefore, in this study, we used COVID-
historical data. [7] looked into Bangladesh's COVID-19 19 infected cases data from 2021-05-08 to 2021-07-21 to
third-wave data and forecasted the COVID-19 situation. The understand future infected cases in a pandemic. We extracted
data from Mar 01, 2021, to July 31, 2021, was used to daily Bangladeshi confirmed cases data along with the
generate the Auto-Regressive Integrated Moving Average continuous date value from 8th May 2021 to 21st July 2021.
(ARIMA) prediction model. The finding shows that the next 75 days of confirmed cases data were selected to predict
three months from 1st August are frightening in Bangladesh. COVID-19 transmission. The data has split into three
[1] Compare the prediction accuracy between Auto- different datasets for the forecasting models. The datasets
Regressive Integrated Moving Average (ARIMA) and contain past data volumes of 45, 60, and 75 days. The
eXtream Gradient Boosting (XGBoost) in Bangladesh. 633 COVID-19 dataset is supervised data. Moreover, we checked
days of the daily conformed dataset and death cases were our data's inconsistencies and completed all data processing
collected from the Directorate General of Health Service work using the Python programming language and the Jupiter
(DGHS) and IEDCR. The final finding shows that the notebook platform.
ARIMA model performs better than the XGBoost model in
predicting confirmed and death cases in Bangladesh. Most of C. Methodology
the work has been done based on a specific range of datasets We are proposing three regression methods Linear
where they did not contribute to the explanation of the past Regression (LR), Least Absolute Shrinkage and Selection
data volume for higher accuracy in forecasting. We aim to Operation (LASSO) Regression, and Support Vector
investigate the past data volume effect on model accuracy Regression (SVR) methods to create the COVID-19
using different data scenarios on COVID-19 transmission and prediction model. For the COVID-19 prediction analysis, the
create a short-term forecasting model using linear regression observations dataset has been recorded over daily time
(LR), Least Absolute Shrinkage and Selection Operation frequency. The time step feature has been constructed for the
(LASSO) regression, and support vector regression (SVR). forecasting models input feature using day numbers derived
Many academics have researched using a short range of the directly from the dataset date index column. Moreover,
past dataset for forecasting COVID-19 up to 7-30 days [6] confirmed cases are the target feature. Prepared data has two
[7]. “Forecasts are more accurate for shorter than longer time sets: 70% are training data, and 30% are testing data for all
horizons. The shorter the time horizon of the forecast, the regression models. Fig. 2 shows the proposed workflow.
lower the degree of uncertainty. Data do not change very
much in the short run” [11]. [12] also informed that
“According to NOAA (National Oceanic and Atmospheric

IJISRT23JUN1007 www.ijisrt.com 1128


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 1. Bangladesh daily new infected cases data from 2020-01-22 to 2022-01-18

variable. Below is the linear regression equation showing


how the independent variable y is related to x.
𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝜀 (1)

𝐸(𝑦) = 𝛽0 + 𝛽1 𝑥 (2)

Here, y is the dependent variable, x is the independent


variable, 𝛽0 represents the y-intercept, 𝛽1 is the regression
coefficient, and the estimate's error term is called epsilon 𝜀.
The total error rate should be minimized to get the best-fit
line. In that case, the difference between the actual and
predicted data point values should be minimal. The
minimization equation can be defined as:
𝑛
1 (3)
𝑐 = ∑(𝑝𝑟𝑒𝑑𝑖 − 𝑦𝑖 )2
𝑛
𝑖=1

Here, c is called a cost function, 𝑝𝑟𝑒𝑑𝑖 is the predicted


value, 𝑦𝑖 is the actual value, and 𝑛 is the number of the total
data point.

 Lasso Regression:
“Lasso” stands for Least Absolute Shrinkage and
Selection Operation. When linear regression cannot give the
required best-fit line or the predicted outcome is overfitted,
lasso regression comes here to reduce overfitting and
Fig. 2. Proposed workflow. improve the prediction accuracy. The shrinking of the
extreme coefficient values toward the central value makes
 Linear Regression: the model stable for overfitting reduction. The L1
Linear regression is a popular prediction analysis regularization technique adds a penalty term equal to the
technique in machine learning. This technique calculates how absolute value of the coefficient’s magnitude. The model
strong the relationship between two variables is. The aim is automatically penalizes the coefficient value equal to or close
to make a best-fit linear line for the observed data with to zero. That means the feature is unable to improve the best-
minimal error. Observations depend on two variables, x and fit line. The coefficient magnitude can be set to zero or equal
y where x is the independent variable, and y is the dependent to zero. The above process makes the model sparse with the

IJISRT23JUN1007 www.ijisrt.com 1129


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
selected coefficient. The goal of the model is to reduce the
following:
𝑛 𝑝
(4)
2
∑(𝑦𝑖 − ∑ 𝑥𝑖𝑗 𝛽𝑗 ) + 𝜆 ∑|𝛽𝑗 |
𝑖=1 𝑗 𝑗=1
or Similarly

The sum of square residuals + λ |slope|


Fig. 3. Support vector Machine
Here, λ is a tuning parameter for L1 Regularization
where 0 ≤ 𝜆 ≤ ∞ and λ |slope| is the penalty term. We aim D. Evaluation Parameters:
to find a best-fit line with low bias and high variance. The study will evaluate all the learning models using
the most widely used evaluation metrics. Model’s
 Support vector regression: performance will be measured by R-squared (𝑅2), Mean
Support vector regression is well known for Square Error (MSE), Mean Absolute Error (MAE), and
regression-based model prediction and analysis. This method Root Mean Square Error (RMSE).
is used for both linear and non-linear data. Most linear
regression tries to minimize the sum of the residual square to  R-squared
get the best-fit line where SVR creates a hyperplane and two R square is a statistical measure that determines the
parallel marginal planes to find out a soft marginal space for goodness of fit in a regression model by comparing the
the given dataset. The main aim is maximizing the marginal residual sum of squares (SSres) with the total sum of squares
plane distance, minimizing the error rate, and creating a (SStot). R2 ideal value is 1. The regression model value
hyperplane for the best-fit line. SVR can set the acceptable closer to 1 indicates a better fit of the model.
error rate using ϵ for the model and find the best-fit line. If
any data point falls outside of ϵ, SVR can still calculate the 𝑆𝑆𝑅𝐸𝑆 (7)
deviation from the margin by ξ. The final objective function 𝑅2 = 1 −
and constraints are as follows: 𝑆𝑆𝑇𝑂𝑇
∑𝑖( 𝑦𝑖 − 𝑦̂𝑖 )2
= 1−
Minimize: ∑𝑖( 𝑦𝑖 − 𝑦̅𝑖 )2
𝑛
1 (5)
𝑀𝐼𝑁 ‖𝑤‖2 + 𝐶 ∑|ξ𝑖 | Here, 𝑦𝑖 is actual data point, 𝑦̂𝑖 predicted data point,
2 and 𝑦̅𝑖 is the predicted mean value.
𝑖=1

 Mean Square Error


Constraints: Mean Square error determines the average distance
|𝑦𝑖 − 𝑤𝑖 𝑥𝑖 | ≤ 𝜀 + |ξ𝑖 | (6) between the data point and regression line by squaring the
distance value. Square is important to replace the negative
value sign and gain more weight to a larger difference. A
Here, ‖𝑤‖ is a weight vector, 𝐶 is the marginal Smaller value close to 0 of MSE indicates a better fit of the
distance from the hyperplane where −𝜀 ≥ 𝑐 ≤ 𝜀, ξ is the model, whereas a high value means a high error rate.
distance between the marginal plane to an outside data
𝑛
point, 𝑦𝑖 is the actual value, and 𝑤𝑖 𝑥𝑖 predicted value. Fig 3 1 (8)
shows the Illustrative Example of the support vector 𝑀𝑆𝐸 = ∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑛
machine. Nonlinear input data transform into n-dimension 𝑖=1
using a nonlinear function called kernel function. The basic
understanding is that the input space is transferred from low
n is the total number of data points, 𝑦𝑖 is the actual value
dimension to higher dimensional space. This way, data will
and 𝑦̂𝑖 is the predicted value of the data point.
be linearly separable in the new space using a mapping
technique. The kernel technique helps us to find a
 Root Mean Square Error
hyperplane with high accuracy and low error. We will check
RMSE is another error metric for numerical
SVR polynomial kernel methods with different tuning
prediction. This process calculates the standard deviation of
parameters and finally pick the best model for the given
the residuals. Standard deviation σ measures the spread of
dataset.
data points around the mean, and the residual prediction error
is measured by calculating the distance between the predicted
value and the actual value. RMSE's outcome is the average
magnitude of inaccuracy. A lower error value close to 0
indicates a better fit of the model. The formula of RMSE
follows:

IJISRT23JUN1007 www.ijisrt.com 1130


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

𝑛 (9) This work uses machine learning techniques to create a


1 short-term forecasting model for daily new infections caused
𝑅𝑀𝑆𝐸 = √ ∑(𝑦𝑖 − 𝑦̂𝑖 )2 by the COVID-19 virus. The dataset in this study included
𝑛
𝑖=1 the frequency of infections that occur each day in
Bangladesh. From Fig 1, data shows a significant increase in
new variants, which is alarming for the country. This study
𝑛 (10) attempts to develop a forecasting model that can predict the
1 upcoming 7 days of newly infected cases by analyzing the
𝜎 = √ ∑(𝑥𝑖 − 𝜇)2 previous infected cases data. Three different datasets (45
𝑛
𝑖=1 days, 60 days, and 75 days dataset) have been used for the
model. The study also measured the impact of past data
n is the total number of data point, 𝑦𝑖 is the actual value, 𝑦̂𝑖 volume on future forecasting. Three machine learning
is the predicted value of the data point, 𝑥𝑖 is the actual data models, LR, LASSO, and SVR, were used to forecast the
point, 𝜇 is the mean of the actual data point. upcoming 7 days of newly infected cases.

 Mean Absolute Error A. Forecasting using 45 days Dataset


Mean absolute error determines the mean difference The study predicted newly infected cases by analyzing
between the actual and predicted values of a model's total the past 45 days infected cases dataset. According to the
variance. MAE score initiated from 0 to infinity, where a low model’s performance and results, SVR performs better than
score indicates the improved performance of the anticipated LR and LASSO models. LR and LASSO performances were
model. The formula is as follows: almost the same, with equal R2 scores. In comparison, SVR
gives the best result for the given dataset. The outcome is
𝑛 displayed in Table 2.
1 (11)
𝑀𝐴𝐸 = ∑|𝑦𝑖 − 𝑦̂𝑖 |
𝑛 Table 2. Model’s performance on future forecasting for
𝑖=1
infected cases.
𝑛 is the total number of data points 𝑦𝑖 is the actual value, 𝑦̂𝑖 Model R2 MSE MAE RMSE
is the predicted value. Score
LR 0.80 227815.61 363.27 477.30
E. Hyperparameter Tuning LASSO 0.80 227854.33 363.31 477.34
Sometimes, the default settings of the regression model SVR 0.93 79352.30 254.05 281.69
did not show the best result. Therefore, parameter tuning is
essential to improve the results. In our cases, several Figures 4, 5 and 6 show the performance of LR,
parameters were tuned to get the best result using LR, LASSO and SVR models respectively. Figures showing the
LASSO, and SVR. Table 1 shows the hyperparameter predicted infected cases have an increasing trend for the
tuning for LR, LASSO, and SVR in all different data upcoming 7 days which is very alarming.
scenarios.

Table 1. The Hyperparameters of best-performing


regression models.
Models Hyperparameter Tuning
45 Days 60 Days Dataset 75 Days
Dataset Dataset
LR Random State = Random State = Random State
13 58 =3
LASSO Random State = Random State = Random State
13 58 =3
SVR Random State = Random State Random State
3, =7, =5,
kernel="poly", kernel="poly", kernel="poly", Fig. 4. Infected cases prediction by LR for the upcoming 7
C=0.3, C=0.3, C=0.3, days.
gamma=0.3, gamma=0.3, gamma=0.3,
degree=3 degree=3 degree=3

IV. RESULTS AND DISCUSSION

IJISRT23JUN1007 www.ijisrt.com 1131


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 8. Infected cases prediction by LASSO for the


Fig. 5. Infected cases prediction by LASSO for the
upcoming 7 days.
upcoming 7 days.

Fig. 9. Infected cases prediction by SVR for the upcoming 7


Fig. 6. Infected cases prediction by SVR for the upcoming 7 days.
days.
C. Forecasting using 75 days Dataset
B. Forecasting using 60 days Dataset The Performance of all models using 75 days dataset
The Prediction model based on the past 60 days data was quite promising and model performance improved. The
has better performance than 45 days dataset. All models LR, result shows that the SVR performs better than the others
LASSO and SVR performance increased and results show model. LR and LASSO performance also improved and both
that the SVR model performs better than all other models. have almost identical performance with equal R2 scores.
LR and LASSO performance were almost identical with Table 4 displays the findings.
equal R2 scores. Table 3 displays the outcomes.
Table 4. Models performance on future forecasting for
Table 3. Model’s performance on future forecasting for infected cases.
infected cases. Model R2 MSE MAE RMSE
Model R2 MSE MAE RMSE Score
Score LR 0.88 1954244.90 1217.47 1397.94
LR 0.81 1320050.68 936.60 1148.93 LASSO 0.88 1954258.42 1217.47 1397.94
LASSO 0.81 1320058.97 936.59 1148.93 SVR 0.96 643586.37 593.40 802.23
SVR 0.97 170617.92 311.09 413.05
Figures 10, 11 and 12 show the performance of LR,
Figures 7, 8 and 9 show the performance of LR, LASSO and SVR models respectively, where the model’s
LASSO and SVR models in graphs, respectively. Prediction prediction is quite promising. In the 75 days dataset, the
indicates an increasing trend of infected cases for the SVR model gives better prediction accuracy. LR and
upcoming 7 days. In comparison to all models, SVR LASSO performance also improved for the given dataset,
performance is better in this situation. where both LR and LASSO models have the same results
with equal R2 scores.

Fig. 7. Infected cases prediction by LR for the upcoming 7 Fig. 10. Infected cases prediction by LR for the upcoming 7
days. days.

IJISRT23JUN1007 www.ijisrt.com 1132


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 11. Infected cases prediction by LASSO for the Fig. 12. Infected cases prediction by SVR for the upcoming
upcoming 7 days. 7 days.

The overall summary of the experimental results is


shown in Table 5.

Table 1. Experimental Results for all regression models in different data scenarios.
Models Evaluation Parameters Accuracy
45 days 60 days 75 days
2
LR R Score 0.80 0.81 0.88
MSE 227815.61 1320050.68 1954244.90
MAE 363.27 936.60 1217.47
RMSE 477.30 1148.93 1397.94
LASSO R2 Score 0.80 0.81 0.88
MSE 227854.33 1320058.97 1954258.42
MAE 363.31 936.59 1217.47
RMSE 477.34 1148.93 1397.94
SVR R2 Score 0.93 0.97 0.96
MSE 79352.30 170617.92 643586.37
MAE 254.05 311.09 593.40
RMSE 281.69 413.05 802.23

From the table, we can see that the performance of the relational features like population density, people’s age
models has increased for high-volume datasets. range, human health condition etc. can be considered to
better understand the future case scenario of COVID-19.
V. CONCLUSION
REFERENCES
COVID-19 creates health and economic crises all over
the world. Every time new variants of COVID-19 make a [1]. Md. Siddikur Rahman, Arman Hossain Chowdhury,
pandemic wave, many people are affected worldwide. As a Miftahuzzannat Amrin. 2022. "Accuracy comparison
result, it causes damage to the human body, and people die. of ARIMA and XGBoost forecasting models in
In this study, we propose a short-term future forecasting predicting the incidence of COVID-19 in Bangladesh."
model for newly infected case prediction for the upcoming 7 PLOS GLOBAL PUBLIC HEALTH.
days. LR, LASSO, and SVR regression models were used doi:https://doi.org/10.1371/journal.pgph.0000495.
for future forecasting. Findings show that the SVR model [2]. Noara AlHusseini, Muhammad Sajid, Afaf Altayeb,
performance was better than LR and LASSO models in all Shahd Alyousof, Haifa Alsheikh, Abdulrahman
three data scenarios. Results also indicate the increases in Alqahtani, Afrah Alsomali. 2021. "Depression and
past data volume, increasing the result accuracy for all Obsessive-Compulsive Disorders Amid the COVID-19
models. Overall, we conclude that the short-term forecasting Pandemic in Saudi Arabia." The Cureus Journal of
model using SVR gives better results for upcoming days of Medical Science.
infected cases, which may help to understand the upcoming doi:https://doi.org/10.7759/cureus.12978.
situation. This forecasting model also can be helpful for [3]. WHO. 2023. Coronavirus disease (COVID-19)
people and the government to make decisions and prepare pandemic. World Health Organization. Accessed
for the upcoming situation. The healthcare system could March 15, 2023.
prepare and provide the best support in the pandemic https://www.who.int/emergencies/diseases/novel-
situation that may help save lives. In the future, it is coronavirus-2019.
intended to adopt deep learning approaches to forecast the
COVID-19 transmission for small datasets. Several

IJISRT23JUN1007 www.ijisrt.com 1133


Volume 8, Issue 6, June – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
[4]. IEDCR. 2023. Institute of Epidemiology, Disease [15]. Abu Kaisar Mohammad Masuma, Sharun Akter
Control and Research. Accessed march 15, 2023. Khushbua, Mumenunnessa Keyaa, Sheikh Abujara,
https://old.dghs.gov.bd/index.php/bd/component/conte Syed Akhter Hossaina. 2020. "COVID-19 in
nt/article?id=5391. Bangladesh: A Deeper Outlook into The Forecast with
[5]. DGHS. 2023. COVID-19 Dynamic Dashboard for Prediction of Upcoming Per Day Cases Using Time
Bangladesh. Accessed March 15, 2023. Series." 9th International Young Scientist Conference
http://dashboard.dghs.gov.bd/webportal/pages/covid19. on Computational Science (YSC 2020). Procedia
php. Computer Science. 291–300.
[6]. Mossamet Kamrun Nesa, Md. Rashed Babu, [16]. Iqra Sardar, Muhammad Azeem Akbar, Victor Leiva,
Mohammad Tareq Mamun Khan. 2022. "Forecasting Ahmed Alsanad, Pradeep Mishra. 2023. "Machine
COVID-19 situation in Bangladesh." Biosafety and learning and automatic ARIMA/Prophet models-based
Health vol.4: pp.6-10. forecasting of COVID-19: methodology, evaluation,
doi:https://doi.org/10.1016/j.bsheal.2021.12.003. and case study in SAARC countries." Stochastic
[7]. Hafsa Binte Kibria, Oishi Jyoti, Abdul Matin. 2022. Environmental Research and Risk Assessment vol.37:
"Forecasting the spread of the third wave of COVID- 345–359. doi:https://doi.org/10.1007/s00477-022-
19 pandemic using time series analysis in Bangladesh." 02307-x.
Informatics in Medicine Unlocked vol.28 (100815). [17]. Khondoker Nazmoon Nabi, Md Toki Tahmid, Abdur
doi:https://doi.org/10.1016/j.imu.2021.100815. Rafi, Muhammad Ehsanul Kader, Md. Asif Haider.
[8]. Md. Shahriare Satu, Koushik Chandra Howlader, 2021. "Forecasting COVID-19 cases: A comparative
Mufti Mahmud, M. Shamim Kaiser, Sheikh analysis between recurrent and convolutional neural
Mohammad Shariful Islam, Julian M. W. Quinn, networks." Results in Physics Vol. 24 (104137).
Salem A. Alyami, Mohammad Ali Moni. 2021. "Short- doi:https://doi.org/10.1016/j.rinp.2021.104137.
Term Prediction of COVID-19 Cases Using Machine
Learning Models." Applied Sciences (MDPI) Vol.
11(9) (4266).
doi:https://doi.org/10.3390/app11094266.
[9]. Furqan Rustam, Aijaz Ahmad Reshi, Arif Mehmood,
Saleem Ullah, Byung-Won On, Waqar Aslam and Gyu
Sang Choi. 2020. "COVID-19 Future Forecasting
Using Supervised Machine Learning Models." in IEEE
Access (in IEEE Access) vol. 8: pp. 101489-101499.
https://ieeexplore.ieee.org/document/9099302.
[10]. Abdelkader Dairi, Fouzi Harrou, Abdelhafid Zeroual,
Mohamad Mazen Hittawe, Ying Sun. 2021.
"Comparative study of machine learning methods for
COVID-19 transmission forecasting." Journal of
Biomedical Informatics vol. 114.
doi:https://doi.org/10.1016/j.jbi.2021.103791.
[11]. Nada R. Sanders, R. Dan Reid. 2012. "PRINCIPLES
OF FORECASTING." In Operations Management: An
Integrated Approach, 5th Edition, by R. Dan Reid and
Nada R. Sanders. Wiley.
https://www.oreilly.com/library/view/operations-
management-an/9781118122679/.
[12]. Moore, Audra. 2021. Forecast Accuracy : Long-Range
Forecasting is Hard. KMTV 3 News Now. Accessed
March 19, 2023.
https://www.3newsnow.com/weather/weather-
blog/forecast-accuracy.
[13]. Sanzida Solayman, Sk. Azmiara Aumi, Chand Sultana
Mery, Muktadir Mubassir, Riasat Khan. 2023.
"Automatic COVID-19 prediction using explainable
machine learning techniques." International Journal of
Cognitive Computing in Engineering (Elsevier) Vol. 4:
PP. 36-46.
doi:https://doi.org/10.1016/j.ijcce.2023.01.003.
[14]. GitHub, Inc. 2023.
time_series_covid19_confirmed_global.csv.
https://github.com/CSSEGISandData/COVID-
19/tree/master/csse_covid_19_data/csse_covid_19_tim
e_series.

IJISRT23JUN1007 www.ijisrt.com 1134

You might also like