Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning
Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning
ISSN No:-2456-2165
Abstract:- COVID-19 is a human-to-human transmissible adopted strict actions, including lockdowns, stay-at-home
virus responsible for damage to the human body, and recommendations, mass gathering cancellations, school, and
people died all over the world. Bangladesh was affected non-essential shop closures, banned flights, sealed
by COVID-19 on March 8th, 2020. During the pandemic, international borders, and export-import transportation. These
people and the government struggled to prevent actions have significantly impacted the economy, with many
transmission due to an inadequate supply of vaccines and businesses struggling and many people losing their jobs. The
healthcare equipment. Therefore, it is essential to healthcare system in Bangladesh has also been under
understand the upcoming infected cases for several days. significant strain, with hospitals and healthcare facilities
That may help people and the government make pre- facing shortages of personal protective equipment, oxygen,
decision before the pandemic to save live. In this paper, and other essential supplies. The country has struggled to
we proposed a COVID-19 short-term forecasting model ramp up its testing, and there are concerns about the accuracy
using Linear Regression (LR), Least Absolute Shrinkage of its COVID-19 data. Bangladeshi government carried out
and Selection Operation (LASSO) Regression, and COVID-19 mass vaccination campaigns to swiftly contain the
Support Vector Regression (SVR) to predict the next disease outbreak and completed 346,605,580 vaccine doses in
seven days of COVID-19 infected cases in Bangladesh 3 steps till 15 January 2023 [4]. An inadequate supply of
during the pandemic situation. Here we considered data vaccines making challenging to normalize the situation. Also,
from 8th May 2021 to 21st July 2021. We analyzed the virus mutation prevents the vaccine from being fully
different past data volumes for the model to understand effective in the human body. Population density is another
the impact of past data in the model. The result reveals concern for maintaining the vaccine necessary measures and
that Support Vector Regression (SVR) performance was proper doses of vaccines. In that case, it would be much more
better than LR and LASSO in all aspects with high effective if the government and healthcare industries could
accuracy. The performance also indicated that the high know the upcoming transmission in advance. From the
volume of past data helps to increase prediction accuracy. beginning, Bangladesh has faced three pandemic waves of
COVID-19. The first wave started on 8th March 2020. The
Keywords:- COVID-19, Short-term Forecast, New infected Beta variant's second wave started in mid-March 2021, and
cases, Bangladesh, Supervised Machine Learning, LR, the third wave of the Delta variant started on 08 May 2021.
LASSO, SVR. Each time a new variant of COVID-19 made a new pandemic
situation. It rapidly increases confirmed and death cases [6]
I. INTRODUCTION [7]. Significant increases in transmission affect human daily
life, and healthcare organizations also struggle to prevent
The 21st-century world is facing a new pandemic caused transmission. Therefore, measuring the downturn of COVID-
by COVID-19. In December 2019, the first COVID-19 case 19 transmission in that situation is almost impossible. So, it is
has initially been identified in Wuhan, China [1]. According essential to know and understand how COVID-19 can
to the World Health Organization, COVID-19 symptoms transmit shortly to make a valuable decision to prevent
include fever, dry cough, shortness of breath, fatigue, bodily COVID-19 transmission. In this regard, Short-term Forecast
pain, sore throat, etc. [2]. The high human-to-human on COVID-19 transmission helps understand and predict
transmission nature of the disease has been progressively upcoming scenarios, manage public health efficiently, and
unpredictable and spread rapidly worldwide. As of January optimize healthcare resources in a pandemic. Also, the
30, 2023, the world has reported over 200 million confirmed healthcare system will get prepared and provide the best
cases and over 4 million deaths due to COVID-19 [3]. The support in the pandemic that may help save lives.
pandemic also has impacted global health and economies,
causing economic downturns and disruptions to daily life. On II. LITERATURE REVIEW
March 8, 2020, Bangladesh received its first case report; since
then, the number of cases has rapidly escalated. The official Presently the application of machine learning is an
statistics of the Institute of Epidemiology, Disease Control advanced approach for solving the problem in several fields
and Research (IEDCR) show 2,037,386 infected cases and such as business, industrial, scientific, etc. The Performance
29,441 confirmed deaths till 19th January 2023 [4] [5]. In the of machine learning algorithms in future forecasting using
early stage of the pandemic, Bangladesh’s government existing data has promising results with high accuracy.
Fig. 1. Bangladesh daily new infected cases data from 2020-01-22 to 2022-01-18
𝐸(𝑦) = 𝛽0 + 𝛽1 𝑥 (2)
Lasso Regression:
“Lasso” stands for Least Absolute Shrinkage and
Selection Operation. When linear regression cannot give the
required best-fit line or the predicted outcome is overfitted,
lasso regression comes here to reduce overfitting and
Fig. 2. Proposed workflow. improve the prediction accuracy. The shrinking of the
extreme coefficient values toward the central value makes
Linear Regression: the model stable for overfitting reduction. The L1
Linear regression is a popular prediction analysis regularization technique adds a penalty term equal to the
technique in machine learning. This technique calculates how absolute value of the coefficient’s magnitude. The model
strong the relationship between two variables is. The aim is automatically penalizes the coefficient value equal to or close
to make a best-fit linear line for the observed data with to zero. That means the feature is unable to improve the best-
minimal error. Observations depend on two variables, x and fit line. The coefficient magnitude can be set to zero or equal
y where x is the independent variable, and y is the dependent to zero. The above process makes the model sparse with the
Fig. 7. Infected cases prediction by LR for the upcoming 7 Fig. 10. Infected cases prediction by LR for the upcoming 7
days. days.
Fig. 11. Infected cases prediction by LASSO for the Fig. 12. Infected cases prediction by SVR for the upcoming
upcoming 7 days. 7 days.
Table 1. Experimental Results for all regression models in different data scenarios.
Models Evaluation Parameters Accuracy
45 days 60 days 75 days
2
LR R Score 0.80 0.81 0.88
MSE 227815.61 1320050.68 1954244.90
MAE 363.27 936.60 1217.47
RMSE 477.30 1148.93 1397.94
LASSO R2 Score 0.80 0.81 0.88
MSE 227854.33 1320058.97 1954258.42
MAE 363.31 936.59 1217.47
RMSE 477.34 1148.93 1397.94
SVR R2 Score 0.93 0.97 0.96
MSE 79352.30 170617.92 643586.37
MAE 254.05 311.09 593.40
RMSE 281.69 413.05 802.23
From the table, we can see that the performance of the relational features like population density, people’s age
models has increased for high-volume datasets. range, human health condition etc. can be considered to
better understand the future case scenario of COVID-19.
V. CONCLUSION
REFERENCES
COVID-19 creates health and economic crises all over
the world. Every time new variants of COVID-19 make a [1]. Md. Siddikur Rahman, Arman Hossain Chowdhury,
pandemic wave, many people are affected worldwide. As a Miftahuzzannat Amrin. 2022. "Accuracy comparison
result, it causes damage to the human body, and people die. of ARIMA and XGBoost forecasting models in
In this study, we propose a short-term future forecasting predicting the incidence of COVID-19 in Bangladesh."
model for newly infected case prediction for the upcoming 7 PLOS GLOBAL PUBLIC HEALTH.
days. LR, LASSO, and SVR regression models were used doi:https://doi.org/10.1371/journal.pgph.0000495.
for future forecasting. Findings show that the SVR model [2]. Noara AlHusseini, Muhammad Sajid, Afaf Altayeb,
performance was better than LR and LASSO models in all Shahd Alyousof, Haifa Alsheikh, Abdulrahman
three data scenarios. Results also indicate the increases in Alqahtani, Afrah Alsomali. 2021. "Depression and
past data volume, increasing the result accuracy for all Obsessive-Compulsive Disorders Amid the COVID-19
models. Overall, we conclude that the short-term forecasting Pandemic in Saudi Arabia." The Cureus Journal of
model using SVR gives better results for upcoming days of Medical Science.
infected cases, which may help to understand the upcoming doi:https://doi.org/10.7759/cureus.12978.
situation. This forecasting model also can be helpful for [3]. WHO. 2023. Coronavirus disease (COVID-19)
people and the government to make decisions and prepare pandemic. World Health Organization. Accessed
for the upcoming situation. The healthcare system could March 15, 2023.
prepare and provide the best support in the pandemic https://www.who.int/emergencies/diseases/novel-
situation that may help save lives. In the future, it is coronavirus-2019.
intended to adopt deep learning approaches to forecast the
COVID-19 transmission for small datasets. Several