Statistical Methods Applications Submitted Version
Statistical Methods Applications Submitted Version
Abstract
Predicting infectious disease outbreak impacts on population,
health care resources and economics has received a special
academic focus during coronavirus (COVID-19) pandemic.
Focus on human disease outbreak prediction techniques in current lit-
erature, Marques et al. [1] states that there are four main methods
to address forecasting problem: Compartmental Models, Classic Statis-
tical Models, Space State Models and Machine Learning Models. We
adopt their framework to compare our research with previous works.
Besides being divided by methods, forecasting problems can also be
divided by the number of variables that are taken into account to
make predictions. Considering this number of variables, forecasting
problems can be classified as univariate, causal and multivariate models.
Multivariate approaches have been applied in less than 10% of
research found. This research is the first attempt to evaluate,
over real time-series data of 3 different countries with univari-
ate and multivariate methods to provide a short-term prediction.
In literature we found no research with that scope and aim.
1
Springer Nature 2021 LATEX template
1 Introduction
Communicable (or infectious) diseases can rapidly spread because it caused
by breathing in an airborne virus, bitten by insects, sexual intercourse, skin
contact by patient who is already suffer with that disease Kaur et al. [2]
When the disease spread runs out of control and infect a community or
region with specified health behavior, or other health-related events clearly
more than normal expectancy it is called epidemic [3]. The term pandemic
is commonly taken to refer to a widespread epidemic of contagious disease
throughout the whole of a country or one or more continents at the same time
[4].
Although personal measures should be taken to avoid the infection and
therefore their spread like do not share personal things, clean hands properly,
always take good and safe food, get vaccinated or cover month when sneezing
or coughing always cover your mouth [2], health systems and governments of
all countries must be able to develop and improve non pharmacological mea-
sures like animal source containment, early detection and diagnosis, rigorous
infection control, timely case report and rapid information dissemination, quar-
antines, mask obligation, lockdown and pharmacological measures like vaccine
development [5].
Over the last few decades, mathematical models applied over infectious
diseases growth have been helpful to gain insights into the transmission dynam-
ics [6] allowing to forecast new cases and deaths as well as evaluate the
interventions’ impact [7].
Besides, still showing numerous limitations and pitfalls, often driven by
data scarcity and delay, Smirnova and Chowell [8] state that integration of
mathematical models’ prediction results with public health practice has the
potential to increase the timeliness and quality of health care unit responses.
In this context, during COVID-19 pandemic, Marques et al. [1] applied four
univariate forecasting approaches using real COVID-19 data of 5 countries.
These approaches are Classical Statistical Models, Compartmental models,
State-Space Model and Machine Learning models and will be presented in
section 2.
After evaluating and comparing sixty six previous works (see Table A1),
we conclude that Less than 10 % of previous research applied multivariate
techniques and none of them were used more than one country or region. Thus,
this research contributes to forecasting methods application over human
infections diseases outbreaks by being the first attempt to evaluate, over real
time-series data:
• For three different countries (Brazil, Italy, and United States);
• Using six univariate and two multivariate methods;
• Providing a short-term prediction of 28 days ahead which is two or four
times longer than similar previous research;
Springer Nature 2021 LATEX template
2 Theoretical Background
Epidemics or pandemics disease outbreak have been devastating populations
worldwide all over the years [9],[10] and [2]. From Athens epidemics (“Plague
of Athens”) in 430-427 B.C (see Hays [9] for more details) to Corona Virus
(SARS-CoV 2) also known as COVID-19 on going pandemic, the civilizations
have lived with epidemics or pandemics caused mainly by virus and bacteria.
Kaur et al. [2] summarized the most relevant disease outbreaks in human
history like Blackdeath (black plague), cholera, malaria and influenzas virus
(Spanish, Hong Kong and Russian Flu). In addition, Hays [9], [10] and Yamey
et al. [11] points out many others like the smallpox, Blackdeath (black plague),
cholera, influenza, HIV/AIDS, measles, dengue, Ebola, Zika virus.
Table 1 summarizes in a non-exhaustive list of worldwide human outbreaks
diseases (epidemics or pandemics) by year, impact in number of deaths and
where each one occurred.
Besides the number of human deaths caused by epidemics and pandemics,
Kaur et al. [2] state that and it will not disappear in future if we do not
find efficient ways to stop before spreading any disease to other population or
countries.
To explain, evaluate and estimate further values (forecast) the behavior of
some variable like outbreak disease cases, deaths, or transmission rate all over
the time many authors use time-series approach [12] [13] [14] [15].
A time-series is a set of data points arranged in time and its analysis intends
to reveal reliable and meaningful statistics [1] that can be used to evaluate
some patterns and forecast future values [16]. It drew the attention of the
scientific community when Yule introduced a general approach for time-series
analysis in 1927 [17].
In the same year, one deterministic compartimental model widely applied
in epidemiology science was proposed by Kermack and McKendrick [18], the
Susceptible–Infectious–Removed (SIR) model.
Almost three decades later (1950s) classical time series statistical models
started to appear (Holt [19]; Brown [20]; Winters [21]; Box and Jenkins [22]) as
well as Machine Learning (Samuel [23]) and Space-State Model (Kalman [24]).
Bring forecasting methods to human infectious disease outbreak con-
text, Chretien et al. [25] proposed a framework to classify research as
follows: Population-based forecasting studies (seasonal or pandemic), forecast
Springer Nature 2021 LATEX template
Table 1 Worldwide epidemics or pandemic. Source: The authors adapted from Yamey et
al. [11], Yang et al. [5], Kaur et al. [2].
75,000-100,000 deaths
Athens Plague 430-427 B. C Greece
(25% of population)
Asia Minor, Egypt,
Antonine Plague 165-180 5 million deaths
Greece, Italy
Justinian Plague 541-542 35 million deaths Eastern roman empire
50 million deaths
(60% of Europe
Black Plague 1346-1353 population). Europe, Africa and Asia
75-200 million
deaths worldwide
Asia, Africa, France,
Fifth Cholera 1881-1896 981,899 deaths Germany, Russia,
and South America
Russia, Canada
Russian Flu 1889-1890 1 million deaths
and Greenland
Modern Plague 1894-1903 10 million deaths India and China mostly
North Africa, Middle East,
Sixth Cholera 1899-1923 1.5 million deaths India, Eastern Europe,
Russia and America
Australia, Canada,
Spanish Flu 1918-1919 20-50 million deaths UK, US, France
and Spain
China, Singapore,
Asian Flu 1956-1958 2 million deaths
US and Hong Kong
Hong Kong and spread
Hong Kong Flu 1968-1969 1 million deaths Singapore, Vietnam, US
India, Australia and Europe
HIV/AIDS 1980-present 39 million deaths Worldwide
Severe Acute
8422 people infected in
Respiratory China, Vietnam, Singapore,
2002-2003 32 countries and 919
Syndrome Canada and others
(11%) died
(SARS CoV)
Mexico, US, Africa
Swine Flu 2009-2010 150,000-750,000 deaths
and Southeast Asia
Middle East
Respiratory
2012 858 Saudi Arabia
Corona virus
(MERS CoV)
Ebola 2013 10,600 deaths Guinea, Liberia, Sierra Leone
Zika Virus 2015-2016 20 deaths Latin America and Caribbean
Coronavirus
SARS CoV 2 2019-present 4.94 million deaths Worldwide
(COVID-19)
for calculation which reduces the need for saving the whole data from previous
iterations [77].
Talkhi et al. [27], Han et al. [78], Benı́tez et al. [15], Eilterson et al. [63],
Yang et al. [68], Wang et al. [69], Nunes et al [79], Mode et al. [80] research
applied these models to several human disease outbreaks like COVID-19,
Ebola, Zika Virus, ILI, SSTIS, HIV/AIDS.
[85], Wang et al. [61], Bomfim et al. [29], Liang et al. [30], Zhang et al. [32],
Wang et al. [33], Choi et al. [35], Chakraborty et al. [36], Stolerman et al. [86],
Wu et al. [40], Ray et al. [43], Caicedo-Torres et al. [87], Nguyen et al. [88],
Chau and Ngoc [89], Wu et al. [49], Kane et al. [51], Gerardi and Monteiro
[59] and Peng et al. [90] research.
• Data range and prediction: only four research proposed to predict the whole
pandemic period (wpp). Considering different time windows, twenty six
research forecasted less or equal to six steps ahead.
• Time window: twenty eight research worked with monthly cases and twenty
four with weekly cases.
• Variables: number of patient cases was studied in sixty three research
(95,45%) while deaths, recovered and hospital admission or discharge or
transmission rate are not much explored (deaths appears in second place
with only five research).
• Epidemiological time-series prediction: thirty four research applied CSM,
twenty three applied MLM, twenty applied CM and only eight applied SSM.
We found no research in which all approaches were applied. The number of
publications by type of epidemiological time-series prediction is presented
in Figure 2. Only two research used three univariate approaches (Talkhi et
al., 2021 and Katris, 2021) in a single country (Iran and Greece respec-
tively), sixteen research used two approaches, forty five research used only
one approach and three reseach used approaches not mentioned by Marques
et al. (2021).
• On twenty CM models only five worked with growth models and basically
applied three models: Richards (GMR), Gompertz (GMG), Logistic (GML).
But we point out that there is other fourteen GM models (Fekedulegn et
al. [91], Kaps et al. [92], Tsoularis and Wallace [93] and Khamis et al. [94])
that were not explored in human disease outbreak context.
6
Uni
Causal
5
Mul
Number of publications
4
3
2
1
0
years
CSM
CM
6
SSM
Number of publications
MLM
4
3
2
1
0
years
2020−03−12 / 2021−03−15
R1
R2
600 R3 600
R4
R5
400 400
200 200
Fig. 3 Rio de Janeiro city COVID-19 daily cases per health region. Source: authors.
2020−02−24 / 2021−02−26
Center
15000 Islands 15000
North−East
North−West
South
10000 10000
5000 5000
• We work with 341 past observations which is more than 10 times the pre-
diction length. It do not means that our past data is larger enough to well
train some models and give reliable predictions;
• Forecasting daily new cases four weeks ahead allow decision makers in health
units to better plain resource availability or governments to choose ade-
quate measures. At least, better than in most of the previous research daily
predictions worked with shorter forecasting range (seven or fourteen);
• Taking into account that resource availability depends on health units
resource allocation (doctors, beds, among others), the smaller time window
Springer Nature 2021 LATEX template
2020−03−04 / 2021−03−07
30000 30000
20000 20000
10000 10000
unit (until this moment is daily data) we are able to work with and pro-
vide reliable predictions will result in the more useful information to help
decision makers to meet resources requirements that ensure a adequately
treatment to patient demand.
Table 2 Regions or States division (for more details see Appendix B). Source: authors.
Q = 1)[f requency = 7]. For more details see Hyndman and Athanasopoulos
[16] and Kotu and Deshpande [98];
• Space State Model Univariate (SSM-U): The state of a deterministic
dynamic system is the smallest vector that summarises the past of the system
in full [77]. The linearity of state dynamics and observation process, as well as
the normal distribution of noise in state dynamics and measurements are the
assumptions of SSM. A linear autoregressive equation x(t) = A ∗ x(t) + W (t)
where W (t) ≈ N (0, Q) with a measurement that is y(t) = C ∗ y(t) + V (t)
where V (t) ≈ N (0, R), define the linearized process in which y(t) ∈ R. The
random variables W (t) and V (t) represent the process and measurement
noise respectively and are assumed to be independent of each other and with
normal distributions. In our case we will work with a vector length (n = 2
for linear model and n = 3 for order 2 polynomial model) which means a
n ∗ n dimensions A matrix. We select the best approach for each time-series
based on the lowest Akaike Information Criterion (AIC) criteria.
• MLP: MLP is a supplement of feed forward neural network. It consists of
three types of layers—the input layer, output layer and hidden layer. The
input layer receives the input signal to be processed. An arbitrary number of
hidden layers that are placed in between the input and output layer are the
Springer Nature 2021 LATEX template
(ω)
(ω) yt −1
yt = , ω ̸= 0,
ω
(ω)
yt = log yt , ω = 0,
t
(ω)
X
yt = lt−1 + ϕ ∗ bt−1 + sit−mi + dt ,
i=1
lt = lt−1 + ϕ ∗ bt−1 + α ∗ dt ,
bt = (1 − ϕ) ∗ bt + ϕ ∗ bt−1 + β ∗ dt ,
sit = sit−mi + γi ∗ dt ,
p
X q
X
dt = ϕi ∗ dt−i + θi ∗ ϵt−i + ϵt ,
i=1 i=1
Springer Nature 2021 LATEX template
kj
X
sit = sij,t ,
j=1
s∗i i i ∗i i i
t = sj,t−1 ∗ sin λj + sj,t−1 ∗ cos λj + γ2 ∗ dt
training data
fitted values
5000
test data
prediction
Daily cases
3000
1000
0
Days
Models R1 R2 R3 R4 R5
Table 5 Forecasting parameters per model per region for Italy regions. Source: The authors.
Models AZ CA NV OR
Table 7 Forecasting results per model per region for Rio de Janeiro health region. Source: The authors.
20
ES 18.6 62.5 77.2 52.7 73.2 19.9 57.2 119.4 47 104.9 14.4 43.5 65.8 33.9 64
ARIMA 17 59.5 73.1 49.7 67.4 16.6 51.5 99.5 37.9 112 13.5 46.4 69.9 38.1 63.2
SSM-U 20.4 72.5 89.5 57.4 83.5 24.8 75.3 137.7 58.9 118.5 19.5 66.7 107.3 48.1 82.1
MLP 8.8 31.3 31.7 26.2 29.8 17.4 53 92.5 34.3 91.7 15.2 45.9 79.8 38.5 62.6
NNETAR 2.1 21.8 23.4 38.8 21 16.5 65.4 99.4 46.4 101.4 13.2 66.1 96 48 66.5
TBATS 17.2 59.5 72 51.7 72.7 20.4 56.3 118.1 49 104.1 14.1 46.6 78.4 39.5 54.9
Springer Nature 2021 LATEX template
ES 256.3 128.4 1085.9 757.8 364.7 436.7 454.3 827.7 991.9 614 414.2 143.4 764.6 673.7 368.8
ARIMA 242.8 127.1 429.9 761 348.8 418.2 489.8 611.4 974.9 632.7 382.1 155.5 676.1 914.1 541.1
SSM-U 165.2 54 397 587.4 238.8 551.4 420.4 493.1 1237.5 841.7 390.3 132.1 500.1 713.3 550.4
MLP 73.9 38.6 103.5 242.7 81.1 471.2 879.7 1064.5 1424.7 1002.3 438.9 406.8 951.2 938.4 681.7
NNETAR 36.9 17.9 71 77.9 45.7 544.3 426.4 2717.5 1720.3 1386.8 404.9 153.9 1474.5 1086.9 609.3
TBATS 238.4 127 427.9 782.7 329.5 364.9 463.2 542.1 674.7 488.4 355.3 145.1 498.3 739.3 494.2
Table 9 Forecasting results per model per state for US. Source: The authors.
ES 1265.8 3167.4 274.3 152.3 662.5 7774.3 389.8 304.5 557.2 1996.3 194.5 152.5
ARIMA 1286.4 2980.6 276.1 153.5 1195.4 6838 334.4 324.1 669.8 1959.1 190.1 148.9
SSM-U 1165.6 1616.6 193.2 113.3 1568.3 7609.9 381.1 274.3 779.8 2301.4 219.8 145.5
MLP 229 1210.5 102.7 48.7 1749.9 9550.3 150.7 490.7 843.9 2507.3 149 255.7
NNETAR 152.1 500.8 37.1 18.5 948.4 29.3x103 234 694 699 7023.6 182.1 318.1
TBATS 1251.7 2913.2 275.3 152.1 617.7 9332.4 395.3 301 590 2617 197.2 153.6
Springer Nature 2021 LATEX template
Springer Nature 2021 LATEX template
From Tables 7, 8 and 9 we can conclude that the best error in-sample
considering RMSE criteria are NNETAR (with thirteen) and MLP (with one)
for all time-series which is not surprising since neural networks work better
the more data we give them.
Although outperforming on in-sample comparison, ML models do not
obtained the same result by evaluating RMSE OUT-ALL and RMSE OUT-
MEAN in which they got lowest RMSE in only five and two time series
respectively.
Trying to predict daily cases 28 days ahead without add new data or param-
eter re-estimation (OUT-ALL), MLP showed better results for four RJ health
regions and one US State. In second place, TBATS showed better results for
Three IT regions and one US State. SSM-U appeared in third position being
chosen in two IT regions and one US State.
However, when we predict daily cases 28 days ahead adding new data
weekly without parameter re-estimation (OUT-MEAN) we conclude that ES
models give us better predictions for six time-series while TBATS models and
SMM-U were chosen for three and two time-series respectively. All these results
are summarized in Table 10.
SSM-U best approach considering the lowest AIC criteria were order two
polynomial model (n = 3) in thirteen time-series. Only in AZ time-series linear
model (n = 2) was chosen.
ES 0 0 0 0 0 0 0 0 3 2 1 6
ARIMA 0 0 0 0 0 0 1 1 0 0 1 1
SMM-U 0 0 0 0 0 2 1 3 0 1 1 2
MLP 1 0 0 1 4 0 1 5 0 0 1 1
NNETAR 4 5 4 13 1 0 0 1 1 0 0 1
TBATS 0 0 0 0 0 3 1 4 1 2 0 3
regressors type to each multivariate time-series using the lowest AIC criteria.
In addition, to select the VAR model order (p) we adopt the Schwarz Crite-
rion (SC(n)) and obtained p = 1 to RJ with constant and trend deterministic
regressors (18 parameters), p = 23 to IT with trend deterministic regressors
(113 parameters) and p = 2 to US with constant and trend deterministic
regressors (20 parameters).
In SSM-M we select a vector length (nx) that gave us the lowest error con-
sidering Akaike Information Criterion (AIC). The nx can be 8 (linear model)
or 12 (polynomial order 2 model) to US and 10 (linear model) or 15 (polyno-
mial order 2 model) to RJ and IT (two or three times the number of univariate
time series).
Table 11 Forecasting results per multivariate model per region for Rio de Janeiro, Italy, and US. Source: The authors.
VAR(1) 102.8 67.7 110.3 55.7 83.0 63.3 100.3 166.8 66.7 118.5 29.4 58.7 94.9 46.8 71.1
SSM-M(10) 21.2 73 91.3 59.5 86.4 83.1 107.7 184.2 78.4 123.8 20.0 68.9 112.3 45.5 79.5
IT CEN ISL NOW NOE SOT CEN ISL NOW NOE SOT CEN ISL NOW NOE SOT
VAR(23) 2331.1 3686.2 2292.1 308.3 2322.3 4749.5 1533.5 4522 11x103 4502.2 1026.5 1024.2 1326.4 2163.3 1062.1
SSM-M(10) 216.6 94.1 477.7 742.8 335.2 2080.8 3389.4 1808 2022.5 1824.3 589.7 262.5 833.9 1231.9 565.0
US AZ CA NV OR - AZ CA NV OR - AZ CA NV OR -
VAR(2) 2801.1 14.9x103 559.5 142.9 - 2908.8 13.1x103 992.9 461.4 - 959.9 3519.2 394.1 185.0 -
SSM-M(8) 1341.2 2047.9 229.8 131.6 - 3860.8 4676.9 4645.5 4674.9 - 991.7 785.1 344.8 115.7 -
24
Springer Nature 2021 LATEX template
IN OUT-ALL OUT-MEAN
Data Regions Model RMSE Model RMSE Model RMSE Model RMSE Model RMSE Model RMSE
R 1 NNETAR 2.1 SSM-M 21.2 NNETAR 16.5 SSM-M 93.3 NNETAR 13.2 SSM-M 20
R 2 NNETAR 21.8 VAR 70.1 ARIMA 51.5 VAR 90.9 ES 43.5 VAR 56.9
R 3 NNETAR 23.4 SSM-M 91.3 MLP 92.5 VAR 92.2 ES 65.8 VAR 86
RJ
R 4 MLP 26.2 VAR 59.4 MLP 34.3 SSM-M 84.1 ES 33.9 VAR 42.2
R 5 NNETAR 21 VAR 85.6 MLP 91.7 VAR 92.1 TBATS 54.9 VAR 69.6
CEN NNETAR 37.6 SSM-M 216.6 TBATS 364.9 SSM-M 2104.3 TBATS 355.3 SSM-M 589.7
ISL NNETAR 17.2 SSM-M 94.1 ES 454.3 VAR 1496.7 ES 143.4 SSM-M 262.5
Italy
NOW NNETAR 73.4 SSM-M 477.7 TBATS 542.1 SSM-M 1843.5 TBATS 498.3 SSM-M 833.9
NOE NNETAR 78.3 VAR 308.3 TBATS 674.7 SSM-M 2049.1 ES 673.7 SSM-M 1231.9
SOT NNETAR 45.1 SSM-M 335.2 TBATS 488.4 SSM-M 1860.5 ES 368.7 SSM-M 565.0
AZ NNETAR 152.1 SSM-M 1341.2 TBATS 617.7 VAR 1986 ES 557.2 VAR 900
CA NNETAR 513.3 SSM-M 2047.9 MLP 164.9 SSM-M 5643.6 ARIMA 1959.1 SSM-M 785.1
US
NV NNETAR 37.6 SSM-M 229.8 NNETAR 246.7 VAR 406.5 MLP 157.2 VAR 298.1
OR NNETAR 17.7 SSM-M 131.6 SSM-M 268.6 VAR 117.4 ARIMA 148.9 SSM-M 115.7
26
Springer Nature 2021 LATEX template
600
600
150
400
Daily cases
Daily cases
Daily cases
400
100
200
200
50
0
0
0 100 200 300 400 0 100 200 300 400 0 100 200 300 400
800
600
400
Daily cases
Daily cases
400
200
200
0
Days Days
6000
1500
Daily cases
Daily cases
Daily cases
2000
500
0
0 100 200 300 400 0 100 200 300 400 0 100 200 300 400
6000
Daily cases
Daily cases
5000
2000
0
0
Days Days
Daily cases
10000
30000
0
Days Days
Daily cases
1000
1500
0
Days Days
5 Conclusions
In this research we applied 6 univariate and 2 multivariate models in order
to find which approach best fit and predict 14 time-series from Brazilian city
(RJ), All Italian regions and 4 US States.
We chose to apply multivariate methods which is unusual in current liter-
ature of human infectious disease outbreak prediction or forecasting (less then
10% of research we found) considering the high correlation and auto-correlation
between different time-series from the same place in many lags as we saw in
Figure 6. In Appendix C all auto-correlation plots are presented where we see
a significant correlation between regions data until lag 15 to RJ and in all lags
to Italy regions and US States.
Besides the strong potential of multivariate methods, we did not observe
them outperforming univariate methods.
In-sample results obtained best results using ML methods for all time-series
which is expected considering that these type of models usually provide better
results the more data is available for training. However, the same result was
not observed in out-sample results evaluation.
Our prediction presented in Figures 9, 10 and 11, suggests that in the next
28 days:
• 4 RJ health regions will remain on the same level of daily new cases, but
in R5 is expected to face a considerable increasing of daily COVID-19 new
cases. However, it will be at least lower than levels observed in previous data;
• IT regions will face a exponential increasing of daily COVID-19 new cases,
excluding CEN Region;
• In US States we can expect different behaviours of daily COVID-19 new
cases. To AZ it is expected a tiny decreasing while in CA and NV will
Springer Nature 2021 LATEX template
Declarations
• Funding: The author(s) received no specific funding for this work.
• Conflict of interest/Competing interests: The authors have declared that no
competing interests exist.
• Ethics approval: Not applicable
• Consent to participate: Not applicable
• Consent for publication: Not applicable
• Availability of data and materials: All relevant data are within the
manuscript and its Supporting Information files.
• Code availability: https://github.com/DanielAssad/Short-term-forecasting.
git
Data
pre- Disease
Authors Year CSM CM SSM MLM Others Data Range Measure Country Approach
dic- outbreak
tion
Brazil, Uni
ES, KF, MLP, 24/02/20 to
Current 2022 - - 28 d dca Italy, and COVID-19
ARIMA TBATS NNETAR 11/07/21
US Mul
Spatial
auto- 21/01/20 to
[12] 2021 - SEIR - - wpp dca China Causal COVID-19
corre- 12/03/20
lation
01/01/10 to
[26] 2021 ARIMA - - - - 6m mca Thailand Causal Dengue
12/12/16
Cca, 16
22/01/20 to
[13] 2021 ARIMA - - LSTM - 60 d Cde, coun- Uni COVID-19
12/03/ 20
32
Cre tries
NNETAR,
ES, 20/02/20 to Cca,
[27] 2021 - TBATS ELM, Prophet 30 d Iran Uni COVID-19
ARIMA 15/08/20 Cde
MLP
Newbolt-
ES, Granger, 21/01/20 to
[14] 2021 SIR - RNN 4w rt Greece Uni COVID-19
ARIMA Bates- 14/08/ 20
Granger
Springer Nature 2021 LATEX template
lag
non-
[101] 2017 - - - - 1998 to 2015 6w wca Taiwan Causal Dengue
linear
model
(DLNM)
Springer Nature 2021 LATEX template
References
[1] Marques, J.A.L., Gois, F.N.B., Xavier-Neto, J., Fong, S.J.: Predictive
Models for Decision Support in the COVID-19 Crisis. Springer, ???
(2021)
[2] Kaur, H., Garg, S., Joshi, H., Ayaz, S., Sharma, S., Bhandari, M.:
A review: Epidemics and pandemics in human history. International
Journal of Pharma Research and Health Sciences 8, 3139–3142 (2020).
https://doi.org/10.21276/ijprhs.2020.02.01
[5] Yang, Y., Peng, F., Wang, R., Guan, K., Jiang, T., Xu, G., Sun, J.,
Chang, C.: The deadly coronaviruses: The 2003 sars pandemic and the
2020 novel coronavirus epidemic in china. Journal of autoimmunity 109,
102434 (2020)
[6] Chowell, G., Sattenspiel, L., Bansal, S., Viboud, C.: Mathematical mod-
els to characterize early epidemic growth: A review. Physics of life
reviews 18, 66–97 (2016)
[9] Hays, J.N.: Epidemics and Pandemics: Their Impacts on Human History.
Abc-clio, ??? (2005)
[10] White, P.: Epidemics and pandemics: their impacts on human history.
Reference Reviews (2006)
[11] Yamey, G., Schäferhoff, M., Aars, O.K., Bloom, B., Carroll, D., Chawla,
M., Dzau, V., Echalar, R., Gill, I.S., Godal, T., et al.: Financing of
international collective action for epidemic and pandemic preparedness.
The Lancet Global Health 5(8), 742–744 (2017)
[12] Chen, Y., Li, Q., Karimian, H., Chen, X., Li, X.: Spatio-temporal dis-
tribution characteristics and influencing factors of covid-19 in china.
Scientific Reports 11(1), 1–12 (2021)
[13] ArunKumar, K., Kalaga, D.V., Kumar, C.M.S., Chilkoor, G., Kawaji,
M., Brenza, T.M.: Forecasting the dynamics of cumulative covid-19 cases
(confirmed, recovered and deaths) for top-16 countries using statisti-
cal machine learning models: Auto-regressive integrated moving average
(arima) and seasonal auto-regressive integrated moving average (sarima).
Applied soft computing 103, 107161 (2021)
[14] Katris, C.: A time series-based statistical approach for outbreak spread
forecasting: Application of covid-19 in greece. Expert Systems with
Applications 166, 114077 (2021)
[15] Benı́tez, D., Montero, G., Rodrı́guez, E., Greiner, D., Oliver, A.,
González, L., Montenegro, R.: A phenomenological epidemic model
based on the spatio-temporal evolution of a gaussian probability density
function. Mathematics 8(11), 2000 (2020)
[22] Box, G., Jenkins, G.: Control. Halden-Day, San Francisco (1970)
[23] Samuel, A.L.: Some studies in machine learning using the game of
checkers. IBM Journal of research and development 3(3), 210–229 (1959)
[24] Kalman, R.E., et al.: Contributions to the theory of optimal control. Bol.
soc. mat. mexicana 5(2), 102–119 (1960)
[25] Chretien, J.-P., George, D., Shaman, J., Chitale, R.A., McKenzie, F.E.:
Influenza forecasting in human populations: a scoping review. PloS one
9(4), 94130 (2014)
[26] Kiang, M.V., Santillana, M., Chen, J.T., Onnela, J.-P., Krieger, N.,
Engø-Monsen, K., Ekapirat, N., Areechokchai, D., Prempree, P., Maude,
R.J., et al.: Incorporating human mobility data improves forecasts of
dengue fever in thailand. Scientific reports 11(1), 1–12 (2021)
[27] Talkhi, N., Fatemi, N.A., Ataei, Z., Nooghabi, M.J.: Modeling and
forecasting number of confirmed and death caused covid-19 in iran:
A comparison of time series forecasting methods. Biomedical Signal
Processing and Control 66, 102494 (2021)
[28] Khan, F., Saeed, A., Ali, S.: Modelling and forecasting of new cases,
deaths and recover cases of covid-19 by using vector autoregressive model
in pakistan. Chaos, Solitons & Fractals 140, 110189 (2020)
[29] Bomfim, R., Pei, S., Shaman, J., Yamana, T., Makse, H.A., Andrade Jr,
J.S., Lima Neto, A.S., Furtado, V.: Predicting dengue outbreaks at neigh-
bourhood level using human mobility in urban areas. Journal of the
Royal Society Interface 17(171), 20200691 (2020)
[30] Liang, X., Xu, Q., Guan, R., Zhao, Y.: Forecasting tuberculosis incidence
in china using baidu index: A comparative study. In: Proceedings of the
4th International Conference on Medical and Health Informatics, pp.
22–29 (2020)
[31] Ramos, A.C.V., Gomes, D., Santos Neto, M., Berra, T.Z., de Assis,
I.S., Yamamura, M., Crispim, J.d.A., Martoreli Junior, J.F., Bruce,
Springer Nature 2021 LATEX template
A.T.I., Dos Santos, F.L., et al.: Trends and forecasts of leprosy for a
hyperendemic city from brazil’s northeast: Evidence from an eleven-year
time-series analysis. PloS one 15(8), 0237165 (2020)
[32] Zhang, C., Fu, X., Zhang, Y., Nie, C., Li, L., Cao, H., Wang, J., Wang, B.,
Yi, S., Ye, Z.: Epidemiological and time series analysis of haemorrhagic
fever with renal syndrome from 2004 to 2017 in shandong province, china.
Scientific reports 9(1), 1–9 (2019)
[33] Wang, Y., Xu, C., Zhang, S., Yang, L., Wang, Z., Zhu, Y., Yuan, J.:
Development and evaluation of a deep learning approach for modeling
seasonality and trends in hand-foot-mouth disease incidence in mainland
china. Scientific reports 9(1), 1–15 (2019)
[34] Li, K., Liu, M., Feng, Y., Ning, C., Ou, W., Sun, J., Wei, W., Liang, H.,
Shao, Y.: Using baidu search engine to monitor aids epidemics inform for
targeted intervention of hiv/aids in china. Scientific reports 9(1), 1–12
(2019)
[35] Choi, S.B., Kim, J., Ahn, I.: Forecasting type-specific seasonal influenza
after 26 weeks in the united states using influenza activities in other
countries. PLoS One 14(11), 0220423 (2019)
[36] Chakraborty, T., Chattopadhyay, S., Ghosh, I.: Forecasting dengue epi-
demics using a hybrid methodology. Physica A: Statistical Mechanics
and its Applications 527, 121266 (2019)
[38] Haddawy, P., Yin, M.S., Wisanrakkit, T., Limsupavanich, R., Promrat,
P., Lawpoolsri, S., Sa-angchai, P.: Complexity-based spatial hierarchi-
cal clustering for malaria prediction. Journal of Healthcare Informatics
Research 2(4), 423–447 (2018)
[39] Wu, H., Wang, X., Xue, M., Wu, C., Lu, Q., Ding, Z., Zhai, Y., Lin, J.:
Spatial-temporal characteristics and the epidemiology of haemorrhagic
fever with renal syndrome from 2007 to 2016 in zhejiang province, china.
Scientific reports 8(1), 1–14 (2018)
[40] Wu, Y., Yang, Y., Nishiura, H., Saitoh, M.: Deep learning for epidemi-
ological predictions. In: The 41st International ACM SIGIR Conference
on Research & Development in Information Retrieval, pp. 1085–1088
(2018)
Springer Nature 2021 LATEX template
[41] Zhao, Y., Ge, L., Zhou, Y., Sun, Z., Zheng, E., Wang, X., Huang, Y.,
Cheng, H.: A new seasonal difference space-time autoregressive inte-
grated moving average (sd-starima) model and spatiotemporal trend
prediction analysis for hemorrhagic fever with renal syndrome (hfrs).
PloS one 13(11), 0207518 (2018)
[43] Ray, E.L., Sakrejda, K., Lauer, S.A., Johansson, M.A., Reich, N.G.:
Infectious disease prediction with kernel conditional density estimation.
Statistics in medicine 36(30), 4908–4929 (2017)
[44] Anggraeni, W., Aristiani, L.: Using google trend data in forecast-
ing number of dengue fever cases with arimax method case study:
Surabaya, indonesia. In: 2016 International Conference on Information
& Communication Technology and Systems (ICTS), pp. 114–118 (2016).
IEEE
[45] Ke, G., Hu, Y., Huang, X., Peng, X., Lei, M., Huang, C., Gu, L., Xian,
P., Yang, D.: Epidemiological analysis of hemorrhagic fever with renal
syndrome in china with the seasonal-trend decomposition method and
the exponential smoothing model. Scientific reports 6(1), 1–7 (2016)
[46] Li, S., Cao, W., Ren, H., Lu, L., Zhuang, D., Liu, Q.: Time series anal-
ysis of hemorrhagic fever with renal syndrome: A case study in jiaonan
county, china. Plos one 11(10), 0163771 (2016)
[47] Johansson, M.A., Reich, N.G., Hota, A., Brownstein, J.S., Santillana, M.:
Evaluating the performance of infectious disease forecasts: A comparison
of climate-driven and seasonal dengue forecasts for mexico. Scientific
reports 6(1), 1–11 (2016)
[48] Pradhan, A., Anasuya, A., Pradhan, M.M., Ak, K., Kar, P., Sahoo, K.C.,
Panigrahi, P., Dutta, A.: Trends in malaria in odisha, india—an analysis
of the 2003–2013 time-series data from the national vector borne disease
control program. PloS one 11(2), 0149126 (2016)
[49] Wu, W., Guo, J., An, S., Guan, P., Ren, Y., Xia, L., Zhou, B.: Compar-
ison of two hybrid models for forecasting the incidence of hemorrhagic
fever with renal syndrome in jiangsu province, china. PLoS One 10(8),
0135492 (2015)
[50] Mekparyup, J., Saithanu, K.: Forecasting the dengue hemorrhagic fever
cases using seasonal arima model in chonburi, thailand. Global J Pure
Springer Nature 2021 LATEX template
[51] Kane, M.J., Price, N., Scotch, M., Rabinowitz, P.: Comparison of arima
and random forest time series models for prediction of avian influenza
h5n1 outbreaks. BMC bioinformatics 15(1), 1–9 (2014)
[52] Feng, H., Duan, G., Zhang, R., Zhang, W.: Time series analysis of
hand-foot-mouth disease hospitalization in zhengzhou: establishment of
forecasting models using climate variables as predictors. PloS one 9(1),
87916 (2014)
[53] Soebiyanto, R.P., Adimi, F., Kiang, R.K.: Modeling and predicting
seasonal influenza transmission in warm regions using climatological
parameters. PloS one 5(3), 9450 (2010)
[54] Shen, Y., Jiang, C., Dun, Z.: Analysis and prediction of epidemiological
trend of scarlet fever from 1957 to 2004 in the downtown area of beijing.
In: International Workshop on Biosurveillance and Biosecurity, pp. 164–
168 (2008). Springer
[55] Medina, D.C., Findley, S.E., Guindo, B., Doumbia, S.: Forecasting non-
stationary diarrhea, acute respiratory infection, and malaria time-series
in niono, mali. PLoS One 2(11), 1181 (2007)
[56] Burkom, H.S., Murphy, S.P., Shmueli, G.: Automated time series fore-
casting for biosurveillance. Statistics in medicine 26(22), 4202–4218
(2007)
[57] Nobre, F.F., Monteiro, A.B.S., Telles, P.R., Williamson, G.D.: Dynamic
linear model and sarima: a comparison of their forecasting performance
in epidemiology. Statistics in medicine 20(20), 3051–3069 (2001)
[58] Krause, A.L., Kurowski, L., Yawar, K., Van Gorder, R.A.: Stochastic
epidemic metapopulation models on networks: Sis dynamics and control
strategies. Journal of theoretical biology 449, 35–52 (2018)
[59] Gerardi, D., Monteiro, L.: System identification and prediction of dengue
fever incidence in rio de janeiro. Mathematical Problems in Engineering
2011 (2011)
[60] Paul, A., Reja, S., Kundu, S., Bhattacharya, S.: Covid-19 pandemic
models revisited with a new proposal: Plenty of epidemiological mod-
els outcast the simple population dynamics solution. Chaos, Solitons &
Fractals 144, 110697 (2021)
[61] Wang, P., Zheng, X., Li, J., Zhu, B.: Prediction of epidemic trends
in covid-19 with logistic model and machine learning technics. Chaos,
Springer Nature 2021 LATEX template
[62] Smirnova, A., Sirb, B., Chowell, G.: On stable parameter estimation
and forecasting in epidemiology by the levenberg–marquardt algorithm
with broyden’s rank-one updates for the jacobian operator. Bulletin of
mathematical biology 81(10), 4210–4232 (2019)
[63] Eilertson, K.E., Fricks, J., Ferrari, M.J.: Estimation and prediction
for a mechanistic model of measles transmission using particle filter-
ing and maximum likelihood estimation. Statistics in medicine 38(21),
4146–4158 (2019)
[64] Suparit, P., Wiratsudakul, A., Modchang, C.: A mathematical model for
zika virus transmission dynamics with a time-dependent mosquito biting
rate. Theoretical Biology and Medical Modelling 15(1), 1–11 (2018)
[65] Basile, L., Oviedo de la Fuente, M., Torner, N., Martı́nez, A., Jané, M.:
Real-time predictive seasonal influenza model in catalonia, spain. PloS
one 13(3), 0193651 (2018)
[66] Li, X., Doroshenko, A., Osgood, N.D.: Applying particle filtering in both
aggregated and age-structured population compartmental models of pre-
vaccination measles. PloS one 13(11), 0206529 (2018)
[67] Valeri, L., Patterson-Lomba, O., Gurmu, Y., Ablorh, A., Bobb, J.,
Townes, F.W., Harling, G.: Predicting subnational ebola virus disease
epidemic dynamics from sociodemographic indicators. PloS one 11(10),
0163544 (2016)
[68] Yang, W., Karspeck, A., Shaman, J.: Comparison of filtering methods
for the modeling and retrospective forecasting of influenza epidemics.
PLoS computational biology 10(4), 1003583 (2014)
[70] Towers, S., Chowell, G.: Impact of weekday social contact patterns on the
modeling of influenza transmission, and determination of the influenza
latent period. Journal of theoretical biology 312, 87–95 (2012)
[71] Aguiar, M., Ballesteros, S., Kooi, B.W., Stollenwerk, N.: The role of
seasonality and import in a minimalistic multi-strain dengue model cap-
turing differences between primary and secondary infections: complex
dynamics and its implications for data analysis. Journal of theoretical
biology 289, 181–196 (2011)
Springer Nature 2021 LATEX template
[72] Laneri, K., Bhadra, A., Ionides, E.L., Bouma, M., Dhiman, R.C., Yadav,
R.S., Pascual, M.: Forcing versus feedback: epidemic malaria and mon-
soon rains in northwest india. PLoS computational biology 6(9), 1000898
(2010)
[73] Santos, L., Costa, M., Pinho, S.T.R.d., Andrade, R.F.S., Barreto, F.R.,
Teixeira, M., Barreto, M.L.: Periodic forcing in a three-level cellular
automata model for a vector-transmitted disease. Physical Review E
80(1), 016102 (2009)
[74] Finkenstädt, B., Morton, A., Rand, D.: Modelling antigenic drift in
weekly flu incidence. Statistics in medicine 24(22), 3447–3461 (2005)
[75] Gamerman, D., Migon, H.S.: Forecasting the number of aids cases
in brazil. Journal of the Royal Statistical Society: Series D (The
Statistician) 40(4), 427–442 (1991)
[76] Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and
Techniques. MIT press, ??? (2009)
[77] Haykin, S.: Kalman Filtering and Neural Networks vol. 47. John Wiley
& Sons, ??? (2004)
[78] Han, T., Gois, F.N.B., Oliveira, R., Prates, L.R., de Almeida Porto,
M.M.: Modeling the progression of covid-19 deaths using kalman filter
and automl. Soft Computing, 1–16 (2021)
[79] Nunes, B., Natário, I., Lucı́lia Carvalho, M.: Nowcasting influenza epi-
demics using non-homogeneous hidden markov models. Statistics in
Medicine 32(15), 2643–2660 (2013)
[80] Mode, C.J., Fife, D., Troy, S.M.: Stochastic methods for short term
projections of symptomatic hiv disease. Statistics in medicine 10(9),
1427–1440 (1991)
[81] Ongsulee, P.: Artificial intelligence, machine learning and deep learn-
ing. In: 2017 15th International Conference on ICT and Knowledge
Engineering (ICT&KE), pp. 1–6 (2017). IEEE
[83] Alzubi, J., Nayyar, A., Kumar, A.: Machine learning from theory to
algorithms: an overview. In: Journal of Physics: Conference Series, vol.
1142, p. 012012 (2018). IOP Publishing
[84] Ribeiro, M.H.D.M., Mariani, V.C., dos Santos Coelho, L.: Multi-step
Springer Nature 2021 LATEX template
[85] Deng, S., Wang, S., Rangwala, H., Wang, L., Ning, Y.: Cola-gnn:
Cross-location attention based graph neural networks for long-term ili
prediction. In: Proceedings of the 29th ACM International Conference
on Information & Knowledge Management, pp. 245–254 (2020)
[86] Stolerman, L.M., Maia, P.D., Kutz, J.N.: Forecasting dengue fever in
brazil: An assessment of climate conditions. PloS one 14(8), 0220106
(2019)
[88] Nguyen, H.L., Duong, T.H., Nguyen, C.P., Nguyen, D.C., Chiem,
T.P., Nguyen, M.H., Nguyen, T.N.M., Nguyen, H.V.: Specific k-mean
clustering-based perceptron for dengue prediction. International Jour-
nal of Intelligent Information and Database Systems 10(3-4), 269–288
(2017)
[89] Chau, N.H., Ngoc Anh, L.T.: Using local weather and geographi-
cal information to predict cholera outbreaks in hanoi, vietnam. In:
Advanced Computational Methods for Knowledge Engineering, pp.
195–212. Springer, ??? (2016)
[90] Peng, L.-Z., Yi, L.-X., Hua, S.-Y.: A new epidemic disease predicting
method. In: 2008 International Conference on Intelligent Computation
Technology and Automation (ICICTA), vol. 1, pp. 550–553 (2008). IEEE
[91] Fekedulegn, D., Mac Siúrtáin, M.P., Colbert, J.J.: Parameter estimation
of nonlinear models in forestry. Silva Fennica 33(4), 327–336 (1999)
[92] Kaps, M., Herring, W., Lamberson, W.: Genetic and environmental
parameters for traits derived from the brody growth curve and their rela-
tionships with weaning weight in angus cattle. Journal of Animal Science
78(6), 1436–1442 (2000)
[93] Tsoularis, A., Wallace, J.: Analysis of logistic growth models. Mathe-
matical biosciences 179(1), 21–55 (2002)
[94] Khamis, A.: Nonlinear growth models for modeling oil palm yield growth.
J. of mathematics and statistics 1(3), 225–233 (2005)
Springer Nature 2021 LATEX template
[96] Krispin, R.: Covid19italy: The 2019 Novel Coronavirus COVID-19 (2019-
nCoV) Italy Dataset. (2021). R package version 0.3.1. https://CRAN.
R-project.org/package=covid19italy
[98] Kotu, V., Deshpande, B.: Time series forecasting. Data Science; Elsevier:
Amsterdam, The Netherlands, 395–445 (2019)
[99] De Livera, A.M., Hyndman, R.J., Snyder, R.D.: Forecasting time series
with complex seasonal patterns using exponential smoothing. Journal of
the American statistical association 106(496), 1513–1527 (2011)
[100] Tabataba, F.S., Lewis, B., Hosseinipour, M., Tabataba, F.S., Venkatra-
manan, S., Chen, J., Higdon, D., Marathe, M.: Epidemic forecasting
framework combining agent-based models and smart beam particle fil-
tering. In: 2017 IEEE International Conference on Data Mining (ICDM),
pp. 1099–1104 (2017). IEEE
[101] Chuang, T.-W., Chaves, L.F., Chen, P.-J.: Effects of local and regional
climatic fluctuations on dengue outbreaks in southern taiwan. PLoS One
12(6), 0178698 (2017)
[102] Guo, P., Zhang, J., Wang, L., Yang, S., Luo, G., Deng, C., Wen, Y.,
Zhang, Q.: Monitoring seasonal influenza epidemics by using internet
search data with an ensemble penalized regression model. Scientific
reports 7(1), 1–11 (2017)