Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
net/publication/339311383
Estimation of the final size of the coronavirus epidemic by the SIR model
CITATIONS READS
4 17,406
1 author:
Milan Batista
University of Ljubljana
138 PUBLICATIONS 585 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Milan Batista on 21 March 2020.
Milan Batista
University of Ljubljana, Slovenia
[email protected]
(Feb 2020)
Abstract.
In the note, the SIR model is used for the estimation of the final size of the coronavirus
epidemic. The current prediction is that the size of the epidemic will be about 85 000
cases. The note complements the author’s note [1]
1. Introduction
In this note, we will try to estimate a final epidemic size by the SIR model [2, 3]. The
program implements the model is available at
https://www.mathworks.com/matlabcentral/fileexchange/74658-fitviruscovid19
dS
IS , (1)
dt N
dI
IS I , (2)
dt N
dR
I , (3)
dt
1
21.03.2020 21:57
t. is the contact rate, .and 1 is the average infectious period. From (1),(2),(3) we
N S I R const. (4)
S S 0 exp
N R R0 .
(5)
S S 0 exp
N R R0 ,
(6)
where R is the final number of recovered persons. Because the final number of
infected people is zero, we have, using (4),
N S R . (7)
R N S 0 exp
N R R0 .
(8)
In order to use the model, we must estimate model parameters , and initial values
S 0 and I 0 from available data (we set R0 0 and I 0 C 1 ).
Now the available data is a time series of the total number of cases C, i.e.,
C I R. (9)
We can estimate the parameters and initial values by minimizing the difference between
the actual and predicted number of cases, i.e., by minimizing
2
C t Cˆt , , S 0 min , (10)
where C t C 1,C 2 , ,C n are given number of cases in times t1, t2 , , tn and
Cˆt Cˆ1,Cˆ2 , ,Cˆn are corresponding estimates calculated by the model. For the
2
21.03.2020 21:57
3. Results
The results of the calculation are shown in Table 1 and on Figure 1. From data in
Table 1, we see that all data sets have high R2 (>0.98). Also, we can see that the final
number of recovered persons converge and the predicted values do not differ
substantially; however, the predicted total population involved differs substantially.
Here we note that from day 28, the collection of data changed. Until day 28, the
estimated epidemic size was about 52 000 infections; after that prediction change to
about 85 700 infections.
Table 1. Convergence study. After day 28, the method of data collection change.
3
21.03.2020 21:57
Figure 1. Actual and predicted number of cases by the SIR model and logistic model
(data up to 25 Feb 2020)
Having a series of final predictions, we can estimate the series limit by Shanks
transformation [4, 5]
For data from Table 1, the current prediction is 84085 cases (Table 2).
4. Conclusion
4
21.03.2020 21:57
If a method of data collection will not change again, and if the situation will remain
stable, then by the SIR model, the predicted size of the epidemic is about 84 100 cases.
This prediction is comparable with the current prediction 84 000 infections, by the
empirical logistic model [6] (see Fig 2).
close all
% get data
[CC,date0] = getData();
% get data
C = CC(1:ii);
% final number
Rinf = Rmax();
Sinf = S0*exp(-beta/gamma/N*(Rinf - R0));
% calculate R2
tspan = 0:length(C)-1; % final time
ic = [S0 I0 R0]'; % initial conditions
opts = []; % no options set
[~,z] = ode45( @SIR, tspan, ic, opts);
z = z(:,2)+z(:,3);
zbar = sum(C)/length(C);
SStot = sum((C - zbar).^2);
SSres = sum((C - z').^2);
R2 = 1 - SSres/SStot;
% print results
fprintf('%12s %3d %10d %10d %10d %7.3f %7.3f %7.3f %7.3f\n',...
datestr(date0+ceil(length(C)-
1)),ceil(length(C)),round(N,0),...
round(Sinf,0),round(Rinf,0),beta,gamma,beta/gamma,R2)
% fprintf('Estimated parameters\n')
% fprintf(' End date
%s\n',datestr(date0+ceil(length(C)-1)));
% fprintf(' Day number %d\n',ceil(length(C)));
% fprintf(' Population size %d\n',round(N,0));
% fprintf(' Initial infected %d\n',round(z(1,2),0));
% fprintf(' Remain susceptible %d\n',round(Sinf,0));
% fprintf(' Total recovered %d\n',round(Rinf,0));
% fprintf(' Contact rate %g\n',beta);
% fprintf(' recovery rate %g\n',gamma);
% fprintf(' Recovery time %g\n',1/gamma);
% fprintf(' Reproduction number %g\n',beta/gamma);
% fprintf(' R2 %g\n',R2);
end
% set parameters
tspan = 0:2*length(C); % final time
ic = [S0 I0 R0]'; % initial conditions
opts = []; % no options set
% simulate
[t,z] = ode45( @SIR, tspan, ic, opts);
6
21.03.2020 21:57
% plot results
figure
hold on
plot(t,z,'LineWidth',2)
legend('Susceptible','Infected','Recovered',...
'Location','best','FontSize',12)
xlabel('Day (after 16 jan 2020)')
ylabel('Cases')
grid on
hold off
shg
% plot comparsion
figure
hold on
plot(t,(z(:,2)+z(:,3))','k','LineWidth',2)
scatter(1:length(C),C,50,'filled')
legend('Predicted','Actual',...
'Location','best','FontSize',12)
xlabel('Day (after 16 jan 2020)')
ylabel('Cases')
grid on
hold off
shg
% save to global
ta = t;
Ca = z(:,2)+z(:,3);
function b = iniguess()
%INIGUESS Obtain initial guess
global beta gamma
global S0 I0 R0 % initial values
global C
global init
if ~init
beta = 1/0.00267103;
gamma = 1/0.00267232;
S0 = 1e8;
I0 = C(1);
R0 = 0;
init = true;
end
b(1) = beta;
b(2) = gamma;
b(3) = S0;
end
function b = parest
7
21.03.2020 21:57
function f = fun( b)
%FUN Optimization function
global beta gamma S0 I0 R0
global C Ca
% set parameters
beta = b(1);
gamma = b(2);
S0 = b(3);
tend = length(C);
% solve ODE
try
[tsol,zsol] = ode45(@SIR,tspan,ic);
catch
f = NaN;
return
end
end
function r = Rmax()
%FSOLVE Calculate number of recoverd individuals after t=inf
global N S0 beta gamma R0
RN = beta/gamma;
r = fzero(@f,[0,S0]);
%-----------------------
function z = f(x)
8
21.03.2020 21:57
z = x - (N - S0*exp(-RN*(x - R0)/N));
end
end
References
9
21.03.2020 21:57
[1] M. Batista, Estimation of the final size of the coronavirus epidemic by the logistic
model, medRxiv (2020) 2020.02.16.20023606.
[2] H.W. Hethcote, The Mathematics of Infectious Diseases, SIAM Review 42(4)
(2000) 599-653.
[3] I. Nesteruk, Statistics based predictions of coronavirus 2019-nCoV spreading in
mainland China, medRxiv (2020) 2020.02.12.20021931.
[4] D. Shanks, Non-linear Transformations of Divergent and Slowly Convergent
Sequences, Journal of Mathematics and Physics 34(1-4) (1955) 1-42.
[5] C.M. Bender, S.A. Orszag, Advanced mathematical methods for scientists and
engineers I asymptotic methods and perturbation theory, Springer, New York, 1999.
[6] M. Batista, Estimation of the final size of the coronavirus epidemic by the logistic
model, 2020 DOI: 10.13140/RG.2.2.36053.37603.
10