SIR Models: An Introduction: Tingda Wang July 2020
SIR Models: An Introduction: Tingda Wang July 2020
Tingda Wang
July 2020
1 Introduction
The classic model for disease modeling is a family of models called “SIR models”
that was pioneered by Kermack and McKendrick in 1927.
These models try to predict things such as how a disease spreads, or the to-
tal number infected, or the duration of an epidemic, and to estimate various
epidemiological parameters. Such models can also show how different public
health interventions may affect the outcome of the epidemic. For example, we
can determine what the most efficient technique is for issuing a limited number
of vaccines in a given population.
To understand this model, we will first look at its most fundamental form, as a
deterministic model consisting of ordinary differential equations (ODE). Then,
we will look at its stochastic representation as well as modeling individual agents
directly.
2 Model Variables
The SIR model of an epidemic focuses on populations of individuals which are
divided into three categories: those who are Susceptible to infection with a
disease, those who are Infected and are infectious with the disease as well as
those who have Recovered from the disease and are immune to further infec-
tion.
The sizes of the populations of susceptible (S), infected (I), and recovered (R)
individuals are the variables of this model that will change over discrete time
steps. We will assume they are related by the following transitions:
1
clearance of the pathogen from a typical individual.
We can model this SIR system through many ways: we will look at the contin-
uous deterministic approach, the stochastic probability distribution approach,
and the individual agent approach.
Note also that the three variables are a function of time. Therefore, we de-
note them as S(t), I(t), R(t).
dS(t) a
= − I(t)S(t) (1)
dt N
dI(t) a
= I(t)S(t) − bI(t) (2)
dt N
dR(t)
= bI(t) (3)
dt
where S(t), I(t), and R(t) are the Susceptible, Infected, and Recovered pop-
ulations respectively. a is the rate of infection and b is the rate of recovery.
Both are assumed to be constant.
Looking at the change in R, the model assumes the sick are going to recover at
2
a steady rate by a factor of b, hence rI(t).
Knowing the above, the change in infected is just newly infected subtracting
those recovered. Hence, we have the set of equations above.
Obviously, this model makes certain assumptions such as all the sick recover
at a constant rate independent of the population. Also, there are no treatment
effects yet but we will incorporate how policy or social distancing can impact
the disease progression, in more advanced models.
%filename: SIRode.m
% Function for solving SIR system of ODEs
function SIRode
clear x;
% time span
tspan = [0 tsteps];
% starting values
S = 999;
I = 1;
R = 0;
% constants
a = 2 * log(2);
b = log(2);
N = 1000;
3
funcs(1) = (-1 * a / N) * x(2) * x(1);
funcs(2) = ((a / N) * x(2) * x(1)) - (b * x(2));
funcs(3) = b * x(2);
3.2 Results
Figure 1: ODE results from Matlab code. The blue line is the S population,
orange line I, and yellow line R.
From Figure 1, we can see the results of the ODE solver that attempts to
find solutions for the above system of equations. Note how S(t) decreases as
I(t) increases. At first R(t) is flat but after some time of I(t) being nonzero,
R(t) begins to increase.
The bell shaped curve of I(t) shows herd immunity which is a form of indi-
rect protection from infectious disease that occurs when a large percentage of
a population has become immune to an infection after getting infected. When
a large proportion of individuals possess immunity, such people being unlikely
to contribute to disease transmission, chains of infection are more likely to be
4
disrupted, which causes the number of infected to fall and eventually reach near
zero levels.
a
p = dS(t) = I(t)S(t)dt (4)
N
Then, all we need to do is iterate over these extremely small time steps, and
at each timestep, sample from the Bernouli distribution to determine whether
an additional S individual will be infected and whether an additional I infected
individual will recover.
Note also, that due to the probabilities, we may not see an increase in infection
if the first infected recovers before infecting anyone. To make sure the system
moves, we can enforce the rule that the population cannot fully recover and the
number of infected cannot reach 0.
5
N=1000; % total population
% initial conditions
S(1) = 999;
R(1)=0;
S = zeros(nsteps, 1);
R = zeros(nsteps, 1);
S(1) = 1000;
R(1) = 0;
I = N-S(:)-R(:);
t=1;
while (t<nsteps)
r1=rand(1); % random number for S switch
r2=rand(1); % random number for R switch
if (r1<a/N*(S(t))*(I(t))*dt )
S(t+1) = S(t)-1;
else
S(t+1) =S(t);
end
6
R(t+1)=R(t);
end
t = t+1;
end
end
t_vec=linspace(1,nsteps*dt,nsteps);
figure;
plot(t_vec,I_av, ’-’);
hold on
plot(t_vec,S_av, ’--’);
plot(t_vec,R_av, ’.’);
hold off
xlabel(’time steps’);
title(’stochastic simulation’);
legend([’I(t)’;’S(t)’;’R(t)’])
xlim([0,T_max])
saveas(gcf,’stochastic_1’,’epsc’)
saveas(gcf,’stochastic_1’,’png’)
S(1) = 1000;
R(1)=0;
7
S_av = zeros(nsteps, 1); % average over several stochastic realizations
R_av = zeros(nsteps, 1);
I_av=zeros(nsteps, 1);
for j = 1:jmax
S = zeros(nsteps, 1);
R = zeros(nsteps, 1);
S(1) = 999;
R(1)=0;
I = N-S(:)-R(:);
t=1;
while (t<nsteps)
r1=rand(1); % random number for S switch
r2=rand(1); % random number for R switch
if (r1<a/N*S(t)*I(t)*dt )
S(t+1) = S(t)-1;
else
S(t+1) =S(t);
end
if (r2< b*(I(t))*dt )
R(t+1)= R(t)+1;
else
R(t+1)=R(t);
end
t = t+1;
end
end
t_vec=linspace(1,nsteps*dt,nsteps);
8
figure;
plot(t_vec,I_av, ’-’);
hold on
plot(t_vec,S_av, ’--’);
plot(t_vec,R_av, ’.’);
hold off
xlabel(’time steps’);
title(’stochastic simulation’);
legend([’I(t)’;’S(t)’;’R(t)’])
xlim([0,T_max])
saveas(gcf,’stochastic_2’,’epsc’)
saveas(gcf,’stochastic_2’,’png’)
4.2 Results
Looking at Figure 2, we see the results differ quite a bit from the deterministic
ODE approach. Due to the probabilistic nature of the model, it is possible that
a disease simply doesn’t take off due to chance events and the number of in-
fected quickly reaches zero. Here, the model predicts a much dampened spread
of the disease. However, note how simply enforcing the number of infected never
reaches 0, we can ensure such rare events don’t occur in the probabilistic setting.
In Figure 3, we see that the predicted results match the results in our determin-
istic model earlier. Therefore, we have developed a probabilistic counterpart
of the ODE deterministic models. With the added stochasticity, we are now
able to incorporate unpredictable factors and see how that changes the model
results.
9
Figure 2: Stochastic modeling when we assume population can fully recover. I
can reach zero.
Note the important difference here: the infected are only infectious for one
time step. Afterwards, they become part of the recovered population. Having
this assumption makes the math derivation easier.
Suppose for any infected individual, his probability of infecting any one in the
susceptible population is p. Then we can represent each of such an event as
Bernoulli independently identically distributed random variables (think of it as
a coin toss) with probability p of infection, and 1 − p of not being infected.
Denote this as Bern(p).
10
Figure 3: Stochastic modeling when we assume population cannot fully recover.
So I > 0.
St = St−1 − It (6)
It = Binom(St−1 , 1 − (1 − p)It−1 ) (7)
Rt = Rt−1 + It−1 (8)
Here all infected are only infectious for one time step and then become part
of the recovered. The S population gradually becomes infected according to the
binomial process.
11
This model you see here is the stochastic Reed-Frost model, which is a chain bi-
nomial model, and is part of a large class of stochastic models known as Markov
chain models. A Markov chain is defined as a stochastic process with the prop-
erty that the future state of the system is dependent only on the present state
of the system and conditionally independent of all past states. This is known
as the memoryless property.
R=zeros(Nt,1);
S(1,1)=N-I(1,1);
figure;
plot(1:Nt,S,1:Nt,I,1:Nt,R);
xlabel(’time step’);
title(’reed frost’);
legend([’S(t)’;’I(t)’;’R(t)’])
xlim([0,Nt])
5.3 Results
Running the model through a binomial random variable yields a similar result
as our previous works. However, due to the stochastic nature of the model, each
instance will yield a slightly different set of curves.
In fact, try running several instances of the model and plot the resulting I(t)
curves together. How do they compare?
12
Figure 4: Reed Frost Model Results
In individual agent models the entities, hence variables, of the model are the
individuals themselves. The population, then, is a set of these variables. The
individuals might be represented by a single variable, for example the state de-
scriptors of the specific individual as to being susceptible, infected or recovered,
or a complex data structure of descriptors, such as age, sex, height, weight, etc.
The dynamics of the model are represented by selecting individuals from the
population set and changing their attributes on the basis of the attribute values
of others in the population, often based on heuristics.
13
statistics would be presented as the results of the model. Note that the results
are aggregate in nature, and don’t differ from the ordinary differential equation
and stochastic models.
Individual-agent models are often called simulations and rightly so. Proponents
of this type of modeling stress the high degree of realism possible in the com-
plex dynamic functions one can incorporate as computer programs. However,
applied mathematicians frown on them because the complex dynamic functions
usually prohibit any formal analysis. Therefore, running the simulation pro-
gram with specific parameter values is pretty much all you can do with it. It is
also unreliable to generalize from a small number of runs of the program. One
usually would need to run multiple instances and average the results.
At each time step, each individual contacts some number of random individ-
uals in the population each time step. The number could be an attribute of the
individual, but for simplicity this model assumes it is a constant over the pop-
ulation. We denote this as nc. This number represents the frequency of contact.
If the first individual is in the infected state and they contact an individual
in the susceptible state, the susceptible changes state to infected with some
given probability pt. Note that unlike our previous models, we separate the
two factors of transmission rate of the disease, the frequency of contacts and
the virulence (contagious strength) of the pathogen. In the stochastic model,
both factors are incorporated into the probability p.
Finally, after the contacts have been made, an infected individual has their
state changed from ‘i’ to ‘r’. Therefore, like the Reed-Frost model, each infected
recovers after one timestep.
14
6.2 Matlab Code
The code will be slightly more complex due to the nature of simulating individual
agents and having to aggregate the data (like a census). Therefore we will divide
the code into multiple parts for better readability and style.
function pop=initiate(s0,i0,r0)
% this fn creates an initial population vector for the epidemic
simulation
% each state’s population is appended to the growing vector, pop
% inputs are initial number of s, i, r individuals
% returns the pop vector
pop=[];
for i=1:s0
pop=[pop,’s’];
end
for i=1:i0
pop=[pop,’i’];
end
for i=1:r0
pop=[pop,’r’];
end
The next script is the core of the model, dealing with the simulation me-
chanics.
function pop2=epidemic(nc,pt,pop1)
% epidemic simulation which calculates a new pop2
% nc is number of contacts per infected
% pop1 is vector of ’s’, ’i’, and ’r’
% assume that all ’i’ cells go to ’r’ after each step
% each i makes nc random infectious contacts with other cells.
% ’s’ goes to ’i’ with probability pt if it is contacted by an ’i’.
15
end
end
Now we need a function to count the individual’s data type and return the
results at the end.
function [s,i,r]=census(pop)
% counts the number of s, i and r cells in pop
s=0;
i=0;
r=0;
n=length(pop(1,:)); % pop is a 1 by n array
for j=1:n
if pop(j)==’s’
s=s+1;
elseif pop(j)==’i’
i=i+1;
elseif pop(j)==’r’
r=r+1;
end
end
Lastly, we can make a script that runs an instance of the simulation using
the functions above.
16
[tt, results] = simulate(10000, 20, .1);
figure(1);
plot(results, ’-o’)
figure(2);
[tt, results] = simulate(10000, 4, .5);
plot(results, ’-o’)
figure(3);
[tt, results] = simulate(10000, 8, .01);
plot(results, ’-o’)
6.3 Results
Looking at the individual agent simulated results, we see that figure 5 look
roughly identical to the previous models. However, note that the time span is
now longer, at more than 25 time steps. We used parameters (20, .1) and (4, .5).
Interestingly, both sets of parameters gave the exact same plot below. Note
that the product of the two parameters, the combined effect of the virility and
number of contact parameters, in this case, is the same.
Also, decreasing parameters too much can result in no outbreak at all, as seen
in 6.
Overall, individual agent models give you a lot of freedom to simulate events
and allows for a better look at population dynamics. However, the results don’t
have the mathematical backing that the ODE equations have. Therefore, all
you have is your intuition to interpret the results.
17
Figure 5: Individual Agent model results: nc = 20, pt = 0.1
18
Figure 6: Nothing happens when parameters too low: nc = 8, pt = 0.01
19