0% found this document useful (0 votes)
12 views

Lecture 02

The document discusses three epidemic models: the Reed-Frost model, the deterministic SIR model, and the stochastic SIR model in continuous time. The Reed-Frost model is a discrete-time SIR model where the probability of transmission between an infectious and susceptible individual is w. The final size distribution of an outbreak can be computed exactly for small populations. Simulation of the Reed-Frost model is also demonstrated using R code. The deterministic SIR model describes the dynamics of an epidemic using differential equations. The stochastic SIR model considers the stochastic effects in a continuous-time framework. Maximum likelihood estimation is also introduced as a method for statistical inference of model parameters based on observed outbreak data.

Uploaded by

Eugene Chan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 02

The document discusses three epidemic models: the Reed-Frost model, the deterministic SIR model, and the stochastic SIR model in continuous time. The Reed-Frost model is a discrete-time SIR model where the probability of transmission between an infectious and susceptible individual is w. The final size distribution of an outbreak can be computed exactly for small populations. Simulation of the Reed-Frost model is also demonstrated using R code. The deterministic SIR model describes the dynamics of an epidemic using differential equations. The stochastic SIR model considers the stochastic effects in a continuous-time framework. Maximum likelihood estimation is also introduced as a method for statistical inference of model parameters based on observed outbreak data.

Uploaded by

Eugene Chan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

The Mathematics and Statistics of Infectious


Disease Outbreaks

Michael Höhle1
1 Departmentof Mathematics
Stockholm University, Sweden

L2: Simulation and Fitting of Epidemic Models1

1
LaMo: 2020-06-16 @ 22:13:45
MT3002 Michael Höhle 1 / 40
Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Overview

1 Reed-Frost model

2 Deterministic SIR model

3 Stochastic SIR model in continuous time

MT3002 Michael Höhle 2 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Outline

1 Reed-Frost model

2 Deterministic SIR model

3 Stochastic SIR model in continuous time

MT3002 Michael Höhle 3 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

The Reed-Frost epidemic model


Discrete-time SIR model, where individuals are either
Susceptible,
I nfectious or
Recovered / Removed (dead, isolated or immune)
Closed population with initially
x0 = n susceptible and
y0 = m infectious individuals
Dynamics are described by a discrete-time Markov chain

Yt+1 |xt , yt ∼ Bin(xt , 1 − (1 − w )yt ),


Xt+1 = xt − Yt+1 ,

where w is the probability of an infectious contact between an


infectious and a susceptible during one unit of time.

MT3002 Michael Höhle 4 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

The Reed-Frost epidemic model (2)

Exercise 1
Show that the one-step success probability in the Markov chain
equation on the previous slide is 1 − (1 − w )yt .

MT3002 Michael Höhle 5 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Final size distribution


The final size of the epidemic is Z = Y1 + Y2 + Y3 + . . .
Final size distribution can be computed exactly for small n,
say n ≤ 30.
Final size distribution for n = 20, m = 1 and w = 0.02:
0.6
Probability

0.4
0.2
0.0

0 1 2 3 4 5 6 7 8 9 11 13 15 17 19

Final size

MT3002 Michael Höhle 6 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Simulated final size distribution for n = 100 and m = 1

w=0.010 w=0.015 w=0.020 w=0.025


0.30

0.30

0.30

0.30
Probability

Probability

Probability

Probability
0.15

0.15

0.15

0.15
0.00

0.00

0.00

0.00
0 18 38 58 78 98 0 18 38 58 78 98 0 18 38 58 78 98 0 18 38 58 78 98

Final size Final size Final size Final size

w=0.030 w=0.035 w=0.040 w=0.045


0.30

0.30

0.30

0.30
Probability

Probability

Probability

Probability
0.15

0.15

0.15

0.15
0.00

0.00

0.00

0.00
0 18 38 58 78 98 0 18 38 58 78 98 0 18 38 58 78 98 0 18 38 58 78 98

Final size Final size Final size Final size

MT3002 Michael Höhle 7 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

R Code for Simulation of the Reed-Frost Model

fsize.RF <-
function(n, m, w, samples) {
#Initial susceptible
xj <- matrix(data=n,nrow=samples,ncol=1)
#Initial infectives
yj <- matrix(data=m,nrow=samples,ncol=1)

#Loop over all (samples) simulations until they all are ceased.
while (sum(yj>0) & sum(xj>0)) {
#Sample from all processes concurrently
yj <- ifelse(xj > 0, rbinom(samples, xj, 1-(1-w)^yj), 0)
#Update all xj
xj <- xj - yj
}
#Done
return(n-xj)
}

MT3002 Michael Höhle 8 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Aside: Inference by Maximum Likelihood Estimation

Maximum Likelihood Estimation is a method in statistics to


estimate the parameters of a statistical model
The statistical model leads to a probability distribution for the
observed data, i.e. in the discrete case
fModel (y ; θ) = Pθ (Y = y ).
Considering data as being fixed we can formulate the
likelihood function as L(θ; y ) = fModel (y ; θ).
The point in the parameter space that maximizes the
likelihood function is called the maximum likelihood estimate.

MT3002 Michael Höhle 9 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Statistical inference
Estimation of w from time series data y = (y0 , y1 , y2 , . . . , yK )
using the binomial likelihood
K −1
y
Y
L(w ) ∝ pt t+1 (1 − pt )xt −yt+1 ,
t=0

here pt = 1 − (1 − w )yt . → Knowledge of x0 is required.


Uncertainty of ŵ can be quantified with a 95% confidence
interval.
Example: Generation sizes of a measles epidemic in
St. Petersburg (from Table 4.1 in Daley and Gani, 1999):
y = (1, 4, 14, 10, 1, 0)
Assume all susceptibles got infected:
x0 = 4 + 14 + 10 + 1 = 29

MT3002 Michael Höhle 10 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Example
######################################################################
# Likelihood function for the Reed-Frost model
#
# Parameters:
# w.logit - logit(w) to have unrestricted parameter space
# x - vector containing the number of susceptibles at each time
# y - vector containing the number of infectious at each time
#
######################################################################

l <- function(w.logit,x,y) {
if (length(x) != length(y)) { stop("x and y need to be the same length") }
K <- length(x)
w <- plogis(w.logit)
p <- 1 - (1-w)^y
return(sum(dbinom( y[-1], size=x[-K], prob=p[-K],log=TRUE)))
}

# Epidemic D in Table 4.1 of Daley and Gani (1999), assuming all susceptibles got infected
y <- c(1, 4, 14, 10, 1, 0)
x <- numeric(length(y))
x[1] <- sum(y[-1])
x[2:length(x)] <- x[1]-cumsum(y[2:length(y)])

mle <- optim(par=0,fn=l,method="BFGS",x=x,y=y,control=list(fnscale=-1),hessian=TRUE)


# Maximum likelihood estimator
(w.hat <- plogis(mle$par))
## [1] 0.1700922

MT3002 Michael Höhle 11 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Inference for x0

Maximize log likelihood for x0 = 29, 30, 31, . . .


−17
−18
profile log likelihood

−19
−20
−21

30 35 40 45

x0

MT3002 Michael Höhle 12 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Outline

1 Reed-Frost model

2 Deterministic SIR model

3 Stochastic SIR model in continuous time

MT3002 Michael Höhle 13 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

The SIR model

When the population considered is large, it can be sufficient


to disregard the stochasticity of the epidemic process and use
deterministic models.
Can formulate a continuous-time deterministic SIR model by
using ordinary differential equations (ODEs).
The deterministic system intends to model the mean
behaviour of the underlying stochastic system.
We assume a closed population (i.e. no demographics
turnover) of size N.

MT3002 Michael Höhle 14 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Example: CSFV in The Netherlands (1)

Classical swine fever virus (CSFV) is a highly contagious


disease of pigs and wild boar.
Characteristics of the disease are
Symptoms after infection: dullness and anorexia.
Acute form: rapid mortality often without clinical symptoms.
Secondary symptoms: diarrhea or respiratory problems.
A huge outbreak in the Netherlands took place between
February 1997 and May 1998.
429 infected herds detected and stamped out (∼ 700,000 pigs)
1286 herds pre-emptively slaughtered (∼ 1.1 million pigs)
Note: Netherlands has approximately 21,500 pig herds

MT3002 Michael Höhle 15 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Example: CSFV in the Netherlands (2)


Stegeman et al. (1999) provide estimates on the weekly
number of infectious herds from contact tracing and
serological analysis:
60
No. infectious herds

40
20
0

0 10 20 30 40 50 60

time (weeks after first infection)

MT3002 Michael Höhle 16 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

SIR differential equation system


As before, divide population into three groups
Susceptible,
I nfectious or
Recovered / Removed
At all times S(t) + I (t) + R(t) = N, so S(0) + I (0) = N.
Describe dynamics using an ordinary differential equation
system
dS(t)
= −βS(t)I (t)
dt
dI (t)
= βS(t)I (t) − γI (t)
dt
dR(t)
= γI (t)
dt
where β, γ > 0.
Solve ODE with initial condition (S(0), I (0), 0) using
numerical routines (R package deSolve).
MT3002 Michael Höhle 17 / 40
Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Example
Number of infected I (t) for γ = 0.3, N = 2 × 104 , I (0) = 1
and different values of β.
10000 15000 20000

β = 1.5e−04
β = 4.5e−05
β = 3.0e−05
I(t)

5000
0

0 10 20 30 40 50

time

MT3002 Michael Höhle 18 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Numerical Solution of the SIR ODE


Defining the vector of derivatives for the SIR ODE
##############################################################################
# Function to compute the derivative of the ODE system
#
# t - time
# y - current state vector of the ODE at time t
# parms - Parameter vector used by the ODE system
#
# Returns:
# list containing dS(t)/dt and dI(t)/dt
##############################################################################

sir <- function(t,y, parms) {


beta <- parms[1]
gamma <- parms[2]
S <- y[1]
I <- y[2]
return(list(c(S=-beta*S*I,I=beta*S*I-gamma*I)))
}

Use deSolve::lsoda
sim <- lsoda(y=c(N-1,1), times=times, func=sir,parms=c(beta.grid[1],gamma))
head(sim, n=3)
## time 1 2
## [1,] 0.00000000 19999.00 1.000000
## [2,] 0.05005005 19998.84 1.144682
## [3,] 0.10010010 19998.66 1.310297

MT3002 Michael Höhle 19 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Aside: Numerical Solution of ODEs (1)

Simple method to solve an ODE system numerically given the


initial condition: Euler-Method
Example in R: Stepwidth h and initial value is S(0) = N − 1
and I (0) = 1
# Step width of the Euler method
h <- 0.1
y <- matrix(NA, nrow=ceiling(20/h), ncol=3, dimnames=list(NULL, c("t","S","I")))
# Initial value
y[1,] <- c(0,N-1,1)
# Loop
for (i in 2:nrow(y)){
y[i,] <- c(y[i-1,"t"]+h, y[i-1,c("S","I")] +
h * sir(y[i-1,"t"], y[i-1,c("S","I")], parms=c(beta.grid[1],gamma))[[1]])
}

MT3002 Michael Höhle 20 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Aside: Numerical Solution of ODEs (2)

Plot for Euler solve for SIR system


β = 0.00015

Euler−method (h=0.1)
12000

lsoda
8000
I(t)

4000
0

0 5 10 15 20

time

MT3002 Michael Höhle 21 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Example cont.

Number of susceptibles S(t) for γ = 0.3, N = 2 × 104 ,


I (0) = 1 and different values of β.
10000 15000 20000

β = 1.5e−04
β = 4.5e−05
β = 3.0e−05
S(t)

5000
0

0 10 20 30 40 50

time

MT3002 Michael Höhle 22 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (1) – Gaussian observations


We have k observations yi = (S(ti ), I (ti ))0 at times t1 , . . . , tk
with mean E(yi ; θ), determined by the SIR ODE.
Least squares estimates θ = (β, γ)0 minimizes the function
k
X
l(θ) = ||yi − E(yi ; θ)||2 ,
i=1

Solution θ̂ is found using numerical optimizing routines.


Often only I (t) is available, but not S(t). Then least squares
corresponds to MLE for Gaussian observations with

I (ti ) ∼ N(E(I (ti ); θ), σ 2 ).

where σ 2 is variance of the observation noise (kept fixed).


Square-root transform of I (ti ) and E(I (ti ); θ) might be useful.
MT3002 Michael Höhle 23 / 40
Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (3) – MLE for CSFV Data


Define the log-likelihood function
######################################################################
#Least-squares fit
######################################################################

ll.gauss <- function(theta, take.sqrt=FALSE) {


#Solve ODE using the parameter vector theta
res <- lsoda(y=c(N-1,1), times=csfv$t, func=sir, parms=exp(theta))
#Squared difference?
if (take.sqrt==FALSE) {
return(sum(dnorm(csfv$I,mean=res[,3],sd=1,log=TRUE)))
} else {
return(sum(dnorm(sqrt(csfv$I),mean=sqrt(abs(res[,3])),sd=1,log=TRUE)))
}
}

Maximize the log-likelihood using optim and compute


estimates
#Determine MLE
N <- 21500
mle <- optim(log(c(0.00002,3)), fn=ll.gauss,control=list(fnscale=-1))

#Show estimates and resulting R0 estimate


beta.hat <- exp(mle$par)[1]
gamma.hat <- exp(mle$par)[2]
R0.hat <- beta.hat*N/gamma.hat

MT3002 Michael Höhle 24 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (3) – MLE for CSFV Data

Plug-in of the MLE to find solution of the ODE


mu <- lsoda(y=c(N-1,1), times=csfv$t, func=sir,parms=exp(mle$par))
head(mu, n=3)
## time 1 2
## [1,] 1 21499.00 1.000000
## [2,] 2 21495.42 1.313401
## [3,] 3 21490.71 1.723989

MT3002 Michael Höhle 25 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (3) – MLE for CSFV Data


Example: SIR model fitted to CSFV curve by Gaussian
likelihood

CSFV outbreak
80

LS fit
No. infectious herds

LS−sqrt fit
60
40
20
0

0 10 20 30 40 50 60

time (weeks after first infection)

The MLEs are β̂ = 0.00015 (0.00014 for LS-sqrt), γ̂ = 2.85


(2.65) and R̂0 = 1.10 (1.10).

MT3002 Michael Höhle 26 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (4) – Poisson observations


Assuming Gaussian observation ignores the fact that we
actually observe count data. For small counts this may
become problematic.
An alternative is to use a count data distribution, e.g.

yi ∼ Po(I (ti )).

As a consequence the log-likelihood is


k
X
log(L(θ)) = yi log(I (ti )) − I (ti ) + const.
i=1

Since for the Poisson distribution E (yi ) = Var (yi ), it might be


necessary to address additional over-dispersion in the data
using, e.g., a negative binomial distribution.
MT3002 Michael Höhle 27 / 40
Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Estimating parameters (5) – MLE for CSFV Data

Example: SIR model fitted to CSFV curve by Poisson


likelihood
CSFV outbreak
80

LS fit
No. infectious herds

LS−sqrt fit
60

Poisson fit
40
20
0

0 10 20 30 40 50 60

time (weeks after first infection)

The MLEs are β̂ = 0.00013, γ̂ = 2.61 and hence R̂0 = 1.10.

MT3002 Michael Höhle 28 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Exercise

Exercise 2
Read the blog post “Flatten the COVID-19 curve” and experiment
with different containment strategies using the Shiny App. Discuss
the pros and cons of different strategies. Discuss limitations of the
model when used to evaluate strategies.

As background information you might want to read the blog post


“Coronavirus: The Hammer and the Dance” by Tomas Pueyo.

MT3002 Michael Höhle 29 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Outline

1 Reed-Frost model

2 Deterministic SIR model

3 Stochastic SIR model in continuous time

MT3002 Michael Höhle 30 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Stochastic SIR model in continuous time (1)

If the population under study is large enough, deterministic


approximations are reasonably valid to obtain an
understanding of the disease.
In small populations, however, stochasticity plays an
important role for extinction, which cannot be ignored.
Stochastic epidemic modeling is described in Becker (1989),
Daley and Gani (1999) and Andersson and Britton (2000),
who all rely heavily on the theory of stochastic processes.

MT3002 Michael Höhle 31 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Stochastic SIR model in continuous time (2)


The stochastic SIR model can be described as a
continuous-time Markov process, where the event rates for
infection and removal are:
Event Transition Rate
Infection: → (S(t) − 1, I (t) + 1) β · S(t) · I (t)
(S(t), I (t))
Removal: → (S(t), I (t) − 1) γ · I (t)

Again, R(t) is implicitly given, because a fixed population of


size S(0) + I (0) is assumed.
The integer size of the population is now taken into account:
Once I (t) = 0, the epidemic ceases.
Point process viewpoint: piecewise constant rates, while the
length of the infective period is exponentially distributed.

MT3002 Michael Höhle 32 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Stochastic SIR model in continuous time (3)


10 SIR simulations with S(0) = 100, I (0) = 1, β = 0.01 and
γ = 0.5:
100
80
Susceptibles

60
40
20
0

0 5 10 15 20

Time

MT3002 Michael Höhle 33 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

R Code for Simulation of the Stochastic SIR Model


rSIR <-
function(T, beta, gamma, n, m) {
#Initialize (x= number of susceptibles)
t <- 0
x <- n
y <- m

#Possible events
eventLevels <- c("S->I","I->R")
#Initialize result
events <- data.frame(t=t,x=x,y=y,event=NA)
#Loop until we are past time T
while (t < T & (y>0)) {
#Draw the waiting type for each possible event
wait <- rexp(2,c("S->I"=beta*x*y,"I->R"=gamma*y))
#Which event occurs first
i <- which.min(wait)
#Advance Time
t <- t+wait[i]
#Update population according to the eventy type
if (eventLevels[i]=="S->I") { x <- x-1 ; y <- y+1}
if (eventLevels[i]=="I->R") { y <- y-1 }
#Store result
events <- rbind(events,c(t,x,y,i))
}
#Recode event type and return
events$event <- factor(eventLevels[events$event], levels=eventLevels)
return(events)
}

MT3002 Michael Höhle 34 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Stochastic SIR vs. determinstic SIR model

Same parameters as previously - 250 trajectories vs. the ODE


solution:
30
No. of infective

20
10
0

0 5 10 15 20 25 30 35

Time

MT3002 Michael Höhle 35 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Exercise

Exercise 3
Set β = 0.01 and γ = 0.5 and S(0) = 100 and I (0) = 1. Simulate
1000 instances of the final size of the epidemic using rSIR and
make a histogram of the result.

MT3002 Michael Höhle 36 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Likelihood inference∗ (1)


Assume that the epidemic process is completely observed over
the interval (0, τ ], where τ is the duration of the epidemic.
Denote the successive times of the k infectious contacts by
T1 , . . . , Tk .
Denote the PDF of the duration of the infectious period by
fY (y ), e.g. exponentially distributed durations:
fY (y ) = γ exp(−γy ).
Likelihood of the data {(ti , yi ), i = 1, . . . , k} is
" k #" k #  Z τ 
Y Y
L= fY (yi ) λ(ti ) exp − λ(u)du ,
i=1 i=1 0

where λ(t) = β · I (t − ) · S(t − ) is the conditional intensity


function (CIF) and t − denotes the time just prior to ti .

MT3002 Michael Höhle 37 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Likelihood inference∗ (2)

A complication of the presented equations is that the CIF has


to be integrated over time. However, for the SIR model the
CIF is a piecewise constant function → integration is
tractable.
A binomial approximation exists for time series data, where
C (t) denotes the number of new cases in the interval
(t, t + 1] (Becker, 1989):
The conditional probability of a given susceptible escaping
infection during the interval (t, t + 1] is approximately
πt = exp{−λ(t)}.
We then have

C (t) ∼ Bin(S(t), 1 − πt )

MT3002 Michael Höhle 38 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Likelihood inference∗ (3) – GLM’s

For the SIR model, λ(t) = β · I (t) and binomial regression


with log link is applicable.
If λ(t) can be assumed to be small, we have

1 − πt = 1 − exp{−λ(t)} ≈ λ(t), so

C (t) ∼ Bin(S(t), λ(t)) ≈ Poisson(S(t) · λ(t))


For the linear formulation λ(t) = βI (t), a Poisson regression
with identity link can be used, with explanatory variables
(S(t))0 .

MT3002 Michael Höhle 39 / 40


Reed-Frost model Deterministic SIR model Stochastic SIR model in continuous time References

Literature I

Andersson, H. and T. Britton (2000). Stochastic Epidemic Models


and their Statistical Analysis. Vol. 151. Springer Lectures Notes
in Statistics. Springer-Verlag.
Becker, N. G. (1989). Analysis of Infectious Disease Data.
Chapman & Hall/CRC.
Daley, D. J. and J. Gani (1999). Epidemic Modelling: An
introduction. Cambridge University Press.
Stegeman, A., A. R. W. Elbers, J. Smak, and M. C. M. de Jong
(1999). “Quantification of the transmission of classical swine
fever virus between herds during the 1997- 1998 epidemic in The
Netherlands”. In: Preventive Veterinary Medicine 42,
pp. 219–234.

MT3002 Michael Höhle 40 / 40

You might also like