0% found this document useful (0 votes)
7 views

Computer Simulation Models

Uploaded by

Lanaya Wiz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Computer Simulation Models

Uploaded by

Lanaya Wiz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Big Data and Machine Learning in

Healthcare Applications

Oleg S. Pianykh, PhD


Medical Analytics Group
Department of Radiology, Massachusetts General Hospital
Harvard Medical School

Simulation Models
in Healthcare.

Process Mining.

Oleg Pianykh [email protected]

Oleg Pianykh [email protected] 1


What is computer simulation?

Oleg Pianykh [email protected]

Pandemic problem, version 1


• Consider the following:
• We have 12,000 people in a small town
• Every day, 100 people get infected
• How long will it take for all to get infected?

• Solution: 12,000/100 = 120 days

• Unrealistically simple ! So, let’s get a bit more real…

Oleg Pianykh [email protected]

Oleg Pianykh [email protected] 2


Pandemic problem, version 2
• Consider the following:
• We have 12,000 people in a small town
• One person gets infected first.
• Each day, each infected person infects one healthy (susceptible) person
• After 4 days, an infected person recovers and cannot infect (or be infected)
again
• How long will the infection last?

• Solution: not so trivial !

Oleg Pianykh [email protected]

Pandemic problem, version 2

Infected Recovered

Susceptible


Day 1 Day 2 Day 3 Day 4

How long will the infection last?

Oleg Pianykh [email protected]

Oleg Pianykh [email protected] 3


Pandemic problem, version 3: Real
What we observe: What we need to know:

Date I R D
2/19/2020 3 0 0
What is the incubation time ?
2/20/2020 3 0 0
2/21/2020 20 0 1
2/22/2020 62 1 2
2/23/2020 155 2 3
What is the infection rate?
2/24/2020 229 1 7
2/25/2020 322 1 10 Should we have a lockdown
2/26/2020 453 3 12
2/27/2020 655 45 17
2/28/2020 888 46 21 How many we need to vaccinate
2/29/2020 1128 46 29 And for how long?
3/1/2020 1694 83 34
to stop the pandemic ?
3/2/2020 2036 149 52
3/3/2020 2502 160 79
3/4/2020 3089 276 107
3/5/2020
3/6/2020
3858
4636
414
523
148
197 How long will it last???
3/7/2020 5883 589 233
3/8/2020 7375 622 366

Oleg Pianykh [email protected]

Computer simulations
• In many cases we:
• Do not know the formula for computing the exact outcome,
but
• Do know the process producing this outcome

• So we can literally model (simulate) how the process behaves by computing


each of its steps:
• Simulating pandemic, day by day
• Simulating blood flow, millimeter by millimeter
• Simulating patient flow in emergency room, hour by hour

Oleg Pianykh [email protected]

Oleg Pianykh [email protected] 4


Study: Pandemic Model (SIR)
• Can we model the spread of infection?

Oleg Pianykh [email protected]

SIR model

Recovered
Susceptible
Infected
Infected
Susceptible
Recovered

Oleg Pianykh [email protected]

10

Oleg Pianykh [email protected] 5


SIR Model: S → I → R
• SIR is a popular epidemic model, breaking fixed population into 3
separate groups:
• S(t) “susceptible” - not yet infected at time t.
• I(t) “infected” - infected and capable of spreading the infection
• R(t) “removed” – immune, recovered from infection (due to immunization or
death); cannot infect or be infected again

Recovered
• S, I and R are usually measured
as fractions of the entire population: Infected
0 ≤ S, I, R ≤ 1
Susceptible
http://en.wikipedia.org/wiki/Epidemic_model,
Oleg Pianykh [email protected] http://en.wikipedia.org/wiki/Mathematical_modelling_of_infectious_disease

11

SIR equations: S(n) = S(tn)


• SIR equations define how (S,I,R) fractions change in time: day number
(n+1) from day number n

• Let’s start with Susceptible S(n) – they get infected:

𝑆 𝑛 + 1 = 𝑆 𝑛 − 𝐼 𝑛 × 𝛽 × 𝑆(𝑛)
… so we subtract newly
infected from S(n) to Each infected person
get updated S(n+1). infects β-fraction of
Note that S(n) currently susceptible
decreases

Oleg Pianykh [email protected]

12

Oleg Pianykh [email protected] 6


SIR equations: R(n) = R(tn)
• Recovered S(n):

𝑅 𝑛+1 =𝑅 𝑛 +𝐼 𝑛 ×𝛾

γ-fraction of infected
become recovered on
each day. Note that
R(n) increases

Oleg Pianykh [email protected]

13

SIR equations: I(tn)=I(n) and the final system


𝑆 𝑛 + 1 = 𝑆 𝑛 − 𝐼 𝑛 × 𝛽 × 𝑆(𝑛)

𝐼 𝑛 + 1 = 𝐼 𝑛 + 𝐼 𝑛 × 𝛽What
× 𝑆(𝑛) −here???
do we put 𝐼 𝑛 ×𝛾

𝑅 𝑛+1 =𝑅 𝑛 +𝐼 𝑛 ×𝛾

We want to complete SIR right-hand side


to make sure the total population size does not change:

𝑆 𝑛+1 +𝐼 𝑛+1 +𝑅 𝑛+1 = 𝑆 𝑛 +𝐼 𝑛 +𝑅 𝑛


Day (n+1) Day (n) Oleg Pianykh [email protected]

14

Oleg Pianykh [email protected] 7


SIR equations: I(tn) =I(n) and the final system
𝑆 𝑛 + 1 = 𝑆 𝑛 − 𝐼 𝑛 × 𝛽 × 𝑆(𝑛)

𝐼 𝑛 + 1 = 𝐼 𝑛 + 𝐼 𝑛 × 𝛽 × 𝑆(𝑛) − 𝐼 𝑛 × 𝛾

𝑅 𝑛+1 =𝑅 𝑛 +𝐼 𝑛 ×𝛾

The total population size does not change:

𝑆 𝑛+1 +𝐼 𝑛+1 +𝑅 𝑛+1 = 𝑆 𝑛 +𝐼 𝑛 +𝑅 𝑛


Day (n+1) Day (n)
Oleg Pianykh [email protected]

15

Example: =0.5, =0.05


System Dynamics approach:
S(t) R(t)
continuously evolving
population “mass”

I(t)

Oleg Pianykh [email protected]

16

Oleg Pianykh [email protected] 8


Example: =0.5, =0.05
Susceptible:
decreasing
S(t) R(t)
Recovered:
Peak: 70% increasing
infected after
22 days

But slowing
down as we get
fewer to infect

I(t)

Exponential
growth at first Pandemic stops when
all are immune: S=0

Oleg Pianykh [email protected]

17

SIR code

SIR model goes here

Oleg Pianykh [email protected]

18

Oleg Pianykh [email protected] 9


SIR equations with vaccination

𝑆 𝑛 + 1 = 𝑆 𝑛 − 𝐼 𝑛 × 𝛽 × 𝑆(𝑛) − 𝑘 𝑆(𝑛)

𝐼 𝑛 + 1 = 𝐼 𝑛 + 𝐼 𝑛 × 𝛽 × 𝑆(𝑛) − 𝐼 𝑛 × 𝛾

𝑅 𝑛+1 =𝑅 𝑛 +𝐼 𝑛 ×𝛾+𝑘 𝑆(𝑛)

Coefficient kvacc defines the fraction of Susceptible population


that we vaccinate.
As a result, we exclude vaccinated from Susceptible, and add
them into Recovered.

Oleg Pianykh [email protected]

19

Impact of vaccination

Kvacc = 0.05

Oleg Pianykh [email protected]

20

Oleg Pianykh [email protected] 10


Adding more parameters
• To make SIR model even more realistic, one needs to consider a few
more parameters, for instance:
• Number of initially infected I0 = I(0).
• The day when infection started, t0. This may happen days before the first
cases were observed and reported
• The “infection count” factor kI>1 – how many people are really infected
compared to the reported cases. (Really infected) = kI*(Reported Infected I). It
is not uncommon for kI to be in the 5-10 range
• Incubation period duration dT, when infected people cannot infect yet (SEIR
model, E – standing for “exposed”).
• Many, many more...
• And, by the way, some of these parameters may change in time!

Oleg Pianykh [email protected]

21

Adding more parameters/effects


 𝛾=1/3*cos2(t/7) (daily “seasonal” pattern in recovery from infection)

More infected!

Second outbreak!

Oleg Pianykh [email protected]

22

Oleg Pianykh [email protected] 11


Simulation + Machine Learning ?
• Q: OK, we talked about SIR models, but all they used were a few
coefficients (β, γ). What this has to do with data analysis/models?

• Q: How would you incorporate big and more real data into SIR
models?

Oleg Pianykh [email protected]

23

Simulation + Machine Learning !


• Simulation models can be viewed as machine
learning models with unknown coefficients,
that we can learn to fit the observed data

• Example: Learn the best coefficients (β, γ) in


SIR to approximate the observed I(n) trend as
much as possible.

• Even better: Learn all SIR parameters from


the observed infected counts. Use these
values for decision making (lockdowns,
resource allocations, …)
Learning model
parameters by fitting the
actual data

Source:
Oleg Pianykh [email protected]

24

Oleg Pianykh [email protected] 12


Example: Solving SIR for parameters
β parameter as a
function of time

Lockdown

𝑆 𝑛 + 1 = 𝑆 𝑛 − 𝐼 𝑛 × 𝜷 × 𝑆(𝑛)

β(t)
Monitor solution to
see if a new lockdown
will be needed

Oleg Pianykh [email protected]

25

More SIR examples


• Pandemic Spanish flu of 1918:
• http://resources.modelling4all.org/spanish-flu/teacher-guide-to-spanish-flu-
simulation
• Pandemic simulations
• http://www.youtube.com/watch?v=mm2u9RKwgsY
• http://www.youtube.com/watch?v=ZPwq_6pLxfw

Oleg Pianykh [email protected]

26

Oleg Pianykh [email protected] 13


Can we simulate even more complex
systems?
How about:
• Individual patient properties: age, sex, weight/BMI, comorbidities,…
• Population density?
• Randomness (“incubation period between 6 and 14 days”)?
• Population mobility (travel)?
• Hospital infrastructure?
• …

• To simulate all this, we might need to start considering each


person/location/resource as an independent entity with different properties

Oleg Pianykh [email protected]

27

Two modeling paradigms


• System Dynamics (SD)
• We study the system as a continuous “mass”, evolving in time.
• The evolution is described in state equations (example: SIR)

• Discrete Event Simulation (DES)


• System is modeled as a set of discrete (individual) agents, corresponding
to different system resources (people, equipment, …)
• The agents interact with each other according to their individual
properties

Oleg Pianykh [email protected]

28

Oleg Pianykh [email protected] 14


Discrete Event Simulation (DES)

• DES: all agents (patients, staff, equipment, resources) are modeled as


discrete entities, moving through different processing queues (e.g.,
treatment for patients)

• Closer match for most real-life workflow modeling

• Easy to model, by assigning various properties to entities (patient age,


sex, diagnosis, blood group, hair color, etc.)

Oleg Pianykh [email protected]

29

Example: Building DES models with Arena


• You can easily specify entity
properties using random Create entities
distributions Delete entities

• You can visualize entity flow during


the simulation run
• When simulation completes, you
Connect entity
can find collected statistics in the workflow
report pages
• No coding required

Specify entity
properties
Example:
https://www.youtube.com/watch?v=EALuWtg7RX4&t
=613s&ab_channel=EdwinCharlesBalakrishnan

Oleg Pianykh [email protected]

30

Oleg Pianykh [email protected] 15


Simulations limitations

• Rule-based – what if the rules change ? And who knows the rules
anyway?
• Hard to match with real complex data – how do you know your
simulation model is correct?
• Simulation systems overloaded with too many parameters make
simulation models kind of an art, not science
• “What if” questions are great, but is it what we need? We need to
know the real drivers of the system behavior. But how can we
discover them?
https://www.researchgate.net/figure/A-common-joke-to-illustrate-the-fact-that-we-are-often-guided-by-the-availability-of_fig4_43135243
Oleg Pianykh [email protected]

31

Find the flow: Process mining in healthcare


• Process mining reconstructs process flow from the data this process
created:
• Takes input data as an event log: event type, ID, and timestamp
• Uses specific algorithms to extract the real process flowchart from the
timestamps and even types found in the data

• Can filter out less probable event sequences


• Can cluster similar sequences
• Great tool for studying how the process really functions

Oleg Pianykh [email protected]

32

Oleg Pianykh [email protected] 16


Example: Emergency department patient flow
Theory from a Reality from
textbook process mining

Arrival

Triage

Physician

Care

Departure

33

Advantages of process mining

• Data-driven instead of rule-based: discover the real process


• Studying process-mined charts is the best way to understand the real
functionality of the system, and helps locate its bottlenecks
• Leads to feature engineering
• Can we use it to model healthcare data then?

Process mining software Disco: https://fluxicon.com/disco/

Oleg Pianykh [email protected]

34

Oleg Pianykh [email protected] 17


Worst case: no equations, no rules

• In some cases, we know virtually nothing about the internal system composition,
rules, or behavior. We cannot do system dynamics, we cannot do DES either, and
process mining does not reveal anything simple.
• Example: A really complex/messy emergency department, where patient flow can
probabilistically branch into numerous possibilities.

• Then we can approximate the system with a machine learning model such that:
• System parameters are used as features
• Individual system outputs are used as the target variable

• Reviewing partial dependence plots will inform us on how each feature impacts
the output, and what features can be the key drivers overall.
Read more
online
Oleg Pianykh [email protected]

35

Example of AI process modeling with partial


dependence plots

Oleg Pianykh [email protected]

36

Oleg Pianykh [email protected] 18


Conclusions
• Simulations can help find the best solutions and predict outcomes –
especially when trivial guesses do not help.

• Simulations can be seen as machine learning models, which can learn


their optimal parameters by closely approximating the output
(ground truth).

• Process mining and AI models open new possibilities for analyzing


flowing, evolving data
"All models are wrong, but some are useful"
George Box, statistician
Oleg Pianykh [email protected]

37

Oleg Pianykh [email protected] 19

You might also like