Parameter Estimation
Parameter Estimation
System
L.G. de Pillis and A.E. Radunskaya
August 22, 2002
This work was supported in part by a grant from the W.M. Keck Foundation
0-0
PARAMETER ESTIMATION
Overview
1. Gathering Data
2. Fitting Curves to Data
3. Calculating Curves
4. Function Minimization
5. Root Finding
6. Example with MATLAB
7. Estimating s and d using measured steady-state values.
8. Parameter estimation without an explicit solution.
9. Demonstration of the procedure, and results.
Parameter Estimation
Gathering Data
Parameter Estimation
Notes for Gathering Data slide:
Answers:
(1) a and b.
Notes: If time allows, a discussion on what parameters might be measured experimentally should precede this slide. How might the different growth terms and competition terms be measured experimentally? Is it actually possible to isolate the effects of
the different cell types? What assumptions that were made in the construction of the
model should be questioned?
2-1
Parameter Estimation
Mouse Data
8
x 10
3.5
Tumor Population in Number of Cells
3
2.5
1.5
0.5
10
20
30
40
50
Time in days
60
70
80
90
Parameter Estimation
Curve through Mouse Data
8
x 10
3.5
2.5
Tumor Population in Number of Cells
2
1.5
0.5
0.5
10
20
30
40
50
Time in days
60
70
80
90
Parameter Estimation
Notes for Mouse Data slide:
Note: The previous slide shows the cubic polynomial which best fits the data. (This
curve was generated using MATLABs Basic Fitting tool in the pull-down menu of the
Figure window). The students should discuss what type of curve might fit the data,
with justification for their answers. Some students may recognize the data as having
the S-shaped form characteristic of logistic growth. (In this case, you may assure the
students that the logistic differential equation is solved explicity a later in this module.)
4-1
Parameter Estimation
Fitting the Curve
Step Two: Fitting the Curve to the Data
Main Idea: Minimize the total distance from the model curve to the
data points.
Collected Data: d1 , ..., dn at times t1 , ..., tn
In our example these are the values of THE NUMBER OF TUMOR
CELLS AT THE 7 DIFFERENT TIMES.
Model Solution:
Goal: Minimize
D=
n
X
(x(ti ) di )2
i=1
Parameter Estimation
Notes for Fitting the Curve slide:
Answers:
(1) the number of tumor cells at the 7 different times.
Note: There are two points to be made here:
How can we find the solution to the model equations? What equation are we
trying to solve here? Since there are no immune cells, the model reduces to a
one-dimensional logistic equation, which can be solved explictly. ( This is worked
out in the following notes page). Numerical methods are discussed in the module
on Numerical Methods.
5-1
Parameter Estimation
Distance to Data
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
0.5
1.5
2.5
3.5
Parameter Estimation
Finding the Curve
The x(ti )s are determined:
by solving the D.E. analytically (in which case we have x(t) for all values of t)
or
Parameter Estimation
Notes for Finding the Curve slide:
Answers:
(1) differentiating the function D . See the next slide.
Note:
1. The differential equation we are solving here is:
1
dx
x(1
bx)
Z
1
b
+
dx
x 1 bx
1
dx =
x(1
bx)
= a dt
x
= ln
1 bx
7-1
= at + C
a dt
Solving for x in terms of t, and writing the constant of integration in terms of the
initial value of x gives:
x
= Ceat
1 bx
x = Ceat bCeat x
Letting x(0)
= x0 :
x0 =
1
1
C=
+b
C +b
x0
2. You can also use a weighted least squares fit, or any other norm, as your criterion. See [MT73, Section 10.2] for more details.
7-2
Parameter Estimation
Function Minimization
Parameter Estimation
Notes for Function Mimimization slide:
Answers:
(1) a=1.45
(2) b=.6167
Note: In MATLAB, once data points are plotted, you can click on the Tools pull-down
menu button in the Figure window, then choose Basic Fitting, and then quadratic in
the pop-up menu. Here is the answer given by MATLABs Basic Fitting with a quadratic
fitting routine:
10
data 1
linear
y = 1.5*x + 0.62
8-1
n
n
X
X
D
2
2
D=
(ati + b xi )
ati +
(bti xi ti )
=2
a
i=1
i=1
i=1
P
P
(xi ti ) b ti
P 2
a=
ti
Computing D
b and setting the result equal to zero gives
n
X
i=1
n
X
ti =
(xi b)
i=1
b=
P
P P
ti xi ti xi (t2i )
P
P
( ti )2 n (t2i )
8-2
H(a, b) =
2D
a2
2D
ba
2D
ab
2D
b2
P
P
2( Pt2i ) 2 ti
2 ti
2n
D=
n
X
(x(ti ) xi )
i=1
8-3
D
a
n
X
i=1
n
X
i=1
1
xi
(1/x0 b)eati + b
1
xi
(1/x0 b)eati + b
x(ti )
a
ti (1/x0 b)eati
2
[(1/x0 b)eati + b]
8-4
Parameter Estimation
Root Finding
Usually parameter estimation requires MINIMIZING a function, which in turn
requires finding the ZEROS of its DERIVATIVES.
This can be difficult to do by hand!
Fortunately, there are NUMERICAL algorithms for finding the zeroes of functions.
For example:
NEWTONS METHOD: To find a zero of the function f , iterate the equation
xi+1 = xi
f (xi )
f (xi )
Parameter Estimation
Notes for Root Finding slide:
Answers:
(1) minimizing
(2) zeroes
(3) derivative(s)
(4) numerical
9-1
Parameter Estimation
Root Finding Example
Applying this algorithm to the function f (x)
1
x1 = 1.67
x0=3
0
x =1.2
0
5
0.5
1.5
2.5
3.5
10
Parameter Estimation
Notes for Root Finding Example slide:
Notes: Newtons method may be discussed at greater length here, or the previous
two slides may be omitted altogether. Alternatively, or in addition, an exercise which
uses Newtons method or other minimizing routines could be assigned for in-class or
at-home work.
A root-finding demo using Newtons method can be found in the MATLAB appendix.
MATLAB demo code: See ParDemo1 scripts.
10-1
Parameter Estimation
MATLAB Curve Fitting: Solve ODE
As in Newtons method, most numerical minimizing routines require an initial
guess - The value of this initial guess is usually very important..
In this example, we use MATLABs fminsearch routine to fit the tumor
growth data to the function.
Step A: Find an explicit formula for T (t) by solving the appropriate differential
equation. Use the first point in the data set as your initial condition. The initial
value problem we are solving is (remember, we are assuming that there are no
immune cells): dT
dt
= aT (1 bT ), T0 = T (0).
The solution is
T (t) =
1
Ceat
+b
where
C=
1
b
T0
11
Parameter Estimation
Notes for MATLAB Curve Fitting Solve ODE slide:
Answers:
(1)
dT
= aT (1 bT ), T0 = T (0)
dt
Notes: This is the logistic equation which can be solved by partial fractions. See the
notes after slide 7.
11-1
Parameter Estimation
MATLAB Curve Fitting: Choose a Metric
Step B: Write the function to be minimized as a MATLAB M-file. This function
should return the sum of the squares of the distances of the solution to the data:
D(a, b) =
n
X
(x(a,b) (ti ) di )2
i=1
where the input to the function are THE PARAMETERS a AND b, and the
function takes as additional arguments: THE DATA POINTS {(ti , di )} AND THE
SOLUTION TO THE DE, x(t).
12
Parameter Estimation
Notes for MATLAB Curve Fitting Choose a Metric slide:
Answers:
(1) the parameters a and b. Note:In the demonstration code included in the appendix,
a and b are components of one input vector.
(2) the data points {(ti , di )}, and the solution to the DE x(t).
12-1
Parameter Estimation
MATLAB Curve Fitting: Example
To calculate the distance between our DE solution evaluated with given
parameters a and b and the data points, we call with MATLAB syntax:
distance function([parameters],@function name,data)
where
13
Parameter Estimation
Notes for MATLAB Curve Fitting Example slide:
Answers:
(1) D Note: See previous slide.
(2) vector [a, b].
(3) calculates the solution to the DEs, x(t).
(4) 2 n
13-1
Parameter Estimation
MATLAB Curve Fitting: Use FMINSEARCH
MATLABs fminsearch function will estimate the unknown parameters in the
differential equation by MINIMIZING THE DISTANCE between the solution and
the actual data points.
MATLAB syntax: [p,fval] = fminsearch(@distance function,[initial guess],[],data)
fminsearch takes as input
Parameter Estimation
Notes for MATLAB Curve Fitting Use FMINSEARCH slide:
Answers:
(1) minimizing the distance
(2) minimizing parameter values
(3) the minimizing parameter values
(4) distance function
Notes:
The MATLAB routine takes as arguments the name of the function to be minimized,
distance function, (the first argument of this function must be the unknown parameters),
an initial guess, in this case a vector [initial guess], and any additional values which are
required by the function to be minimized, data. The routine outputs the minimizing
values in the vector p, as well as the value of the function itself evaluated at those
minimizing values, fval. The empty vector [ ] is a place-marker, but can contain special
options to be sent to the routine. See fminsearch in the MATLAB Help menu for details.
14-1
Parameter Estimation
MATLAB Curve Fitting: Program Flow
Program Flow: The syntax will be different in other languages but the procedure remains
roughly the same.
t1 t2 t3 . . . tn
x1 x2 x3 . . . xn
15
Parameter Estimation
MATLAB Curve Fitting: Program Flow (continued)
4. Call fminsearch
Arguments: the name of the distance function, a vector containing the initial
guesses for the minimizing values of a and b, and the data; Output: the
minimizing values of a and b in p, and the associated minimum distance.
5. Plot the data and the solution T (t) evaluated using the computed best
parameter values for visual comparison.
16
Parameter Estimation
Notes for MATLAB Curve Fitting: Program Flow slides:
Notes: A demo of the software used in the course might be appropriate here. If a
computer-classroom is available, one of the exercises for this section could be used as
an in-class project.
MATLAB demo code: see ParDemo2.
16-1
Parameter Estimation
MATLAB Curve Fitting: Graphical Output
10
10
10
10
10
10
20
30
40
50
60
70
80
90
17
100
Parameter Estimation
Notes for MATLAB Curve Fitting Graphical Output slide:
Note: The results are plotted on a logarithmic scale because the number of tumor
cells is so large, and also because the data are given this way in the paper.
Running the MATLAB routines will give the estimated values of a and b in the vector
p to be a = .14 and b = 3.18109 . Ideally, the students will be able to see this
generated real-time in class.
17-1
Parameter Estimation
Estimating the Other Parameters
The output of our minimization routine gives:
18
Parameter Estimation
Notes for Estimating the Other Parameters slide:
Answers:
(1) .14 day1
(2) 3.18109 cell1
Notes Make note of the units here, as a reminder of the role the parameters play in the
model. Also, we point out that the results here differ from the values given in [KMTP94].
This is due to the fact that the data we used were obtained by reading off values from
the rather small figures in the paper, while the authors of the paper presumably used
numerical data obtained from experiments. In the later sections, we will revert back
to the parameter values given in the article, since we feel that the data they used are
more accurate. However, the demonstration is useful in order to illustrate the process
of parameter estimation.
(3) Cytotoxic T-Lymphocytes Note: These are part of the Effector or E -population in
our model.
(4) 3.2 108 cells = .0032 108 cells
18-1
Parameter Estimation
Estimating s and d
From our earlier analysis of the model, we found that without any tumor, the
number of immune cells approaches the STABLE EQUILIBRIUM (tumor-free) at
Ess =
s
d.
Other experiments show that the average lifetime of a lymphocyte is 24.25 days,
giving a death rate of d
= 0.0412/day.
If we assume that the measured number of CTLs is the steady state value, we
can estimate:
19
Parameter Estimation
Notes for Estimating s and d slide:
Answers:
(1) stable equilibrium (the tumor-free equilibrium)
(2) s/d, (we solved for this equilibrium by setting T
5
(4) 1.3 104 cells
day = .0412day 1 3.2 10 cells
19-1
Parameter Estimation
Incorporating the Immune Parameters
To find the remaining parameters: p, g, m, and n, we need to repeat the DATA
FITTING procedure, using the following data:
Data From Mice With an Intact Immune System
10
10
10
10
10
20
40
60
Time in Days
80
100
120
20
Parameter Estimation
Notes for Incorporating the Immune Parameters slide:
Answers:
(1) p, (2) g , (3) m, (4) n.
(5) data fitting
Note: It might be worth recalling the biological meaning of these parameters here, to tie
the equations back to the real world:
p
g
m
n
=
=
=
=
Note also that these data have a different shape from the data for the chimeric mice:
point out that the tumor grows and then shrinks, due to the response from the immune
system. Clinically, these tumors are called immunogenic.
20-1
Parameter Estimation
Parameter Estimation Without an Explicit Solution
Again, we need to minimize the DISTANCE function D(p, g, m, n) with respect
to the unknown PARAMETERS.
However, we no longer have a formula for T (t). We need to compute it by
NUMERICALLY INTEGRATING the system of differential equations.
The procedure is the same:
1. Gather the DATA: {(ti , di )} in an array.
2. Numerically evaluate T (t) at the TIMES, ti given in the data.
3. Calculate the distance between the COMPUTED VALUES T (ti ) and the
DATA, di .
4. Minimize this distance over all possible PARAMETER values.
21
Parameter Estimation
Notes for Parameter Estimation Without an Explicit Solution slide:
Answers:
(1) distance Recall: The distance function measures the distance between the computed solution and the data. Weve used a sum of squares formula in our routines, but
other distances, or norms, may also be used.
(2) parameters
(3) numerically integrating
(4) data: {(ti , di )}
(5) times, ti
(6) computed values T (ti )
(7) data, di .
(8) parameter
MATLAB demo code: see ParDemo3.. This demo estimates the parameters p, g, m
and n using the data given in Figures 1 a), b) and c) of [KMTP94].
21-1
Parameter Estimation
Results: Estimates of Immune Response Parameters
p=0.12579, g=662945.5977,m=1.2778e010,n=2.5675e008
10
10
10
10
10
10
10
10
10
Figure
6:
20
The
40
60
Time in days
parameters
are
80
100
estimated
10
120
to
, n = 2.57 10
be:
8
Parameter Estimation
Notes for Results Estimates of Immune Response Parameters slide:
Notes: Point out that the tumor values are plotted on a logarithmic scale, to conform
with the figures in the article [KMTP94]. Thus, the discrepancies at lower tumor values
look larger than they are, relative to the higher values. On the other hand, we have
not made any attempts to refine the parameter estimation procedure for this demo,
preferring a straight-forward approach. The estimated parameters give graphs which
do not match the data as well for the later time values. This suggests the following:
Questions for discussion:
1. In what way could the parameter estimation procedure be modified in order to
better match the data points at the later time values? Suggestion: Give more
weight to distances between computed values and data values for the later time
points, by multiplying the squared differences by an increasing function of time. It
should be noted, also, that in our estimation we force the initial values to match,
perhaps thereby encouraging stronger agreement with the data for small time
values.
2. Is it likely that the differences between computed values and the data are due
to experimental error? How could we test this? Suggestion: The data are an
22-1
average of several experiments: the first and last sets are an average of two
individual experiments, while the second set is the result of a single experiment.
We are using all of the data together, and assuming that the parameters are the
same for all of the experimental subjects. What is known about the variation
in immune response between the different mice? Perhaps a large sample is
needed to get parameters which are optimal for all of the experiments.
3. How bad is this fit? Is it bad enough to require an adjustment in the model
equations? If so, what adjustments do the errors suggest? Suggestion: This is
a tough question. Kuznetsov et al. argue that the qualitative results, in particular
the regrowth of the tumor shown in the computed graphs, has been observed
clinically. We contend that the sample size is too small, and that parameter
values vary too widely over the different mice. A larger sample size would be
needed to reach a definitive answer as to whether this model adequately mirrors
reality. In general, we can only hope to get qualitative information from a model
which only uses two cell types from the entire organism.
Note: In the analysis of the module, we will use the parameter estimates from the
article [KMTP94] as the normal values, rather than our own estimates. We do this for
two reasons: we assume that the authors of the article had access to more precise data,
22-2
and we feel that using different parameter values might be confusing to the student who
is reading the orginal article as she works through the module.
We collect here for completeness the parameter estimates from [KMTP94]:
References
[KMTP94] Vladmir A. Kuznetsov, Iliya A. Makalkin, Mark A. Taylor, and Alan S. Perelson. Nonlinear dynamics of immunogenic tumors: Parameter estimation and
global bifurcation analysis. Bulletin of Mathematical Biology, 56(2), 1994.
[MT73]
Daniel P. Maki and Maynard Thompson. Mathematical Models and Applications. Prentice-Hall, Inc., 1973.
22-3