0% found this document useful (0 votes)
7 views86 pages

intro-to-maths-computation

The document outlines a course on Introduction to Mathematical Computing, focusing on the use of MATLAB/R for symbolic and numeric computations, data visualization, and statistical analysis. It covers course objectives, teaching methodologies, assessment procedures, and essential topics such as data manipulation, linear models, and MATLAB fundamentals. Additionally, it discusses the benefits and challenges of computational mathematics, emphasizing its applications in solving complex problems in science and engineering.

Uploaded by

Novan Hazard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views86 pages

intro-to-maths-computation

The document outlines a course on Introduction to Mathematical Computing, focusing on the use of MATLAB/R for symbolic and numeric computations, data visualization, and statistical analysis. It covers course objectives, teaching methodologies, assessment procedures, and essential topics such as data manipulation, linear models, and MATLAB fundamentals. Additionally, it discusses the benefits and challenges of computational mathematics, emphasizing its applications in solving complex problems in science and engineering.

Uploaded by

Novan Hazard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

lOMoARcPSD|43072212

Intro To Maths Computation

Bsc. mathematics and computer science (Jomo Kenyatta University of Agriculture and
Technology)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Francis Mbae ([email protected])
lOMoARcPSD|43072212

Introduction to Mathematical Computing

Duncan K. Gathungu

November 2, 2023

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Content
Introduction to Mathematical Computing
Contact Hours: 45 hours
Pre-Requisites: None
Purpose of the course:
This module focuses on showing how to use scientific soft wares to perform symbolic
and numeric computations, visualization, experimentation and much more. The mod-
ule introduces scientific computation with MATLAB/R.
Expected Learning outcomes:
Students will learn through the application of concepts and techniques covered in the
module to real data sets and models. Students will be encouraged to examine issues of
substantive interest in these studies. Successful students will be able to:

1. Use the MATLAB/R programming language.

2. Produce basic statistics and create graphs from information given

3. Work with linear models in MATLAB/R, the MATLAB/R syntax for writing
functions, iterations and conditions

Course Description
Interactive use of MATLAB/R. Basic data types. Writing scripts. Graphical facili-
ties. Writing your own functions. String processing. File input/output. Vectorization.
Numeric issues, Debugging. Introduction to Monte-Carlo methods. Reproducible re-
search. Interfacing to databases. Advanced aspects. In summary, the following topics
will be covered: Manipulation and management of data in the MATLAB/R environ-
ment, summarizing data numerically and graphically, fitting linear models in MAT-
LAB/R, functions, iterations, and conditions in MATLAB/R.

Teaching and Learning Methodology: Teaching will comprise 2 hours of formal lec-
tures and 3 hours of computer -based practical classes every day for five days. Students
will undertake computer-based data analysis practical’s involving use of MATLAB/R
to perform statistical analysis/modelling using real data including graphs and fitting
liner models. Assignments from practical classes are handed in for feedback.

Course Assessment Procedures: Practical assignments are assessed for feedback.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Course Textbooks

1. Faraway J., (2014). Linear Models with R; 2nd Edition; Chapman and Hall; ISBN:
1439887330, ISBN: 978-1439887332.

2. Stormy, A. (2018). MATLAB: A Practical Introduction to Programming and Prob-


lem Solving (5th Ed.). Woburn, Elsevier. ISBN-13: 9780128154793.

3. Brian, H. & Daniel, V. (2019). Essential MATLAB for Engineers and Scientists (7th
Ed.). London, Elsevier. ISBN-13: 9780081029978

4. Dalgaard P, (2008). Introductory Statistics with R; 2nd Edition; Springer; ISBN:


978-0-387-79054-1.

Reference Textbooks

1. Maindonald, J. and Braun, J, (2006). Data Analysis and Graphics Using R; 2nd
Revised Edition; Cambridge University Press; ISBN: 1139460536, 9781139460538.

2. Mark, M. M. (1998). Mathematical Modeling (Revised Ed.). San Diego, Elsevier.


ISBN-13: 9780124876521.

3. Crawley, M.J., (2005). Statistics: An Introduction Using R; 1st Edition; Wiley, New
York; ISBN: 978-0-470-02298-6.

Course Journals

1. International Journal of Computing Science and Mathematics 1752-5063.

2. The R Journal, the R Foundation for Statistical Computing.

Reference Journals

1. Journal of Advanced Mathematics and Applications ISSN: 2156-7565.

2. The Journal of statistical software, American Statistical Association

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Introduction
Computational mathematics is the use of computers to solve mathematical problems.
It is a broad field that encompasses a wide range of topics, including numerical anal-
ysis, scientific computing, and mathematical modeling. Numerical analysis is the study
of numerical methods for solving mathematical problems. Numerical methods are
approximate methods that use computers to find solutions to problems that are too
difficult or impossible to solve analytically. Scientific computing is the use of comput-
ers to solve problems in science and engineering. Scientific computing problems often
involve the solution of differential equations, optimization problems, and statistical
problems. Mathematical modeling is the use of mathematical equations to describe a
physical system. Mathematical models can be used to predict the behavior of a sys-
tem, to design new systems, and to test the validity of existing theories.

Here are some of the benefits of using computational mathematics:

• It can be used to solve problems that are too difficult or impossible to solve ana-
lytically.

• It can be used to solve problems that are too large or complex to be solved by
hand.

• It can be used to solve problems that require a large number of calculations.

• It can be used to generate new insights into a problem.

• It can be used to create simulations of real-world systems.

Computational mathematics is a rapidly growing field with many exciting applica-


tions. As computers continue to become faster and more powerful, the potential appli-
cations of computational mathematics will only continue to grow.
Here are some of the challenges of using computational mathematics:

• It can be difficult to choose the right numerical method for a particular problem.

• Numerical methods can be computationally expensive.

• Numerical methods can be sensitive to the input data.

• Numerical methods can produce inaccurate results.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Matrix Laboratory (MATLAB) is a powerful mathematical computing software that


is used by engineers, scientists, and researchers to solve a wide variety of problems.
MATLAB is a versatile tool and provides a wide range of functions for performing
mathematical operations and tasks, including: that can be used for a variety of tasks,
including:

• Data analysis

• Data visualization

• Numerical computing

• Symbolic computing

• Simulation

• Algorithm development

• Software development

MATLAB also has a powerful graphical user interface (GUI) that makes it easy to create
and edit plots, animations, and simulations.
In this introduction to mathematical computing using MATLAB, we will cover the
following topics:

1. Getting started with MATLAB.

2. Using MATLAB for mathematical operations.

3. Implementation and using of functions.

4. Implementation of mathematical models.

5. Creating interactive visualizations.

MATLAB Desktop
MATLAB may be started via the Start menu or by clicking on the MATLAB icon on the
desktop. Upon startup, a new window will open containing the MATLAB ?desktop?
and one or more MATLAB windows will open within the MATLAB desktop as seen in
the Figure 1.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Figure 1: Layout of the MATLAB desktop.

The main windows are: the Command Window, Command History, Current Folder, and
Workspace. You can customize the MATLAB windows that appear upon startup by
opening clicking on Layout in the Tool strip and checking (or unchecking) the windows
that you wish to appear on the MATLAB desktop.

1. Command Window: In the Command window, you can enter commands and
data, make calculations, and print results. You can write a script in the Com-
mand window and execute the script. However, writing a script directly into the
Command window is discouraged because it will not be saved, and if an error is
made, the entire script must be retyped. By using the up arrow (?)key on your
keyboard, the previous command can be retrieved (and edited)for re-execution.

2. Command History Window: This window lists a history of the commands that
you have executed in the Command Window. You can click on a command in
this window and it will be re-executed.

3. Current Folder Toolbar: This toolbar gives the path to the Current Folder. To run
a MATLAB script, the script needs to be in the folder listed in this toolbar.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

4. Current Folder Window (on the left): This window lists all the files in the Cur-
rent Folder whose path is listed in the Current Folder Toolbar.By double clicking
on a file in this window, the file will open within MATLAB.

5. Script Window: To open this window, click on the New Script icon in the Tool-
strip in MATLAB?s desktop. This will open the Script window (see 2).

6. The Script window may be used to create, edit, and execute MATLAB scripts
(programs). Scripts are then saved as M-Files. These files have the extension .m,
such as heat.m. To execute the script, you can click the Save and Run icon (the
green arrow) in the Script window (see 2) or return to the Command window
and type in the name of the program(without the .m extension).

Figure 2: Creating a new script.

Example 1. If the newly created script is called heat.m, to run it in the command window, we
just type heat and press enter.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Figure 3: Running a script called heat.m in the command window.

MATLAB Fundamentals
When using MATLAB in the mathematical computing the following should be consid-
ered

1. Variable names: Must start with a letter, can contain letters, digits, and un-
derscore character,can be of any length but must be unique within the first 19
characters Note: Do not use a variable name that is the same as a file name, a
MATLAB function name, or a self-written function name.

2. MATLAB command names and variable names are case sensitive.

3. Semicolons are usually placed after variable definitions and program statements
when you do not want the command echoed to the screen. In the absence of a
semicolon, the defined variable appears on the screen, for example, if you entered
the following assignment in the Command Window:

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Alternatively, if you add the semicolon after the assignment, then your command
is entered, but there is nothing printed to the screen, and the prompt immediately
appears for you to enter your next command:

4. The percent sign (%) is used for a comment line.

5. A separate Graphics window opens to display plots and graphs.

6. MATLAB’s clear and interrupt commands.

• clear: removes all variables and data from the Workspace.


• clc: clears the Command window.
• clf: clears the Graphics window.
• ctrl-C: aborts a program that may be running in an infinite loop.

7. Commands are case sensitive. Use lowercase letters for commands.

8. The quit command or exit command terminates MATLAB.

9. The save command saves variables or data in the Workspace of the Current-
Folder. The data file name will have the .mat extension.

10. User-defined functions (also called self-written functions) are also saved as M-
files.

11. Scripts and functions are saved as ASCII text files. Thus, they may be written ei-
ther in the built-in Script window or in Notepad or in any word processor(saved
as a text file). Be aware that the single quotation mark in MicrosoftWord is not
the same as the one in MATLAB and will need to be changed inthe MATLAB
program.

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

12. The basic data structure in MATLAB is a matrix. For example a matrix
" #
1 3
A=
6 5

can be written in MATLAB as

where the semicolon within the brackets indicates the start of a new row within
the matrix.

13. A specific element in the matrix can be accessed by specifying the row followed
by the column. For example from the above matrix A we can access number 3
which is in the 1st row and 2nd column as shown below.

14. The colon operator (:) may be used to

(a) Create a new matrix from an existing matrix, for example, if


 
5 7 10
 
A= 2 5 2 

,
1 3 1

typing

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

The colon in the expression A(:,1) implies all the rows in matrix A,and the 1
implies column 1. Typing

The first colon in the expression A(:,2:3) implies all the rows in A, and the
2:3 implies columns 2 and 3.
(b) Colon operator can also be used to generate a series of numbers. The syntax
is n =starting value : step size : final value. If the step size is omitted, the
defaultstep size is one. For example n = 1 : 8 gives

To increment in the step of 2 we use n = 1:2:8, which gives

10

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

These types of expressions are often used in a for loop.

15. Arithmetic operators

* Multiplication
/ Division
+ Addition
- Subtraction
^ Power / Exponentiation

For arithmetic statements containing several of these arithmetic operators, MAT-


LAB has a specific order in carrying out the operations. First, all expressions
within parentheses will be carried out first in the following order: exponenti-
ation, then multiplication and division, and then addition and subtraction. Ex-
pressions outside parentheses will be carried out in the same order. Knowing this
order may help you in deciding where parentheses are required when you write
arithmetic statements. For example, for the expressiony = c/2m, you might be
tempted to write the expression in the MATLAB Command window (after defin-
ing c and m) as

16. Special characters

11

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

pi π or 3.1426

i or j −1
Inf ∞
The last computed unassigned result to an expression typed
ans
in the Command window

17. Trigonometric functions

sin sine
sinh hyperbolic sine
asin inverse sine
asinh inverse hyperbolic sine
cos cosine
cosh hyperbolic cosine
acos inverse cosine
acosh inverse hyperbolic cosine
tan tangent
tanh hyperbolic tangent
tanh hyperbolic tangent
atan inverse tangent
atanh inverse hyperbolic tangent

These trigonometric functions are in radians. However these arguments can be


made in degrees if a d is placed after the function name , for example

π
NB: x (radians) = x (degrees) × 180 .

12

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

18. Exponential, logarithm, square root, and error functions

exp exponential
log natural log
log10 common (base 10) logarithm
sqrt square root
erf error function

For example

19. Other special values

13

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

abs Absolute value in magnitude


angle Phase angle (in radians)
conj Complex conjugate
imag Complex imaginary part
real Complex real part

For example

20. Other useful functions

14

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

size(X) Gives the size i.e the number of rows and number of columns of a matrix
length(X) For the vectors, this gives the number of elements in X
linspace(X,Y,N) Generates N points between X and Y
Gives the sum of elements in X.
sum(X) For matrices, sum(X) gives a row vector containing the sum.
of elements in each column of the matrix
For vectors, it gives the maximum element in X
For matrices, max(X) gives a row vector containing the maximum
max(X) .
in each column of the matrix.
If X is a column vector, it gives the maximum absolute value of X
min(X) Same as max(X) but gives the minimum element.
For vectors this sorts elements of X in ascending order.
sort(X)
For matrices sorts each column in the matrix in ascending order.
factorial(X) n! = 1 × 2 × 3 × . . . × n
mod(x,y) Modulo operator gives the remainder from the division of x by y.

21. Sometimes it is necessary to preallocate a matrix of a given size. This can be done
by defining a a matrix of all zeros or ones. For example

 
0 0 0
 
A = zeros(3) =   0 0 0 ,

0 0 0
 
0 0
 
B = zeros(3, 2) =  0 0 ,

0 0
 
1 1 1
 
C = ones(3) =   1 1 1 ,

1 1 1
" #
1 1 1
D = ones(2, 3) = .
1 1 1

15

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

To generate the identity matrix i.e. main diagonal of ones we use ’eye’ .E.g.
 
1 0 0
 
I = eye(3) = 
 0 1 0 .

0 0 1

MATLAB Input and Output


To display an output on the command window several options exist. One is not sup-
pressing the output by omitting the semicolon, use of disp() command and use of
fprintf command.

1. The disp() command prints only the items that are enclosed within the parenthe-
ses which can be a variable or alphanumeric information. Alphanumeric infor-
mation must be enclosed by single quotation marks. For example

2. The fprintf command prints formatted text next to the screen or to file, for exam-
ple

\n moves the cursor to a newline, \t moves the cursor several spaces along the
line. %f refers to a formatted floating-point number that is assigned to the vari-
able V. You can also specify the number of spaces and decimal places you may
wish to display. For example using %8.3f, is used to specify 8 places to be printed
to 2 decimal places.

16

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Other formats include

%i or %d Used for integers


%e Scientific notation (e.g, 1.23e18), default is 6 decimal places
%g Automatically uses the briefest of %f or %e formats
%s Used for string of characters
%c Used for a single character

Remark 2. We use single quotation marks in fprintf command.

3. fprintf command can also be used to results of a MATLAB program to a file.


Before you print the file you need to open it and we use fopen command. The
syntax is

fo=fopen(’filename’,’w’)

fo points to a file named filename and w indicates writing to a file. To print to


filename we can use

fprintf(fo,’format’,var1,var2,.....)

the format string contains the textfoemat for var1, var2, etc.
For example

This creates output.txt file below

17

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

4. Existing data file can also be entered into a program by using the command load,
for example

load filename.txt

x=filename(:,1);

y=filename(:,2);

Loops
Loops provide the means to repeat a series of statements with just a few lines of code.

1. for loop
The syntax for the for loop is

for index variable= starting value: step size:final value

The step size maybe omitted and MATLAB will take the step size as 1.
For example, an index variable as m taking the values from 1 upto 20 the for loop
can be written as

Listing 1: Loop
1 for m=1:20
2 for l=1:20
3 fprintf(' %i %i\n', m,l);
4 end
5 end

MATLAB sets the index m to 1, carries out the statements between the for and
end statements, then returns to the top of the loop, changes m to 2, and repeats
the process. After the process has been carried out 20 times, the program exits
the loop without further executing any of the statements within the loop. All
statements that are not to be repeated should not be within the for loop. For

18

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

example, table headings that are not to be repeated should be outside the for
loop.
Also notice that the statements within the for loop are indented for easier reading
and debugging.

Example 3. In order to determine the position x of a person in a roller coaster, the posi-
tion is determined by the function x in terms of t as x = x0 + v cos (θ ) t, where v=10 m/s
is the velocity of travel and θ=30◦ is the angle of motion. If the initial position x0 = 0.0,
determine the position at different times from 0 to 10 seconds and print the output.

Listing 2: Example 1

The output as seen in the command window is

2. while statement
In the while loop, MATLAB will carry out the statements between the while and
end statements as long as the condition in the while statement is satisfied. If an
index in the program is required, the use of the while loop statement (unlike the
for loop statement) requires that the program generate its own index, as shown
in the following example:

Listing 3: Example 1

19

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Conditional operators
1. if loop
The syntax is given as

Listing 4: If statements
1 if logical expression
2 statement;
3
4 statement;
5 else
6 statement;
7
8 statement;
9 end

If the logical expression is true, then only the upper set of statements are exe-
cuted. If the logical expression is false, then only the bottom set of statements are
executed.

2. Logical expressions are of the form

a==b; a<=b;

a<b; a>=b;

a>b; a ∼= b; (a not equal to b)

3. Compound logical expression

Listing 5: Compound logical expressions


1 a > b && a ~= c ;(a > b and a not equal c)
2
3
4 a > b || a < c ;(a > b or a < c)

4. if-elseif ladder
The syntax is given as

20

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Listing 6: if-elseif expressions


1 if logical expression 1
2 statement(s);
3 elseif logical expression 2
4 statements(s);
5 elseif logical expression 3
6 statement(s);
7 else
8 statement(s);
9 end

The if-elseif ladder works from top down. If the top logical expression is true,
the statements related to that logical expression are executed, and the program
will leave the ladder. If the top logical expression is not true, the program moves
to the next logical expression. If that logical expression is true, the program will
execute the group of statements associated with that logical expression and leave
the ladder. If that logical expression is not true, the program moves to the next
logical expression and continues the process. If none of the logical expressions are
true, the program will execute the statements associated with the else statement.
The else statement is not required. In that case, if none of the logical expressions
are true, no statements within the ladder will be executed.

5. switch group
In some cases, the switch group may be used as an alternative to the if-elseif
ladder. This syntax is given as

Listing 7: Switch syntax


1 switch(var)
2 case var1
3 statement(s);
4 case var2
5 statement(s);
6 case var3
7 statement(s);
8 otherwise
9 statement(s);

21

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

10 end

where var takes on the possible values var1, var2, var3, etc.
If var equals var1, those statements associated with var1 are executed, and the
program leaves the switch group. If var does not equal var1, the program tests
if var equals var2, and if yes, the program executes those statements associated
with var2 and leaves the switch group. If var does not equal any of var1, var2,
etc., the program executes the statements associated with the otherwise state-
ment. If var1, var2, etc., are strings, they need to be enclosed by single quotation
marks. It should be noted that var cannot be a logical expression, such as var1 >=
80.
For example

Listing 8: Example of a Switch

MATLAB Graphics
1. Plot commands
MATLAB provides many different types of plots that can be accessed by clicking
the PLOTS tab in the desktop. For example

plot(x,y) Linear plot of y vs x


semilogx(x,y) Semilog plot (log scale for x-axis, linear scale for y-axis)
semilogy(x,y) Semilog plot (linear scale for x-axis, log scale for y-axis)
loglog(x,y) Log-log plot(log scale for both x- and y- axes)

The variable arguments in the plot commands need to be vectors. In addition,the


vectors need to be of the same length. If the arguments in the plot command are
scalars, the plot commands will produce just a single point.
For example, the following command produce the a simple graph of y = x2 for
0 ≤ x ≤ 1.

Listing 9: Line graph


1 clc;
2 close all;

22

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

3 x = 0 : 0.01 : 1;
4 y = x.^2;
5 plot(x,y)
6 grid on

In general, MATLAB draws a piecewise linear function that connects the data
points; the graph will appear smooth if the spacing between the grid points is
sufficiently small.
The ’array operations’ that are built into MATLAB are very useful for generating
vectors of vertical coordinates. For example, if x is a vector, then x2 is undefined.
However, x.2 denotes the vector that is obtained by squaring the components of
x.

Remark 4. When you use the plot command, y does not have to be a function of x;

Example 5. Graph a unit circle centred at the origin. Without the command axis(’square’)
the graph would be an ellipse due to different scaling of the horizontal and vertical axes.
Try running the following code.

Listing 10: Line graph


1 theta = 0 : pi/60 : 2*pi;
2 x = cos(theta);
3 y = sin(theta);
4 plot(x,y)
5 axis('square')

2. Multiple plots
Suppose that the vectors x1 and y1 contain horizontal and vertical coordinates for
a curve, and suppose that the vectors x2 and y2 contain the coordinates for an-
other curve.The command plot(x1,y1,x2,y2) plots both curves on the same graph.
The vectors x1 and x2 could be the same. This procedure can be generalized to
any number of curves.

Example 6. The following commands produce plots of the curves y = x, y = x2 , y = x3 ,


and y = x4 on the same graph.

23

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Listing 11: Multiple graphs


1 x = 0 : .01 : 1;
2 y1 = x;
3 y2 = x.^2;
4 y3 = x.^3;
5 y4 = x.^4;
6 plot(x,y1, x,y2, x,y3, x,y4)

Or this can be done as follows

Listing 12: Multiple graphs


1 x = 0 : .01 : 1;
2 Y = [x; x.^2; x.^3; x.^4];
3 plot(x,Y)

If several curves are to be plotted simultaneously, and if they all use the same
vector of horizontal coordinates, then another method can be used to plot the
curves. Multiple curves on the same graph can be distinguished by color coding
the curves. Available color types are

black ’k’
blue ’b’
green ’g’
red ’r’
cyan ’c’
yellow ’y’

Multiple curves on the same graph can also be distinguished by using different
types of lines. The available line types are

solid default
dashed ’–’
dashed-dot ’-.’
dotted ’:’

Alternatively you can create a marker plot of discrete points by using one of these
marker styles:

24

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

point ’.’
plus ’+’
star ’*’
circle ’o’
x-mark ’x’
diamond ’d’

For example

Listing 13: Line graph


1 clc;
2 close all;
3 x = 0 : 0.1 : 1;
4 figure
5 y = x.^2;
6 plot(x,y,'r−−+')
7 figure
8 y = x.^3;
9 plot(x,y,'r−−+')
10 grid on

3. Axis control
The axis command can be used to control the ranges of x- and y-coordinates that
are plotted. (Unless you say otherwise, Matlab will choose the ranges automat-
ically.) For example, the command axis([0 10 -1 1]) specifies that the graph win-
dow will show the region 0 ≤ x ≤ 10, −1 ≤ y ≤ 1. The same effect is obtained
by the sequence of commands v = [0 10 -1 1]; axis(v) .

Listing 14: Axis control


1 clc;
2 close all;
3 x = 0 : 0.1 : 1;
4 y = x.^2;
5 plot(x,y,'r−−+')
6 grid on
7 axis([−0.5 1.5 −0.1 0.8])

25

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

The axis command should be invoked after the graph is plotted. In general, it is
possible to plot a graph once and then execute the axis command several times
to alter the appearance of the plot.

4. Labelling plots
Suppose that a plot is currently residing in the graphics window. Some com-
mands:

xlabel(’info’) Places the character string info immediately below the x-axis
ylabel(’info’) Places the character string info next to the y-axis
title(’info’) Places the character string info above the graph
Places the lower left corner of the character string
text(x,y,’info’)
info at position (x,y) in the graphics screen
gtext(’info’) Same as text except the text is placed graphically

5. Screen control
To clear the contents of the graphics window, type clf or clg .
The command hold on holds the current graph on the screen. Subsequent graph-
ing commands will add to the current plot; everything that is already in the
graphics window will be retained, and the axes will not change. The command
hold off turns off this mode.

6. Plotting several axes in the same graph window


It is possible to divide the graph window into several subwindows and then place
a plot in each sub window. The command subplot(m,n,p) divides the graph win-
dow into an m × n array of subwindows and then selects the pth subwindow for
the next plot. The sub windows are numbered row-wise. If the pth subwindow
already contains a plot, then subplot(m,n,p) causes that window to become the
current window.

Example 7. Plots of the curves y = x, y = x2 , y = x3 , and y = x4 in a 2 × 2 array of


plots.

Listing 15: Multiple plots


1 x = 0 : .01 : 1;
2 % Divide the graph window into a 2x2 array of windows, and

26

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

3 % and plot y = x in the first window.


4 subplot(2,2,1)
5 plot(x, x)
6 title('y = x')
7 % Place plots in the other windows. Several commands
8 % can be placed on one line.
9 subplot(2,2,2), plot(x, x.^2), title('y = x^2')
10 subplot(2,2,3), plot(x, x.^3), title('y = x^3')
11 subplot(2,2,4), plot(x, x.^4), title('y = x^4')
12 % Go back and put y−labels on the first and fourth plots.
13 subplot(2,2,1), ylabel('first plot')
14 subplot(2,2,4), ylabel('last plot')

7. Using several graph windows


The command figure can be used to create new graphics windows. If, for ex-
ample,you have just started a Matlab session, then a graphics command would
create a graph in a window labelled Figure No. 1 . Subsequent graphics com-
mands would over-write the contents of that window. If you wish to retain the
contents of that window while creating new graphs, then execute the command
figure to create a new window; if Figure No. 1 is the only existing graphics win-
dow, then the new window is labelled Figure No. 2 . Subsequent executions of
figure create additional windows.

Example 8. Suppose that the only open graphics window is Figure No. 1 . The following
commands plot the graph of y = x in Figure No. 1 , y = x2 in Figure No. 2 ,and y = x3
in Figure No. 3 . The third figure is then printed.

Listing 16: figure function


1 x = 0 : 0.01 : 1;
2 y = x;
3 plot(x,y)
4 figure
5 plot(x, x.^2)
6 figure
7 plot(x, x.^3)
8 grid on

27

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Three dimensional graphics


1. Arrays of independent and dependent variables
MATLAB contains commands for producing contour plots and surface plots. In
each case, MATLAB plots the data contained in a rectangular matrix. The entries
in such a matrix are regarded as z-coordinates, and the row and column indices
correspond to the independent variables.
If the z-coordinates are to be calculated from an explicit formula involving in-
dependent variables x and y , it is very useful to generate matrices containing
values of these variables.

Example 9. Suppose that you want to plot a function of (x, y) for 0 ≤ x ≤ 1.5 and 0 ≤
y ≤ 1, with increment 0.5 in each variable. (In practice, the increment should generally
be much smaller than this.) Arrays containing values of these variables a regenerated by
the following commands.

Listing 17: Arrays


1 x = 0 : 0.5 : 1.5;
2 y = 0 : 0.5 : 1;
3 [X,Y] = meshgrid(x,y);

The output from the meshgrid is

The matrix X thus contains values of x-coordinates, and matrix Y contains values
of y coordinates. In the matrix Y, the row index increases with increasing values
of y. Don’t worry about the values of y being upside down; this is taken care of
automatically by the contour plot and surface plot routines.

28

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

2. Contour plot
The MATLAB function contour produces contour plots of functions of two real
variables; the MATLAB function contour3 produces three-dimensional contour
plots, in which contours are placed on a three-dimensional surface.

Example 10. Produce a contour plot of the function z = e−y sin x for 0 ≤ x ≤ π and
0 ≤ y ≤ 1.

Listing 18: Contour plot


1 close all;
2 x = 0 : pi/30 : pi;
3 y = 0 : .1 : 1;
4 [X,Y] = meshgrid(x,y);
5 Z = exp(−Y) .* sin(X);
6 figure
7 contour(Z)
8 figure
9 contour3(Z)

In this example the contours are not labelled, the contour level is chosen by MAT-
LAB.
In this example the contour levels are specified explicitly and each contour is
labelled with the corresponding value of z.

Listing 19: Contour plot


1 close all;
2 clc;
3 x = 0 : pi/30 : pi;
4 y = 0 : .1 : 1;
5 [X,Y] = meshgrid(x,y);
6 Z = exp(−Y) .* sin(X);
7 v = .2 : .2: 1;
8 cdata = contour(x,y,Z,v);
9 %clabel(cdata)
10
11 clabel(cdata,v,'FontSize',18,'Color','red')

29

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

12 axis([−1 5 −1 2])

• vector v contains the values of z for which contours are to be drawn.


• The statement cdata = contour(x,y,Z,v); produces the plot and stores data
about the plot in an array named cdata .
• clabel(cdata) produces labels of the contour curves.

3. Plotting an implicit function


One application of contour plotting is to plot curves that are defined implicitly.
If you want to plot a curve of the form f ( x, y) = 0, make a contour plot of f with
one contour level z = 0.

Remark 11. The command contour(x,y,Z,n) produces n contour levels if n is an integer.

Example 12. Plot the curve(s) defined by e xy = (1 + x + y).

Listing 20: Implicit function plot


1 n=100;
2 x = linspace(−5,5,n);
3 y = linspace(−5,5,n);
4 [X,Y] = meshgrid(x,y);
5 Z = exp(X.*Y) − (1 + X + Y);
6 contour(x,y,Z)
7 %axis([−10 10 −10 10])

4. Surface plots
Examples of plotting surfaces in three dimensions are as follows.

mesh Represent the surface as a wire-frame mesh.


Represent the surface as a wire-frame mesh,
meshc
and also display a contour plot in the (x, y) plane.
Same as mesh, except that the ’panes’ between
surf
the mesh curves are colored (or shaded).
Same as surf, except that a contour plot
surfc
is also shown in the (x, y) plane.

30

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Example 13. Plot a mesh plot of the surface z = e−y sin x for 0 ≤ x ≤ π and 0 ≤ y ≤
1.

Listing 21: Mesh plot


1 x = 0 : pi/30 : pi;
2 y = 0 : .1 : 1;
3 [X,Y] = meshgrid(x,y);
4 Z = exp(−Y).*sin(X);
5 mesh(Z);
6 xlabel('x');
7 ylabel('y');
8 zlabel('z');
9 figure
10 meshc(Z);
11 xlabel('x');
12 ylabel('y');
13 zlabel('z');
14 figure
15 surf(Z);
16 xlabel('x');
17 ylabel('y');
18 zlabel('z');
19 figure
20 surfc(Z);
21 xlabel('x');
22 ylabel('y');
23 zlabel('z');

5. Parametric plots
The functions plot3 and comet3 can be used to plot parametric curves in three
dimensions. The function comet is a two dimensional analogue of comet3. The
mesh and surf functions can be used to plot surfaces for which z is not a function
of x and y. Instead, write x, y, and z as functions of two independent variables,
and plot a ’parametric’ surface.

31

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Example 14. If the axis of a torus is the z-axis then the torus can be parametrized in the
form x = ( a + b cos ψ) cos θ, y = ( a + b cos ψ) sin θ, z = b sin ψ, for 0 ≤ θ ≤ 2π,
0 ≤ ψ ≤ 2π. Here, a is the distance from the z-axis to the center of a cross-section, b is
the radius of a cross-section, ?θ is an angle of rotation about the z-axis, and ψ is an angle
of rotation within a cross-section. Here, we plot a torus for which a = 2 and b = 1.

Listing 22: Parametric plot


1 theta = 0 : pi/16 : 2*pi;
2 psi = 0 : pi/16 : 2*pi;
3 [T, P] = meshgrid(theta, psi);
4 a = 2; b = 1;
5 X = ( a + b*cos(P) ) .* cos(T);
6 Y = ( a + b*cos(P) ) .* sin(T);
7 Z = b*sin(P);
8 surf(X,Y,Z)
9 figure
10 plot3(X,Y,Z)
11 figure
12 comet3(X,Y,Z)

Listing 23: Comet plot


1 t = 0:pi/50:4*pi;
2 x = −sin(t) − sin(t/2);
3 y = −cos(t) + cos(t/2);

32

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

4 p = 0.7;
5 comet(x,y,p)

For practice

1 function [y1,y2]=compound()
2 % First a table of y1 = t^2/10 and y2 = t^3/100 is created.
3 % To plot y1, y2 vs. and t, they need to be made vectors.
4 % y1 and y2 vs. t are plotted on the same graph.
5 clear;
6 clc;
7 t = 0:10;
8 for n = 1:length(t)
9 y1(n) = t(n)^2/10;
10 y2(n) = t(n)^3/100;
11 end
12
13 fprintf(' t y1 y2 \n');
14 fprintf('−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−\n');
15 for n = 1:length(t)
16 fprintf('%8.1f %10.2f %10.2f \n',t(n),y1(n),y2(n));
17 end
18 % Create the plot, y1 as a solid line, y2 as a dashed line.
19 plot(t,y1,t,y2,'−−');
20 xlabel('t'), ylabel('y1,y2'), grid, title('y1 and y2 vs. t');
21
22 text(6.5,2.5,'y2');
23
24 text(4.2,2.4,'y1'),

Linear systems
To solve a system of linear equations is the most important task in technical computing.
We revisit some concepts already covered in previous classes.

33

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Recall

Zeros and roots


MATLAB provides a way to compute for roots of a polynomial. MATLAB represents a
polynomial by the vector of its coefficients in the descending order. So the vector

p = [1 − 1 − 1]

represents the polynomial p( x ) = x2 − x − 1.


The roots are computed by the roots function

r=roots(p)

produces

You can also use the Symbolic Toolbox which connects to a computer algebra system
to solve the equation without converting to a polynomial. The equation involves the
symbolic variable and a double equal sign. The solve function finds the two solutions.

The pretty function displays the results in a way that resembles typeset mathematics

34

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

For equations from a system of equations, solve function can be used to obtain the
solutions even though there exists more efficient methods for systems of equations.
For example

Bisection

Suppose we would like to compute for the 2. Bisection method can be used by using
interval bisection which uses systematic trial and error technique. We know that the

2 is between 1 and 2. Trying x = 1 12 . Because x2 is greater than 2, this x is too big.
Trying x = 1 41 , for this x2 is small. We continue this way and our approximations are
1 12 , 1 41 , 1 38 , 1 16
5
, 1 13
32 , . . .
This can simply be implemented in MATLAB as follows where we include a counter
to know the number of iterations.

1 M = 2;
2 a = 1;
3 b = 2;
4 k = 0;

35

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

5 while b−a > 1.2e−10


6 x = (a + b)/2;
7 if x^2 > M
8 b = x;
9 else
10 a = x;
11 end
12 fprintf('These are the values of a= %5.8f and b=%5.8f at %i iteration
with %5.4e with %5.4e\n',a,b,k, b−a, eps )
13 k = k + 1;
14 end

Newton’s method
This involves solving f ( x ) = 0 and draws the tangent to the graph f ( x ) at any
point and determines where the tangent intersects the x-axis. The method requires one
starting value, x0 , and the iteration is

f ( xn )
x n +1 = x n − .
f ′ ( xn )

Linear systems
To solve a system of linear equations is the most important task in technical computing.
We revisit some concepts already covered in previous classes.

Recall

Zeros and roots


MATLAB provides a way to compute for roots of a polynomial. MATLAB represents a
polynomial by the vector of its coefficients in the descending order. So the vector

p = [1 − 1 − 1]

represents the polynomial p( x ) = x2 − x − 1.


The roots are computed by the roots function

36

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

r=roots(p)

produces

You can also use the Symbolic Toolbox which connects to a computer algebra system
to solve the equation without converting to a polynomial. The equation involves the
symbolic variable and a double equal sign. The solve function finds the two solutions.

The pretty function displays the results in a way that resembles typeset mathematics

For equations from a system of equations, solve function can be used to obtain the
solutions even though there exists more efficient methods for systems of equations.
For example

37

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Bisection

Suppose we would like to compute for the 2. Bisection method can be used by using
interval bisection which uses systematic trial and error technique. We know that the

2 is between 1 and 2. Trying x = 1 12 . Because x2 is greater than 2, this x is too big.
Trying x = 1 41 , for this x2 is small. We continue this way and our approximations are
1 12 , 1 41 , 1 38 , 1 16
5
, 1 13
32 , . . .
This can simply be implemented in MATLAB as follows where we include a counter
to know the number of iterations.

1 M = 2;
2 a = 1;
3 b = 2;
4 k = 0;
5 while b−a > 1.2e−10
6 x = (a + b)/2;
7 if x^2 > M
8 b = x;
9 else
10 a = x;
11 end
12 fprintf('These are the values of a= %5.8f and b=%5.8f at %i iteration
with %5.4e with %5.4e\n',a,b,k, b−a, eps )
13 k = k + 1;
14 end

38

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Newton’s method
This involves solving f ( x ) = 0 and draws the tangent to the graph f ( x ) at any
point and determines where the tangent intersects the x-axis. The method requires one
starting value, x0 , and the iteration is

f ( xn )
x n +1 = x n − .
f ′ ( xn )

Ordinary Differential Equations


There are several techniques of obtaining numerical solutions of ordinary differential
equations (ODEs). The choice of an appropriate technique to use is based on the effi-
ciency, accuracy and special features of the functions involved as well as the stiffness
of the ODEs plays a crucial role. An ODE is an equation that contains one indepen-
dent variable (e.g. time) and one or more derivatives with respect to that independent
variable. In the time domain, ODEs are initial-value problems, so all the condtions are
specified at the initial time t = 0.
The initial value prolem (IVP) for an ODE involves finding a function y(t) that satisfies

dy(t)
= f (t, y(t)) ,
dt

subject to the initial condition


y ( t0 ) = y0 .

A numerical solution to this problem generates a sequence of values for the indepen-
dent variable,t0 , t1 , ..., and a corresponding sequence of values for the dependent vari-
able, y0 , y1 , ..., so that each yn approximates the solution at tn :

yn ≈ y (tn ) , n = 0, 1, . . .

Modern numerical methods automatically determine the step sizes

h n = t n +1 − t n ,

so that the estimaed error in the numerical solution is controlled by a specified toler-
ance.
The fundamental theorem of calculus gives us an important connection between dif-

39

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

ferential equations and integrals:


Z t+h
y(t + h) = y(t) + f (s, y(s))ds,
t

where numerical quadrature technique cannot be used to approximate the integrand


since hte function y(s) is unkown hence the basis idea is to choose a sequence of values
of h so that we can generate a numerical solution.
MATLAB has several different functions (built-ins) for the numerical solutions of ODEs.
But before we look at some of the inbuilt functions, we look at how some of these nu-
merical techniques are structured.

Single step methods


The simplest numerical method for the solution of IVP problems is the Euler’s method.
It uses a fixed step size h and generates the approximate solution by

y n +1 = y n + h f ( t n , y n ) ,
tn+1 = tn + h.

The MATLAB code would use an initial point t0, a final point t f inal and initial value
y0, a step size h and a function f . The primary loop would be

1 t = t0;
2 y = y0;
3 while t <= tfinal
4 y = y + h*f(t,y);
5 t = t + h;
6 end

Improvement on this technique is called the Improved Euler’s method or the Heun’s
method, which uses the fact that in order to approximate f (t, y) we can obtain the aver-
age of its values at t0 and t1 . For example
Z t1
y ( t1 ) = y0 + f (t, y(t))dt,
t0
Z t1
f ( t0 , y0 ) + f ( t1 , y1 )
= y0 + dt.
t0 2

40

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

In order to approximate f (t, y(t)) by the average of its values at t0 and t1 , we need to
know its values at t0 and t1 . We know the former but not the latter. We use some other
method in this case the Euler’s method to provide us with the initial approximation
yˆ1 for y(t1 ), thus we take yˆ1 = y0 + h f (t0 , y0 ) as an approximation of y at t1 . We then
write
Z t1
f (t0 , y0 ) + f (t1 , ŷ1 )
y ( t1 ) = y0 + dt,
t0 2
 
f (t0 , y0 ) + f (t1 , yˆ1 )
= y0 + h .
2

We move on to approximate y at t2 = t1 + h. First by Euler method for the initial


approximation ŷ2 = y1 + h f (t1 , y1 ) and use it to obtain
 
f (t1 , y1 ) + f (t2 , yˆ2 )
y ( t2 ) = y1 + h .
2

In general we first find


ŷn+1 = yn + h f (tn , yn )

and then use it to find


 
f (tn , yn ) + f (tn+1 , ŷn+1 )
y ( t n +1 ) = y1 n + h .
2

This general method with ŷn+1 is also called te Runge-Kutta 2nd Order method. It is an
example of predictor-corrector method, where it uses Euler’s method to predict and
then corrects the value.
The Runge-Kutta 4th order method used improvements of the above discussed
method for increased accuracy. In general this technique calculates the values of k1 , k2 , k3 , k4
and k from the formulas

k1 = h f (tn , yn ) ,
 
h k1
k2 = h f tn + , yn + ,
2 2
 
h k2
k3 = h f tn + , yn + ,
2 2
k4 = h f (tn + h, yn + k3 ) ,

41

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

and
1
k= [k1 + 2k2 + 2k3 + k4] ,
6
then set yn+1 = yn + k and take this as the approximation at xn+1 = xn + h.

First Order Equations


Though MATLAB is primarily a numerics package, it can certainly solve straight for-
ward differential equations symbolically. Suppose, for example, that we want to solve
the first order differential equation given by

y′ ( x ) = xy.

In MATLAB we can use the built-in function called dsolve(). For this problem the syn-
tax looks like this

1 y = dsolve('Dy = y*x','x');

Notice in particular that MATLAB uses capital D to indicate the derivative and
requires that the entire equation appear in single quotes. MATLAB takes t to be the
independent variable by default, so here x must be explicitly specified as the indepen-
dent variable. Alternatively, if you are going to use the same equation a number of
times, you might choose to define it as a variable, say, eqn1.

1 eqn1 = 'Dy = y*x'


2 y = dsolve(eqn1,'x')

To solve an IVP with say the initial condition y(1) = 1, we use can use either of the
following structures

1 eqn1 = 'Dy = y*x';


2 y = dsolve(eqn1,'y(1)=1','x');
3
4
5 %% OR We can use the following syntax
6 inits = 'y(1)=1';
7 y = dsolve(eqn1,inits,'x');

Now that we’ve solved the ODE, suppose we want to plot the solution to get a rough
idea of its behavior. We run immediately into two minor difficulties:

42

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

1. Our expression for y( x ) isn’t suited for array operations (.*, ./, .?).

2. y, as MATLAB returns it, is actually a symbol(a symbolic object).

The first of these obstacles is straightforward to fix, using vectorize(). For the second,
we employ the useful command eval(), which evaluates or executes text strings that
constitute valid MATLAB commands. Hence, we can use

1 x = linspace(0,1,20);
2 z = eval(vectorize(y));
3 plot(x,z)

Remark 15. eval() evaluates strings (character arrays), and y, as we have defined it, is a
symbolic object. However, vectorize converts symbolic objects into strings.

Second and Higher Order Equations


Suppose we want to solve and plot the solution to the second order equation given by

d2 y ( x ) dy( x ) dy
+ 8 + 2y ( x ) = cos x y ( 0 ) = 0, (0) = 1,
dx2 dx dx

we can implement it as follows

1 eqn2 = 'D2y + 8*Dy + 2*y = cos(x)';


2 inits2 = 'y(0)=0, Dy(0)=1';
3 y=dsolve(eqn2,inits2,'x');
4 x=linspace(0,1);
5 z = eval(vectorize(y));
6 plot(x,z)

Systems of ODEs
Suppose we want to solve and plot solutions to the system of three ordinary differential
equations

43

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Example 16. Obtain the solution of the following system of ODEs

x ′ (t) = x (t) + 2y(t) − z(t)


y′ (t) = x (t) + z(t)
z′ (t) = 4x (t) − 4y(t) + 5z(t)

We can implement it as follows

1 [x,y,z]=dsolve('Dx=x+2*y−z','Dy=x+z','Dz=4*x−4*y+5*z');

Remark 17. If you use MATLAB to check your work, keep in mind that its choice of constants
C1, C2, and C3 probably won’t correspond with your own. For example, you might have C =
−2C1 + 21 C3, so that the coefficients of exp(t) in the expression for x are combined. Fortunately,
there is no such ambiguity when initial values are assigned. Notice that since no independent
variable was specified, MATLAB used its default, t.

To solve an initial value problem, we simply define a set of initial values and add them
at the end of our dsolve() command.Suppose we have x (0) = 1, y(0) = 2, and z(0) = 3.
We have, then,

1 inits='x(0)=1,y(0)=2,z(0)=3';
2 [x,y,z]=dsolve('Dx=x+2*y−z','Dy=x+z','Dz=4*x−4*y+5*z',inits);
3 t=linspace(0,.5,25);
4 xx=eval(vectorize(x));
5 yy=eval(vectorize(y));
6 zz=eval(vectorize(z));
7 plot(t, xx, t, yy, t, zz)

MATLAB Ordinary Differential Equations Solvers


The names of the MATLAB ordinary differential equation solvers are all of the form
odennxx with digits nn indicating the order of the underlying method and a possibly
empty xx indicating some special characteristic of the method. If the error estimate
is obtained by comparing formulas with different orders, the digits nn indicate these
orders. For example, ode45 obtains its error estimate by comparing a fourth-order and
a fifth-order formula.
These solvers can be used with the following syntax:

44

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

1 [outputs] = function_handle(inputs)
2 [t, state] = solver (@dstate, tspan, Initialconditions, options)

where

1. state: An array. The solution of the ODE (the values of the state at every time).

2. solver: MATLAB algorithm e.g. ode45, ode23 etc

3. dstate: Handle for the function containing the derivatives.

4. tspan: Vector that specifies the interval of the solution e.g. [t0 : 5 : t f ].

5. Initial conditions: A vector of the initial conditions for the system (row or col-
umn).

Different numerical methods reduce errors at a different rate for example Euler’s method,
Midpoint methods and Rune-Kutta methods reduce the error at 1st , 2nd and 4th orders
respectively. Different solvers have been implemented differently and are applicable
differently depending on the circumstances.
In summary

1. ode45: Based on explicit Runge-Kutta 4th and 5th order formula. In computing
y(tn+1 ), it needs only the solution at the immediately preceding time point, y(tn ).
Usage: Nonstiff problems, medium accuracy. Use most of the time. This should
be the first solver you try.

2. ode23: Based on explicit Runge-Kutta 2nd and 3rd order formula. It is often more
efficient than ode45 at crude tolerances and in the presence of moderate stiffness.
Usage: Nonstiff problems, low accuracy. Use for large error tolerances or moder-
ately stiff problems.

3. ode113: Uses a variable-order Adams-Bashforth-Moulton predictor-corrector al-


gorithm. It is often more efficient than ode45 at stringent tolerances and if the
ordinary differential equation file function is particularly expensive to evaluate.
Usage: Nonstiff problems, low to high accuracy. Use for stringent error toler-
ances or computationally intensive ordinary differential equation functions.

45

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

4. ode15s: is a variable-order solver based on the numerical differentiation formulas


(NDFs). Optionally, it uses the backward differentiation formulas (BDFs, also
known as Gear’s method), which are usually less efficient. Like ode113, ode15s is
a multistep solver. Try ode15s if ode45 fails or is very inefficient and you suspect
that the problem is stiff, or if you are solving a differential-algebraic problem.
Usage: Stiff problems, low to medium accuracy. Use if ode45 is slow (stiff sys-
tems) or there is a mass matrix.

5. ode23s: is based on a modified Rosenbrock formula of order two. Because it is


a one-step solver, it is often more efficient than ode15s at crude tolerances. It can
solve some kinds of stiff problems for which ode15s is not effective.
Usage: Stiff problems, low accuracy. Use for large error tolerances with stiff sys-
tems or with a constant mass matrix.

6. ode23t: is an implementation of the trapezoidal rule using a ’free’ interpolant.


Use this solver if the problem is only moderately stiff and you need a solution
without numerical damping. ode23t can solve differential-algebraic equations.
Usage: Moderately stiff problems, low accuracy. Use for moderately stiff prob-
lems where you need a solution without numerical damping.

7. ode23tb: is an implementation of TR-BDF2, an implicit Runge-Kutta formula


with a first stage that is a trapezoidal rule step and a second stage that is a BDF
of order two. By construction, the same iteration matrix is used in evaluating
both stages. Like ode23s, this solver is often more efficient than ode15s at crude
tolerances.
Usage: Stiff problems, low accuracy. Use for large error tolerances with stiff sys-
tems or if there is a mass matrix.

Example 18. Numerically approximate the solution of the first order differential equation

dy
= xy2 + y, y(0) = 1,
dx
on the interval x ∈ [0, 5].

For any differential equation in the form y′ = f ( x, y), we begin by defining the func-
tion f ( x, y). For single equations, we can define f ( x, y) as an inline function. For this
example we can implement it as follows

46

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

1 f=inline('x*y^2+y');
2 [x,y]=ode45('x*y^2+y',[0 .5],1);
3 plot(x,y);

Choosing the partition


In approximating this solution, the algorithm ode45 has selected a certain partition of
the interval [0, .5], and MATLAB has returned a value of y at each point in this parti-
tion. It is often the case in practice that we would like to specify the partition of values
on which MATLAB returns an approximation. For example, we might only want to
approximate y(.1), y(.2), ..., y(.5). We can specify this by entering the vector of values
[0, .1, .2, .3, .4, .5] as the domain in ode45. That is we use
1 xvalues=0:.1:.5;
2 [x,y]=ode45(f,xvalues,1);
3 plot(x,y)

Remark 19. It is important to point out here that MATLAB continues to use roughly the same
partition of values that it originally chose; the only thing that has changed is the values at which
it is printing a solution. In this way, no accuracy is lost.

Options
Several options are available for MATLAB ode45 solver, giving the user limited con-
trol over the algorithm. Two important options are relative and absolute tolerance,
respecively RelTol and AbsTol in MATLAB . At each step of the ode45 algorithm, an er-
ror is approximated for that step. If yk is the approximation of y( xk ) at step k, and ek is
the approximate error at this step, then MATLAB chooses its partition to insure

ek ≤ max ( RelTol × yk , AbsTol ),

where the default values are RelTol = .001 and AbsTol = .000001. As an example for
when we might want to change these values, observe that if yk becomes large, then
the error ek will be allowed to grow quite large. In this case, we increase the value of
RelTol. For the equation y′ = xy2 + y, with y(0) = 1, the values of y get quite large as
x nears 1. In fact, with the default error tolerances, we find that the command

47

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

1 f=inline('x*y^2+y');
2 [x,y]=ode45(f,[0,1],1);
3 plot(x,y);

leads to an error message, caused by the fact that the values of y are getting too large
as x nears 1. (Note at the top of the column vector for y that it is multipled by 1014 .) In
order to fix this problem, we choose a smaller value for RelTol.

1 options=odeset('RelTol',1e−10);
2 [x,y]=ode45(f,[0,1],1,options);
3 max(y)

which is 2.4251e + 07

Example 20. Now using functions the ODE in the previous example can be implemented as

1 function yprime = firstode(x,y)


2 % FIRSTODE: Computes yprime = x*y?2+y
3 yprime = x*y^2 + y;
4 xspan = [0,.5];
5 y0 = 1;
6 [x,y]=ode23(@firstode,xspan,y0);
7 plot(x,y)

Example 21. Solving the first order ODE given by

dy
= αy(t) − γt(t)2 , y(0) = 10.
dt

We implement in MATLAB as follows

1 function [t,y] = example_seven()


2 tspan = [0 9]; % set time interval
3 y0 = 10; % set initial condition
4 % dstate evaluates r.h.s. of the ode
5
6 function dydt = dstate (t,y)
7 alpha=2;
8 gamma=0.0001;

48

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

9 dydt = alpha* y−gamma *y^2;


10 end
11
12 [t,y] = ode45( @dstate ,tspan ,y0);
13 plot(t,y)
14 disp([t,y]) % displays t and y(t)
15 hold on
16 [t,y1] = ode23( @dstate ,tspan ,y0);
17 plot(t,y1)
18 hold on
19 [t,y2] = ode113( @dstate ,tspan ,y0);
20 plot(t,y2)
21 hold off
22 end

Example 22. Implement the following system of equations1 in MATLAB and plot the solu-
tions. The system is given by

dy
= x, y(0) = 2,
dt
dx  
= 1000 1 − y2 x − y, x (0) = 0.
dt

This is a stiff system because the limit cycle has portions where the solution compo-
nents change slowly alternating with regions of very sharp change - so we will need
ode15s. Hence we implement as follows

1 function [T,Y] = call_osc()


2 tspan = [0 3000];
3 y_0 = 2;
4 x_0 = 0;
5 [T,Y] = ode15s(@osc,tspan,[y_0 x_0]);
6 plot(T,Y(:,1),'o')
7
8 function dydt = osc(t,y)
9 dydt = zeros(2,1); % this creates an empty column

1 Van der Pol equations in relaxation oscillation

49

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

10 %vector that you can fill with your two derivatives:


11 dydt(1) = y(2);
12 dydt(2) = 1000*(1 − y(1)^2)*y(2) − y(1);
13 %In this case, y(1) is y and y(2) is x, and dydt(1)
14 %is dy/dt and dydt(2) is dx/dt.
15 end
16 end

Arguments From data


Motivation: When the US tested the atomic bomb in 1945, a British mathematician
Sir Geoffrey Taylor was able to estimate accurately the mass of the bomb based on
the dimensional analysis of the radius of the shock wave as a function of time using
the film footage of the explosion. Since data was still classified, he assumed that the
expanding shock wave with a radius R due to the explosion could be expressed as

R = f (t, E, ρ, p), (1)

where t was time, E, the released energy(function of mass of the bomb), ρ, density of
the ambient air and p denoting the air pressure. With all the conclusions, he deduced
the formula of the radius of the shock as

 1/5
t2 E
R= , (2)
ρ
which describes the radius of the shockwave as a function of t and parameters E and
ρ. From the example above, given the measurement data (t, R(t)) and the value of
density as ρ = 1.25kg/m3 , it was possible to estimate E and hence the mass of the
nuclear bomb. Using the following data

50

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Time (in miliseconds) Radius (metres)


0.10 11.1
0.24 19.9
0.38 25.4
0.52 28.8
0.66 31.9
.. ..
. .
3.53 61.1
3.80 62.9
4.07 64.3
From equation (2), taking logs on both sides, yields

2 1 1
log R = log t + log E − log ρ. (3)
5 5 5

Rewriting (3), we have

2 1 1
log R = log t + b, where b = log E − log ρ. (4)
5 5 5

• E ≈ 8.05 × 1013 joules can be obtained by a least square bit of the data. And using
a conversion factor of 1Kiloton = 4.186 × 1012 Joules, Taylor was able to estimate
the weight of the bomb as 19.2 Kilotons and was later revealed that the actual
weight was 21.1 Kilotons. (Taylor had quite accurate approach).

• Now exploring the usefulness of the data above, the logarithmic representation
given by (4) is equivalent to an equation of the form

y(t) = αx (t) + β,

with the new variables y = log R and x = log t, known parameter α and un-
known parameter β

• If the measurement data


ti , R(ti ) : i = 1, 2, . . .

were exact, it is possible to determine the unknown coefficient β from a single


measurement of ( x (t1 ), y(t1 ) = (log t1 , log ( R(t1 ))))

51

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

• Taking into account the measurement data are subject to measuring errors, e.g.,
from the measurement apparatus or from other sources of errors not part of the
measurement model, then a more realistic model could be an equation of the
form
Y (t) = αX (t) + β + ϵ(t) (5)

where ϵ(t) can be statistical noise / errors in obtaining the data.

Curve fitting and linear least squares Problem

Curve Fitting Problems


Given data ( x1 , y1 ), . . . , ( xn , yn ), we consider the following curve-fitting models:

1. Find a straight line y = ax + b as in (5) that ’best fits’ all the data points.

2. Find a mth order polynomial

y = a m x m + a m −1 x m −1 + · · · + a1 x + a0 ,

that ’best fits’ all data points.

3. Find a curve of form

y = a0 f 0 ( x ) + a1 f 1 ( x ) + · · · + a m f m ( x ),

that ’best fits’ all data points. Here f 0 ( x ), f 1 ( x ) . . . f m ( x ) are given functions.

Least Square Fitting:


Suppose that y = a1 x + a0 is a line of the best fitting, also called a line of regression ,
we need to clarify the meaning of best fitting.

• When n > 2, there is a little hope for a line to pass through more than two data

52

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

points, i.e., for all the following equations to hold simultaneously

y1 = a1 x1 + a0 ,
y2 = a1 x2 + a0 ,
..
.
y n = a1 x n + a0 .

We look for a best fittig line that minimises the total error.

d( a0 , a1 ) = [y1 − ( a1 x1 + a0 )]2 + · · · + [yn − ( a1 xn + a0 )]2 (6)

Data Prediction Error


x1 y1 a1 x1 + a0 y1 − ( a1 x1 + a0 )
x2 y2 a1 x2 + a0 y2 − ( a1 x2 + a0 )
.. .. .. ..
. . . .
xn yn a1 x n + a0 y2 − ( a1 x n + a0 )

From the table, d( a0 , a1 ) measures the total error between data yi and the pre-
diction a1 xi + a0 for i = 1, . . . , n. This problem is also a standard minimization
problem, we look for a pair ( aˆ0 , aˆ1 ) at which the function d( a0 , a1 ) is a minimum.

• The choice of the Euclidean norm for the measure of error gives rise to the form
’least squares’
It ensures that d( a0 , a1 ) is a differentiable function.

Mathematical Formulation of least square problem


Let 
  
y1 1 x1 " #
 , x = a0 ,
. . 
b= . .
 .  , A = .  a1
yn 1 xn

Then the least squares solution ( x̂ ) = ( aˆ0 , aˆ1 ) T satisfies

||b − A x̂ || ≤ ||b − Ax ||, (7)

53

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

for all x ∈ R. || · || denotes the Euclidean norm and T (superscript denotes transposi-
tion of the matrices and vectors.)

• The expression term given by (6) is infact the square of the Euclidean norm ||b −
Ax ||2 and we use the fact that ||b − Ax || is minimized iff ||b − Ax ||2 is minimized.

Generally, we allow Am×n be a matrix of m × n and b ∈ R m . Then x̂ ∈ R m is the least


squares solution of Ax = b if x̂ satisfies

||b − A x̂ || ≤ ||b − Ax ||, (8)

for all x ∈ R n .
Geometrical Illustration:
A least square solution x̂ can be found based on geometrical observations.
To set
col ( A) = { Ax : x ∈ R n } ,

is the column space col( A) of matrix A. Therefore minimizing d = ||b − Ax || is equiv-


alent to finding the distance from vector b to the subspace col ( A).

• From geometry, we know such distance is achieved at the orthogonal projection


of b onto the subspace col ( A)
b̂ = Proj b.

Therefore, the least squares solution necessarily satisfies

A x̂ = b̂. (9)

• It is desirable to find a solution x̂ of (9) without having to find the projection b̂.
Consider the following diagram

b

b − b̂


col ( A) b̂ = Proj b

54

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

From this geometry, b − b̂ is orthogonal to col ( A) and thus (b − A x̂ ) ⊥ col ( A). In


terms of the dot product, we have

(b − A x̂ ) · all columns of A = 0.

In matrix form, this relation becomes

A T (b − A x̂ ) = 0.

• Based on these observations, we know that a least square solution x̂ necessarily


satisfies the normal system given by

A T A x̂ = A T b. (10)

• The least squares solution x̂ may not be unique, however rthe solutions to (9) and
(10) are unique.

Theorem 23. The following statements are true:

1. The least squares solution is unique for each b ∈ R m

2. The columns of A are linearly indepnedent ( A has full rank)



3. Matrix A T A is invertible and x̂ = A T A A T b

Example 24. Find the line y = a0 + a1 x that best fits the data points (2, 1), (5, 2), (7, 3) and
(8, 3).

In this case  
 
1 2 1
    " #
1 5 2 a0
A=
1 7 , b = 3 , x = a
  
    1
1 8 3

. The normal system ATAx = A T b becomes


   
1 2 1
" # " # " # 
1 5 a0 = 1 1 1 1 2 ,
1 1 1 1    
2 5 7 8  
1 7 a1 2 5 7 8 
3

1 8 3

55

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

namely " #" # " #


4 22 a0 9
= .
22 142 11 57
On solving,
" # " # −1 " # " #
2
a0 4 22 9 7
= = 5
,
a1 22 142 57 14
2 5
hence the least squares line is y = 7 + 14 x

3.5
3
2.5
y

2
1.5
1
0.5
∗∗∗∗
2 4 6 8
x

Example 25. Find the quadratic curve that best fits the data points (2, 1), (−1, 5), (6, 2), (4, −1)
The quadratic curve is of the form

y = a2 x 2 + a1 x + a0

• We need to find the coefficients ( aˆ0 , aˆ1 , aˆ2 ) such that

d = [( aˆ0 + 2aˆ1 + 4aˆ2 ) − 1]2 + [( aˆ0 + (−1) aˆ1 + aˆ2 ) − 5]2


+ [( aˆ0 + 6aˆ1 + 36aˆ2 ) − 2]2 + [( aˆ0 + 4aˆ1 + 16aˆ2 ) − (−1)]2

56

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

is smallest among all choices of ( a0 , a1 , a2 ). Let


       
1 x1 x12 1 2 y1 1
4  
a
 0
       
1 x2 x22  1 −1 1   y2   5 
A= 1 6 36 , b = y  = 1.2 , x =  a1  ,
=       
1 x
 3 x32 
    3  
2
a2
1 x4 x4 1 4 16 y4 1

The normal system is A T Ax = A T b becomes


   
  1 2 4     1
1 1 1 1   a 1 1 1 1  
 1 −1 1   0  
  a1  = 2 −1 6 4   5  ,
  
2 −1 6 4  
  1 6 36     2 
   
4 1 36 16 a2 4 1 36 16
1 4 16 −1

namely     
4 11 a 57 7
   0  
11 57 287   a1  =  5 
    
57 287 1569 a2 65

The unique solution is

aˆ0 ≈ 2.9719, (11)


aˆ1 ≈ −1.909, (12)
aˆ2 ≈ 0.2826. (13)

The unique least squares curve is y = 0.2826x2 − 1.909x + 2.9719.

Remark 26. In the above example, we use non-linear best-fit functions. Why do we call it a
linear least-squares problem? The best fit function takes the form of linear combination of bases
functions, and finding the best-fit functions means finding the best choice of coefficients.

Example 27. Find the least squares function of the form x (t) = a0 e a1 t , t > 0, a0 > 0 for the
data points
(t1 , x1 ), (t2 , x2 ), . . . , (tn , xn ), x1 , x2 , . . . , xn > 0.

Let y(t) = ln x = ln a0 + a1 t,

57

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

and

b0 = ln a0 ,
b1 = a1 ,
y1 = ln x1 , · · · , yn = ln xn .

Then we first solve the following linear-least-squares problem:


Find the least-squares curve of the form

y(t) = b0 + b1 t, t > 0,
 
ˆ ˆ
we obtain least-squares solution b0 , b1 . The best-fit curve to the original problem is
then given by
ˆ
x (t) = aˆ0 e aˆ1 t where aˆ0 = eb0 , aˆ1 = bˆ1 .

Non-Linear Least Square Problems


Let θ = (θ1 , . . . , θm ) be a multidimensional parameter. Consider a family of scalar-
valued curves y = f ( x, θ ) that depend on the parameter θ.

Nonlinear least-square fitting:

Given data points


( x1 , y1 ), ( x2 , y2 ), . . . , ( x n , y n ),

find a parameter value θ̂ such that the curve y = f ( x, θ̂ ) minimizes the squared sum of
erros (SSE):
n
2
SSE (θ ) = ∑ (yi − f (xi , θ )) , (14)
i =1

NB: Linear least squares will not work for this kind of problem.
SSE(θ) function is treated as a smooth function of θ. To find its minimum in R m , we
use calculus to get a critical point.

∂SSE(θ )
= 0, j = 1, . . . , m. (15)
∂θi

58

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Using Chain rule of differentiation, (14) is written as


!
n
∂ f ( xi , θ )
∑ [yi − f (xi , θ )] −
∂θ j
= 0, j = 1, . . . , m (16)
i =1

→ this system may still be non-linear in θ through the dependence in f ( xi , θ ). Nu-


merical schemes are used to find approximations.
Normally
n o starting at an inital guess θ (0) of θ1 we define a sequence of approximations
θ (k) inductively by the process of Newton’s method.
Expanding f ( xi , θ ) by its first-order Taylor’s polynomial at θ (k)
 
  n ∂ f xi , θ (k) 
(k)

f ( xi , θ ) ≈ f xi , θ (k ) + ∑ θs − θs
s =1
∂θs
 
∂ f ( xi ,θ (k) ) 
• Let J = ∂θ j = Jij , be the Jacobian matrix at θ k . We note that Jij in
the above relation depends on k and for simplicity of notation, we suppecs the
dependence on k.
Equation (16) can be approximated by
" #
n   m    
(k)
∑ yi − f xi , θ (k ) − ∑ Jij θs − θs − Jij = 0, j = 1, . . . , m. (17)
i =1 s =1

• Let
 
(k)
∆yi = yi − f xi , θ ,
( k +1) (k)
∆θ j = θj − θj .

Equation (17) can be written as

n m n
( k +1)
∑∑ Jis Jij ∆θs = ∑ Jij ∆yi , (18)
i =1 s =1 i =1

from which we can solve ∆θ (k+1) and define

θ (k+1) = θ (k) + ∆θ (k+1) , for k = 0, 1, 2, . . . .

59

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

In matrix notation (18) can be written as

J T J∆θ = J T ∆y. (19)

Equation (19) is called a normal system.

• The following iterations scheme for non-linear least-square method is called Gauss-
Newton method:

1. Choose the initial value θ (0)


2. Solve the normal equation (19) for ∆θ (1)
3. Update θ by θ (1) = θ (0) + ∆θ (1)
4. Repeat the iteration until convergence is achieved (when the difference θ (k+1) −
θ (k) ) is below margin error.

There are many other methods used for non-linear least-squares problem aimed
at improved efficiency and rates of convergence.

• If f ( x, θ ) is a linear function of θ, for example

f ( x, θ ) = α( x ).θ,

where α( x ) = (α1 ( x ), . . . , αm ( x )). We expect the Gauss-Newton method leads to


a linear least squares method, indeed

m
f ( xi , θ ) = α ( xi ) · θ = ∑ α j ( xi ) θi , i = 1, . . . , n
j =i

and
n
SSE(θ ) = ∑ ( y i − α ( x i ) · θ )2 .
i =1

Therefore
n
∂SSE 
= ∑ ( yi − α ( xi ) · θ ) − α j ( xi ) ,
∂θ i =1
n
= − ∑ α j ( xi ) (yi − α ( xi ) · θ ) , j = 1, . . . , m. (20)
i =1

60

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212


• Let A = α j ( xi ) , then (20) can be written in matrix form as

A T y − A T Aθ = 0,

which gives the normal equation

A T Aθ = A T · y,

of the linear least-squares problem.

Example 28. Given the data

(1, 4.6), (2, 8.82), (3, 16), (4, 31.3), (5, 58.5)

Find the best-fit curve x = a0 e a1 t

Solution one (Using Linear least-squares)


Using the transformation y = ln x, b0 = ln a0 , b1 = a1 we obtain

y = b0 + b1 t,

and the new data set is

(1, 1.526) , (2, 2.177), (3, 2.773), (4, 3.444), (5, 4.069).

Using the linear least-squares method, let


   
1 1 1.526
   
1 2 2.177 " #
    b0
   
A = 1 3 , b = 2.773 , y = .

1
   b1
 4
3.444
 
1 5 4.069

The normal equation A T Ay = A T b becomes


" #" # " # " # " #
5 15 b0 13.989 b0 0.892
= , · ,
15 55 b1 48.32 b1 0.635

61

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

a0 = eb0 = e0.892 = 2.44,


a1 = b1 = 0.635.

Hence the least-square curve is


x = 2.44e0.635t

Solution two (Using Gauss-Newton method)


To solve the non-linear least-squares problem, consider the non linear function

f (t, a) = a0 e a1 t , a = ( a0 , a1 ), x (1, 2, . . . , 5) T

The Jacobian matrix is  


e a1 a 0 e a1
 2a1 
e
 2a0 e2a1 
 
J ( x, a) = e3a1 3a
3a0 e  ,
1
 
e4a1 4a0 e4a1 
 
e5a1 5a
5a0 e 1

and the normal equation J T J∆a = J T ∆y becomes


"  #" #
e2a1 + e4a1 + e6a1 + e8a1 + e10a1 a0 2e a1 + 2e4a1 + 3e6a1 + 4e8a1 + 5e10a1 ∆a1
 
a0 e2a1 + 2e4a1 + 3e6a1 + 4e8a1 + 5e10a1 a20 e2a1 + 4e4a1 + 9e6a1 + 16e8a1 + 25e10a1 ∆a2
 
∆y1
 
" # ∆y2 
e a1 e2a1 e3a1 e4a1 e5a1 



= ∆y3  ,
a0 e3a1 2a0 e2a1 3a0 e3a1 4a0 e4a1 5
5a0 e a1 ∆y 

 4
∆y5
" # " (k)
#
∆a1 a0 − a0 (k) (k)
• Here = (k) and ∆yi = ui − a0 e a1 , 1 ≤ i ≤ 5
∆a2 a1 − a1

• Choosing an initial vector


" (0)
# " #
a0 1
(0) =
a1 1

62

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

and using the iteration scheme


" ( k +1)
# " (k)
#
a0 a0   −1
( k +1) = (k) + JT J JT
a1 a1 
(k)
a0 , a1
(k)


 (k) (k)

y 1 − a 0 e a0
 
 y 2 − a ( k ) e a0 ( k ) 
 0 
 ( k ) a0 ( k ) 
 y3 − a0 e 
 
 y − a ( k ) e a0 ( k ) 
 4 0 
( k ) a0 ( k )
y5 − a0 e
For k = 0;
 
1.882
 
" (1)
# " # " # −1 " #  1.431 
a0 1 25472.8 12383 2.718 7.389 20.086 54.598 148.413  


(1) = + ×  −4.086  ,
a1 1 133383 602214 2.718 14.778 60.257 218.393 742.066  
 − 23.298 

−89.913
" #
1.386
= .
0.801
For k = 1;
 
1.511
 
" # " # " # #  1.936  "
a20 1.386 3797.53 24873.7 2.28 4.965 11.064 24.653 54.934   

= + ×  0.661  ,
a21 0.801 24873.7 166018 3.089 13.768 46.017 136.716 380.807 −2.879

 
−17.66
" #
2.104
= .
0.651
For k = 2, " # " #
(3)
a0 2.428
(3) = .
a1 0.635

63

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

For k = 3, " # " #


(4)
a0 2.431
(4) = .
a1 0.636

For k = 4, " # " #


(5)
a0 2.431
(5) = .
a1 0.636

The margin of error is 3 dp, the desired accuracy is achieved at 4th iteration and

a0 ≈ 2.431, a1 ≈ 0.636

70
60
50
40
30
20
10

1 2 3 4 5

Fitting data to models and parameter estimation


Data is often used to validate mathematical models and also to increase confidence in
the model that has been developed. Suppose we consider a model described by an
initial values problem of a system of differential equations given by

dx
= f ( x, θ ), x ∈ R d , t ∈ [0, tmax ] , (21)
dt
x (0) = x0 . (22)

Here, θ ∈ R m is an m−dimensional parameter and [0, tmax ] is the finite time interval
in which the model is considered. The data is often given at discrete observation time

64

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

points t1 , t2 , . . . , t p [0, tmax ] in the form


        
t 1 , g x (1) , t 2 , g x (2) , . . . , t p , g x ( p ) , x ( i ) ∈ R d .

The function g( x ) : R d → R n represents measurable quantities of the state variable x,


also called the observables. To fit with data, it is natural that we only consider values of
the solution x (t, θ ) at these observational time points:

T
x (t, θ ) ≈ x (t1 , θ ), x (t2 , θ ), . . . , x (t p , θ ) .

From the fundamental theory of differential equations we know that, if the vector field
f () x, θ ) is a smooth function (having continuous partial derivatives) with respect to
( x, θ ), then the solution x (t, θ ) has a dependence on ? with the same order of smooth-
ness as f . In the above notation, the dependence of the solution on initial condition x0
is understood and suppressed. We keep x0 fixed and discuss fitting the parameter θ. In
many applications, the initial conditions are not always known and need to be fitted.
Since the solution x (t, x0 , θ ) is a diffeomorphism2 with respect to the initial conditions
x0 , we can consider x0 as part of the parameter ?. We denote the data points as
 
y = g ( x (1) ) , g ( x (2) ) , . . . , g ( x ( p ) ) , x ( i ) ∈ R d .

The squared sum of the errors (SSE) between solution and the data can be measured
by
p   2
SSE(θ ) = d ( g( x (t, θ )), y)2 = ∑ = g( x (ti , θ )) − g x (i) . (23)
i =1

Remark 29. It is important to measure the differences in quantities of the same type. Since
the data is given as the observable quantities g( x ), we need to compare the observable part of
the model solution g ( x (t, θ )) with the data. The expression ∥ g( x ) − g(y)∥ is the Euclidean
norm of the n-dimensional vector g( x ) − g(y), that is in its discrete form it is written as
∥ g( x ) − g(y)∥2 = ∑in=1 | gi ( x ) − gi (y)|2 .

The main idea in using the least-squares fitting is to obtain the value θ̂ of the model
parameter θ such that the value SSE(θ ) is a minimum. Such a problem is clearly a
non-linear least squares method, since the dependence of the solution x (t, θ ) on the
2Aone-to-one continuously-differentiable mapping f : M → N of a differentiable manifold M ( e.g.
of a domain in a Euclidean space) into a differentiable manifold N for which the inverse mapping is also
continuously differentiable.

65

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

parameter θ is a non-linear system of linear differential equations. We can apply the


Gauss-Newton method to the least-squares parameter fitting problem. The differen-
tial equations can be discretized to give a system of difference equations for the solu-
tion.The normal equation for the difference equation can be derived and then solved
by Gauss-Newton iteration.
It is computationally expensive when (21) is of high dimension hence it is recom-
mended to use optimized functions in MATLAB namely:

1. lqcurvefit: This function requires the following inputs: the model equation, the
initial guess for the parameters to be fitted, the time points and the data points.
It then solves the non-linear least-squares problem directly.

2. nlinfit: This is a nonlinear regression routine that uses an iterative least-squares


estimation with an initial value for the parameters.

3. fminsearch: This function needs sum of squares of errors SSE(θ ) between the
model output and data. With the initial guess of θ0 the model can be solved
numerically to produce a value for SSE(θ0 ). This needs the least-squares error
function and the initial guess of the parameter value, and uses a direct search
routine to find the minimum value of least-squares error. To ensure that the min-
imim value returned by the fminsearch is not just a local minimum, the process is
repeated with several choices of the initial guess.

polyfit function

MATLAB calls curve fitting with a polynomial by the name ’polynomial regression’.
The function polyfit(x,y,m) returns a vector of (m + 1) coefficients, ai ,that represent the
best-fit polynomial of degree m for the( xi , yi ) set of data points. The coefficient order
corresponds to decreasing powers of x; that is,

y c = a 1 x m + a 2 x m −1 + a 3 x m −2 + . . . + a m x + a m +1 .

To obtain yc at ( x1 , x2 , . . . , xn ), we use the MATLAB function polyval(a,x) which re-


turns a vector of length n giving yc,i where

yc,i = a1 xim + a2 xim−1 + a3 xim−2 + . . . + am xi + am+1 .

As discussed in the previous section, MATLAB measures the precision of the fit using a

66

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

function named MSE which calculates the mean squared error (MSE) which is defined
as
1 m
(yi − yc,i )2 ,
n i∑
MSE = (24)
=1

where n is the number of data points.

Example 30. In this example we obtain the best fit polynomial for the approximating function
for orders 2,3,4 and 5. The data is given as: x = −10 : 2 : 10 or x2 = 10 : 0.5 : 10 and
y = (−980, −620, −70, 80, 100, 90, 0, −80, −90, 10, 220)

Listing 24: Using polyval and polyfit fucntions


1 clear; clc;
2 x = −10:2:10;
3 y = [−980 −620 −70 80 100 90 0 −80 −90 10 220];
4 x2 = −10:0.5:10;
5 mse = zeros(4);
6 for n = 2:7
7 fprintf('n = %i \n',n);
8 coef = zeros(n+1);
9 coef = polyfit(x,y,n);
10 yc2 = polyval(coef,x2);
11 yc = polyval(coef,x);
12 MSE(n) = sum((y−yc).^2)/length(x);
13 fprintf(' x y yc \n');
14 fprintf('−−−−−−−−−−−−−−−−−−−−−−−−−−−−\n');
15 for i = 1:length(x)
16 fprintf('%5.1f %5.1f %8.3f \n',x(i),y(i),yc(i));
17 end
18 fprintf('\n\n');
19 subplot(2,3,n−1),plot(x2,yc2,'−r',x,y,'−>k',x,yc,'o'),
20 legend('yc2','y','yc', 'Location','best')
21 xlabel('x'), ylabel('y'), grid, axis([−10 10 −1500 500]);
22 title(sprintf('Degree %d polynomial fit',n));
23 end
24 fprintf(' n MSE \n')
25 fprintf('−−−−−−−−−−−−−−−−−−−−\n');

67

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

26 for n = 2:5
27 fprintf(' %d %8.2f \n',n,MSE(n))
28 end

The output of the plots of the fittings are as follows

From the values obtained for MSE, MSE decreases as the order of the fitted polynomial
is increased.

Cubic splines

Given a set of n data points, suppose that an mth -degree polynomial is selected as the
approximating curve and that this approximating curve produces curve values that are
not allowed. For example, suppose it is known that a particular property represented
by the approximating curve (such as absolute pressure or absolute temperature)must
be positive and the approximating function produces values that are negative. In this
case, the approximating function produces values that are not allowed and is therefore
not satisfactory. The method of cubic splines eliminates this problem. Given a set of
(n + 1) data points ( xi , yi ) , i = 1, 2, . . . , (n + 1), the method of cubic spline develops a
set of n cubic functions such that y( x ) is represented by a different cubic in the interval
of the n intervals and the set of the cubics passes through the (n + 1) data points.

68

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

This is accomplished by forcing the slopes and the curvatures to be the same fr each
pair of cubics that join at a data point.
 
d2 y
±
dx2
Remark 31. The curvature, K is given by k =    3/2 .
dy 2
1+ dx

Consider the diagram below, showing the two adjacent intervals in a cubic spline
curve fitting scheme.

This type of fitting can be accomplished by using the following equations

[y( xi )]int i−1 = [y( xi )]int i ,


   
y′ ( xi ) int i−1 = y′ ( xi ) int i ,
 ′′   
y ( xi ) int i−1 = y′′ ( xi ) int i .

In the interval (i − 1),( xi−1 ≤ x ≤ xi ),

y( x ) = Ai−1 + Bi−1 ( x − xi−1 ) + Ci−1 ( x − xi−1 )2 + Di−1 ( x − xi−1 )3 .

In the interval i, ( xi ≤ x ≤ xi+1 ),

y( x ) = Ai + Bi ( x − xi ) + Ci ( x − xi )2 + Di ( x − xi )3 .

From the above equations, we have fewer equations than the unknowns hence the
d2 y
need to make the additional assumptions. The values for dx at x1 and at xn+1 must b
assumed. The following alternatives exist:

69

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

1. Assume that y′′ ( x1 ) = y′′ ( xn+1 ) = 0.


This forces the splines to approach straight lines at the end points.

2. Assume that y′′ ( xn+1 ) = y′′ ( xn ) and y′′ ( x1 ) = y′′ ( x2 ).


This forces the splines to approach parabolas at the end points.

In MATLAB the syntax for the cubic spline function is spline function as follows:

yy=spline(xi,yi,xx),,

where ( xi, yi ) is the given set data points and yy is the value of y at xx. The spline
function determines the four cubic coefficients for each section in the given data and
will evaluate yy by the cubic spline method.

Remark 32. Using the function interpl in MATLAB gives the same results as spline method
is specified for interpolation. The syntax for interpolating by the spline method is

yi=interpl(x,y,xi,’spline’)

Example 33. Consider the data given by distance=[0.52:0.3:4.12] and pressure=[165.5, 96.5,69.0,52.4,37.2,
27.6,21.4,17.2,13.8,11.7,10.3, 9.0, 7.2]. We use the two methods spline and interpl for this set
of data points and see the results.

Listing 25: Using spline and interpl fucntions


1 clear; clc;
2 dist = 0.52:0.3:4.12;
3 press=[165.5 96.5 69.0 52.4 37.2 27.6 21.4 17.2 13.8 11.7 ...
4 10.3 9.0 7.2];
5 d = 0.52:0.1:5;
6 p1 = spline(dist,press,d);
7 p2 = interp1(dist,press,d,'spline');
8 plot(d,p1,'k<−',d,p2,'−*r'), xlabel('km from ground zero'),
9 ylabel('overpressure(kPa)'), grid,
10 title('peak over−pressure vs. distance from blast')
11 legend('Spline','interp1','Location','best')

70

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Example 34. Consider the data given as follows for the infection of flu in a school.

Day 3 4 5 6 7 8 9 10 11 12 13 14
Number of
25 75 227 296 258 236 192 126 71 28 11 7
infected individuals
Let us use the function fminsearch to fit this data to solutions from mathematical model for flu
given by the system of ODE as follows

dS
= − βSI,
dt
dI
= βSI − αI.
dt

It is possible to fit the parameters α and β and the initial conditions. We can pre-
estimate α from the duration of infectiousness and the two initial conditions from the
data above. Since I (3) = 25 and S(3) = 738, the duration of infectiousness is 2-4
days, so we may take α = 0.3. and β = 0.0025. We use MATLAB to fit the data to the
solutions of the model.

Listing 26: Using fminserch to fit the parameters of from the ODE model
1 function ODE_model_fitting
2
3 clear all

71

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

4 close all
5 clc
6 Filename='Data';
7 fludata = xlsread(Filename);
8 format long % specifying higher precision
9 tdata = fludata(:,1); % define array with t−coordinates from the data
10 qdata = fludata(:,2); % define array with y−coordinates i.e the number of
infections
11
12 tforward = 3:0.01:14; % t mesh for the solution of the ODEs
13 tmeasure = [1:100:1101]'; % selects the points in the solution
14
15 a = 0.3;
16 b = 0.0025; % initial values of parameters to be fitted
17
18 function dy = model_1(t,y,k) % the system of ODEs
19 a = k(1);
20 b = k(2);
21 dy = zeros(2,1);
22
23 dy(1) = − b * y(1) * y(2);
24
25 dy(2) = b * y(1) * y(2) − a * y(2);
26 end
27
28
29 function error_in_data = moder(k) % computing the error in the data
30
31 [T Y] = ode23s(@(t,y)(model_1(t,y,k)),tforward,[738.0 25.0]);
32
33 q = Y(tmeasure(:),2);
34 error_in_data = sum((q − qdata).^2); %computes SSE
35 end
36 k = [a b]; % main routine; assigns initial values of parameters
37 [T Y] = ode23s(@(t,y)(model_1(t,y,k)),tforward,[738.0 25.0]);

72

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

38
39 yint = Y(tmeasure(:),2);
40 figure(1)
41 subplot(1,2,1);
42 plot(tdata,qdata,'r*');
43 hold on
44 plot(tdata,yint,'b−');
45 xlabel('time in days');
46 ylabel('Number of cases');
47 title('Fitting before optimizing the parameters')
48 axis([3 14 0 500]);
49 grid on
50
51 [k,fval] = fminsearch(@moder,k); % minimization routine;
52 [T Y] = ode23s(@(t,y)(model_1(t,y,k)),tforward,[738.0 25.0]);
53 yint = Y(tmeasure(:),2); % computing the y−coordinates ...
54 k
55 subplot(1,2,2)
56 plot(tdata,qdata,'r*');
57 hold on
58 plot(tdata,yint,'b−');
59 xlabel('time in days'); % plotting final fit
60 ylabel('Number of cases');
61 title('Fitting after optimizing the parameters')
62 axis([3 14 0 500]);
63 grid on
64 end

73

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Monte Carlo Methods


Monte Carlo Methods are a class of computational algorithms that rely on random
sampling to obtain numerical solutions to complex problems. They were first intro-
duced during the Manhattan Project in the 1940s, where they were used to simulate
the behavior of neutrons in nuclear reactions. Since then, they have evolved into an
essential tool in scientific research and engineering, offering solutions to problems that
may be analytically intractable or computationally intensive.
The name ’Monte Carlo’ is derived from the famous Monte Carlo Casino in Monaco,
which is known for its games of chance involving random outcomes. The core idea be-
hind these methods is to approximate complex calculations by performing random
experiments or simulations and then analyzing the outcomes statistically to arrive at
an approximate solution.
The key concepts and general structure of Monte Carlo Methods are as follows

(i) Random Sampling: Monte Carlo methods rely on the generation of random num-
bers to simulate various scenarios. These random samples are drawn from prob-
ability distributions that represent the uncertainties or variables in the problem
being studied.

(ii) Integration: One of the primary applications of Monte Carlo methods is in ap-
proximating definite integrals of complex functions. Instead of using traditional

74

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

calculus methods, Monte Carlo integration involves randomly sampling points


within the integration domain and using the average of the function evaluations
at those points to approximate the integral.

(iii) Importance Sampling: To improve the efficiency of Monte Carlo simulations, im-
portance sampling is often employed. This technique involves biased sampling
to focus on regions of the problem space that have a significant impact on the final
result, rather than uniformly sampling the entire space.

(iv) Markov Chain Monte Carlo (MCMC): MCMC methods are a specialized class
of Monte Carlo techniques that use Markov chains to generate correlated ran-
dom samples. These methods are particularly useful when dealing with high-
dimensional spaces and are commonly used in Bayesian statistics and machine
learning for parameter estimation and inference.

Some of the applications of Monte Carlo Methods include and not limited to

(i) Simulation: Monte Carlo simulations are extensively used to model and analyze
complex systems, such as financial markets, traffic flow, weather patterns, and
nuclear reactions. These simulations can help predict outcomes and assess risk in
real-world scenarios.

(ii) Optimization: In optimization problems, Monte Carlo methods can be employed


to search for the best possible solution within a large parameter space. They are
especially useful when the objective function is challenging to evaluate analyti-
cally or has many local optima.

(iii) Numerical Integration: As mentioned earlier, Monte Carlo integration can be


used to approximate integrals, especially in cases where traditional methods like
the trapezoidal or Simpson’s rule are not feasible.

(iv) Uncertainty Analysis: Monte Carlo methods are essential for quantifying uncer-
tainty in complex systems. By running multiple simulations with randomly var-
ied inputs, one can assess the uncertainty and sensitivity of the model’s outputs.

(v) Gaming and Gambling: The original inspiration for Monte Carlo methods came
from games of chance, and these methods continue to be used in gambling and
casino industries for statistical analysis and predicting outcomes.

75

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Monte Carlo methods have proven to be versatile and powerful tools for tackling com-
plex problems that defy traditional analytical approaches. However, they require care-
ful consideration of sample sizes, convergence criteria, and the underlying probability
distributions to ensure accurate results and reliable conclusions. As computational
power continues to advance, Monte Carlo methods will likely remain a crucial compo-
nent in addressing real-world challenges and refining our understanding of complex
systems.

Random vectors and variables


Let R be the set of real numbers and N = 1, 2, 3, . . . be the set of natural numbers. For
any n ∈ N, we denote the set of real n−dimensional column vectors by R n . X ∈ R n is
a real n−dimensional random vector with the components X1 , . . . , Xn . In Monte Carlo
simulation, we often consider samples from the distribution of a random vector X. We
denote such samples by X1 , X2 , . . .. The notation Xi , which indicates the ith sample (a
random vector), should not be confused with Xi , which indicates the ith component of
X. When necessary, we indicate the jth component Xi ∈ R n by (Xi ) j .

Probability, expectation, variance and covariance


We denote the probability of an event A by PA. For example, the probability that a
random variable X exceeds a threshold x is written PX > x. We denote expectation by
E. For a random variable X ∈ R, the variance of X is

VarX := E [ X − EX ]2 = EX 2 − E [ X ]2 .

The covariance between random variables X and Y is

Cov( X, Y ) := E [( X − EX ) (Y − EY )] ,

so Cov( X, Y ) = Cov(Y, X ) and Cov( X, X ) = VarX. The correlation between X and Y


is
Cov( X, Y )
ρ := √ .
VarXVarY
Hence for a random vector X = ( X1 , . . . , Xn ) ∈ R n , we define the n × n covariance

76

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

matrix as
h i
CovX :=E (X − EX) (X − EX)T ,
 
VarX1 Cov( X1 , X2 ) . . . Cov( X1 , Xn )
 
 Cov( X , X ) VarX2 . . . Cov( X2 , Xn )
 2 1 
= . .. .. .
 .. . . 
 
Cov( Xn , X1 ) Cov( Xn , X2 ) ... VarXn

since Cov( X j , Xi ) = Cov( Xi , X j ) then CovX is symmetric for any X. The covariance
between random vectors X ∈ R n and Y ∈ R m is the n × m matrix

h i
Cov (X, Y) :=E (X − EX) (Y − EY) , T
 
Cov( X1 , Y2 ) Cov( X1 , Y2 ) . . . Cov( X1 , Ym )
 .. ... .. 
=
 . . .

Cov( Xn , Y1 ) Cov( Xn , Y2 ) . . . Cov( Xn , Ym )

Definition of the problem

Consider the problem of estimating a parameter

θ := Eh (X) .

Here X is a random vector in R m and h : R m → R. The random variables X1 , . . . , Xm


may be continuos, discrete, or mixed. They can represent static quantities or they may
be the values of some stochastic processes over time. The dimension m could be very
large. The function h may be non-linear, non-smooth or not even analytically repre-
sentable. For example, h could be the indicator function I A : R m → R for some bizarre
set A ∪ R m in which case θ = E I A (X) = P {X ∈ A}. We assume that E | h(X)| and
Var h (X) are finite.
In principle, the parameter θ could be computed analytically or by deterministic meth-
ods for numerical integration. If X is continuous with probability density function
(PDF) f defined on all of R m , for example the expected value of h (X) is the multiple
integral
Z +∞ Z +∞
Eh (X) = ... h(x) f (x)dx1 . . . dxm .
−∞ −∞

77

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Remark 35. In practice, this is usually too complex to evaluate analytically. The compu-
tational costs of deterministic methods for numerical integration typically increase exponen-
tially quickly with the dimension m of X. By contrast the Monte Carlo methods for computing
Eh (X) converges at a rate that is independent of m, hence this makes the Monte Carlo methods
attractive tools for complex, high-dimensional systems.

Intuitive explanation: Monte Carlo Integration

Consider that we want to evaluate the integral give by


Z 1
3
e− x dx.
0

3
The integrand e− x doesn’t seem to have a closed form solution hence we can use nu-
merical techniques to evaluate it. Using the approach of Riemann Integration where it
is proposed that we use choose evenly spaced points x1 , . . . , xK over the interval [0, 1]
and obtaining the corresponding functional values f ( x1 ) , . . . , f ( xK ) and use

1 K
K i∑
( f ( xi )) ,
=1

to evaluate the integration. If the function is smooth and as K → ∞, this numerical


integration converges to the analytical integration.
Suppose we rewrite the integration as
Z 1  
3 3
e− x · 1dx = E e−U ,
0

where U is uniform random variable over the interval [0, 1], hence the integration is
3
the expected value of the random variable e−U which implies that evaluating the inte-
gration is the same as estimating the expected value. So we can generate independent,
identical distributed (iid) random variables U1 , . . . , UK ∼ Uni[0, 1] and then compute

3 3
W1 = e−U1 , . . . , WK = e−UK ,

and then use


1 K 1 K −U 3
K i∑ K i∑
W̄K = Wi = e i,
=1 =1

78

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

R1 3
as the numerical evaluation of 0 e− x dx.
Hence by the Law of Large Number, 3

  Z 1
P −Ui3 3
W̄K −
→ E (Wi ) = E e = e− x dx,
0

is the alternative numerical method that is statistically consistent. In this above exam-
ple, the integration can be written as
Z
I= f ( x ) p( x )dx,

where f is some function and p is a probability density function. Let X be a random


variable with density p, then the equation above equals
Z
f ( x ) p( x )dx = E ( f ( X )) = I.

The alternative numerical method to evaluate the above integration is to generate the
iid X1 , . . . , X N ∼ p, N data points and then use the sample average

N
1
ĪN =
N ∑ f ( Xi ) .
i =1

This method of evaluating integrals via simulating random points is called Monte
Carlo Simulation.

Remark 36. A crucial feature of Monte Carlo simulation is the the statistical theory is rooted
in the theory of sample average where we use the sample average as an estimator of the expected
value. The bias and the variance of the estimator are the key quantities in the evaluation of the
quantity of an estimator.

Now since we are using the sample average as an estimator of the expected value, so
3 The law of large numbers states that an observed sample average from a large sample will be close
to the true population average and that it will get closer the larger the sample.

79

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

the bias ( ĪN ) = 0. The variance will then be

1
Var ( ĪN ) = Var ( f ( X1 )) ,
N 
1   2 
2 
= E f ( X1 ) − E ( f ( X1 )) ,
N | {z }
I2
Z 
1
= f 2 ( x ) p( x )dx − I 2 .
N
R
Hence the variance contains two components: f 2 ( x ) p( x )dx and I 2 .
The quantity I is fixed and we need choose the number of random points N and the
sampling distribution p. If we change the sampling distribution p, the function f will
also change.
R1 3
For instance in the evaluation of the integral 0 e− x dx( we have seen using the uni-
form random variables) to evaluate it. We can also generate iid B1 , . . . , BK ∼ Beta(2, 2),
K points from the beta distribution4 Beta(2, 2), now the PDF of Beta(2, 2) is

pBeta(2,2) ( x ) = 6x (1 − x ).

We can then rewite


3 3
!
e− x e− B1
Z 1 Z 1
− x3
e dx = · 6x (1 − x ) dx = E .
0 0 6x (1 − x ) | {z } 6B1 (1 − B1 )
| {z } p( x )
f (x)

Remark 37. It is important to note that different choices of p leads to a different variance of the
estimator as the expectation is always fixed to be I so the second part of the variance remains
R
the same but for the first part f 2 ( x ) p( x )dx depends on the choice of p and f . This brings
about the issue of importance of sampling procedure.
4 The probability density function (PDF) of the beta distribution for 0 ≤ x ≤ 1 for the parameters
x α −1 (1− x ) β −1 Γ(α)Γ( β)
α, β > 0 is where B(α, β) = where Γ is the gamma function.
B(α,β) Γ(α+ β)

80

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Monte-Carlo framework
STEP I: Statistical properties of the Monte-Carlo estimators

Instead of computing expectation integrals analytically or by deteministic numerical


methods, Monte-Carlo methods generate independent, identically distributed random
samples X1 , . . . , Xn from the distribution X and estimate Eh (X) by the corresponding
sample average,
1 n
n i∑
θ̄n := h ( Xi ) . (25)
=1

Since X1 . . . , Xn have the same distribution as X, then the sample average θ̄n is unbiased
estimator of θ:
1 n nEh (X)
E θ̄n = ∑
n i =1
Eh (Xi ) =
n
= θ. (26)

We assume that E | h (X) | < ∞, so the strong Law of Large numbers implies that θ̄n
is a consistent estimator of θ, that is

θ̄n → θ almost surely as n → ∞.

We also assume that


σ2 := Varh (X) < ∞,

so h (X1 ) , h (X2 ) , . . . is a sequence of iid random variables with finite mean θ and a
finite variance σ2 . Therefore the Central Limit Theorem5 gives the asymptotic distri-
bution of θ̄n : √ 
n θ̄n − θ
⇒ N (0, 1) as n → ∞. (27)
σ
Here there is convergence in the distribution where N (0, 1) is a standard normal ran-
dom variable.

STEP II: Selecting Sample Size

When selecting the sample, it is important we select such that our Monte-Carlo esti-
mator is as accurate as possible. Let us define the confidence level6 α ∈ (0, 1) and an
5 Let { X
1 , . . . , Xn } be a sequence of iid random variables having a distribution with the expected value
as µ and finite variance given by σ2 . Suppose we are interested in the sample average X̄ = X1 +...n + Xn ,
then by the law od large numbers, the sample averages converge almost surely to the expected value µ
as n → ∞.
6 Typical values of α are 0.05 and 0.01.

81

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

error tolerance ϵ > 0. Our aim here is to find n such that


P |θ̄n − θ | ≤ ϵ = 1 − α.

n(θ̄n −θ )
Using (27) gives us an approximate7 answer. Because ⇒ N (0, 1) is asymp- σ
totically distributed as a standard normal random variable. For large n we have

( √  √ )
 n θ̄n − θ nϵ
P |θ̄n − θ | ≤ ϵ = P ≤ ,
σ σ
  √

≈ P |N (0, 1)| ≤ ,
σ
 √   √ 
nϵ nϵ
= P N (0, 1) ≤ − P N (0, 1) ≤ − ,
σ σ
 √ 

= 2 P N (0, 1) ≤ − 1.
σ

where the last line follows symmetry and normalization of the standard normal PDF.

To guarantee that P |θ̄n − θ | ≤ ϵ ≈ 1 − α we choose n such that
  √

2 P N (0, 1) ≤ − 1 = 1 − α,
σ
√ 
nϵ α
⇐⇒ Φ = 1− ,
σ 2
where Φ is the standard normal cumulative distribution function (CDF). With the def-
inition  α
z1− α2 := Φ−1 1 − ,
2
the required sample size is
 2
σz1− α2
n= .
ϵ
In practice, we usually don’t know the variance σ2 of h (X), it can be estimated however
by the sample variance
7 It’s
approximate because the Central Limit Theorem only gives the asymptotic distribution of n in
the limit of large n. How large must n be before n starts to ’look normal’? It depends on ’how normal’
the distribution of h (X) is, but the typical rule of thumb is that n should be at least 30

82

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

n
1 2
σ̄n2 := ∑ h (Xi ) − θ̄n .
n − 1 i =1

The sample variance is unbiased and consistent estimator of σ2 . By the converging


together theorem, we can replace σ by σ̄n in the Central Limit Theorem:
√ 
n θ̄n − θ
⇒ N (0, 1) as n → ∞.
σ̄n

This gives a two-stage procedure for deciding how large a sample size to use:

1. Choose a pilot sample size n0 say (50 or 100), generate X1 , . . . , Xn0 iid from the
distribution X, and estimate σ by σ̄n0 .
 
σ̄n0 z1− α 2
2. Set n = ϵ
2

STEP III: Constructing confidence intervals

Now we as given α and n for what tolerance ϵ > 0 can we be 100(1 − α)% confident
 
that θ lies in the interval θ̄n − ϵ, θ̄n + ϵ
Using the results obtained from Step II,

σz1− α
ϵ= √ 2,
n

because σ is usually unknown, we use σ̄n when constructing the confidence intervals.

Convergence rate
σz1− α
The confidence interval half width ϵ = √ 2
n
, gives the measure of the Monte-Carlo
convergence rate. For a fixed α, ϵ it is directly proportional to the standard deviation
σ of h (X), meaning Monte-Carlo works better for problems with less variability.

Challenges

Challenges of using the algorithm 1 are as follows;

(i) How to generate the random samples X1 , . . . , Xn iid from the distribution of X.

(ii) Improving the convergence rate of Monte Carlo algorithm.

83

Downloaded by Francis Mbae ([email protected])


lOMoARcPSD|43072212

Algorithm 1 Monte-Carlo estimation of θ = E (h (X))


Input: Confidence interval α ∈ (0, 1) and sample size n ∈ N:

1. Generate X1 , . . . , Xn iid from the distribution of X.

2. Estimate θ = E (h (X)) and σ2 = Var h (X) by

1 n
n i∑
θ̄n := h ( Xi ) ,
=1

n
1 2
σ̄n2 := ∑ h (Xi ) − θ̄n .
n − 1 i =1

3. Conclude with approximately 100(1 − α)% confidence that


 
σ̄n z1− α2 σ̄n z1− α2
θ ∈ θ̄n − √ , θ̄n + √ .
n n
.

Remark 38. To address shortcomings in the sampling procedures, Markov Chains are used
hence what we call Markov Chain Monte Carlo (MCMC) that is powerful as Markov chains
ensure consistent samples are generated drawn from any given distribution. Metropolis Hast-
ings algorithms further ensures that there is faster convergence and correct the biases.

84

Downloaded by Francis Mbae ([email protected])

You might also like