0% found this document useful (0 votes)
8 views11 pages

Lab 1 Eng

This document provides an introduction to MCMC simulation using Matlab, focusing on the Normal-Inverse-χ2 distribution. It outlines the steps for logging into Matlab, basic commands, and constructing a Metropolis-Hastings algorithm for simulating posterior distributions with a practical example involving clinical trial data. The exercise emphasizes understanding the underlying algorithms while using Matlab for efficient computations and includes guidance on evaluating the simulations' convergence and effectiveness.

Uploaded by

Bouzouia Farid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

Lab 1 Eng

This document provides an introduction to MCMC simulation using Matlab, focusing on the Normal-Inverse-χ2 distribution. It outlines the steps for logging into Matlab, basic commands, and constructing a Metropolis-Hastings algorithm for simulating posterior distributions with a practical example involving clinical trial data. The exercise emphasizes understanding the underlying algorithms while using Matlab for efficient computations and includes guidance on evaluating the simulations' convergence and effectiveness.

Uploaded by

Bouzouia Farid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

STOCKHOLM UNIVERSITY April 17, 2008

Mathematics department
Div. of mathematical statistics
Mikael Andersson

Computer exercise 1: Introduction to


MCMC-simulation

The purpose of this computer exercise is that you, for a comparatively simple situa-
tion, will get an introduction to the general ideas behind MCMC-simulation and how one
practically implements it. These instructions are written for the software package Matlab,
but it is allowed to solve the exercise using other software.
The choice of Matlab is mainly due to pedagocical considerations. Today, there are
several programs developed for MCMC-simulation (e.g. WinBUGS for Microsoft Windows),
but using such high-level software usually does not give any deeper insight into the details
behind the algorithms. In addition, Matlab is very efficient when it comes to extensive
numerical computations, random number generation and handling large data sets.
No previous experience of Matlab is required, although it would be a great advantage.
The syntax and the specific functions used will be introduced gradually as needed. For
those who wish to learn more about Matlab, there are extensive resources on the web, see
for example www.sgr.nada.kth.se/unix/software/matlab/. It is also possible to get a
short description of every defined function in Matlab using the command help followed by
the name of the function.

1 To log in and start Matlab


First, enter your user name and password and klick “Go”. Then you enter the Linux
operating system and can work in a desktop environment similar to Windows. If this is
the first time you log in, a window for system configuration appear. If you do not want to
configure the system, just click “Avbryt”.
You start Matlab by first clicking the “K Menu” at the bottom to the left, then on
“Math” on the menu that appear and finally on “Matlab”: A total of four windows will ap-
pear: “Workspace”, “Command History”, “Current Directory” and “Command Window”.
You write all commands and operation that you want Matlab to execute in “Command
Window”, in “Command History” you see all commands that have been executed, in

1
“Workspace” you see all variables you create and in “Current Directory” you see all files
in the current directory. To terminate the session you either write exit in “Command
Window” or enter the menu “File” at the top and then “Exit MATLAB”.

2 A short introduction to Matlab


Essentially, Matlab works like an unusually advanced pocket calculator. At the command
line (starting with the prompter >>) in “Command Window” you can write simple arith-
metic expressions like5+2 or (3.7-4.8)/3*1.96 and the result is displayed immediately.

There are also several pre-defined functions like exp(2.5) for e2.5 , sqrt(2.5) for 2.5 och
log(2.5) for log(2.5) (natural logarithm).
Data sets are stored in the form of vectors and matrices. To create a vector you write

>> vec=[1 2 3 4]

vec =

1 2 3 4

and to create a matrix you write

>> mat=[1.4 2.7 4.2;3.3 1.9 8.4]

mat =

1.4000 2.7000 4.2000


3.3000 1.9000 8.4000

To access particular entries from these variables you use indices like

>> vec(2)

ans =

>> mat(2,3)

ans =

8.4000

You can also access several entries like

2
>> vec(2:4)

ans =

2 3 4

>> mat(1:2,2:3)

ans =

2.7000 4.2000
1.9000 8.4000

This is a very crude introduction to the structure of Matlab, but, as mentioned, we will
introduce more aspects and functions as we go along.

3 MCMC-simulation of Normal-Inverse-χ2-distribution
As an introduction to MCMC-simulatiom, we will consider the situation with normally
distributed data with unknown mean and variance. As we have seen in the lectures and
in the book, Section 3.3, the family of Normal-Inverse-χ2 -distributions is conjugate for
the parameter vector θ = (µ, σ 2 ) in the normal distribution. This comparatively simple
situation can be handled analytically, but it might be a good idea to carry out a simulation
algorithm for a model without too many parameters to keep track of.
In this exercise, we will focus on the case with the non-informative prior distribution
1
p(µ, σ 2 ) ∼
σ2
according to Section 3.2 in the book. This implies the posterior distribution
µ ¶n/2+1 µ ¶
2 1 1
p(µ, σ |y) ∼ exp − 2 {(n − 1)s2 + n(ȳ − µ)2 }
σ2 2σ
where ȳ and s2 are average and sample variance for the sampley = (y1 , y2 , . . . , yn ).
As an application with a biostatistical connection, we will use data from a clinical trial
where the dissolving time of a certain substance in stomach acid from eight patients was
measured. The result (in seconds) was

42.7 43.4 44.6 45.1 45.6 45.9 46.8 47.6

First create the vector y in Matlab consisting of these observations and calculate the average
and sample variance using the functions mean and var. (Write help mean and help var
for short descriptions of these functions.)

3
3.1 The Metropolis-Hastings algorithm
We are now going to construct an algorithm for simulation of the posterior distribution for
given data. Let us first rename our parameters as θ1 = µ and θ2 = σ 2 , which yields a more
convenient notation, and let n = 8. The posterior distribution can now be written
µ ¶
1
p(θ1 , θ2 |y) ∼ θ2−5 exp − {7s2 + 8(ȳ − θ1 )2 }
2θ2
We will now construct the algorithm such that the parameters are updated one at the
time instead of simultaneously. Let us start with the jump distributions.
We will use the uniform distribution for θ1 , mostly because it is easy to simulate but
also because it is symmetric. If we are at step t in the Markov chain, we simulate a new
value for θ1 according to
θ∗ ∼ U [θ1t−1 − d1 , θ1t−1 + d1 ]
where U [x1 , x2 ] denotes the uniform distribution on the interval x1 ≤ x ≤ x2 and d1 denotes
the maximal jump length. An alternative way to express this is

θ1∗ = θ1t−1 + X

where X ∼ U [−d1 , d1 ]. In Matlab it is easy to generate uniform random numbers between 0


and 1 using the command rand. To obtain X, we make the following simple transformation
(rand-0.5)*2*d1 in Matlab. Choose some d1 and test this.
For θ2 , which is non-negative, it is unsuitable to choose a uniform distribution for
the jumps because then we risk to generate negative values. We will instead consider the
transformed parameter φ = log θ2 , which is unbounded from below. In this parametrisation
we can choose the uniform distribution as

φ∗ = φt−1 + Y

where Y ∼ U [−d2 , d2 ]. To express the jump distribution for θ2 in terms of the jump
distribution for φ we use the relation
¯ ¯
¯ dφ ¯ 1
¯ ¯
Jt (θ2∗ |θ2t−1 ) ∗
= Jt (φ |φ t−1
) ¯¯ ¯ = Jt (φ∗ |φt−1 ) ∗
¯
dθ2 θ2

The ratio between the densities for the jump distributions used in the Metropolis-Hastings
algorithm can now be written

Jt (θ2∗ |θ2t−1 ) Jt (φ∗ |φt−1 )/θ2∗ θ2t−1


= = ∗
Jt (θ2t−1 |θ2∗ ) Jt (φt−1 |φ∗ )/θ2t−1 θ2

since Jt (φ∗ |φt−1 ) = Jt (φt−1 |φ∗ ).


When simulating θ2∗ we hence go through the following steps:
a. Calculate φt−1 = log θ2t−1 .

4
b. Simulate φ∗ = φt−1 + Y .

c. Calculate θ2∗ = exp(φ∗ ).

Let us now try out a couple of steps in the algorithm. First we have to choose starting
values for the parameters, for example θ10 = 45 and θ20 = 2. In this simple situation we
know that the distribution is centered in the point (ȳ, s2 ), so to get fast convergence we
choose a starting point close to this. To be able to handle data in a convenient way, we
store the simulated values in the vectors theta1 and theta2. We begin by entering the
starting values

>> theta1(1)=45

theta1 =

45

>> theta2(1)=2

theta2 =

Vectors are indexed from 1 and up, so θ10 corresponds to theta1(1) and θ11 corresponds to
theta1(2) and so on.
Let d1 = 0.1 and simulate θ1∗ as

>> d1=0.1

d1 =

0.1000

>> t1=theta1(1)+(rand-0.5)*2*d1

t1 =

45.0315

Since this is a random number, you will most likely get something else.
In the next step, we are going to calculate the ratio

p(θ1∗ , θ20 |y)


r=
p(θ10 , θ20 |y)

5
for the starting values θ10 , θ20 and our simulated value θ1∗ . Use the expression for the posterior
distribution p(θ1 , θ2 |y) and calculate r. Then generate a uniform random number between
0 and 1 and check if this is smaller than r. In Matlab you can use an if-statement as
>> if rand<r

theta1(2)=t1

else

theta1(2)=theta1(1)

end

theta1 =

45.0000 45.0315
If the condition after if is satisfied then the first command is executed, if not then the
second command after else is executed. In this case, the new value θ1∗ was accepted and
hence the vector theta1 was updated with the new value 45.0315.
Let d2 = 0.2 and simulate θ2∗ as
>> d2=0.2

d2 =

0.2000

>> phi=log(theta2(1))+(rand-0.5)*2*d2

phi =

0.6065

>> t2=exp(phi)

t2 =

1.8339
Then, calculate the ratio
p(θ11 , θ2∗ |y)/Jt (θ2∗ |θ21 ) p(θ11 , θ2∗ |y)θ2∗
r= =
p(θ11 , θ21 |y)/Jt (θ21 |θ2∗ ) p(θ11 , θ21 |y)θ21

6
and test in the same way as above if a random number between 0 and 1 is smaller than r.
Finally, update the vector theta2 depending on the result in a similar way as above.
This will produce vectors similar to

>> theta1

theta1 =

45.0000 45.0315

>> theta2

theta2 =

2.0000 1.8339

To reach (almost) convergence of the Markov chain constructed in this way, we naturally
have to repeat these steps a large number of times. To do this more conveniently, we will
now define a new function in Matlab that simulates a Markov chain a fixed number of
steps. Start by clicking the menu “File” in the top left corner in the “Command Window”,
then click “New” and finally “M-file”. Then a new window appear where we will write our
function.
Write on the first line

function [z1,z2]=MHalg(N,y,theta10,theta20,d1,d2)

The command function says that we are going to define a function with the name MHalg,
where N,y,theta10,theta20,d1,d2 are the arguments of the function and z1 and z2 are
the results, in our case the simulated parameters. We start as before by entering the
starting values and introduce a step counter t as

function [z1,z2]=MHalg(N,y,theta10,theta20,d1,d2)

theta1(1)=theta10;
theta2(1)=theta20;
t=0;

Putting a semicolon (;) at the end of a line has the effect that the result of the operation
on that line is not written out in “Command Window”. Since we are going to carry out
several operations hundreds and even thousands of times, the “Command Window” would
be jammed by all results.
The argument N denotes the number of steps we are going to simulate. To get the
function to repeat the following operations the required number of times, we can use a
while-statement as

7
while t<N
operation 1
operation 2
:
:
:
operation k
end

Write alla steps that you carried out earlier for the simulation of θ11 , but remember to use
the step counter t in the right way as

t1=theta1(t+1)+(rand-0.5)*2*d1;

and

theta1(t+2)=t1;

Write the corresponding steps for simulation of θ22 using the step counter in the same way
as before, update the vectors again and increase the step counter as

t=t+1;

Finish the while-statement with end. After this you write

z1=theta1;
z2=theta2;

so that the function yields the vectors as results. When all this is done you click “File”
again, then “Save As” and write MHalg.m on the line “Enter file name:”. Do not forget the
extension .m at the end, so that Matlab will recognise the file.
By calling the function in “Command Window” as

>> [z1,z2]=MHalg(100,y,45,2,0.1,0.2)

we get a simulated Markov chain in 100 steps with starting values 45 and 2 and maximal
jump lengths 0.1 and 0.2 respectively. Try this and plot the results using

>> plot(0:100,z1)
>> plot(0:100,z2)
>> plot(z1,z2)

It is also possible to illustrate the simulated parameters in histograms, which can be done
in Matlab as

>> hist(z1,10)
>> hist(z2,10)

8
The second argument, in this example 10, denotes the number of bars in the histogram. A
simple way to test if the simulated values are correct is to calculate the sample means

>> mean(z1)
>> mean(z2)

and compare to ȳ and s2 . Remember that the mean of the inverse χ2 -distribution is not
equal to s2 but to (n − 1)s2 /(n − 3).

3.2 Evaluation of simulations


To conclude this exercise, we will investigate how quickly the chain converges and try to find
optimal jump lengths. Let us simulate four separate sequences in 200 steps, throw away
the 100 first steps and calculate the scale reduction and effective number of simulations
according to section 11.6 in the textbook.
To simplify calculations, we will store all sequences for every parameter in a large
matrix. Again, since you need to carry out the operations below, it is a good idea to write
them in another M-file called, say, steplengths.m.

[z1,z2]=MHalg(200,y,45,2,0.1,0.2);
psi1(1,1:100)=z1(102:201);
psi2(1,1:100)=z2(102:201);
[z1,z2]=MHalg(200,y,45.5,2,0.1,0.2);
psi1(2,1:100)=z1(102:201);
psi2(2,1:100)=z2(102:201);
[z1,z2]=MHalg(200,y,45,10,0.1,0.2);
psi1(3,1:100)=z1(102:201);
psi2(3,1:100)=z2(102:201);
[z1,z2]=MHalg(200,y,45.5,10,0.1,0.2);
psi1(4,1:100)=z1(102:201);
psi2(4,1:100)=z2(102:201);

I have chosen four fairly scattered starting points based on the results from the first simu-
lation. To run the operations, you simply write

>> steplengths

in “Command Window”.
We have now obtained two matrices psi1 and psi2 consisting of four rows and 100
columns each. Using the Matlab functions mean, var and sum, we can now calculate the
variances between groups and within groups as

>> B1=100/3*sum(power(mean(psi1’)-mean(mean(psi1)),2))

B1 =

9
0.9307

>> B2=100/3*sum(power(mean(psi2’)-mean(mean(psi2)),2))

B2 =

163.9868

and

>> W1=mean(var(psi1’))

W1 =

0.0206

>> W2=mean(var(psi2’))

W2 =

0.8309

Again, it is quite possible that you will get other values, but the difference should not be
too large. We get the scale reductions

>> R1=sqrt((99/100*W1+1/100*B1)/W1)

R1 =

1.2010

>> R2=sqrt((99/100*W2+1/100*B2)/W2)

R2 =

1.7215

and effective number of simulations

>> neff1=4*100*(99/100*W1+1/100*B1)/B1

neff1 =

12.7543

10
>> neff2=4*100*(99/100*W2+1/100*B2)/B2

neff2 =

6.0065

According to my simulations, we do not satisfy the condition for low scale reduction nor
large effective number of simulations. Try to increase the number of simulations until
these criteria are satisfied. Try also longer or shorter jump lengths d1 and d2 to increse
the efficiency in the algorithm. Another convenient feature of Matlab is to recall previous
commands by repeatedly using the “arrow up” key.

3.3 Bayesian inference


Finally, we will use our simulated values to make inference on the parameters. Assume
that we have found a suitable number of simulations and useful jump lengths and have
stored the values in z1 och z2. Since these are assumed to follow the posterior distribution,
we can easily estimate means using sample means as

>> mean(z1)

>> mean(z2)

Posterior probability intervals are almost as easily obtained by sorting all values

>> z1=sort(z1);

>> z2=sort(z2);

and removing the 2.5 % smallest and 2.5 % largest values (if we want a 95 % interval). If
we have simulated 1000 values, we get the limits

>> z1(25)

>> z1(975)

and the corresponding for z2.

3.4 Exercise
The computer exercise must be summarised in a written report containing a print-out of
the file MHalg.m, plots and histograms of your simulated values, sample means and 95 %
probability intervals and choice of optimal jump lengths. It is also possible to send in the
report by e-mail to [email protected], preferably in pdf-format.

11

You might also like