0% found this document useful (0 votes)
27 views8 pages

Lab 8: Introduction To Winbugs: Goals

WinBUGS is Bayesian software that uses MCMC to fit statistical models. This lab introduces WinBUGS and Bayesian analysis. The goals are to learn the basic syntax of WinBUGS and apply it in a simple example of estimating the means of two normal distributions. Specifically, the document discusses: (1) what WinBUGS is and how it works, (2) the concepts of Bayesian analysis including specifying models, priors, and posteriors, and (3) a step-by-step demonstration of using WinBUGS to estimate normal means, including specifying the model, loading data, running the sampler, and examining posterior inference.

Uploaded by

Ekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views8 pages

Lab 8: Introduction To Winbugs: Goals

WinBUGS is Bayesian software that uses MCMC to fit statistical models. This lab introduces WinBUGS and Bayesian analysis. The goals are to learn the basic syntax of WinBUGS and apply it in a simple example of estimating the means of two normal distributions. Specifically, the document discusses: (1) what WinBUGS is and how it works, (2) the concepts of Bayesian analysis including specifying models, priors, and posteriors, and (3) a step-by-step demonstration of using WinBUGS to estimate normal means, including specifying the model, loading data, running the sampler, and examining posterior inference.

Uploaded by

Ekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

140.

656 Lab 8 2008

Lab 8: Introduction to WinBUGS


1. Introduce the concepts of Bayesian data analysis.
2. Learn the basic syntax of WinBUGS.
3. Learn the basics of using WinBUGS in a simple example.

Goals:

Next lab we will look at two case studies: (1) NMMAPS and (2) Hospital ranking data.
PART I

WinBUGS and Bayesian Analysis

A. What is WinBUGS
BUGS = Bayesian Inference Using Gibbs Sampling
Not using Windows? Try
OpenBUGS: http://mathstat.helsinki.fi/openbugs
JAGS: http://www-fis.iarc.fr/~martyn/software/jags

WinBUGS is a Bayesian analysis software that uses Markov Chain Monte


Carlo (MCMC) to fit statistical models.
WinBUGS can be used in statistical problems as simple as estimating
means and variances or as complicated as fitting multilevel models,
measurement error models, and missing data models.
WinBUGS fits fixed-effect and multilevel models using the Bayesian
approach. Stata fits fixed effects and limited multilevel models using
maximum likelihood or generalized least squares.
Often results from WinBUGS and Stata are very similar.

B. What is Bayesian Analysis


Consider the following conditional probability statement:
P (data | ) P ( )
,
P (data )
where is the unobserved parameter that we want to learn about using the observed data.
P ( | data ) =

There are three important components:


(1) P( data | ):
This represents the model part in a statistical analysis.
It describes our assumptions that the observed data were generated
based on the parameter .
E.g. is the mean of a Normal distribution with variance 1.
E.g. is the coefficients in a linear regression model where data here
include both the response (Y) and covariates (X).
(2) P():

140.656 Lab 8 2008

This represents prior assumption of .


It describes our prior belief of , typically using a distribution.
E.g. Assume ~ Normal (0, 20) I >0, so is strictly positive.
E.g. Assume ~ Normal (0, 10^5), a pretty non-informative prior.
Choosing appropriate priors can be tricky and we will see many
examples that are typically used in standard statistical analyses.

(3) P ( | data):
This represents the posterior distribution of .
It describes all the information on after combining prior knowledge
on and what our data informed us about .
Since P( | data) is a probability distribution, statistical inference is
made by examining the different characteristics of this distribution.
E.g. the posterior mean, median or mode can be our estimate of .
E.g. the variance and middle 95% of the posterior distribution can tell
us about the uncertainty in our estimates.
Therefore, Bayesian data analyses typically involve the following ingredients:
(1) Specify a model that specifies the relation between the unknown parameters
and the observed data.
(2) Specify prior distributions for the unknown parameters.
(3) Obtain the posterior distributions.
(4) Make inference using the posterior distributions.
C. Why Do Bayesian Analysis
Here are some advantages of the Bayesian approach:

All uncertainty in parameter estimation is included in the final inference. (E.g.


Bayesian versus empirical Bayes estimates of random effects).
Estimation (particularly the uncertainty) for any function of the parameters can be
easily obtained by examining the corresponding posterior distribution.
Prior information can be easily integrated.
Does not rely on large-sample asymptotic theory.

Here are some elements that make Bayesian analysis more complex:

Need to specify prior distributions.


Only in very simple models can P ( | data) be derived explicitly.
 Solution: Monte Carlo sampling!!!

Markov Chain Monte Carlo?

140.656 Lab 8 2008

Since we specify P( data | ) and P(), the only element that prevents us from
obtaining P( | data ) is the marginal distribution P (data).
However, P (data) is often difficult to evaluate, especially when the number of
parameters is large.
Note that P (data) does NOT depend on the parameter . Many methods have
been developed to draw samples from P( | data ) and WinBUGS does this for us
automatically!!!
Given samples from P( | data ), we can calculate the desired statistics such as the
mean or variance to make statistical inference. The precision of how our samples
resemble the true posterior distribution is only limited by the number of draws we
make.

PART II

WinBUGS in Action

A. How to install WinBUGS


(1) Download the .exe. file from: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
(2) Find the file WinBUGS14.exe where WinBUGS is installed. You may want to
create a shortcut.
(3) Follow the instructions in the patch file and install the patch in WinBUGS
(4) Fill out a registration form to obtain a key through email.
(5) Update the key in WinBUGS.

B. A Simple Demonstration: Inference for Two Normal Means.


Data: two samples of size 20 from two independent Normal distributions with unknown
variances.
Data: X = ( x1, x2, , x20) and Y = ( y1, y2, , y20)
(1) Statistical model:

xi ~ Normal ( 1 , 12 )

i = 1, 2, K , 20

yi ~ Normal ( 2 , 22 )

i = 1, 2, K , 20

(2) Prior distributions:


In our model, we have four unknowns so four prior distributions are needed.

1 ~ Normal ( 0, 100 2 )

2 ~ Normal ( 0, 100 2 )

1 / 12 ~ Gamma ( 0.001, 0.001 )

1 / 22 ~ Gamma ( 0.001, 0.001 )

The Normal distributions for s are flat and cover a large range of values.

140.656 Lab 8 2008

We set the inverse of the variance to have a gamma prior distribution since
gamma distribution only takes positive values. Gamma (0.001,.001) has
extremely large standard deviation.
We pick the above prior distributions such that they are non-informative in that
the data will easily dominate the posterior distributions.

A typical WinBUGS program include three sections: Model, Data, and Initial Values.
Model: Translating our statistical model into a WinBUGS program:
model{
for (i in 1:20){
x[i] ~ dnorm (mu[1], prec[1])
y[i] ~ dnorm (mu[2], prec[2])
}
mu[1] ~
mu[2] ~
prec[1]
prec[2]

dnorm (0, 0.0001)


dnorm (0, 0.0001)
~ dgamma (0.001, 0.001)
~ dgamma (0.001, 0.001)

s2[1] <- 1/prec[1]


s2[2] <- 1/prec[2]

Model, P( | data )

Priors, P( )

Convert variances to precision

model { } specifies the statistical model we are fitting.


for (i in 1:20) { } is short-hand for writing the two statements in {}
20 times.
The square brackets allow us to index a vector of values. They are
equivalent to the subscripts in our previous model.
WinBUGS uses precision as a parameter in specifying a Normal
distribution instead of variance!!!
o precision = 1/variance
o dnorm (0, 0.0001) is the same as a Normal distribution with mean
0 and variance 1/0.0001 = 1002.
The last two lines tell WinBUGS to also keep track of the variances.

Our data in WinBUGS format:


list(x = c(6.62, 6.71, 5.07, 4.39, 5.68, 3.94, 5.83, 2.31, 3.60, 4.64,
1.79, 3.12, 3.46, 8.25, 5.49, 6.49, 2.65, 9.14, 5.31, 6.58), y =
c(9.06, 7.00, 8.59, 8.70, 8.64, 8.03, 9.27, 6.01, 7.92, 6.20, 6.39,
9.10, 7.63, 6.75, 8.88, 8.44, 8.95, 5.66, 9.78, 8.09))

Often we also need to give initial values for out parameters:


list( mu=c(0,0), prec=c(1,1))

140.656 Lab 8 2008

Run the Analysis in WinBUGS


A. Load model, data, and initial values:
1. Open a new document in WinBUGS and paste all three parts (model, data, initial
values) on it.
2. Save the file as an .odc.
3. From the top panel: open Model  Specification. A dialogue box will open.
4. Double-click (highlight) the word model in your file and click check model
on the dialogue box.
5. Look at the bottom left-hand corner for model is syntactically correct.
6. Highlight the word list for the data section and click load data in the dialogue
box. Look at the bottom left-hand corner for data loaded.
7. Click compile and look for model compiled.
8. Highlight the word list for the initial value section and click load inits in the
dialogue. Look for model is initialized.
9. [Optional] If you did no initialize all parameters, click gen inits in the dialogue
box.
B. Run Sampler:
1. Open the Sample Monitor Tool window: Menu  Inference  Sample,
2. Type the parameters we are interested in the node box and click set. In this
example we will track both mu and s2.
3. Open the Update Tool window: Menu  Model  Update.
4. In the Update Too box, type in the number of posterior samples you want in the
updates box. E.g. 20000.
5. Click Update and watch it runs in the iteration box!
C. Posterior Inference:
1. In the Sample Monitor Tool select mu from the drop-down box in node.
2. Select the number of initial samples that we want to drop in the beg box. This is
also known as burn-in. Lets choose 2000 here.
3. Select the jth number of iteration you want to keep in the thin box. For
example, if you pick 10, then only the every 10th sample from the 2000th~10000th
iterations will be used for posterior inference. Lets choose 10 here.
Some Posterior Inference:
History Show sampled value at each iteration. Look for chains that show no particular
patter and low auto-correlation.

140.656 Lab 8 2008


mu[1]
7.0
6.0
5.0
4.0
3.0
2000

4000

6000

8000

iteration

mu[2]
9.0
8.5
8.0
7.5
7.0
6.5
2000

4000

6000

8000

iteration

s2[1]
15.0
10.0
5.0
0.0
2000

4000

6000

8000

iteration
s2[2]
5.0
4.0
3.0
2.0
1.0
0.0
2000

4000

6000

8000

iteration

Density: Plot the density estimates of the parameters.


mu[2] sample: 800

mu[2] sample: 800

2.0
1.5
1.0
0.5
0.0

2.0
1.5
1.0
0.5
0.0
6.0

7.0

8.0

6.0

s2[1] sample: 800

7.0

8.0

s2[2] sample: 800

0.4
0.3
0.2
0.1
0.0

1.0
0.75
0.5
0.25
0.0
0.0

5.0

10.0

0.0

2.0

4.0

140.656 Lab 8 2008

Stats: Show statistics of the posterior distribution based on the samples.


node
mu[1]
mu[2]

mean
5.051
7.957

sd
0.4654
0.2765

MC error 2.5%
0.01625 4.134
0.01001 7.429

median 97.5%
5.051
5.99
7.972
8.498

start
2000
2000

sample
800
800

node
s2[1]
s2[2]

mean
4.312
1.662

sd
1.605
0.5761

MC error 2.5%
0.04095 2.148
0.01988 0.8416

median 97.5%
3.944
8.394
1.549
3.122

start
2000
2000

sample
800
800

Parameter

Truth

1
2
2 1
2 2

5
8
4
1

Posterior
mean
5.05
7.96
4.32
1.66

95% Posterior
Interval
(4.13, 5.99)
(7.43, 8.50)
(2.15, 8.39)
(0.84, 3.12)

95% Confidence
Interval
(4.14, 5.96)
(7.38, 8.52)
(2.19, 8.08)
(0.85, 3.15)

MLE
5.05
7.95
3.79
1.48

The point estimates and 95% posterior interval for the 2 means are very similar to
the MLE estimates and its large sample 95% confidence interval.
The point estimates for the variances are a bit different. Why?

Some Additional Analyses


Perhaps we are also interested in:
1) The difference of the population means.
2) The ratio of the two variance components.
In standard non-Bayesian analysis, the confidence intervals for theses estimates can be
quite tricky. However, in a Bayesian analysis we simply add the following two lines in
the model section of our WinBUGS code.
mu.diff <- mu[1] mu[2]
var.ratio <- s2[1]/s2[2]

mu.diff sample: 10800

var.ratio sample: 10000

0.8
0.6
0.4
0.2
0.0

0.4
0.3
0.2
0.1
0.0
-6.0

node
mu.diff

mean
-2.909

node
mean
var.ratio 2.847

-4.0

-2.0

0.0

5.0

10.0

15.0

sd
0.5467

MC error
0.004712

2.5%
-3.965

median 97.5%
-2.913
-1.811

start
9201

sample
10800

sd
1.442

MC error
0.01527

2.5%
1.013

median 97.5%
2.555
6.47

start
10001

sample
10000

We conclude the difference between 1 and 2 is -2.91 with a 95% posterior interval of (3.96 ~ -1.81) and the ratio between 21 and 22 has a posterior median of 2.55 (95% PI:
1.01~6.47). Therefore we found evidence that 1 less then 2 and 21 is greater than 22.
7

140.656 Lab 8 2008

Stata has a package that allows you to run WinBUGS in Stata (link below). It still
requires you to write WinBUGS program but you can analyze the posterior samples in
Stata. R has similar libraries (BRugs and R2WinBUGS) that call WinBUGS or
OpenBUGS. JAGS is another MCMC software that you can call from R and is not based
on BUGS.

Run WinBUGS in Stata:


http://www2.le.ac.uk/departments/health-sciences/extranet/BGE/geneticepidemiology/gedownload/information/

You might also like