0% found this document useful (0 votes)

28 views18 pages

Chapter 2 B

The document discusses Bayesian inference for the mean and variance of a normal population using a conjugate prior distribution. It introduces Bayes' theorem for multiple parameters and the normal-gamma distribution as a conjugate prior. It then shows that the posterior distribution follows a normal-gamma distribution based on the prior and likelihood.

Uploaded by

emily

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views18 pages

Chapter 2 B

Uploaded by

emily

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Chapter 2

Inference for a normal population

This chapter shows how to make inferences for the mean and variance of a normal
population using a conjugate prior distribution. First we need the multi-parameter version
of Bayes Theorem.

2.1 Bayes Theorem for many parameters

Suppose that now the probability (density) function we used to describe the data depends
on many parameters, that is, f (x|θ) where θ = (θ1 , θ2 , . . . , θp )T . After observing the
data, the likelihood function for θ is f (x|θ). Prior beliefs about θ are represented through
a probability (density) function π(θ). Therefore, using Bayes Theorem, the posterior
probability (density) function for θ is

π(θ) f (x|θ)
π(θ|x) =
f (x)
where
R
 Θ π(θ) f (x|θ) dθ
 if θ is continuous,
f (x) =

P
Θ π(θ) f (x|θ) if θ is discrete.

As in Chapter 1, this can be rewritten as

π(θ|x) ∝ π(θ) × f (x|θ)

i.e. posterior ∝ prior × likelihood.

Next we introduce a new distribution which will be useful later on.

31
32 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

Example 2.1

If X has a generalised ta (b, c) distribution (see page 101) then show that Y = (X −
√
b)/ c ∼ ta ≡ ta (0, 1).
Recall the general result: if X is a random variable with probability density function fX (x)
and g is a bijective (1–1) function then the random variable Y = g(X) has probability
density function
d −1
fY (y ) = fX g −1 (y )

g (y ) . (2.1)
dy

Solution
√ √
Here we take Y = g(X) = (X − b)/ c from which we obtain X = g −1 (Y ) = b + c Y .
Therefore using (2.1) we have

d −1
fY (y ) = fX g −1 (y )

g (y )
dy
√ √
= fY b + c y × c
− a+1
Γ a+1

2 y2 2 √
=√ 1 + × c, y ∈ R
acπ Γ 2a

a
− a+1
Γ a+1

2 y2 2
=√ 1 + , y ∈ R.
aπ Γ 2a

a
√
This is the ta density and so Y = (X − b)/ c ∼ ta .

Comment

Values for the density function fY (y ) and the distribution function FY (y ) can be obtained
by using the R functions dgt and pgt in the package nclbayes.
It is clear that ta (0, 1) ≡ ta by examining their densities. Therefore, it makes sense
to think of the ta distribution as the standard ta –distribution and make all calculations
for the generalised ta (b, c) distribution from this standard distribution. The relationship
between this standard and generalised version of the t-distribution is directly analogous
to that between the standard normal N(0, 1) distribution and its more general version:
the N(b, c) distribution. In both cases the relationship is one of location and scale:

Y −b
Y ∼ N(b, c) =⇒ √ ∼ N(0, 1)
c

Y −b
Y ∼ ta (b, c) =⇒ √ ∼ ta .
c
2.2. PRIOR TO POSTERIOR ANALYSIS 33

2.2 Prior to posterior analysis

Suppose we have a random sample from a normal distribution in which both the mean µ
and the precision τ are unknown, that is, Xi |µ, τ ∼ N(µ, 1/τ ), i = 1, 2, . . . , n (indepen-
dent). We shall adopt a (joint) prior distribution for µ and τ for which

1
µ|τ ∼ N b, and τ ∼ Ga(g, h)
cτ
for known values b, c, g and h. This distribution has density function

π(µ, τ ) = π(µ|τ )π(τ )

cτ 1/2 n cτ o hg τ g−1 e −hτ
2
= exp − (µ − b) × , µ ∈ R, τ > 0
2π 2 Γ(g)
1
n τ o
∝ τ g− 2 exp − c(µ − b)2 + 2h , µ ∈ R, τ > 0.

(2.2)
2
We will use the notation NGa(b, c, g, h) for this distribution. Thus we take the prior
distribution
µ
∼ NGa(b, c, g, h).
τ

µ
Determine the posterior distribution for .
τ
Hint:
2
nc(x̄ − b)2

2 2 cb + nx̄
c(µ − b) + n(x̄ − µ) = (c + n) µ − + .
c +n c +n

Solution

From (1.8), the likelihood function is

τ n/2 h nτ i
f (x|µ, τ ) = exp − s 2 + (x̄ − µ)2 .
2π 2
Using Bayes Theorem, the posterior density is

π(µ, τ |x) ∝ π(µ, τ ) f (x|µ, τ )

and so, for µ ∈ R, τ > 0

1
n τ o
π(µ, τ |x) ∝ τ g− 2 exp − c(µ − b)2 + 2h
2 h nτ i
n
2 2
× τ exp −
2 s + (x̄ − µ)
2
n 1
n τ o
∝ τ g+ 2 − 2 exp − c(µ − b)2 + n(x̄ − µ)2 + 2h + ns 2
( 2 " 2 #)
2

n 1 τ cb + nx̄ nc(x̄ − b)
∝ τ g+ 2 − 2 exp − (c + n) µ − + + 2h + ns 2
2 c +n c +n
34 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

using the hint. Let

bc + nx̄
B= , C = c + n,
c +n
(2.3)
n cn(x̄ − b)2 ns 2
G=g+ , H=h+ + .
2 2(c + n) 2

Then the posterior density is

1
n τ o
π(µ, τ |x) ∝ τ G− 2 exp − C(µ − B)2 + 2H ,
2
µ ∈ R, τ > 0

Notice that this posterior density is of the same form as the prior density (2.2). Therefore,
we can conclude that the posterior distribution is

µ
x ∼ NGa(B, C, G, H).
τ

Thus, the NGa distribution is conjugate to this data model.

2.2.1 Marginal distributions

Suppose (µ, τ )T ∼ NGa(b, c, g, h). From the definition of the NGa distribution we know
√
that τ ∼ Ga(g, h). This also means that σ = 1/ τ ∼ Inv-Chi(g,h); see page 101.
The (marginal) density for µ is, for µ ∈ R
Z ∞
π(µ) = π(µ, τ ) dτ
Z0 ∞ n τ
g− 21 2
o
∝ τ exp − c(µ − b) + 2h dτ.
0 2

Now, as the integral of a gamma density over its entire range is one, we have
Z ∞ a a−1 −bθ Z ∞
b θ e Γ(a)
dθ = 1 =⇒ θa−1 e −bθ dθ = a .
0 Γ(a) 0 b

Therefore, for µ ∈ R
Z ∞ n τ o
g+ 21 −1 2
π(µ) ∝ τ exp − c(µ − b) + 2h dτ
0 2
1

Γ g+2
∝ 1
[{c(µ − b)2 + 2h}/2}]g+ 2
−g−1/2
c(µ − b)2

−g−1/2
∝h 1+
2h
− 2g+1
c(µ − b)2
2
∝ 1+ .
2h
2.2. PRIOR TO POSTERIOR ANALYSIS 35

Comparing this density with that of the generalised t–distribution (on page 101) gives

h
µ ∼ t2g b, . (2.4)
gc

Thus, marginally, the prior distribution for µ is a t–distribution.

Similar calculations can be used to determine the (marginal) posterior distributions.

Summary of marginal distributions

µ
The prior ∼ NGa(b, c, g, h) has marginal distributions
τ

h
• µ ∼ t2g b, gc

• τ ∼ Ga(g, h)

√
Also σ = 1/ τ ∼ Inv-Chi(g, h).

µ
The posterior x ∼ NGa(B, C, G, H) has marginal distributions
τ

H

• µ|x ∼ t2G B, GC

• τ |x ∼ Ga(G, H)

Also σ|x ∼ Inv-Chi(G, H).

It can be shown that the posterior mean of µ is greater than its prior mean if and only if
the sample mean (likelihood mode) is greater than its prior mean, that is,

E(µ|x) > E(µ) ⇐⇒ x̄ > b.

The relationships between the prior and posterior variance of µ and mean and variance
of τ and of σ are rather more complex.

Example 2.2

Recall Example 1.4 on the earth’s density. Previously we assumed that the measurements
followed a N(µ, 0.22 ) distribution, that is, the standard deviation of the measurements
was known to be 0.2 g/cm3 . Now we consider the case where this standard deviation is
unknown and determine posterior distributions using the theory in section 2.2.
Before we can proceed, we must specify the parameters in the NGa(b, c, g, h) prior distri-
bution for (µ, τ ). In the previous analysis, we assumed that the population measurement
36 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

precision was τ = 1/0.22 = 25 and assumed a N(5.41, 0.42 ) prior distribution for the
population mean, that is, µ|τ = 25 ∼ N(5.41, 0.42 ).
Choice of b and c: the conditional prior distribution for µ is µ|τ ∼ N{b, 1/(cτ )} and so
matching the prior distributions for µ (when τ = 25) gives b = 5.41 and c = 0.25.
Choice of g and h: the marginal prior distribution for τ is τ ∼ Ga(g, h). Previously, we
assumed τ = 25 (with V ar (τ ) = 0) and so take this value as the prior mean: E(τ ) = 25.
Suppose we also decide that V ar (τ ) = 250. These two requirements give g = 2.5 and
h = 0.1. Therefore, we will assume the prior distribution

µ
∼ NGa(5.41, 0.25, 2.5, 0.1).
τ

T
We have seen that if (µ, τ ) ∼ NGa(b, c, g, h) then the marginal distribution of µ is
µ ∼ t2g b, h/(gc) . Therefore, with this choice of prior distribution, the marginal prior
distribution for µ is
µ ∼ t5 (5.41, 0.16).

Figure 2.1 shows the close match between the new (marginal) prior distribution for µ and
that used previously.
0.8
density

0.4
0.0

4.0 4.5 5.0 5.5 6.0 6.5 7.0

Figure 2.1: Marginal prior density for µ: new version (solid) and previous version (dashed)

Determine the posterior distribution for (µ, τ )T . Also determine the marginal prior dis-
tribution for τ and for σ, and the marginal posterior distribution for each of µ, τ and σ.

Solution

We can combine the information in the NGa(5.41, 0.25, 2.5, 0.1) prior distribution
for (µ, τ )T with that in the data (n = 23, x̄ = 5.4848, s = 0.1882) using the results in
2.2. PRIOR TO POSTERIOR ANALYSIS 37

section 2.2 to obtain a NGa(B, C, G, H) posterior distribution, where

bc + nx̄ (5.41 × 0.25) + (23 × 5.4848)
B= = = 5.4840,
c +n 23.25
C = c + n = 23.25,
n
G = g + = 14,
2
cn(x̄ − b)2 ns 2 5.75
H=h+ + = 0.1 + (5.4848 − 5.41)2 + 11.5 × 0.18822 = 0.5080.
2(c + n) 2 46.5

The marginal prior distributions for τ and σ are

τ ∼ Ga(g, h) ≡ Ga(2.5, 0.1)

σ ∼ Inv-Chi(g, h) ≡ Inv-Chi(2.5, 0.1)

Also the marginal posterior distributions for µ, τ and σ are

H
µ|x ∼ t2G B, ≡ t28 (5.4840, 0.001561)
GC
τ |x ∼ Ga(G, H) ≡ Ga(14, 0.5080)
σ|x ∼ Inv-Chi(G, H) ≡ Inv-Chi(14, 0.5080)

Plots of the (marginal) prior and posterior distributions of µ, τ and σ are given in Fig-
ure 2.2. Note that the (marginal) prior and posterior distributions for σ can be determined
from that of τ . We can also examine the joint prior and posterior distributions for (µ, τ )T
via the contour plots of their densities to see if there is any change in the dependence
structure; see Figure 2.3. This figure is produced by using the R command NGacontour
in the nclbayes package as follows:

mu=seq(4.5,6.5,len=1000)
tau=seq(0,71,len=1000)
NGacontour(mu,tau,b,c,g,h,lty=3)
NGacontour(mu,tau,B,C,G,H,add=TRUE)

in which the variables b,c,g,h,B,C,G,H have already been set to their prior/posterior
values. A careful look at the values of the contour levels plotted shows that the highest
38 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

0 2 4 6 8
density

4.0 4.5 5.0 5.5 6.0 6.5 7.0

15
density

density

10
0.03

5
0.00

0
0 10 20 30 40 50 60 70 0.0 0.1 0.2 0.3 0.4 0.5

τ σ

Figure 2.2: Prior (dashed) and posterior (solid) densities for µ, τ and σ

contour level plotted for the prior density is 0.024 and the lowest level for the posterior
density is 0.05. From this we can conclude that the posterior distribution is far more
concentrated than the prior distribution. Also the contours for the posterior distribution
are much more elliptical than those for the prior distribution. This indicates a change
in the dependence structure. However, the main changes shown by the figure are in the
mean and variability of µ and τ .
Wikipedia tells us that the actual mean density of the earth is 5.515 g/cm3 . We can
determine the (posterior) probability that the mean density is within 0.1 of this value as
follows. We already know that µ|x ∼ t28 (5.484, 0.001561) and so we can calculate

P r (5.415 < µ < 5.615|x) = 0.9529

using pgt(5.615,28,5.484,0.001561)-pgt(5.415,28,5.484,0.001561).
Without the data, the only basis for determining the earth’s density is via the prior
distribution. Here the prior distribution is µ ∼ t5 (5.41, 0.16) and so the (prior) probability
that the mean density is within 0.1 of the (now known) true value is

P r (5.415 < µ < 5.615) = 0.1802,

calculated using pgt(5.615,5,5.41,0.16)-pgt(5.415,5,5.41,0.16).

These probability calculations demonstrate that the data have been very informative and
changed our beliefs about the earth’s density.
2.3. CONFIDENCE INTERVALS AND REGIONS 39

70
0.002

60
0.004

50
40

0.016
τ

0.02
30

0.024
20

0.2

0.05
0.022
10

0.018
0.014
2
0.01 0.01 0.008
0.006
0.004
0.002
0

4.5 5.0 5.5 6.0 6.5

Figure 2.3: Contour plot of the prior (dashed) and posterior (solid) densities for (µ, τ )T .

2.3 Confidence intervals and regions

Example 2.3

Determine the 100(1 − α)% highest density interval (HDI) for the population mean µ in
terms of quantiles of the standard t-distribution.

Solution
H

The marginal posterior distribution is µ|x ∼ t2G B, GC . This is a symmetric
distribution and so the HDI is an equi-tailed interval. Therefore the HDI (`, u) for µ
must satisfy
P r (µ < `|x) = α/2 and P r (µ > u|x) = α/2.

Now, given the data x

µ−B
p ∼ t2G
H/(GC)
40 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

and so
!
µ−B u−B
P r (µ > u|x) = α/2 ⇒ Pr p >p x = α/2
H/(GC) H/(GC)
u−B
⇒ p = t2G;α/2
H/(GC)
where t2G;p is the upper p point of the t2G distribution. Therefore
r
H
u = B + t2G;α/2 .
GC
Similar calculations give
r r
H H
` = B + t2G;1−α/2 = B − t2G;α/2
GC GC
since the t distribution is symmetric about zero. Thus the 100(1 − α)% HDI for µ is
r r !
H H
B − t2G;α/2 , B + t2G;α/2 .
GC GC

These intervals can be calculated easily using the R function qgt in the package nclbayes.
For example, the prior and posterior 95% HDIs for µ can be calculated using

c(qgt(0.025,2*g,b,h/(g*c)),qgt(0.975,2*g,b,h/(g*c)))
c(qgt(0.025,2*G,B,H/(G*C)),qgt(0.975,2*G,B,H/(G*C)))

Determining a highest density interval (HDI) for the population precision τ or standard
deviation σ is more complicated as their posterior distributions are not symmetric. The
(marginal) posterior for τ is τ |x ∼ Ga(G, H) and the (marginal) posterior for σ is σ|x ∼
Inv-Chi(G, H). HDIs can be found by using the R functions hdiGamma and hdiInvchi
in the package nclbayes. More standard equi-tailed confidence intervals can be found
using the functions qgamma and qinvchi.
For example, the prior and posterior 95% HDIs for τ can be calculated using R com-
mands hdiGamma(0.95,g,h) and hdiGamma(0.95,G,H), and those for σ using com-
mands hdiInvchi(0.95,g,h) and hdiInvchi(0.95,G,H). The 95% equi-tailed confi-
dence intervals are calculated in a similar way to the HDIs for µ above. So for τ , the
prior and posterior intervals are calculated using

c(qgamma(0.025,g,h),qgamma(0.975,g,h))
c(qgamma(0.025,G,H),qgamma(0.975,G,H))

and those for σ using

c(qinvchi(0.025,g,h),qinvchi(0.975,g,h))
c(qinvchi(0.025,G,H),qinvchi(0.975,G,H))
2.3. CONFIDENCE INTERVALS AND REGIONS 41

Prior Posterior
µ: (4.3818, 6.4382) (5.4031, 5.5649)
τ: (1.4812, 55.9573) (14.0193, 42.2530) ← HDI
(4.1561, 64.1625) (15.0674, 43.7625)
σ: (0.1062, 0.4246) (0.1466, 0.2505) ← HDI
(0.1248, 0.4905) (0.1512, 0.2576)

Table 2.1: Prior and posterior 95% intervals for the analysis in Example 2.2

The numerical values for the prior and posterior 95% intervals for the analysis in Exam-
ple 2.2 are given in Table 2.1. Notice that there is little difference between the posterior
HDI and equi-tailed intervals for τ and for σ, whereas the prior intervals are fairly differ-
ent. This is because the prior distributions are quite skewed but the posterior distributions
are fairly symmetric; see Figure 2.2.
In Bayesian inference it can also be useful to determine (joint) confidence regions for
several parameters, in this case, for (µ, τ )T . In general this is a difficult problem to solve
mathematically, and it is in this case.
42 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

Example 2.4

Determine a joint confidence region for (µ, τ )T .

Solution

We know that the (joint) prior distribution for these parameters is

µ
∼ NGa(b, c, g, h).
τ
Therefore an HDI–type confidence region takes the form

µ
: π(µ, τ ) > k
τ
n τ
µ g− 12 2
o 0
= : τ exp − c(µ − b) + 2h > k
τ 2

µ 1 τ 2
00
= : g− log τ − c(µ − b) + 2h > k
τ 2 2
τ c(µ − b)2

µ 1
= : + hτ − g − log τ < kα
τ 2 2
where kα will depend on the confidence level of the region. These regions are not difficult
to draw. The difficult part is determining the appropriate value for kα to get say a 95%
confidence region. If we could determine the distribution of
τ c(µ − b)2

1
Y = + hτ − g − log τ
2 2
when

µ
∼ NGa(b, c, g, h)
τ
then we could get the value for kα . Unfortunately it is quite difficult to do this mathe-
matically. However, we can use simulation methods to get a pretty accurate value for kα
(for a given confidence level).

Using an additional argument in the R function NGacontour produces plots of confidence

regions. For example
2.4. PREDICTIVE DISTRIBUTION 43

mu=seq(3.5,7.5,len=1000)
tau=seq(0,80,len=1000)
NGacontour(mu,tau,b,c,g,h,p=c(0.95,0.9,0.8),lty=3)
NGacontour(mu,tau,B,C,G,H,p=c(0.95,0.9,0.8),add=TRUE)

produces a plot containing the 95%, 90% and 80% prior and posterior confidence regions
for (µ, τ )T for the prior and posterior distributions in Example 2.2; see Figure 2.4. The
upper plot shows contours of both prior and posterior densities. The numbers within
the plot are the contour levels. The largest prior confidence region is the 95% region.
The next largest is the 90% prior confidence region and the smallest is the 80% prior
confidence region. The same ordering holds for the posterior confidence regions. The
posterior contours are so concentrated in the middle of the plot that there is no room to
put in the contour levels. However, these can be see on the lower plot which also shows
the contours but focuses the parameter range to highlight the contours of the posterior
density. The values of the contours in this lower plot show that the posterior density is
much more peaked, that is, the posterior has a much reduced variability. The location
of the centre of the central contour for both the prior and posterior densities shows that
there has been little change in the mean/mode.

2.4 Predictive distribution

Suppose we sample another value y randomly from the population. What values is it
likely to take? This is described by its predictive distribution. We can determine this
distribution by using the definition of the predictive density
Z
f (y |x) = f (y |µ, τ ) π(µ, τ |x) dµ dτ

or by using Candidate’s formula (as this is a conjugate analysis). However, for this
model/prior, there is a more straightforward method to determine the predictive distri-
bution in this model.

As Y is a random value from the population, we have that Y |µ, τ ∼ N(µ, 1/τ ). We also
know that the posterior distribution is (µ, τ )T |x ∼ NGa(B, C, G, H). Therefore, we can
write

Y = µ + ε,

where

1
ε|τ ∼ N(0, 1/τ ) and µ|x, τ ∼ N B, .
Cτ
Hence Y is the sum of two independent normal random quantities, and so

1 1 C+1
Y |x, τ ∼ N B, + ≡ N B, .
τ Cτ Cτ
44 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

80
60
40
τ

0.0053
0.0014 0.0027
0

4 5 6 7

µ
60

0.0053
50

0.056
40
τ

30
20

0.11

0.029
10

5.3 5.4 5.5 5.6 5.7

Figure 2.4: 95%, 90% and 80% prior (dashed) and posterior (solid) confidence regions
for (µ, τ )T
2.4. PREDICTIVE DISTRIBUTION 45

Thus, as τ |x ∼ Ga(G, H)

Y C
x ∼ NGa B, , G, H
τ C+1

and so, using (2.4)

H(C + 1)
Y |x ∼ t2G B, .
GC

We can determine 100(1 − α)% predictive intervals by noting that the predictive distri-
bution is symmetric about its mean and therefore the HDI is
r r !
H(C + 1) H(C + 1)
B − t2G;α/2 , B + t2G;α/2 .
GC GC
46 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

These predictive intervals can be calculated easily using the R function qgt. For example,
in Example 2.2, the prior and posterior predictive HDIs for a new value Y from the
population are (4.2604, 6.5596) and (5.0855, 5.8825) respectively, calculated using

c(qgt(0.025,2*g,b,h*(c+1)/(g*c)),qgt(0.975,2*g,b,h*(c+1)/(g*c)))
c(qgt(0.025,2*G,B,H*(C+1)/(G*C)),qgt(0.975,2*G,B,H*(C+1)/(G*C)))

2.5 Summary

Suppose we have a normal random sample with Xi |µ, τ ∼ N(µ, 1/τ ), i = 1, 2, . . . , n

(independent).

(i) (µ, τ )T ∼ NGa(b, c, g, h) is a conjugate prior distribution.

(ii) The posterior distribution is (µ, τ )T |x ∼ NGa(B, C, G, H) where the posterior pa-
rameters are given by (2.3).
√
(iii) The marginal prior distributions are µ ∼ t2g {b, h/(gc)}, τ ∼ Ga(g, h), σ = 1/ τ ∼
Inv-Chi(g, h).

(iv) The marginal posterior distributions are µ|x ∼ t2G {B, H/(GC)}, τ |x ∼ Ga(G, H),
σ|x ∼ Inv-Chi(G, H).

(v) Prior and posterior means and standard deviations for µ, τ and σ can be calculated
from the properties of the t, Gamma and Inv-Chi distributions.

(vi) Prior and posterior probabilities and densities for µ, τ and σ can be calculated using
the R functions pgt, dgt, pgamma, dgamma, pinvchi, dinvchi.

(vii) HDIs or equi-tailed CIs for µ, τ and σ can be calculated using qgt, hdiGamma,
hdiInvchi, qgamma, qinvchi.

(viii) Contour plots of the prior and posterior densities for (µ, τ )T can be plotted using
the NGacontour function.

(ix) Prior and posterior confidence regions for (µ, τ )T can be plotted using the NGacontour
function.

(x) The predictive distribution for a new observation Y from the population is Y |x ∼
t2G {B, H(C + 1)/(GC)} and its HDI can be calculated using the qgt function.
2.6. WHY DO WE HAVE SO MANY DIFFERENT DISTRIBUTIONS? 47

2.6 Why do we have so many different distributions?

So far we have used many distributions, some you will have met before and some will be
new. After a while the variety and sheer number of different distributions can become
overwhelming. Why do we need so many distributions and why do we name so many of
them?
Statistics studies the random variation in experiments, samples and processes. The variety
of applications leads to their randomness being described by many different distributions.
In many applications, bespoke distributions will need to be formulated. However, some
distributions come up time and time again for modelling random variation in data and
for describing prior beliefs. It is helpful for us to be able to refer to these distributions –
and so we give each one a name – and also to be able to quote known results for these
distributions such as their mean and variance. In this chapter you have been introduced
to a generalisation of the t-distribution and the inverse chi distribution, and we have been
able to use results for their mean and variance to study prior and posterior distributions
and have been able to plot these distributions using functions in the R package.
You will meet several other new distributions in the remainder of the module. You won’t
be surprised to hear that it is useful to have a working knowledge of each of these
distributions but perhaps not vital to remember all their properties listed in these notes.
To help in this regard, the exam paper will contain a list of all the distributions used in
the exam, together with their density (or probability function) and any useful results such
as their mean and variance (as needed for the exam); see the specimen exam paper at
the back of this booklet.
48 CHAPTER 2. INFERENCE FOR A NORMAL POPULATION

2.7 Learning objectives

By the end of this chapter, you should be able to:

• determine the posterior distribution for (µ, τ )T

• determine and use the univariate prior and posterior distributions

• determine confidence intervals, HDIs and confidence regions

• determine the predictive distribution of another value from the population, and its
predictive interval

• determine the predictive distribution of the mean of another random sample from
the population
both in general and for a particular prior and data set. Also you should be able to:
• appreciate the benefit of naming distributions and for having lists of properties for
these distributions

(Hiromitsu Kumamoto, E. J. Henley) Probabilistic R PDF
0% (1)
(Hiromitsu Kumamoto, E. J. Henley) Probabilistic R PDF
612 pages
Slides 2 V 3
No ratings yet
Slides 2 V 3
45 pages
Week 11
No ratings yet
Week 11
11 pages
Chapter 1 B
No ratings yet
Chapter 1 B
35 pages
BT Wk3 LectureNotes
No ratings yet
BT Wk3 LectureNotes
16 pages
Slides 1
No ratings yet
Slides 1
73 pages
Solutions 308
No ratings yet
Solutions 308
13 pages
Chap 2
No ratings yet
Chap 2
28 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Ch3 - 2009 Conjugate Families of Distributions
No ratings yet
Ch3 - 2009 Conjugate Families of Distributions
67 pages
Lecture 2 - 4 Prior
No ratings yet
Lecture 2 - 4 Prior
51 pages
Lecture 4
No ratings yet
Lecture 4
7 pages
Lecture 20 - Bayesian Analysis
No ratings yet
Lecture 20 - Bayesian Analysis
4 pages
Bayesian Statistics: MA501, Statistics For Insurance
No ratings yet
Bayesian Statistics: MA501, Statistics For Insurance
28 pages
Lecture 3
No ratings yet
Lecture 3
4 pages
BT Wk3 LectureNotes
No ratings yet
BT Wk3 LectureNotes
19 pages
Bayesian Statistics 01
100% (1)
Bayesian Statistics 01
22 pages
Chapter 4 B
No ratings yet
Chapter 4 B
38 pages
Problem Set 1 Sol
No ratings yet
Problem Set 1 Sol
7 pages
CH 5
No ratings yet
CH 5
45 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
DS 630 - Lec 4 - ST
No ratings yet
DS 630 - Lec 4 - ST
27 pages
MAS3301 Bayesian Statistics: M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9
No ratings yet
MAS3301 Bayesian Statistics: M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9
18 pages
20 Bayesian2
No ratings yet
20 Bayesian2
50 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
Lecture4 More Bayes
No ratings yet
Lecture4 More Bayes
24 pages
Bayes
No ratings yet
Bayes
3 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Bayesian Data Analysis - Reading Instructions 2: Chapter 2 - Outline
No ratings yet
Bayesian Data Analysis - Reading Instructions 2: Chapter 2 - Outline
36 pages
Gaussian Conjugate Prior Cheat Sheet
No ratings yet
Gaussian Conjugate Prior Cheat Sheet
7 pages
Single Parameter Models
No ratings yet
Single Parameter Models
37 pages
i i 2 i 1 2 θ i 2 2 3 2
No ratings yet
i i 2 i 1 2 θ i 2 2 3 2
159 pages
02 Solution Bayes Example
No ratings yet
02 Solution Bayes Example
2 pages
Nonparametric Inference Techniques For High-Dimensional Data: Challenges and Solutions
No ratings yet
Nonparametric Inference Techniques For High-Dimensional Data: Challenges and Solutions
16 pages
Predição em Modelos de Tempo de Falha Acelerado Com Efeito Aleatório para Avaliação de Riscos de Falha - (JoaoBC)
No ratings yet
Predição em Modelos de Tempo de Falha Acelerado Com Efeito Aleatório para Avaliação de Riscos de Falha - (JoaoBC)
22 pages
5B Bayesian Inference: Class Problems
No ratings yet
5B Bayesian Inference: Class Problems
9 pages
Problem Set 1
No ratings yet
Problem Set 1
3 pages
LN 13
No ratings yet
LN 13
5 pages
Azzalini
No ratings yet
Azzalini
9 pages
Introduction To Bayesian Statistics
No ratings yet
Introduction To Bayesian Statistics
33 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
Bayesian Inference: by Hoai Nam Nguyen September 9, 2017
No ratings yet
Bayesian Inference: by Hoai Nam Nguyen September 9, 2017
7 pages
Prior Distribution
No ratings yet
Prior Distribution
14 pages
Part A Statistics HT 2017 Problem Sheet 4
No ratings yet
Part A Statistics HT 2017 Problem Sheet 4
2 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Minka - Inferring A Gaussian Distribution
No ratings yet
Minka - Inferring A Gaussian Distribution
15 pages
Universiteit Hasselt Concepts in Bayesian Inference Exam June 2015
No ratings yet
Universiteit Hasselt Concepts in Bayesian Inference Exam June 2015
8 pages
Stat100b Gamma Chi T F
No ratings yet
Stat100b Gamma Chi T F
18 pages
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
No ratings yet
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
6 pages
MA40189 20 Mock
No ratings yet
MA40189 20 Mock
4 pages
Single Parametric Models
No ratings yet
Single Parametric Models
10 pages
Bayesian - Lec - 3
No ratings yet
Bayesian - Lec - 3
24 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
3 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Bayesian Linear Model Gory Details
No ratings yet
Bayesian Linear Model Gory Details
9 pages
Bayesian Inference
No ratings yet
Bayesian Inference
18 pages
Bayesian Inference For The Gaussian
No ratings yet
Bayesian Inference For The Gaussian
11 pages
Chapter 1 Basic Concepts: Thomas Bayes (1702-1761)
No ratings yet
Chapter 1 Basic Concepts: Thomas Bayes (1702-1761)
6 pages
Week 7 Notes
No ratings yet
Week 7 Notes
9 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Concepts of Biology Chapter 01
No ratings yet
Concepts of Biology Chapter 01
45 pages
The Biblical Concept of Miracles
No ratings yet
The Biblical Concept of Miracles
2 pages
The Falsification Principle
No ratings yet
The Falsification Principle
2 pages
Short Term Memory and Long Term Memory. Features of Each Store - Coding, Capacity and Duration
No ratings yet
Short Term Memory and Long Term Memory. Features of Each Store - Coding, Capacity and Duration
3 pages
Psychology Freud - Dreaming Notes
No ratings yet
Psychology Freud - Dreaming Notes
3 pages
Andy Goldsworthy
No ratings yet
Andy Goldsworthy
1 page
Chapter20 4e
No ratings yet
Chapter20 4e
36 pages
Dealing With Missing Data: Key Assumptions and Methods For Applied Analysis
No ratings yet
Dealing With Missing Data: Key Assumptions and Methods For Applied Analysis
20 pages
Forrest 2005
No ratings yet
Forrest 2005
14 pages
bMATH 2020 BruinsmaR
No ratings yet
bMATH 2020 BruinsmaR
43 pages
Lecture 5
No ratings yet
Lecture 5
13 pages
Marketing Mutual Funds
No ratings yet
Marketing Mutual Funds
85 pages
Machine Learning in Dentistry Dropbox Download
100% (18)
Machine Learning in Dentistry Dropbox Download
15 pages
1 s2.0 S2590123024000410 Main
No ratings yet
1 s2.0 S2590123024000410 Main
10 pages
CS 229, Public Course Problem Set #4: Unsupervised Learning and Re-Inforcement Learning
No ratings yet
CS 229, Public Course Problem Set #4: Unsupervised Learning and Re-Inforcement Learning
5 pages
Zhang X. Et Al. (2011) - A Framework For Hand Gesture Recognition Based On Accelerometer and EMG Sensors
No ratings yet
Zhang X. Et Al. (2011) - A Framework For Hand Gesture Recognition Based On Accelerometer and EMG Sensors
13 pages
Econometrics 005
No ratings yet
Econometrics 005
6 pages
A Study of Weibull Mixture ROC Curve With Constant Shape Parameter PDF
No ratings yet
A Study of Weibull Mixture ROC Curve With Constant Shape Parameter PDF
10 pages
344-Article Text-2139-1-10-20201223
No ratings yet
344-Article Text-2139-1-10-20201223
78 pages
M 378K Course Syllabus Fall 2024
No ratings yet
M 378K Course Syllabus Fall 2024
8 pages
A New One Parameter Distribution Properties and Estimation With Applications To Complete and Type II Censored Data
No ratings yet
A New One Parameter Distribution Properties and Estimation With Applications To Complete and Type II Censored Data
9 pages
Stat I CH - 4
No ratings yet
Stat I CH - 4
15 pages
You Are Given The Following Information About An Invertible ARMA Time-Series Model: 0 4 0 2 3 4 - , ,, ,..
No ratings yet
You Are Given The Following Information About An Invertible ARMA Time-Series Model: 0 4 0 2 3 4 - , ,, ,..
40 pages
The Vasicek LGD Model 1720304491
No ratings yet
The Vasicek LGD Model 1720304491
6 pages
An Analytical Approximation For Option Price Under The Affine GARCH Model - A Comparison With The Closed-Form Solution of Heston-Nandi
No ratings yet
An Analytical Approximation For Option Price Under The Affine GARCH Model - A Comparison With The Closed-Form Solution of Heston-Nandi
15 pages
Statistical Techniques in Bioassay
No ratings yet
Statistical Techniques in Bioassay
249 pages
PNS Notes
No ratings yet
PNS Notes
351 pages
Kernell Mallows Kernels For Permutations
No ratings yet
Kernell Mallows Kernels For Permutations
38 pages
Chapter 17 - Logistic Regression
No ratings yet
Chapter 17 - Logistic Regression
32 pages
Risk Matrix-With Regards To The Oluwil Project: Severity
No ratings yet
Risk Matrix-With Regards To The Oluwil Project: Severity
3 pages
HW 2 Solutions
No ratings yet
HW 2 Solutions
6 pages
Course Outline Mat361 Summer 2023
No ratings yet
Course Outline Mat361 Summer 2023
3 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Forecastingtime Series
No ratings yet
Forecastingtime Series
24 pages

Chapter 2 B

Uploaded by

Chapter 2 B

Uploaded by

Chapter 2

Inference for a normal population

2.1 Bayes Theorem for many parameters

As in Chapter 1, this can be rewritten as

π(θ|x) ∝ π(θ) × f (x|θ)

Next we introduce a new distribution which will be useful later on.

2.2 Prior to posterior analysis

π(µ, τ ) = π(µ|τ )π(τ )

From (1.8), the likelihood function is

π(µ, τ |x) ∝ π(µ, τ ) f (x|µ, τ )

and so, for µ ∈ R, τ > 0

using the hint. Let

Then the posterior density is

Thus, the NGa distribution is conjugate to this data model.

2.2.1 Marginal distributions

Thus, marginally, the prior distribution for µ is a t–distribution.

Summary of marginal distributions

Also σ|x ∼ Inv-Chi(G, H).

E(µ|x) > E(µ) ⇐⇒ x̄ > b.

4.0 4.5 5.0 5.5 6.0 6.5 7.0

section 2.2 to obtain a NGa(B, C, G, H) posterior distribution, where

The marginal prior distributions for τ and σ are

τ ∼ Ga(g, h) ≡ Ga(2.5, 0.1)

Also the marginal posterior distributions for µ, τ and σ are

4.0 4.5 5.0 5.5 6.0 6.5 7.0

P r (5.415 < µ < 5.615|x) = 0.9529

P r (5.415 < µ < 5.615) = 0.1802,

calculated using pgt(5.615,5,5.41,0.16)-pgt(5.415,5,5.41,0.16).

4.5 5.0 5.5 6.0 6.5

2.3 Confidence intervals and regions

Now, given the data x

and those for σ using

Determine a joint confidence region for (µ, τ )T .

We know that the (joint) prior distribution for these parameters is

Using an additional argument in the R function NGacontour produces plots of confidence

2.4 Predictive distribution

5.3 5.4 5.5 5.6 5.7

and so, using (2.4)  

Suppose we have a normal random sample with Xi |µ, τ ∼ N(µ, 1/τ ), i = 1, 2, . . . , n

(i) (µ, τ )T ∼ NGa(b, c, g, h) is a conjugate prior distribution.

2.6 Why do we have so many different distributions?

2.7 Learning objectives

By the end of this chapter, you should be able to:

• determine and use the univariate prior and posterior distributions

• determine confidence intervals, HDIs and confidence regions

You might also like

and so, using (2.4)