0% found this document useful (0 votes)
10 views

Lecture Material 2.5 - Bayesian Estimation & Concepts

Uploaded by

ensiliyu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture Material 2.5 - Bayesian Estimation & Concepts

Uploaded by

ensiliyu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chapter 18

Bayesian Statistics
18.1 Bayesian Concepts
The classical methods of estimation that we have studied in this text are based
solely on information provided by the random sample. These methods essentially
interpret probabilities as relative frequencies. For example, in arriving at a 95%
confidence interval for μ, we interpret the statement

P (−1.96 < Z < 1.96) = 0.95

to mean that 95% of the time in repeated experiments Z will fall between −1.96
and 1.96. Since
X̄ − μ
Z= √
σ/ n

for a normal sample with known variance, the √ probability statement


√ here means
that 95% of the random intervals (X̄ − 1.96σ/ n, X̄ + 1.96σ/ n) contain the true
mean μ. Another approach to statistical methods of estimation is called Bayesian
methodology. The main idea of the method comes from Bayes’ rule, described
in Section 2.7. The key difference between the Bayesian approach and the classical
or frequentist approach is that in Bayesian concepts, the parameters are viewed as
random variables.

Subjective Probability
Subjective probability is the foundation of Bayesian concepts. In Chapter 2, we
discussed two possible approaches to probability, namely the relative frequency and
the indifference approaches. The first one determines a probability as a consequence
of repeated experiments. For instance, to decide the free-throw percentage of a
basketball player, we can record the number of shots made and the total number
of attempts this player has made. The probability of hitting a free-throw for this
player can be calculated as the ratio of these two numbers. On the other hand,
if we have no knowledge of any bias in a die, the probability that a 3 will appear
in the next throw will be 1/6. Such an approach to probability interpretation is
based on the indifference rule.

709
710 Chapter 18 Bayesian Statistics

However, in many situations, the preceding probability interpretations cannot


be applied. For instance, consider the questions “What is the probability that
it will rain tomorrow?” “How likely is it that this stock will go up by the end
of the month?” and “What is the likelihood that two companies will be merged
together?” They can hardly be interpreted by the aforementioned approaches, and
the answers to these questions may be different for different people. Yet these
questions are constantly asked in daily life, and the approach used to explain these
probabilities is called subjective probability, which reflects one’s subjective opinion.

Conditional Perspective
Recall that in Chapters 9 through 17, all statistical inferences were based on the
fact that the parameters are unknown but fixed quantities, apart from those in
Section 9.14, in which the parameters were treated as variables and the maximum
likelihood estimates (MLEs) were calculated conditioning on the observed sample
data. In Bayesian statistics, not only are the parameters treated as variables as in
MLE calculation, but also they are treated as random.
Because the observed data are the only experimental results for the practitioner,
statistical inference is based on the actual observed data from a given experiment.
Such a view is called a conditional perspective. Furthermore, in Bayesian concepts,
since the parameters are treated as random, a probability distribution can be
specified, generally by using the subjective probability for the parameter. Such a
distribution is called a prior distribution and it usually reflects the experimenter’s
prior belief about the parameter. In the Bayesian perspective, once an experiment
is conducted and data are observed, all knowledge about the parameter is contained
in the actual observed data and in the prior information.

Bayesian Applications
Although Bayes’ rule is credited to Thomas Bayes, Bayesian applications were
first introduced by French scientist Pierre Simon Laplace, who published a paper
on using Bayesian inference on the unknown binomial proportions (for binomial
distribution, see Section 5.2).
Since the introduction of the Markov chain Monte Carlo (MCMC) computa-
tional tools for Bayesian analysis in the early 1990s, Bayesian statistics has become
more and more popular in statistical modeling and data analysis. Meanwhile,
methodology developments using Bayesian concepts have progressed dramatically,
and they are applied in fields such as bioinformatics, biology, business, engineer-
ing, environmental and ecology science, life science and health, medicine, and many
others.

18.2 Bayesian Inferences


Consider the problem of finding a point estimate of the parameter θ for the pop-
ulation with distribution f (x| θ), given θ. Denote by π(θ) the prior distribution
of θ. Suppose that a random sample of size n, denoted by x = (x1 , x2 , . . . , xn ), is
observed.
18.2 Bayesian Inferences 711

Definition 18.1: The distribution of θ, given x, which is called the posterior distribution, is given
by

f (x|θ)π(θ)
π(θ|x) = ,
g(x)

where g(x) is the marginal distribution of x.

The marginal distribution of x in the above definition can be calculated using


the following formula:
⎧
⎨ f (x|θ)π(θ), θ is discrete,
g(x) = θ∞
⎩ f (x|θ)π(θ) dθ, θ is continuous.
−∞

Example 18.1: Assume that the prior distribution for the proportion of defectives produced by a
machine is
p 0.1 0.2
π(p) 0.6 0.4
Denote by x the number of defectives among a random sample of size 2. Find the
posterior probability distribution of p, given that x is observed.
Solution : The random variable X follows a binomial distribution
 
2 x 2−x
f (x|p) = b(x; 2, p) = p q , x = 0, 1, 2.
x
The marginal distribution of x can be calculated as
g(x) = f (x|0.1)π(0.1) + f (x|0.2)π(0.2)
 
2
= [(0.1)x (0.9)2−x (0.6) + (0.2)x (0.8)2−x (0.4)].
x
Hence, for x = 0, 1, 2, we obtain the marginal probabilities as
x 0 1 2
g(x) 0.742 0.236 0.022
The posterior probability of p = 0.1, given x, is
f (x|0.1)π(0.1) (0.1)x (0.9)2−x (0.6)
π(0.1|x) = = x 2−x
,
g(x) (0.1) (0.9) (0.6) + (0.2)x (0.8)2−x (0.4)
and π(0.2|x) = 1 − π(0.1|x).
Suppose that x = 0 is observed.
f (0 | 0.1)π(0.1) (0.1)0 (0.9)2−0 (0.6)
π(0.1|0) = = = 0.6550,
g(0) 0.742
and π(0.2|0) = 0.3450. If x = 1 is observed, π(0.1|1) = 0.4576, and π(0.2|1) =
0.5424. Finally, π(0.1|2) = 0.2727, and π(0.2|2) = 0.7273.
The prior distribution for Example 18.1 is discrete, although the natural range
of p is from 0 to 1. Consider the following example, where we have a prior distri-
bution covering the whole space for p.
712 Chapter 18 Bayesian Statistics

Example 18.2: Suppose that the prior distribution of p is uniform (i.e., π(p) = 1, for 0 < p <
1). Use the same random variable X as in Example 18.1 to find the posterior
distribution of p.
Solution : As in Example 18.1, we have
 
2 x 2−x
f (x|p) = b(x; 2, p) = p q , x = 0, 1, 2.
x
The marginal distribution of x can be calculated as
1   1
2
g(x) = f (x|p)π(p) dp = px (1 − p)2−x dp.
0 x 0

The integral above can be evaluated at each x directly as g(0) = 1/3, g(1) = 1/3,
and g(2) = 1/3. Therefore, the posterior distribution of p, given x, is
2 x  
x p (1 − p)
2−x
2 x
π(p|x) = =3 p (1 − p)2−x , 0 < p < 1.
1/3 x

The posterior distribution above is actually a beta distribution (see Section 6.8)
with parameters α = x + 1 and β = 3 − x. So, if x = 0 is observed, the posterior
distribution of p is a beta distribution with parameters (1, 3). The posterior mean
1
is μ = 1+3 = 14 and the posterior variance is σ 2 = (1+3)(1)(3) 3
2 (1+3+1) = 80 .

Using the posterior distribution, we can estimate the parameter(s) in a popu-


lation in a straightforward fashion. In computing posterior distributions, it is very
helpful if one is familiar with the distributions in Chapters 5 and 6. Note that
in Definition 18.1, the variable in the posterior distribution is θ, while x is given.
Thus, we can treat g(x) as a constant as we calculate the posterior distribution of
θ. Then the posterior distribution can be expressed as

π(θ|x) ∝ f (x|θ)π(θ),

where the symbol “∝” stands for is proportional to. In the calculation of the
posterior distribution above, we can leave the factors that do not depend on θ out
of the normalization constant, i.e., the marginal density g(x).

Example 18.3: Suppose that random variables X1 , . . . , Xn are independent and from a Poisson
distribution with mean λ. Assume that the prior distribution of λ is exponential
with mean 1. Find the posterior distribution of λ when x̄ = 3 with n = 10.
Solution : The density function of X = (X1 , . . . , Xn ) is

n
xi
,
n xi
−λ λ λi=1
f (x|λ) = e = e−nλ -
n ,
xi !
i=1 xi !
i=1

and the prior distribution is

π(θ) = e−λ , for λ > 0.


18.2 Bayesian Inferences 713

Hence, using Definition 18.1 we obtain the posterior distribution of λ as



n
xi 
n
λi=1 xi
π(λ|x) ∝ f (x|λ)π(λ) = e−nλ -
n e−λ ∝ e−(n+1)λ λi=1 .
xi !
i=1

Referring to the gamma distribution in Section 6.6, we conclude that the posterior

n
1
distribution of λ follows a gamma distribution with parameters 1 + xi and n+1 .
n i=1 n
x +1
i i=1 ix +1
Hence, we have the posterior mean and variance of λ as i=1 n+1 and (n+1) 2 .
10
So, when x̄ = 3 with n = 10, we have i=1 xi = 30. Hence, the posterior
distribution of λ is a gamma distribution with parameters 31 and 1/11.
From Example 18.3 we observe that sometimes it is quite convenient to use
the “proportional to” technique in calculating the posterior distribution, especially
when the result can be formed to a commonly used distribution as described in
Chapters 5 and 6.

Point Estimation Using the Posterior Distribution


Once the posterior distribution is derived, we can easily use the summary of the
posterior distribution to make inferences on the population parameters. For in-
stance, the posterior mean, median, and mode can all be used to estimate the
parameter.

Example 18.4: Suppose that x = 1 is observed for Example 18.2. Find the posterior mean and
the posterior mode.
Solution : When x = 1, the posterior distribution of p can be expressed as

π(p|1) = 6p(1 − p), for 0 < p < 1.

To calculate the mean of this distribution, we need to find


1  
1 1 1
6p2 (1 − p) dp = 6 − = .
0 3 4 2

To find the posterior mode, we need to obtain the value of p such that the posterior
distribution is maximized. Taking derivative of π(p) with respect to p, we obtain
6 − 12p. Solving for p in 6 − 12p = 0, we obtain p = 1/2. The second derivative is
−12, which implies that the posterior mode is achieved at p = 1/2.
Bayesian methods of estimation concerning the mean μ of a normal population
are based on the following example.

Example 18.5: If x̄ is the mean of a random sample of size n from a normal population with
known variance σ 2 , and the prior distribution of the population mean is a normal
distribution with known mean μ0 and known variance σ02 , then show that the
posterior distribution of the population mean is also a normal distribution with
714 Chapter 18 Bayesian Statistics

mean μ∗ and standard deviation σ ∗ , where


%
∗ σ2 σ 2 /n ∗ σ02 σ 2
μ = 2 0 2 x̄ + 2 μ0 and σ = .
σ0 + σ /n σ0 + σ 2 /n nσ02 + σ 2
Solution : The density function of our sample is
  2 
1
n
1 xi − μ
f (x1 , x2 , . . . , xn | μ) = n/2 n
exp − ,
(2π) σ 2 i=1 σ

for −∞ < xi < ∞ and i = 1, 2, . . . , n, and the prior is


  2 
1 1 μ − μ0
π(μ) = √ exp − , − ∞ < μ < ∞.
2πσ0 2 σ0

Then the posterior distribution of μ is


  n  2  2 
1  xi − μ μ − μ0
π(μ|x) ∝ exp − +
2 i=1 σ σ0
  
1 n(x̄ − μ)2 (μ − μ0 )2
∝ exp − + ,
2 σ2 σ02
due to

n 
n
(xi − μ)2 = (xi − x̄)2 + n(x̄ − μ)2
i=1 i=1

from Section 8.5. Completing the squares for μ yields the posterior distribution
  2 
1 μ − μ∗
π(μ|x) ∝ exp − ,
2 σ∗

where
%
∗ nx̄σ02 + μ0 σ 2 ∗ σ02 σ 2
μ = , σ = .
nσ02 + σ 2 nσ02 + σ 2

This is a normal distribution with mean μ∗ and standard deviation σ ∗ .


The Central Limit Theorem allows us to use Example 18.5 also when we select
sufficiently large random samples (n ≥ 30 for many engineering experimental cases)
from nonnormal populations (the distribution is not very far from symmetric), and
when the prior distribution of the mean is approximately normal.
Several comments need to be made about Example 18.5. The posterior mean
μ∗ can also be written as
σ02 σ 2 /n
μ∗ = x̄ + 2 μ0 ,
σ02 2
+ σ /n σ0 + σ 2 /n
which is a weighted average of the sample mean x̄ and the prior mean μ0 . Since both
coefficients are between 0 and 1 and they sum to 1, the posterior mean μ∗ is always
18.2 Bayesian Inferences 715

between x̄ and μ0 . This means that the posterior estimation of μ is influenced by


both x̄ and μ0 . Furthermore, the weight of x̄ depends on the prior variance as
well as the variance of the sample mean. For a large sample problem (n → ∞),
the posterior mean μ∗ → x̄. This means that the prior mean does not play any
role in estimating the population mean μ using the posterior distribution. This
is very reasonable since it indicates that when the amount of data is substantial,
information from the data will dominate the information on μ provided by the prior.
On the other hand, when the prior variance is large (σ02 → ∞), the posterior mean
μ∗ also goes to x̄. Note that for a normal distribution, the larger the variance,
the flatter the density function. The flatness of the normal distribution in this
case means that there is almost no subjective prior information available on the
parameter μ before the data are collected. Thus, it is reasonable that the posterior
estimation μ∗ only depends on the data value x̄.
Now consider the posterior standard deviation σ ∗ . This value can also be
written as
%
σ02 σ 2 /n
σ∗ = .
σ02 + σ 2 /n

It is obvious that the value σ ∗ is smaller than both σ0 and σ/ n, the prior stan-
dard deviation and the standard deviation of x̄, respectively. This suggests that
the posterior estimation is more accurate than both the prior and the sample data.
Hence, incorporating both the data and prior information results in better pos-
terior information than using any of the data or prior alone. This is a common
phenomenon in Bayesian inference. Furthermore, to compute μ∗ and σ ∗ by the for-
mulas in Example 18.5, we have assumed that σ 2 is known. Since this is generally
not the case, we shall replace σ 2 by the sample variance s2 whenever n ≥ 30.

Bayesian Interval Estimation


Similar to the classical confidence interval, in Bayesian analysis we can calculate a
100(1 − α)% Bayesian interval using the posterior distribution.

Definition 18.2: The interval a < θ < b will be called a 100(1 − α)% Bayesian interval for θ if
a ∞
α
π(θ|x) dθ = π(θ|x) dθ = .
−∞ b 2

Recall that under the frequentist approach, the probability of a confidence


interval, say 95%, is interpreted as a coverage probability, which means that if an
experiment is repeated again and again (with considerable unobserved data), the
probability that the intervals calculated according to the rule will cover the true
parameter is 95%. However, in Bayesian interval interpretation, say for a 95%
interval, we can state that the probability of the unknown parameter falling into
the calculated interval (which only depends on the observed data) is 95%.

Example 18.6: Supposing that X ∼ b(x; n, p), with known n = 2, and the prior distribution of p
is uniform π(p) = 1, for 0 < p < 1, find a 95% Bayesian interval for p.
716 Chapter 18 Bayesian Statistics

Solution : As in Example 18.2, when x = 0, the posterior distribution is a beta distribution


with parameters 1 and 3, i.e., π(p|0) = 3(1 − p)2 , for 0 < p < 1. Thus, we need to
solve for a and b using Definition 18.2, which yields the following:
a
0.025 = 3(1 − p)2 dp = 1 − (1 − a)3
0

and
1
0.025 = 3(1 − p)2 dp = (1 − b)3 .
b

The solutions to the above equations result in a = 0.0084 and b = 0.7076. There-
fore, the probability that p falls into (0.0084, 0.7076) is 95%.
For the normal population and normal prior case described in Example 18.5,
the posterior mean μ∗ is the Bayes estimate of the population mean μ, and a
100(1−α)% Bayesian interval for μ can be constructed by computing the interval
μ∗ − zα/2 σ ∗ < μ < μ∗ + zα/2 σ ∗ ,
which is centered at the posterior mean and contains 100(1 − α)% of the posterior
probability.

Example 18.7: An electrical firm manufactures light bulbs that have a length of life that is ap-
proximately normally distributed with a standard deviation of 100 hours. Prior
experience leads us to believe that μ is a value of a normal random variable with a
mean μ0 = 800 hours and a standard deviation σ0 = 10 hours. If a random sample
of 25 bulbs has an average life of 780 hours, find a 95% Bayesian interval for μ.
Solution : According to Example 18.5, the posterior distribution of the mean is also a normal
distribution with mean
(25)(780)(10)2 + (800)(100)2
μ∗ = = 796
(25)(10)2 + (100)2
and standard deviation
%
(10)2 (100)2 √
σ∗ = = 80.
(25)(10)2 + (100)2
The 95% Bayesian interval for μ is then given by
√ √
796 − 1.96 80 < μ < 796 + 1.96 80,
or
778.5 < μ < 813.5.
Hence, we are 95% sure that μ will be between 778.5 and 813.5.
On the other hand, ignoring the prior information about μ, we could proceed
as in Section 9.4 and construct the classical 95% confidence interval
   
100 100
780 − (1.96) √ < μ < 780 + (1.96) √ ,
25 25
or 740.8 < μ < 819.2, which is seen to be wider than the corresponding Bayesian
interval.
18.3 Bayes Estimates Using Decision Theory Framework 717

18.3 Bayes Estimates Using Decision Theory Framework


Using Bayesian methodology, the posterior distribution of a parameter can be
obtained. Bayes estimates can also be derived using the posterior distribution and
a loss function when a loss is incurred. A loss function is a function that describes
the cost of a decision associated with an event of interest. Here we only list a few
commonly used loss functions and their associated Bayes estimates.

Squared-Error Loss

Definition 18.3: The squared-error loss function is

L(θ, a) = (θ − a)2 ,

where θ is the parameter (or state of nature) and a an action (or estimate).

A Bayes estimate minimizes the posterior expected loss, given on the observed
sample data.

Theorem 18.1: The mean of the posterior distribution π(θ|x), denoted by θ∗ , is the Bayes esti-
mate of θ under the squared-error loss function.

Example 18.8: Find the Bayes estimates of p, for all the values of x, for Example 18.1 when the
squared-error loss function is used.
Solution : When x = 0, p∗ = (0.1)(0.6550) + (0.2)(0.3450) = 0.1345.
When x = 1, p∗ = (0.1)(0.4576) + (0.2)(0.5424) = 0.1542.
When x = 2, p∗ = (0.1)(0.2727) + (0.2)(0.7273) = 0.1727.
Note that the classical estimate of p is p̂ = x/n = 0, 1/2, and 1, respectively,
for the x values at 0, 1, and 2. These classical estimates are very different from
the corresponding Bayes estimates.

Example 18.9: Repeat Example 18.8 in the situation of Example 18.2.


Solution : Since the posterior distribution of p is a B(x + 1, 3 − x) distribution (see Section
6.8 on page 201), the Bayes estimate of p is
  1
∗ 2
p =E π(p|x)
(p) = 3 px+1 (1 − p)2−x dp,
x 0

which yields p∗ = 1/4 for x = 0, p∗ = 1/2 for x = 1, and p∗ = 3/4 for x = 2,


respectively. Notice that when x = 1 is observed, the Bayes estimate and the
classical estimate p̂ are equivalent.
For the normal situation as described in Example 18.5, the Bayes estimate of
μ under the squared-error loss will be the posterior mean μ∗ .

Example 18.10: Suppose that the sampling distribution of a random variable, X, is Poisson with
parameter λ. Assume that the prior distribution of λ follows a gamma distribution
/ /

718 Chapter 18 Bayesian Statistics

with parameters (α, β). Find the Bayes estimate of λ under the squared-error loss
function.
Solution : Using Example 18.3, we conclude that the posterior distribution of λ follows a
gamma distribution with parameters (x + α, (1 + 1/β)−1 ). Using Theorem 6.4, we
obtain the posterior mean
x+α
λ̂ = .
1 + 1/β

Since the posterior mean is the Bayes estimate under the squared-error loss, λ̂ is
our Bayes estimate.

Absolute-Error Loss
The squared-error loss described above is similar to the least-squares concept we
discussed in connection with regression in Chapters 11 and 12. In this section, we
introduce another loss function as follows.

Definition 18.4: The absolute-error loss function is defined as

L(θ, a) = |θ − a|,

where θ is the parameter and a an action.

Theorem 18.2: The median of the posterior distribution π(θ|x), denoted by θ∗ , is the Bayes
estimate of θ under the absolute-error loss function.

Example 18.11: Under the absolute-error loss, find the Bayes estimator for Example 18.9 when
x = 1 is observed.
Solution : Again, the posterior distribution of p is a B(x + 1, 3 − x). When x = 1, it is a beta
distribution with density π(p | x = 1) = 6x(1 − x) for 0 < x < 1 and 0 otherwise.
The median of this distribution is the value of p∗ such that
p∗
1
= 6p(1 − p) dp = 3p∗2 − 2p∗3 ,
2 0

which yields the answer p∗ = 12 . Hence, the Bayes estimate in this case is 0.5.

Exercises

18.1 Estimate the proportion of defectives being pro- p 0.05 0.10 0.15
duced by the machine in Example 18.1 if the random π(p) 0.3 0.5 0.2
sample of size 2 yields 2 defectives. If 2 of the next 9 drinks from this machine overflow,
find
18.2 Let us assume that the prior distribution for the (a) the posterior distribution for the proportion p;
proportion p of drinks from a vending machine that (b) the Bayes estimate of p.
overflow is
/ /

Exercises 719

18.3 Repeat Exercise 18.2 when 1 of the next 4 drinks (a) a Bayes estimate of the true average daily profit for
overflows and the uniform prior distribution is this building;
(b) a 95% Bayesian interval of μ for this building;
π(p) = 10, 0.05 < p < 0.15.
(c) the probability that the average daily profit from
the machine in this building is between $24.00 and
18.4 Service calls come to a maintenance center ac- $26.00.
cording to a Poisson process with λ calls per minute.
A data set of 20 one-minute periods yields an average 18.9 The mathematics department of a large uni-
of 1.8 calls. If the prior for λ follows an exponential versity is designing a placement test to be given to
distribution with mean 2, determine the posterior dis- incoming freshman classes. Members of the depart-
tribution of λ. ment feel that the average grade for this test will vary
from one freshman class to another. This variation of
18.5 A previous study indicates that the percentage the average class grade is expressed subjectively by a
of chain smokers, p, who have lung cancer follows a normal distribution with mean μ0 = 72 and variance
beta distribution (see Section 6.8) with mean 70% and σ02 = 5.76.
standard deviation 10%. Suppose a new data set col-
(a) What prior probability does the department assign
lected shows that 81 out of 120 chain smokers have
to the actual average grade being somewhere be-
lung cancer.
tween 71.8 and 73.4 for next year’s freshman class?
(a) Determine the posterior distribution of the percent-
(b) If the test is tried on a random sample of 100 stu-
age of chain smokers who have lung cancer by com-
dents from the next incoming freshman class, re-
bining the new data and the prior information.
sulting in an average grade of 70 with a variance of
(b) What is the posterior probability that p is larger 64, construct a 95% Bayesian interval for μ.
than 50%?
(c) What posterior probability should the department
assign to the event of part (a)?
18.6 The developer of a new condominium complex
claims that 3 out of 5 buyers will prefer a two-bedroom
unit, while his banker claims that it would be more 18.10 Suppose that in Example 18.7 the electrical
correct to say that 7 out of 10 buyers will prefer a two- firm does not have enough prior information regard-
bedroom unit. In previous predictions of this type, the ing the population mean length of life to be able to
banker has been twice as reliable as the developer. If assume a normal distribution for μ. The firm believes,
12 of the next 15 condominiums sold in this complex however, that μ is surely between 770 and 830 hours,
are two-bedroom units, find and it is thought that a more realistic Bayesian ap-
proach would be to assume the prior distribution
(a) the posterior probabilities associated with the
claims of the developer and banker; 1
π(μ) = , 770 < μ < 830.
(b) a point estimate of the proportion of buyers who 60
prefer a two-bedroom unit.
If a random sample of 25 bulbs gives an average life of
780 hours, follow the steps of the proof for Example
18.7 The burn time for the first stage of a rocket is 18.5 to find the posterior distribution
a normal random variable with a standard deviation
of 0.8 minute. Assume a normal prior distribution for π(μ | x1 , x2 , . . . , x25 ).
μ with a mean of 8 minutes and a standard deviation
of 0.2 minute. If 10 of these rockets are fired and the
first stage has an average burn time of 9 minutes, find 18.11 Suppose that the time to failure T of a certain
a 95% Bayesian interval for μ. hinge is an exponential random variable with probabil-
ity density
18.8 The daily profit from a juice vending machine
placed in an office building is a value of a normal ran- f (t) = θe−θt , t > 0.
dom variable with unknown mean μ and variance σ 2 .
Of course, the mean will vary somewhat from building From prior experience we are led to believe that θ is
to building, and the distributor feels that these average a value of an exponential random variable with proba-
daily profits can best be described by a normal distri- bility density
bution with mean μ0 = $30.00 and standard deviation
σ0 = $1.75. If one of these juice machines, placed in π(θ) = 2e−2θ , θ > 0.
a certain building, showed an average daily profit of
x̄ = $24.90 during the first 30 days with a standard If we have a sample of n observations on T , show that
deviation of s = $2.10, find the posterior distribution of Θ is a gamma distribution
720 Chapter 18 Bayesian Statistics

with parameters 18.14 A random variable X follows an exponential


distribution with mean 1/β. Assume the prior distri-
−1

n bution of β is another exponential distribution with
α=n+1 and β= ti + 2 . mean 2.5. Determine the Bayes estimate of β under
i=1 the absolute-error loss function.

18.15 A random sample X1 , . . . , Xn comes from


18.12 Suppose that a sample consisting of 5, 6, 6, 7, a uniform distribution (see Section 6.1) population
5, 6, 4, 9, 3, and 6 comes from a Poisson population U (0, θ) with unknown θ. The data are given below:
with mean λ. Assume that the parameter λ follows a
gamma distribution with parameters (3, 2). Under the 0.13, 1.06, 1.65, 1.73, 0.95, 0.56, 2.14, 0.33, 1.22, 0.20,
squared-error loss function, find the Bayes estimate of 1.55, 1.18, 0.71, 0.01, 0.42, 1.03, 0.43, 1.02, 0.83, 0.88
λ.
Suppose the prior distribution of θ has the density
18.13 A random variable X follows a negative bino-
mial distribution with parameters k = 5 and p [i.e., 1
, θ > 1,
θ2
b∗ (x; 5, p)]. Furthermore, we know that p follows a uni- π(θ) =
0, θ ≤ 1.
form distribution on the interval (0, 1). Find the Bayes
estimate of p under the squared-error loss function. Determine the Bayes estimator under the absolute-
error loss function.

You might also like