0% found this document useful (0 votes)
54 views3 pages

CH 10

This chapter discusses Bayesian inference for multiple linear regression models. It presents: 1) The Bayesian multiple linear regression model, where the prior distributions are conjugate so the posterior retains the same form. 2) The marginal posterior distributions of the parameters β and τ are derived. β follows a multivariate t distribution and τ follows a gamma distribution. 3) Bayesian point estimates and credible intervals for the regression coefficients β are obtained from the posterior distributions. Hypothesis tests can also be conducted.

Uploaded by

AmalAbdlFattah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views3 pages

CH 10

This chapter discusses Bayesian inference for multiple linear regression models. It presents: 1) The Bayesian multiple linear regression model, where the prior distributions are conjugate so the posterior retains the same form. 2) The marginal posterior distributions of the parameters β and τ are derived. β follows a multivariate t distribution and τ follows a gamma distribution. 3) Bayesian point estimates and credible intervals for the regression coefficients β are obtained from the posterior distributions. Hypothesis tests can also be conducted.

Uploaded by

AmalAbdlFattah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Chapter 10: Multiple Regression: Bayesian Inference

This chpaper considers Bayesian estimation and prediction for the multiple linear regression
model in which x variables are fixed constants.

1 Elements of Bayesian Statistical Inference


Let f (y|θ) denote the joint likelihood function of y1 , y2 , . . . , yn , and let p(θ) denote the prior
distribution imposed on θ. Then the posterior distribution of θ is given by

f (y|θ)p(θ)
π(θ|y) = R
f (y|θ)p(θ)dθ
1
= f (y|θ)p(θ),
c(y)
R
where c(y) = f (y|θ)p(θ)dθ is the normalizing constant of the posterior distribution.
Bayesian inference for the model is always based on the posterior distribution π(θ|y). For
example, let q(y0 |θ) denote a prediction function for y0 given θ. Then
Z
r(y0 |y) = q(y0 |θ)π(θ|y)dθ,

which leads to a conditional inference formula.

2 A Bayesian Multiple Linear Regression Model


For convenience, one often parameterizes Bayesian models using precision (τ ) rather than variance
(σ 2 ). With this reparameterization, one often assumes
1
y|β, τ ∼ Nn (Xβ, I),
τ
1 (1)
β|τ ∼ Nk+1 (ϕ, V ),
τ
τ ∼ Gamma(α, δ),

where α and δ are prior-hyperparameters.

1
Theorem 2.1. Consider the Bayesian multiple regression model, for which the prior distributions
are as specified in (1). Then the joint prior distribution is conjugate, that is, π(β, τ |y) is of the
same form as π(β, τ ).

Theorem 2.2. Consider the Bayesian multiple regression model, for which the prior distributions
are as specified in (1). The marginal posterior distribution π(β|y) is a multivariate t distribution
with parameters (n + 2α, ϕ∗ , W ∗ ), where

ϕ∗ = (V −1 + X ′ X)−1 (V −1 ϕ + X ′ y), (2)

and
(y − Xϕ)′ (I + XV X ′ )−1 (y − Xϕ) + 2δ
 
W∗ = (V −1 + X ′ X)−1 . (3)
n + 2α
Theorem 2.3. Consider the Bayesian multiple regression model, for which the prior distributions
are as specified in (1). The marginal posterior distribution π(τ |y) is a gamma distribution with
parameters α + n/2 and (−ϕ′∗ V −1 ′ −1
∗ ϕ∗ + ϕ V ϕ + y ′ y + 2δ)/2, where V ∗ = (V −1 + X ′ X)−1 and
ϕ∗ = V ∗ (V −1 ϕ + X ′ y).

3 Inference in Bayesian Multiple Linear Regression


Point Estimate and Credible Interval A convenient property of the multivariate t-distribution
is that linear functions of the random vector follow the (univariate) t-distribution. Thus, given y,

a′ β − a′ ϕ∗
∼ t(n + 2α),
a′ W ∗ a
and, as an important special case,
βi − ϕ∗i
∼ t(n + 2α),
w∗ii
where ϕ∗i is the ith element of ϕ∗ and w∗ii is the ith diagonal element of W ∗ . Thus a Bayesian
point estimate of βi is its posterior mean ϕ∗i and a 100(1 − ω)% Bayesian credible interval for βi
is
ϕ∗i + tω/2,n+2α w∗ii .

Hypothesis Test For example, to test the hypothesis test βi > βi0 , we can calculate the prob-
ability  
βi0 − ϕ∗i
P t(n + 2α) > .
w∗ii
The larger the probability is, the more credible is the hypothesis.

2
Special cases of Inference First, we consider the use of a diffuse prior. Let ϕ = 0, let V be
a diagonal matrix with all diagonal elements equal to a large constant (say, 106 ), and let α and δ
both be equal to a small constant (say, 10−6 ). In this case, V −1 is close to 0, and so ϕ∗ , and the
Bayesian point estimate of β in (2) is approximately equal to

(X ′ X)−1 X ′ y.

Also, since (I + XV X ′ )−1 = I − X(X ′ X + V )−1 X ′ , the covariance matrix W ∗ approaches

y ′ (I − X(X ′ X)−1 X ′ )y n − 1 2 ′ −1
W∗ = (X ′ X)−1 = s (X X) .
n n
The second special case of inference is the case in which ϕ = 0 and V is a diagonal matrix
with a constant on the diagonal. Thus V = aI, where a is a positive number, and the Bayesian
estimator of β becomes
1
(X ′ X + I)−1 X ′ y,
a
which is known as the ridge estimator.

Bayesian Point and Interval Estimation of σ 2 A possible Bayesian point estimator of σ 2 is


the mean of the marginal inverse gamma density given in Theorem ??:

(−ϕ′∗ V −1 ′ −1
∗ ϕ∗ + ϕ V ϕ + y ′ y + 2δ)/2
,
α + n/2 − 1

and a 100(1 − ω)% Bayesian credible interval for σ 2 is given by the 1 − ω/2 and ω/2 quantiles of
the inverse gamma distribution.
Consider a special case that α and δ are both close to 0, ϕ = 0, and V is a diagonal matrix
with all diagonal elements equal to a large constant. Then the Bayesian point estimator of σ 2 is
approximately

(y ′ y − ϕ∗ V −1
∗ ϕ∗ )/2 y ′ y − y ′ X(X ′ X)−1 X ′ y n−k−1 2
= = s.
n/2 − 1 n−2 n−2

4 Bayesian Inference through MCMC Simulations


Consider the Bayesian multiple regression model, for which the prior distributions are as specified
in (1). Then the conditional distribution of β|τ, y is Nk+1 (ϕ∗ , τ −1 V ∗ , and the conditional distribu-
tion of τ |β, y ∼ Gamma((α∗∗ + k + 3)/2, [(β − ϕ∗ )′ V −1
∗ (β − ϕ∗ ) + δ∗∗ ]/2), where α∗∗ = 2α − 2 + n,
and δ∗∗ = −ϕ′∗ V −1 ′ −1
∗ ϕ∗ + ϕ V ϕ + y ′ y + δ∗ . Then the Gibbs sampler can be used for simulating
from the posterior distribution.

You might also like