Linear Model Methodology
Linear Model Methodology
Y v = X v + v , (6.33)
2 X Y + 2 X X + A = 0. (6.38)
1 1 1 1 1
r = X X X Y XX A A XX A A XX X Ym
1 1 1
= X X A A XX A A m , (6.41)
where is the OLS estimator given in (6.9). This solution is called the restricted
least-squares estimator of . It is easy to see that r satisfies the equality
Full-Rank Linear Models 139
S r = Y X Y X + r X X r
= SSE + r X X r . (6.42)
Hence, S(r ) > SSE since equality is attained if and only if = r , which is
not possible. Using (6.41), formula (6.42) can also be written as
1 1
S r = SSE + A m A X X A A m .
The assertion that S() attains its minimum value at r in the constrained
parameter space can be verified as follows: Let be any vector in the con-
strained parameter space. Then, A = m = Ar . Hence, A(r ) = 0.
Therefore,
S() = (Y X) (Y X)
= Y X r + X r X Y X r + X r X
= Y X r Y X r + 2 Y X r X r X
+ X r X X r X .
A m X r
1 1 1
= A m A X X A A XX XX
= 0, since A r = 0.
It follows that
S () = S r + r X X r .
To find the MLE of and 2 , we proceed as follows: We first find the sta-
tionary values of and 2 for which the partial derivatives of l(, 2 , y) with
respect to and 2 are equal to zero. The next step is to verify that these
values maximize l(, 2 , y). Setting the partial derivatives of l(, 2 , y) with
respect to and 2 to zero, we get
l , 2 , y 1
= 2 2 X y + 2 X X
2
=0 (6.44)
l , 2 , y n 1
= 2 + (y X) (y X)
2 2 2 4
= 0. (6.45)
Let and 2 denote the solution of equations 6.44 and 6.45 for and 2 ,
respectively. From (6.44) we find that
= (X X)1 X Y, (6.46)
which is the same as , the OLS estimator of in (6.9). Note that Y was used
in place of y in (6.46) since the latter originated from the likelihood function
in (6.43) where it was treated as a mathematical variable. In formula (6.46),
however, Y is treated as a random vector since it is data dependent. From
(6.45) we get
1
2 = Y X Y X . (6.47)
n
Full-Rank Linear Models 141
where < < . Let h(Y) be a function such that E[h(Y)] = 0 for all .
Then,
1
1
h(y)exp (y )2 dy = 0, < < ,
2 2
h(y)ey
2 /2
ey dy = 0, < < . (6.52)
where (y) 0 and t1 (y), t2 (y), . . . , tk (y) are real-valued functions of y only,
and c() 0 and 1 (), 2 (), . . . , k () are real-valued functions of
only. Then, F is called an exponential family.
Several well-known distributions belong to the exponential family. These
include the normal, gamma, and beta distributions, among the continuous
distributions; and the binomial, Poisson, and negative binomial, among the
discrete distributions. For example, for the family of normal distributions,
N(, 2 ), we have
1 1
g(y, ) = exp 2 (y )2 , < < , > 0, (6.55)
2 2 2
1 2 y2 y
= exp 2 exp 2 + 2 .
2 2 2 2
= n/2 exp 2 y X y X
2 2 2
+ X X
1 1 2
= n/2 exp 2 n + X X . (6.57)
2 2 2
We note that the right-hand side of (6.57) is a function of 2 , , and the ele-
ments of . Hence, by the Factorization Theorem (Theorem 6.2), the statistic
( , 2 ) is sufficient for [the function g1 in Theorem 6.2, in this case, is
identically equal to one, and the function g2 is equal to the right-hand side
of (6.57)].
Now, to show completeness, let us rewrite (6.57) as
1 1
g(y, ) = n/2 exp X X
2 2 2 2
1
1
exp 2 n2 + X X + 2 X X . (6.58)
2
Full-Rank Linear Models 145
By comparing (6.58) with (6.54) we find that g(y, ) belongs to the exponential
family with k = p + 1,
(y) = 1,
1 1
c() = n/2 exp 2 X X ,
2 2 2
1
1 () = 2 ,
2
t1 (y) = n 2
+ X X , (6.59)
1
2 () = 2 ,
t2 (y) = X X . (6.60)
Furthermore, the set
() = 1 (), 2 ()
1 1
= 2, 2 ,
2
is a subset of a (p + 1)-dimensional Euclidean space with a negative first
coordinate, and this subset has a nonempty interior. Hence, by Theorem 6.3,
t(Y) = [t1 (Y), t2 (Y)] is a complete statistic. But, from (6.59) and (6.60) we can
solve for and 2 in terms of t1 (Y) and t2 (Y), and we obtain,
1
= X X t2 (Y),
1 1 1
2 = t1 (Y) t2 (Y) X X XX XX t2 (Y)
n
1 1
= t1 (Y) t2 (Y) X X t2 (Y) .
n
It follows that ( , 2 ) is a complete statistic (any invertible function of a
statistic with a complete family has a complete family; see Arnold, 1981,
Lemma 1.3, p. 3). We finally conclude that ( , 2 ) is a complete and sufficient
statistic for ( , 2 ) .
Corollary 6.2 Let Y N(X, 2 In ), where X is of order np and rank p (< n).
Then, = (X X)1 X Y, and
1 1
MSE = Y In X X X X Y,
np