0% found this document useful (0 votes)

51 views

Econometrics (EM2008) The K-Variable Linear Regression Model

This document provides an outline and overview of the key matrix formulation concepts in the k-variable linear regression model. It discusses how the model can be written in matrix notation with vectors and matrices, and derives the normal equations that are solved to obtain least squares estimates of the model coefficients. It also covers the decomposition of the total sum of squares into the explained and unexplained components.

Uploaded by

SelMarie Nutelliina Gomez

0% found this document useful (0 votes)

51 views

Econometrics (EM2008) The K-Variable Linear Regression Model

Uploaded by

SelMarie Nutelliina Gomez

You are on page 1/ 46

Econometrics [EM2008]

Lecture 2
The k-variable linear regression model

Irene Mammi

[email protected]

Academic Year 2018/2019

1 / 46
outline

I the k-variable linear regression model

I matrix formulation
I partial correlation coefficients
I inference
I prediction

I References:
I Johnston, J. and J. DiNardo (1997), Econometrics Methods, 4th
Edition, McGraw-Hill, New York, Chapter 3.

2 / 46
the multivariate model

I the bivariate framework is too restrictive for realistic analysis of

economic phenomena
I generally more useful to specify multivariate relations
I restrict the analysis to a single equation which now include k variables
I the specification of such a relationship is

Yt = β 1 + β 2 X2t + β 3 X3t + · · · + β k Xkt + ut t = 1, . . . , n

which identifies k − 1 explanatory variables, namely X2 , X3 , . . . , Xk ,

that are thought to influence the dependent variable
I nb: the X may be transformations of other variables but the
relationship is linear in the β coefficients
I assume that the disturbances are white noise
I k + 1 parameters to estimate, the β’s and the disturbance variance σ2

3 / 46
matrix formulation of the k-variable model
I matrices indicated by uppercase bold letters, vectors by lowercase
bold letters
I vectors generally taken as column vectors
I for example,
   
Y1 X21
 Y2  X22 
y= .  x2 =  . 
   
 ..   .. 
Yn X2n
are n × 1 vectors, also referred to n-vectors, containing the sample
observations on Y and X2
I the n sample observations on the k-variable model can be written as

.. .. .. .. ..
         
. . .  .  .
y  = β 1 x 1  + β 2 x 2  + · · · + β k x k  + u 
         
.. .. .. .. ..
. . . . .
4 / 46
matrix formulation of the k-variable model (cont.)

I the y vector is expressed as a linear combination of the x vectors plus

the disturbance vector u
I the x 1 vector is a column of ones to allow for the intercept term
I collecting all the x vectors intro a matrix X and the β coefficients
into a vector β, can write

y = Xβ + u

where
· · · Xk1
   
1 X21 β1
1 X22 · · · Xk2   β2 
X = . and β= . 
   
.. .. .. 
 .. . . .   .. 
1 X2n · · · Xkn βk

5 / 46
the algebra of least squares

I if the unknown vector β is replaced by some guess or estimate b, this

defines a vector of residuals e,

e = y − Xb

I the least squares principle is to choose b to minimize the residual

sum of squares e 0 e, namely,

RSS = e 0 e
= (y − Xb )0 (y − Xb )
= y 0 y − b 0 X 0 y − y 0 Xb + b 0 X 0 Xb
= y 0 y − 2b 0 X 0 y + b 0 X 0 Xb

6 / 46
the algebra of least squares (cont.)
I the first order-conditions for the minimization are
∂(RSS )
= −2X 0 y + 2X 0 Xb = 0
∂b
giving the normal equations

(X 0 X )b = X 0 y

I if y is replaced by Xb + e the result is

(X 0 X )b = X 0 (Xb + e ) = (X 0 X )b + X 0 e

thus
X 0e = 0
which is another fundamental least-squares result
I the first element in this equation gives ∑ et = 0, that is,

ē = Ȳ − b1 − b2 X̄2 − · · · − bk X̄k = 0
7 / 46
the algebra of least squares (cont.)

I ⇒ the residuals have zero mean, and the regression plane passes
through the point of means in k dimensional space
I the remaining elements are of the form

∑ Xit et = 0 i = 2, . . . , k
t

which implies that each regressor has zero sample correlation with the
residuals
I this, in turn,implies that ŷ (= Xb ), the vector of the regression values
for Y , is uncorrelated with e, for

ŷ 0 e = (Xb )0 e = b 0 X 0 e = 0

8 / 46
the algebra of least squares (cont.)

normal equations for the two-variable case

I here, k = 2 and the model of interest is Y = β 1 + β 2 X + u

I the X matrix is  
1 X1
1 X2 
X = .
 
.. 
 .. . 
1 Xn
thus,
 
1 X1
1 X2  ∑X

1 1 ··· 1  n
X 0X = =

X1 X2 ··· Xn  ... ..
∑ X ∑ X2
 
. 
1 Xn

9 / 46
the algebra of least squares (cont.)

and  
Y1
1 Y2  ∑Y
 
1 1 ···
X 0y =  ..  =
X1 X2 ··· Xn  .  ∑ XY
Yn
giving
∑X ∑Y

n b1
=
∑X ∑ X2 b2 ∑ XY
or

nb1 + b2 ∑ X = ∑Y
b1 ∑ X + b2 ∑ X 2
= ∑ XY

10 / 46
the algebra of least squares (cont.)

normal equations for the three-variable case

I in a similar way, it may be shown that the normal equations for fitting
a three-variable equation by least squares are

nb1 + b2 ∑ X2 + b3 ∑ X3 = ∑Y
b1 ∑ X2 + b2 ∑ X22 + b3 ∑ X2 X3 = ∑ X2 Y
b1 ∑ X3 + b2 ∑ X2 X3 + b3 ∑ X32 = ∑ X3 Y

11 / 46
decomposition of the sum of squares

I the zero covariances between regressors and the residuals underlie the
decomposition of the sum of squares
I decomposing the y vector into the part explained by the regression
and the unexplained part, we have

y = ŷ + e = Xb + e

from which it follows that

y 0 y = (ŷ + e )0 (ŷ + e ) = ŷ 0 ŷ + e 0 e = b 0 X 0 Xb + e 0 e

I however, y 0 y = ∑nt=1 Yt2 is the sum of squares of actual Y values;

interest normally centers on analyzing the variation in Y , measured by
the sum of the squared deviations from the sample mean, namely,

∑(Yt − Ȳt )2 = ∑ Yt2 − nȲ 2

t t

12 / 46
decomposition of the sum of squares (cont.)

I subtracting nȲ 2 from each side of the previous decomposition gives

(y 0 y − nȲ 2 ) = (b 0 X 0 Xb − nȲ 2 ) + e0e

TSS = ESS + RSS

where TSS indicates the total sum of squares in Y , and ESS and
RSS the explained and residual (unexplained) sum of squares

13 / 46
equation in deviation form
I alternatively, express all the data in the form of deviations from the
sample mean
I the least-squares equation is

Yt = b1 + b2 X2t + b3 X3t + · · · + bk Xkt + et t = 1, . . . , n

I averaging over the sample observations gives

Ȳ = b1 + bx X̄2 + b3 X̄3 + · · · + bk X̄k

which contains no term in e, since ē is zero

I subtracting the second equation from the first gives

yt = b2 x2t + b3 x3t + · · · + bk xkt + et t = 1, . . . , n

I intercept b1 disappears, but it may be recovered from

b1 = Ȳ − b2 X̄2 − · · · − bk X̄k
14 / 46
equation in deviation form (cont.)
I nb: the least-squares slope coefficients b2 , . . . , bk are identical in both
forms of the regression equation, and so the residuals
I collecting all n observations, the deviation form of the equation may
be written compactly using a transformation matrix

1
A = In − ii0
n

where i is a column vector of n ones

I it follows that Ae = e and Ai = 0
I write the least-squares equation as

b1
y = Xb + e = i X2 +e
b2

where X 2 is the n × (k − 1) matrix of observations on the regressors

and b 2 is the k − 1 element vector containing the coefficients
b2 , b3 , . . . , bk
15 / 46
equation in deviation form (cont.)
I premultiplying by A gives

b1
Ay = 0 AX 2 + Ae = (AX 2 )b 2 + e
b2
or
y ∗ = X ∗b2 + e
where y ∗ = Ay and X ∗ = AX 2 give the data in deviation form.
Since X 0 e = 0, it follows that X 0∗ e = 0.
I premultiplying previous equation by X 0∗ gives

X 0∗ y ∗ = (X 0∗ X ∗ )b 2

which are the familiar normal equations, except that now the data
have all been expresses in deviation form and the b 2 vector contains
the k − 1 slope coefficients and excludes the intercept term

16 / 46
equation in deviation form (cont.)
I the decomposition of the sum of squares may be expressed as

y 0∗ y ∗ = b 20 X 0∗ X ∗ b 2 + e0e
TSS = ESS + RSS

I the coefficient of multiple correlation R is defined as the positive

square root of
ESS RSS
R2 = = 1−
TSS TSS
2
I the adjusted R is defined as

RSS/(n − k )
R̄ 2 = 1 −
TSS/(n − 1)
I the numerator and the denominator on the RHS are unbiased
estimators of the disturbance variance and the variance of Y

17 / 46
equation in deviation form (cont.)
I the relation between the adjusted and unadjusted coefficients is

n−1
R̄ 2 = 1 − (1 − R 2 )
n−k
1−k n−1 2
= + R
n−k n−k
I two alternative criteria for comparing the fit of specifications are the
Schwarz criterion
e0e k
SC = ln + ln n
n n

and the Akaike information criterion

e0e 2k
AIC = ln +
n n

18 / 46
generalizing partial correlation
I the normal equations solve for b = (X 0 X )−1 X 0 y
I the residuals from the LS regression may be expressed as

e = y − Xb = y − X (X 0 X )−1 X 0 y = My

where
M = I − X (X 0 X ) −1 X 0
I M is a symmetric, idempotent matrix; it also has the properties that
MX = 0 and Me = e
I now write the general regression in partitioned form as

b2
y = x2 X∗ +e
b (2)

I in this partitioning x 2 is the n × 1 vector of observations on X2 , with

coefficient b2 , and X ∗ is the n × (k − 1) matrix of all the other
variables (including the column of ones) with coefficient vector b (2)

19 / 46
generalizing partial correlation (cont.)
I the normal equations for this setup are
0
x 2 x 2 x 20 X ∗
0
b2 x y
= 20
X 0∗ x 2 X 0∗ X ∗ b (2) X ∗y

I the solution for b2 is

b2 = (x 20 M ∗ x 2 )−1 (x 20 M ∗ y )

where
M ∗ = I − X ∗ (X 0∗ X ∗ )−1 X 0∗
M ∗ is a symmetric, idempotent matrix with the properties
M ∗ X ∗ = 0 and M ∗ e = e
I we have that

M ∗ y is the vector of residuals when y is regressed on X ∗

M ∗ x 2 is the vector of residuals when x 2 is regressed on X ∗

20 / 46
generalizing partial correlation (cont.)
I regressing the first vector on the second gives a slope coefficient,
which, using the simmetry and idempotency of M ∗ , gives the b2
coefficient defined above
I a simpler way to prove the same result is as follows: write the
partitioned regression as

y = x 2 b2 + X ∗ b ( 2 ) + e

I premultiplying by M ∗ , obtain

M ∗ y = ( M ∗ x 2 ) b2 + e

I finally, premultiply by x 20 which gives

x 20 M ∗ y = (x 20 M ∗ x 2 )b2

21 / 46
inference in the k-variables equation
assumptions

1. X is nonstochastic and has full rank k

2. the errors have the properties

E(u ) = 0

and
var(u ) = E(uu 0 ) = σ2 I

I since the expected value operator is applied to every element of a

vector or matrix, we have
     
u1 E(u1 ) 0
u2  E(u2 )  0
E(u ) = E  .  =  .  =  .  = 0
     
 ..   ..   .. 
un E ( un ) 0

22 / 46
inference in the k-variables equation (cont.)
u1
  
  u2  
E(uu 0 ) = E  .  u1 u2 ··· un 
  
 ..  
un
E(u12 ) E ( u1 u2 ) · · · E ( u1 un )
 
 E ( u2 u1 ) 2
E ( u2 ) · · · E ( u2 un ) 
=
 
.. .. .. .. 
 . . . . 
E(un u1 ) E(un u2 ) · · · E(un2 )
var(u1 ) cov(u1 , u2 ) · · · cov(u1 , un )
 
cov(u2 , u1 ) var(u2 ) · · · cov(u2 , un )
=
 
.. .. .. .. 
 . . . . 
cov(un , u1 ) cov(un , u2 ) · · · var(un )
 2
0 ··· 0

σ
 0 σ2 · · · 0 
2
= . ..  = σ I
 
.. ..
 .. . . . 
0 0 · · · σ2

23 / 46
inference in the k-variables equation (cont.)

I the previous matrix is the variance-covariance matrix of the error

term
I this matrix embodies two strong assumptions: homoskedasticity and
no serial correlation

24 / 46
inference in the k-variables equation (cont.)
Mean and Variance of b

I write the normal equations as

b = (X 0 X ) −1 X 0 y

I substitute for y to get

b = (X 0 X ) −1 X 0 (X β + u ) = β + (X 0 X ) −1 X 0 u

from which
b − β = (X 0 X ) −1 X 0 u
I take expectations (moving the expectation operator to the right past
non-stochastic terms such as X)

E (b − β ) = (X 0 X ) −1 X 0 E (u ) = 0

giving
E(b ) = β
25 / 46
inference in the k-variables equation (cont.)
I under the assumptions of the model, the LS estimators are
unbiased estimators of the β parameters
I to obtain the variance-covariance matrix of the LS estimators,
consider
var(b ) = E[(b − β)(b − β)0 ]
and substituting for b − β get

E[(b − β)(b − β)0 ] = E[(X 0 X )−1 X 0 uu 0 X (X 0 X )−1 ]

= (X 0 X )−1 X 0 E[uu 0 ]X (X 0 X )−1
= σ 2 (X 0 X ) −1

thus

var(b ) = σ2 (X 0 X )−1

26 / 46
inference in the k-variables equation (cont.)
Estimation of σ2

I the variance-covariance matrix of LS estimators involves the error

variance σ2 , which is unknown
I it is reasonable to base an estimate on the residual sum of squares
from the fitted regression
I write e = My = M (X β + u ) = Mu since MX = 0, so that

E(e 0 e ) = E(u 0 M 0 Mu ) = E(u 0 Mu )

I exploiting the fact that the trace of a scalar is the scalar, write

E(u 0 Mu ) = E[tr(u 0 Mu )]
= E[tr(uu 0 M )]
= σ2 tr(M )
= σ2 tr(I ) − σ2 tr[X (X 0 X )−1 X 0 ]
= σ2 tr(I ) − σ2 tr[(X 0 X )−1 (X 0 X )]
= σ 2 (n − k )
27 / 46
inference in the k-variables equation (cont.)

I thus
e0e
s2 =
n−k
defines an unbiased estimator of σ2
I the square root s is the standard deviation of the Y values about the
regression plane; it is referred to as standard error of the
estimator or standard error of the regression (SER)

28 / 46
inference in the k-variables equation (cont.)
Gauss-Markov theorem

I this is the fundamental LS theorem

I G-M states that, conditional on the assumptions made, no other
linear, unbiased estimator of the β coefficient can have smaller
sampling variances than those of the least-squares estimator

1. each LS estimator bi is a best linear unbiased estimator of the

population parameter β i
2. the BLUE of any combination of β’s is that same linear combination
of the b’s
3. the BLUE of E(Ys ) is

Ŷs = b1 + b2 X2s + b3 X3s + · · · + bk Xks

which is the value found by inserting a relevant vector of X values

into the regression model

29 / 46
testing linear hypotheses about β

I we have established the properties of the LS estimators of β

I now we show how to test hypotheses about β
I consider, for example
(i) H0 : βi = 0
(ii) H0 : β i = β i0
(iii) H0 : β2 + β3 = 1
(iv) H0 : β 3 = β 4 , or β 3 − β 4 = 0
(v) H0 :
0
   
β2
 β 3  0
 .  = .
   
 ..   .. 
βk 0
(vi) H0 : β2 = 0

30 / 46
testing linear hypotheses about β (cont.)
I all these examples fit into the general linear framework

Rβ = r

where R is a q × k matrix of known constants, with q < k, and r is a

q-vector of known constants. Each null hypothesis determines the
relevant elements in R and r
I for the previous examples we have

(i) R = 0 · · · 0 1 0 · · · 0 r =0 q=1
with 1 in th ith position
(ii) R = 0 · · · 0 1 0 · · · 0 r = β i0 q=1
with 1 in th ith position
(iii) R = 0 1 1 0 · · · 0 r =1 q=1
(iv) R = 0 0 1 −1 0 · · · 0 r =0 q=1
(v) R = 0 I k −1 r =0 q = k −1
where 0 is a vector of k − 1 zeros
(vi) R = 0k2 ×k1 I k2 r =0 q = k2

31 / 46
testing linear hypotheses about β (cont.)
I we now derive a general testing procedure for the general linear
hypothesis
H0 : R β − r = 0
I given the LS estimator, we can compute the vector (Rb − r ), which
measures the discrepancy between expectation and observation
I if this vector is “large”, it casts doubt on the null hypothesis
I the distinction between “large” and “small” is determined from the
sampling distribution under the null, in this case, the distribution of
Rb when R β = r
I from the unbiasedness result, it follows that

E(Rb ) = R β

I therefore

var(Rb ) = E[R (b − β)(b − β)0 R 0 ]

= R var(b )R 0
= σ 2 R (X 0 X ) −1 R 0
32 / 46
testing linear hypotheses about β (cont.)
I we know the mean and the variance of the vector Rb
I need a further assumption to determine the form of the sampling
distribution: since b is a function of the u vector, the sampling
distribution of Rb will be determined by the distribution of u
I assume that the ui are normally distributed so that,

u ∼ N (0, σ2 I )

I it follows that
b ∼ N [ β, σ2 (X 0 X )−1 ]
then
Rb ∼ N [R β, σ2 R (X 0 X )−1 R 0 ]
and so
R (b − β) ∼ N [0, σ2 R (X 0 X )−1 R 0 ]
I if the null hypothesis R β = r is true, then

(Rb − r ) ∼ N [0, σ2 R (X 0 X )−1 R 0 ]

33 / 46
testing linear hypotheses about β (cont.)
I this equation gives the sampling distribution of Rb, and we may
derive a χ2 variable, namely

(Rb − r )0 [σ2 R (X 0 X )−1 R 0 ]−1 (Rb − r ) ∼ χ2 (q )

I σ2 is unknown but it can be shown that

e0e
∼ χ2 (n − k )
σ2
and that this statistic is distributed independently of b
I a computable test statistic, which has an F distribution under the
null, is

(Rb − r )0 [R (X 0 X )−1 R 0 ]−1 (Rb − r )/q

∼ F (q, n − k )
e 0 e/(n − k )
I the test procedure is to reject R β = r if the computed F value
exceeds the relevant critical value

34 / 46
testing linear hypotheses about β (cont.)

I it could be helpful to write

(Rb − r )0 [s 2 R (X 0 X )−1 R 0 ]−1 (Rb − r )/q ∼ F (q, n − k )

thus, s 2 (X 0 X )−1 is the estimated variance-covariance matrix of b.

I if we let cij denote the i, j th element in (X 0 X )−1 then

s 2 cii = var(bi ) and s 2 cij = cov(bi , bj ) i, j = 1, 2, . . . , k

35 / 46
testing linear hypotheses about β (cont.)
I going back to the previous examples. . .
(i) H0 : β i = 0: Rb picks out bi and R (X 0 X )−1 R 0 picks out cii , the i th
diagonal element in (X 0 X )−1 . Thus we have

bi2 bi2
F = 2
= ∼ F (1, n − k )
s cii var(bi )

or, taking the square root,

bi bi
t= √ = ∼ t (n − k )
s cii s.e.(bi )

(ii) H0 : β i = β i0 : this hypothesis is tested by

bi − β i0
t= ∼ t (n − k )
s.e.(bi )

One may compute also compute a 95 % confidence interval for β i :

bi ± t0.025 s.e.(bi )

36 / 46
testing linear hypotheses about β (cont.)
(iii) H0 : β 2 + β 3 = 1: Rb gives the sum of the two estimated coefficients,
b2 + b3 . Premultiplying (X 0 X )−1 by R gives a row vector whose
elements are the sum of the corresponding elements in the second and
third rows of (X 0 X )−1 . Forming the inner product with R 0 gives the
sum of the second and third elements of the row vector, that is,
c22 + 2c23 + c33 , noting that c23 = c32 . Thus

s 2 R (X 0 X )−1 R 0 = s 2 (c22 + 2c23 + c33 )

= var(b2 ) + 2 cov(b2 , b3 ) + var(b3 )
= var(b2 + b3 )

The test statistic is then

(b2 + b3 − 1)
t= p ∼ t (n − k )
var(b2 + b3 )

Alternatively one may compute, say, a 95% percent confidence

interval for the sum ( β 2 + β 3 ) as
q
(b2 + b3 ) ± t0.025 var(b2 + b3 )
37 / 46
testing linear hypotheses about β (cont.)

(iv) H0 : β 3 = β 4 : the test statistic here is

b3 − b4
t= p ∼ t (n − k )
var(b3 − b4 )

(v) H0 : β 2 = β 3 = · · · = β k = 0: this case involves a composite

hypothesis about all k − 1 coefficients. The F statistic for testing the
joint significance of the complete set of regressors is

ESS/(k − 1)
F = ∼ F (k − 1, n − k )
RSS/(n − k )

This statistic may also be expressed as

R 2 / (k − 1)
F = ∼ F (k − 1, n − k )
(1 − R 2 ) / (n − k )

38 / 46
testing linear hypotheses about β (cont.)

(vi) H0 : β2 = 0: this hypothesis postulates that a subset of coefficients is

a zero vector. Partition the regression equation as follows:

b1
y = X1 X2 + e = X 1b1 + X 2b2 + e
b2

where X 1 has k1 columns, including a column of ones, X 2 has

k2 = k − k1 columns, and b 1 and b 2 are the corresponding subvectors
of regression coefficients. The hypothesis may be tested by running
two separate regressions. First regress y on X 1 and denote the RSS
by e 0∗ e ∗ . Then run the regression on all the X s, obtaining the RSS,
denoted by e 0 e. The test statistic is

(e 0∗ e ∗ − e 0 e )/k2
F = ∼ F ( k2 , n − k )
e 0 e/(n − k )

39 / 46
restricted and unrestricted regressions

I examples (v) and (vi) may be interpreted as the outcome of two

separate regressions
I recall that ESS may be expressed ESS = y 0∗ y ∗ − e 0 e, where y ∗ = Ay
I it may be shown that y 0∗ y ∗ is the RSS when y ∗ is regressed on
x 1 (= i )
I in both cases (v) and (vi) the first regression may be regarded as a
restricted regression and the second as an unrestricted
regression.
I e 0∗ e ∗ is the restricted RSS and e 0 e is the unrestricted RSS

40 / 46
fitting the restricted regressions
I question: how to fit the restricted regression?
I answer: 1) either work out each specific case from first principles; 2)
or derive a general formula into which specific cases can be fitted
I (1) as for the first approach, consider example (iii) with the regression
in deviation form,
y = b 2 x2 + b 3 x3 + e
I want to impose the restriction that b2 + b3 = 1. Substituting the
restriction in the regression gives

y = b2 x2 + (1 − b2 )x3 + e∗ or

(y − x3 ) = b2 (x2 − x3 ) + e∗
so as to form two new variables (y − x3 ) and (x2 − x3 ): the simple
regression of the first on the second (without the constant) gives the
restricted estimate of b2 ; the RSS from this regression is the
restricted RSS, e 0∗ e ∗ .
41 / 46
fitting the restricted regressions (cont.)
I (2) the general approach requires a b ∗ vector that minimizes the RSS
subject to the restrictions Rb ∗ = r . To do so set up the function

φ = (y − Xb ∗ )0 (y − Xb ∗ ) − 2λ0 (Rb ∗ − r )

where λ is a q-vector of Lagrange multipliers

I the first-order conditions are
∂φ
= −2X 0 y + 2(X 0 X )b ∗ − 2R 0 λ = 0
∂b ∗
∂φ
= −2(Rb ∗ − r ) = 0
∂λ
I the solution for b ∗ is

b ∗ = b + (X 0 X )−1 R 0 [R (X 0 X )−1 R 0 ]−1 (r − Rb )

where b is the unrestricted LS estimator (X 0 X )−1 X 0 y.

42 / 46
fitting the restricted regressions (cont.)
I the residuals from the restricted regression are

e ∗ = y − Xb ∗
= y − Xb − X (b ∗ − b )
= e − X (b ∗ − b )
I transposing and multiplying, we obtain

e 0∗ e ∗ = e 0 e + (b ∗ − b )0 X 0 X (b ∗ − b )
I the process of substituting for (b ∗ − b ) and simplifying gives

e 0∗ e ∗ − e 0 e = (r − Rb )0 [R (X 0 X )−1 R 0 ]−1 (r − Rb )

where, apart from q, the expression on the RHS is the same as the
numerator in the F statistic
I thus an alternative expression of the test statistic for H0 : Rb = r is
(e 0∗ e ∗ − e 0 e )/q
F = ∼ F (q, n − k )
e 0 e/(n − k )
43 / 46
prediction

I suppose that we have fitted a regression model, and we know consider

some specific vector of regressor values,

c 0 = 1 X2f · · · Xkf

I we wish to predict the value of Y conditional on c

I a point prediction is obtained by inserting the given X values into the
regression equation, giving

Ŷf = b1 + b2 X2f + · · · + bk Xkf = c 0 b

I Gauss-Markov theorem shows that c 0 b is a BLUE of c 0 β; here

c 0 β = E(Yf ) so that Ŷf is an optimal predictor of E(Yf )
I as var(Rb ) = R var(b )R 0 , replacing R by c 0 gives

var(c 0 b ) = c 0 var(b )c

44 / 46
prediction (cont.)
I if we assume normality for the error term, it follows that

c 0b − c 0 β
p ∼ N (0, 1)
var(c 0 b )

I when the unknown σ2 in var(b ) is replaced by s 2 , we have

Ŷ − E(Yf )
pf ∼ t (n − k )
s c 0 (X 0 X ) −1 c

from which a 95% confidence interval for E(Yf ) is

q
Ŷf ± t0.025 s c 0 (X 0 X )−1 c

I to obtain a confidence interval for Yf rather than E(Yf ), consider

they differ only by the error uf that appears in the prediction period
I the point prediction is the same as before, but uncertainty of
prediction increases
45 / 46
prediction (cont.)

I we have Ŷf = c 0 b as before and now Yf = c 0 β + uf so that the

prediction error is

ef = Yf − Ŷf = uf − c 0 (b − β)

I squaring both sides and taking expectations gives the variance of the
prediction error

var(ef ) = σ2 + c 0 var(b )c
σ 2 (1 + c 0 (X 0 X ) −1 c )

from which we derive a t statistic

Ŷf − Yf
∼ t (n − k )
1 + c 0 (X 0 X ) −1 c
p
s

46 / 46

Topic 6 Simple Linear Regression
No ratings yet
Topic 6 Simple Linear Regression
57 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
REg03
No ratings yet
REg03
39 pages
Unit 5 Fitting of Curves: Structure
No ratings yet
Unit 5 Fitting of Curves: Structure
21 pages
Chapter3 Econometrics MultipleLinearRegressionModel
No ratings yet
Chapter3 Econometrics MultipleLinearRegressionModel
41 pages
Frisch Waugh Lovell
No ratings yet
Frisch Waugh Lovell
15 pages
Lecture Note 4 To 7 OLS
No ratings yet
Lecture Note 4 To 7 OLS
29 pages
Ec 2
No ratings yet
Ec 2
12 pages
Econometrics I 3
No ratings yet
Econometrics I 3
27 pages
EE C222/ME C237 - Spring'18 - Lecture 3 Notes: Murat Arcak January 24 2018
No ratings yet
EE C222/ME C237 - Spring'18 - Lecture 3 Notes: Murat Arcak January 24 2018
5 pages
Mathematical Economics: 1 What To Study
No ratings yet
Mathematical Economics: 1 What To Study
23 pages
The Classical Linear Regression and Estimator
No ratings yet
The Classical Linear Regression and Estimator
3 pages
Note 4 Math Camp
No ratings yet
Note 4 Math Camp
8 pages
Econometrics I 3
No ratings yet
Econometrics I 3
27 pages
Lec2 Ps
No ratings yet
Lec2 Ps
6 pages
Chapter 3 Multiple Regression
No ratings yet
Chapter 3 Multiple Regression
49 pages
Linear Model
No ratings yet
Linear Model
14 pages
ECON 332 LECTURE NOTES APRIL 2021
No ratings yet
ECON 332 LECTURE NOTES APRIL 2021
57 pages
PSAI Unit3 Curve Fitting
No ratings yet
PSAI Unit3 Curve Fitting
21 pages
Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi
No ratings yet
Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi
8 pages
Vector Spaces and Norms: 1 The Vector Space R
No ratings yet
Vector Spaces and Norms: 1 The Vector Space R
12 pages
The Infinite Square Well: PHY3011 Wells and Barriers Page 1 of 17
No ratings yet
The Infinite Square Well: PHY3011 Wells and Barriers Page 1 of 17
17 pages
Day1 Solutions-2007
No ratings yet
Day1 Solutions-2007
3 pages
Vandermonde_and_Lagrange_Basis (1)
No ratings yet
Vandermonde_and_Lagrange_Basis (1)
4 pages
A Short Note On Doubly Substochastic Analogue of Birkhoff's Theorem
No ratings yet
A Short Note On Doubly Substochastic Analogue of Birkhoff's Theorem
12 pages
Chapter Three Systems of Linear Equations: Dr. Asma Alramle
No ratings yet
Chapter Three Systems of Linear Equations: Dr. Asma Alramle
12 pages
A-star training questions for single maths A level
No ratings yet
A-star training questions for single maths A level
10 pages
SM I 2013 LecturesWeek 6
No ratings yet
SM I 2013 LecturesWeek 6
7 pages
Practice_Problems_for_ML_Midterms
No ratings yet
Practice_Problems_for_ML_Midterms
5 pages
4 - The Multiple Linear Regression - Parameter Estimation
No ratings yet
4 - The Multiple Linear Regression - Parameter Estimation
10 pages
hw3 Sol
No ratings yet
hw3 Sol
14 pages
Matrix Algebra: 1.1 Vector
No ratings yet
Matrix Algebra: 1.1 Vector
25 pages
Linear Algebraic Equations, SVD, and The Pseudo-Inverse: 1 A Little Background
No ratings yet
Linear Algebraic Equations, SVD, and The Pseudo-Inverse: 1 A Little Background
8 pages
Algebraic Curves PDF
No ratings yet
Algebraic Curves PDF
10 pages
Unit 5 Hard Mode
No ratings yet
Unit 5 Hard Mode
12 pages
Class_notes_on_Linear_Programming_Simple
No ratings yet
Class_notes_on_Linear_Programming_Simple
19 pages
Linear Model
No ratings yet
Linear Model
11 pages
Lecture-3
No ratings yet
Lecture-3
109 pages
Spring 2021: Numerical Analysis Assignment 5 (Due Thursday April 22nd 10:00am)
No ratings yet
Spring 2021: Numerical Analysis Assignment 5 (Due Thursday April 22nd 10:00am)
4 pages
2019-20 Exam
No ratings yet
2019-20 Exam
7 pages
hw1 Econometrics
No ratings yet
hw1 Econometrics
2 pages
Multiple - Random - Variables
No ratings yet
Multiple - Random - Variables
13 pages
Quick Review Chap 4
No ratings yet
Quick Review Chap 4
2 pages
Sst414 Lesson 2
No ratings yet
Sst414 Lesson 2
8 pages
Fulltext01 1
No ratings yet
Fulltext01 1
26 pages
578 ch6
No ratings yet
578 ch6
83 pages
tut 4s
No ratings yet
tut 4s
5 pages
Lecture - 3
No ratings yet
Lecture - 3
4 pages
Three Forms Quadratic Equation
No ratings yet
Three Forms Quadratic Equation
1 page
Lecture 7
No ratings yet
Lecture 7
10 pages
TA_session_06
No ratings yet
TA_session_06
13 pages
Matrices and Determinants Project
No ratings yet
Matrices and Determinants Project
5 pages
Week 01
No ratings yet
Week 01
5 pages
Assignment 4
No ratings yet
Assignment 4
2 pages
Kernel Ridge Regression
No ratings yet
Kernel Ridge Regression
8 pages
Elimination of Arbitrary Constants (Slides For Video Lecture)
No ratings yet
Elimination of Arbitrary Constants (Slides For Video Lecture)
75 pages
L02 Notes
No ratings yet
L02 Notes
6 pages
Linear Algebra Vectors
No ratings yet
Linear Algebra Vectors
12 pages
(Ebook) Optimization Models (Instructor's Solution Manual) (Solutions) by Giuseppe C. Calafiore, Laurent El Ghaoui ISBN 9781107050877, 1107050871 pdf download
100% (1)
(Ebook) Optimization Models (Instructor's Solution Manual) (Solutions) by Giuseppe C. Calafiore, Laurent El Ghaoui ISBN 9781107050877, 1107050871 pdf download
47 pages
Lec Note 6 2024
No ratings yet
Lec Note 6 2024
4 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Econometrics (EM2008) Stationary Univariate Time Series: Irene Mammi
No ratings yet
Econometrics (EM2008) Stationary Univariate Time Series: Irene Mammi
28 pages
Econometrics (EM2008/EM2Q05) Heteroskedasticity: Irene Mammi
No ratings yet
Econometrics (EM2008/EM2Q05) Heteroskedasticity: Irene Mammi
15 pages
Econometrics (EM2008/EM2Q05) Maximum Likelihood and Generalized Least Squares Estimators
No ratings yet
Econometrics (EM2008/EM2Q05) Maximum Likelihood and Generalized Least Squares Estimators
19 pages
Econometrics (EM2008/EM2Q05) Autocorrelation: Irene Mammi
No ratings yet
Econometrics (EM2008/EM2Q05) Autocorrelation: Irene Mammi
30 pages
Econometrics (EM2008) Specification Error in The The K-Variable Model
No ratings yet
Econometrics (EM2008) Specification Error in The The K-Variable Model
31 pages
Sick Capitale Vice: Lavanderia
No ratings yet
Sick Capitale Vice: Lavanderia
5 pages
1 Introduction and Objectives: IEOR E4150 Introduction To Probability and Statistics CVN Fall 2019 Dr. A. B. Dieker
No ratings yet
1 Introduction and Objectives: IEOR E4150 Introduction To Probability and Statistics CVN Fall 2019 Dr. A. B. Dieker
3 pages
Answer PDF Lab
No ratings yet
Answer PDF Lab
34 pages
Drawing Contours From Arbitrary Data Points: D. H. Mclain
No ratings yet
Drawing Contours From Arbitrary Data Points: D. H. Mclain
7 pages
Estad Istica II Chapter 5. Regression Analysis (Second Part)
No ratings yet
Estad Istica II Chapter 5. Regression Analysis (Second Part)
39 pages
Physics 1 Lab Expt. 1
No ratings yet
Physics 1 Lab Expt. 1
6 pages
Flyash Kinetics Revised v1
No ratings yet
Flyash Kinetics Revised v1
43 pages
Chapter 13 Simple Regression
No ratings yet
Chapter 13 Simple Regression
44 pages
Curve Fitting Matlab
No ratings yet
Curve Fitting Matlab
19 pages
CVX PDF
No ratings yet
CVX PDF
92 pages
PDF From Finite Sample to Asymptotic Methods in Statistics 1st Edition Pranab K. Sen download
100% (1)
PDF From Finite Sample to Asymptotic Methods in Statistics 1st Edition Pranab K. Sen download
77 pages
Session: Modeling, Simulation and Optimization
No ratings yet
Session: Modeling, Simulation and Optimization
31 pages
Data Analysis With Python: A Tutorial Laboratory
No ratings yet
Data Analysis With Python: A Tutorial Laboratory
3 pages
An Introduction To Regression Analysis
No ratings yet
An Introduction To Regression Analysis
34 pages
Developers Google Com Machine Learning Glossary
No ratings yet
Developers Google Com Machine Learning Glossary
85 pages
Ex No: 5 Curve Fitting Using Polynomial Regression: Description
No ratings yet
Ex No: 5 Curve Fitting Using Polynomial Regression: Description
5 pages
HW3: (Regularized) Least Square Problem (65 PTS) : Mathematical Backgrounds
No ratings yet
HW3: (Regularized) Least Square Problem (65 PTS) : Mathematical Backgrounds
13 pages
PDF (Ebook PDF) Principles of Econometrics, 5th Edition Download
100% (3)
PDF (Ebook PDF) Principles of Econometrics, 5th Edition Download
51 pages
(Ebook) Convex Optimization by --pdf download
100% (2)
(Ebook) Convex Optimization by --pdf download
57 pages
Fatigue Crack Growth Database For Damage Tolerance Analysis
No ratings yet
Fatigue Crack Growth Database For Damage Tolerance Analysis
126 pages
Introductory Methods of Numerical Analys
No ratings yet
Introductory Methods of Numerical Analys
11 pages
Cost Behavior: Analysis and USE: Patrick Louie E. Reyes, CTT, Micb, Rca, Cpa
No ratings yet
Cost Behavior: Analysis and USE: Patrick Louie E. Reyes, CTT, Micb, Rca, Cpa
24 pages
Balanced K-Means Revisited-2
No ratings yet
Balanced K-Means Revisited-2
2 pages
Least Squares Matrix Form PDF
No ratings yet
Least Squares Matrix Form PDF
16 pages
Unit 6 Point Estimation: Structure
No ratings yet
Unit 6 Point Estimation: Structure
18 pages
6.4.1 Stochastic Robust Approximation
No ratings yet
6.4.1 Stochastic Robust Approximation
7 pages
2018-IPS Endterm Sols
No ratings yet
2018-IPS Endterm Sols
14 pages
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2010
No ratings yet
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2010
103 pages
A Note On Adjustment of Free Networks
No ratings yet
A Note On Adjustment of Free Networks
19 pages
(BAO, 2014) A Simulation-Based Portfolio Optimization Approach With Least Squares Learning
No ratings yet
(BAO, 2014) A Simulation-Based Portfolio Optimization Approach With Least Squares Learning
6 pages