Unit 8
Unit 8
MODEL: INFERENCES
Structure
8.0 Objectives
8.1 Introduction
8.2 Assumptions of Multiple Regression Models
8.2.1 Classical Assumptions
8.2.1 Test for Normality of the Error Term
8.3 Testing of Single Parameter
8.3.1 Test of Significance Approach
8.3.2 Confidence Interval Approach
8.4 Testing of Overall Significance
8.5 Test of Equality between Two Parameters
8.6 Test of Linear Restrictions on Parameters
8.6.1 The t-Test Approach
8.6.2 Restricted Least Squares
8.7 Structural Stability of a Model: Chow Test
8.8 Prediction
8.8.1 Mean Prediction
8.8.2 Individual Prediction
8.9 Let Us Sum Up
8.10 Answers/ Hints to Check Your Progress Exercises
8.0 OBJECTIVES
After going through this unit, you should be able to
explain the need for the assumption of normality in the case of multiple
regression;
describe the procedure of testing of hypothesis on individual estimators;
test the overall significance of a regression model;
test for the equality of two regression coefficients;
explain the procedure of applying the Chow test;
make prediction on the basis of multiple regression model;
interpret the results obtained from the testing of hypothesis, both individual
and joint; and
apply various tests such as likelihood ratio (LR), Wald (W) and Lagrange
Multiplier Test (LM).
Dr. Pooja Sharma, Assistant Professor, Daulat Ram College, University of Delhi
Multiple Regression
Models
8.1 INTRODUCTION
In the previous unit we discussed about the interpretation and estimation of
multiple regression models. We looked at the assumptions that are required for
the ordinary least squares (OLS) and maximum likelihood (ML) estimation. In
the present Unit we look at the methods of hypothesis testing in multiple
regression models.
Recall that in Unit 3 of this course we mentioned the procedure of hypothesis
testing. Further, in Unit 5 we explained the procedure of hypothesis testing in the
case of two variable regression models. Now let us extend the procedure of
hypothesis testing to multiple regression models. There could be two scenarios in
multiple regression models so far as hypothesis testing is concerned: (i) testing of
individual coefficients, and (ii) joint testing of some of the parameters. We
discuss the method of testing for structural stability of regression model by
applying the Chow test. Further, we discuss three important tests, viz.,
Likelihood Ratio test, Wald test, and Lagrange Multiplier test. Finally, we deal
with the issue of prediction on the basis of multiple regression equation.
One of the assumptions in hypothesis testing is that the error variable 𝑢 follows
normal distribution. Is there a method to test for the normality of a variable? We
will discuss this issue also. However, let us begin with an overview of the basic
assumptions of multiple regression models.
where
n = sample size
101
Multiple Regression S = measure of skewness ( )
Models
K = measure of kurtosis ( )
102
8.3 TESTING OF SINGLE PARAMETER Multiple Linear Regression
Model: Inferences
There are two approaches to hypothesis testing: (i) test of significance approach,
and (ii) confidence interval approach. We discuss both the approaches below.
8.3.1 Test of Significance Approach
In this approach we proceed as follows:
(i) Take the point estimate of the parameter that we want test, viz., b1, or
b2 or b3.
(ii) Set the null hypothesis. Suppose we expect that variable 𝑋 has no
influence on Y. It implies that 𝛽 should be zero. Thus, null hypothesis
is 𝐻 : 𝛽 = 0. In this case what should be alternative hypothesis? The
alternative hypothesis is 𝐻 : 𝛽 ≠ 0.
(iii) If 𝛽 ≠ 0, then 𝛽 could be either positive or negative. Thus we have
to apply two-tail test. Accordingly, the critical value of the t-ratio has
to be decided.
(iv) Let us consider another scenario. Suppose we expect that 𝛽 should be
positive. It implies that our null hypothesis is 𝐻 : 𝛽 > 0 . The
alternative hypothesis is 𝐻 : 𝛽 ≤ 0.
(v) If 𝛽 > 0, then 𝛽 could be either zero or negative. Thus the critical
region or rejection region lies on one side of the t probability curve.
Therefore, we have to apply one-tail test. Accordingly the critical
value of t-ratio is to be decided.
(vi) Remember that the null hypothesis depends on economic theory or
logic. Therefore, you have to set the null hypothesis according to
some logic. If you expect that the explanatory variable should have no
effect on the dependent variable, then set the parameter as zero in the
null hypothesis.
(vii) Decide on the level of significance. It represents extent of error you
want to tolerate. If the level of significance is 5 per cent (α = 0.05),
103
Multiple Regression your decision on the null hypothesis will go be wrong 5 per cent
Models
times. If you take 1 per cent level of significance (α = 0.01), then your
decision on the null hypothesis will be wrong 1 per cent times (i.e., it
will be correct 99 per cent times).
(viii) Compute the t-ratio. Here the standard error is the positive square root
of the variance of the estimator. The formula for the variance of the
OLS estimators in multiple regression models is given in Unit 7.
𝑡= … (8.5)
( )
(ix) Compare the computed value of the t-ratio with the tabulated value of
the t-ratio. Be careful about the two issues while reading the t-table:
(i) level of significance, and (ii) degree of freedom. Level of
significance we have mentioned above. Degree of freedom is (n–k), as
you know from the previous Unit.
(x) If the computed value of t-ratio is greater than the tabulated value of t-
ratio, reject the null hypothesis. If computed value of t-ratio is less
than the tabulated value of t-ratio, do not reject the null hypothesis
and accept the alternative null hypothesis.
8.3.2 Confidence Interval Approach
We have discussed about interval estimation in Unit 3 and Unit 5. Thus, here we
bring out the essential points only.
(i) Remember that confidence interval (CI) is created individually for
each parameter. There cannot be a single confidence interval for a
group of parameters.
(ii) Confidence interval is build on the basis of the logic described above
in the test of significance approach.
(iii) Suppose we have the null hypothesis 𝐻 : 𝛽 = 0 and the alternative
hypothesis is 𝐻 : 𝛽 ≠ 0. The estimator of 𝛽 is 𝑏 . We know the
standard error of 𝑏 .
(iv) Here also we decide on the level of significance (α). We refer to the t-
table and find out the t-ratio for desired level of significance.
(v) The degree of freedom is known to us, i.e., (n–k).
(vi) Since the above is case of two-tailed test, we take 𝛼⁄2 on each side of
the t probability curve. Therefore, we take the t-ratio corresponding to
the probability 𝛼⁄2 and the degrees of freedom applicable.
(vii) Remember that confidence interval is created with the help of the
estimator and its standard error. We test whether the parameter lies
within the confidence interval or not.
(viii) Construct the confidence interval as follows:
104
𝑏 −𝑡 ⁄ 𝑆𝐸 (𝑏 ) ≤ 𝛽 ≤ 𝑏 + 𝑡 ⁄ 𝑆𝐸 (𝑏 ) . … (8.6) Multiple Linear Regression
Model: Inferences
(ix) The probability of the parameter remaining in the confidence interval
is (1 − 𝛼). If we have taken the confidence interval as 5 per cent, then
the probability that 𝛽 will remain in the confidence interval is 95 per
cent.
𝑃 𝑏 −𝑡 ⁄ 𝑆𝐸 (𝑏 ) ≤ 𝛽 ≤ 𝑏 + 𝑡 ⁄ 𝑆𝐸 (𝑏 ) = (1 − 𝛼) … (8.7)
(x) If the parameter (in this case, 𝛽 ) remains in the confidence interval,
do not reject the null hypothesis.
(xi) If the parameter does not remain within the confidence interval, reject
the null hypothesis, and accept the alternative null hypothesis.
Check Your Progress 2
1) Describe the steps you would follow in testing the hypothesis that 𝛽 < 0.
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
........................................................................................................................
105
Multiple Regression (iii) Decide on the level of significance. It has the same connotation as in
Models
the case of t-test described above.
(iv) For multiple regression model the F-statistic is given by
/( )
𝐹= ( )
… (8.10)
𝐻 :𝛽 = 𝛽 = ⋯ = 𝛽 = 0 … (8.25)
(iv) The corresponding alternative hypothesis will be that the 𝛽s are not
zero.
(v) Estimate the unrestricted regression model given at (8.11). Obtain the
residual sum of squares (RRS) on the basis of the estimated regression
equation. Denote it as RSSUR.
(ix) Find out the computed value of F on the basis of equation (8.10).
Compare it with the tabulated value of F (given at the end of the
book). Read the tabulated F value for desired level of significance and
applicable degrees of freedom.
(x) If the computed value of F is greater than the tabulated value, then
reject the null hypothesis.
(xi) If the computed value is less than the tabulated value, do not reject the
null hypothesis.
As mentioned earlier, the residual sum of squares (RSS) and the coefficient of
determination (𝑅 ) are related. Therefore, it is possible to carry out the F-test on
the basis of 𝑅 also. If we have the coefficient of determination for the
unrestricted model (𝑅 ) and the coefficient of determination for the restricted
model (𝑅 ), then we can test the joint hypothesis about the set of parameters.
The F-statistic will be
/
𝐹= … (8.27)
/( )
110
8.7 STRUCTURAL STABILITY OF A MODEL: Multiple Linear Regression
Model: Inferences
CHOW TEST
Many times we come across situations where there is a change in the pattern of
data. The dependent and independent variables may not remain the same
throughout the sample. For example, saving behaviour of poor and rich
households may be different. The production of an industry may be different after
a policy change. In such situations it may not be appropriate to run a single
regression for the entire dataset. There is a need to check for structural stability of
the econometric model.
There are various procedures to bring in structural breaks in a regression model.
We will discuss about the dummy variable cases in unit 9. In this Unit we discuss
a very simple and specific case.
Suppose we have data on n observations. We suspect that the first 𝑛
observations are different from the remaining 𝑛 observations (we have 𝑛 +
𝑛 = 𝑛). In this case run the following three regression equations:
𝑌 =𝜆 +𝜆 𝑋 +𝑢 (number of observations: 𝑛 ) … (8.28)
𝑌 =𝑟 +𝑟 𝑋 +𝑣 (number of observations: 𝑛 ) … (8.29)
𝑌 = 𝛼 + 𝛼 𝑋 + 𝑤 (number of observations: 𝑛 = 𝑛 + 𝑛 ) … (8.30)
If both the sub-samples are the same, then we should have 𝜆 = 𝑟 = 𝛼
and 𝜆 = 𝑟 = 𝛼 . If both the sub-samples are different then there will be a
structural break in the sample. It implies the parameters of equations (8.28) and
(8.29) are different. In order to test for the structural stability of the regression
model we apply Chow test.
We process as follows:
(i) Run the regression model (8.28). Obtain residual sum of squares RSS1.
(ii) Run regression model (8.29). Obtain residual sum of squares RSS2.
(iii) Run regression model (8.30). Obtain residual sum of squares RSS3.
(iv) In regression model (8.30) we are forcing the model to have the same
parameters in both the sub-samples. Therefore, let us call the residual
sum of squares obtained from this model RSSR.
(v) Since regression models given at (8.28) and (8.29) are independent, let
us call this the unrestricted model. Therefore, 𝑅𝑆𝑆 = 𝑅𝑆𝑆 + 𝑅𝑆𝑆
(vi) Suppose both the sub-samples are the same. In that case there should not
be any difference between 𝑅𝑆𝑆 and 𝑅𝑆𝑆 . Our null hypothesis in that
case is H0: There is not structural change (or, there is parameter
stability).
(vii) Test the above by the following test statistic:
111
Multiple Regression )/
𝐹= … (8.31)
Models /
𝑌 𝑋′ = 𝑋′ 𝛽 ... (8.37)
where the values of x0 are fixed. You should note that (8.36) gives an unbiased
prediction of 𝐸 𝑌 𝑋 ′ , since 𝐸 𝑋 ′ 𝛽 = 𝑋 ′ 𝛽 .
where var(𝑌 |𝑋 ) stands for E Y0 Ŷ0 | X 2 . In practice we replace 2 by its
unbiased estimator ̂2 .
113
Multiple Regression Check Your Progress 4
Models
1) Consider a Cobb-Douglas production. Write down the steps of testing the
hypothesis that it exhibits constant returns to scale.
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
3) Point out why individual prediction has higher variance than mean
prediction.
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
115