0% found this document useful (0 votes)
21 views70 pages

L8.2 2023

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views70 pages

L8.2 2023

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

The Problem

of Inference:
Multiple
Regression
Analysis
2

The Normality Assumption

▰ The ui follow the normal distribution with zero mean and


constant variance σ2 .
3

The Normality Assumption

▰ Upon replacing σ2 by its unbiased estimator 𝜎ො 2 in the computation


of the standard errors, each of the following variables

▰ follows the t distribution with n - 3 df.


4

EXAMPLE

▰ Child Mortality Example


Source SS df MS Number of obs = 64
F(2, 61) = 73.83
Model 257362.373 2 128681.187 Prob > F = 0.0000
Residual 106315.627 61 1742.87913 R-squared = 0.7077
Adj R-squared = 0.6981
Total 363678 63 5772.66667 Root MSE = 41.748

cm Coef. Std. Err. t P>|t| [95% Conf. Interval]

pgnp -.0056466 .0020033 -2.82 0.006 -.0096524 -.0016408


flr -2.231586 .2099472 -10.63 0.000 -2.651401 -1.81177
_cons 263.6416 11.59318 22.74 0.000 240.4596 286.8236
5

EXAMPLE

▰ Child Mortality Example

▰ What about the statistical significance of the observed


results?
6

Hypothesis Testing in Multiple Regression

▰ Testing hypotheses about an individual partial regression


coefficient.
▰ Testing the overall significance of the estimated multiple regression
model.
▰ Testing that two or more coefficients are equal to one another.
▰ Testing that the partial regression coefficients satisfy certain
restrictions.
▰ Testing the stability of the estimated regression model over time or
in different cross sectional units.
▰ Testing the functional form of regression models.
7

Hypothesis Testing about Individual


Regression Coefficients

▰ Suppose

▰ The null hypothesis states that, with X3 (female literacy


rate) held constant, X2 (PGNP) has no (linear) influence
on Y (child mortality)
▰ To test the null hypothesis, we obtain t statistics
8

Example

▰ The critical t value is 2.0 for a two-tail test or 1.671 for a


one-tail test.
▰ Since the computed t value of 2.8187 (in absolute terms)
exceeds the critical t value of 2.
▰ We can reject the null hypothesis that PGNP has no effect
on child mortality.
9

Example

▰ Example
10

Example

▰ The 95 percent confidence interval for β2 is:


11

Testing the Overall Significance of the Sample


Regression

▰ Consider the following hypothesis:

▰ This null hypothesis is a joint hypothesis that β2 and β3 are jointly or


simultaneously equal to zero.
▰ A test of such a hypothesis is called a test of the overall
significance of the observed or estimated regression line, that is,
whether Y is linearly related to both X2 and X3.
12

The F Test

▰ Take the following equation

▰ TSS has n - 1 df and RSS has n - 3 df.


▰ ESS has 2 df since it is a function of two variables.
13

The F Test

▰ Under the assumption of normal distribution for ui and the


null hypothesis β2 = β3 = 0, the variable

▰ is distributed as the F distribution with 2 and n - 3 df.


14

The F Test

▰ ANOVA Table for the Three-Variable Regression


15

The F-Test

▰ Under the assumption that the

▰ We get
16

The F-Test

▰ With the additional assumption that β2 = β3 = 0, it can be shown


that

▰ If the null hypothesis is true, both Eqs. give identical estimates of


true σ2.
▰ If there is a trivial relationship between Y and X2 and X3, the
sole source of variation in Y is due to the random forces
represented by ui .
17

The F-Test

▰ Summary of F statistics
18

Example
19

Testing the Overall Significance of a Multiple


Regression: The F Test
20

The F-Test

▰ If F > Fα(k - 1, n - k), reject H0; otherwise do not reject it.


21

An Important Relationship between R2 and F


22

Example Revisited

▰ The results are


Source SS df MS Number of obs = 64
F(2, 61) = 73.83
Model 257362.373 2 128681.187 Prob > F = 0.0000
Residual 106315.627 61 1742.87913 R-squared = 0.7077
Adj R-squared = 0.6981
Total 363678 63 5772.66667 Root MSE = 41.748

cm Coef. Std. Err. t P>|t| [95% Conf. Interval]

pgnp -.0056466 .0020033 -2.82 0.006 -.0096524 -.0016408


flr -2.231586 .2099472 -10.63 0.000 -2.651401 -1.81177
_cons 263.6416 11.59318 22.74 0.000 240.4596 286.8236
23

The F-Test

▰ ANOVA Table in Terms of R2


24

Testing the Overall Significance of a Multiple


Regression in Terms of R2

▰ If F > Fα(k - 1, n - k), reject H0; otherwise do not reject it.


25

The “Incremental” or “Marginal” Contribution


of an Explanatory Variable

▰ What if we introduce PGNP and FLR sequentially;


▰ Whether the addition of the variable to the model
increases ESS (and hence R2) “significantly” in relation to
the RSS.
▰ This contribution may appropriately be called the
incremental, or marginal, contribution of an explanatory
variable.
26

Example

▰ First run the regression model as

Source SS df MS Number of obs = 64


F(1, 62) = 12.36
Model 60449.4605 1 60449.4605 Prob > F = 0.0008
Residual 303228.539 62 4890.78289 R-squared = 0.1662
Adj R-squared = 0.1528
Total 363678 63 5772.66667 Root MSE = 69.934

cm Coef. Std. Err. t P>|t| [95% Conf. Interval]

pgnp -.0113645 .0032325 -3.52 0.001 -.0178262 -.0049027


_cons 157.4244 9.845583 15.99 0.000 137.7434 177.1055
27

Example

▰ ANOVA Table for Regression Equation


28

Example

▰ Regress child mortality on PGNP and obtain the following regression

▰ The F value

▰ follows the F distribution with 1 and 62 df. This F value is highly


significant, as the computed p value is 0.0008
29

“Marginal” Contribution of an Explanatory Variable

▰ Suppose we decide to add FLR to the model and obtain the


multiple regression. The questions we want to answer are:
1. What is the marginal, or incremental, contribution of FLR,
knowing that PGNP is already in the model and that it is
significantly related to CM?
2. Is the incremental contribution of FLR statistically significant?
3. What is the criterion for adding variables to the model?
30

“Marginal” Contribution of an Explanatory Variable

▰ To assess the incremental contribution of X3 after allowing


for the contribution of X2,
31

“Marginal” Contribution of an Explanatory Variable

▰ ANOVA Table to Assess Incremental Contribution of a


Variable(s)
32

Example

▰ Rerun the model with both the variables:

Source SS df MS Number of obs = 64


F(2, 61) = 73.83
Model 257362.373 2 128681.187 Prob > F = 0.0000
Residual 106315.627 61 1742.87913 R-squared = 0.7077
Adj R-squared = 0.6981
Total 363678 63 5772.66667 Root MSE = 41.748

cm Coef. Std. Err. t P>|t| [95% Conf. Interval]

pgnp -.0056466 .0020033 -2.82 0.006 -.0096524 -.0016408


flr -2.231586 .2099472 -10.63 0.000 -2.651401 -1.81177
_cons 263.6416 11.59318 22.74 0.000 240.4596 286.8236
33

Example

▰ In the example,
34

Example

▰ We get:
35

“Marginal” Contribution of an Explanatory Variable

▰ Alternatively,
36

Example

▰ Thus, we get

▰ That is almost same as above.

▰ If we use the R2 version of the F test, we need to ensure that the


dependent variable in the new and the old models is the same.
▰ If they are different, use the F test given previously.
37

When to Add a New Variable

▰ If the inclusion of a variable increases 𝑅ത 2 , it is retained


in the model.
▰ When does the adjusted R2 increase?
▰ 𝑅ത 2 will increase if the t value of the coefficient of the newly
added variable is larger than 1 in absolute value.
▰ Alternatively, 𝑅ത 2 will increase with the addition of an extra
explanatory variable only if the F( = t2) value of that
variable exceeds 1.
38

When to Add a Group of Variables

▰ If adding (dropping) a group of variables to the model


gives an F value greater (less) than 1, R2 will increase
(decrease).
39

Testing the Equality of Two Regression


Coefficients

▰ Consider the following regression model

▰ Test the hypotheses


40

Testing the Equality of Two Regression


Coefficients

▰ How do we test such a null hypothesis?


▰ Under the classical assumptions, it can be shown that

▰ follows the t distribution with (n - 4) df because it is a four-


variable model or, more generally, with (n - k) df
41

Testing the Equality of Two Regression


Coefficients

▰ The standard error can be computed as


42

The Cubic Cost Function


43

The Cubic Cost Function Revisited

▰ Run the regression model using following command:

Source SS df MS Number of obs = 10


F(3, 6) = 1202.22
Model 38918.1562 3 12972.7187 Prob > F = 0.0000
Residual 64.7438228 6 10.7906371 R-squared = 0.9983
Adj R-squared = 0.9975
Total 38982.9 9 4331.43333 Root MSE = 3.2849

y Coef. Std. Err. t P>|t| [95% Conf. Interval]

x 63.47766 4.778607 13.28 0.000 51.78483 75.17049


xsq -12.96154 .9856646 -13.15 0.000 -15.37337 -10.5497
xcb .9395882 .0591056 15.90 0.000 .794962 1.084214
_cons 141.7667 6.375322 22.24 0.000 126.1668 157.3665
44

The Cubic Cost Function Revisited

▰ where Y is total cost and X is output


45

Example

▰ We want to test the hypothesis that the coefficients of the


X2 and X3 terms in the cubic cost function are the same
46

Restricted Least Squares: Testing Linear


Equality Restrictions

▰ Consider the Cobb–Douglas production function:

▰ In log form, the equation becomes


47

Cobb–Douglas production function

▰ Suppose we wish to test the constant returns to scale

▰ This is the case of linear restriction


▰ There are two approaches to test the above
48

The t-Test Approach

▰ Estimate the model in the usual manner without taking into


account the restriction explicitly.
▰ This is called the unrestricted or unconstrained regression
▰ Then apply the t-test
49

The F-Test Approach: Restricted Least Squares

▰ A direct approach
50

EXAMPLE : Cobb–Douglas Production Function

▰ Use the data for the Mexican economy for 1955–1974.


51

EXAMPLE : Cobb–Douglas Production Function

▰ Now run the regression model using the command:

Source SS df MS Number of obs = 20


F(2, 17) = 1719.20
Model 2.75165006 2 1.37582503 Prob > F = 0.0000
Residual .01360456 17 .000800268 R-squared = 0.9951
Adj R-squared = 0.9945
Total 2.76525462 19 .145539717 Root MSE = .02829

lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]

llabor .3397362 .1856928 1.83 0.085 -.0520414 .7315138


lcapital .8459951 .093352 9.06 0.000 .6490397 1.042951
_cons -1.652429 .6062017 -2.73 0.014 -2.931402 -.3734547
52

EXAMPLE : Cobb–Douglas Production Function

▰ Regression output
53

EXAMPLE : Cobb–Douglas Production Function

▰ To test the constant returns to scale


▰ Impose the restriction of constant returns to scale.
54

EXAMPLE : Cobb–Douglas Production Function

▰ Run the regression model:

Source SS df MS Number of obs = 20


F(1, 18) = 789.93
Model .729755249 1 .729755249 Prob > F = 0.0000
Residual .016628834 18 .000923824 R-squared = 0.9777
Adj R-squared = 0.9765
Total .746384083 19 .039283373 Root MSE = .03039

lgdplab Coef. Std. Err. t P>|t| [95% Conf. Interval]

lcaplab 1.015301 .0361244 28.11 0.000 .939406 1.091195


_cons -.4947181 .1218164 -4.06 0.001 -.7506448 -.2387914
55

Example
56

EXAMPLE : Cobb–Douglas Production Function

▰ Show all the relevant values:

teststatistic = 3.7790767
crit = 4.4513218
pvalue = .06863821
57

Testing for Structural or Parameter Stability of


Regression Models: The Chow Test

▰ In time series analysis, we may encounter structural


change, that is, the values of the parameters of the model
do not remain the same through the entire time period
▰ We run three regression models:
▰ First sub period model
▰ Second sub period model
▰ Complete time period model
58

The Chow Test

▰ Chow test assumptions


▰ 1. u1t ∼ N(0, σ2) and u2t ∼ N(0, σ2). That is, the error terms
in the sub period regressions are normally distributed with
the same (homoscedastic) variance σ2
▰ 2. The two error terms u1t and u2t are independently
distributed.
59

Mechanics of the Chow test

▰ Estimate full date regression, which is appropriate if there is no


parameter instability, and obtain RSS3 with df = (n1 + n2 - k)
▰ RSS3 the restricted residual sum of squares (RSSR) because
it is obtained by imposing the restrictions that λ1 = γ1 and λ2 = γ2,
that is, the subperiod regressions are not different.
▰ Obtain RSS1, with df = (n1 - k)
▰ Obtain RSS2, with df = (n2 - k)
60

Chow test

▰ Because of independent assumption, we can add RSS1


and RSS2 to obtain the unrestricted residual sum of
squares (RSSUR), that is,
61

Chow test

▰ Idea behind the Chow test is that if in fact there is no structural change
both the regressions are essentially the same), then the RSSR and
RSSUR should not be statistically different. Therefore, if we form the
following ratio:

▰ We do not reject the null hypothesis of parameter stability (i.e., no


structural change) if the computed F value in an application does not
exceed the critical F value obtained from the F table at the chosen level
of significance (or the p value).
62

Example: Disposable personal income and


personal savings

▰ Run the regression model for 1970-1981

Source SS df MS Number of obs = 12


F(1, 10) = 92.19
Model 16456.2587 1 16456.2587 Prob > F = 0.0000
Residual 1785.03254 10 178.503254 R-squared = 0.9021
Adj R-squared = 0.8924
Total 18241.2912 11 1658.2992 Root MSE = 13.361

savings Coef. Std. Err. t P>|t| [95% Conf. Interval]

income .0803319 .0083665 9.60 0.000 .0616901 .0989737


_cons 1.016115 11.63771 0.09 0.932 -24.91432 26.94655
63

Example: Disposable personal income and


personal savings

▰ Regression output
64

Example: Disposable personal income and


personal savings

▰ Run the regression model for 1982-1995

Source SS df MS Number of obs = 14


F(1, 12) = 3.14
Model 2614.39647 1 2614.39647 Prob > F = 0.1020
Residual 10005.2214 12 833.768451 R-squared = 0.2072
Adj R-squared = 0.1411
Total 12619.6179 13 970.739837 Root MSE = 28.875

savings Coef. Std. Err. t P>|t| [95% Conf. Interval]

income .0148624 .0083932 1.77 0.102 -.0034248 .0331496


_cons 153.4947 32.71227 4.69 0.001 82.22075 224.7686
65

Example: Disposable personal income and


personal savings

▰ Regression output
66

Example: Disposable personal income and


personal savings

▰ Run the regression model for 1970-1995

Source SS df MS Number of obs = 26


F(1, 24) = 79.10
Model 76621.7867 1 76621.7867 Prob > F = 0.0000
Residual 23248.3 24 968.679166 R-squared = 0.7672
Adj R-squared = 0.7575
Total 99870.0867 25 3994.80347 Root MSE = 31.124

savings Coef. Std. Err. t P>|t| [95% Conf. Interval]

income .0376791 .0042366 8.89 0.000 .0289353 .046423


_cons 62.42267 12.76075 4.89 0.000 36.08578 88.75957
67

Example: Disposable personal income and


personal savings

▰ Regression output
68

Example: Disposable personal income and


personal savings

▰ We get,
69

Example: Disposable personal income and


personal savings

▰ From the F tables, we find that for 2 and 22 df the 1


percent critical F value is 5.72.
▰ The Chow test supports that the savings–income relation
has undergone a structural change in the United States
over the period 1970–1995, assuming that the
assumptions underlying the test are fulfilled.
70

You might also like