0% found this document useful (0 votes)
2 views

chapter 3

Chapter 3 discusses multiple linear regression, expanding on the simple linear regression model to include multiple independent variables affecting a dependent variable. It covers the method of ordinary least squares, assumptions of the model, and the interpretation of regression coefficients, as well as the significance of the model using F-tests. Additionally, it introduces the concepts of the multiple correlation coefficient (R) and the coefficient of multiple determination (R²), along with adjusted R² to account for the number of predictors.

Uploaded by

Elias Shiferaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

chapter 3

Chapter 3 discusses multiple linear regression, expanding on the simple linear regression model to include multiple independent variables affecting a dependent variable. It covers the method of ordinary least squares, assumptions of the model, and the interpretation of regression coefficients, as well as the significance of the model using F-tests. Additionally, it introduces the concepts of the multiple correlation coefficient (R) and the coefficient of multiple determination (R²), along with adjusted R² to account for the number of predictors.

Uploaded by

Elias Shiferaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Chapter 3:

Multiple Linear
Regression
.3.1. Method of Ordinary Least Squares revised
3.1.1 Generalising the Simple Model to Multiple Linear Regression
• Before, we have used the model
yt    xt  ut t = 1,2,...,T
• But what if our dependent (y) variable depends on more than one independent variable?
For example the number of cars sold might plausibly depend on
1. the price of cars
2. the price of public transport
3. the price of petrol
4. the extent of the public’s concern about global warming
• Similarly, stock returns might depend on several factors.
• Having just one independent variable is no good in this case - we want to have more than one x
variable.
Con’d
• Now we write, more than one predictor…

yt  1   2 x2t   3 x3t  ...   k xkt  ut , t=1,2,...,T


1, 2 , … , k are called partial regression coefficients.

• Where is x1? It is the constant term.

• Each regression coefficient is the amount of change in the outcome


variable that would be expected per one-unit change of the predictor,
if all other variables in the model were held constant.
Parameter Interpretation

Controlling for other predictors in model, there is a linear


relationship between E(y) and x1 with slope 1.

i.e., consider case of k = 2 explanatory variables,


E(y) =  + 1x1 + 2x2

If x1 goes up 1 unit with x2 held constant, the change in E(y) is


[ + 1(x1 + 1) + 2x2] – [ + 1x1 + 2x2] = 1.
Simple vs. Multiple Regression

• One dependent variable Y • One dependent variable Y


predicted from one predicted from a set of
independent variable X independent variables (X1, X2
….Xk)
• One regression coefficient for
• One regression coefficient each independent variable
• R2: proportion of variation in
• r2: proportion of variation in dependent variable Y
dependent variable Y predictable by set of
predictable from X independent variables (X’s)
Assumptions
• Assumption 1. The regression model is linear in the parameters.
• Assumption 2. The values of the regressors, the X’s, are fixed in
repeated sampling.
• Assumption 3. For given X’s, the mean value of the disturbance ui is
zero.
• Assumption 4. For given X’s, the variance of ui is constant or
homoscedastic.
• Assumption 5. For given X’s, there is no autocorrelation in the
disturbances.
Con’d
• Assumption 6. If the X’s are stochastic, the disturbance term and the
(stochastic) X’s are independent or at least uncorrelated.
• Assumption 7. The number of observations must be greater than the
number of regressors.
• Assumption 8. There must be sufficient variability in the values taken
by the regressors.
• Assumption 9. The regression model is correctly specified.
• Assumption 10. There is no exact linear relationship (i.e.,
multicollinearity) in the regressors.
…none of the regressors can be written as exact linear combinations of the
remaining regressors in the model.
• Assumption 11. The stochastic (disturbance) term ui is normally
distributed.
The Multiple Regression Model

Idea: Examine the linear relationship between


1 dependent (Y) & 2 or more independent variables (Xi)

Multiple Regression Model with k Independent Variables:

Y-intercept Population slopes Random Error

Yi  β 0  β1 X1i  β 2 X 2i      β k X ki  ε i
Con’d

The coefficients of the multiple regression model are


estimated using sample data

Multiple regression equation with k independent variables:


Estimated Estimated
(or predicted) Estimated partial slope
intercept
value of Y coefficients

ˆ  b  b X  b X    b X
Yi 0 1 1i 2 2i k ki
Example: 2 Independent Variables
• A distributor of icecream wants to evaluate factors thought to
influence demand

• Dependent variable: icecream sales (units per week)


• Independent variables: Price (in birr)
Advertising (birr100’s)
The Multiple Regression Equation

Sales  306.526- 24.975(Price)  74.131(Advertising)


where
Sales is in number of pies per week
Price is in $
Advertising is in $100’s.
b1 = -24.975: sales b2 = 74.131: sales will
will decrease, on increase, on average,
average, by 24.975 by 74.131 pies per
pies per week for week for each $100
each $1 increase in increase in
selling price, net of advertising, net of the
the effects of changes effects of changes
due to advertising due to price
3.3. Multiple Correlation Coefficient (R) and Coefficient
of Multiple Determination (R2)

• R = the magnitude of the relationship between the dependent


variable and the best linear combination of the predictor variables
• R2 = the proportion of variation in Y accounted for by the set of
independent variables (X’s).
Properties of R and R2

• 0 ≤ R2 ≤ 1
•   R 2 so 0 ≤ R ≤ 1 (i.e., it can’t be negative)
R
• The larger their values, the better the set of explanatory variables predict
y
• R2 = 1 when observed y = predicted y, so SSE = 0
• R2 = 0 when all predicted y = y so TSS = SSE.
When this happens, b1 = b2 = … = bk = 0 and the correlation r = 0 between
y and each x predictor.
• R2 cannot decrease when predictors added to model
• With single predictor, R2 = r2 , R = |r|
Con’d
• We would like some measure of how well our regression model actually fits the
data.
• We have goodness of fit statistics to test this: i.e. how well the sample regression
function (srf) fits the data.
• The most common goodness of fit statistic is known as R2. One way to define R2 is
to say that it is the square of the correlation coefficient between y and .
• For another explanation, recall that what we are interested in doing is explaining
the variability of y about its mean value, , i.e. the total sum of squares, TSS:
TSS    yt  y 
2

• We can split the TSS into two parts, the part which we have explained (known as
the explained sum of squares, ESS) and the part which we did not explain using
the model (the RSS).
Recall how to define R2
• That is, TSS = ESS + RSS

 ty  y 2
 
 tˆ
y  y 2
 t
ˆ
u 2

t t t

• Our goodness of fit statistic is


ESS
R2 
TSS
• But since TSS = ESS + RSS, we can also write
ESS TSS  RSS RSS
R2    1
TSS TSS TSS

• R2 must always lie between zero and one. To understand this, consider two extremes
RSS = TSS i.e. ESS = 0 so R2 = ESS/TSS = 0
ESS = TSS i.e. RSS = 0 so R2 = ESS/TSS = 1
Example: Multiple Coefficient of
Determination
Regression Statistics
SSR 29460.0
Multiple R 0.72213
r 
2
  .52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
52.1% of the variation in pie sales
Standard Error 47.46341
is explained by the variation in
Observations 15
price and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Problems with R2 as a Goodness of Fit Measure
• There are a number of them:

1. R2 is defined in terms of variation about the mean of y so that if a model is reparameterised


(rearranged) and the dependent variable changes, R2 will change.

2. R2 never falls if more regressors are added. to the regression, e.g. consider:
Regression 1: yt = 1 + 2x2t + 3x3t + ut
Regression 2: y = 1 + 2x2t + 3x3t + 4x4t + ut
………R2 will always be at least as high for regression 2 relative to regression 1.

3. R2 quite often takes on values of 0.9 or higher for time series regressions.
Adjusted R2
• In order to get around these problems, a modification is often made which takes into account the
loss of degrees of freedom associated with adding extra variables. This is known as R 2 , or
adjusted R2:
 T 1 
R 2 1  (1  R 2 )
T  k 

• So if we add an extra regressor, k increases and unless R2 increases by a more than offsetting
amount, R 2 will actually fall.
Adjusted R2
(continued)
• Shows the proportion of variation in Y explained by all
X variables adjusted for the number of X variables used

 2  n  1  SSR /(n  k  1)
R2
 1  (1  r )   1 
  
adj
  n k 1   SST /(
(where n = sample size, k = number of independent variables)
n 1)
• Penalizes excessive use of unimportant independent
variables
• Smaller than R2
• Useful in comparing among models
Adjusted R2
 .44172
Regression Statistics 2
Multiple R 0.72213 R adj
R Square 0.52148
Adjusted R Square 0.44172
44.2% of the variation in pie sales is
Standard Error 47.46341
explained by the variation in price and
Observations 15 advertising, taking into account the sample
size and number of independent variables
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Is the Model Significant?
• F Test for Overall Significance of the Model
• Shows if there is a linear relationship between all of the X variables
considered together and Y
• Use F-test statistic
• Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent
variable affects Y)
F Test for Overall Significance
• Test statistic:

MSR SSR / k
FSTAT  
MSE SSE /(n  k  1)

where FSTAT has numerator d.f. = k and


denominator d.f. = (n – k - 1)
F Test for Overall Significance
(continued)
Regression Statistics
Multiple R 0.72213
MSR 14730.0
R Square 0.52148
FSTAT    6.5386
Adjusted R Square 0.44172 MSE 2252.8
Standard Error 47.46341
With 2 and 12 degrees P-value for
Observations 15 of freedom the F Test

ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
F Test for Overall Significance
(continued)

H0: β1 = β2 = 0 Test Statistic:


H1: β1 and β2 not both zero MSR
FSTAT   6.5386
 = .05 MSE
df1= 2 df2 = 12
Decision:
Critical Since FSTAT test statistic is
Value:
in the rejection region (p-
F0.05 = 3.885 value < .05), reject H0
 = .05
Conclusion:
0 F There is evidence that at least one
Do not Reject H0
reject H0 independent variable affects Y
F0.05 = 3.885
Are Individual Variables Significant?

• Use t tests of individual variable slopes


• Shows if there is a linear relationship between the
variable Xj and Y holding constant the effects of other
X variables
• Hypotheses:
• H0: βj = 0 (no linear relationship)
• H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)
Are Individual Variables Significant?
(continued)

H0: βj = 0 (no linear relationship)


H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)

Test Statistic:
bj  0
t STAT 
Sb j (df = n – k – 1)

where b j and Sb j are


coefficient and standard error of the parameter
Are Individual Variables Significant?(continued)

Regression Statistics
t Stat for Price is tSTAT = -2.306, with
Multiple R 0.72213
R Square 0.52148
p-value .0398
Adjusted R Square 0.44172
Standard Error 47.46341 t Stat for Advertising is tSTAT = 2.855,
Observations 15 with p-value .0145

ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Inferences about the Slope:
t Test Example
H0: βj = 0
H1: βj  0 For Price tSTAT = -2.306, with p-value .0398

For Advertising tSTAT = 2.855, with p-value .0145


d.f. = 15-2-1 = 12
 = .05 The test statistic for each variable falls
t/2 = 2.1788 in the rejection region (p-values < .05)
Decision:
/2=.025 /2=.025 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0 Do not reject H0 Reject H0
-tα/2 tα/2 Price and Advertising affect
0
-2.1788 2.1788 pie sales at  = .05
Confidence Interval Estimate
for the Slope
Confidence interval for the population slope βj

b j  tα / 2 Sb where t has
(n – k – 1) d.f.
j

Coefficients Standard Error


Intercept 306.52619 114.25389 Here, t has
Price -24.97509 10.83213
(15 – 2 – 1) = 12 d.f.
Advertising 74.13096 25.96732

Example: Form a 95% confidence interval for the effect of changes in


price (X1) on pie sales:
-24.975 ± (2.1788)(10.832)
So the interval is (-48.576 , -1.374)
(This interval does not contain zero, so price has a significant effect on sales)
Confidence Interval Estimate
for the Slope
(continued)
Confidence interval for the population slope βj

Coefficients Standard Error … Lower 95% Upper 95%


Intercept 306.52619 114.25389 … 57.58835 555.46404
Price -24.97509 10.83213 … -48.57626 -1.37392
Advertising 74.13096 25.96732 … 17.55303 130.70888

Example: this output also reports these interval endpoints:


Weekly sales are estimated to be reduced by between 1.37 to
48.58 pies for each increase of birr 1 in the selling price, holding
the effect of advertising constant
Using The Equation to Make
Predictions
Predict sales for a week in which the selling
price is birr 5.50 and advertising is birr 350:

Sales  306.526 - 24.975(Price)  74.131(Advertising)


 306.526 - 24.975 (5.50)  74.131(3.5)
 428.62

Note that Advertising is


Predicted sales in birr100s, so birr 350
means that X2 = 3.5
is 428.62 pies

You might also like