0% found this document useful (0 votes)
15 views

Multiple Regression Slides Mod-Ed

Uploaded by

Kunduz Ibraeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Multiple Regression Slides Mod-Ed

Uploaded by

Kunduz Ibraeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Learning Objectives for the Multiple

Regression Model
We will discuss:
◼ How to develop a multiple regression model

◼ How to interpret the regression coefficients

◼ How to determine which independent variables to include in the


regression model
◼ How to determine which independent variables are most important in
predicting a dependent variable
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Three-category approach to
quantitative methods
◼ Quantitative methods has 3 distinctive categories or
functions:
◼ Descriptive statistics
◼ Analysis of differences in the same variables measured (or

observed) under different conditions-t-test, Z-test, ANOVA,


MANOVA.
◼ Analysis of relationships between/among variables –

regression models (cross sectional, time-series and panel


data). Regression is king of Econometrics! A US Professor ◼
Wooldridge Jeffrey M. - 7th edition
◼ Gujarati Damodar N. - Basic Econometrics, 5th edition. Ragnar
Frisch – father of Econometrics, Jan Tinbergen, Simon Kuznets-
GDP, Jacob Marschak-Ukraine.
2
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Regression Models
◼ A regression model can be applied to estimate the
parameters of many relationships in economics, business,
and the social sciences. The term regression was coined by
Francis Galton (1886).
◼ regression toward the mean (or regression to the
mean), ”on average, move towards the middle”. ◼ Any time
you ask how much a change in one variable will affect
another variable, regression analysis is a potential tool
◼ Similarly, any time you wish to predict the value of one
variable given the value of another then least squares
regression is a tool to consider The Simple Linear Regression Model 3
Principles of Econometrics, 5e

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

The Multiple Regression


Model

Idea: Examine the linear relationship between


1 dependent (Y) & 2 or more independent variables (Xi)

Multiple Regression Model with k Independent Variables: Y-intercept


Population slopes Random Error
Y =β + β X + β X + ⋅⋅⋅ + β X + ε i 0
1 1i 2 2i k ki i

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Introduction (2 of 2)
◼ Variable to be predicted is called the dependent
variable or response variable
◼ Value depends on the value of the
independent variable(s)
◼ Explanatory or predictor variable
Copyrigh

t ©2015 Pearson Education, Ltd. All rights reserved. Copyright © 2018, 2015, 2012 Pearson Education, Inc. All Rights
Reserved

Multiple Regression Equation


Sample Data Example
The coefficients of the multiple regression model are estimated
using sample data

Multiple regression equation with k independent variables:


Estimated (or predicted) value of Y Estimated slope coefficients
Estimated intercept

ˆ
i011i22ikkiY = b + b X + b X + ⋅ ⋅ ⋅
+bX
We will use soft to obtain the regression slope coefficients
and other regression summary measures.

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Multiple Regression Equation


(continued)
Two variable model
Y X1
ˆ
=++
Y b 0b 1X 1b 2X 2 X 2

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Line of best fit


C

opyright ©2015 Pearson Education, Ltd. All rights reserved.

Estimating β1 and β2 manually is a very


tedious work
◼Solving the normal equations simultaneously, we
obtain

Copyrig
ht ©2015 Pearson Education, Ltd. All rights reserved.

Estimating β1 and β2 manually is a very


tedious work
◼ where X¯ and Y¯ are the sample means of X and Y

and where we define xi = (Xi − X¯ ) and yi = (Yi − Y¯).


Henceforth we adopt the convention of letting the
lowercase letters denote
deviations from mean values.
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Sample Data for Estimating the β1 and β2

Copyright ©2015 Pearson Education, Ltd. All rights reserved.


Regression model equation
Show the Excel file
• βˆ1 = 24.4545 se (βˆ1) = 6.4138
• βˆ2 = 0.5091 se (βˆ2) = 0.0357
• The estimated regression line therefore is •
Yˆi = 24.4545 + 0.509*Xi
• which is shown graphically
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Copyright ©2015 Pearson Education, Ltd. All rights reserved.


Copyright ©2015 Pearson Education, Ltd. All rights reserved.

123 SALES PRICE ADVERT= β + β + β 12233 ββ


β iiiiy x x e = + + +
The Economic Model
◼Let’s set up an economic model in which sales revenue
depends on one or more explanatory variables
◼We initially hypothesize that sales revenue is linearly
related to price and advertising expenditure
◼The economic model is:

Principles of Econometrics, 5e The Multiple Regression Model 15


Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Data Sales_Example.xls. Stata


commands for exploring the data
◼ sum Y X, d
◼ graph twoway (lfit Y X) (scatter Y X)

◼ corr Y X

◼ reg Y X1 X2

◼ predict yhat

◼ predict rvar, res


Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Multiple Regression Output


Regression Statistics
Multiple R 0.72213

R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Observations 15 Sales = 306.526 - 24.975(Pri ce) + 74.131(Adv
ertising)

ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12


27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993
57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732
2.85478 0.01449 17.55303 130.70888
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

The Multiple Regression Equation

Sales = 306.526 - 24.975(Price) + 74.131(Advertising)


where
Sales is in number of pies per week
Price is in $
Advertising is in $100’s.
b1 = -24.975: sales will decrease, on b2 = 74.131: sales will increase, on
average, by 24.975 pies per week average, by 74.131 pies per week
for each $1 increase in selling price, for each $100 increase in
net of the effects of changes due to advertising, net of the effects of
advertising changes due to price

Copyright ©2015 Pearson Education, Ltd. All rights reserved.


Is the Model Significant?

◼ F Test for Overall Significance of the Model


◼ Shows if there is a linear relationship between all
of the X variables considered together and Y
◼ Use F-test statistic
◼ Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship) H1:
at least one βi ≠ 0 (at least one independent
variable affects Y)
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

F Test for Overall Significance


◼ Test statistic:
SSR
MSR k
FSTAT
SSE
==
nk
MSE
− −1

where FSTAT has numerator d.f. = k and


denominator d.f. = (n – k - 1)
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

F Test for Overall Significance In


Excel
(continued)
Regression Statistics
Multiple R 0.72213 R Square 0.52148
MSR =14730.0
FSTAT = =
6.5386
Adjusted R Square 0.44172
MSE 2252.8
Standard Error 47.46341 Observations 15
of freedomP-value for the F Test
With 2 and 12 degrees

ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12


27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993
57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732
2.85478 0.01449 17.55303 130.70888
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

F Test for Overall Significance


(continued
)
H0: β1 = β2 = 0 MSR
H1: β1 and β2 not both zero α = F 6.5386 STAT = = MSE
.05
df1= 2 df2 = 12 Decision:
Critical Since FSTAT test statistic is in
Value: the rejection region (p value
< .05), reject H0
F0.05 = 3.885
α = .05 Conclusion:
Test Statistic:
0 Do not Reject H0
least one F
There is evidence that at
reject H0 independent variable affects Y
F0.05 = 3.885

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Using The Equation to Make


Predictions
Predict sales for a week in which the selling
price is $5.50 and advertising is $350:

Sales 306.526 - 24.975(Price) 74.131(Advertising)


=+
=+
306.526 - 24.975 (5.50) 74.131(3.5)
= 428.62

Note that Advertising is in


Predicted sales is $100s, so $350 means that X2 =
428.62 pies 3.5

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Multiple Coefficient of
Determination
Regression Statistics
SSR 29460.0
Multiple R 0.72213
r2= = = .52148
R Square 0.52148
SST 56493.3
Adjusted R Square 0.44172 Standard Error 47.46341
explained by the variation in price
Observations 15
and advertising
52.1% of the variation in pie sales is
ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12
27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993
57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732
2.85478 0.01449 17.55303 130.70888

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

2
Adjusted r
◼ r2 never decreases when a new X variable is
added to the model
This can be a disadvantage when comparing

models
◼ What is the net effect of adding a new variable?

◼ We lose a degree of freedom when a new X

variable is added
◼ Did the new X variable add enough

explanatory power to offset the loss of one


degree of freedom?
Copyright ©2015 Pearson Education, Ltd. All rights reserved.

2(continued)
Adjusted r
◼ Shows the proportion of variation in Y explained
by all X variables adjusted for the number of X
variables used
⎢ ⎡⎟ ⎥
⎣ ⎠⎞ ⎦⎤

=−−
11
n −
22 r r adj ⎜ ⎛
⎝ −−nk
1 (1 )
(where n = sample size, k = number of independent variables)
◼ Penalizes excessive use of unimportant independent
variables
2
◼ Smaller than r

◼ Useful in comparing among models

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

2
Adjusted r in Excel
Regression Statistics
44.2% of the variation in pie sales is
Multiple R 0.72213 R Square 0.52148 Adjusted R Square
explained by the variation in price
0.44172 Standard Error 47.46341 Observations 15
and advertising, taking into account
r .44172 2adj = the sample size and number of
independent variables
ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12
27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993
57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732
2.85478 0.01449 17.55303 130.70888

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Are Individual Significant?


Variables
(continued)

H0: βj = 0 (no linear relationship between Xj and Y)


H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)

Test Statistic:
t− 0
= b j
j
STAT b
S (df = n – k – 1)

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Are Individual Output


Variables
Significant? Excel (continued)

Regression Statistics 0.44172 Standard Error 47.46341 Observations 15


Multiple R 0.72213 R Square 0.52148 Adjusted R Square
t Stat for Price is tSTAT = -2.306, with t Stat for Advertising is tSTAT = 2.855,
p-value .0398 with p-value .0145

ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12


27033.306 2252.776
Total 14 56493.333

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993
57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732
2.85478 0.01449 17.55303 130.70888

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

Residuals in Multiple Regression


Two variable model
Y Sample

ˆ
Yi = + +
Residual = observation Yi
<

ei = (Yi – Yi) <


Y b 0b 1X 1b 2X 2

x2i x1i

X2

X1 minimizing the sum of squared errors,


The best fit equation is found by Σe2

Copyright ©2015 Pearson Education, Ltd. All rights reserved.

You might also like