Multiple Linear Reg Ex 2
Multiple Linear Reg Ex 2
yˆ i b0 b1x1i b 2 x 2i bk x ki
we will always use a computer to obtain the regression slope
coefficients and other regression summary measures.
MultipleLinearRegEx2 -1
Example of MLRM (2 Independent Variables)
• A distributor of frozen desert pies wants to evaluate factors
thought to influence demand. Data are collected for 15 weeks.
• Dependent variable: Pie sales (units per week)
• Independent variables: Price (in $) Advertising ($100’s)
MultipleLinearRegEx2 -2
Pie Sales Example
Pie Price Advertising Multiple regression equation:
Week Sales ($) ($100s)
1 350 5.50 3.3
2 460 7.50 3.3
3 350 8.00 3.0
Sales = b0 + b1 (Price) + b2 (Advertising)
4
5
430
350
8.00
6.80
4.5
3.0
yˆ b0 b1 X 1 b2 X 2
6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7 MultipleLinearRegEx2 -3
Estimating a Multiple Linear
Regression Equation
• Excel will be used to generate the coefficients and
measures of goodness of fit for multiple
regression
• Excel:
• Tools / Data Analysis... / Regression
MultipleLinearRegEx2 -4
Multiple Regression Output
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Sales 306.526 - 24.975(Pri ce) 74.131(Adv ertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
MultipleLinearRegEx2 -5
The Multiple Regression Equation
MultipleLinearRegEx2 -7
Coefficient of Determination, R2
(continued)
Regression Statistics
SSR 29460.0
Multiple R 0.72213
R 2
.52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
Standard Error 47.46341 52.1% of the variation in pie sales
Observations 15 is explained by the variation in
price and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
MultipleLinearRegEx2 -8
Estimation of Error Variance
Yi β 0 β1x1i β 2 x 2i βK x Ki ε i
i
e 2
SSE
s2e i1
n K 1 n K 1
ei y i yˆ i
where
MultipleLinearRegEx2 -10
Adjusted Coefficient of Determination,R 2
• R2 never decreases when a new X variable is added to the
model, even if the new variable is not an important
predictor variable
• This can be a disadvantage when comparing models
• What is the net effect of adding a new variable?
• We lose a degree of freedom when a new X variable is
added
• Did the new X variable add enough explanatory power
to offset the loss of one degree of freedom?
MultipleLinearRegEx2 -11
Adjusted Coefficient of Determination,R 2
(continued)
• Used to correct for the fact that adding non-relevant
independent variables will still reduce the error sum of squares
SSE / (n K 1)
R 2 1
SST / (n 1)
(where n = sample size, K = number of independent variables)
MultipleLinearRegEx2 -12
Regression Statistics
Multiple R 0.72213
R 2 .44172
R Square 0.52148
ANOVA df SS MS F Significance F
Total 14 56493.333
MultipleLinearRegEx2 -13
Coefficient of Multiple Correlation
• The coefficient of multiple correlation is the correlation between the
predicted value and the observed value of the dependent variable
R r(yˆ , y) R 2
• Is the square root of the multiple coefficient of determination
• Used as another measure of the strength of the linear relationship
between the dependent variable and the independent variables
• Comparable to the correlation between Y and X in simple regression
MultipleLinearRegEx2 -14
Evaluating Individual Regression
Coefficients
• Use t-tests for individual coefficients
• Shows if a specific independent variable is conditionally
important
• Hypotheses:
• H0: βj = 0 (Xj has no linear influence on y)
• H1: βj ≠ 0 (xj has linear influence on y)
MultipleLinearRegEx2 -15
Evaluating Individual Regression Coefficients
(continued)
Test Statistic:
bj - 0
t ~ t (n k 1)
sb j
MultipleLinearRegEx2 -16
Evaluating Individual
Regression Coefficients
(continued)
Regression Statistics
t-value for Price is t = -2.306, with p-value .0398
Multiple R 0.72213
R Square 0.52148
t-value for Advertising is t = 2.855, with p-value .0145
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
MultipleLinearRegEx2 -17
Example: Evaluating Individual
Regression Coefficients
From Excel output:
H0: βj = 0
Coefficients Standard Error t Stat P-value
H1: βj 0 Price -24.97509 10.83213 -2.30565 0.03979
Advertising 74.13096 25.96732 2.85478 0.01449
d.f. = 15-2-1 = 12
a = .05 The test statistic for each variable falls
t12, .025 = 2.1788 in the rejection region (p-values < .05)
Decision:
a/2=.025 a/2=.025 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0
Price and Advertising affect
0
-2.1788 2.1788 pie sales at = .05
MultipleLinearRegEx2 -18
Confidence Interval Estimate
for the Slope
Confidence interval limits for the population slope βj
MultipleLinearRegEx2 -20
Test on All Coefficients
• F-Test for Overall Significance of the Model
• Shows if there is a linear relationship between all of
the X variables considered together and Y
• Use F test statistic
• Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent
variable affects Y)
MultipleLinearRegEx2 -21
F-Test for Overall
Significance
• Test statistic:
MSR SSR/K
F 2
se SSE/(n K 1)
where F has k (numerator) and
(n – K – 1) (denominator)
degrees of freedom
• The decision rule is
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
MultipleLinearRegEx2 -23
F-Test for Overall
Significance
(continued)
y i β 0 β1x1i βK x Ki α1z1i αr z ri ε i
H0 : α1 α 2 αr 0
H1 : at least one of α j 0 (j 1,...,r)
MultipleLinearRegEx2 -26
Tests on a Subset of Regression
Coefficients
(continued)
• Goal: compare the error sum of squares for the complete
model with the error sum of squares for the restricted model
• First run a regression for the complete model and obtain SSE
• Next run a restricted regression that excludes the z variables (the
number of variables excluded is r) and obtain the restricted error
sum of squares SSE(r)
• Compute the F statistic and apply the decision rule for a significance
level
( SSE(r) SSE ) / r
Reject H0 if F 2
Fr,nK r 1,α
se
MultipleLinearRegEx2 -27