Multiple Regression (Output)
Multiple Regression (Output)
quantitative dependent variable and two or more independent variables using a straight line.
Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables
(Xi). Let us Multiple Regression Model with k Independent Variables:
Example 1:
A distributor of frozen desert pies wants to evaluate factors thought to influence demand. Data
are collected for 15 weeks. Where dependent variable is the Pie sales (units per week) and
independent variables are Price (in $) and Advertising ($100’s).
The output is given below,
Questions
I. Fit the multiple linear regression equation.
II. Interpret the intercept and regression coefficients.
III. Predict sales for a week in which the selling price is $5.50 and advertising is $350.
IV. Interpret Coefficient of Determination.
V. Explain adjusted r-square.
VI. Is the Model is significant?
VII. Are Individual Variables Significant?
VIII. Obtain Confidence Interval Estimate for the Slope
Solution:
I. Multiple regression equation:
III. Prediction of sales for a week in which the selling price is $5.50 and advertising is
$3.50 (100s).
Comment: 52.1% of the variation in pie sales is explained by the variation in price and
advertising.
V. adjusted r-square.
It shows the proportion of variation in Y explained by all X variables adjusted for the number of
X variables used. It penalizes excessive use of unimportant independent variables and it is
always smaller than r-square. It is useful in comparing among models.
2
radj .44172
Comment: 44.2% of the variation in pie sales is explained by the variation in price and
advertising, taking into account the sample size and number of independent variables.
VI. Significance of the model.
F Test for Overall Significance of the Model,
Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent variable affects Y)
Test statistic:
SSR
MSR k
F
MSE SSE
n k 1
Critical region:
Decision: For 5% level of significance, the p-value < 0.05. Therefore, we reject H 0. That means,
the model is significant. There is evidence that at least one independent variable affects Y
VII. Are Individual Variables Significant?
t-tests of individual variable slopes
Hypotheses:
H0: βj = 0 (no linear relationship)
H1: βj ≠ 0 (linear relationship does exist between Xj and Y)
Test Statistic:
bj 0
t
Sb j
(df = n – k – 1)
Decision: The test statistic for each variable falls in the rejection region (p-values < .05). Reject
H0 for each variable. There is evidence that both Price and Advertising affect pie sales at = .05.
VIII. Confidence Interval Estimate for the Slope
Confidence interval for the population slope βj
b j t n k 1Sb j
Example: Form a 95% confidence interval for the effect of changes in price (X1) on pie sales:
-24.975 ± (2.1788)*(10.832)
So the interval is (-48.576, -1.374)
Here, b1 = -24.975, S.E = 10.832, tn-k-1 = 10.832
(This interval does not contain zero, so price has a significant effect on sales)
Excel output also reports these interval endpoints:
Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each increase of $1
in the selling price.
Dummy Variables:
A dummy variable is a categorical explanatory variable with two levels:
Yes or no, on or off, male or female.
Coded as 0 or 1.
Ŷ b0 b1 X1 b 2 X 2
Let:
Y = pie sales
X1 = price
X2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that
week)