0% found this document useful (0 votes)
3 views

Multiple Regression (Output)

Uploaded by

statistics.cou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Multiple Regression (Output)

Uploaded by

statistics.cou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Table of Contents

Assumptions of the Classical Linear Regression Model:................................................................2


Multiple linear Regression...............................................................................................................2
Estimated Multiple Linear Regression Model.................................................................................2
Example 1:...................................................................................................................................2
Assumptions of the Classical Linear Regression Model:
1. The regression model is linear, correctly specified, and has an additive error term.
2. The error term has a zero population mean.
3. All explanatory variables are uncorrelated with the error term
4. Observations of the error term are uncorrelated with each other (no serial correlation).
5. The error term has a constant variance (no heteroskedasticity).
6. No explanatory variable is a perfect linear function of any other explanatory variables (no
perfect multicollinearity).
7. The error term is normally distributed (not required).

Multiple linear Regression


Multiple linear regression is a regression model that estimates the relationship between a

quantitative dependent variable and two or more independent variables using a straight line.
Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables
(Xi). Let us Multiple Regression Model with k Independent Variables:

Estimated Multiple Linear Regression Model

Example 1:
A distributor of frozen desert pies wants to evaluate factors thought to influence demand. Data
are collected for 15 weeks. Where dependent variable is the Pie sales (units per week) and
independent variables are Price (in $) and Advertising ($100’s).
The output is given below,

Questions
I. Fit the multiple linear regression equation.
II. Interpret the intercept and regression coefficients.
III. Predict sales for a week in which the selling price is $5.50 and advertising is $350.
IV. Interpret Coefficient of Determination.
V. Explain adjusted r-square.
VI. Is the Model is significant?
VII. Are Individual Variables Significant?
VIII. Obtain Confidence Interval Estimate for the Slope
Solution:
I. Multiple regression equation:

Sales = b0 + b1 (Price) + b2 (Advertising)


Sales= 306 . 526 - 24. 975( Price)+74 . 131(Advertising )

II. Interpretation of the intercept and regression coefficients.


For b0 =306.526, indicates that if there is no increase in price and no money is spent for
advertisement then, on an average, almost 306 pies per week will be sold.
Here, b1 = -24.975: that means, sales will decrease, on average, by 24.975 pies per week for
each $1 increase in selling price, net of the effects of changes due to advertising.
Similarly, b2 = 74.131: which indicates that sales will increase, on average, by 74.131 pies per
week for each $100 increase in advertising, net of the effects of changes due to price.

III. Prediction of sales for a week in which the selling price is $5.50 and advertising is
$3.50 (100s).

So, Predicted sales is 428.62 pies per week.


Note that Advertising is in $100’s, so $350 means that X2 = 3.5
IV. Interpretation Coefficient of Determination. (R-square)
SSR 29460.0
r2   .52148
SST 56493.3

Comment: 52.1% of the variation in pie sales is explained by the variation in price and
advertising.
V. adjusted r-square.
It shows the proportion of variation in Y explained by all X variables adjusted for the number of
X variables used. It penalizes excessive use of unimportant independent variables and it is
always smaller than r-square. It is useful in comparing among models.

2
radj .44172

Comment: 44.2% of the variation in pie sales is explained by the variation in price and
advertising, taking into account the sample size and number of independent variables.
VI. Significance of the model.
F Test for Overall Significance of the Model,
Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent variable affects Y)
Test statistic:
SSR
MSR k
F  
MSE SSE
n k  1

Critical region:

Decision: For 5% level of significance, the p-value < 0.05. Therefore, we reject H 0. That means,
the model is significant. There is evidence that at least one independent variable affects Y
VII. Are Individual Variables Significant?
t-tests of individual variable slopes
Hypotheses:
 H0: βj = 0 (no linear relationship)
 H1: βj ≠ 0 (linear relationship does exist between Xj and Y)
Test Statistic:

bj  0
t 
Sb j
(df = n – k – 1)

t-value for Price is t = -2.306, with p-value .0398


t-value for Advertising is t = 2.855, with p-value .0145
Critical Region:

Decision: The test statistic for each variable falls in the rejection region (p-values < .05). Reject
H0 for each variable. There is evidence that both Price and Advertising affect pie sales at  = .05.
VIII. Confidence Interval Estimate for the Slope
Confidence interval for the population slope βj

b j t n  k  1Sb j

Example: Form a 95% confidence interval for the effect of changes in price (X1) on pie sales:
-24.975 ± (2.1788)*(10.832)
So the interval is (-48.576, -1.374)
Here, b1 = -24.975, S.E = 10.832, tn-k-1 = 10.832
(This interval does not contain zero, so price has a significant effect on sales)
Excel output also reports these interval endpoints:
Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each increase of $1
in the selling price.

Similarly, we can obtain the confidence interval for advertising.

Dummy Variables:
A dummy variable is a categorical explanatory variable with two levels:
 Yes or no, on or off, male or female.
 Coded as 0 or 1.

Ŷ b0  b1 X1  b 2 X 2
Let:
Y = pie sales
X1 = price
X2 = holiday (X2 = 1 if a holiday occurred during the week) (X2 = 0 if there was no holiday that
week)

Sales 300 - 30(Price)  15(Holiday)


b2 = 15: on average, sales were 15 pies greater in weeks with a holiday than in weeks without a
holiday, given the same price

You might also like