Assignment5_Regression Analysis - solutions
Assignment5_Regression Analysis - solutions
The operations manager of a musical instrument distributor feels that demand for bass drums may be
related to the number of television appearances by the popular rock group Green Shades during the
preceding month. The manager has collected the data shown in the following table:
3 3
6 4
7 7
5 4
10 9
8 5
6 5
(a) Using the formulas of linear regression in this chapter, calculate the intercept, 𝑏! , and the slope, 𝑏" .
Verify your answer with Excel. Present your Excel output in addition to your manual calculation.
(b) Using the equations presented in this chapter, compute the SST, SSE, and SSR. Find the regression
equation for these data. Show all your calculation steps and verify them with Excel, too. Present your
Excel output in addition to your manual calculation.
(c) Based on the available information, what is your estimate for bass drum sales if the Green Shades
performed on TV four times in the last month?
Solution:
(a and b)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8783
R Square 0.7714
Adjusted R Square 0.7257
Standard Error 1.1655
Observations 7.0000
ANOVA
df SS MS F Significance F
Regression 1.0000 22.9222 22.9222 16.8740 0.0093
Residual 5.0000 6.7921 1.3584
Total 6.0000 29.7143
(c)
Problem 2 (Salary)
The following data give the starting salary for students who recently graduated from a local university
and accepted jobs soon after graduation. The starting salary, grade-point average (GPA), and major
(business or other) are provided.
Salary 29500 38000 44800 36500 42000 31500 36200 30000
GPA 3.1 3.5 3.8 2.9 3.4 2.1 2.5 3.6
Major Other Business Business Other Business Other Business Other
(a) Develop a linear regression model that could be used to predict starting salary based on GPA and
major.
(b) Is this a good model? Comment based on the fit and the statistical significance of the model (use
significance level 𝛼 = 0.05).
(c) What does the model say about the starting salary for a business major compared to a nonbusiness
major? Specifically, does the business major significantly impact the starting salary?
Solution
(b) The p-value for the F test is 0.058, which means that the model is not significant at the 5% level.
Hence, the model is not very useful for predicting the starting salary. However, the reasonable r-
squared/adj. r-squared value can be due to the very small sample size of the data.
(c) The t-test for Z is significant at a 5% level, and the p-value is 0.04. Thus, business major does have a
significantly different starting salary in comparison to other majors.
Solutions to Assignment 5 – Regression Analysis
A company is looking to develop a regression model to predict the number of units sold for some
high-end chocolate. Data is provided below:
A)
Sales (units) Price ($) Advertising ($) Holiday
438 100 50 No
560 120 40 Yes
459 110 45 No
516 103 55 Yes
493 108 40 No
498 109 30 No
514 106 45 Yes
After conducting the analysis in Excel, they obtain the following output with many missing values that are
marked by “?”.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9585
R Square ?
Adjusted R Square ?
Standard Error 16.0912
Observations ?
ANOVA
df SS MS F Significance F
Regression ? 8784.0756 ? ? 0.0383
Residual ? ? ?
Total ? 9560.8571
b) What is the predicted sales volume in a holiday season if the price is $115, advertising is $48?
c) Please calculate and fill in all missing values according to the information provided. Show your manual
calculation steps and verify it using Excel, too.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9585
R Square 0.9188
Adjusted R Square 0.8375
Standard Error 16.0912
Observations 7
ANOVA
df SS MS F Significance F
Regression 3 8784.0756 2928.0252 11.3083 0.0383
Residual 3 776.7816 258.9272
Total 6 9560.8571
Yes
The overall model is statistically significant, and the fit of the model is also good
Among X variables, The Holiday variable alone is statistically significant
Solutions to Assignment 5 – Regression Analysis
“In a multiple linear regression model, an X variable that does not pass the test of significance at 𝛼 =
0.05 may be indeed significantly related to the Y variable. ” Is this statement true or false? Explain.
True
Multicollinearity may exist
So the individual t-test may not be reliable --- a significant variable may appear insignificant