0% found this document useful (0 votes)
2 views

Assignment5_Regression Analysis - solutions

The document provides solutions to four regression analysis problems, including calculating demand for bass drums based on TV appearances, predicting starting salaries based on GPA and major, forecasting chocolate sales using various factors, and discussing pitfalls in regression. Key findings include the significance of the business major on salary and the statistical validity of the models used. The document emphasizes the importance of understanding model significance and the potential for multicollinearity affecting results.

Uploaded by

??
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Assignment5_Regression Analysis - solutions

The document provides solutions to four regression analysis problems, including calculating demand for bass drums based on TV appearances, predicting starting salaries based on GPA and major, forecasting chocolate sales using various factors, and discussing pitfalls in regression. Key findings include the significance of the business major on salary and the statistical validity of the models used. The document emphasizes the importance of understanding model significance and the potential for multicollinearity affecting results.

Uploaded by

??
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Solutions to Assignment 5 – Regression Analysis

Problem 1 (Bass Drums)

The operations manager of a musical instrument distributor feels that demand for bass drums may be
related to the number of television appearances by the popular rock group Green Shades during the
preceding month. The manager has collected the data shown in the following table:

Demand for Bass Drums Green Shades TV Appearance

3 3

6 4

7 7

5 4

10 9

8 5

6 5

(a) Using the formulas of linear regression in this chapter, calculate the intercept, 𝑏! , and the slope, 𝑏" .
Verify your answer with Excel. Present your Excel output in addition to your manual calculation.
(b) Using the equations presented in this chapter, compute the SST, SSE, and SSR. Find the regression
equation for these data. Show all your calculation steps and verify them with Excel, too. Present your
Excel output in addition to your manual calculation.
(c) Based on the available information, what is your estimate for bass drum sales if the Green Shades
performed on TV four times in the last month?
Solution:

(a and b)
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.8783
R Square 0.7714
Adjusted R Square 0.7257
Standard Error 1.1655
Observations 7.0000

ANOVA
df SS MS F Significance F
Regression 1.0000 22.9222 22.9222 16.8740 0.0093
Residual 5.0000 6.7921 1.3584
Total 6.0000 29.7143

Coefficients Standard Error t Stat P-value Lower 95%


Intercept 1.4101 1.2987 1.0858 0.3271 -1.9283
Green Shades TV Appearance 0.9494 0.2311 4.1078 0.0093 0.3553

(c)

Estimated sales = 1.41 + 0.9494*4 = 5.21


Solutions to Assignment 5 – Regression Analysis

Problem 2 (Salary)

The following data give the starting salary for students who recently graduated from a local university
and accepted jobs soon after graduation. The starting salary, grade-point average (GPA), and major
(business or other) are provided.
Salary 29500 38000 44800 36500 42000 31500 36200 30000
GPA 3.1 3.5 3.8 2.9 3.4 2.1 2.5 3.6
Major Other Business Business Other Business Other Business Other

(a) Develop a linear regression model that could be used to predict starting salary based on GPA and
major.

(b) Is this a good model? Comment based on the fit and the statistical significance of the model (use
significance level 𝛼 = 0.05).

(c) What does the model say about the starting salary for a business major compared to a nonbusiness
major? Specifically, does the business major significantly impact the starting salary?

Solution

(a) The estimated prediction equation is


) = 26667 + 1780.5 × 𝐺𝑃𝐴 + 7707.3 × 𝑍,
𝑆𝑎𝑙𝑎𝑟𝑦
where Z = 1 if the major is business and 0 if otherwise.

(b) The p-value for the F test is 0.058, which means that the model is not significant at the 5% level.
Hence, the model is not very useful for predicting the starting salary. However, the reasonable r-
squared/adj. r-squared value can be due to the very small sample size of the data.

(c) The t-test for Z is significant at a 5% level, and the p-value is 0.04. Thus, business major does have a
significantly different starting salary in comparison to other majors.
Solutions to Assignment 5 – Regression Analysis

Problem 3 (Chocolate Data)

A company is looking to develop a regression model to predict the number of units sold for some
high-end chocolate. Data is provided below:
A)
Sales (units) Price ($) Advertising ($) Holiday
438 100 50 No
560 120 40 Yes
459 110 45 No
516 103 55 Yes
493 108 40 No
498 109 30 No
514 106 45 Yes

After conducting the analysis in Excel, they obtain the following output with many missing values that are
marked by “?”.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.9585
R Square ?
Adjusted R Square ?
Standard Error 16.0912
Observations ?

ANOVA
df SS MS F Significance F
Regression ? 8784.0756 ? ? 0.0383
Residual ? ? ?
Total ? 9560.8571

Coefficients Standard Error t Stat P-value Lower 95%


Intercept 381.7020 197.1301 1.9363 0.1482 -245.6540
Price ($) 1.5935 1.4850 ? 0.3619 -3.1324
Advertising ($) -1.9347 1.2228 ? 0.2118 -5.8261
Holiday 63.8318 15.9239 ? 0.0279 13.1548

a) Define the variables and write down the regression equation.

X variables: Price, Advertising


Holiday = 1 for YES and 0 for NO
Y variable: Sales
Regression equation: Predicted Sales = 381.7019 + 1.5935Price – 1.9347Advertising + 63.8318Holiday

b) What is the predicted sales volume in a holiday season if the price is $115, advertising is $48?

Predicted Sales = 381.7019 + 1.5935*115 – 1.9347*48 + 63.8318*1 = 536 (approx)


Solutions to Assignment 5 – Regression Analysis

c) Please calculate and fill in all missing values according to the information provided. Show your manual
calculation steps and verify it using Excel, too.
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.9585
R Square 0.9188
Adjusted R Square 0.8375
Standard Error 16.0912
Observations 7

ANOVA
df SS MS F Significance F
Regression 3 8784.0756 2928.0252 11.3083 0.0383
Residual 3 776.7816 258.9272
Total 6 9560.8571

Coefficients Standard Error t Stat P-value Lower 95%


Intercept 381.7020 197.1301 1.9363 0.1482 -245.6540
Price ($) 1.5935 1.4850 1.0730 0.3619 -3.1324
Advertising ($) -1.9347 1.2228 -1.5822 0.2118 -5.8261
Holiday 63.8318 15.9239 4.0085 0.0279 13.1548

d) Is this a good model? Please explain. (Use significance level 𝛼 = 0.05.)

Yes
The overall model is statistically significant, and the fit of the model is also good
Among X variables, The Holiday variable alone is statistically significant
Solutions to Assignment 5 – Regression Analysis

Problem 4 (Pitfalls in Regression)

“In a multiple linear regression model, an X variable that does not pass the test of significance at 𝛼 =
0.05 may be indeed significantly related to the Y variable. ” Is this statement true or false? Explain.

True
Multicollinearity may exist
So the individual t-test may not be reliable --- a significant variable may appear insignificant

You might also like