Hendrickson Assignment4
Hendrickson Assignment4
X-Variable Y-Variable
YouTube Views Guitar Sales
30 8
40 11
70 12
60 10
80 15
50 13
14
f(x) = 0.1 x + 6
12 R² = 0.593220338983051
Guitar Sales
10
4
20 30 40 50 60 70 80 90
YouTube Views
(a) Graph these data to see whether a linear equation might describe the relationship between the views on YouTube a
See above
(b) Using excel, compute the SST, SSE, and SSR. Find the least-squares regression line for these data.
See Summary Output ------->>>>>>>>
(c) Using the regression equation, predict guitar sales if there were 40,000 views last month.
y = 0.1*40000 + 6
y = 4000+6 = 4006
80 90
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.77
R Square 0.59
Adjusted R Square 0.49
Standard Error 1.73
Observations 6.00
ANOVA
df SS MS F Significance F
SSR Regression 1.00 17.50 17.50 5.83 0.07
SSE Residual 4.00 12.00 3.00
SST Total 5.00 29.50
Coefficients
Standard Error t Stat P-value Lower 95%
Intercept 6.00 2.38 2.52 0.07 -0.62
YouTube Views 0.10 0.04 2.42 0.07 -0.01
Upper 95% Lower 95.0% Upper 95.0%
12.62 -0.62 12.62
0.21 -0.01 0.21
Based on the three regression model equ
The following data give the selling price, square footage, number of would be age because R2 = .702, followed
bedrooms, and age of houses that have sold in a neighborhood in the past bedroom last because R2 = .43.
6 months. Develop three regression models to predict the selling price
based upon each of the other factors individually. Which of these is best?
Explain
Selling Price
84000
79000
91500
120000
127500
132500
145000
164000
155000
168000
172500
174000
175000
177500
184000
195500
195000
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA
Regression
Residual
Total
Intercept
Sq FT
Bedroom
Age
e three regression model equations, I believe the best predictor
e because R2 = .702, followed by SqFt because R2 = .7, then
t because R2 = .43. Predicting Selling Price
210000
f(x) = 51.0272115301307 x + 26532.2361399712
190000 R² = 0.699966957304038
170000
150000
Selling Price
Sq FT Bedroom Age
1670 2 30 130000
1339 2 25 110000
1712 3 30 90000
1840 3 40 70000
2300 3 18
50000
2234 3 30 1000 1500 2000 2500 3000 3500
2311 3 19 Square Footage
2377 3 7
2736 4 10
2500 3 1
2500 4 3 Predicting Selling Price
2479 3 3 250000
2400 3 1
3124 4 0 200000
2500 3 2 f(x) = − 2424.91368148468 x + 182504.704359085
Selling Price
2854 3 3
100000
50000
gression Statistics 0
0.93 0 5 10 15 20 25 30 35
0.87 Age
0.84
15231.90
17.00
df SS MS F Significance F
3.00 19794476008.27 6598158669.42 28.44 0.00
13.00 3016141638.78 232010895.29
16.00 22810617647.06
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%
91446.49 26076.89 3.51 0.00 35110.80 147782.19 35110.80
29.86 10.86 2.75 0.02 6.39 53.32 6.39
2116.86 10003.01 0.21 0.84 -19493.33 23727.04 -19493.33
-1504.77 370.82 -4.06 0.00 -2305.87 -703.66 -2305.87
cting Selling Price Predicting Selling Price
210000
301307 x + 26532.2361399712
304038 190000
f(x) = 41403.0612244898 x + 20331.6326530613
170000 R² = 0.433216525633361
150000
Selling Price
130000
110000
90000
70000
50000
2500 3000 3500 4000 4500 1.5 2 2.5 3 3.5 4 4.5
Square Footage Bedrooms
8 x + 182504.704359085
15 20 25 30 35 40 45
Age
Upper 95.0%
147782.19
53.32
23727.04
-703.66
e
331.6326530613
3.5 4 4.5
The total expenses of a hospital are related to many factors. Two From the regression output you can see that the
of these factors are the number of beds in the hospital and the best regression model is = 0.6539+0.0231*# of
beds+ 0.6230*Admissions. The model has an R-
number of admissions. Data were collected on 14 hospitals, as sqr value of 0.97 which tells you that 97% of the
shown in the following table. Find the best regression model to variability in the dependnet variable is explained
predict the total expenses of a hospital. Discuss the accuracy of by the independent variables - the regression
this model. Should both variables be included in the model? model is working well. However, both variables
Why or why not? are not significant in the model. The p-value for
Number of beds is really high, therefore we can
say the number of beds is not a significant
predictor and does not need to be included in
the model.
100
80
60
Total Expenses
57 40
127
20
157
24
0
14 0 100 200 300 400 500 600
93 Number of Beds
45
6
99 SUMMARY OUTPUT
12
11 Regression Statistics
15 Multiple R 0.99
21 R Square 0.97
63 Adjusted R Square 0.97
Standard Error 8.37
Observations 14.00
ANOVA
df SS MS F
Regression 2.00 29901.34 14950.67 213.48
Residual 11.00 770.37 70.03
Total 13.00 30671.71
160
f(x) = 0.668591363439665 x + 1.51805258012299
140 R² = 0.974287669830764
120
Total Expenses
100
80
60
40
20
0
500 600 0 50 100 150 200 250
Admissions
Significance F
0.00