0% found this document useful (0 votes)
65 views4 pages

Worksheet 10 - Spring 2014 - Chapter 10 - Key

The document describes using simple linear regression to analyze the relationship between average passing yards per attempt and percentage of games won for 10 NFL teams in 2011. It finds a statistically significant positive linear relationship, with the percentage of games won increasing by an estimated 17.2% for each additional average passing yard per attempt. About 66% of the variation in winning percentage is explained by the linear model. A 95% confidence interval is constructed for the predicted winning percentage of a team with an average of 6.9 passing yards per attempt. Extrapolating the model beyond the original data range is not appropriate.

Uploaded by

Misbah Mirza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views4 pages

Worksheet 10 - Spring 2014 - Chapter 10 - Key

The document describes using simple linear regression to analyze the relationship between average passing yards per attempt and percentage of games won for 10 NFL teams in 2011. It finds a statistically significant positive linear relationship, with the percentage of games won increasing by an estimated 17.2% for each additional average passing yard per attempt. About 66% of the variation in winning percentage is explained by the linear model. A 95% confidence interval is constructed for the predicted winning percentage of a team with an average of 6.9 passing yards per attempt. Extrapolating the model beyond the original data range is not appropriate.

Uploaded by

Misbah Mirza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Worksheet 10 (Chapter 10): Simple Linear Regression

Name: _______KEY_______________________________________ Section: _________________________

The National Football League (NFL) records a variety of performance data for individuals and teams. To investigate the
importance of passing on the percentage of games won by a team, the following data show the average number of passing
yards per attempt (yds/Attempt) and the percentage of games won (WinPct) for a random sample of 10 NFL teams for the
2011 season (NFL website, February 12, 2012). The data can be found in the NFL.xls file on BB.
a. Using DDXL create a scatterplot for this data.

b. Does it appear from the scatterplot that fitting a line to this data is a reasonable method for analysis?
Yes – the plots appear to be not too far removed from a linear pattern.

c. Using DDXL perform a simple linear regression analysis for this data. Provide the output given by DDXL here.

d. What is the least squares regression line from the output?

𝑦𝑦� = −70.391 + 17.1751(𝑥𝑥)


� = −70.391 + 17.1751(𝑌𝑌𝑌𝑌𝑌𝑌/𝐴𝐴𝐴𝐴𝐴𝐴)
𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊

e. What is your value of slope? Interpret this value in context of the problem.

β1 = 17.1751 – For every additional average yard per attempt scored the winning percentage increases by 17.1751
percent.
f. What is your value of y-intercept? Does this value have an interpretation in the context of the problem? If so
what is that interpretation, if not, why not?
β0 = -70.391 – A value of 0 average yards per attempt is not feasible therefore a value of -70.391 winning
percentage does not make sense.

g. What conditions are needed of the probability distribution of the random errors (or residuals) (ε’s).

1. ε's have a mean of 0


2. ε's are normally distributed
3. ε's have a constant variance
4. ε's are independent

h. Give the estimated value of the standard deviation of the random errors. Interpret this value in context of the
problem.

s = 11.65 - We expect approximately 95% of the observed values of winning percentage to lie
within 2(11.65) = 23.3% of their respective least squares predicted winning percentage value.

i. Perform a two-tailed Hypothesis test of slope (alpha = 0.05). [Give Hypotheses, Conditions, Test Statistics, P-
value, and Conclusions]

Hypothesis Test for 𝛽𝛽1

Hypotheses:
𝐻𝐻0 : 𝛽𝛽1 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽1 ≠ 0
Assumptions: (See part g above)
Test:
Test Statistic: t = 3.92
p-value = 0.0044

Summary: At the 5% significance level, my p-value is less than alpha therefore reject H0. There
is sufficient evidence to suggest that the population slope of the regression line predicting
winning percentage from average passing yards per attempt is different from 0.

Our model is statistically useful

j. Give the value of coefficient of correlation (r). Interpret this value in context of the problem.

𝑟𝑟 = √. 658 = 0.8112

There is a strong positive linear association between winning percentage and average passing
yards per attempt.
k. Give the value of coefficient of determination (r2). Interpret this value in the context of the problem.

𝑟𝑟 2 = .658
About 65.8% of the sample variation in winning percentage can be explained by using average
passing yards per attempt to predict winning percentage in our linear model.

l. Based on your information above – Is your model statistically useful? Why?

Yes – The hypothesis test for slope found in favor of a slope different from zero.

m. Based on your information above – Is you model practically useful? Why?

Yes – The standard deviation of the random errors is relatively small (compared to the winning percentage values)
and the coefficient for determination is relatively large.

n. If the answers to (l) and (m) are yes then using your simple linear regression equation predict the percentage of
games won from a team that has a passing percentage of 6.9.

𝑦𝑦� = −70.391 + 17.1751(𝑌𝑌𝑌𝑌𝑌𝑌/𝐴𝐴𝐴𝐴𝐴𝐴)


𝑦𝑦� = −70.391 + 17.1751(6.9)
𝑦𝑦� = 48.11719%

o. By showing values plugged into the proper equation - Create a 95% confidence interval for the estimation given
in part (n)

2
1 �𝑥𝑥𝑝𝑝 − 𝑥𝑥̅ �
𝑦𝑦� ± �𝑡𝑡𝛼𝛼⁄2 �(𝑠𝑠)� +
𝑛𝑛 𝑆𝑆𝑆𝑆𝑥𝑥𝑥𝑥

1 (6.9 − 6.8)2
48.12 ± (2.306)(11.65)� +
10 7.18

p. Have DDXL create the 95% confidence interval given in part (o). Provide the output given by DDXL here.
(Note: this interval should be identical to what you got in part (o).
q. Interpret the interval found in parts (o and p).

We are 95% confident that the true mean winning percentage for a team whose average yards passing
per attempt is 6.8 lies between 39.56 and 56.67%.

r. Why would it not be appropriate to use your equation to predict the percentage of games won from a team that has
a passing percentage of 10.2? Explain. What is this called?

This is called extrapolation. It is possible that teams with a passing percentage of 10.2 (above the scope of the x-
values used to create this equation) may have a different slope than those used to create this equation thus
inaccurately predicting winning percentage.

You might also like