Worksheet 10 - Spring 2014 - Chapter 10 - Key
Worksheet 10 - Spring 2014 - Chapter 10 - Key
The National Football League (NFL) records a variety of performance data for individuals and teams. To investigate the
importance of passing on the percentage of games won by a team, the following data show the average number of passing
yards per attempt (yds/Attempt) and the percentage of games won (WinPct) for a random sample of 10 NFL teams for the
2011 season (NFL website, February 12, 2012). The data can be found in the NFL.xls file on BB.
a. Using DDXL create a scatterplot for this data.
b. Does it appear from the scatterplot that fitting a line to this data is a reasonable method for analysis?
Yes – the plots appear to be not too far removed from a linear pattern.
c. Using DDXL perform a simple linear regression analysis for this data. Provide the output given by DDXL here.
e. What is your value of slope? Interpret this value in context of the problem.
β1 = 17.1751 – For every additional average yard per attempt scored the winning percentage increases by 17.1751
percent.
f. What is your value of y-intercept? Does this value have an interpretation in the context of the problem? If so
what is that interpretation, if not, why not?
β0 = -70.391 – A value of 0 average yards per attempt is not feasible therefore a value of -70.391 winning
percentage does not make sense.
g. What conditions are needed of the probability distribution of the random errors (or residuals) (ε’s).
h. Give the estimated value of the standard deviation of the random errors. Interpret this value in context of the
problem.
s = 11.65 - We expect approximately 95% of the observed values of winning percentage to lie
within 2(11.65) = 23.3% of their respective least squares predicted winning percentage value.
i. Perform a two-tailed Hypothesis test of slope (alpha = 0.05). [Give Hypotheses, Conditions, Test Statistics, P-
value, and Conclusions]
Hypotheses:
𝐻𝐻0 : 𝛽𝛽1 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽1 ≠ 0
Assumptions: (See part g above)
Test:
Test Statistic: t = 3.92
p-value = 0.0044
Summary: At the 5% significance level, my p-value is less than alpha therefore reject H0. There
is sufficient evidence to suggest that the population slope of the regression line predicting
winning percentage from average passing yards per attempt is different from 0.
j. Give the value of coefficient of correlation (r). Interpret this value in context of the problem.
𝑟𝑟 = √. 658 = 0.8112
There is a strong positive linear association between winning percentage and average passing
yards per attempt.
k. Give the value of coefficient of determination (r2). Interpret this value in the context of the problem.
𝑟𝑟 2 = .658
About 65.8% of the sample variation in winning percentage can be explained by using average
passing yards per attempt to predict winning percentage in our linear model.
Yes – The hypothesis test for slope found in favor of a slope different from zero.
Yes – The standard deviation of the random errors is relatively small (compared to the winning percentage values)
and the coefficient for determination is relatively large.
n. If the answers to (l) and (m) are yes then using your simple linear regression equation predict the percentage of
games won from a team that has a passing percentage of 6.9.
o. By showing values plugged into the proper equation - Create a 95% confidence interval for the estimation given
in part (n)
2
1 �𝑥𝑥𝑝𝑝 − 𝑥𝑥̅ �
𝑦𝑦� ± �𝑡𝑡𝛼𝛼⁄2 �(𝑠𝑠)� +
𝑛𝑛 𝑆𝑆𝑆𝑆𝑥𝑥𝑥𝑥
1 (6.9 − 6.8)2
48.12 ± (2.306)(11.65)� +
10 7.18
p. Have DDXL create the 95% confidence interval given in part (o). Provide the output given by DDXL here.
(Note: this interval should be identical to what you got in part (o).
q. Interpret the interval found in parts (o and p).
We are 95% confident that the true mean winning percentage for a team whose average yards passing
per attempt is 6.8 lies between 39.56 and 56.67%.
r. Why would it not be appropriate to use your equation to predict the percentage of games won from a team that has
a passing percentage of 10.2? Explain. What is this called?
This is called extrapolation. It is possible that teams with a passing percentage of 10.2 (above the scope of the x-
values used to create this equation) may have a different slope than those used to create this equation thus
inaccurately predicting winning percentage.