Mod 3 Worksheet Review 14KEY
Mod 3 Worksheet Review 14KEY
A. There is a very strong linear positive correlation between subscribers and cell service
regions.
B. There is a strong linear negative correlation between subscribers and corded phone sales.
CORRELATIO A correlation coefficient, denoted by r, is a number from -1 to 1 that measures how well a line
N fits a set of data pairs (x, y). If r is near 1, the points lie close to a line with positive slope. If r
COEFFICIENT is near -1, the points lie close to a line with negative slope. If r is near 0, the points do not lie
S close to any line.
Example 2: Tell whether the correlation coefficient for the data is closest to -1, -0.5, 0, 0.5, or 1.
Solution:
a. -0.5
b. 0
c. 1
Example 3: For each scatter plot, (a) tell whether the data have a positive correlation, a negative
correlation, or approximately no correlation, and (b) tell whether the correlation
coefficient is closest to -1, -0.5, 0, 0.5, or 1.
0.5 -1 0
BEST- If the correlation coefficient for a set of data is near ±1, the data can be reasonably modeled by
FITTING a line. The best-fitting line is the line that lies as close as possible to all the data points. You
LINES can approximate a best-fitting line by graphing.
Approximating a Best-Fitting Line
STEP 1 Draw a scatter plot of the data.
STEP 2 Sketch the line that appears to follow most closely the trend given by the data points.
There should be about as many points above the line as below it.
STEP 3 Choose two points on the line, and estimate the coordinates of each point.
STEP 4 Write an equation of the line that passes through the two points from Step 3. This
equation is a model for the data.
Example 4: The table shows the number y (in thousands) of alternative-fueled
vehicles in use in the United States x years after 1997. Approximate the
best-fitting line for the data.
x 0 1 2 3 4 5 6 7
500
450
400
350
300
0 1 2 3 4 5 6 7
Extension: Use the equation of the line of fit from the above example to predict the
number of alternative-fueled vehicles in use in the United States in
2010.
Solution: If you are confident the
trend will continue, you could use
x=13. Otherwise, 2010 is outside
the scope of the data.
5. A line that lies as close as possible to a set of data points (x, y) is called the best fit line for the data
points.
6. Describe how to tell whether a set of data points shows a positive correlation, a negative correlation, or
approximately no correlation.
Positive correlation: As x increases, y increases.
Negative correlation: As x increases, y decreases.
No correlation: No visible pattern appears.
Tell whether the data have a positive correlation, a negative correlation, or approximately no correlation.
7. 8. 9.
Tell whether the correlation coefficient for the data is closest to -1, -0.5, 0, 0.5, or 1.
10. 11. 12.
0 0.5 -1
In Exercises 13–14, (a) draw a scatter plot of the data using an appropriate scale, (b) approximate the
best-fitting line, and (c) estimate y when x = 20. Graph can be found on the last page.
13. 14.
17. MULTIPLE CHOICE A set of data has correlation coefficient r. For which value of r would the
data points lie closest to a line? A
18. The data pairs (x, y) give U.S. average annual public college tuition y (in dollars) x years after
1997. Find the best-fitting line for the data using Statcato. Also write the value of r and r squared.
(0, 2271), (1, 2360), (2, 2430), (3, 2506), (4, 2562), (5, 2727), (6, 2928)
Y = 2236.607 + 101.321x
R = .9724
R squared = .9455
SE = 57.5549
The important point is to be able to interpret the meaning of these values in context.
Slope: The average tuition increases $101.32 per year.
Y intercept: The tuition was $2236.61 in 1997.
R: The correlation coefficient r=.9724 shows that there is a very strong positive correlation
between year and tuition.
R squared: 94.55% of the variation in tuition can be explained by the year.
SE: One can roughly expect an error of ±$57.55 in tuition when making predictions.