AP Stats 3.1
AP Stats 3.1
1: Scatterplots &
Correlation
Section 3.1
Scatterplots and Correlation
After this section, you should be able to…
Sample Answer:
There is a moderately strong,
positive, linear relationship
between body weight and pack
weight. There is one possible
outlier, the hiker with the body
weight of 187 pounds seems to
be carrying relatively less weight
than are the other group
members. It appears that lighter
students are carrying lighter
backpacks
Describe and interpret the scatterplot below. The y-axis refer
to a school’s mean SAT math score. The x-axis refers to the
percentage of students at a school taking the SAT.
Describe and interpret the scatterplot below. The y-axis refer
to a school’s mean SAT math score. The x-axis refers to the
percentage of students at a school taking the SAT.
Sample Answer:
There is a moderately strong,
negative, curved relationship
between the percent of
students in a state who take
the SAT and the mean SAT
math score.
Further, there are two distinct
clusters of states and at least
one possible outliers that falls
outside the overall pattern.
What is Correlation?
• A mathematical value that describes the
strength of a linear relationship between
two quantitative variables.
• Correlation values are between -1 and 1.
• Correlation is abbreviated: r
• The strength of the linear relationship
increases as r moves away from 0 towards -1
or 1.
What does “r” tell us?!
• Correlation describes what percent of
variation in y is ‘explained’ by x.
• Notice that the formula is the sum of the z-
scores of x multiplied by the z-scores of y.
Scatterplots and Correlation
What does “r” mean?
R Value Strength
-1 Perfectly linear; negative
-0.75 Strong negative relationship
-0.50 Moderately strong negative relationship
-0.25 Weak negative relationship
0 nonexistent
0.25 Weak positive relationship
0.50 Moderately strong positive relationship
0.75 Strong positive relationship
1 Perfectly linear; positive
Describe and interpret the scatterplot below. Be
sure to estimate the correlation.
Sample Answer:
As the number of boats registered in Florida
increases so does the number of manatees killed
by boats. This relationship is evidenced in the
scatterplot by a strong, positive linear
relationship. The estimated correlation is
approximately r =0.85.
• 0.235
• -0.456
• 0.975
• -0.784
Calculate Correlation: TI-Nspire
1. Enter x values in list 1 and y values in list 2.
2. Press MENU, then 4: Statistics
3. Option 1: Stat Calculations
4. Option 3: Linear Regression mx + b
5. X: a[] , Y: b[] , ENTER
6. Correlation = r
R = 0.97
Facts about Correlation
1. Correlation requires that both variables be quantitative.
2. Correlation does not describe curved relationships between
variables, no matter how strong the relationship is.
3. Correlation is not resistant. r is strongly affected by a few
outlying observations.
4. Correlation makes no distinction between explanatory and
response variables.
5. r does not change when we change the units of measurement
of x, y, or both.
6. r does not change when we add or subtract a constant to
either x, y or both.
7. The correlation r itself has no unit of measurement.
R: Ignores distinctions
between X & Y
R: Highly Effected By
Outliers
Why?!
• Since r is calculated using standardized values
(z-scores), the correlation value will not
change if the units of measure are changed
(feet to inches, etc.)
• Adding a constant to either x or y or both will
not change the correlation because neither
the standard deviation nor distance from the
mean will be impacted.
Correlation Formula:
Suppose that we have data on variables x and y for n
individuals.
The values for the first individual are x1 and y1, the values
for the second individual are x2 and y2, and so on.
The means and standard deviations of the two variables are
x-bar and sx for the x-values and y-bar and sy for the y-
values.
The correlation r between x and y is:
1 x1 − x y1 − y x 2 − x y 2 − y x n − x y n − y
r= + + ...+
n −1 sx sy sx sy sx sy
1 x i − x y i − y
r=
n −1 sx sy