0% found this document useful (0 votes)
3 views19 pages

Notes3.1 TPS6up

Chapter 3 focuses on exploring two-variable quantitative data, specifically through scatterplots and correlation. It teaches how to distinguish between explanatory and response variables, create scatterplots, and describe relationships in terms of direction, form, strength, and unusual features. Additionally, it emphasizes the properties of correlation, including its limitations and the distinction between correlation and causation.

Uploaded by

cjr2333333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views19 pages

Notes3.1 TPS6up

Chapter 3 focuses on exploring two-variable quantitative data, specifically through scatterplots and correlation. It teaches how to distinguish between explanatory and response variables, create scatterplots, and describe relationships in terms of direction, form, strength, and unusual features. Additionally, it emphasizes the properties of correlation, including its limitations and the distinction between correlation and causation.

Uploaded by

cjr2333333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Chapter 3

Exploring Two-Variable Quantitative Data

Section 3.1
Scatterplots and
Correlation
Scatterplots and Correlation

LEARNING TARGETS
By the end of this section, you should be able to:
 DISTINGUISH between explanatory and response variables for
quantitative data.
 MAKE a scatterplot to display the relationship between two
quantitative variables.
 DESCRIBE the direction, form, and strength of a relationship displayed
in a scatterplot and identify unusual features.
 INTERPRET the correlation.
 UNDERSTAND the basic properties of correlation, including how the
correlation is influenced by unusual points.
 DISTINGUISH correlation from causation.

Starnes/Tabor, The Practice of Statistics


3.1a - Univariate and Bivariate Data
A one-variable data set is sometimes called univariate data.

A data set that describes the relationship between two variables is sometimes
called bivariate data.

Analysis of relationships between two variables builds on the same tools we used
to analyze one variable:
• Plot the data, then look for overall patterns and departures from those
patterns.
• Add numerical summaries.
• When there’s a regular overall pattern, use a simplified model to describe it.

Starnes/Tabor, The Practice of Statistics


Explanatory and Response Variables
Most statistical studies examine data on more than one variable. Analysis of
relationships between two variables builds on the same tools we used to analyze
one variable.

A response variable measures an outcome of a study.


An explanatory variable may help predict or explain changes in a response
variable.

Note: In many studies, the goal is to show that changes in one or more explanatory
variables actually cause changes in a response variable. However, other
explanatory-response relationships don’t involve direct causation.

Starnes/Tabor, The Practice of Statistics


Displaying Relationships: Scatterplots
A scatterplot shows the relationship (association) between two quantitative
variables measured on the same individuals. The values of one variable appear on
the horizontal axis, and the values of the other variable appear on the vertical axis.
Each individual in the data set appears as a point in the graph.

How to Make a Scatterplot


• Label the axes.
The eXplanatory variable goes on the horizontal (X-axis). The response
variable goes on the vertical axis. If there is no explanatory variable, either
variable can go on the horizontal axis.
• Scale the axes.
• Plot individual data values.

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots
To describe a scatterplot, follow the basic strategy of data analysis from
Chapter 1: look for patterns and important departures from those
patterns.

Two variables have a positive association when above-average values of


one variable tend to accompany above-average values of the other
variable and when below-average values also tend to occur together.

Two variables have a negative association when above-average values of


one variable tend to accompany below-average values of the other
variable.

There is no association between two variables if knowing the value of one


variable does not help us predict the value of the other variable.

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots
Positive Association Negative Association

No Association

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots

How to Describe a Scatterplot


To describe a scatterplot, make sure to address the following four
characteristics in the context of the data:
• Direction: A scatterplot can show a positive association, negative
association, or no association.
• Form: A scatterplot can show a linear form or a nonlinear form. The form is
linear if the overall pattern follows a straight line. Otherwise, the form is
nonlinear.
• Strength: A scatterplot can show a weak, moderate, or strong association.
An association is strong if the points don’t deviate much from the form
identified. An association is weak if the points deviate quite a bit from the
form identified.
• Unusual features: Look for individual points that fall outside the overall
pattern and distinct clusters of points.

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots

How to Describe a Scatterplot


To describe a scatterplot, make sure to address the following four
characteristics in the context of the data:
• Direction: A scatterplot can show a positive association, negative
association, or no association.
• Form: A scatterplot can show a linear form or a nonlinear form. The form is
linear if the overall pattern follows a straight line. Otherwise, the form is
nonlinear.
• Strength: A scatterplot can show a weak, moderate, or strong association.
An association is strong if the points don’t deviate much from the form
identified. An association is weak if the points deviate quite a bit from the
form identified.
• Unusual features: Look for individual points that fall outside the overall
pattern and distinct clusters of points.

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots
The scatterplot shows the association between mean SAT Math score and
percent of students who take the SAT for the 50 U.S. states. Describe the
association shown by the scatterplot.

Starnes/Tabor, The Practice of Statistics


Describing Scatterplots
The scatterplot shows the association between mean SAT Math score and
percent of students who take the SAT for the 50 U.S. states. Describe the
association shown by the scatterplot.

Strength
Direction There is a moderately strong,
Form negative, curved relationship
Unusual features between the percent of
students in a state who take
the SAT and the mean SAT
math score.
Further, there are two distinct
clusters of states and two
unusual points that fall outside
the overall pattern.

Starnes/Tabor, The Practice of Statistics


3.1b - Measuring Linear Association: Correlation

A scatterplot displays the direction, form, and strength of a relationship


between two quantitative variables. When the association between two
quantitative variables is linear, we can use the correlation r to help describe the
strength and direction of the association.

For a linear association between two quantitative variables, the correlation r


measures the direction and strength of the association.

CAUTION:

It is only appropriate to use the correlation to describe


strength and direction for a linear relationship.

Starnes/Tabor, The Practice of Statistics


Measuring Linear Association: Correlation

Some Important Properties of the Correlation r


• The correlation r is always a number between –1 and 1 (–1 ≤ r ≤ 1).

• The correlation r indicates the direction of a linear relationship by its sign:


r > 0 for a positive association and r < 0 for a negative association.

• The extreme values r = –1 and r = 1 occur only in the case of a perfect


linear relationship, when the points lie exactly along a straight line.

• If the linear relationship is strong, the correlation r will be close to 1 or –1.

• If the linear relationship is weak, the correlation r will be close to 0.

Starnes/Tabor, The Practice of Statistics


Measuring Linear Association: Correlation

Starnes/Tabor, The Practice of Statistics


Cautions about Correlation

CAUTION:

1. Correlation doesn’t imply causation.


2. Correlation does not measure form.
3. Correlation should only be used to describe linear
relationships.
4. Correlation is NOT a resistant measure of strength.

Starnes/Tabor, The Practice of Statistics


Calculating Correlation Coefficient, r

How to Calculate the Correlation r


Suppose that we have data on variables x and y for n individuals. The values
for the first individual are x1 and y1, the values for the second individual are
x2 and y2, and so on. The means and standard deviations of the two variables
are 𝑥ҧ and sx for the x-values and 𝑦ത and sy for the y-values.
The correlation r between x and y is:

1  x1  x   y1  y   x2  x   y2  y   xn  x   yn  y  
r           ...      
n  1  sx  s y   sx  s y   s x  s y  
1  xi  x   yi  y 
r 
n  1  sx
  
  sy 

Starnes/Tabor, The Practice of Statistics


Facts About Correlation

How correlation behaves is more important than the details of the formula.
Here are some important facts about r.

1. Correlation requires that both variables be quantitative.


2. Correlation makes no distinction between explanatory and
response variables.
3. r does not change when we change the units of measurement of
x, y, or both.
4. The correlation r has no unit of measurement. It’s just a number.

Starnes/Tabor, The Practice of Statistics


Section Summary

LEARNING TARGETS
After this section, you should be able to:
 DISTINGUISH between explanatory and response variables for
quantitative data.
 MAKE a scatterplot to display the relationship between two
quantitative variables.
 DESCRIBE the direction, form, and strength of a relationship displayed
in a scatterplot and identify unusual features.
 INTERPRET the correlation.
 UNDERSTAND the basic properties of correlation, including how the
correlation is influenced by unusual points.
 DISTINGUISH correlation from causation.

Starnes/Tabor, The Practice of Statistics

You might also like