Chapter 03 Describing Bivarate Data
Chapter 03 Describing Bivarate Data
and Statistics
Fourteenth Edition
Chapter 3
Describing Bivariate Data
50
80
Percent
40
Percent
60
30
40 20
10
20
0
Gender Men Women Men Women Men Women
0
Opinion Agree Disagree No Opinion
Opinion Agree Disagree No Opinion
y=5
x
x=2 Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Describing the Scatterplot
• We can describe the relationship between two variables, x
and y, using the patterns shown in the scatterplot.
• What pattern or form do you see?
• Straight line upward or downward
• Curve or
• No pattern at all, but just a random scattering of points
• How strong is the pattern?
• Strong- all of the points follow the pattern exactly or
• weak - the relationship is only weakly visible
• Are there any unusual observations?
observations
• Clusters or outliers
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Describing the Scatterplot
Example 3.3: The number of household members, x, and the
amount spent on groceries per week, y, are measured for six
households in a local area.
Example 3.4
from Book
Curvilinear No relationship
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Numerical Measures for
Two Quantitative Variables
• A constant rate of increase or decrease is perhaps
the most common pattern found in bivariate
scatterplots.
• Assume that the two variables x and y exhibit a
linear pattern or form.
form
• There are two numerical measures to describe
– The strength and direction of the relationship
between x and y (Correlation Coefficient, r)
– The form of the relationship (Regression)
• Example: 3.5
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Correlation Coefficient
• The strength and direction of the relationship between x
and y are measured using the correlation coefficient, r.
• The new quantity sxy is called the s xy
r
covariance between x and y and defined as sx s y
( xi )( yi )
xi y i
s xy n
n 1
sx = standard deviation of the x’s
Copyright ©2006 Brooks/Cole
sy = standard deviation of the y’s
A division of Thomson Learning, Inc.
The Correlation Coefficient
• When a data point (x, y) is in either area I or III in the
scatterplot, the cross product will be positive;
• When a data point is in area II or IV, the cross product
will be negative. We can draw these conclusions:
•The scatterplot
indicates a positive
linear relationship.
( xi )( yi ) s xy
xi y i r
s xy n sx s y
n 1
63.6
(81)(1123) .885
18447 1.924(37.36)
5 63.6
4
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Interpreting r MY APPLET
sy
br
sx
a y bx
sy 37.3604
br (.885) 17.189
sx 1.9235
a y b x 224 .6 17 .189 (16 .2 ) 53 .86
Regression Line : y 53 .86 17 .189 x
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• Predict the selling price for another residence
with 1600 square feet of living area.