0% found this document useful (0 votes)

38 views32 pages

Chapter 03 Describing Bivarate Data

Uploaded by

Fardin Selim Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views32 pages

Chapter 03 Describing Bivarate Data

Uploaded by

Fardin Selim Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 32

Introduction to Probability

and Statistics
Fourteenth Edition

Chapter 3
Describing Bivariate Data

Copyright ©2006 Brooks/Cole

Some images © 2001-(current year) www.arttoday.com A division of Thomson Learning, Inc.
Bivariate Data
• Sometimes the data that are collected consist of obser-
vations for two variables on the same experimental unit.
• When two variables are measured on a single
experimental unit, the resulting data are called bivariate
data. Examples are:
• 1. An auto insurance company might be interested in the
number of vehicles owned by a policyholder as well as
the number of drivers in the household.
• 2. An economist might need to measure the amount spent
per week on groceries in a household and also the
number of people in that household.

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Bivariate Data
• 3. A real estate agent might measure the selling price of a
residential property and the square footage of the living
area.
• You can describe each variable individually, and you can
also explore the relationship between the two variables.
• Bivariate data (qualitative or quantitative) can be
described with
– Graphs – allow you to study 2 variables together
– Numerical Measures

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Graphs for Qualitative
/Categorical Variables
• When at least one of the two variables is qualitative or
categorical, either simple or more intricate pie charts
(comparative), line charts, and bar charts can be used to
display and describe the data.
• Sometimes you will have one qualitative and one
quantitative variable that have been measured in two
different populations or groups.
• In this case, you can use two side-by-side pie charts or a
bar chart in which the bars for the two populations are
placed side by side.

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Graphs for Qualitative
/Categorical Variables
• Another option is to use a stacked bar chart, in which the
bars for each category are stacked on top of each other.
• Example 3.1: Are professors in private colleges paid more
than professors at public colleges?
• A sample of 400 college professors whose rank, type of
college, and salary are recorded. The number in each cell
is the average salary (in thousands of dollars) for all
professors who fell into that category.

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Graphs for Qualitative
/Categorical Variables
• To display the average salaries of these 400 professors,
you can use a side-by-side bar chart.

• Salaries are substantially higher for full professors in

private colleges, however, there are less striking
differences at the lower two ranks.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Graphs for Qualitative
/Categorical Variables
• Example 3.2 (From Book)
• Another Example: Do you think that men and women are
treated equally in the workplace?
Variable #1 = Opinion
Variable #2 = Gender
Men Women

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Comparative Bar Charts
120 Gender 70
Men
Women 60
100

50
80

Percent
40
Percent

60
30

40 20

10
20

0
Gender Men Women Men Women Men Women
0
Opinion Agree Disagree No Opinion
Opinion Agree Disagree No Opinion

• Stacked Bar Chart • Side-by-Side Bar Chart

Describe the relationship between opinion and
gender:
More women than men feel that they are
not treated equally in the workplace..
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Scatterplot for Two Quantitative
Variables
• When both of the variables are quantitative, call one
variable x and the other y. A single measurement is a pair
of numbers (x, y) that can be plotted using a two-
dimensional graph called a scatterplot.
• It is the two dimensional extension of the dotplot we
used to graph one quantitative variable
y
(2, 5)

y=5

x
x=2 Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Describing the Scatterplot
• We can describe the relationship between two variables, x
and y, using the patterns shown in the scatterplot.
• What pattern or form do you see?
• Straight line upward or downward
• Curve or
• No pattern at all, but just a random scattering of points
• How strong is the pattern?
• Strong- all of the points follow the pattern exactly or
• weak - the relationship is only weakly visible
• Are there any unusual observations?
observations
• Clusters or outliers
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Describing the Scatterplot
Example 3.3: The number of household members, x, and the
amount spent on groceries per week, y, are measured for six
households in a local area.

Example 3.4
from Book

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Examples

Positive linear - strong Negative linear -weak

Curvilinear No relationship
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Numerical Measures for
Two Quantitative Variables
• A constant rate of increase or decrease is perhaps
the most common pattern found in bivariate
scatterplots.
• Assume that the two variables x and y exhibit a
linear pattern or form.
form
• There are two numerical measures to describe
– The strength and direction of the relationship
between x and y (Correlation Coefficient, r)
– The form of the relationship (Regression)
• Example: 3.5
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Correlation Coefficient
• The strength and direction of the relationship between x
and y are measured using the correlation coefficient, r.
• The new quantity sxy is called the s xy
r
covariance between x and y and defined as sx s y

( xi )( yi )
 xi y i 
s xy  n
n 1
sx = standard deviation of the x’s
Copyright ©2006 Brooks/Cole
sy = standard deviation of the y’s
A division of Thomson Learning, Inc.
The Correlation Coefficient
• When a data point (x, y) is in either area I or III in the
scatterplot, the cross product will be positive;
• When a data point is in area II or IV, the cross product
will be negative. We can draw these conclusions:

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
The Correlation Coefficient
• If most of the points are in areas I and III (forming a
positive pattern), Sxy and r will be positive.
• If most of the points are in areas II and IV (forming
a negative pattern), Sxy and r will be negative.
• If the points are scattered across all four areas
(forming no pattern), Sxy and r will be close to 0.

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
Example
• Living area x and selling price y of 5 homes.
Residence 1 2 3 4 5
x (thousand sq ft) 14 15 17 19 16
y ($000) 178 230 240 275 200

•The scatterplot
indicates a positive
linear relationship.

Copyright ©2006 Brooks/Cole

A division of Thomson Learning, Inc.
x
14
y
178
xy
2492
Example
15 230 3450 Calculate
17 240 4080
x  16.2 s x  1.924
19 275 5225
y  224.6 s y  37.360
16 200 3200
81 1123 18447

(  xi )( yi ) s xy
 xi y i  r
s xy  n sx s y
n 1
63.6
(81)(1123)   .885
18447  1.924(37.36)
 5  63.6
4
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Interpreting r MY APPLET

•-1  r  1 Range of r. Sign of r indicates

direction of the linear relationship.

•r  0 Weak relationship; random scatter

of points

•r  1 or –1 Strong relationship; either

positive or negative

All points fall exactly on a

•r = 1 or –1 straight line.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Interpreting r
• The value of r always lies between -1 and 1.
• When r is positive, x increases when y increases,
and vice versa.
• When r is negative, x decreases when y increases, or
x increases when y decreases.
• When r takes the value exactly -1 or 1, all the points
lie exactly on a straight line.
• If r = 0, then there is no apparent linear relationship
between the two variables.
• The closer the value of r is to -1 or 1, the stronger
the linear relationship between the two variables.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Regression Line
• Sometimes x and y are related in a particular way—the value of y
depends on the value of x.
– y = dependent variable
– x = independent variable
• Example: the cost of a home (y) may depend on its amount of
floor space (x), a student’s grade point average (x) may explain
her score on an achievement test (y)
• The form of the linear relationship between x and y can be
described by fitting a line as best we can through the points. This
is the regression line,
y = a + bx.
– a = y-intercept of the line
– b = slope of the line

A division of Thomson Learning, Inc.
The Regression Line

• For every one-unit increase in x, y increases by an amount

b. The quantity b determines whether the line is increasing
(b > 0), decreasing (b < 0), or horizontal (b = 0) and is
appropriately called the slope of the line.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Regression Line
• When plotting the (x, y) points for two variables x and y,
the points generally do not fall exactly on a straight line,
but they may show a trend that could be described as a
linear pattern.
• We can describe this trend by fitting a line as best we
can through the points.
• This best-fitting line relating y to x, often called the
regression or least squares line, is found by minimizing
the sum of the squared differences between the data
points and the line itself.

A division of Thomson Learning, Inc.
The Regression Line

A division of Thomson Learning, Inc.
The Regression Line
• To find the slope and y-intercept of the
best fitting line, use:

sy
br
sx
a  y  bx

• The least squares

• regression line is y = a + bx
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Regression Line
• Since Sx and Sy are both positive, b and r have the
same sign, so that:
1. When r is positive, so is b, and the line is
increasing with x.
2. When r is negative, so is b, and the line is
decreasing with x.
3. When r is close to 0, then b is close to 0.
• Example 3.7 from Book

A division of Thomson Learning, Inc.
x y xy Example in Excel
14 178 2492
15 230 3450
Recall
17 240 4080 x  16.2 s x  1.9235
19 275 5225 y  224.6 s y  37.3604
16 200 3200
r  .885
81 1123 18447

sy 37.3604
br  (.885)  17.189
sx 1.9235
a  y  b x  224 .6  17 .189 (16 .2 )   53 .86
Regression Line : y   53 .86  17 .189 x
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• Predict the selling price for another residence
with 1600 square feet of living area.

Predict: y   53 .86  17 .189 x

A division of Thomson Learning, Inc.
The Correlation and Regression
• The regression approach is used when the values of x are set
in advance and then the corresponding value of y is
measured.
• The correlation approach is used when an experimental unit
is selected at random and then measurements are made on
both variables x and y.
• Most data analysts begin any data-based investigation by
examining plots of the variables involved.
• If the relationship between two variables is of interest, data
analysts can also explore bivariate plots in conjunction with
numerical measures of location, dispersion, and correlation.
• Graphs and numerical descriptive measures are only the first
of many statistical tools.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
I. Bivariate Data
1. Both qualitative and quantitative variables
2. Describing each variable separately
3. Describing the relationship between the variables
II. Describing Two Qualitative Variables
1. Side-by-Side pie charts
2. Comparative line charts
3. Comparative bar charts
 Side-by-Side
 Stacked
4. Relative frequencies to describe the relationship between
the two variables.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
III. Describing Two Quantitative Variables
1. Scatterplots
 Linear or nonlinear pattern
 Strength of relationship
 Unusual observations; clusters and outliers
2. Covariance and correlation coefficient
3. The best fitting line
 Calculating the slope and y-intercept
 Graphing the line
 Using the line for prediction
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.

J. Klup Applications Year 12 Workbook Unit 3 (J.klup)
100% (5)
J. Klup Applications Year 12 Workbook Unit 3 (J.klup)
274 pages
Yr 10 STATISTICS BOOKLET (Teacher Copy)
No ratings yet
Yr 10 STATISTICS BOOKLET (Teacher Copy)
47 pages
Chapter 03
No ratings yet
Chapter 03
19 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Chapter 3
No ratings yet
Chapter 3
18 pages
Correg
No ratings yet
Correg
19 pages
BA 216 Lecture 5 Notes
No ratings yet
BA 216 Lecture 5 Notes
31 pages
Chapter2-ESTA3042 2020S2
No ratings yet
Chapter2-ESTA3042 2020S2
80 pages
3 Bivariate Data
No ratings yet
3 Bivariate Data
33 pages
Data Analysis-Univariate & Bivariate
100% (1)
Data Analysis-Univariate & Bivariate
9 pages
YMS Topic Review (Chs 1-8)
No ratings yet
YMS Topic Review (Chs 1-8)
7 pages
QM2 23-24 Session 3
No ratings yet
QM2 23-24 Session 3
53 pages
IPS7e LecturePPT ch02
No ratings yet
IPS7e LecturePPT ch02
105 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
MATH& 146 Lesson 3: Sections 1.1 and 1.2
No ratings yet
MATH& 146 Lesson 3: Sections 1.1 and 1.2
29 pages
Sem 6 Ques Data Science
No ratings yet
Sem 6 Ques Data Science
23 pages
Stat and Prob Q4 Week 7 Module 15 Lorena
No ratings yet
Stat and Prob Q4 Week 7 Module 15 Lorena
24 pages
Chapter 2
No ratings yet
Chapter 2
67 pages
3 Bivariate Data
No ratings yet
3 Bivariate Data
31 pages
Introduction To Probability and Statistics Twelfth Edition
No ratings yet
Introduction To Probability and Statistics Twelfth Edition
31 pages
3.1 Power Point
No ratings yet
3.1 Power Point
17 pages
Source: Pllnu4Dk9H04Wqyrebvzx4?Fr Yfp-T-701-S &toggle 1&cop Mss&Ei Utf8&Fp - Ip PH&P Types of Descriptive Statistics
No ratings yet
Source: Pllnu4Dk9H04Wqyrebvzx4?Fr Yfp-T-701-S &toggle 1&cop Mss&Ei Utf8&Fp - Ip PH&P Types of Descriptive Statistics
51 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Chapter 01
No ratings yet
Chapter 01
31 pages
1 - Descriptive Analysis and Presentation of Bivariate Data
No ratings yet
1 - Descriptive Analysis and Presentation of Bivariate Data
23 pages
Notes3.1 TPS6up
No ratings yet
Notes3.1 TPS6up
19 pages
Business Club: Basic Statistics
No ratings yet
Business Club: Basic Statistics
26 pages
MATH210 - Stats Custom Text
No ratings yet
MATH210 - Stats Custom Text
145 pages
Scatterplots - Week 5
No ratings yet
Scatterplots - Week 5
34 pages
Stat Review Keller
No ratings yet
Stat Review Keller
211 pages
Topic 4 ETC1000
No ratings yet
Topic 4 ETC1000
13 pages
Unit I II III IV
No ratings yet
Unit I II III IV
23 pages
Correlation New
No ratings yet
Correlation New
37 pages
Chap 1
No ratings yet
Chap 1
75 pages
Fundamentals of Data Science and Analytics On Descriptive Analysis
No ratings yet
Fundamentals of Data Science and Analytics On Descriptive Analysis
53 pages
Variables and Data Presentation
No ratings yet
Variables and Data Presentation
64 pages
Session 3 - Bivariate Data Analysis Tutorial Prac
No ratings yet
Session 3 - Bivariate Data Analysis Tutorial Prac
24 pages
Variable: An Item of Data Examples
No ratings yet
Variable: An Item of Data Examples
60 pages
Chapter 05
No ratings yet
Chapter 05
13 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Chapter 3 Slides
No ratings yet
Chapter 3 Slides
40 pages
Stat 1 - Q1 - Week 7
No ratings yet
Stat 1 - Q1 - Week 7
16 pages
STAT Lec1 2023
No ratings yet
STAT Lec1 2023
27 pages
Unit 18 - StatsProbab 1
No ratings yet
Unit 18 - StatsProbab 1
74 pages
3 - Bidimensional Statistics
No ratings yet
3 - Bidimensional Statistics
41 pages
Notes 2 - Scatterplots and Correlation
No ratings yet
Notes 2 - Scatterplots and Correlation
6 pages
CS3353 FDS Unit 3 New
No ratings yet
CS3353 FDS Unit 3 New
48 pages
Presentation-Bivariate Data
No ratings yet
Presentation-Bivariate Data
112 pages
Introduction To Statistics?: Dr. Smitabh Barik
No ratings yet
Introduction To Statistics?: Dr. Smitabh Barik
85 pages
DATA202-02 - Descriptive Statistics (Part 2)
No ratings yet
DATA202-02 - Descriptive Statistics (Part 2)
18 pages
Module 01 - STAT 101
No ratings yet
Module 01 - STAT 101
23 pages
Business Analytics: Finding Relationships Among Variables
No ratings yet
Business Analytics: Finding Relationships Among Variables
39 pages
9.bivariate Analysis
No ratings yet
9.bivariate Analysis
64 pages
Full Bound Reference
No ratings yet
Full Bound Reference
83 pages
Chapter 3 - Regression
No ratings yet
Chapter 3 - Regression
8 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
Zyril John Wong Statistics
No ratings yet
Zyril John Wong Statistics
19 pages
U1 Exploring One-Variable Data
No ratings yet
U1 Exploring One-Variable Data
22 pages
High School Physics Tutor
From Everand
High School Physics Tutor
Joseph Molitoris
1/5 (1)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Sci 7 q1 12 Demonstrate Proper Use and Handling of Science Equipment
No ratings yet
Sci 7 q1 12 Demonstrate Proper Use and Handling of Science Equipment
44 pages
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
No ratings yet
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
8 pages
Casting Technology 04
No ratings yet
Casting Technology 04
11 pages
Objective: SQL Server 6.5
No ratings yet
Objective: SQL Server 6.5
24 pages
15.18 Auxiliary Power Units (APUs)
No ratings yet
15.18 Auxiliary Power Units (APUs)
24 pages
MIKE11 UserManual
No ratings yet
MIKE11 UserManual
542 pages
HTML Tags
No ratings yet
HTML Tags
14 pages
Chemical Resistance Guide
No ratings yet
Chemical Resistance Guide
20 pages
CBSE Class 12 Chemistry Question Paper Solution 2019
No ratings yet
CBSE Class 12 Chemistry Question Paper Solution 2019
6 pages
Ls Inverter Ic5
No ratings yet
Ls Inverter Ic5
20 pages
Level Iii Ut Specific Examination
No ratings yet
Level Iii Ut Specific Examination
8 pages
Unit 4 Transport Layer
No ratings yet
Unit 4 Transport Layer
25 pages
Electrical and Optical Properties of Germanium-Doped Zinc Oxide Thin Films
No ratings yet
Electrical and Optical Properties of Germanium-Doped Zinc Oxide Thin Films
4 pages
AERO3000 Equation List
No ratings yet
AERO3000 Equation List
19 pages
Acknowledgement Abstract
No ratings yet
Acknowledgement Abstract
6 pages
White Paper Droplet Based Microfluidics Elveflow Microfluidics
No ratings yet
White Paper Droplet Based Microfluidics Elveflow Microfluidics
28 pages
CBSE Class 11 Mathematics Worksheet - Set Theory (1) Export PDF
100% (1)
CBSE Class 11 Mathematics Worksheet - Set Theory (1) Export PDF
14 pages
Crude Oil Conversion Table
No ratings yet
Crude Oil Conversion Table
61 pages
On Bottom Stability Analysis and Mudmat Design
No ratings yet
On Bottom Stability Analysis and Mudmat Design
9 pages
Chemical Engineering - Why in A Normal Distillation Column Does Temperature and Pressure Gradient Exist From Bottom To Top - Quora PDF
No ratings yet
Chemical Engineering - Why in A Normal Distillation Column Does Temperature and Pressure Gradient Exist From Bottom To Top - Quora PDF
6 pages
Design Considerations For The Vibration of Floors - Part 2: Advisory Desk
No ratings yet
Design Considerations For The Vibration of Floors - Part 2: Advisory Desk
3 pages
Computer Architecture
No ratings yet
Computer Architecture
10 pages
Shop Manual PC27MRX1 PC30MRX1 PC35MRX1 PC40MRX1 PC45MRX1
No ratings yet
Shop Manual PC27MRX1 PC30MRX1 PC35MRX1 PC40MRX1 PC45MRX1
946 pages
Cement Evaluation Challenges
No ratings yet
Cement Evaluation Challenges
18 pages
Unit-III Final Java Servlets and XML Notes
No ratings yet
Unit-III Final Java Servlets and XML Notes
64 pages
Teknik Menjawab Kimia 3 SPM
No ratings yet
Teknik Menjawab Kimia 3 SPM
31 pages
Cryptanalysis of A New Ultralightweight RFID Authentication ProtocolSASI
No ratings yet
Cryptanalysis of A New Ultralightweight RFID Authentication ProtocolSASI
5 pages
Computer Science and Engineering
No ratings yet
Computer Science and Engineering
145 pages
Ultratech Report Final
No ratings yet
Ultratech Report Final
78 pages
MA3151 Matrix and Calculus Unit Wise
No ratings yet
MA3151 Matrix and Calculus Unit Wise
5 pages

Chapter 03 Describing Bivarate Data

Uploaded by

Chapter 03 Describing Bivarate Data

Uploaded by

Introduction to Probability

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

• Salaries are substantially higher for full professors in

Copyright ©2006 Brooks/Cole

• Stacked Bar Chart • Side-by-Side Bar Chart

Copyright ©2006 Brooks/Cole

Positive linear - strong Negative linear -weak

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

•-1  r  1 Range of r. Sign of r indicates

•r  0 Weak relationship; random scatter

•r  1 or –1 Strong relationship; either

All points fall exactly on a

Copyright ©2006 Brooks/Cole

• For every one-unit increase in x, y increases by an amount

Copyright ©2006 Brooks/Cole

Copyright ©2006 Brooks/Cole

• The least squares

Copyright ©2006 Brooks/Cole

Predict: y   53 .86  17 .189 x

Copyright ©2006 Brooks/Cole

You might also like