Chapter 2, Lecture Notes
Chapter 2, Lecture Notes
1
categorical
► There are two types of data sets:
and quantitative .
► Categorical data
.
use labels or names to identify categories of like items
► Quantitative data
are numerical values that indicate how much or how many
.
2
► Example 1:
Coca-cola Pepsi Pepsi Coca-
cola Sprite Coca-cola
Sprite Pepsi Coca-
cola Pepsi
.
3
4
2.1 Summarizing Data for a Categorical
Variable
Frequency, Relative Frequency, and Percent
Frequency Distributions
Category freq. relative f. percent f.
Coca-Cola
Pepsi
Sprite
Cumulative
Distributions
Category cum. freq. cum. rel. f. cum. percent f.
Coca-Cola
Pepsi
Sprite
5
2.1 Summarizing Data for a Categorical
Variable
Frequency, Relative Frequency, and Percent
Frequency Distributions
Category freq. relative f. percent f.
Coca-Cola 4 4/10 = .4 (0.4)(100) = 40%
Pepsi 4 4/10 = .4 (0.4)(100) = 40%
Sprite 2 2/10 = .2 (0.2)(100) = 20%
Cumulative
Distributions
Category cum. freq. cum. rel. f. cum. percent f.
Coca-Cola 4 .4 40%
Pepsi 8 .8 80%
Sprite 10 1.0 100%
6
► The relative frequencies add up to
1.
7
Histogram (= Bar
Chart)
relative frequency .
The y axis of a histogram is
Example 1:
8
Shapes of
Histograms
9
Pie
Chart
10
Creating Frequency Distribution and Bar Chart
in Excel
Step 1: Select any cell in the data
set. Step 2: Click the Insert tab on
the ribbon. Step 3: Click
Recommended Charts.
Step 4: Click OK.
Step 5: Select the Frequency Distribution table and
click on the bar chart icon.
11
2.2 Summarizing Data for a Quantitative
Variable
Example 2: 10, 5, 2, 9, 8, 11, 3, 12, 4, 7
Frequency, Relative Frequency, and Percent
Frequency Distributions
12
Histogram (= Bar
Chart)
13
How to Create Frequency Distribution and Bar
Chart in Excel
Use
PivotTable.
softdrink.xlsx
Audit.xlsx
14
Dot Plot
15
Stem–and–Leaf Display
16
Stem–and–Leaf Display
17
► Put the digits on each line in order.
► Use a rectangle to contain the leaves of each stem.
► Rotating this page counterclockwise onto its side
provides a picture that is similar to a histogram.
18
2.3 Summarizing Data for Two Variables Using
Tables
A crosstabulation is a tabular summary of data for two
variables. Quality rating and meal price data for 10 LA
restaurants
Restaurant Quality Rating Meal Price ($)
1 Good 18
2 Very Good 22
3 Good 28
4 Excellent 38
5 Very Good 33
6 Good 28
7 Very Good 19
8 Very Good 11
9 Very Good 23
10 Good 13
19
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Good
Very
Good
Excellen
t Total
20
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0
21
How to Create a Crosstabularion in
Excel
Quality Rating.xlsx
22
We obtain column percentages by dividing each
element in a particular column by the total for that
column.
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0
23
We obtain column percentages by dividing each
element in a particular column by the total for that
column.
Meal Price
Quality $10– $20–29 $30–
Rating 19 39
Goo 50 50 0
d % % %
Very 50 50 50
Good % % %
Excelle 0 0 50
nt % % %
Tota 100 100 100
l % % %
24
We obtain row percentages by dividing each element in a
particular row by the total for that row.
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0
25
We obtain row percentages by dividing each element in a
particular row by the total for that row.
Meal Price
Quality $10– $20–29 $30– tota
Rating 19 39 l
Goo 50 50 0%
d % % 100%
Very 40 40 20%
Good % % 100%
Excelle 0 0 100 100
nt % % % %
26
Simpson’s Paradox
Simpson's paradox occurs when conclusions from
separate crosstabulations are reversed when the
data is aggregated.
Judge
Verdict Luckett Kendall Total
Upheld 129 (86%) 110 (88%) 239
Reversed 21 (14%) 15 (12%) 36
Example:
No. of Weekly Sales (in thousands of
Commercials dollars)
2 50
5 57
3 54
Scatter
Diagram:
28
Shapes of Scatter
Diagrams
29
Side-by-Side Bar Chart
Meal Price
Quality $10– $20–29 $30–$40–
Rating 19 39 49
Goo 2.6 78.6
d % 11.9% 36.8% %
Very 53.8
Good % 33.9%
Excelle 43.6 21.4
2.6% 0%
nt % %
Tota 100 100
54.2%
l % %
60.5%
100%
100%
30
31
Stacked Bar Chart
Meal Price
Quality $10– $20–29 $30–$40–
Rating 19 39 49
Goo 2.6 78.6
d % 11.9% 36.8% %
Very 53.8
Good % 33.9%
Excelle 43.6 21.4
2.6% 0%
nt % %
Tota 100 100
54.2%
l % %
60.5%
100%
100%
32
33
How to Create Side-by-Side and Stacked Bar
Charts in Excel
Quality Rating.xlsx
34
Choosing the Type of Graphical Display
Group the following charts into three
categories:
► Displays Used to Show the Distribution
of Data
► Displays Used to Make Comparisons
► Displays Used to Show Relationships
Bar
Chart
Pie
Chart
Dot Plot
Histogra
m
Stem-and-Leaf
Display Side-by- 35
Choosing the Type of Graphical Display
36