Chapter 2 Organizing and Visualizing Data
Chapter 2 Organizing and Visualizing Data
Organizing and
Visualizing Variables
Tallying Data
One Two
Categorical Categorical
Variable Variables
Summary Contingency
Table Table
Source: Data extracted and adapted from “Main Reason Young Adults Shop Online?”
USA Today, December 5, 2012, p. 1A.
The number of classes depends on the number of values in the data. With
a larger number of values, typically there are more classes. In general, a
frequency distribution should have at least 5 but no more than 15 classes.
To determine the width of a class interval, you divide the range (Highest
value–Lowest value) of the data by the number of class groupings desired.
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Summary Contingency
Table For One Table For Two
Variable Variables
Pie or
Doughnut Chart
The “Vital
Few”
Copyright © 2017 Pearson Education, Ltd. Chapter 2 - 28
Visualizing Categorical Data:
Side By Side Bar Charts DCOVA
The side by side bar chart represents the data from a contingency table.
No
Errors Errors Total
Invoice Size Split Out By Errors
Small 50.75% 30.77% 47.50% & No Errors
Amount
Medium 29.85% 61.54% 35.00% Errors
Amount
Large 19.40% 7.69% 17.50% No Errors
Amount
0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0%
Total 100.0% 100.0% 100.0% Large Medium Small
No
Invoice Size & Errors
Errors Errors Total Inner Ring With Errors, Outer Ring No Errors
Amount 7.7%
30.8%
30.8%
Frequency Distributions
Ordered Array and
Cumulative Distributions
Stem-and-Leaf
Histogram Polygon Ogive
Display
Frequency
4
(In a percentage
histogram the vertical
axis would be defined to 2
show the percentage of
observations per class).
0
5 15 25 35 45 55 More
Two Numerical
Variables
Scatter Time-
Plot Series
Plot
29 146
150
33 160
100
38 167
50
42 170
0
50 188
20 30 40 50 60 70
55 195
Volume per Day
60 200
100
2015 95
Number of Franchises
80
60
40
20
0
2007 2008 2009 2010 2011 2012 2013 2014 2015
Year
table.
Allows interactive changing of the level of
summarization and formatting of the variables.
Allows you to interactively “slice” your data to
summarize subsets of data that meet specified criteria.
Can be used to discover possible patterns and
relationships in multidimensional data that simpler
tables and charts would fail to make apparent.
Movie
revenues
by week
per month
Selective summarization:
Presenting only part of the data collected.
Chartjunk.
Copyright © 2017 Pearson Education, Ltd. Chapter 2 - 57
An Example of Selective Summarization, These
Two Summarizations Tell Totally Different Stories
DCOVA
Change
from
Prior
Company Year Company Year 1 Year 2 Year 3
A +7.2% A -22.6% -33.2% +7.2%
B +24.4% B -4.5% -41.9% +24.4%
C +24.9% C -18.5% -31.5% +24.9%
D +24.8% D -29.4% -48.1% +24.8%
E +12.5% E -1.9% -25.3% +12.5%
F +35.1% F -1.6% -37.8% +35.1%
G +29.7% G +7.4% -13.6% +29.7%
200 20%
100 10%
0 0%
FR SO JR SR FR SO JR SR
100 25
0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Bad Presentation
Good Presentations