0% found this document useful (0 votes)
53 views57 pages

Describing Data Visually

This document discusses various methods for visually summarizing and presenting numerical and categorical data, including: bar charts and pie charts for categorical data; stem-and-leaf displays, dot plots, frequency distributions, histograms, and scatter plots for numerical data. The objectives are to learn how to develop visual representations of data to facilitate decision making and understand which graphical methods are best suited for different variable types.

Uploaded by

Yin Yin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views57 pages

Describing Data Visually

This document discusses various methods for visually summarizing and presenting numerical and categorical data, including: bar charts and pie charts for categorical data; stem-and-leaf displays, dot plots, frequency distributions, histograms, and scatter plots for numerical data. The objectives are to learn how to develop visual representations of data to facilitate decision making and understand which graphical methods are best suited for different variable types.

Uploaded by

Yin Yin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

LECTURE 3

Describing Data Visually


Chapter Contents

 Bar Charts and Pie Charts


 Stem-and-Leaf Displays and Dot Plots
 Frequency Distributions and Histograms
 Scatter Plots
 Line Charts
Learning Objectives

In this lecture you learn:


 To develop tables and charts for categorical
data
 To develop tables and charts for numerical
data
Graphical
Presentation of Data
 Data in raw form are usually not easy to use
for decision making
 Some type of organization like graph or table
is needed
 The type of graph to use depends on the
variable being summarized
Tables and Charts for
Categorical Data
Categorical
Data

Tabulating Data Graphing Data

Summary Bar Pie Pareto


Table Charts Charts Diagram
The Summary Table
Summarize data by category

Example: Current Investment Portfolio


Investment Amount Percentage
Type (in thousands $) (%)

Stocks 46.5 42.27


Bonds 32.0 29.09
CD 15.5 14.09
(Variables are Savings 16.0 14.55
Categorical)
Total 110.0 100.0
Bar and Pie Charts

 Bar charts and Pie charts are often used for


qualitative data (categories or nominal scale)
 Pies or Bars represent categories
 Height of bar or size of pie slice shows the
frequency or percentage for each category
Bar Chart Example
Current Investment Portfolio
Investment Amount Percentage
Type (in thousands $) (%)

Stocks 46.5 42.27


Bonds 32.0 29.09
CD 15.5 14.09
Investor's Portfolio
Savings 16.0 14.55
Savings
Total 110.0 100.0
CD
Bonds
Stocks

0 10 20 30 40 50
Amount in $1000's
Vertical and Horizontal Bar Charts
Pie Chart Example
Current Investment Portfolio
Investment Amount Percentage
Type (in thousands $) (%)

Stocks 46.5 42.27


Bonds 32.0 29.09
Savings
CD 15.5 14.09 15%
Savings 16.0 14.55 Stocks
Total 110.0 100.0 42%
CD
14%

Percentages
are rounded to
Bonds the nearest
29% percent
Effective Pie Charts
 A pie chart should only have a few slices
 Each slice should be labeled with data values or
percents

2D pie chart

3D pie chart
Pareto Diagram

 Used to portray categorical data (nominal scale)


 A bar chart, where categories are shown in
descending order of frequency
 A cumulative polygon is often shown in the
same graph
Pareto Diagram Example
Investment Amount Percentage Cum. percentage
Type (in thousands $) (%)

Stocks 46.5 42.27 42.27


Bonds 32.0 29.09 71.36
CD 15.5 14.09 85.45
Savings 16.0 14.55 100.00
Total 110.0 100.0
Current Investment Portfolio
% invested in each category

cumulative % invested
45% 100%

40% 90%

(line graph)
80%
35%
(bar graph)

70%
30%

60%
25%
50%
20%
40%

15%
30%

10%
20%

5% 10%

0% 0%
Stocks Bonds Savings CD
Tables and Charts for
Numerical Data
Numerical Data

Frequency Distributions
Ordered Array and
Cumulative Distributions

Stem-and-Leaf
Histogram Polygon Ogive
Display, Dot Plot
The Ordered Array

A sequence of data in rank order:


 Shows range (min to max)
 Provides some signals about variability
within the range
 May help identify outliers (unusual observations)
 If the data set is large, the ordered array is
less useful
The Ordered Array
(continued)

 Data in raw form (as collected):

24, 26, 24, 21, 27, 27, 30, 41, 32, 38

 Data in ordered array from smallest to largest:

21, 24, 24, 26, 27, 27, 30, 32, 38, 41


Stem-and-Leaf Diagram

 A simple way to see distribution details in a


data set

METHOD: Separate the sorted data series


into leading digits (the stem) and
the trailing digits (the leaves)
Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Here, use the 10’s digit for the stem unit:


Stem Leaf
 21 is shown as 2 1
 38 is shown as 3 8
 41 is shown as 4 1
Example
(continued)
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Completed stem-and-leaf diagram:


Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
Using other stem units
 Using the 100’s digit as the stem:
 Round off the 10’s digit to form the leaves

Stem Leaf
 613 would become 6 1
 776 would become 7 8
 ...
 1224 becomes 12 2
Using other stem units
(continued)

 Using the 100’s digit as the stem:


 The completed stem-and-leaf display:
Data:
Stem Leaves
613, 632, 658, 717, 6 136
722, 750, 776, 827, 7 2258
841, 859, 863, 891, 8 346699
894, 906, 928, 933,
9 13368
955, 982, 1034,
1047,1056, 1140, 10 356
1169, 1224 11 47
12 2
Dot Plots
 A dot plot is the simplest graphical display of
n individual values of numerical data:
 Easy to understand
 Not good for large samples (e.g., > 5,000).
Making a Dot Plot
1. Make a scale that covers the data range
2. Mark the axes and label them
3. Plot each data value as a dot above the scale at its
approximate location

Note:
If more than one data value lies at about the same axis
location, the dots are piled up vertically.
Interpreting a Dot Plot
 Range of data shows dispersion
 Clustering shows central tendency

 The range is from 21 to 41


 Most data values lie between 24 and 27
Tabulating Numerical Data:
Frequency Distributions

What is a Frequency Distribution?


 A frequency distribution is a table
 containing class groupings
 and the corresponding frequencies with which
data fall within each grouping
Frequency Distribution Example

Example:
A manufacturer of insulation randomly selects
20 winter days and records the daily high
temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Frequency and
Cumulative Frequency
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative Percentage Cumulative Cumulative


Class Frequency Frequency Frequency Percentage

10 - 20 3 0.15 15 3 15
20 - 30 6 0.30 30 9 45
30 - 40 5 0.25 25 14 70
40 - 50 4 0.20 20 18 90
50 - 60 2 0.10 10 20 100
Total 20 1.00 100
Graphing Numerical Data:
The Histogram

 A graph of the data in a frequency distribution


is called a histogram
 The class boundaries (or class midpoints)
are shown on the horizontal axis
 the vertical axis is either frequency, relative
frequency, or percentage
 Bars of the appropriate heights are used to
represent the number of observations within
each class
Histogram Example

Class
Class Midpoint Frequency
10 but less than 20 15 3 His togram : Daily High Te m pe rature
20 but less than 30 25 6
30 but less than 40 35 5 7
40 but less than 50 45 4
6
50 but less than 60 55 2
5
Frequency

4
3
2
(No gaps 1
between 0
bars)
5 15 25 35 45 55 65
Class Midpoints
Histograms in Excel

1
Select
Tools/Data Analysis
Histograms in Excel
(continued)

2
Choose Histogram

(
Input data range and bin
range (bin range is a cell
range containing the upper
3 interval endpoints for each class
grouping)

Select Chart Output


and click “OK”
Graphing Numerical Data:
The Frequency Polygon
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5 Frequency Polygon: Daily High Temperature
40 but less than 50 45 4
7
50 but less than 60 55 2
6
5
Frequency

4
3
2
(In a percentage 1
polygon the vertical axis 0
would be defined to 5 15 25 35 45 55 65
show the percentage of
observations per class) Class Midpoints
Graphing Cumulative Frequencies:
The Ogive (Cumulative % Polygon)
Lower
class Cumulative
Class boundary Percentage
Less than 10 0 0
10 but less than 20 10 15
20 but less than 30 20 45 Ogive: Daily High Temperature
30 but less than 40 30 70
40 but less than 50 40 90 100
50 but less than 60 50 100
Cumulative Percentage 80
60
40
20
0
10 10 20 20 30 30 40 40 50 50 60 60
Class Boundaries (Not Midpoints)
Tabulating and Graphing
Multivariate Categorical Data
 Contingency Table for Investment Choices ($1000’s)
Investment Investor A Investor B Investor C Total
Category

Stocks 46.5 55 27.5 129


Bonds 32.0 44 19.0 95
CD 15.5 20 13.5 49
Savings 16.0 28 7.0 51
Total 110.0 147 67.0 324

(Individual values could also be expressed as percentages of the overall total,


percentages of the row totals, or percentages of the column totals)
Tabulating and Graphing
Multivariate Categorical Data
(continued)

 Side-by-side bar charts


Comparing Investors

S avin gs

CD

B o nd s

S t oc k s

0 10 20 30 40 50 60

Inves tor A Inve s t or B Inve s to r C


Side-by-Side Chart Example
 Sales by quarter for three sales territories:
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East 20.4 27.4 59 20.4
W est 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9

60

50

40
East
30 West
North
20

10

0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Scatter Plots

 Scatter Plots are used to examine


possible relationships between two
numerical variables

 The Scatter Plot:


 one variable is measured on the vertical

axis and the other variable is measured


on the horizontal axis
Interpreting a Scatter Plot

 Scatter plots can convey patterns in data pairs that


would not be apparent from a table.
Scatter Plots Example

Volume Cost per


per day day
23 131 Cost per Day vs. Production Volume
250
24 120
26 140 200
Cost per Day

29 151 150
33 160 100
38 167
50
41 185
0
42 170 0 10 20 30 40 50 60 70
50 188 Volume per Day
55 195
60 200
Scatter Plots in Excel

1
Select the chart wizard

2
Select XY(Scatter) option,
then click “Next”

3
When prompted, enter the
data range, desired
legend, and desired
destination to complete
the scatter diagram
Time Series Plot

 A Time Series Plot is used to study


patterns in the values of a variable
over time.
 In a Time Series Plot, one variable is
measured on the vertical axis and the time
period is measured on the horizontal axis.
 Can display several variables at once.
Time Series Plot Example

Number of Number of Franchises


Year Franchises
1996 43 120

1997 54 100

1998 60 80
1999 73
60
2000 82
40
2001 95
20
2002 107
2003 99 0
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
2004 95
Log Scales

A log scale is useful for time series data that


might be expected to grow at a compound annual
percentage rate (e.g., GDP, the national debt, or
your future income).

It reveals whether the quantity is growing at an


increasing percent (concave upward), constant
percent (straight line), or declining percent
(concave downward).
Comparison of Arithmetic and
Log Scales
Pictograms

 A visual display in which data values are


replaced by pictures.
A Complete Example

 Sort the data and then summarize in a graphical


display. Here are the sorted P/E ratios (values
from Table 3.2).
Stem-and-leaf Displays

For the 44 P/E ratios, the stem-and-leaf Display and


Dot Plot are given below.
Dot Plots
Frequency Distribution
Histograms
Make a histogram with appropriate bins.
Frequency Polygons and Orgives
Deceptive Graphs
Error 1: Nonzero Origin
• A nonzero origin will exaggerate the trend.

Deceptive Correct
Deceptive Graphs
Error 2: Elastic Graph Proportions

• Keep the aspect ratio (width/height) below 2.00 so as not to


exaggerate the graph. By default, Excel uses an aspect ratio of
1.68.
Deceptive Graphs
Error 4: 3-D and Novelty Graphs

• Can make trends appear to dwindle into the distance or loom


towards you.
Deceptive Graphs
Error 5: 3-D and Rotated Graphs

• Can make trends appear to dwindle into the distance or loom


towards you.
Deceptive Graphs
Error 8: Complex Graphs

• Avoid if possible. Keep your main objective in mind. Break graph


into smaller parts.
Deceptive Graphs
Error 11: Area Trick

• As figure height increases, so does width, distorting the graph.

You might also like