Displaying and Describing Quantitative Data

This document provides information on displaying and describing quantitative data through various statistical methods. It discusses how to summarize numerical data using histograms, stem-and-leaf plots, and analyzing shape and skewness. It also covers measuring the center through mean and median, and spread using boxplots, interquartile range, and standard deviation. Examples are provided on building frequency distributions for continuous data and interpreting histograms, stem-and-leaf plots, and boxplots.

Uploaded by

Josh Potash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views49 pages

Displaying and Describing Quantitative Data

Uploaded by

Josh Potash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Displaying and Describing

Quantitative Data
Displaying and Describing Quantitative
Data
Summarizing numerical data
Histograms
Stem-and-Leaf plots
Shape and Skewness
Center: Mean vs. Median
Boxplots (5 number summary)
Measuring the spread

Continuous Data: may take on any value in
some interval
Summarized in a grouped data frequency
table

Example: A manufacturer of insulation randomly selects 20
winter days and records the daily high temperature
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

NOTE: Temperature is a continuous variable because it could be
measured to any degree of precision desired
Frequency Distribution:
Continuous Data

1. Determine the number of categories
(classes/bins)
2. Establish class width
Minimum width is the range of the data
Largest data point Smallest data point = Range
3. Set the class boundaries
4. Determine the frequency in each class
Count the number of data points in each category

Building a Frequency Table:
Continuous Data
How Many Categories?
Many (Narrow class intervals)
May yield a very jagged distribution
with gaps from empty classes
Can give a poor indication of how
frequency varies across classes

Few (Wide class intervals)
May compress variation too much
and yield a blocky distribution
Can obscure important patterns of
variation
0
2
4
6
8
10
12
0 30 60 More
Temperature
F
r
e
q
u
e
n
c
y
0
0.5
1
1.5
2
2.5
3
3.5
48
1
2
1
6
2
0
2
4
2
8
3
2
3
6
4
0
4
4
4
8
5
2
5
6
6
0
M
o
r
e
Temperature
F
r
e
q
u
e
n
c
y
(X axis labels are upper class endpoints)
General Guidelines
Number of Data Points Number of Classes
under 50 5 - 7
50 100 6 - 10
100 250 7 - 12
over 250 10 - 20

Class widths can typically be reduced as the
number of observations increases
Distributions with numerous observations are
more likely to be smooth and have gaps filled
since data are plentiful
Considerations:
Continuous Data

Must be mutually exclusive

Must be all-inclusive

Bins should be of equal width

Avoid empty categories
How should the endpoints be
determined?
Often by trial and error

The goal is to create a distribution that is
neither too "jagged" nor too "blocky

You want to appropriately show the pattern
of variation

Sort raw data from low to high:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 20)
Compute class width: 10 (46/5 then round off)
Determine class boundaries:10, 20, 30, 40, 50
(Sometimes class midpoints are reported: 15, 25, 35, 45, 55)
Count the number of values in each class
Example:
Continuous Data
Frequency Distribution Example
Data from low to high:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class Frequency
10 but under 20 3 .15
20 but under 30 6 .30
30 but under 40 5 .25
40 but under 50 4 .20
50 but under 60 2 .10
Total 20 1.00
Relative
Frequency
Frequency Distribution
Histogram
0
3
6
5
4
2
0
0
1
2
3
4
5
6
7
5 15 25 36 45 55 More
F
r
e
q
u
e
n
c
y
Class Midpoints
Histogram Example
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
No gaps
between bars,
since continuous
data
0 10 20 30 40 50 60
Class Endpoints/Bins
Frequency Histograms
Visual representation of the frequency table
The classes/intervals/bins are shown on the
horizontal axis
Frequency is measured on the vertical axis

Bars of the appropriate heights can be used to
represent the number of observations within
each class
Shows the center of the data and the spread

Stem-and-Leaf Plots
A quick and dirty histogram
11, 24, 24, 25, 27, 30, 30, 31, 32, 33, 44, 46, 47, 50,
52

467 4
02 5
00123 3
4457 2
1 1
The parts
Stems
Leaves
5
4
3
2
1
Key: 1|1 stands for 11
02
467
00123
4457
1
Splitting Stems
You want about 7 to 10 stems (depending on how
much data you have)
Create more stems be splitting them
0012233 8
555555566 8
8899 7
02 7
7.0 to 7.4
7.5 to 7.9
8.0 to 8.4
8.5 to 8.9
Or More
2233 8
001 8
8899 7
7
7
2 7
0 7
7.0 and 7.1
7.2 and 7.3
7.4 and 7.5
7.6 and 7.7
7.8 and 7.9
8.0 and 8.1
8.2 and 8.3
20 18 16 29 26 14 21 26 24 21 22
8 28 21 31 20 29 33 26 18 21 19
38 21 13 29 17 15 35 26

1358 3
00111126668999 2
34567889 1
8 0
58 3
13 3
6668999 2
0011112 2
567889 1
34 1
8 0
Key: 1|3 means 13
Normal Body Temperature
96.3 96.7 96.9 97.0 97.1
97.1 97.1 97.2 97.3 97.4
97.4 97.4 97.4 97.5 97.5
97.6 97.6 97.6 97.7 97.8
97.8 97.8 97.8 97.9 97.9
98.0 98.0 98.0 98.0 98.0
98.0 98.1 98.1 98.2 98.2
98.2 98.2 98.3 98.3 98.4
98.4 98.4 98.4 98.5 98.5
98.6 98.6 98.6 98.6 98.6
98.6 98.7 98.7 98.8 98.8
98.8 98.9 99.0 99.0 99.0
99.1 99.2 99.3 99.4 99.5
5 99
0001234 99
55666666778889 98
000000112222334444 98
556667888899 97
0111234444 97
79 96
3 96

Symmetric or Skewed?
Is the distribution
symmetric?
A distribution is symmetric if
the right and left sides of the
histogram are approximately
mirror images of each other.

left skewed?
It is skewed to the left
if the left side of the
histogram extends
much farther out than
the right side.
or right skewed?
A distribution is skewed
to the right if the right
side of the histogram
(side with larger values)
extends much farther
out than the left side

Other Distributions
Bimodal Distribution
Uniform Distribution
Outliers
Data points that dont seem to fit in the
distribution.
Far to the left or right in the graph.
Quantitative Summaries
Mean
Median
5 number summary -- Boxplots
Measuring Spread
Describing Quantitative Data
Where is it? What is its center?
What is the spread or variability? How much
noise is in the data?
What is the shape of the distribution? Is it
symmetric?
Measuring these attributes
Center of the Distribution
The average salary, height, etc.

Mean: add up the data and divide by the
number of observations.

Median: An equal number of observations
more and less than the median.
Mean
Add up the data and divide by the number of
observations

Data: 1, 2, 2, 3, 4
Mean = (1 + 2 + 2 + 3 + 4) /5 = 2.4

Data: 10, 12, 56, 78, 113, 1209
Mean = (10 + 12 + 56 +78 + 113 + 1209)/6 = 246.3
Some Algebra
Median
The middle observation
Data: 1, 2, 2, 3, 4
Mean = (1 + 2 + 2 + 3 + 4) /5 = 2.4
Median = 2

Data: 10, 12, 56, 78, 113, 1209
Mean = (10 + 12 + 56 +78 + 113 + 1209)/6 = 246.3
Median = (56 + 78)/2 = 67

Comparison
Similar for symmetric distributions.
Mean moves in the direction of a skewed distribution
Median
Mean
Modes
Mode: peak in the distribution
Bimodal = Two Modes

Mean and Median
5 number summary
Median
Minimum, Maximum
Quartiles middle observation above the
median and below the median

Min, Q1, Med, Q3, Max
Finding Quartiles
1. Data: 7, 23,75,82,34,91,10
2. Put it in order:
7, 10, 23, 34, 75, 82, 91
3. Find the median: 34
4. Below the median: 7, 10, 23
Lower Quartile Q1 = 10
5. Above the median: 75, 82, 91
Upper Quartile Q3 = 82
More Quartiles
7, 8, 22, 38, 48, 62
Median = (22+38)/2 = 30

7, 8, 22, 38, 48, 62

Q1 =8
Q3 = 48
7, 8, 22, 38, 48, 62
One more time
125, 126, 127, 129, 133, 136, 136, 140, 141,
143, 143, 147, 152
125, 126, 127, 129, 133, 136, 136,
140, 141, 143, 143, 147, 152
Q1 = (127+129)/2 = 128
Q3 = (143 + 143)/2 = 143

5 Numbers
125, 126, 127, 129, 133, 136, 136, 140, 141,
143, 143, 147, 152

Five number summary
(min, Q1, med, Q3, max) =
(125, 128, 136, 143, 152)
5 Number Summary
Example: Shares traded daily on NYSE
Max 3,115,805,723
Q3 1,739,245,625
Median 1.584,406,064
Q1 1,451,269,968
Min 545,244,020
Box Plot
Example: monthly credit card charges($)
100
200
300
C
o
u
n
t
0 1000 2000 3000 4000 5000 6000 7000
How would you describe this distribution?
We can compare groups
Side-by-side boxplots compare two set of
data.
Do they have the same center? Spread?
Shape?
Is the difference between the medians much
bigger than the variability in the data?
Lets examine the numerical responses of the
% hotel occupancy rate in Hawaii to compare
the summer months with the non-summer
months

Case: Hotel Occupancy in Hawaii
Season
H
o
t
e
l

O
c
c
u
p
a
n
c
y

(
%
)
Summer Non-Summer
90
80
70
60
50
Boxplot of Hotel Occupancy vs Season
Case (cont.): Occupancy by each of the 4
seasons
There are two high seasons for hotels in Hawaii.
Season
H
o
t
e
l

O
c
c
u
p
a
n
c
y

(
%
)
Fall Summer Spring Winter
90
80
70
60
50
Boxplot of Hotel Occupancy vs Season
H
o
t
e
l

O
c
c
u
p
a
n
c
y

(
%
)
Year
Month
2004 2003 2002 2001 2000
Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan
90
80
70
60
50
Time Series Plot of Hotel Occupancy
Time Series Plots
What additional information does this graph give us?
Seasonal behavior of unemployment rates
Data from 19802001
Measuring the Spread
How much variability is in the data?
1. Range
Maximum Minimum
2. InterQuartile Range
Q3 Q1
3. Standard Deviation
Average squared distance from the mean
Why I dont use the range.
An outlier is going to be either the largest or
smallest data point.
If there is an outlier then I dont want to use it
it isnt typical.
Even if there are no outliers, Im more
interested in central numbers than extreme
numbers.
IQR
The IQR is the length of the central half of the data.

IQR = Q3 Q1
It is the least sensitive to outliers of any of our
measures of spread.
How much does it vary?
Range = 46555
IQR = 6587
Average of the squared deviations
Shows variation about the mean
Sample variance:

2
2
1
1
n
i
i
X X
S
n

Variance

Square root of the variance
Has the same units as the original data
Sample standard deviation:

2
1
1
n
i
i
X X
S
n

Standard Deviation
How many standard deviations is a value from the mean?

Z = 2 the value is 2 standard deviations above the mean

Z= -2 the value is 2 standard deviations below the mean

Allows us to compare values with different units

Standardizing

z
value mean
st.dev.

Chapter 6 II Stastitic III ENHANCE
No ratings yet
Chapter 6 II Stastitic III ENHANCE
55 pages
7QC Tools
No ratings yet
7QC Tools
62 pages
PED 106 Module 7 Assessment
No ratings yet
PED 106 Module 7 Assessment
13 pages
3-Data Organization and Presentation
No ratings yet
3-Data Organization and Presentation
78 pages
Elementary Statistics
No ratings yet
Elementary Statistics
73 pages
Chapter 3
No ratings yet
Chapter 3
19 pages
Math7 Q4 M7
No ratings yet
Math7 Q4 M7
16 pages
Business Statistics: Graphs, Charts, and Tables - Describing Your Data Graphs, Charts, and Tables - Describing Your Data
100% (1)
Business Statistics: Graphs, Charts, and Tables - Describing Your Data Graphs, Charts, and Tables - Describing Your Data
74 pages
Vibrations From Blasting For A Road Tunnel in Hong Kong - A Statistical Review - FINAL - PUBLISHED - VERSION
100% (1)
Vibrations From Blasting For A Road Tunnel in Hong Kong - A Statistical Review - FINAL - PUBLISHED - VERSION
18 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
9 pages
Epidem Chapter 8
No ratings yet
Epidem Chapter 8
62 pages
Chapter 1 and Introduction Notes
No ratings yet
Chapter 1 and Introduction Notes
85 pages
Organisation and Presentation of Data
No ratings yet
Organisation and Presentation of Data
17 pages
Course: Biostatistics: Haramaya University, Chms
100% (1)
Course: Biostatistics: Haramaya University, Chms
49 pages
CH 02
No ratings yet
CH 02
38 pages
Lec 3 (Data Organization)
No ratings yet
Lec 3 (Data Organization)
62 pages
Chapter 2 Final of Final
No ratings yet
Chapter 2 Final of Final
158 pages
Week 02 Data Organizatiion and Presentaion
No ratings yet
Week 02 Data Organizatiion and Presentaion
51 pages
CS 459 Chapter 2
No ratings yet
CS 459 Chapter 2
84 pages
2035 CH2 Notes
No ratings yet
2035 CH2 Notes
42 pages
Midterms Day 3
No ratings yet
Midterms Day 3
74 pages
Statistics For Css
No ratings yet
Statistics For Css
73 pages
Describing Data New
No ratings yet
Describing Data New
13 pages
Statistics
No ratings yet
Statistics
46 pages
5315 ch00 Plotschartshistogram
No ratings yet
5315 ch00 Plotschartshistogram
37 pages
Introduction To Quality Control
No ratings yet
Introduction To Quality Control
75 pages
Lecture 1 - Descriptive Statistics
No ratings yet
Lecture 1 - Descriptive Statistics
43 pages
3 Data Description and Measures of Central Tenndency
No ratings yet
3 Data Description and Measures of Central Tenndency
72 pages
Unit 4
No ratings yet
Unit 4
41 pages
S2a Data Presenting
No ratings yet
S2a Data Presenting
39 pages
Lecture 1: Introduction: Statistics Is Concerned With
No ratings yet
Lecture 1: Introduction: Statistics Is Concerned With
45 pages
1.4 Frequency Polygon IN DTB
No ratings yet
1.4 Frequency Polygon IN DTB
24 pages
Describing, Exploring, and Comparing Data
100% (1)
Describing, Exploring, and Comparing Data
61 pages
Probability Statistics Lecture 2
No ratings yet
Probability Statistics Lecture 2
38 pages
Lesson 2: Summarizing Data
No ratings yet
Lesson 2: Summarizing Data
53 pages
Describing Data-Frequency Distributions and Graphic Presentation
0% (1)
Describing Data-Frequency Distributions and Graphic Presentation
39 pages
Lecture 2 A - Describing Data-Graphical-New PDF
No ratings yet
Lecture 2 A - Describing Data-Graphical-New PDF
52 pages
Finals RT Core 3
No ratings yet
Finals RT Core 3
25 pages
18bst5el U2
No ratings yet
18bst5el U2
21 pages
V2 Chapter3 Summer 2020 - 21 - Tagged
No ratings yet
V2 Chapter3 Summer 2020 - 21 - Tagged
36 pages
Research Methodology and Scientific Writing AS 402 Descriptive Statistics
No ratings yet
Research Methodology and Scientific Writing AS 402 Descriptive Statistics
41 pages
BS101 StudyGuide 1 2021 Final
No ratings yet
BS101 StudyGuide 1 2021 Final
125 pages
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
17 pages
Lacture Note 03 - Frequency Distributions and Graphical Representation
No ratings yet
Lacture Note 03 - Frequency Distributions and Graphical Representation
16 pages
05.1 Data Organization PRESENTATION
No ratings yet
05.1 Data Organization PRESENTATION
19 pages
Lesson 2 Frequency Distribution and Graphs
No ratings yet
Lesson 2 Frequency Distribution and Graphs
11 pages
DOM105 Session 1
No ratings yet
DOM105 Session 1
31 pages
Geokniga Art and Science Resource Estimation PDF
No ratings yet
Geokniga Art and Science Resource Estimation PDF
246 pages
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
No ratings yet
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
45 pages
Math 133 - Unit 10 Summary Statistics
No ratings yet
Math 133 - Unit 10 Summary Statistics
21 pages
Principle of Biostatistic Marcello Pagano Principle & Method Richard A Jhonson & Gouri K. Bhattacharyya
No ratings yet
Principle of Biostatistic Marcello Pagano Principle & Method Richard A Jhonson & Gouri K. Bhattacharyya
45 pages
Syllabus
No ratings yet
Syllabus
5 pages
STAT111 Module3-PresentationOfData
No ratings yet
STAT111 Module3-PresentationOfData
9 pages
Chapter 2 Measures of Location
No ratings yet
Chapter 2 Measures of Location
16 pages
Lecture 7 Quantitative Reasoning
No ratings yet
Lecture 7 Quantitative Reasoning
7 pages
Chapter 2 Review
No ratings yet
Chapter 2 Review
12 pages
3 - Frequency Distribution
100% (1)
3 - Frequency Distribution
28 pages
Chapter 2 - Describing The Data
No ratings yet
Chapter 2 - Describing The Data
9 pages
CHP 2 Mat161
No ratings yet
CHP 2 Mat161
12 pages
Chapter 1 Descriptive Data
No ratings yet
Chapter 1 Descriptive Data
113 pages
BSTA100-Essentials of Statistics: Frequency Distributions and Graphs
No ratings yet
BSTA100-Essentials of Statistics: Frequency Distributions and Graphs
10 pages
Handout 2 Frequency Distribution
100% (1)
Handout 2 Frequency Distribution
14 pages
Exploratory Data Analysis: 2.1 Objectives
No ratings yet
Exploratory Data Analysis: 2.1 Objectives
23 pages
Testing Hypotheses About Proportions
No ratings yet
Testing Hypotheses About Proportions
26 pages
CH 02
No ratings yet
CH 02
104 pages
Population vs. Sample
100% (1)
Population vs. Sample
44 pages
Grouped Data:: BA 302: Chapter-2 Instructions by Dr. Kishor Guru-Gharana
No ratings yet
Grouped Data:: BA 302: Chapter-2 Instructions by Dr. Kishor Guru-Gharana
8 pages
Statistics Cour 3
No ratings yet
Statistics Cour 3
6 pages
Advanced Excel
No ratings yet
Advanced Excel
48 pages
Elementary Statistics: Davis Lazarus Assistant Professor ISIM, The IIS University
No ratings yet
Elementary Statistics: Davis Lazarus Assistant Professor ISIM, The IIS University
73 pages
Confidence Intervals and Hypothesis Tests For Means
No ratings yet
Confidence Intervals and Hypothesis Tests For Means
40 pages
Chapter 1 (Introduction)
No ratings yet
Chapter 1 (Introduction)
40 pages
Data Representation 02 - 12
No ratings yet
Data Representation 02 - 12
28 pages
Chap1 Lesson 2
No ratings yet
Chap1 Lesson 2
10 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Staticus: Math 103 Lecture 9 Class Notes
No ratings yet
Staticus: Math 103 Lecture 9 Class Notes
4 pages
Grapher 12 Users Guide Preview PDF
No ratings yet
Grapher 12 Users Guide Preview PDF
117 pages
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
No ratings yet
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
24 pages
Descriptive Statistics Assignment 1
No ratings yet
Descriptive Statistics Assignment 1
2 pages
As Stats Chapter 3 Representations of Data Worksheet QP 3
No ratings yet
As Stats Chapter 3 Representations of Data Worksheet QP 3
7 pages
Tut 7
No ratings yet
Tut 7
1 page
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
No ratings yet
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
14 pages
Displaying and Describing Categorical Data
No ratings yet
Displaying and Describing Categorical Data
29 pages
Comparing Two Groups
No ratings yet
Comparing Two Groups
25 pages
PRELIM in Stat Anal
No ratings yet
PRELIM in Stat Anal
5 pages
Lecture 6 & 7
No ratings yet
Lecture 6 & 7
43 pages
Topic2 - 2024 - Descriptive Statistics - STD - Revised
No ratings yet
Topic2 - 2024 - Descriptive Statistics - STD - Revised
20 pages
Sampling Distributions and Confidence Intervals For Proportions
No ratings yet
Sampling Distributions and Confidence Intervals For Proportions
31 pages
Welcome To BMGT230 Business Statistics: A First Course
No ratings yet
Welcome To BMGT230 Business Statistics: A First Course
5 pages
002 - Workbook
No ratings yet
002 - Workbook
52 pages
Neuman Allen
No ratings yet
Neuman Allen
48 pages
Lecture 8: Seven Basic Tools of Quality Six Sigma White Belt Program
No ratings yet
Lecture 8: Seven Basic Tools of Quality Six Sigma White Belt Program
42 pages
A - Copie - Copie
No ratings yet
A - Copie - Copie
8 pages
mn009 - BMD TMD Calibration in Ctan With Skyscan Phantoms
No ratings yet
mn009 - BMD TMD Calibration in Ctan With Skyscan Phantoms
31 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
09 Maths ch14 tp1
No ratings yet
09 Maths ch14 tp1
10 pages
Visualization Code
No ratings yet
Visualization Code
5 pages
Frequency Distribution and Graphical Representation
No ratings yet
Frequency Distribution and Graphical Representation
12 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)

Displaying and Describing Quantitative Data

Uploaded by

Displaying and Describing Quantitative Data

Uploaded by

Displaying and Describing

You might also like