Chapter2 Sumarizing Data S
Chapter2 Sumarizing Data S
Categorial Numerical
Data Data
Bonds Percentages
(Variables are Qualitative) are rounded to
29% the nearest
percent
Pie Chart
Axis for
bar
chart
shows
%
invested
in each
category Axis for line
graph
shows
cumulative
% invested
VILFREDO PARETO
(1843–1923)
The “Vital
Few”
Pareto Diagram
■Contingency Tables
■Side By Side Bar Charts
Bivariate Categorical Data
Categorial Numerical
Data Data
Relative
Class Frequency Frequency Percentage
10 but under 20 3 .15 15
20 but under 30 6 .30 30
30 but under 40 5 .25 25
40 but under 50 4 .20 20
50 but under 60 2 .10 10
Total 20 1 100
Frequency Distribution:
Discrete Data
■The following data record the number of
children in the families of the 47 workers in a
company:
1 1 3 2 0 2 0 1 2 2 1 3
5 2 4 0 0 2 4 1 1 2 2 0
3 0 0 2 1 3 6 0 2 1 0 3
2 2 2 1 0 0 1 1 3 1 4
Frequency distribution table
Number of Number of workers
children in family
0
1
2
3
4
5
6
Frequency Distribution:
Discrete Data
■Discrete data: possible values are countable
Number of days
Example: An read
Frequency
advertiser asks 0 44
200 customers 1 24
how many days 2 18
per week they 3 16
read the daily 4 20
newspaper. 5 22
6 26
7 30
Total 200
Relative Frequency
Relative Frequency: What proportion is in each category?
Number of days Relative
Frequency
read Frequency
0 44 .22
1 24 .12
2 18 .09 22% of the
3 16 .08 people in the
sample report
4 20 .10 that they read
5 22 .11 the newspaper
0 days per week
6 26 .13
7 30 .15
Total 200 1.00
NOTE
For developing frequency and relative frequency
distributions for discrete data
(1)List all possible values of the variables. If the
variable is quantitatives, order the possible values
from low and high.
(2) Count the number of occurrences at each
value of the variable and place this value in a
column labeled “frequency”
(3)Determine the variable frequencies
Frequency Distribution:
Continuous Data
Discrete Continuous
Lowe data Upper data
r Classes Classes
limit
limit
Lower Upper
limit limit
Distribution classes
■Class widths (class lengths):
- continuous data: are the numerical differences
between lower and upper class limits.
- discrete data: are the numerical differences
between the lower limit of one class and the lower
limit of the immediately following class
■Class mid-points: are situated in the centre of the
classes.
Distribution classes
■ Open-ended class: Classes
- A class without a/an < 10
lower/upper limit.
- Usually used for the first 10-15
class which has no defined
lower limit and/or the last 15-20
class which has no defined
upper limit
>=20
Grouping Data by Classes
Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
■ Find range: 58 - 12 = 46
■ Select number of classes: 5 (usually between 5 and 20)
■ Compute class width: 10 (46/5 then round off)
■ Determine class boundaries:10, 20, 30, 40, 50
■ Compute class midpoints: 15, 25, 35, 45, 55
■ Count observations & assign to classes
Frequency Distribution Example
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53,
58
Frequency Distribution
No Gaps
Between
Bars
Class Boundaries
Class Midpoints
The Histogram
“lost” orders
in the upper
tail of the
distribution.
Histogram- Example
■ In the Journal of Experimental Social Psychology (Vol. 45, 2009) study on
whether money can buy love (p. 63), the researchers randomly assigned
participants to the role of either gift-giver or gift-receiver. (Gift-givers, recall,
were asked about a birthday gift they recently gave, while gift-recipients were
asked about a birthday gift they recently received.) Two quantitative variables
were measured for each of the 237 participants: gift price (measured in dollars)
and overall level of appreciation for the gift (measured as the sum of the two 7-
point appreciation scales, with higher values indicating a higher level of
appreciation).
■ One of the objectives of the research was to investigate whether givers and
receivers differ on the price of the gift reported and on the level of appreciation
reported.
■ Use BUYLOV to construct side-by-side histograms for the quantitative
variables, one histogram for gift-givers and one for gift-recipients.
The histograms for birthday gift price
The prices
reported by
gift-recipients
tended to be
higher than
the prices
reported by
gift-givers.
The histograms for overall level of
appreciation
Gift-givers and
gift-recipients
respond
differently, with
gift-recipients
more likely to
express a greater
level of
appreciation for
the gift than what
gift-givers
perceive
Organizing Numerical Data
Numerical Data 41, 24, 32, 26, 27, 27, 30, 24, 38, 21
Tables Polygons
Meal Costs
16
14 City
12
10
Frequency
8
6
4
2
0
25 35 45 55 65 75 85 95
Meal cost
Cummulative and relative cummulative frequency
distribution
Class Midpoints
The Polygon
▪ A percentage polygon is formed by having the
midpoint of each class represent the data in that
class and then connecting the sequence of
midpoints at their respective class percentages.
▪ The cumulative percentage polygon, or ogive,
displays the variable of interest along the X axis,
and the cumulative percentages along the Y axis.
▪ Useful when there are two or more groups to
compare.
The Polygons
■Linear Relationships
Types of Relationships
(continued
)
■Curvilinear Relationships
Types of Relationships
(continued
)
■No Relationship
Summary
■END OF CHAPTER 2