2 Graphical Descriptive Techniques 1
2 Graphical Descriptive Techniques 1
Statistics
DR. LEONARDO C. MEDINA, JR.
Chapter Two
Graphical
Descriptive Techniques 1
2.2
Introduction & Re-cap…
Descriptive statistics involves arranging, summarizing, and
presenting a set of data in such a way that useful information
is produced.
Statistics
Data Information
2.3
Populations & Samples
Population Sample
Subset
The graphical & tabular methods presented here apply to both entire
populations and samples drawn from populations.
2.4
Definitions…
A variable is some characteristic of a population or sample.
E.g. student grades.
Typically denoted with a capital letter: X, Y, Z…
2.5
Types of Data & Information
Data (at least for purposes of Statistics) fall into three main
groups:
Interval Data
Nominal Data
Ordinal Data
2.6
Interval Data…
Interval data
• Real numbers, i.e. heights, weights, prices, etc.
• Also referred to as quantitative or numerical.
2.7
Nominal Data…
Nominal Data
• The values of nominal data are categories.
E.g. responses to questions about marital status, coded
as:
Single = 1, Married = 2, Divorced = 3, Widowed = 4
2.8
Ordinal Data…
Ordinal Data appear to be categorical in nature, but their
values have an order; a ranking to them:
2.10
Hierarchy of Data…
Interval
Values are real numbers.
All calculations are valid.
Data may be treated as ordinal or nominal.
Ordinal
Values must represent the ranked order of the data.
Calculations based on an ordering process are valid.
Data may be treated as nominal but not as interval.
Nominal
Values are the arbitrary numbers that represent categories.
Only calculations based on the frequencies of occurrence are valid.
Data may not be treated as ordinal or interval.
2.11
Graphical & Tabular Techniques for Nominal Data…
2.12
Example 2.1 Work Status in the GSS 2012 Survey
[GSS2012*] In Chapter 1 we briefly introduced the General Social Survey.
In the 2012 survey respondents were asked the following question.
Last week were you working full time, part time, going to school, keeping
house, or what? The responses were
1. Working full time
2. Working part time
3. Temporarily not working
4. Unemployed, laid off
5. Retired
6. School
7. Keeping house
8. Other
The responses were recorded using the codes 1, 2, 3, 4, 5, 6, 7, and 8,
respectively.
2.13
Frequency and Relative Frequency Distributions
2.14
Nominal Data (Frequency)
Bar Chart
1000
912
900
800
700
600
500
400 357
300
226 210
200
104
100 70 54
40
0
1 2 3 4 5 6 7 8
WRKSTAT
6, 3.5%
1, 46.2%
5, 18.1%
4, 5.3%
3, 2.0%
2, 11.5%
Pie Chart
8, 2.7%
7,
6, 3.5%
10.6%
1, 46.2%
5, 18.1%
4, 5.3% 2, 11.5%
3, 2.0%
2.17
Describing the Relationship between Two Nominal Variables
To describe the relationship between two nominal variables, we must
remember that we are permitted only to determine the frequency of the
values. As a first step we need to produce a cross-classification table,
which lists the frequency of each combination of the values of the two
variables
2.18
Example 2.4 Newspaper Readership Survey
In a major North American city there are four competing
newspapers: the Globe and Mail (G&M), Post, Sun, and
Star. To help design advertising campaigns, the advertising
managers of the newspapers need to know which segments
of the newspaper market are reading their papers. A survey
was conducted to analyze the relationship between
newspapers read and occupation. A sample of newspaper
readers was asked to report which newspaper they read:
Globe and Mail (1) Post (2), Star (3), Sun (4), and to
indicate whether they were blue-collar worker (1), white-
collar worker (2), or professional (3). The responses are
stored in Xm02-04 using the codes. Some of the data are
listed here.
2.19
Example 2.4
Reader Occupation Newspaper
1 2 2
2 1 4
3 2 1
. . . .
. . . .
352 3 2
353 1 3
354 2 3
Determine whether the two nominal variables are related.
2.20
Cross-Classification Table of Frequencies
Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 18 38 37 120
White collar 29 43 21 15 108
Professional 33 51 22 20 126
Total 89 112 81 72 354
2.21
Row Relative Frequencies
Newspaper
Occupation G&M Post Star Sun Total
Blue collar .23 .15 .32 .31 1.00
White collar .27 .40 .19 .14 1.00
Professional .26 .40 .17 .16 1.00
Total .25 .32 .23 .20 1.00
2.22
Graphing the Relationship between 2 Nominal Variables
60
Post
50
Post
Star Sun
40
G&M
G&M G&M
30
Star Star Sun
Post
20 Sun
10
0
Blue collar White collar Professional
Occupation
2.23
INTERPRET
If the two variables are unrelated, the patterns exhibited in
the bar charts should be approximately the same. If some
relationship exists, then some bar charts will differ from
others.
The graphs tell us the same story as did the table. The shapes
of the bar charts for occupations 2 and 3 (White-collar and
Professional) are very similar. Both differ considerably from
the bar chart for occupation 1 (Blue-collar).
2.24