Data Overview
Data Overview
Leguma Bakari
Email: [email protected]
Phone:+255 762 760 095
December 25, 2023
Eastern Africa Statistical Training Center (EASTC)
1
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
2
Definitions
3
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
4
Scales of Measurement of Data
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
7
Nominal Scale
8
Nominal Scale Arithmetic Application
10
• Inequality (< or >)
• From mathematics; 2 > 1,
• For nominal variable this is equivalent to
• Married > Single; this is impossible
• Inequality in other words implies ranking,
• Since inequality can not be applied, and the median computation
involves ranking (ascending or descending),
• Therefore median is an invalid statistic for nominal data.
• Among three common measure of central tendency, mean
median and mode,
• Mode is the only valid statistic since in involves counts
(frequency)
• This conclude that, no any mathematical operation can be
applied in nominal data.
11
Marital Status Frequency
Single 23
Married 67
Widowed 19
Divorced 32
• From the above frequency table married is mode since it has the
most occurred values.
12
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
13
Ordinal Scale
14
Ordinal Scale Arithmetic Application
16
• Inequality (< or >)
• From mathematics; 2 > 1,
• For nominal variable this is equivalent to
• Secondary > Primary; this is possible
• Inequality in other words implies ranking,
• Since inequality can be applied, and the median computation
involves ranking (ascending or descending),
• Therefore median is an valid statistic for ordinal data.
• Among three common measure of central tendency, mean
median and mode,
• Mode and median are the only valid statistic for summarization.
• This conclude that, only inequality mathematical operation can
be applied in ordinal data.
17
Binary Variable
18
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
19
Interval Scale
20
Interval Scale Arithmetic Application
21
• Subtraction (−)
• From mathematics; 15o C − 5o C = 10o C, which is possible and
valid
• Any computation which involves subtraction will become valid
due to this property.
• Multiplication (×)
• From mathematics; 15o C × 0o C =??, which is impossible and not
valid
• Any computation which involves multiplication (ie geometric
mean) will become invalid due to this property.
√
• Geometric mean = n y1 × y2 × · · · × yn
• Division (÷)
• From mathematics; 15o C ÷ −5o C =??, which is impossible and
not valid
• Any computation which involves division (ie harmonic mean)
will become invalid due to this property.
• Harmonic mean = 1 1 n 1
x1 + x +···+ xn
2
22
• Inequality (< or >)
• From mathematics; 15o C > 5o C, which is possible and valid
• Inequality in other words implies ranking,
• Since inequality can be applied, and the median computation
involves ranking (ascending or descending),
• Therefore median is an valid statistic for interval data.
• All three common measure of central tendency, arithmetic mean
median and mode, are valid statistic for summarization.
• Geometric mean and harmonic mean can not be applied.
• This conclude that only inequality, addition and subtraction
mathematical operation can be applied in nominal data.
23
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
24
Ratio Scale
25
Ratio Scale Arithmetic Application
26
• Subtraction (−)
• From mathematics; 150 cm − 50 cm = 100 cm, which is possible
and valid
• Any computation which involves subtraction will become valid
due to this property.
• Multiplication (×)
• From mathematics; 50 cm × 5 cm = 250 cm2 , which is possible
and valid
• Any computation which involves multiplication (ie geometric
mean) will become valid due to this property.
√
• Geometric mean = n y1 × y2 × · · · × yn
• Division (÷)
• From mathematics; 50 cm ÷ 5 cm = 10, which is possible and
valid
• Any computation which involves division (ie harmonic mean)
will become valid due to this property.
• Harmonic mean = 1 1 n 1
x1 + x +···+ xn
2
27
• Inequality (< or >)
• From mathematics; 50 cm > 5 cm, which is possible and valid
• Inequality in other words implies ranking,
• Since inequality can be applied, and the median computation
involves ranking (ascending or descending),
• Therefore median is an valid statistic for ratio data.
• All three common measure of central tendency, arithmetic mean
median and mode, are valid statistic for summarization.
• Geometric mean and harmonic mean can also be applied.
• This conclude that, all forms mathematical operation can be
applied in nominal data.
28
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
29
Classification of Data
30
Outline
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
31
Type of Data in Terms of Nature
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
33
Types of Data in Terms of Source
Primary Data
Secondary Data
Introduction
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
Classification of Data
35
Types of Data in Terms of Design
37
Time Series Data
38
Panel (Longitudinal) Data
39
Summary
40