0% found this document useful (0 votes)
31 views17 pages

02 Data Types Overview (Moodle Slides)

The document defines and provides examples of different types of data that can be present in a data set, including: 1) Univariate, bivariate, and multivariate data depending on the number of variables. 2) Qualitative and quantitative variables that can be nominal, ordinal, discrete, or continuous. 3) Examples of primary and secondary data sources and notation for variable names in data sets.

Uploaded by

Thabang Khowa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views17 pages

02 Data Types Overview (Moodle Slides)

The document defines and provides examples of different types of data that can be present in a data set, including: 1) Univariate, bivariate, and multivariate data depending on the number of variables. 2) Qualitative and quantitative variables that can be nominal, ordinal, discrete, or continuous. 3) Examples of primary and secondary data sources and notation for variable names in data sets.

Uploaded by

Thabang Khowa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Exploring data Types of data

Outcomes
Know the definitions

Identify the four types of data in a data set


Exploring data Types of data

E.g.:

Population of cars
Sample of cars N = 96
n = 12

Sampled without replacement


Exploring data Types of data

E.g.:

Population of cars
Sample of cars N = 96
n = 12

Sampled without replacement


Exploring data Types of data

E.g.: # Cases /Sample points/ Elements/ Observations


Each row represents a case /sample
point/ element/ observation
# Car
01
Population of cars
02 Sample of cars N = 96
03 n = 12
04
05
06
07
08 Sampled without replacement
09
10
11
12
Exploring data Types of data

E.g.: # Cases /Sample points/ Elements/ Observations


Each row represents a case /sample
point/ element/ observation
# Car
01
02
03
04
05 Note:
06 This is only a list of labels: Car 1, Car 2, …, Car 12.
07 This is not a data set because there are no
08 columns containing information that describe
09
the observations.
10
11
12
Exploring data Types of data

E.g.: # Cases /Sample points/ Elements/ Observations


Each row represents a case /sample
point/ element/ observation
# Car Colour No specific order (Raw data)
01 Red
02 Yellow
03 Blue
04 Red
05 Blue
Now
06 Grey we have a data set because we have
07 Green listed something that describes the
08 Green observations (cars).
09 Blue
10 Green
11 Green
12 Yellow
Exploring data Types of data

E.g.: # Cases /Sample points/ Elements/ Observations


Each row represents a case /sample
point/ element/ observation
# Car Colour No specific order (Raw data)
01 Red
02 Yellow Grouped data (Arranged)
03 Blue #Red #Yellow #Blue #Green #Grey
04 Red 2 2 4 3 1
05 Blue
06 Grey
07 Green
08 Green
09 Blue
10 Green
11 Green
12 Yellow
Exploring data Types of data

E.g.: Variable/ Characteristic = Thing of interest Can assume many different values

# Car Colour Only one variable = univariate data


01 Red
02 Yellow ‘Colour’ is a category ⇒ qualitative variable
03 Blue
‘Colour’ has no natural order ⇒ nominal variable
04 Red
05 Blue
06 Grey
07 Green
08 Green
09 Blue
10 Green
11 Green
12 Yellow
Exploring data Types of data

E.g.: Variable/ Characteristic = Thing of interest Can assume many different values

# Car Colour Age Two variables ⇒ bivariate data


01 Red 5
02 Yellow 9 ‘Age’ is a measurement ⇒ quantitative variable
03 Blue 3
‘Age’ (in years) are whole numbers ⇒ discrete variable
04 Red 2
05 Blue 19
06 Grey 8
07 Green 15
08 Green 7
09 Blue 10
10 Green 6
11 Green 3
12 Yellow 4
Exploring data Types of data

E.g.: Variable/ Characteristic = Thing of interest Can assume many different values

# Car Colour Age Distance Three variables ⇒ multivariate data


01 Red 5 50016.30
02 Yellow 9 127694.00 ‘Distance’ is a measurement ⇒ quantitative variable
03 Blue 3 36011.50
‘Distance’ has decimals ⇒ continuous variable
04 Red 2 27558.20
05 Blue 19 240321.20
06 Grey 8 93483.60
07 Green 15 211027.70
08 Green 7 92577.30
09 Blue 10 125465.60
10 Green 6 88640.00
11 Green 3 38401.60
12 Yellow 4 43605.20
Exploring data Types of data

E.g.: Variable/ Characteristic = Thing of interest Can assume many different values

# Car Colour Age Distance Risk Four variables ⇒ multivariate data


01 Red 5 50016.30 High
02 Yellow 9 127694.00 Medium ‘Risk’ is not a measurement ⇒ qualitative variable
03 Blue 3 36011.50 High
‘Risk’ has a natural order ⇒ ordinal variable
04 Red 2 27558.20 High
05 Blue 19 240321.20 Low
06 Grey 8 93483.60 Medium
07 Green 15 211027.70 Low
08 Green 7 92577.30 Medium
09 Blue 10 125465.60 Low
10 Green 6 88640.00 Medium
11 Green 3 38401.60 High
12 Yellow 4 43605.20 High
Exploring data Types of data

E.g.:
# Car Colour Age Distance Risk
01 Red 5 50016.30 High
02 Yellow 9 127694.00 Medium
Sample of cars
03 Blue 3 36011.50 High
n = 12
04 Red 2 27558.20 High
05 Blue 19 240321.20 Low
06 Grey 8 93483.60 Medium
07 Green 15 211027.70 Low
08 Green 7 92577.30 Medium
09 Blue 10 125465.60 Low
10 Green 6 88640.00 Medium
11 Green 3 38401.60 High
12 Yellow 4 43605.20 High
Exploring data Types of data

E.g.:
# Car Colour Age Distance Risk
01 04 Red 5 50016.30 High
02 54 Yellow 9 127694.00 Medium
Sample of cars
03 14 Blue 3 36011.50 High
n = 12
04 18 Red 2 27558.20 High
05 09 Blue 19 240321.20 Low
06 45 Grey 8 93483.60 Medium
07 68 Green 15 211027.70 Low
08 07 Green 7 92577.30 Medium
09 43 Blue 10 125465.60 Low
10 66 Green 6 88640.00 Medium
11 27 Green 3 38401.60 High
12 08 Yellow 4 43605.20 High
Exploring data Types of data

E.g.:
# Car Colour Age Distance Risk
01 04 Red 5 50016.30 High
02 54 Yellow 9 127694.00 Medium New data collected by the
03 14 Blue 3 High
04 18 Red 2
36011.50
27558.20 High
researcher through
05 09 Blue 19 240321.20 Low experimentation/
06 45 Grey 8 93483.60 Medium observation/ survey
07
08
68
07
Green
Green
15
7
211027.70 Low
92577.30 Medium
⇒ Primary data
09 43 Blue 10 125465.60 Low
Otherwise
10 66 Green 6 88640.00 Medium
11 27 Green 3 38401.60 High
⇒ Secondary data
12 08 Yellow 4 43605.20 High
Exploring data Types of data – Notation

E.g.:
Variable names
• Always single capitals
# Car C A D R
• Usually (but not always) the last letters of the
01 04 Red 5 50016.30 High
alphabet, like X, Y, Z
02 54 Yellow 9 127694.00 Medium
03 14 Blue 3 36011.50 High
04 18 Red 2 27558.20 High
05 09 Blue 19 240321.20 Low
06 45 Grey 8 93483.60 Medium
07 68 Green 15 211027.70 Low
08 07 Green 7 92577.30 Medium
09 43 Blue 10 125465.60 Low
10 66 Green 6 88640.00 Medium
11 27 Green 3 38401.60 High
12 08 Yellow 4 43605.20 High
Exploring data Types of data – Notation

E.g.:
Variable names
• Always single capitals
# Car C A D R
• Usually (but not always) the last letters of the
01 04 c₀₁ a₀₁ d₀₁ r₀₁
alphabet, like X, Y, Z
02 54 c₀₂ a₀₂ d₀₂ r₀₂
03 14 c₀₃ a₀₃ d₀₃ r₀₃
04 18 c₀₄ a₀₄ d₀₄ r₀₄
05 09 c₀₅ a₀₅ d₀₅ r₀₅ Variable values
06 45 c₀₆ a₀₆ d₀₆ r₀₆ • Always lower case version of variable name
07 68 c₀₇ a₀₇ d₀₇ r₀₇ • Must have a subscript /index equal to the case
08 07 c₀₈ a₀₈ d₀₈ r₀₈
number
09 43 c₀₉ a₀₉ d₀₉ r₀₉
10 66 c₁₀ a₁₀ d₁₀ r₁₀
11 27 c₁₁ a₁₁ d₁₁ r₁₁
12 08 c₁₂ a₁₂ d₁₂ r₁₂
Exploring data Types of data – Summary

In general:
• Sampling with /without replacement
# X Y Z
• Raw /grouped data
• Primary /secondary data
1 x1 y1 z1
• Univariate /bivariate /multivariate data
2 x2 y2 z2 • Variables
• Names are single capitals
3 x3 y3 z3 • Values are lower case with case# index
• Qualitative
⋮ ⋮ ⋮ ⋮ • Ordinal
• Nominal
n xn yn zn • Quantitative
• Discrete
• Continuous
Data

Sample size = #Cases

You might also like