Lecture 1.4
Lecture 1.4
Usha Mohan
1/ 10
Statistics for Data Science -1
Learning objectives
1. What is statistics?
I Descriptive statistics, inferential statistics.
I Distinguish between a sample and a population.
2. Understand how data are collected.
I Identify variables and cases (observations) in a data set
3. Types of data-
I classify data as categorical(qualitative) or
numerical(quantitative) data.
I Understand cross-sectional versus time-series data.
I Measurement scales
4. Creating data sets; Downloading and manipulating data sets;
working on subsets of data.
5. Framing questions that can be answered from data.
2/ 10
Statistics for Data Science -1
Introduction
Basic definitions
Population and sample
Understanding data
Classification of data
Categorical and numerical
Cross-sectional versus time-series data
Scales of measurement
3/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Scales of measurement
4/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
5/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
6/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
6/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
6/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of ordinal data and the
interval between values is expressed in terms of a fixed unit of
measure, then the scale of measurement is interval scale.
7/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of ordinal data and the
interval between values is expressed in terms of a fixed unit of
measure, then the scale of measurement is interval scale.
I Interval data are always numeric. Can find out difference
between any two values.
7/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of ordinal data and the
interval between values is expressed in terms of a fixed unit of
measure, then the scale of measurement is interval scale.
I Interval data are always numeric. Can find out difference
between any two values.
I Ratios of values have no meaning here because the value of
zero is arbitrary.
7/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of ordinal data and the
interval between values is expressed in terms of a fixed unit of
measure, then the scale of measurement is interval scale.
I Interval data are always numeric. Can find out difference
between any two values.
I Ratios of values have no meaning here because the value of
zero is arbitrary.
I Interval:
numerical values that can be added/subtracted (no absolute zero)
7/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Example: temperature
I Suppose the response to a question on how hot the day is
comfortable and uncomfortable, then the temperature as a
variable is nominal.
8/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Example: temperature
I Suppose the response to a question on how hot the day is
comfortable and uncomfortable, then the temperature as a
variable is nominal.
I Suppose the answer to measuring the temperature of a liquid
is cold, warm, hot - the variable is ordinal.
8/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Example: temperature
I Suppose the response to a question on how hot the day is
comfortable and uncomfortable, then the temperature as a
variable is nominal.
I Suppose the answer to measuring the temperature of a liquid
is cold, warm, hot - the variable is ordinal.
I Example: Consider a AC room where temperature is set at
20°C and the temperature outside the room is 40°C. It is
correct to say that the difference in temperature is 20°C, but it
is incorrect to say that the outdoors is twice as hot as indoors.
8/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Example: temperature
I Suppose the response to a question on how hot the day is
comfortable and uncomfortable, then the temperature as a
variable is nominal.
I Suppose the answer to measuring the temperature of a liquid
is cold, warm, hot - the variable is ordinal.
I Example: Consider a AC room where temperature is set at
20°C and the temperature outside the room is 40°C. It is
correct to say that the difference in temperature is 20°C, but it
is incorrect to say that the outdoors is twice as hot as indoors.
I Temperature in degrees Fahrenheit or degrees centigrade is an
interval variable. No absolute zero.
Celsius Fahrenheit
Freezing point 0 32
Boiling point 100 212
8/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of interval data and the
ratio of two values is meaningful, then the scale of
measurement is ratio scale.
9/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of interval data and the
ratio of two values is meaningful, then the scale of
measurement is ratio scale.
I Example: height, weight, age, marks, etc.
9/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
I If the data have all the properties of interval data and the
ratio of two values is meaningful, then the scale of
measurement is ratio scale.
I Example: height, weight, age, marks, etc.
I Ratio: numerical values that can be added, subtracted,
multiplied or divided (makes ratio comparisons possible)
9/ 10
Statistics for Data Science -1
Classification of data
Scales of measurement
Summary
10/ 10