Course content
Probability is a field of mathematics, which investigates
the behaviour of mathematically defined random
phenomena.
Statistics attempts to describe, model and interpret the
behaviour of observed random phenomena.
In this course, we will learn probability in order to use it
as a modelling device in statistics.
Learning outcomes
After passing the course the student
knows:
1 the basic concepts and rules of probability
2 the basic properties of one- and two-dimensional
discrete and continuous probability distributions
3 common one- and two-dimensional discrete and continuous
probability distributions and knows how to apply them to
simple random phenomena
4 the basic properties of the bivariate normal distribution
5
the basic methods for collecting and describing statistical
6 data
how to apply basic methods of estimation and testing in
7 simple problems of statistical inference
the basic concepts of statistical dependence, correlation and
What is statistics?
Statistics is a collection of tools to study uncertain data.
The observed data itself is not statistics. Statistics is the
conclusions we can draw from our observations, and the
techniques to draw these conclusions.
Applicable whenever there is quantifiable data available.
Terminology
Population is the set that contains all possible objects
of a statistical experiment.
Unit is an element of population.
Sample is a subset of the population.
Observation is an observed value of a variable attached to
each unit in the sample.
Statistical data is the collection of all observations.
Why statistics?
We want to learn something about an entire population, but can
not afford to collect (or store) all the data we would want.
Want to draw as strong conclusions as we can, from limited
data.
Perhaps counterintuitively, to get a useful sample, we want to
know as little as possible about the sample, i.e. the sample
should be selected randomly.
What is “typical” anyway?
Assume we have a data set S = {x1, . . . , xn } of n
numerical observations.
Three different notions: mean, median and mode
Mean is the “average” value: x¯ = x1 n+···+xn .
Median is the “center” value: order the sample such
that
x1 ≤ x 2 ≤ · · · ≤ x n .
If n = 2k − 1 is odd, then the median is xk .
If n = 2k is even, then the median is the average of xk and xk+1.
Mode is the most frequent value. (might not be unique.)
What is “typical” anyway?
Example
S = {−8, 0, 1, 1, 2, 2,
2}
Mean= −8+0+1+1+2+2+2 = 0, Median=1, Mode=2
7
Example
S = {−16, 1, 1, 2, 3, 4,
5}
Mean= −16+1+1+2+3+4+5 = 0, Median=2, Mode=1
7
Example
S = {−8, −1, −1, 1, 2, 3,
4}
Mean= −8−1−1+1+2+3+4 = 0, Median=1, Mode=-1
7
Mean (or average) value
If xi = a + byi , then x¯ = a + by¯.
The average
1+4−2+10
of {100, 400, −200, 1000} can be
100
computed4 as .
·The average of {127, 99, 82, 104} can be
computed 4as .
100 + 27−1−18+4
Mean (or average) value
If a sample is composed of several smaller samples, then the
mean of the whole sample can be computed as a weighted
average of the means of the smaller samples.
Let the sample x consist of r parts x1, x2, . . . , xr , where xi
consists of ni units and n1 + · · · + nr = N.
If x¯i denotes the mean of the i :th part, then
n1 nr
x¯ 1= rx¯ +··
N N
· + x¯ .
This is not the same as the mean of the averages, because
larger samples must be given larger weight.
Median value
The median is useful when we want to ignore outliers.
If we want to understand the typical standard of living in a
developing country, it is useful to compare the median income
to the poverty line, but not the mean income.
Does not require that data can be meaningfully added
and subtracted - only that the data be ordered.
Mode
The mode is useful even for qualitative data.
For example, the mode of the data set {bus, car, bicycle,
pedestrian, pedestrian, car, pedestrian} is pedestrian, but the
mean and median of this data set is meaningless.
Requires that the observations be grouped into not too many
sets of feasible outcomes.
Measures of Variability:
Variance
The sample variance s2 is the average, squared deviation of
each observation from the mean.
The idea is that it measures the spread of the data about the
mean.
It is difficult to interpret because it’s in squared units, cannot
be negative and is only zero when all data points are equal.
n
Σ i =1 (x ¯ 2
s2= i − X )
n−1
The sample standard deviation s is the positive square root
of the variance,
s = √s2
Seungchul 12