Introduction To Statistics Part IV: Statistical Inference: Achim Ahrens Anna Babloyan Erkal Ersoy
Introduction To Statistics Part IV: Statistical Inference: Achim Ahrens Anna Babloyan Erkal Ersoy
September 2015
Outline
1. Descriptive statistics
I Sample statistics (mean, variance, percentiles)
I Graphs (box plot, histogram)
I Data transformations (log transformation, unit of measure)
I Correlation vs. Causation
2. Probability theory
I Conditional probabilities and independence
I Bayes’ theorem
3. Probability distributions
I Discrete and continuous probability functions
I Probability density function & cumulative distribution function
I Binomial, Poisson and Normal distribution
I E[X] and V[X]
4. Statistical inference
I Population vs. sample
I Law of large numbers
I Central limit theorem
I Confidence intervals
I Hypothesis testing and p-values
1 / 41
Introduction
Recall, in the last lecture we assumed that we know the probability
distribution of the random variable in question as well as the parameters
of the distribution (e.g. µ and σ 2 for the normal distribution). Under
these assumptions we were able to obtain the probability that the random
variable would take values within a particular interval (e.g. P(X ≤8)).
0.5
N(0, 1)
N(0, 2)
0.4 N(0, 3)
0.3
f (x)
0.2
0.1
0
−8 −6 −4 −2 0 2 4 6 8
x
What if we don’t know µ?
2 / 41
Population vs. sample
Suppose we are interested in the distribution of heights in the UK. The
residents of the UK are the population; the parameter µ is the true
average height of UK residents and σ 2 the true variance.
If we were to measure the height of all UK residents, we would conduct a
census. However, measuring the height of every individual is hardly
feasible, or only at an exorbitant cost. Instead, we can randomly select a
sample from the population and make inferences from the sample to the
population.
In particular, we can use the sample statistics (e.g. sample mean and
sample variance) to make inferences about the true, but unknown
population parameters (µ and σ 2 ).
3 / 41
Population vs. sample
We randomly select a sample from the UK population and measure the
heights of the individuals in the sample.
Simple random sample
A random sample is given if each individual in the population has an
equal chance of being chosen.
Since the draws are random, the height of the first, second, third, . . . nth
selected individual is random, too. That is, X1 , X2 , . . . , Xn are random
variables.
I.I.D.
Suppose we draw n items (X1 , X2 , . . . , Xn ) at random from the same
population. Since X1 , X2 , . . . , Xn are drawn from the same population,
they are identically distributed. Furthermore, since the distribution of Xi
does not depend on the distribution of Xj (for i, j = 1, . . . , n; i 6= j), we
can say that they are independently distributed. We say that
X1 , X2 , . . . , Xn are independently and identically distributed (i.i.d.).
4 / 41
Population vs. sample
Now, we draw (n=10, in cm)
182 197 183 171 171 162 152 157 192 174
Given this sample, what is our best guess about µ? It’s just the sample
mean.
n
X 1
x̄ = xi = (182 + · · · + 174) = 174.1
i=1
10
The sample mean is an unbiased and consistent estimator of the
unknown population mean µ.
Unbiasedness vs. consistency
To understand unbiasedness, note that the sampling distribution of x̄ is
centered at µ. When we repeatedly sample (more on this in a bit), x̄ is
sometimes above the true value of the parameter µ and sometimes below
it. However, the key aspect here is that there is no systematic tendency
to overestimate or underestimate the true parameter. This makes x̄ an
unbiased estimator of the parameter µ.
9 / 41
CLT in Action
.6
.5
.4 100 repetitions
Density
.3.2
.1
0
10 / 41
CLT in Action
.6
.5
.4 5000 repetitions
Density
.3.2
.1
0
10 / 41
CLT in Action
.6
.5
.4 10000 repetitions
Density
.3.2
.1
0
10 / 41
Central limit theorem
10000 repetitions
.6
.5
.4
Density
.3.2
.1
0
The sample mean of x̄ (1) , x̄ (2) , . . . , x̄ (10,000) is 170.0007 and the standard
deviation is 0.8139225.
11 / 41
The mean and the standard deviation of x̄
If x̄ is the mean of a sample of size n, which are drawn randomly from a
large sample with mean µ and standard deviation σ, then the mean of
the sampling distribution of x̄ is µ and its standard deviation is √σn .
More formally, the central limit theorem can be described as follows.
12 / 41
Short digression: The expected value of X̄
N
X 1 1 1 1 1
X̄ = Xi = X1 + X2 + X3 + · · · + XN
i=1
N N N N N
From last lecture, we know that the expectation of a sum is the
sum of the expectations and thus:
N
X 1 1 1 1
E(X̄ ) = E[ Xi ] = E[ X1 ] + E[ X2 ] + · · · + E[ XN ]
i=1
N N N N
1 1 1
= E[X1 ] + E[X2 ] + · · · + E[XN ]
N N N
1 1 1
= µ + µ + ··· + µ
N N N
=µ
13 / 41
Short digression: The variance of X̄
N
X 1 1 1 1 1
V[X̄ ] = V[ Xi ] = V[ X1 + X2 + X3 + · · · + XN ]
i=1
N N N N N
1 1 1
= V[Y1 ] + 2 V[Y2 ] + · · · + 2 V[YN ]
N2 N N
1 1 1
= 2 σ2 + 2 σ2 + · · · + 2 σ2
N N N
σ2
=
N
This result tells us that the sample variance decreases as the sample size
increases.
14 / 41
Making statistical inferences
Confidence intervals
68%
mean
95%
mean
0.5
N (0, 1)
0.4
0.3
f (x)
0.2
0.1
0
−4 −2 0 2 4
x
16 / 41
Making statistical inferences
Confidence intervals
As discussed earlier, the sample mean x̄ is an appropriate estimator of the
unknown population mean µ because it is an unbiased estimator of µ,
and it approaches the true population parameter as sample size increases.
We have also mentioned, however, that this estimate varies from sample
to sample. So, how reliable is this estimator? To answer this question, we
need to consider the spread as well. From the central limit theorem
(CLT), we know that if the population mean is µ and the standard
deviation is σ, then repeated samples of n observationsshouldyield a
2
sample mean x̄ with the following distribution: X̄ ∼ N µ, σn .
Confidence Interval
A confidence interval with confidence level C consists of two parts:
1. An interval obtained from the data in the form
estimate ± margin of error
2. A chosen confidence level, C , which gives the probability that the
calculated interval will contain the true parameter value.
17 / 41
Confidence Intervals
Calculating the interval
18 / 41
Confidence Intervals
Calculating the interval
Example
Suppose a student measuring the boiling temperature of a certain
liquid observes the readings (in degrees Celsius) 102.5, 101.7,
103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid.
He calculates the sample mean to be 101.82. If he knows that the
standard deviation for this procedure is 1.2 degrees, what is the
confidence interval for the population mean at a 95% confidence
level?
Example
Suppose a student measuring the boiling temperature of a certain liquid
observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9,
100.5, and 102.2 on 6 different samples of the liquid. He calculates the
sample mean to be 101.82. If he knows that the standard deviation for
this procedure is 1.2 degrees, what is the confidence interval for the
population mean at a 95% confidence level?
19 / 41
Confidence Intervals
Behaviour of confidence intervals
Confidence intervals get
smaller as:
1. The number of
observations, n,
increases
2. The level of
confidence
decreases
20 / 41
Tests of Significance
Why we need them (and some terminology)
Null hypothesis, H0
The statement or hypothesis being tested in a significance test is
called the null hypothesis, and is normally referred to as H0 . The
significance tests assess the evidence for and against this
hypothesis and allow us to either reject or fail to reject H0 .
Consider the following example to see how we can put these to use.
21 / 41
Tests of Significance
Example: Are the bottles being filled as advertised?
Suppose we are appointed as inspectors at an Irn Bru factory here in Scotland.
We have data on past production and observe that the distribution of the
contents is normal with standard deviation of 2ml. To assess the bottling
process, we randomly select 10 bottles, measure their contents and obtain the
following results:
502.9 499.8 503.2 502.8 500.9 503.9 498.2 502.5 503.8 501.4
For this sample of observations, the mean content, x̄, is 501.94 ml. Is this
sample mean far enough from 500 ml to provide convincing evidence that the
mean content of all bottles produced at the factory differ from the advertised
amount of 500 ml?
502.9 499.8 503.2 502.8 500.9 503.9 498.2 502.5 503.8 501.4
22 / 41
Tests of Significance
P-values
Consider our earlier example about the Irn Bru factory, where we
calculated the z-statistic to be 3.07 using our sample of size
n = 10, standard deviation of 2 ml and the sample mean
x̄ = 501.94:
x̄ − µ 501.94 − 500
z= = = 3.07
√σ √2
n 10
25 / 41
Tests of Significance
Calculating p-values
Now that we obtained the p-value, we need to decide what level of significance
to use in our test. The significance level determines how much evidence we
require to reject H0 , and is usually denoted by the Greek letter alpha, α.
If we choose α = 0.05, rejecting H0 requires evidence so strong that it would
happen no more than 5% of the time if H0 is true. If we choose α = 0.01, we
require even stronger evidence against H0 to be able to reject it: evidence
against H0 would need to be so strong that it would happen only 1% of the
time if H0 is true.
26 / 41
Tests of Significance
P-values and statistical significance
Statistical significance
If the p-value we calculate is smaller than our chosen α, we reject
H0 at significance level α.
This suggests that there is very strong evidence against the null
hypothesis, because if H0 were true, the observed sample mean
should not have happened any more than 1% of the time.
27 / 41
Tests of Significance
One- and two-sided alternative hypotheses
28 / 41
Tests of Significance
Summary 1/2
I Significance tests allow us to formally assess the evidence
against a null hypothesis (H0 ) provided by data. This way, we
can judge whether the deviations from what the null
hypothesis suggests are due to chance.
I When stating hypotheses, H0 is usually a statement that no
effect exists (e.g. all bottles at a factory are filled with a mean
quantity of 500 ml). The alternative hypothesis, Ha , on the
other hand, suggests that a parameter differs from its null
value in either direction (two-sided alternative) or in a specific
direction (one-sided alternative).
I The test itself is conducted using a test statistic. The
corresponding p-value is calculated assuming H0 is true, and it
indicates the probability that the test statistic will take a value
at least as "surprising" as the observed one.
29 / 41
Tests of Significance
Summary 2/2
30 / 41
Tests of Significance
Standard error
31 / 41
Tests of Significance
z versus t distribution
32 / 41
Tests of Significance
t distributions
33 / 41
Tests of Significance
z versus t distribution
34 / 41
Confidence Intervals
...using t distributions
35 / 41
Confidence Intervals
...using t distributions
And formally,
t Confidence Interval
A level C confidence interval for a population mean (µ) is
s
x̄ ± t ∗ × √
n
where t ∗ is the critical value with area C between −t ∗ and t ∗ under
the t(n − 1) density curve, and n − 1 is the degrees of freedom.
36 / 41
Confidence Intervals
...using t distributions
Example
Here are monthly dollar amounts for phone service for a random
sample of 8 households: 43, 47, 51, 36, 50, 42, 37, 41. We would
like to construct a 95% CI for the average monthly expenditure, µ.
Example
Here are monthly dollar amounts for phone service for a random
sample of 8 households: 43, 47, 51, 36, 50, 42, 37, 41. We would
like to construct a 95% CI for the average monthly expenditure, µ.
38 / 41
Confidence Intervals
...using t distributions
39 / 41
Hypothesis tests
...using t distributions
Example
Suppose that the overall U.S. average monthly expenditure for
phone service is $49. Is the sample mean, x̄, of 43.5 different from
the national average of $49?
Example
Suppose that the overall U.S. average monthly expenditure for phone service is
$49. Is the sample mean, x̄, of 43.5 different from the national average of $49?
Example
Suppose that the overall U.S. average monthly expenditure for phone service is 40 / 41
References
DeGroot, M. H. & Schervish, M. J. (2002). Probability and Statistics.
Addison-Wesley.
Moore, D. et al. (2011). The Practice of Statistics for Business and
Economics. Third Edition. W. H. Freeman.
Stock, J. H. & Watson, M. W. (2010). Introduction to Econometrics. Third
Edition. Pearson Education.
41 / 41