Lecture 7
Lecture 7
confidence intervals, p-
values
Statistical Inference
The process of making
guesses about the truth
from a sample. Sample statistics
n
x
̂ X n i 1
n
n
Truth (not (x X 2
i n)
ˆ 2 s 2 i 1
n 1
observable)
Sample *hat notation ^ is often used to indicate
Population (observation)
“estitmate”
parameters
N N
x (x )
i
2
i 1 2 i 1
N N
Make guesses
about the whole
population
Statistics vs. Parameters
Sample Statistic – any summary measure calculated
from data; e.g., could be a mean, a difference in means
or proportions, an odds ratio, or a correlation coefficient
E.g., the mean vitamin D level in a sample of 100 men is 63
nmol/L
E.g., the correlation coefficient between vitamin D and
cognitive function in the sample of 100 men is 0.15
1. Lee DM, Tajar A, Ulubaev A, et al. Association between 25-hydroxyvitamin D levels and cognitive performance
in middle-aged and older European men. J Neurol Neurosurg Psychiatry. 2009 Jul;80(7):722-9.
Distribution of a trait:
vitamin D
Right-skewed!
Mean= 63 nmol/L
Standard deviation = 33
nmol/L
Distribution of a trait: DSST
Normally distributed
Mean = 28 points
Standard deviation = 10 points
Distribution of a statistic…
Statistics follow distributions too…
But the distribution of a statistic is a
theoretical construct.
Statisticians ask a thought experiment: how
much would the value of the statistic fluctuate
if one could repeat a particular study over and
over again with different samples of the same
size?
By answering this question, statisticians are
able to pinpoint exactly how much uncertainty
is associated with a given statistic.
Distribution of a statistic
Two approaches to determine the
distribution of a statistic:
1. Computer simulation
Repeat the experiment over and over again
virtually!
More intuitive; can directly observe the behavior
of statistics.
2. Mathematical theory
Proofs and formulas!
More practical; use formulas to solve problems.
Example of computer
simulation…
How many heads come up in 100
coin tosses?
Flip coins virtually
Flip a coin 100 times; count the number
of heads.
Repeat this over and over again a large
number of times (we’ll try 30,000
repeats!)
Plot the 30,000 results.
Coin tosses…
Conclusions:
We usually get
between 40 and 60
heads when we flip
a coin 100 times.
It’s extremely
unlikely that we will
get 30 heads or 70
heads (didn’t
happen in 30,000
experiments!).
Distribution of the sample
mean, computer simulation…
1. Specify the underlying distribution of
vitamin D in all European men aged 40 to 79.
Right-skewed
Standard deviation = 33 nmol/L
True mean = 62 nmol/L (this is arbitrary; does not
affect the distribution)
2. Select a random sample of 100 virtual men
from the population.
3. Calculate the mean vitamin D for the
sample.
4. Repeat steps (2) and (3) a large number of
times (say 1000 times).
5. Explore the distribution of the 1000 means.
Distribution of mean
vitamin D (a sample
statistic)
Normally distributed!
Surprise!
Mean= 62 nmol/L (the true
mean)
Standard deviation = 3.3
nmol/L
Distribution of mean
vitamin D (a sample
statistic)
Normally distributed (even though
the trait is right-skewed!)
Mean = true mean
Standard deviation = 3.3 nmol/L
The standard deviation of a statistic is
called a standard error
s
The standard error of a mean =
n
If I increase the sample
size to n=400…
Standard error = 1.7 nmol/L
s 33
1.7
n 400
If I increase the variability of
vitamin D (the trait) to
SD=40…
Standard error = 4.0 nmol/L
s 40
4.0
n 100
Mathematical Theory…
The Central Limit
Theorem!
If all possible random samples, each of size n, are
taken from any population with a mean and a
standard deviation , the sampling distribution of
the sample means (averages) will:
1. have mean: x
2. have standard deviation: x
n
3. be approximately normally distributed regardless of the shape
of the parent population (normality improves with larger n). It all
comes back to Z!
Symbol Check
n n
x i E ( x) nE ( x)
E ( X n ) E ( i 1 ) i 1 E ( x )
n n n
n n
1. have mean: x
2. have standard deviation: x
n
3. be approximately normally distributed regardless of the shape
of the parent population (normality improves with larger n)
Central Limit Theorem
caveats for small samples:
For small samples:
The sample standard deviation is an imprecise
estimate of the true standard deviation (σ); this
imprecision changes the distribution to a T-
distribution.
A t-distribution approaches a normal distribution for large n
(100), but has fatter tails for small n (<100)
If the underlying distribution is non-normal, the
distribution of the means may be non-normal.
Confidence Interval
s
confidence interval observed mean Z/2 * ( )
n
Single population mean
(small n, normally
distributed trait)
Hypothesis test:
observed mean null mean
Tn 1
s
n
Confidence Interval
s
confidence interval observed mean Tn 1,/2 * ( )
n
Examples of Sample
Statistics:
Single population mean
Single population proportion
Difference in means (ttest)
Difference in proportions (Z-test)
Odds ratio/risk ratio
Correlation coefficient
Regression coefficient
…
Distribution of a correlation
coefficient?? Computer
simulation…
1. Specify the true correlation coefficient
Correlation coefficient = 0.15
2. Select a random sample of 100 virtual
men from the population.
3. Calculate the correlation coefficient for
the sample.
4. Repeat steps (2) and (3) 15,000 times
5. Explore the distribution of the 15,000
correlation coefficients.
Distribution of a
correlation coefficient…
Normally distributed!
Mean = 0.15 (true correlation)
Standard error = 0.10
Distribution of a
correlation coefficient in
general…
1. Shape of the distribution
Normally distributed for large samples
T-distribution for small samples (n<100)
2. Mean = true correlation
coefficient (r) 2
1 r
3. Standard error
n
Many statistics follow
normal (or t-distributions)
…
Means/difference in means
T-distribution for small samples
Proportions/difference in
proportions
Regression coefficients
T-distribution for small samples
Natural log of the odds ratio
Estimation (confidence
intervals)…
What is a good estimate for the
true mean vitamin D in the
population (the population
parameter)?
63 nmol/L +/- margin of error
95% confidence interval
Goal: capture the true effect (e.g.,
the true mean) most of the time.
A 95% confidence interval should
include the true effect about 95%
of the time.
A 99% confidence interval should
include the true effect about 99%
of the time.
Recall: 68-95-99.7 rule for normal distributions! These is a
95% chance that the sample mean will fall within two
standard errors of the true mean= 62 +/- 2*3.3 = 55.4 nmol/L
to 68.6 nmol/L
Mean - 2 Std error=55.4 Mean Mean + 2 Std error =68.6
To be precise,
95% of
observations fall
between Z=-1.96
and Z= +1.96 (so
the “2” is a
rounded number)
…
95% confidence interval
There is a 95% chance that the sample
mean is between 55.4 nmol/L and 68.6
nmol/L
For every sample mean in this range,
sample mean +/- 2 standard errors will
include the true mean:
For example, if the sample mean is 68.6
nmol/L:
95% CI = 68.6 +/- 6.6 = 62.0 to 75.2
This interval just hits the true mean, 62.0.
95% confidence interval
Thus, for normally distributed statistics,
the formula for the 95% confidence
interval is:
sample statistic 2 x (standard error)
Examples:
95% CI for mean vitamin D:
63 nmol/L 2 x (3.3) = 56.4 – 69.6 nmol/L
95% CI for the correlation coefficient:
0.15 2 x (0.1) = -.05 – .35
Simulation of 20 studies of
100 men…
Vertical line indicates the true mean (62)
95% confidence
intervals for the mean
vitamin D for each of
the simulated studies.
Only 1 confidence
interval missed the true
mean.
Confidence Intervals give:
*A plausible range of values for a
population parameter.
*The precision of an estimate.(When
sampling variability is high, the
confidence interval will be wide to reflect
the uncertainty of the observation.)
*Statistical significance (if the 95% CI
does not cross the null value, it is
significant at .05)
Confidence Intervals
The value of the statistic in my
sample (eg., mean, odds ratio,
etc.)
point estimate (measure of how
confident we want to be) (standard
error)
From a Z table or a T table,
depending on the sampling
distribution of the statistic.
80% 1.28
90% 1.645
95% 1.96
98% 2.33
99% 2.58
99.8% 3.08
99.9% 3.27
99% confidence
intervals…
99% CI for mean vitamin D:
63 nmol/L 2.6 x (3.3) = 54.4 – 71.6
nmol/L
99% CI for the correlation coefficient:
0.15 2.6 x (0.1) = -.11 – .41
Testing Hypotheses
1. Is the mean vitamin D in middle-
aged and older European men
lower than 100 nmol/L (the
“desirable” level)?
2. Is cognitive function correlated
with vitamin D?
Is the mean vitamin D
different than 100?
Start by assuming that the mean =
100
This is the “null hypothesis”
This is usually the “straw man” that
we want to shoot down
Determine the distribution of
statistics assuming that the null is
true…
Computer simulation
(10,000 repeats)…
Normally
distributed
Std error = 3.3
Mean = 100
Compare the null
distribution to the
observed value…
What’s the
probability of
seeing a sample
It didn’t happen
mean of 63
in 10,000
nmol/L if the true
simulated
mean is 100
studies. So the
nmol/L?
probability is less
than 1/10,000
Compare the null
distribution to the
observed value…
This is the p-
value!
P-value <
1/10,000
Calculating the p-value
with a formula…
Because we know how normal curves work, we can exactly calculate the
probability of seeing an average of 63 nmol/L if the true average weight is
100 (i.e., if our null hypothesis is true):
63 100
Z 11 .2
3.3
Z= 11.2, P-value << .0001
The P-value
P-value is the probability that we would have seen our
data (or something more unexpected) just by chance if
the null hypothesis (null value) is true.
50 60 70 80 90 100
50 60 70 80 90 100
Null distribution:
Normally
distributed
Std error = 0.1
Mean = 0
What’s the probability of
our data?
This is a two-sided
hypothesis test, so “more
extreme” includes as big or
bigger negative correlations
(<-0.15).
P-value = 7% + 7% = 14%
What’s the probability of
our data?
Normally distributed,
standard error = 11.1
Computer simulation
assuming the null (15,000
repeats)…
If the vaccine is
completely
ineffective, we
could still get
23 excess
infections just
by chance.
Probability of
23 or more
excess
infections =
0.04
How to interpret p=.04…
P(data/null) = .04
P(null/data) .04
P(null/data) 22%
*estimated using Bayes’ Rule
(and prior data on the vaccine)
*Gilbert PB, Berger JO, Stablein D, Becker S, Essex M, Hammer SM, Kim JH, DeGruttola VG.
Statistical interpretation of the RV144 HIV vaccine efficacy trial in Thailand: a case study for
statistical issues in efficacy trials. J Infect Dis 2011; 203: 969-975.
Alternative analysis of the
data (“intention to treat”)
…
56/8202 (6.8 per 1000) infections
in the vaccine group versus
76/8200 (9.3 per 1000)
Computer simulation
assuming the null (15,000
repeats)…
Probability of
20 or more
excess
infections =
0.08
P=.08 is only
slightly different
than p=.04!
Confidence intervals…
95% CI (analysis 1): .0014 to .0055