What Do P-Values and Confidence Intervals Really Tell Us?
What Do P-Values and Confidence Intervals Really Tell Us?
Inference
Sample k
Sampling variability…
- you only get to pick one sample!
4
Interpreting the results
Selection of subjects
Population
Sample
Inference
p-values
Confidence intervals
6
Significance testing
7
Significance testing
Choice of Statistical
test
Predictor/ Study
Independent design
Variable & question
Outcome/
Dependent
variable
8
Choice of statistical test when…
Dichotomous
outcome (yes/no,
alive, dead)
Independent variable
Categorical
(e.g., smoking yes vs no)
Is smoking associated
with the outcome? Outcome + Outcome -
Smk (yes) a b
Statistical test…
(pS+ )
Two sample proportion or Smk (no) c d
Chi-square or (pS- )
Risk ratio 9
Choice of statistical test when…
Dichotomous
outcome (yes/no,
alive, dead)
Independent variable
Continuous
(e.g., smoking pack yrs)
Is smoking associated
with the outcome?
Outcome + Outcome - Statistical test…
Two sample t-test
Smoking
amount
(pk yrs)
xo xo
10
Significance testing
Subjects with Acute MI
Mortality ? Mortality
IV nitrate No nitrate
PN PC
11
Obtaining P values
Number dead / randomized
Trial Intravenous Control Risk Ratio 95% C.I. P value
nitrate
How do we get this p-value?
12
Table adapted from Whitley and Ball. Critical Care; 6(3):222-225, 2002
Null Hypothesis(Ho)
There is no association between the
independent and dependent/outcome
variables
Formal basis for hypothesis testing
13
Example of significance testing
In the Chiche trial:
pN = 3/50 = 0.06; pC = 8/45 = 0.178
Null hypothesis:
H0: pN – pC = 0 or pN = pC
Statistical test:
Two-sample proportion
14
General form of a test statistic
X N XC XN XC
p , pN , pC
n N nC nN nC
15
Test statistic for Two Population
Proportions
The test statistic for p1 – p2 is a Z statistic:
Observed difference
p N pC PN PC o
Z
1 1
p (1 p)
0
n
N nC Null hypothesis
No. of subjects in IV
nitrate group No. of subjects in
control group
X N XC XN XC
where p , pN , pC
n N nC nN nC
16
Testing significance at 0.05 level
-1.96 +1.96
17
Two Population Proportions
(continued)
Z
0.06 0.178 1.79
1 1
0.116 (1 .116)
50 45
38 3 8
where p 0.116 , p N 0.06 , p C 0.178
45 50 45 50
18
Statistical test for p1 – p2
Two Population Proportions, Independent Samples
/2 /2
-1.79 +1.79
P (Z<-1.79) + P (Z>1.79)= 0.08
What is a P value?
‘P’ stands for probability
Tail area probability based on the observed effect
Calculated as the probability of an effect as large
as or larger than the observed effect (more
extreme in the tails of the distribution), assuming
null hypothesis is true
21
What is a P value?
Fisher suggested 5% level (p<0.05) could be used
as a scientific benchmark for concluding that fairly
strong evidence exists against H0
Was never intended as an absolute threshold
Strength of evidence is on a continuum
Simply noting the magnitude of the P-value should
suffice
Scientific context is critical
22
What is a P value?
P<0.05 is an arbitrary cut-point
Does it make sense to adopt a therapeutic agent
because P-value obtained in a RCT was 0.049,
and at the same time ignore results of another
therapeutic agent because P-value was 0.051?
23
P-values
…8 out of 100 such trials would show a risk reduction of 66% or more
extreme just by chance
Flaherty 11/56 11/48 0.83 (0.33,2.12) 0.70
…70 out of 100 such trials would show a risk reduction of 17% or more
extreme just by chance…very likely a chance finding
Lis 5/64 10/76 0.56 (0.19,1.65) 0.29
28
Confidence intervals
“Statistics means never having to say you’re certain!”
30
Computing confidence intervals (CI)
General formula:
(Sample statistic) [(confidence level) (measure of how
high the sampling variability is)]
34
Interpretation of Confidence intervals
Null value CI
35
Connection between P-values and CIs
If a 95% CI includes the null effect, the
P-value is >0.05 (and we would fail to
reject the null hypothesis)
36
Interpreting confidence intervals
Number dead / randomized
Trial Intravenous Control Risk Ratio 95% C.I. P value
nitrate
37
Table adapted from Whitley and Ball. Critical Care; 6(3):222-225, 2002
What about clinical importance?
“A difference, to be a difference, must make a
difference.” -- Gertrude Stein
39
Interpretation of Confidence intervals
Null value CI
42
“Multivariate analysis showed that routine use of immediate node dissection
had no impact on survival (hazard ratio 0·72,95% CI 0·5–1·02), whilst the
status of regional nodes affected survival significantly (p=0·007).”
-- The point estimate of 0·72 and p-value of 0·07 suggest that the
result (or a result even more extreme) is consistent with a relative
survival benefit of 28%, and that the probability of the result being
due to chance is small in comparison.
44
Clinical vs statistical significance
46
Reaction of investigator to results of a
statistical significance test
Statistical significance
47
Which statement(s) is/are correct?
48
Which of the following odds ratios for the relationship
between various risk factors and heart disease are
statistically significant at the .05-significance level?
Which are likely to be clinically significant?
Statistically Clinically
significant? significant?
A. Odds ratio for every 1-year increase
in age: 1.10 (95% CI: 1.01—1.19)
50
Summary of key points
Confidence interval quantifies
How confident are we about the true value
in the source population
Better precision with large sample size
Corresponds to hypothesis testing, but
much more informative than P-value
51
Example: 95%CI for an odds ratio
Cases Control
Exposure + 20 10 30
Exposure - 6 24
30
OR = (20 * 24) / ( 6*10) = 8.0
ln (OR ) [1.96 S.E.( ln OR)]
1 1 1 1 1 1 1 1
ln ( 8 ) [1.96 ] ln ( 8 ) [1.96 ]
LCL e 20 6 10 24
UCL e 20 6 10 24
1 1 1 1 1 1 1 1
1.96 1.96
95% CI (8.0)e 20 6 10 24
, (8.0)e 20 6 10 24
(2.47, 25.8)
52