Statistics IV Interpreting The Results of Statistical Tests
Statistics IV Interpreting The Results of Statistical Tests
When in doubt as to the type of distribution that the sample data follow,
particularly when the sample size is small, non-parametric analysis should be
undertaken, accepting that the analysis may lack power. The best solution to
avoiding mistakes in choosing the appropriate statistical test for analysis of data
is to design a study with sufficiently large numbers of subjects in each group.
The main advantage of the paired over the unpaired study design is that paired
statistical tests are more powerful and fewer subjects need to be recruited in
order to prove a given difference between the study groups. Against this are
pragmatic difficulties and additional time needed for crossover studies, and the
danger that, despite a washout period, there may still be an influence of the first
treatment on the second. The pitfalls of matching patients for all important
characteristics also have to be considered.
For example, we may obtain two sample data sets which appear to be from different
populations when we examine the data. Let us consider that the appropriate
statistical test is applied and the P-value obtained is 0.02. Conventionally, the
P-value for statistical significance is defined as P < 0.05. In the above example,
the threshold is breached and the null hypothesis is rejected. What exactly does a
P-value of 0.02 mean? Let us imagine that the study is repeated numerous times. If
the null hypothesis is true and the sample means are not different, a difference
between the sample means at least as large as that observed in the first study
would be observed only 2% of the time.
In fact, if the actual P-value of the first study was 0.048 and that of the second
study was 0.052, the two studies are entirely consistent with each other. The
conventional value for statistical significance (P < 0.05) should always be viewed
in context and a P-value close to this arbitrary cut-off point should perhaps lead
to the conclusion that further work may be necessary before accepting or rejecting
the null hypothesis.
Confidence intervals
A confidence interval is a range of sample data which includes an unknown
population parameter, for example, mean. The most commonly reported is the 95%
confidence interval (CI 95%), although any other confidence interval may be
calculated. If an investigation is repeated numerous times, the CI 95% generated
will contain the population mean 95% of the time.