Lecture1 - Parametric Statistical Tests For Independent Groups
Lecture1 - Parametric Statistical Tests For Independent Groups
Endgames
ENDGAMES
STATISTICAL QUESTION
Researchers investigated whether a school based educational different ages, with each distribution described by a unique
programme aimed at reducing consumption of carbonated drinks mean and standard deviation. Boys would have been expected
prevented excessive weight gain in children. A cluster to have on average a greater BMI than girls. Each child’s BMI
randomised controlled trial study design was used. The was therefore transformed to a z score specific for their age and
intervention, which was delivered over one school year, included sex. The standardisation of outcome measures by using z scores
focused education promoting a healthy diet together with has been described in a previous question.2 Each child’s change
discouragement of carbonated drink consumption. The control in BMI z score was calculated—that is, his or her BMI z score
group received no intervention. Children were followed for at baseline was subtracted from that after three years. The mean
three years from baseline.1 change in BMI z score over three years in the intervention group
The main outcome measures included body mass index (BMI) was compared with that in the control group, providing an
converted to age and sex specific z scores. A total of 644 estimate of the true effect of the school based educational
children aged between 7 and 11 years from six schools were programme at preventing excessive weight gain in children.
recruited. Measurements were obtained from 434 children three Student’s t test (answer b) would most likely have been used to
years after baseline. Distributional assumptions of normality in compare treatment groups in the mean change in BMI z score
the BMI z scores were verified. At follow-up the age and sex over the three years of follow-up. Student’s t test, also known
specific BMI z scores had increased in the control group by a as the independent samples t test and described in a previous
mean of 0.1 (SD 0.53) but decreased in the intervention group question,3 compares the means of a variable measured on a
by 0.01 (SD 0.58). The mean difference between treatment continuous scale in two independent groups. It is a parametric
groups was not significant (0.1 (95% confidence interval 0 to test that assumes that the distribution of change scores in the
0.21); P=0.06). age and sex specific BMI z scores were approximately normally
Which one of the following statistical tests was most likely used distributed in both groups and that the variances for the two
to compare treatment groups in the mean difference in the mean groups were equal. The researchers reported that distributional
change in BMI z score over three years from baseline? assumptions of normality in the BMI z scores had been verified.
a) Paired t test Parametric tests have been described in a previous question.4
A further indication that Student’s t test (answer b) would most
b) Student’s t test
likely have been used was that a 95% confidence interval for
c) Wilcoxon rank sum test the mean difference between treatment groups in mean change
d) Wilcoxon signed ranks test in BMI z scores from baseline was presented. A 95% confidence
interval for the mean difference should have been derived only
Answers if the assumption of normality required for a parametric test
Student’s t test (answer b) would most likely have been used to could be made. If the assumption of normality could not have
compare the treatment groups in the mean difference in mean been made, the Wilcoxon rank sum test (answer c)—a
change in BMI z score over three years from baseline. non-parametric test described below—would have been used.
However, under such circumstances it would not have been
The trial investigated whether an educational programme aimed
sensible to derive the 95% confidence interval for the mean
at reducing consumption of carbonated drinks prevented
difference between the treatment groups.
excessive weight gain in children. The outcome measures
included BMI, recorded at baseline and after three years. The In the statistical test to compare the intervention with the control
distribution of BMI would have been different for children of treatment, the null hypothesis stated that in the population from
For personal use only: See rights and reprints http://www.bmj.com/permissions Subscribe: http://www.bmj.com/subscribe
BMJ 2012;345:e8145 doi: 10.1136/bmj.e8145 (Published 30 November 2012) Page 2 of 2
ENDGAMES
which the sample was taken there was no difference between described in previous questions.6 7 The Wilcoxon rank sum test
treatment groups in the mean change in BMI z score over three is used to compare two independent groups in a variable
years of follow-up. The alternative hypothesis was two sided: measured on a continuous or ordinal scale. The Wilcoxon signed
in the population the intervention, when compared with control ranks test is used to compare two related samples in a variable
treatment, resulted in a larger or a smaller mean change in BMI that is continuous or ordinal. The Wilcoxon rank sum test and
z score over three years of follow-up. Although at follow-up Wilcoxon signed ranks test are non-parametric methods and
the BMI z scores had increased in the control group by a mean therefore make no assumption about the distribution of the
of 0.1 but decreased in the intervention group by 0.01, the mean variable in the population. The tests are used when the
difference between the groups was not significant (P=0.06) at distribution of the variable does not satisfy the assumption of
the 5% level of significance. Therefore, there was no evidence normality or when it is not achieved after a transformation of
of a difference between treatments in the effect on the mean the data. The log transformation of data has been described in
change in BMI z score over three years. a previous question.8 If the assumption of normality could not
The paired t test (answer a) is used to compare two related have been made, and non-parametric tests had been used, then
measurements of a continuous variable. The age and sex specific it would not have been sensible to calculate a 95% confidence
BMI z score of each child were recorded at baseline and again interval for the mean difference between treatment groups in
three years later. The paired t test could have been used to test the mean change in BMI z scores from baseline.
whether the mean change in BMI z scores over three years was
significantly different from zero in each treatment group. The Competing interests: None declared.
paired t test is a parametric test and would have assumed that
the distribution of the change in BMI z scores in the population 1 James J, Thomas P, Kerr D. Preventing childhood obesity: two year follow-up results
from the Christchurch obesity prevention programme in schools (CHOPPS). BMJ
was normally distributed. Although analysis of the mean change 2007;335:762-5.
in BMI z scores in each treatment group may be useful, it would 2
3
Sedgwick P. Standardisation of outcome measures (z scores). BMJ 2012;345:e6178.
Sedgwick P. Independent samples t test. BMJ 2010;340:c2673.
be more informative to compare the mean change in the 4 Sedgwick P. Parametric v non-parametric statistical tests. BMJ 2012;344:e1753.
intervention group with that in the control group. Children in 5 Sedgwick P. Analysis of outcome measures within treatment groups. BMJ 2012;345:e7201.
6 Sedgwick P. Non-parametric statistical tests for independent groups: numerical data. BMJ
the control group did not receive any intervention, and thus this 2012;344:e3354.
group provided an estimate of the natural change in BMI z 7 Sedgwick P. Non-parametric statistical tests for two related groups: numerical data. BMJ
2012;344:e2537.
scores over three years. However, investigating the mean change 8 Sedgwick P. Log transformation of data. BMJ 2012;345:e6727.
within groups can be misleading, as a previous question has
described.5 Cite this as: BMJ 2012;345:e8145
The Wilcoxon rank sum test (answer c) and Wilcoxon signed © BMJ Publishing Group Ltd 2012
ranks test (answer d) are non-parametric methods that have been
For personal use only: See rights and reprints http://www.bmj.com/permissions Subscribe: http://www.bmj.com/subscribe