Assignment #2
Assignment #2
“Students who have a running/walking path (excluding a sidewalk) within 200m of where
they live will engage in more hours of moderate-vigorous physical activity during the middle
of the term (PA last week) compared to students who do not have a running/walking path
within 200m of where they live.”
1a) Select and state the appropriate statistical analysis given the research hypothesis (1
mark).
The t-test for two independent means (unequal sample sizes) will be used.
1b) Using JASP, produce appropriate frequency distribution tables of the data (2
marks).
The distribution does not contain any data that is ‘problematic’ since the values inputted are
appropriate to the question that was asked. Firstly, all participants have responded to the
question ‘Path within 200m?’ with either a ‘No’ or a ‘Yes’. Secondly, there are no entered
values like -3 hours or 170 hours that fall outside the limits of a 0-168-hour period within the
week.
2a) State the null and alternative hypothesis (assume two-tailed, alpha = .05) (2 marks).
μ1: Hours of moderate-vigorous physical activity engaged during the last week by students
who do not have a running/walking path (excluding a sidewalk) within 200m of where they
live.
μ2: Hours of moderate-vigorous physical activity engaged during the last week by students
who have a running/walking path (excluding a sidewalk) within 200m of where they live.
H : μ1 = μ2
0
H : μ1 ≠ μ2
1
2b) State the decision rule for the analysis if you were to conduct the statistical analysis
by hand (2 marks).
Rejection rule:
2c) Using JASP, conduct the appropriate statistical analysis given the research
hypothesis (two-tailed, alpha = .05). Include the JASP output of the t-table in your
assignment write up.
2d) Check and discuss the assumptions of the analysis you conducted (eg: what were the
assumptions based on the analysis? Were the assumptions met?) Please include the
JASP output for any assumption checks you did using JASP.
There are 5 assumptions associated with the analysis that was conducted.
2. Independence:
This assumption states that scores on the dependent variable must be independent,
meaning that they must come from different participants. In this study, data was
collected using a between groups design. Each participant was categorized into two
groups based on their response to the question that was asked regarding whether or
not they had a running/walking path (excluding sidewalk) within 200m of where they
lived. Therefore, this assumption is met since no participant was present in both
groups.
4. Homogeneity of variance:
Homogeneity of variance assumes that the two groups have equal variances, which
was tested by the Levene's test. Since the p-value of 0.54 for the F-statistic is greater
than 0.05, the variances of the two groups do not significantly statistically differ from
each other. Therefore, the assumption of equality of variances is met.
Since the p-value of 0.89 for the ‘No’ group is greater than 0.05, the distribution of
the hours of physical activity engaged by KIN 206 students who do not have a
running/walking path (excluding a sidewalk) within 200m of where they live do not
statistically significantly differ from a normal distribution. Thus, the assumption of normality
is met for this group.
However, the p-value of 0.00989 for the ‘Yes’ group is less than 0.05, which indicates
that the distribution of the hours of physical activity engaged by KIN 206 students who have
a running/walking path (excluding a sidewalk) within 200m of where they live statistically
significantly differs from a normal distribution. Although this initial analysis suggests that the
assumption of normality for this group is not met, it is important to evaluate the assumption
of normality further.
Moreover, a slightly negative kurtosis score of -0.97 implies that the distribution has a
little more variability compared to a normal distribution that is mesokurtic (neither peaked
nor flat). This is also illustrated by the histogram that presents itself to have a flatter shape
(platykurtic). Given that only a kurtosis statistic that is greater than 2 or lesser than -2 raises
concerns regarding the degree to which the distribution deviates from normality, it is clear
that the data still approximates a normal distribution.
The mean hours of physical activity last week engaged by 15 KIN 206 students who did not
have a running/walking path within 200m of where they live (excluding a sidewalk) (M =
6.27, SD = 4.10) did not statistically significantly differ from the mean hours of physical
activity last week engaged by 45 KIN 206 students who have a running/walking path within
200m of where they live (excluding a sidewalk) (M = 6.02, SD = 4.25), t(58) = 0.19, p >
0.05, with analyses suggesting an extremely small effect size (d = 0.06).
2f) Relate the results of the analysis to the research hypothesis (1 mark).
The existence of a walking/running path (excluding a sidewalk) within 200m of where a KIN
206 student lived did not affect the hours of moderate-vigorous physical activity they
engaged in during the middle of the term.
3a) Calculate and report the 95% confidence intervals for each sample mean in the
analysis you conducted in step 2. Show the formula that you used to calculate the 95%
confidence interval and the main values within the formula (eg: the man, standard
error, t) (4 marks).
s s
sx̅ = sx̅ =
√N √N
4.10 4.25
sx̅ = =1.0586 sx̅ = =0.6336
√ 15 √ 45
df = N – 1 df = N – 1
df = 15 – 1 = 14 df = 45 – 1 = 44
A 95% C.I implies α = 0.05 and df = 14. A 95% C.I implies α = 0.05 and df = 44.
t = ± 2.145 t = ± 2.021
95% C.I = 6.27 ± 2.145 (1.0586) 95% C.I = 6.02 ± 2.021 (0.6336)
There is a 0.95 probability that the interval There is a 0.95 probability that the interval
of 4.00 hours to 8.54 hours contains the of 4.74 hours to 7.30 hours contains the
population mean hours of moderate- population mean hours of moderate-
vigorous physical activity during the middle vigorous physical activity during the middle
of the term (PA last week) engaged by 15 of the term (PA last week) engaged by 45
KIN 206 students who do not have a KIN 206 students who have a
running/walking path within 200m of where running/walking path within 200m of where
they live. they live.
3b) Using JASP, create a descriptive plot with 95% confidence intervals and include it
in your assignment write up (1 mark).
Hours of PA during the last week
3c) Based on your work in steps 3a) and 3b), do the intervals between the samples
overlap (2 marks)? What do you think this overlap or lack of overlap means (2 marks)?
From our calculations and analysis of the descriptive plot above, it is evident that the
intervals between the samples overlap. Since the statistical analysis was conducted to deduce
whether the presence of a running/walking path (excluding sidewalk) within 200m of a KIN
206 student’s area of residence affects the hours of moderate-vigorous physical activity they
engage in during the middle of the term, the overlap between the C.Is suggests that the
difference between the two groups of KIN 206 students is not statistically significant. This
reinforces the conclusion that was reached through our independent samples t-test that
rejected the null hypothesis.
Hypothesis 2:
“Students will participate in more hours of moderate-vigorous physical activity
during the first week of term compared to during the middle of term.”
1a) Select and state the appropriate statistical analysis given the research hypothesis (1
mark).
1b) Using JASP, produce appropriate frequency distribution tables of the data (2
marks).
The distribution does not contain any data that is ‘problematic’ since the values inputted are
appropriate to the question that was asked. For instance, there are no entered values like -3
hours or 170 hours that fall outside the limits of a 0-168-hour period within the week.
2a) State the null and alternative hypothesis (assume two-tailed, alpha = .05) (2 marks).
H : μD = 0
0
H : μD ≠ 0
1
2b) State the decision rule for the analysis if you were to conduct the statistical analysis
by hand (2 marks).
Degrees of freedom = N - 1
D
Degrees of Freedom = 60 - 1 = 59
Rejection rule:
If t < -2.009 or > 2.009, reject H ; otherwise do not reject H .
0 0
2c) Using JASP, conduct the appropriate statistical analysis given the research
hypothesis (two-tailed, alpha = .05). Include the JASP output of the t-table in your
assignment write up.
t = - 2.009 < -0.76 < 2.009; therefore, do not reject Ho (p > 0.05)
2d) Check and discuss the assumptions of the analysis you conducted (eg: what were the
assumptions based on the analysis? Were the assumptions met?) Please include the
JASP output for any assumption checks you did using JASP.
There are 2 assumptions associated with the analysis that was conducted.
2. Normality:
This assumption states that the distribution of differences in the dependent variable
should be normally distributed and can be verified using the Shapiro-Wilk’s test – a
statistical test to assess whether the data deviates from a normal distribution.
Upon first glance, since the p-value for the distribution of difference scores between
the hours of physical activity engaged by KIN 206 students is less than 0.01 (which is
automatically less than 0.05), it can be inferred that the distribution statistically significantly
differs from the normal distribution. However, this assumption must be explored further via
examining the modality, symmetry, and kurtosis for the distribution of the difference scores.
Descriptive Statistics
Hours of PA during the Hours of PA during the Difference
first week last week scores
Valid 60 60 60
Missing 0 0 0
Median 6.00 5.50 0.00
Mean 5.82 6.08 -0.27
Std. Deviation 3.97 4.18 2.73
Skewness 0.18 0.27 -0.73
Std. Error of
0.31 0.31 0.31
Skewness
Kurtosis -0.85 -0.93 1.73
Std. Error of
0.61 0.61 0.61
Kurtosis
Minimum 0.00 0.00 -9.00
Maximum 14.00 14.00 6.00
Distribution Plots
The hours of moderate-vigorous physical activity engaged by 60 KIN 206 students during the
first week of the term (M = 5.82, SD = 3.97), did not differ from the hours of moderate-
vigorous physical activity engaged by them during the last week of the term (M = 6.08, SD =
4.18), t(59) = -0.76, p > 0.05, with analyses suggesting a very small effect size (d = 0.10).
2f) Relate the results of the analysis to the research hypothesis (1 mark).
The statistical analyses revealed that the hours of moderate-vigorous physical activity KIN
206 students participated was not affected by the time of the term, i.e. whether it was during
the first week or during the middle.
3a) Calculate and report the 95% confidence intervals for each sample mean in the
analysis you conducted in step 2. Show the formula that you used to calculate the 95%
confidence interval and the main values within the formula (eg: the mean, standard
error, t) (4 marks).
s s
sx̅ = sx̅ =
√N √N
3.97 4.18
sx̅ = =0.5125 sx̅ = =0.5396
√ 60 √ 60
df = N – 1 df = N – 1
df = 60 – 1 = 59 df = 60 – 1 = 59
A 95% C.I implies α = 0.05 and df = 59. A 95% C.I implies α = 0.05 and df = 59.
t = ± 2.009 t = ± 2.009
95% C.I = 5.82 ± 2.009 (0.5125) 95% C.I = 6.08 ± 2.009 (0.5396)
There is a 0.95 probability that the interval There is a 0.95 probability that the interval
of 4.79 hours to 6.85 hours contains the of 5.00 hours to 7.16 hours contains the
population mean hours of moderate- population mean hours of moderate-
vigorous physical activity engaged by KIN vigorous physical activity engaged by KIN
206 students during the first week of term. 206 students during the first week of term.
3b) Using JASP, create a descriptive plot with 95% confidence intervals and include it
in your assignment write up (1 mark).
Hours of PA during the first week – Hours of PA during the last week
3c) Based on your work in steps 3a) and 3b), do the intervals between the samples
overlap (2 marks)? What do you think this overlap or lack of overlap means (2 marks)?
From our calculations and analysis of the descriptive plot above, it is evident that the
intervals between the samples overlap. Since the statistical analysis was conducted to deduce
whether the time of the term affects the hours of moderate-vigorous physical activity engaged
by KIN 206 students, the overlap between the C.Is suggests that the difference between the
hours of physical activity engaged within the two time periods is not statistically significant.
This reinforces the conclusion that was reached through our dependent samples t-test that
rejected the null hypothesis.