0% found this document useful (0 votes)
101 views

Assignment #2

This document contains an analysis of physical activity levels of students based on proximity to walking paths. It tests the hypothesis that students closer to paths engage in more moderate-vigorous activity. Frequency tables show activity hour distributions. A t-test finds no significant difference between groups. Assumptions like independence, normality and equal variance are checked and met.

Uploaded by

Suprita Anand
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views

Assignment #2

This document contains an analysis of physical activity levels of students based on proximity to walking paths. It tests the hypothesis that students closer to paths engage in more moderate-vigorous activity. Frequency tables show activity hour distributions. A t-test finds no significant difference between groups. Assumptions like independence, normality and equal variance are checked and met.

Uploaded by

Suprita Anand
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Assignment #2

Suprita Anand (93856235)

Jenna Multani (10143741)

The University of British Columbia

KIN 206: Introduction to Statistics in Kinesiology

Dr. Carolyn McEwen

March 17th, 2022


Hypothesis 1:

“Students who have a running/walking path (excluding a sidewalk) within 200m of where
they live will engage in more hours of moderate-vigorous physical activity during the middle
of the term (PA last week) compared to students who do not have a running/walking path
within 200m of where they live.”

1a) Select and state the appropriate statistical analysis given the research hypothesis (1
mark).

The t-test for two independent means (unequal sample sizes) will be used.

1b) Using JASP, produce appropriate frequency distribution tables of the data (2
marks).

Frequencies for Hours of PA during the last week


Path within Hours of PA during the Frequenc Valid Cumulative
Percent
200m? last week y Percent Percent
No 0 2 13.33 13.33 13.33
  2 1 6.67 6.67 20.00
  3 1 6.67 6.67 26.67
  4 1 6.67 6.67 33.33
  5 1 6.67 6.67 40.00
  6 2 13.33 13.33 53.33
  7 2 13.33 13.33 66.67
  9 2 13.33 13.33 80.00
  11 2 13.33 13.33 93.33
  14 1 6.67 6.67 100.00
Missing 0 0.00    
  Total 15 100.00    
Yes 0 6 13.33 13.33 13.33
  1 1 2.22 2.22 15.56
  2 1 2.22 2.22 17.78
  3 5 11.11 11.11 28.89
  4 10 22.22 22.22 51.11
  5 1 2.22 2.22 53.33
  6 2 4.44 4.44 57.78
  7 3 6.67 6.67 64.44
  8 2 4.44 4.44 68.89
  9 1 2.22 2.22 71.11
  10 5 11.11 11.11 82.22
  11 2 4.44 4.44 86.67
  12 3 6.67 6.67 93.33
  14 3 6.67 6.67 100.00
Missing 0 0.00    
  Total 45 100.00    
1c) Is there problematic data? Explain your answer (1 mark).

The distribution does not contain any data that is ‘problematic’ since the values inputted are
appropriate to the question that was asked. Firstly, all participants have responded to the
question ‘Path within 200m?’ with either a ‘No’ or a ‘Yes’. Secondly, there are no entered
values like -3 hours or 170 hours that fall outside the limits of a 0-168-hour period within the
week.

2a) State the null and alternative hypothesis (assume two-tailed, alpha = .05) (2 marks).

 μ1: Hours of moderate-vigorous physical activity engaged during the last week by students
who do not have a running/walking path (excluding a sidewalk) within 200m of where they
live.

 μ2: Hours of moderate-vigorous physical activity engaged during the last week by students
who have a running/walking path (excluding a sidewalk) within 200m of where they live.

H : μ1 =  μ2
0

H : μ1 ≠ μ2
1

2b) State the decision rule for the analysis if you were to conduct the statistical analysis
by hand (2 marks).

Degrees of freedom = (N - 1) + ( N - 1) 


1 2

Degrees of Freedom = (45 - 1) + (15 - 1) = 58

For ɑ = .05 (two-tailed) and df = 58, critical value = ± 2.009

Rejection rule: 

If t < -2.009 or > 2.009, reject H ; otherwise do not reject H .


0 0

2c) Using JASP, conduct the appropriate statistical analysis given the research
hypothesis (two-tailed, alpha = .05). Include the JASP output of the t-table in your
assignment write up.

Independent Samples T-Test


95% CI for
95% CI for
Mean
Cohen's d
Difference
Mean SE Cohen's
t df p Lower Upper Lower Upper
Difference Difference d
Hours of
PA during 0.1
58 0.85 0.24 1.26 -2.27 2.76 0.06 -0.53 0.64
the last 9
week
Independent Samples T-Test
95% CI for
95% CI for
Mean
Cohen's d
Difference
Mean SE Cohen's
t df p Lower Upper Lower Upper
Difference Difference d
Note.  Student's t-test.
t = - 2.009 < 0.19 < 2.009; therefore, do not reject Ho (p > 0.05)

2d) Check and discuss the assumptions of the analysis you conducted (eg: what were the
assumptions based on the analysis? Were the assumptions met?) Please include the
JASP output for any assumption checks you did using JASP.

There are 5 assumptions associated with the analysis that was conducted.

1. Level of measurement of the dependent variable:


This assumption states that the dependent variable must be measured at an interval or
ratio level measurement. In this study, the hours of moderate-vigorous physical
activity engaged by KIN 206 students is the dependent variable, which due to its true
zero point makes it a ratio level of measurement. Therefore, this assumption is met.

2. Independence:
This assumption states that scores on the dependent variable must be independent,
meaning that they must come from different participants. In this study, data was
collected using a between groups design. Each participant was categorized into two
groups based on their response to the question that was asked regarding whether or
not they had a running/walking path (excluding sidewalk) within 200m of where they
lived. Therefore, this assumption is met since no participant was present in both
groups.

3. Levels of the independent variable:


This assumption states that there must be two levels of the independent variable. In
this study, the two levels of the independent variable are as follows: KIN 206 students
who have a running/walking path (excluding sidewalk) within 200m of where they
live and KIN 206 students who do not have a running/walking path (excluding
sidewalk) within 200m of where they live. Therefore, this assumption is met.

4. Homogeneity of variance:
Homogeneity of variance assumes that the two groups have equal variances, which
was tested by the Levene's test. Since the p-value of 0.54 for the F-statistic is greater
than 0.05, the variances of the two groups do not significantly statistically differ from
each other. Therefore, the assumption of equality of variances is met.

Test of Equality of Variances (Levene's)


  F df p
Hours of PA during the last week 0.39 1 0.54
5. Normality:
This assumption states that the dependent variable should be normally distributed and
can be verified using the Shapiro-Wilk’s test – a statistical test to assess whether the
data deviates from a normal distribution.

Test of Normality (Shapiro-Wilk)


    W p
Hours of PA during the last week No 0.97 0.89
  Yes 0.93 9.89e-3
Note.  Significant results suggest a deviation from normality.

Since the p-value of 0.89 for the ‘No’ group is greater than 0.05, the distribution of
the hours of physical activity engaged by KIN 206 students who do not have a
running/walking path (excluding a sidewalk) within 200m of where they live do not
statistically significantly differ from a normal distribution. Thus, the assumption of normality
is met for this group.

However, the p-value of 0.00989 for the ‘Yes’ group is less than 0.05, which indicates
that the distribution of the hours of physical activity engaged by KIN 206 students who have
a running/walking path (excluding a sidewalk) within 200m of where they live statistically
significantly differs from a normal distribution. Although this initial analysis suggests that the
assumption of normality for this group is not met, it is important to evaluate the assumption
of normality further.

Hours of PA during the last week Descriptive Statistics


Yes Hours of PA
during the last
week
  No Yes
Valid 15 45
Missing 0 0
Median 6.00 4.00
Mean 6.27 6.02
Std. Error of
1.06 0.63
Mean
Std. Deviation 4.10 4.25
Variance 16.78 18.02
Skewness 0.12 0.33
Std. Error of
0.58 0.35
Skewness
Kurtosis -0.55 -0.97
Std. Error of
1.12 0.69
Kurtosis
Minimum 0.00 0.00
Maximum 14.00 14.00
Since the mean hours of physical activity (6.02 hours) for KIN 206 students who have
a running/walking path (excluding sidewalk) within 200m of where they live is greater than
the median (4.00 hours), it results in a positively skewed distribution. The histogram
illustrates that the highest frequency of values appear towards the left of the distribution and
the lowest frequency of values occur near the tail-end (right side) of the distribution.
However, upon analysing the slightly positive skewness statistic of 0.33, it is evident that the
data still approximates a normal distribution, as only a skewness score that is greater than 2
or lesser than -2 gives rise to concerns about the degree to which the distribution is
asymmetrical.

Moreover, a slightly negative kurtosis score of -0.97 implies that the distribution has a
little more variability compared to a normal distribution that is mesokurtic (neither peaked
nor flat). This is also illustrated by the histogram that presents itself to have a flatter shape
(platykurtic). Given that only a kurtosis statistic that is greater than 2 or lesser than -2 raises
concerns regarding the degree to which the distribution deviates from normality, it is clear
that the data still approximates a normal distribution.

Therefore, based on the aforementioned analysis, it can be concluded that the


assumption of normality for the ‘Yes’ group is met.

2e) Write a concluding statement based on your analysis (6 marks).

The mean hours of physical activity last week engaged by 15 KIN 206 students who did not
have a running/walking path within 200m of where they live (excluding a sidewalk) (M =
6.27, SD = 4.10) did not statistically significantly differ from the mean hours of physical
activity last week engaged by 45 KIN 206 students who have a running/walking path within
200m of where they live (excluding a sidewalk) (M = 6.02, SD = 4.25), t(58) = 0.19, p >
0.05, with analyses suggesting an extremely small effect size (d = 0.06).

2f) Relate the results of the analysis to the research hypothesis (1 mark).

The existence of a walking/running path (excluding a sidewalk) within 200m of where a KIN
206 student lived did not affect the hours of moderate-vigorous physical activity they
engaged in during the middle of the term.

3a) Calculate and report the 95% confidence intervals for each sample mean in the
analysis you conducted in step 2. Show the formula that you used to calculate the 95%
confidence interval and the main values within the formula (eg: the man, standard
error, t) (4 marks).

‘No’ group ‘Yes’ group


95% C.I = x̄ ± t (sx̅) 95% C.I = x̄ ± t (sx̅)

x̄ = 6.27, s = 4.10, N = 15 x̄ = 6.02, s = 4.25, N = 45

s s
sx̅ = sx̅ =
√N √N
4.10 4.25
sx̅ = =1.0586 sx̅ = =0.6336
√ 15 √ 45
df = N – 1 df = N – 1

df = 15 – 1 = 14 df = 45 – 1 = 44

A 95% C.I implies α = 0.05 and df = 14. A 95% C.I implies α = 0.05 and df = 44.

t = ± 2.145 t = ± 2.021

95% C.I = 6.27 ± 2.145 (1.0586) 95% C.I = 6.02 ± 2.021 (0.6336)

95% C.I = 6.27 ± 2.27 95% C.I = 6.02 ± 1.0840

Upper limit = 8.54 (2dp) Upper limit = 7.30 (2dp)

Lower limit = 4.00 (2dp) Lower limit = 4.74 (2dp)


95% C.I = 4.00, 8.54 95% C.I = 4.74, 7.30

There is a 0.95 probability that the interval There is a 0.95 probability that the interval
of 4.00 hours to 8.54 hours contains the of 4.74 hours to 7.30 hours contains the
population mean hours of moderate- population mean hours of moderate-
vigorous physical activity during the middle vigorous physical activity during the middle
of the term (PA last week) engaged by 15 of the term (PA last week) engaged by 45
KIN 206 students who do not have a KIN 206 students who have a
running/walking path within 200m of where running/walking path within 200m of where
they live. they live.

3b) Using JASP, create a descriptive plot with 95% confidence intervals and include it
in your assignment write up (1 mark).
Hours of PA during the last week

3c) Based on your work in steps 3a) and 3b), do the intervals between the samples
overlap (2 marks)? What do you think this overlap or lack of overlap means (2 marks)?

From our calculations and analysis of the descriptive plot above, it is evident that the
intervals between the samples overlap. Since the statistical analysis was conducted to deduce
whether the presence of a running/walking path (excluding sidewalk) within 200m of a KIN
206 student’s area of residence affects the hours of moderate-vigorous physical activity they
engage in during the middle of the term, the overlap between the C.Is suggests that the
difference between the two groups of KIN 206 students is not statistically significant. This
reinforces the conclusion that was reached through our independent samples t-test that
rejected the null hypothesis.

Hypothesis 2:
“Students will participate in more hours of moderate-vigorous physical activity
during the first week of term compared to during the middle of term.”

1a) Select and state the appropriate statistical analysis given the research hypothesis (1
mark).

The t-test for paired sample means will be used.

1b) Using JASP, produce appropriate frequency distribution tables of the data (2
marks).

Frequencies for Hours of PA during the first week


Hours of PA during the first Frequenc Valid Cumulative
Percent
week y Percent Percent
0 9 15.00 15.00 15.00
1 1 1.67 1.67 16.67
2 4 6.67 6.67 23.33
Frequencies for Hours of PA during the first week
Hours of PA during the first Frequenc Valid Cumulative
Percent
week y Percent Percent
3 4 6.67 6.67 30.00
4 6 10.00 10.00 40.00
5 3 5.00 5.00 45.00
6 9 15.00 15.00 60.00
7 4 6.67 6.67 66.67
8 5 8.33 8.33 75.00
9 3 5.00 5.00 80.00
10 3 5.00 5.00 85.00
11 2 3.33 3.33 88.33
12 4 6.67 6.67 95.00
13 2 3.33 3.33 98.33
14 1 1.67 1.67 100.00
Missing 0 0.00    
Total 60 100.00    

Frequencies for Hours of PA during the last week


Hours of PA during the last Valid Cumulative
Frequency Percent
week Percent Percent
0 8 13.33 13.33 13.33
1 1 1.67 1.67 15.00
2 2 3.33 3.33 18.33
3 6 10.00 10.00 28.33
4 11 18.33 18.33 46.67
5 2 3.33 3.33 50.00
6 4 6.67 6.67 56.67
7 5 8.33 8.33 65.00
8 2 3.33 3.33 68.33
9 3 5.00 5.00 73.33
10 5 8.33 8.33 81.67
11 4 6.67 6.67 88.33
12 3 5.00 5.00 93.33
14 4 6.67 6.67 100.00
Missing 0 0.00    
Total 60 100.00    

Frequencies for Difference scores


Difference
Frequency Percent Valid Percent Cumulative Percent
scores
-9 1 1.67 1.67 1.67
Frequencies for Difference scores
Difference
Frequency Percent Valid Percent Cumulative Percent
scores
-8 1 1.67 1.67 3.33
-5 1 1.67 1.67 5.00
-4 5 8.33 8.33 13.33
-3 3 5.00 5.00 18.33
-2 3 5.00 5.00 23.33
-1 4 6.67 6.67 30.00
0 26 43.33 43.33 73.33
1 1 1.67 1.67 75.00
2 7 11.67 11.67 86.67
3 5 8.33 8.33 95.00
4 1 1.67 1.67 96.67
5 1 1.67 1.67 98.33
6 1 1.67 1.67 100.00
Missing 0 0.00    
Total 60 100.00    

1c) Is there problematic data? Explain your answer (1 mark).

The distribution does not contain any data that is ‘problematic’ since the values inputted are
appropriate to the question that was asked. For instance, there are no entered values like -3
hours or 170 hours that fall outside the limits of a 0-168-hour period within the week.

2a) State the null and alternative hypothesis (assume two-tailed, alpha = .05) (2 marks).

H : μD =  0
0

H : μD ≠ 0
1

2b) State the decision rule for the analysis if you were to conduct the statistical analysis
by hand (2 marks).

Degrees of freedom = N - 1
D

Degrees of Freedom = 60 - 1 = 59

For ɑ = .05 (two-tailed) and df = 59, critical value = ± 2.009

Rejection rule: 
If t < -2.009 or > 2.009, reject H ; otherwise do not reject H .
0 0

2c) Using JASP, conduct the appropriate statistical analysis given the research
hypothesis (two-tailed, alpha = .05). Include the JASP output of the t-table in your
assignment write up.

Paired Samples T-Test


Mean SE Cohen's
Measure 1   Measure 2 t df p
Difference Difference d
Hours of PA during Hours of PA during - 0.4
- 59 -0.27 0.35 -0.10
the first week the last week 0.76 5
Note.  Student's t-test.

t = - 2.009 < -0.76 < 2.009; therefore, do not reject Ho (p > 0.05)

2d) Check and discuss the assumptions of the analysis you conducted (eg: what were the
assumptions based on the analysis? Were the assumptions met?) Please include the
JASP output for any assumption checks you did using JASP.

There are 2 assumptions associated with the analysis that was conducted.

1. Level of measurement of the dependent variable:


This assumption states that the dependent variable must be measured at an interval or
ratio level measurement. In this study, the hours of moderate-vigorous physical
activity engaged by KIN 206 students during the start and middle of the term is the
dependent variable, which due to its true zero point makes it a ratio level of
measurement. Therefore, this assumption is met.

2. Normality:
This assumption states that the distribution of differences in the dependent variable
should be normally distributed and can be verified using the Shapiro-Wilk’s test – a
statistical test to assess whether the data deviates from a normal distribution.

Test of Normality (Shapiro-Wilk)


      W p
Hours of PA during the first week - Hours of PA during the last week 0.92 < .001
Note.  Significant results suggest a deviation from normality.

Upon first glance, since the p-value for the distribution of difference scores between
the hours of physical activity engaged by KIN 206 students is less than 0.01 (which is
automatically less than 0.05), it can be inferred that the distribution statistically significantly
differs from the normal distribution. However, this assumption must be explored further via
examining the modality, symmetry, and kurtosis for the distribution of the difference scores.
Descriptive Statistics
Hours of PA during the Hours of PA during the Difference
 
first week last week scores
Valid 60 60 60
Missing 0 0 0
Median 6.00 5.50 0.00
Mean 5.82 6.08 -0.27
Std. Deviation 3.97 4.18 2.73
Skewness 0.18 0.27 -0.73
Std. Error of
0.31 0.31 0.31
Skewness
Kurtosis -0.85 -0.93 1.73
Std. Error of
0.61 0.61 0.61
Kurtosis
Minimum 0.00 0.00 -9.00
Maximum 14.00 14.00 6.00

Distribution Plots

Hours of PA during the first week

Hours of PA during the last week


Difference scores

2e) Write a concluding statement based on your analysis (6 marks).

The hours of moderate-vigorous physical activity engaged by 60 KIN 206 students during the
first week of the term (M = 5.82, SD = 3.97), did not differ from the hours of moderate-
vigorous physical activity engaged by them during the last week of the term (M = 6.08, SD =
4.18), t(59) = -0.76, p > 0.05, with analyses suggesting a very small effect size (d = 0.10).

2f) Relate the results of the analysis to the research hypothesis (1 mark).

The statistical analyses revealed that the hours of moderate-vigorous physical activity KIN
206 students participated was not affected by the time of the term, i.e. whether it was during
the first week or during the middle.

3a) Calculate and report the 95% confidence intervals for each sample mean in the
analysis you conducted in step 2. Show the formula that you used to calculate the 95%
confidence interval and the main values within the formula (eg: the mean, standard
error, t) (4 marks).

Hours of PA first week Hours of PA last week


95% C.I = x̄ ± t (sx̅) 95% C.I = x̄ ± t (sx̅)

x̄ = 5.82, s = 3.97, and N = 60 x̄ = 6.08, s = 3.97, N = 60

s s
sx̅ = sx̅ =
√N √N
3.97 4.18
sx̅ = =0.5125 sx̅ = =0.5396
√ 60 √ 60
df = N – 1 df = N – 1

df = 60 – 1 = 59 df = 60 – 1 = 59
A 95% C.I implies α = 0.05 and df = 59. A 95% C.I implies α = 0.05 and df = 59.

t = ± 2.009 t = ± 2.009

95% C.I = 5.82 ± 2.009 (0.5125) 95% C.I = 6.08 ± 2.009 (0.5396)

95% C.I = 5.82 ± 1.0296 95% C.I = 6.08 ± 1.0840

Upper limit = 6.85 (2dp) Upper limit = 7.16 (2dp)

Lower limit = 4.79 (2dp) Lower limit = 5.00 (2dp)


95% C.I = 4.79, 6.85 95% C.I = 5.00, 7.16

There is a 0.95 probability that the interval There is a 0.95 probability that the interval
of 4.79 hours to 6.85 hours contains the of 5.00 hours to 7.16 hours contains the
population mean hours of moderate- population mean hours of moderate-
vigorous physical activity engaged by KIN vigorous physical activity engaged by KIN
206 students during the first week of term. 206 students during the first week of term.

3b) Using JASP, create a descriptive plot with 95% confidence intervals and include it
in your assignment write up (1 mark).

Hours of PA during the first week – Hours of PA during the last week
3c) Based on your work in steps 3a) and 3b), do the intervals between the samples
overlap (2 marks)? What do you think this overlap or lack of overlap means (2 marks)?

From our calculations and analysis of the descriptive plot above, it is evident that the
intervals between the samples overlap. Since the statistical analysis was conducted to deduce
whether the time of the term affects the hours of moderate-vigorous physical activity engaged
by KIN 206 students, the overlap between the C.Is suggests that the difference between the
hours of physical activity engaged within the two time periods is not statistically significant.
This reinforces the conclusion that was reached through our dependent samples t-test that
rejected the null hypothesis.

You might also like