Unit 7 2 Hypothesis Testing and Test of Differences
Unit 7 2 Hypothesis Testing and Test of Differences
Unit 7 2 Hypothesis Testing and Test of Differences
Definition. A measure of effect size is intended to provide a measurement of the absolute magnitude of a treatment
effect, independent of the size of the sample(s) being used.
One of the simplest and most direct methods for measuring effect size is Cohen’ s d. Cohen (1988)
recommended that effect size can be standardized by measuring the mean difference in terms of the standard deviation.
The resulting measure of effect size is computed as
Interpreting r2 In addition to developing the Cohen’s d measure of effect size, Cohen (1988) also proposed
criteria for evaluating the size of a treatment effect that is measured by r2. The criteria were actually suggested for
evaluating the size of a correlation, r, but are easily extended to apply to r2. Cohen’s standards for interpreting r2 are
shown in table below.
Confidence Interval
Definition. A confidence interval is an interval, or range of values, centered around a sample statistic. The logic
behind a confidence interval is that a sample statistic, such as a sample mean, should be relatively near to the
corresponding population parameter. Therefore, we can confidently estimate that the value of the parameter should be
located in the interval.
1
Test of Differences Between Means
There are two general research designs that can be used to obtain the two sets of data to be compared:
• The two sets of data could come from two completely separate groups of participants. For example, the study
could involve a sample of men compared with a sample of women. Or the study could compare grades for one
group of fresh- men who are given laptop computers with grades for a second group who are not given
computers.
• The two sets of data could come from the same group of participants. For example, the researcher could obtain
one set of scores by measuring depression for a sample of patients before they begin therapy and then obtain a
second set of data by measuring the same individuals after 6 weeks of therapy.
Definition. The t statistic is used to test hypotheses about an unknown population mean, µ, when the value of is
unknown. The formula for the t statistic has the same structure as the z-score formula, except that the t statistic uses the
estimated standard error in the denominator.
Definition. Degrees of freedom describe the number of scores in a sample that are independent and free to vary.
Because the sample mean places a restriction on the value of one score in the sample, there are n – 1 degrees of
freedom for a sample with n scores
Two basic assumptions are necessary for hypothesis tests with the t statistic.
• The values in the sample must consist of independent observations. In everyday terms, two observations
are independent if there is no consistent, predictable relationship between the first observation and the second.
More precisely, two events (or observations) are independent if the occurrence of the first event has no effect
on the probability of the second event.
• The population that is sampled must be normal. This assumption is a necessary part of the mathematics
underlying the development of the t statistic and the t distribution table. However, violating this assumption
has little practical effect on the results obtained for a t statistic, especially when the sample size is relatively
large. With very small samples, a normal population distribution is important. With larger samples, this
assumption can be violated without affecting the validity of the hypothesis test. If you have reason to suspect
that the population distribution is not normal, use a large sample to be safe.
The goal of an independent-measures research study is to evaluate the mean difference between two populations
(or between two treatment conditions). Using subscripts to differentiate the two populations, the mean for the first
population is "# , and the second population mean is "$ . The difference between means is simply "# − "$ . As always,
the null hypothesis states that there is no change, no effect, or, in this case, no difference. Thus, in symbols, the null
hypothesis for the independent-measures test is
H0: "# − "$ = 0 or "# = "$ (No difference between the population means)
Example: Using the Data File for STAT 201, test whether the grades of male students in English significantly differ
from the female students assuming that the data is approximately normally distributed and the students were randomly
selected. Use the steps in hypothesis testing.
Solution:
1. Formulate the null hypothesis.
Ho: There is no significant difference in the grades of students in English when the students were classified
as to sex.
Ha: There is a significant difference in the grades of students in English when the students were classified as
to sex.
2
4. Compute the statistical test.
Using SPSS
Steps: (1) Click Analyze, then select (2) Compare Means, and (3) click Independent-Samples T
Test.
A dialog box will open, (4) Put Grades in English under the Test Variable (s) box, and (5) Sex under
Grouping Variable box.
Then (6) Click Define Groups… box and (7) write 1 for Group 1, and 2 for Group 2. After which, (8)
Click Continue, and the (9) Click OK.
3
SPSS Output
Group Statistics
Std. Std. Error
Sex N Mean Deviation Mean
Grades in Male 23 84.3478 3.52406 .73482
English Female 22 90.2273 3.66362 .78109
Std. Deviation (Entire Group) = 4.63
Effect Size:
56 − 57 90.23 − 84.35 5.88
01ℎ+-’. , = = = = 1.27
8',. :+;)(')1- 4.63 4.63
Interpretation:
Results in the Independent Samples Test table showed that we are 95% confident that the mean difference of -
5.879 falls between -8.04 and -3.719 with a standard error of 1.071. It further shows that, significant difference
exited in the grades of students in English when the students were classified as to sex since the p-value of 0.000 is
less than the level of significance which is 0.05 with a t-value of -5.487 and the degrees of freedom of 43.
This simply suggest that female students with a mean grade of 90.23 performed better than male students
with a mean grade of 84.35 as shown in the Group Statistics table with a large effect size, d=1.27.
5. Compare the significance/ probability obtained to the level of significance. Make your decision.
Reject H0 if p≤α, otherwise do not reject.
Decision: Since the p-value of 0.000 is less than the level of significance which is 0.05, reject the null hypothesis
(Ho).
Solution:
1. Formulate the null hypothesis.
Ho: There is no significant difference between the pretest and the post test scores of students when exposed
to a certain intervention.
Ha: There is a significant difference between the pretest and the post test scores of students when exposed to
a certain intervention.
Steps: (1) Click Analyze, then select (2) Compare Means, and (3) click Paired-Samples T Test.
A dialog box will open, (4) Click and Put Pretest and Posttest under the Paired Variables box, and (5)
Click OK.
5
SPSS Output
5. Compare the significance/ probability obtained to the level of significance. Make your decision.
Reject H0 if p≤α, otherwise do not reject.
Decision: Since the p-value of 0.000 is less than the level of significance which is 0.05, reject the null hypothesis
(Ho).
Analyzing the total variability into these two components is the heart of ANOVA.
Thus, the entire process of ANOVA requires nine calculations: three values for SS, three values for df, two
variances (between and within), and a final F-ratio. However, these nine calculations are all logically related and are all
directed toward finding the final F-ratio. The figure below shows the logical structure of ANOVA calculations.
If an ANOVA were used to evaluate these data, a significant F-ratio would indicate that at least one of the
sample mean differences is large enough to satisfy the criterion of statistical significance. As the name implies, post
hoc tests are done after an ANOVA. More specifically, these tests are done after ANOVA when
1. You reject Ho and
2. There are three or more treatments (k ≥3).
Rejecting Ho indicates that at least one difference exists among the treatments. If there are only two treatments,
then there is no question about which means are different and, therefore, no need for posttests. However, with three or
more treatments (k ≥3), the problem is to determine exactly which means are significantly different.
Definition. Post hoc tests (or posttests) are additional hypothesis tests that are done after an ANOVA to determine
exactly which mean differences are significant and which are not.
The independent-measures ANOVA requires the same three assumptions that were necessary for the
independent-measures t hypothesis test:
1. The observations within each sample must be independent.
2. The populations from which the samples are selected must be normal.
3. The populations from which the samples are selected must have equal variances (homogeneity of variance).
Example: Using the Data File for STAT 201, test if there is a significant difference in the SATT score of students when
classified as to the highest educational attainment of the father assuming that the data is approximately normally
distributed. Use the steps in hypothesis testing.
Solution:
1. Formulate the null hypothesis.
Ho: There is no significant difference in the SATT score of students when classified as to highest educational
attainment of the father.
Ha: There is a significant difference in the SATT score of students when classified as to highest educational
attainment of the father.
7
2. Set the level of significance and tailedness of the test.
! = 0.05
'()*+,-+..: two-tailed
A dialog box will open, (4) Click and Put SATT Score under the Dependent List box, and (5) Click Father
HEA in Factor box.
8
For Post Hoc Test, click Post Hoc box, check either Scheffe, LSD, or Bonferroni (depending on the data/
study) and click Continue.
SPSS Output
Descriptives
SATT Score
95% Confidence Interval
for Mean Minimum Maximum
Std. Std. Lower Upper
N Mean Deviation Error Bound Bound
Secondary 15 79.2667 5.21627 1.34684 76.3780 82.1553 70.00 87.00
Bachelor's Degree 15 83.4000 5.05399 1.30494 80.6012 86.1988 70.00 90.00
Master's Degree 15 90.7333 4.25049 1.09747 88.3795 93.0872 84.00 96.00
Total 45 84.4667 6.74739 1.00584 82.4395 86.4938 70.00 96.00
ANOVA
SATT Score
Sum of
Squares df Mean Square F Sig.
Between Groups 1011.733 2 505.867 21.429 .000
Within Groups 991.467 42 23.606
Total 2003.200 44
Effect Size:
7K L7M XY.Z[LZX.$Z ##.^\
Between Masters and Secondary: 01ℎ+-’. , = = = = 1.70
NOP.QRSTUOTVW \.Z] \.Z]
7K L7_ XY.Z[L`[.^Y Z.[[
Between Masters and Bachelor’s Degree: 01ℎ+-’. , = = = = 1.09
NOP.QRSTUOTVW \.Z] \.Z]
Interpretation:
Results in the ANOVA table showed that significant difference exited in the SATT score of students when the
students classified as to highest educational attainment of the father since the p-value of 0.000 is less than the level
of significance which is 0.05 with a F-value of 21.429 and the degrees of freedom between groups is 2 and within
groups is 42.
Since a significant difference existed in the SATT score of students when the students classified as to highest
educational attainment of the father, a post hoc test will be employed to determine where significant difference
existed between groups/categories in the highest educational attainment of the father.
9
Multiple Comparisons
Dependent Variable: SATT Score
Mean 95% Confidence Interval
Difference Lower Upper
(I) Father HEA (J) Father HEA (I-J) Std. Error Sig. Bound Bound
Scheffe Secondary Bachelor's Degree -4.13333 1.77412 .078 -8.6355 .3688
Master's Degree -11.46667* 1.77412 .000 -15.9688 -6.9645
Bachelor's Secondary 4.13333 1.77412 .078 -.3688 8.6355
Degree Master's Degree -7.33333* 1.77412 .001 -11.8355 -2.8312
Master's Degree Secondary 11.46667* 1.77412 .000 6.9645 15.9688
Bachelor's Degree 7.33333* 1.77412 .001 2.8312 11.8355
*. The mean difference is significant at the 0.05 level.
Interpretation:
Using Scheffe as a post hoc test, the significant difference in the SATT score existed between students whose
father are secondary graduate and those students whose father are master’s degree holder (Mean Diff.=-11.467,
p=0.000). This simply means that students whose father are master’s degree holder have better SATT score than
students whose father are secondary graduate with a large effect size, d=1.70.
Also, the significant difference in the SATT score existed between students whose father are bachelor’s
degree holder and those students whose father are master’s degree holder (Mean Diff.=-7.333, p=0.000). This simply
means that students whose father are master’s degree holder have better SATT score than students whose father are
bachelor’s degree holder with a large effect size, d=1.09.
5. Compare the significance/ probability obtained to the level of significance. Make your decision.
Reject H0 if p≤α, otherwise do not reject.
Decision: Since the p-value of 0.000 is less than the level of significance which is 0.05, reject the null hypothesis
(Ho).
10
Problem Set:
A. Using the Data File for STAT 201, answer the following problems using the steps in hypothesis testing.
1. Is there a significant difference in the SATT score of students when classified as to sex?
Report
SATT Score
Sex Mean N Std. Deviation
Male 84.7391 23 5.30195
Female 84.1818 22 8.11017
Total 84.4667 45 6.74739
T-Test
Group Statistics
2. Is there a significant difference in the pretest scores of students when classified as to type of high school graduated
from?
Report
Pretest
HS Graduated Mean N Std. Deviation
Public 81.3214 28 5.93204
Private 81.9412 17 4.90498
Total 81.5556 45 5.51673
11
T-Test
Group Statistics
HS Graduated N Mean Std. Deviation Std. Error Mean
Pretest Public 28 81.3214 5.93204 1.12105
Private 17 81.9412 4.90498 1.18963
B. Ninety students in a math class were exposed in an intervention where the teacher used geogebra in teaching
graphing trigonometric functions. Students were given a test to measure their performance on the said topic. Test
if there is a significant difference between the pretest and the post test scores of students when exposed to an
intervention of using geogebra in teaching graphing trigonometric functions assuming that the data is
approximately normally distributed. Use the steps in hypothesis testing.
T-Test
Paired Samples Statistics
Std. Error
Mean N Std. Deviation Mean
Pair 1 Pretest 83.5556 45 5.70309 .85017
Posttest 87.2222 45 4.63136 .69040
12
C. Using the Data File for STAT 201, answer the given problem using the steps in hypothesis testing.
1. Is there a significant difference in the Entrance Test score of students when classified as to the highest educational
attainment of the father?
Report
Entrance Test
HEA Mean N Std. Deviation
Secondary 318.6667 15 69.44439
Bachelors 356.1333 15 61.94568
Masters 418.0667 15 38.68825
Total 364.2889 45 70.35482
Oneway
ANOVA
Entrance Test
13