Statistical Analysis Cont 1
Statistical Analysis Cont 1
(continued-1)
Serkan Adıgüzel
PhD Candidate
COURSE FLOW
• Parametric Tests
• Independent samples t test
• Paired samples t test
• The One-Way, Between-Subjects Analysis of Variance (ANOVA)
• The One-Way, Repeated Measures Analysis of Variance (ANOVA)
• A CAUSAL CLAIM IS the boldest kind of claim a scientist can make.
• A causal claim replaces verb phrases such as related to, is associated with, or linked to with powerful
verbs such as makes, influences, or affects.
• Causal claims are special: When researchers make a causal claim, they are also stating something about
interventions and treatments.
• If your design is on behalf of the casual relationship then you have to choose a specific statistical
analysis.
INDEPENDENT SAMPLES T TEST
LET’S LOOK AT THE SPSS FILE
• n = 36
INDEPENDENT SAMPLES T TEST
• The one-sample t test is used when we have one sample of data and want to compare its
mean with the population mean.
• It is called the “independent samples” t test because each member of the sample is randomly assigned to one
and only one experimental group. This type of experiment is called a between-subjects experiment.
• Just to avoid confusion, the fact that this statistical tool is called the “independent” samples t test has nothing to
do with the notion of an independent variable. Rather, the word “independent” signifies that each member of
the sample was randomly assigned to one experimental group.
• What is the logic of the independent samples t test? It is a comparison of whether mean differences in
the sample are generalizable to the population from which that sample was drawn.
• What do we need?
• Your sample data should follow a normal distribution or each group has more than 15 participants.
• T tests require continuous data. Continuous variables can take on any numeric value, and the scale can be
meaningfully divided into smaller increments, including fractional and decimal values. Typically, you measure
continuous variables on a scale. For example, when you measure depression levels, iq scores, and age, you have
continuous data.
• We must know the mean for each group on a dependent variable.
• Use an independent samples t test when you want to compare the means of precisely two groups—no more
and no less! Typically, you perform this test to determine whether two population means are different.
• The independent samples t test is also known as the two sample t test.
• For an example of an independent t test, do students who learn using Method A have a different mean
score than those who learn using Method B?
HOW TO RUN
INDEPENDENT SAMPLES T TEST
• To run an Independent Samples t Test in SPSS, click Analyze > Compare Means > Independent-
Samples T Test.
• A Test Variable(s): The dependent variable(s). This is the continuous variable whose
means will be compared between the two groups. You may run multiple t tests
simultaneously by selecting more than one test variable.
• B Grouping Variable: The independent variable. The categories (or groups) of the
independent variable will define which samples will be compared in the t test. The
grouping variable must have at least two categories (groups); it may have more than two
categories but a t test can only compare two groups, so you will need to specify which
two groups to compare. You can also use a continuous variable by specifying a cut point
to create two groups (i.e., values at or above the cut point and values below the cut point).
• C Define Groups: Click Define Groups to define the category indicators (groups) to use in
the t test. If the button is not active, make sure that you have already moved your
independent variable to the right in the Grouping Variable field. You must define the
categories of your grouping variable before you can run the Independent Samples t Test
procedure.
• D Options: The Options section is where you can set your desired confidence level for the
confidence interval for the mean difference, and specify how SPSS should handle
missing values.
• Define Groups
• Clicking the Define Groups button (C) opens the Define Groups window:
• 1 Use specified values: If your grouping variable is categorical, select Use specified
values. Enter the values for the categories you wish to compare in the Group 1 and Group
2 fields. If your categories are numerically coded, you will enter the numeric codes. If
your group variable is string, you will enter the exact text strings representing the two
categories. If your grouping variable has more than two categories (e.g., takes on values
of 1, 2, 3, 4), you can specify two of the categories to be compared (SPSS will disregard the
other categories in this case).
• Note that when computing the test statistic, SPSS will subtract the mean of the Group 2
from the mean of Group 1. Changing the order of the subtraction affects the sign of the
results, but does not affect the magnitude of the results.
• 2 Cut point: If your grouping variable is numeric and continuous, you can designate a cut
point for dichotomizing the variable. This will separate the cases into two categories
based on the cut point. Specifically, for a given cut point x, the new categories will be:
• Group 1: All cases where grouping variable > x
• Group 2: All cases where grouping variable < x
HOW TO READ
INDEPENDENT SAMPLES T TEST OUTPUT
• The first section, Group Statistics, provides basic information about the group comparisons,
including the sample size (n), mean, standard deviation, and standard error for mile times by
group.
• The second section, Independent Samples Test, displays the results most relevant to the
Independent Samples t Test. There are two parts that provide different pieces of
information: (A) Levene’s Test for Equality of Variances and (B) t-test for Equality of
Means.
• A Levene's Test for Equality of of Variances: This section has the test results for Levene's
Test
• The p-value of Levene's test is printed as ".000" (but should be read as p < 0.001 -- i.e., p very small).
This tells us that we should look at the "Equal variances not assumed" row for the ttest (and
corresponding confidence interval) results. (If this test result had not been significant -- that is, if
we had observed p > α -- then we would have used the "Equal variances assumed" output.)
• B t-test for Equality of Means provides the results for the actual Independent
Samples t Test. From left to right:
• t is the computed test statistic, using the formula for the equal-variances-assumed test
statistic (first row of table) or the formula for the equal-variances-not-assumed test
statistic (second row of table)
• df is the degrees of freedom, using the equal-variances-assumed degrees of freedom formula (first
row of table) or the equal-variances-not-assumed degrees of freedom formula (second row of
table)
• Sig (2-tailed) is the p-value corresponding to the given test statistic and degrees of freedom
• Mean Difference is the difference between the sample means, i.e. x1 − x2; it also corresponds to the
numerator of the test statistic for that test
• Std. Error Difference is the standard error of the mean difference estimate; it also corresponds to
the denominator of the test statistic for that test
• C Confidence Interval of the Difference: This part of the t-test output complements the
significance test results. Typically, if the CI for the mean difference contains 0 within the
interval -- i.e., if the lower boundary of the CI is a negative number and the upper
boundary of the CI is a positive number -- the results are not significant at the chosen
significance level. In this example, the 95% CI is [01:57, 02:32], which does not contain
zero; this agrees with the small p-value of the significance test.
• Since p < .001 is less than our chosen significance level α = 0.05, we can reject the null
hypothesis, and conclude that the that the mean mile time for athletes and non-athletes
is significantly different.
• Based on the results, we can state the following:
• There was a significant difference in mean mile time between non-athletes and athletes (t315.846 =
15.047, p < .001).
• The average mile time for athletes was 2 minutes and 14 seconds lower than the average mile
time for non-athletes.
HOW TO REPORT
INDEPENDENT SAMPLES T TEST
• Examples:
• An independent-samples t-test was run to determine if the Mind Over Matter coping strategy was
more effective at reducing anxiety than deep breathing exercises. The results showed that the
participants using the Mind Over Matter strategy (M = 21, SD = 2.2) reported lower levels of anxiety
than participants using deep breathing exercises (M = 28, SD = 2.7). This difference was
significant (t(19) = 4.37, p < .01).
• The 25 participants who received the drug intervention (M = 480, SD = 34.5) compared to the 28
participants in the control group (M = 425, SD = 31) demonstrated significantly better peak flow
scores, t(51) = 2.1, p = .04.
• There was no significant effect for sex, t(38) = 1.7, p = .097, despite women (M = 55, SD = 8) attaining
higher scores than men (M = 53, SD = 7.8).
• When reporting the results of the independent-samples t-test, APA Style has very specific
requirements on what information should be included. Below is the key information required for
reporting the results of the. You want to replace the red text with the appropriate values from your
output.
• t(degrees of freedom) = the t statistic, p = p value.
• When reporting the p-value, there are two ways to approach it. One is when the results are not
significant. In that case, you want to report the p-value exactly: p = .24. The other is when the results
are significant. In this case, you can report the p-value as being less than the level of significance: p <
.05.
• The t statistic should be reported to two decimal places without a 0 before the decimal point: .36
• Degrees of freedom for this test are (n1 - 1) + (n2 - 1) or (n1 + n2) - 2, where "n1" represents the
number of people in one group and "n2" represents the number of people in the other group. The n for
each group can be found in the SPSS output.
PAIRED SAMPLES T TEST
LET’S LOOK AT THE SPSS FILE
• n = 15
PAIRED SAMPLES T TEST
• The Paired-Samples T Test procedure compares the means of two variables for a single
group.
• The Paired Samples t Test compares the means of two measurements taken from the same
individual, object, or related units. These "paired" measurements can represent things like:
• A measurement taken at two different times (e.g., pre-test and post-test score with an intervention
administered between the two time points)
• A measurement taken under two different conditions (e.g., completing a test under a "control" condition
and an "experimental" condition)
• Example
• In a study on anxiety levels, all patients are measured at the beginning of the study, given a therapy, and measured again.
Thus, each subject has two measures, often called before and after measures.
• What do we need?
• Dependent variable that is continuous (i.e., interval or ratio level)
• Related samples/groups (i.e., dependent observations)
• Random sample of data from the population
• Normal distribution (approximately) of the difference between the paired values
• The independent samples t test is also known as Dependent t Test, Repeated Measures t Test
HOW TO RUN
PAIRED SAMPLES T TEST
• To run a Paired Samples t Test in SPSS, click Analyze > Compare Means > Paired-Samples T
Test.
• A Pair: The “Pair” column represents the number of Paired Samples t Tests to run. You
may choose to run multiple Paired Samples t Tests simultaneously by selecting multiple
sets of matched variables. Each new pair will appear on a new line.
• B Variable1: The first variable, representing the first group of matched values. Move the
variable that represents the first group to the right where it will be listed beneath the
“Variable1” column.
• C Variable2: The second variable, representing the second group of matched values.
Move the variable that represents the second group to the right where it will be listed
beneath the “Variable2” column.
• D Options: Clicking Options will open a window where you can specify the Confidence
Interval Percentage and how the analysis will address Missing Values (i.e., Exclude
cases analysis by analysis or Exclude cases listwise). Click Continue when you are
finished making specifications.
HOW TO READ
PAIRED SAMPLES T TEST OUTPUT
• There are three tables: Paired Samples Statistics, Paired Samples Correlations, and Paired
Samples Test.
• Paired Samples Statistics gives univariate descriptive statistics (mean, sample size, standard deviation,
and standard error) for each variable entered. Notice that the sample size here is 398; this is because the
paired t-test can only use cases that have non-missing values for both variables.
• Paired Samples Correlations shows the bivariate Pearson correlation coefficient (with a two-tailed test of
significance) for each pair of variables entered.
• Paired Samples Test gives the hypothesis test results.
• First column: The pair of variables being tested, and the order the subtraction was carried out. (If you have
specified more than one variable pair, this table will have multiple rows.)
• Mean: The average difference between the two variables.
• Standard deviation: The standard deviation of the difference scores.
• Standard error mean: The standard error (standard deviation divided by the square root of the sample size). Used
in computing both the test statistic and the upper and lower bounds of the confidence interval.
• t: The test statistic (denoted t) for the paired T test.
• df: The degrees of freedom for this test.
• Sig. (2-tailed): The p-value corresponding to the given test statistic t with degrees of freedom df.
• Because the p value is <.05, the mean difference between the variables is statistically significant at α = 0.05.
HOW TO REPORT
PAIRED SAMPLES T TEST
• Examples:
• “The results of this study indicate that there is a statistically significant difference between the mean
test scores of the experimental group and the control group. Specifically, the experimental group had
a higher mean test score than the control group (M = 85, SD = 10) than the control group (M = 80, SD =
15). A paired-samples t-test revealed a t-statistic of 2.17, with df=49 (p < .05).
• A Paired samples t-test was conducted to determine the effect of training on a math test score. The
results indicate a not significant difference between math test score before training (M= 73.08; SD=
16.89) and Math test score after training (M= 68.83; SD= 17.69); [t(36) = 1.086, p = .284].
• A paired samples t test was conducted to compare differences in participants reported feelings of
disgust about Kathy’s actions, when attributed to one’s self or when attributed to the victim (Maria).
Findings indicated that there was a difference between feelings of disgust when attributed to one’s
self compared to when attributed to the victim, t(179) = 3.10, p = .002, Cohen’s d = 0.23. Participants
attributed more feelings of disgust to themselves ( M = 5.67, SD = 1.24) compared to the victim ( M =
5.83, SD = 1.21).
THE ONE-WAY, BETWEEN-SUBJECTS
ANALYSIS OF VARIANCE
(ANOVA)
LET’S LOOK AT THE SPSS FILE
• n = 15
THE ONE-WAY, BETWEEN-SUBJECTS
ANALYSIS OF VARIANCE
(ANOVA)
• The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant
differences between the means of three or more independent (unrelated) groups.
• For example, you could use a one-way ANOVA to understand whether exam performance differed based on test anxiety
levels amongst students, dividing students into three independent groups (e.g., low, medium and high-stressed students).
• Your data must meet the following requirements:
• Dependent variable that is continuous (i.e., interval or ratio level)
• Independent variable that is categorical (i.e., two or more groups)
• Independent samples/groups (i.e., independence of observations)
• Random sample of data from the population
• Normal distribution (approximately) of the dependent variable for each group (i.e., for each level of the factor)
• This test is also known as, One-Factor ANOVA, One-Way Analysis of Variance, Between Subjects ANOVA
HOW TO RUN
ONE-WAY ANOVA
• To run a One-Way ANOVA in SPSS, click Analyze > Compare Means > One-Way ANOVA.
• A Dependent List: The dependent variable(s). This is the variable whose means will be
compared between the samples (groups). You may run multiple means
comparisons simultaneously by selecting more than one dependent variable.
• B Factor: The independent variable. The categories (or groups) of the independent
variable will define which samples will be compared. The independent variable must
have at least two categories (groups), but usually has three or more groups when used in
a One-Way ANOVA.
• C Contrasts: (Optional) Specify contrasts, or planned comparisons, to be conducted after
the overall ANOVA test.
• D Post Hoc: (Optional) Request post hoc (also known as multiple comparisons) tests.
Specific post hoc tests can be selected by checking the associated boxes.
• E Options: Clicking Options will produce a window where you can specify
which Statistics to include in the output (Descriptive, Fixed and random effects,
Homogeneity of variance test, Brown-Forsythe, Welch), whether to include a Means plot,
and how the analysis will address Missing Values (i.e., Exclude cases analysis by
analysis or Exclude cases listwise). Click Continue when you are finished making
specifications.
HOW TO READ
ONE-WAY ANOVA OUTPUT
• This is the table that shows the output of the ANOVA analysis and whether there is a statistically significant
difference between our group means. We can see that the significance value is 0.021 (i.e., p = .021), which is
below 0.05. and, therefore, there is a statistically significant difference in the mean length of time to complete
the spreadsheet problem between the different courses taken.
• This is great to know, but we do not know which of the specific groups differed. Luckily, we can find this out in
the Multiple Comparisons table which contains the results of the Tukey post hoc test.
• From the results so far, we know that there are statistically significant differences between the groups
as a whole. The table below, Multiple Comparisons, shows which groups differed from each other. The
Tukey post hoc test is generally the preferred test for conducting post hoc tests on a one-way ANOVA,
but there are many others. We can see from the table below that there is a statistically significant
difference in time to complete the problem between the group that took the beginner course and the
intermediate course (p = 0.046), as well as between the beginner course and advanced course (p =
0.034). However, there were no differences between the groups that took the intermediate and
advanced course (p = 0.989).
HOW TO REPORT
ONE-WAY ANOVA
• Examples:
• There was a statistically significant difference between groups as determined by one-way ANOVA
(F(2,27) = 4.467, p = .021). A Tukey post hoc test revealed that the time to complete the problem was
statistically significantly lower after taking the intermediate (23.6 ± 3.3 min, p = .046) and advanced
(23.4 ± 3.2 min, p = .034) course compared to the beginners course (27.2 ± 3.0 min). There was no
statistically significant difference between the intermediate and advanced groups (p = .989).
THE ONE-WAY, REPEATED MEASURES
ANALYSIS OF VARIANCE
(ANOVA)
LET’S LOOK AT THE SPSS FILE
• password:
uedufy
• n = 50
THE ONE-WAY, REPEATED MEASURES
ANALYSIS OF VARIANCE
(ANOVA)
• A one-way repeated measures ANOVA (also known as a within-subjects ANOVA) is used to determine whether three
or more group means are different where the participants are the same in each group. For this reason, the groups are
sometimes called "related" groups
• For example, you could use a one-way repeated measures ANOVA to understand whether there is a difference in anxiety levels
amongst moderately anxious participants after a hypnotherapy programme aimed at reducing anxiety (e.g., with three time
points: anxiety immediately before, 1 month after and 6 months after the hypnotherapy programme). In this example, "anxiety
level" is your dependent variable, whilst your independent variable is "time" (i.e., with three related groups, where each of the
three time points is considered a "related group").
• Your data must meet the following requirements:
• Dependent variable that is continuous (i.e., interval or ratio level)
• Related samples/groups (i.e., dependence of observations)
• Random sample of data from the population
• Normal distribution (approximately) of the dependent variable for each group (i.e., for each level of the factor)
• This test is also known as, With-in subjects ANOVA
HOW TO RUN
REPEATED MEASURES ANOVA
1. Navigate to Analyze → General Linear Model → Repeated Measures in the SPSS top menu
• The Within-Subject Factor Name, e,g., Period or Time Frame. You can
rename the name with something appropriate in the case is needed.
• The Number of Levels respectively the number of dependent variables
in the study. In our case, this is three.
• Click the Add button to move the factor and the defined variables into
the respective box.
• Click the Define button.
• In the Repeated Measures window we have to specify in the right
order a dependent variable for each Within-Subjects Variables.
• For instance, In our SPSS sample dataset, we have three
dependent variables representing three periods: 1st Period,
2nd Period, and 3rd Period. In your study, these
• The 1st Period will be moved to the _?_(1) position, the 2nd
Period to the _?_(2) position, the 3rd Period to _?_(3), and so
on. You can select each variable and use the arrow to move it
in the appropriate box or even easier, use the mouse to drag
and drop each dependent variable to its appropriate position.
• Add the independent variable of interest to the Between-
Subject Factor(s) box. In our case, the independent variable is
the Program.
• [Optional] Once on the Repeated Measures: Options, click on
the Options button and make sure the following checkboxes
are checked:
• Descriptive statistics
• Estimates of effect size
• Homogeneity tests
HOW TO READ
REPEATED MEASURES ANOVA OUTPUT
1. The first table in the ANOVA repeated measures output is Within-Subjects Factors which is represented by
three periods, respectively 1st Period, 2nd Period, and 3rd Period in our example.
2. The Between-Subjects Factors table shows the treatments (conditions) applied to the subjects. In our
examples, these conditions are Low Carbs Diet and Low Fat Diet. We can also observe that the
population N is equal for both treatments (N = 25 subjects).
3. In the Descriptive Statistics table we can see that the mean is overall higher in the 1st Period (Total
Mean = 54.76), followed by the 2nd Period (Total Mean = 45.74) and 3rd Period (42.16) which can
indicate a link between the diet programs and weight loss of the participants.
7. The Tests of Within-Subjects Effects test determines if there were any significant differences between
means at any point in time. As with the previous tests, the Sphericity Assumed has an α level of 0.05,
and a P-value <0.05 shows statistical significance.
• In our case, we can assume sphericity for both Period and Period * Program.
HOW TO REPORT
REPEATED MEASURES ANOVA
• Examples:
• A repeated-measures ANOVA determined that mean SPQ scores differed significantly across three
time points (F(2, 58) = 5.699, p = .006). A post hoc pairwise comparison using the Bonferroni
correction showed an increased SPQ score between the initial assessment and follow-up assessment
one year later (20.1 vs 20.9, respectively), but this was not statistically significant (p = .743). However,
the increase in SPQ score did reach significance when comparing the initial assessment to a second
follow-up assessment taken two years after the original assessment (20.1 vs 22.26, p = .010).
Therefore, we can conclude that the results for the ANOVA indicate a significant time effect for
untreated fear of spiders as measured on the SPQ scale.
THANK YOU