Class5 Lecture
Class5 Lecture
Paired t test
Assumptions?
DV?
IV?
REVIEW OF KEY CONCEPT
Sample statistics and population
parameters
We are not able to estimate population
parameters (why? Time and money issues)
We use sample statistics instead (with an
assumption that the sample is randomly selected
and representative of the population)
We then conduct tests based on the samples we
get. If we find out there are some effects in the
samples, we assume that this effect can also be
observed in the population
REVIEW OF KEY CONCEPT
Parametric estimates vs. non-parametric estimates?
Parametric estimates: when all the possible statistical assumptions are met, then we can use parametric
estimates
If some statistical assumptions are violated, you will need to consider non-parametric estimates (Class 9,
categorical data analysis)
If you have a similar study question that involves a grouping IV that has more than 3
or more categories with an outcome of continuous measure, you are using a method
called the analysis of variance (ANOVA) or F test (Fisher’s test)
ANOVA (ANALYSIS OF VARIANCE) FAMILY
ANOVA is a big family
One-way ANOVA (Fisher’s test or F-test)*
DV is one continuous variable
IV is one categorical (nominal, 3+ groups) variable
Two-way ANOVA (factorial ANOVA; two-factors ANOVA)
DV is one continuous variable
IV is two categorical variables
MANOVA (Multivariate ANOVA)
DV is multiple continuous variables
IV is one categorical (nominal, 3+ group) variable
ANCOVA (Analysis of Covariance)
DV is a continuous variable
IV is one categorical (nominal, 3+ group) variable
Covariates can be any type of variables
WHAT DOES THE ANOVA TEST TELL ME?
It shows you the “group effect,” this is originally that ANOVA comes from clinical
research
An ANOVA can also be used to describe differences between groups where there is
no experimental design or intervention. Difference?
Difference? Difference?
T-TEST VS. ANOVA
Independent variable (IV) Dependent variable (DV)
Independent t-test Nominal (two levels. • Interval/ratio measurement
Example: males or females; • Independence of observation
young or old…) • Normal distribution
• Equal variance
ANOVA (F test) Nominal (three- or more • Interval/ratio measurement
levels. Example: • Independence of observation
Caucasian, African • Normal distribution
American, Hispanics…) • Equal variance
HYPOTHESIS TESTING
Research hypothesis:
These is a difference in the mean of DV among groups A, B, & C
(MeanA ≠ MeanB ≠ MeanC)
Null hypothesis
There is no difference in the means of DV among groups A, B, & C
(MeanA = MeanB = MeanC)
Note: The IV doesn’t have to be limited to three groups, but it is better to keep your
group < 6 groups because post-hoc comparison will be complicated as the number of
groups increases.
ASSUMPTIONS
IV: nominal variable with 3 or more groups
DV: continuous and normally distributed
Independence of observations
Rules of 30 (i.e., 30+ per group robust rule applies)
Equal variance
Rule of thumb: If the largest group variance is no more than 1.5 times larger than the smallest, the test
will be robust
But it is safer to do other formal tests (what tests?)
WHY NOT JUST RUN MULTIPLE T-TESTS?
If we had three groups, why don’t we just run 3 times of the t-test?
It will inflate Type I error (false positive) because of running the same test multiple times with the same
data
We control our type I error by 5%, meaning we get significant results by chance by limiting the
probability to only 5%
WHY NOT JUST RUN MULTIPLE T TEST?
Why can’t we do 3 times of t-tests? Because it will inflate my type I error
If each of these t-tests uses a 0.05 level of significance (i.e., 5%), that means for
each test, the probability of falsely rejecting the null hypothesis (known as Type I
error) is only 5%. Therefore, the probability of getting a NO Type I error is 95%
If we assume each 3 test is independent, then the overall probability of no Type I
error is 0.95*0.95*0.95=0.857
Type I error by doing 3 t test = 1-0.857=0.143 (or14.3%). This means my Type I
error inflates from 5% to 14.3%
The ANOVA test takes these multiple tests (by pair) into account in a SINGLE test and
controls my Type I error at 5% (no inflation). Consequently, it is called an “omnibus”
test
HOW ARE RESULTS SIGNIFICANT?
BSS= between sum of squares. This can be thought of as the distance between
groups
WSS=within (or error) sum of squares. This can be considered the distance between
individual points and the central mean.
High School=1
Mean (R code) using “psych” package: mean, sd, skewness, other index…
install.packages("psych")
library(psych)
describe(data$var)
Check (1) sample size in each group, (2) mean, (3) Var, (4) skewness, and (5) others such as min, max, range…
STEP 2: CHECK THE ASSUMPTIONS FOR EACH
VARIABLE, BY GROUP
R output (Q-Q plot): Self-rated
health, by education level (low,
medium, high)
STEP 3: CHECK THE TEST FOR EQUAL VARIANCE
TEST (LEVENE’S TEST)
Levene’s test (R code)
install.packages("car")
library(car)
leveneTest(data$DV, data$IV, center = median)
Test results:
If the test is significant (p < 0.05), we don’t have equal variance
If the test is not significant (p > 0.05), we have equal variance
ANOVA, if equal variance does not hold. We request welch correction (R code)
oneway.test(DV ~ IV, data = your data set)
STEP 4: CONDUCT F TEST
R output (ANOVA), self-rated health by education
Even though we did Welch’s correction, the result remains significant! (but the value is a bit different)
But this is the correct result, not the prior one. Why???
STEP 5: CONDUCT POST-HOC TEST
If the ANOVA is significant, it means at least one pair of comparisons has a mean
difference. We use post-hoc test to identify which pair of mean comparisons cause
difference
R has pairwise.t.test function as part of the R base system. It supports post-hoc test
methods such as Bonferroni, Tukey, or others
We will use Tukey HSD for post-hoc tests as it is more conservative. To use this
method, you need “multcomp” package with the glht command
STEP 5: CONDUCT POST-HOC TEST
Tukey HSD post-hoc test (R code)
install.packages("multcomp")
library(multcomp)
model <- glht(aov.model, linfct = mcp(IV = "Tukey"))
summary(model)
STEP 5: CONDUCT POST-HOC TEST
STEP 6: PROVIDE YOUR INTERPRETATION (WRITE-
UP)
You must report the overall F test, including the degree of freedom, F values,
and significance.
You need to report whether the equal variance assumption is passed. If not,
you will need to use Welch to correct it.
Report the results from the post hoc test (report the comparisons concisely). You
will also need to report the mean and SD for each group.
EXAMPLES
This study examines the effect of education levels (low, medium, and high) on older
adult’s self-rated health. The significant Levene’s test (F(2,3838) = 8.47, p < .001)
indicated the equal variance assumption was violated, and therefore the Welch
correction was applied.
The F-test results showed that education has a significant effect on self-rated health
(F(2,3838) = 24.676, p < .001). The Tukey post-hoc test indicated that older adults with
high education (M = 3.22; SD = 0.88) had higher levels of self-rated health
compared to those with medium education (M = 2.94; SD = 0.88) and low education
(M = 2.73; SD = 0.92). All the comparisons were statistically significant at the 0.05
level.
EXAMPLES OF TABLES
Do we need to do post-hoc test for ANOVA?
No, why?
Do we need to do post-hoc test for ANOVA?
Yes, why?