Chapter 2
Chapter 2
HYPOTHESIS TESTING IN R
Richie Cotton
Data Evangelist at DataCamp
Two-sample problems
Another problem is to compare sample statistics across groups of a variable.
converted_comp is a numerical variable.
Do users who first programmed as a child tend to be compensated higher than those that
started as adults?
HYPOTHESIS TESTING IN R
Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult.
H0 : μchild = μadult
H0 : μchild − μadult = 0
HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult.
HYPOTHESIS TESTING IN R
Calculating groupwise summary statistics
stack_overflow %>%
group_by(age_first_code_cut) %>%
summarize(mean_compensation = mean(converted_comp))
# A tibble: 2 x 2
age_first_code_cut mean_compensation
<chr> <dbl>
1 adult 111544.
2 child 138275.
HYPOTHESIS TESTING IN R
Test statistics
Sample mean estimates the population mean.
HYPOTHESIS TESTING IN R
Standardizing the test statistic
sample stat − population parameter
z=
standard error
difference in sample stats − difference in population parameters
t=
standard error
(x̄child − x̄adult ) − (μchild − μadult )
t=
SE(x̄child − x̄adult )
HYPOTHESIS TESTING IN R
Standard error
SE(x̄child − x̄adult ) ≈ √
s2child s2adult
+
nchild nadult
HYPOTHESIS TESTING IN R
Assuming the null hypothesis is true
(x̄child − x̄adult ) − (μchild − μadult ) stack_overflow %>%
t=
SE(x̄child − x̄adult ) group_by(age_first_code_cut) %>%
summarize(
H0 : μchild − μadult = 0 xbar = mean(converted_comp),
s = sd(converted_comp),
(x̄child − x̄adult ) n = n()
t=
SE(x̄child − x̄adult ) )
(x̄child − x̄adult )
t= # A tibble: 2 x 4
√
s2child s2adult age_first_code_cut xbar s n
+ <chr> <dbl> <dbl> <int>
nchild nadult 1 adult 111544. 270381. 1579
2 child 138275. 278130. 1001
HYPOTHESIS TESTING IN R
Calculating the test statistic
# A tibble: 2 x 4 numerator <- xbar_child - xbar_adult
age_first_code_cut xbar s n denominator <- sqrt(
<chr> <dbl> <dbl> <int> s_child ^ 2 / n_child + s_adult ^ 2 / n_adult
1 adult 111544. 270381. 1579 )
2 child 138275. 278130. 1001 t_stat <- numerator / denominator
HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
Calculating p-values
from t-statistics
HYPOTHESIS TESTING IN R
Richie Cotton
Data Evangelist at DataCamp
t-distributions
The test statistic, t, follows a t-distribution.
t-distributions have a parameter named
degrees of freedom, or df.
t-distributions look like normal distributions,
with fatter tails.
HYPOTHESIS TESTING IN R
Degrees of freedom
As you increase the degrees of freedom,
the t-distribution gets closer to the normal
distribution.
HYPOTHESIS TESTING IN R
Calculating degrees of freedom
Suppose your dataset has 5 independent observations.
Four of the values are 2, 6, 8, and 5.
df = nchild + nadult − 2
HYPOTHESIS TESTING IN R
Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult.
HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult.
HYPOTHESIS TESTING IN R
Significance level
α = 0.1
If p ≤ α then reject H0 .
HYPOTHESIS TESTING IN R
Calculating p-values: one proportion vs. a value
p_value <- pnorm(z_score, lower.tail = FALSE)
HYPOTHESIS TESTING IN R
Calculating p-values: two means from different groups
numerator <- xbar_child - xbar_adult
denominator <- sqrt(s_child ^ 2 / n_child + s_adult ^ 2 / n_adult)
t_stat <- numerator / denominator
2.4046
2578
0.008130
HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
Paired t-tests
HYPOTHESIS TESTING IN R
Richie Cotton
Data Evangelist at DataCamp
US Republican presidents dataset
state county repub_percent_08 repub_percent_12
Alabama Bullock 25.69 23.51
Alabama Chilton 78.49 79.78
Alabama Clay 73.09 72.31
Alabama Cullman 81.85 84.16
Alabama Escambia 63.89 62.46
Alabama Fayette 73.93 76.19
Alabama Franklin 68.83 69.68
... ... ... ...
500 rows; each row represents county-level votes in a presidential election.
1 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ
HYPOTHESIS TESTING IN R
Hypotheses
Question: Was the percentage of votes given to the Republican candidate lower in 2008
compared to 2012?
H0 : μ2008 − μ2012 = 0
The data is paired, since each voter percentage refers to the same county.
HYPOTHESIS TESTING IN R
From two samples to one
sample_data <- repub_votes_potus_08_12 %>%
mutate(diff = repub_percent_08 - repub_percent_12)
HYPOTHESIS TESTING IN R
Calculate sample statistics of the difference
sample_data %>%
summarize(xbar_diff = mean(diff))
xbar_diff
1 -2.643027
HYPOTHESIS TESTING IN R
Revised hypotheses
Old hypotheses x̄diff − μdiff
t=
√
s2dif f
H0 : μ2008 − μ2012 = 0
ndiff
HA : μ2008 − μ2012 < 0
df = ndif f − 1
New hypotheses
H0 : μdiff = 0
HA : μdiff < 0
HYPOTHESIS TESTING IN R
Calculating the p-value
x̄diff − μdiff
n_diff <- nrow(sample_data) t=
√
s2diff
s_diff <- sample_data %>%
summarize(sd_diff = sd(diff)) %>%
ndiff
pull(sd_diff)
df = ndiff − 1
t_stat <- (xbar_diff - 0) / sqrt(s_diff ^ 2 / n_diff)
-16.06374
p_value <- pt(t_stat, df = degrees_of_freedom)
499
HYPOTHESIS TESTING IN R
Testing differences between two means using t.test()
t.test( One Sample t-test
# Vector of differences
sample_data$diff, data: sample_data$diff
# Choose between "two.sided", "less", "greater" t = -16.064, df = 499, p-value < 2.2e-16
alternative = "less", alternative hypothesis: true mean is less than 0
# Null hypothesis population parameter 95 percent confidence interval:
mu = 0 -Inf -2.37189
) sample estimates:
mean of x
-2.643027
HYPOTHESIS TESTING IN R
t.test() with paired = TRUE
t.test( Paired t-test
sample_data$repub_percent_08,
sample_data$repub_percent_12, data: sample_data$repub_percent_08 and
alternative = "less", sample_data$repub_percent_12
mu = 0, t = -16.064, df = 499, p-value < 2.2e-16
paired = TRUE alternative hypothesis: true difference in means
) is less than 0
95 percent confidence interval:
-Inf -2.37189
sample estimates:
mean of the differences
-2.643027
HYPOTHESIS TESTING IN R
Unpaired t.test()
t.test( Welch Two Sample t-test
x = sample_data$repub_percent_08,
y = sample_data$repub_percent_12, data: sample_data$repub_percent_08 and
alternative = "less", sample_data$repub_percent_12
mu = 0 t = -2.8788, df = 992.76, p-value = 0.002039
) alternative hypothesis: true difference in means
is less than 0
HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
ANOVA tests
HYPOTHESIS TESTING IN R
Richie Cotton
Data Evangelist at DataCamp
Job satisfaction: 5 categories
stack_overflow %>%
count(job_sat)
# A tibble: 5 x 2
job_sat n
<fct> <int>
1 Very dissatisfied 187
2 Slightly dissatisfied 385
3 Neither 245
4 Slightly satisfied 777
5 Very satisfied 981
HYPOTHESIS TESTING IN R
Visualizing multiple distributions
Question: Is mean annual compensation
different for different levels of job
satisfaction?
stack_overflow %>%
ggplot(aes(x = job_sat, y = converted_comp)) +
geom_boxplot() +
coord_flip()
HYPOTHESIS TESTING IN R
Analysis of variance (ANOVA)
mdl_comp_vs_job_sat <- lm(converted_comp ~ job_sat, data = stack_overflow)
anova(mdl_comp_vs_job_sat)
Response: converted_comp
Df Sum Sq Mean Sq F value Pr(>F)
job_sat 4 1.09e+12 2.73e+11 3.65 0.0057 **
Residuals 2570 1.92e+14 7.47e+10
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
HYPOTHESIS TESTING IN R
Pairwise tests
μvery dissatisfied ≠ μslightly dissatisfied μslightly dissatisfied ≠ μslightly satisfied
μvery dissatisfied ≠ μneither μslightly dissatisfied ≠ μvery satisfied
μvery dissatisfied ≠ μslightly satisfied μneither ≠ μslightly satisfied
μvery dissatisfied ≠ μvery satisfied μneither ≠ μvery satisfied
μslightly dissatisfied ≠ μneither μslightly satisfied ≠ μvery satisfied
HYPOTHESIS TESTING IN R
pairwise.t.test()
pairwise.t.test(stack_overflow$converted_comp, stack_overflow$job_sat, p.adjust.method = "none")
Significant differences: "Very satisfied" vs. "Slightly dissatisfied"; "Very satisfied" vs. "Neither";
"Very satisfied" vs. "Slightly satisfied"
HYPOTHESIS TESTING IN R
As the no. of groups increases...
HYPOTHESIS TESTING IN R
Bonferroni correction
pairwise.t.test(stack_overflow$converted_comp, stack_overflow$job_sat, p.adjust.method = "bonferroni")
Significant differences: "Very satisfied" vs. "Slightly dissatisfied"; "Very satisfied" vs. "Slightly
satisfied"
HYPOTHESIS TESTING IN R
More methods
p.adjust.methods
HYPOTHESIS TESTING IN R
Bonferroni and Holm adjustments
p_values
0.268603 0.795778 0.295702 0.344819 0.368580 0.829315 0.003840 0.412482 0.159389 0.000838
Bonferroni
pmin(1, 10 * p_values)
1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.03840 1.00000 1.00000 0.00838
Holm (roughly)
0.00838 0.03456 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.82931
HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R