0% found this document useful (0 votes)

14 views41 pages

Chapter 2

Uploaded by

zopauy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views41 pages

Chapter 2

Uploaded by

zopauy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Performing t-tests

HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
Two-sample problems
Another problem is to compare sample statistics across groups of a variable.
converted_comp is a numerical variable.

age_first_code_cut is a categorical variable with levels ( "child" and "adult" ).

Do users who first programmed as a child tend to be compensated higher than those that
started as adults?

HYPOTHESIS TESTING IN R
Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult.

H0 : μchild = μadult

H0 : μchild − μadult = 0

HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult.

HA : μchild > μadult

HA : μchild − μadult > 0

HYPOTHESIS TESTING IN R
Calculating groupwise summary statistics
stack_overflow %>%
group_by(age_first_code_cut) %>%
summarize(mean_compensation = mean(converted_comp))

# A tibble: 2 x 2
age_first_code_cut mean_compensation
<chr> <dbl>
1 adult 111544.
2 child 138275.

HYPOTHESIS TESTING IN R
Test statistics
Sample mean estimates the population mean.

x̄ denotes a sample mean.

x̄child is the original sample mean compensation for coding first as a child.
x̄adult is the original sample mean compensation for coding first as an adult.
x̄child − x̄adult is a test statistic.
z-scores are one type of (standardized) test statistic.

HYPOTHESIS TESTING IN R
Standardizing the test statistic
sample stat − population parameter
z=
standard error
difference in sample stats − difference in population parameters
t=
standard error
(x̄child − x̄adult ) − (μchild − μadult )
t=
SE(x̄child − x̄adult )

HYPOTHESIS TESTING IN R
Standard error
SE(x̄child − x̄adult ) ≈ √
s2child s2adult
+
nchild nadult

s is the standard deviation of the variable.

n is the sample size (number of observations/rows in sample).

HYPOTHESIS TESTING IN R
Assuming the null hypothesis is true
(x̄child − x̄adult ) − (μchild − μadult ) stack_overflow %>%
t=
SE(x̄child − x̄adult ) group_by(age_first_code_cut) %>%
summarize(
H0 : μchild − μadult = 0 xbar = mean(converted_comp),
s = sd(converted_comp),
(x̄child − x̄adult ) n = n()
t=
SE(x̄child − x̄adult ) )

(x̄child − x̄adult )
t= # A tibble: 2 x 4

√
s2child s2adult age_first_code_cut xbar s n
+ <chr> <dbl> <dbl> <int>
nchild nadult 1 adult 111544. 270381. 1579
2 child 138275. 278130. 1001

HYPOTHESIS TESTING IN R
Calculating the test statistic
# A tibble: 2 x 4 numerator <- xbar_child - xbar_adult
age_first_code_cut xbar s n denominator <- sqrt(
<chr> <dbl> <dbl> <int> s_child ^ 2 / n_child + s_adult ^ 2 / n_adult
1 adult 111544. 270381. 1579 )
2 child 138275. 278130. 1001 t_stat <- numerator / denominator

(x̄child − x̄adult ) 2.4046

t=
√
s2child s2adult
+
nchild nadult

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
Calculating p-values
from t-statistics
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
t-distributions
The test statistic, t, follows a t-distribution.
t-distributions have a parameter named
degrees of freedom, or df.
t-distributions look like normal distributions,
with fatter tails.

HYPOTHESIS TESTING IN R
Degrees of freedom
As you increase the degrees of freedom,
the t-distribution gets closer to the normal
distribution.

A normal distribution is a t-distribution with

infinite degrees of freedom.

Degrees of freedom are the maximum

number of logically independent values in
the data sample.

HYPOTHESIS TESTING IN R
Calculating degrees of freedom
Suppose your dataset has 5 independent observations.
Four of the values are 2, 6, 8, and 5.

You also know the sample mean is 5.

The last value is no longer independent; it must be 4.

There are 4 degrees of freedom.

df = nchild + nadult − 2

HYPOTHESIS TESTING IN R
Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult.

HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult.

Use a right-tailed test.

HYPOTHESIS TESTING IN R
Significance level
α = 0.1

If p ≤ α then reject H0 .

HYPOTHESIS TESTING IN R
Calculating p-values: one proportion vs. a value
p_value <- pnorm(z_score, lower.tail = FALSE)

HYPOTHESIS TESTING IN R
Calculating p-values: two means from different groups
numerator <- xbar_child - xbar_adult
denominator <- sqrt(s_child ^ 2 / n_child + s_adult ^ 2 / n_adult)
t_stat <- numerator / denominator

2.4046

degrees_of_freedom <- n_child + n_adult - 2

2578

Test statistic standard error used an approximation (not bootstrapping).

Use t-distribution CDF not normal CDF.

p_value <- pt(t_stat, df = degrees_of_freedom, lower.tail = FALSE)

0.008130

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
Paired t-tests
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
US Republican presidents dataset
state county repub_percent_08 repub_percent_12
Alabama Bullock 25.69 23.51
Alabama Chilton 78.49 79.78
Alabama Clay 73.09 72.31
Alabama Cullman 81.85 84.16
Alabama Escambia 63.89 62.46
Alabama Fayette 73.93 76.19
Alabama Franklin 68.83 69.68
... ... ... ...
500 rows; each row represents county-level votes in a presidential election.

1 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ

HYPOTHESIS TESTING IN R
Hypotheses
Question: Was the percentage of votes given to the Republican candidate lower in 2008
compared to 2012?

H0 : μ2008 − μ2012 = 0

HA : μ2008 − μ2012 < 0

Set α = 0.05 significance level.

The data is paired, since each voter percentage refers to the same county.

HYPOTHESIS TESTING IN R
From two samples to one
sample_data <- repub_votes_potus_08_12 %>%
mutate(diff = repub_percent_08 - repub_percent_12)

ggplot(sample_data, aes(x = diff)) +

geom_histogram(binwidth = 1)

HYPOTHESIS TESTING IN R
Calculate sample statistics of the difference
sample_data %>%
summarize(xbar_diff = mean(diff))

xbar_diff
1 -2.643027

HYPOTHESIS TESTING IN R
Revised hypotheses
Old hypotheses x̄diff − μdiff
t=
√
s2dif f
H0 : μ2008 − μ2012 = 0
ndiff
HA : μ2008 − μ2012 < 0
df = ndif f − 1

New hypotheses

H0 : μdiff = 0

HA : μdiff < 0

HYPOTHESIS TESTING IN R
Calculating the p-value
x̄diff − μdiff
n_diff <- nrow(sample_data) t=
√
s2diff
s_diff <- sample_data %>%
summarize(sd_diff = sd(diff)) %>%
ndiff
pull(sd_diff)
df = ndiff − 1
t_stat <- (xbar_diff - 0) / sqrt(s_diff ^ 2 / n_diff)

-16.06374
p_value <- pt(t_stat, df = degrees_of_freedom)

degrees_of_freedom <- n_diff - 1

2.084965e-47

499

HYPOTHESIS TESTING IN R
Testing differences between two means using t.test()
t.test( One Sample t-test
# Vector of differences
sample_data$diff, data: sample_data$diff
# Choose between "two.sided", "less", "greater" t = -16.064, df = 499, p-value < 2.2e-16
alternative = "less", alternative hypothesis: true mean is less than 0
# Null hypothesis population parameter 95 percent confidence interval:
mu = 0 -Inf -2.37189
) sample estimates:
mean of x
-2.643027

HYPOTHESIS TESTING IN R
t.test() with paired = TRUE
t.test( Paired t-test
sample_data$repub_percent_08,
sample_data$repub_percent_12, data: sample_data$repub_percent_08 and
alternative = "less", sample_data$repub_percent_12
mu = 0, t = -16.064, df = 499, p-value < 2.2e-16
paired = TRUE alternative hypothesis: true difference in means
) is less than 0
95 percent confidence interval:
-Inf -2.37189
sample estimates:
mean of the differences
-2.643027

HYPOTHESIS TESTING IN R
Unpaired t.test()
t.test( Welch Two Sample t-test
x = sample_data$repub_percent_08,
y = sample_data$repub_percent_12, data: sample_data$repub_percent_08 and
alternative = "less", sample_data$repub_percent_12
mu = 0 t = -2.8788, df = 992.76, p-value = 0.002039
) alternative hypothesis: true difference in means
is less than 0

Unpaired t-test has more chance of false 95 percent confidence interval:

-Inf -1.131469
negative error (less statistical power).
sample estimates:
mean of x mean of y
56.52034 59.16337

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
ANOVA tests
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
Job satisfaction: 5 categories
stack_overflow %>%
count(job_sat)

# A tibble: 5 x 2
job_sat n
<fct> <int>
1 Very dissatisfied 187
2 Slightly dissatisfied 385
3 Neither 245
4 Slightly satisfied 777
5 Very satisfied 981

HYPOTHESIS TESTING IN R
Visualizing multiple distributions
Question: Is mean annual compensation
different for different levels of job
satisfaction?

stack_overflow %>%
ggplot(aes(x = job_sat, y = converted_comp)) +
geom_boxplot() +
coord_flip()

HYPOTHESIS TESTING IN R
Analysis of variance (ANOVA)
mdl_comp_vs_job_sat <- lm(converted_comp ~ job_sat, data = stack_overflow)

anova(mdl_comp_vs_job_sat)

Analysis of Variance Table

Response: converted_comp
Df Sum Sq Mean Sq F value Pr(>F)
job_sat 4 1.09e+12 2.73e+11 3.65 0.0057 **
Residuals 2570 1.92e+14 7.47e+10

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

1 Linear regressions with lm() are taught in "Introduction to Regression in R"

HYPOTHESIS TESTING IN R
Pairwise tests
μvery dissatisfied ≠ μslightly dissatisfied μslightly dissatisfied ≠ μslightly satisfied
μvery dissatisfied ≠ μneither μslightly dissatisfied ≠ μvery satisfied
μvery dissatisfied ≠ μslightly satisfied μneither ≠ μslightly satisfied
μvery dissatisfied ≠ μvery satisfied μneither ≠ μvery satisfied
μslightly dissatisfied ≠ μneither μslightly satisfied ≠ μvery satisfied

Set significance level to α = 0.2.

HYPOTHESIS TESTING IN R
pairwise.t.test()
pairwise.t.test(stack_overflow$converted_comp, stack_overflow$job_sat, p.adjust.method = "none")

Pairwise comparisons using t tests with pooled SD

data: stack_overflow$converted_comp and stack_overflow$job_sat

Very dissatisfied Slightly dissatisfied Neither Slightly satisfied

Slightly dissatisfied 0.26860 - - -
Neither 0.79578 0.36858 - -
Slightly satisfied 0.29570 0.82931 0.41248 -
Very satisfied 0.34482 0.00384 0.15939 0.00084

P value adjustment method: none

Significant differences: "Very satisfied" vs. "Slightly dissatisfied"; "Very satisfied" vs. "Neither";
"Very satisfied" vs. "Slightly satisfied"

HYPOTHESIS TESTING IN R
As the no. of groups increases...

HYPOTHESIS TESTING IN R
Bonferroni correction
pairwise.t.test(stack_overflow$converted_comp, stack_overflow$job_sat, p.adjust.method = "bonferroni")

Pairwise comparisons using t tests with pooled SD

data: stack_overflow$converted_comp and stack_overflow$job_sat

Very dissatisfied Slightly dissatisfied Neither Slightly satisfied

Slightly dissatisfied 1.0000 - - -
Neither 1.0000 1.0000 - -
Slightly satisfied 1.0000 1.0000 1.0000 -
Very satisfied 1.0000 0.0384 1.0000 0.0084

P value adjustment method: bonferroni

Significant differences: "Very satisfied" vs. "Slightly dissatisfied"; "Very satisfied" vs. "Slightly
satisfied"

HYPOTHESIS TESTING IN R
More methods
p.adjust.methods

"holm" "hochberg" "hommel" "bonferroni" "BH" "BY" "fdr" "none"

HYPOTHESIS TESTING IN R
Bonferroni and Holm adjustments
p_values

0.268603 0.795778 0.295702 0.344819 0.368580 0.829315 0.003840 0.412482 0.159389 0.000838

Bonferroni

pmin(1, 10 * p_values)

1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.03840 1.00000 1.00000 0.00838

Holm (roughly)

pmin(1, 10:1 * sort(p_values))

0.00838 0.03456 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.82931

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R

T Test
100% (1)
T Test
6 pages
Applied Research Final Exam
100% (1)
Applied Research Final Exam
3 pages
Mid-Century Modern Lounge Chair: Instructables
No ratings yet
Mid-Century Modern Lounge Chair: Instructables
24 pages
Module 4 T Test For Independent
No ratings yet
Module 4 T Test For Independent
8 pages
Industrialised Building Systems - Book
No ratings yet
Industrialised Building Systems - Book
187 pages
WINSEM2015-16 CP1615 18-MAR-2016 RM01 Z-Test For Means and Proprtions
0% (2)
WINSEM2015-16 CP1615 18-MAR-2016 RM01 Z-Test For Means and Proprtions
7 pages
Lecture 3 Development of Classroom Assessment Tools
No ratings yet
Lecture 3 Development of Classroom Assessment Tools
87 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
Hypothesis Testing Assignment
100% (2)
Hypothesis Testing Assignment
8 pages
Question Paper of UP Junior Assistant Exam UPSSSC 2015
100% (1)
Question Paper of UP Junior Assistant Exam UPSSSC 2015
22 pages
MATH 10 - Module 9 - Math in The Social Sciences - Final - July 2018
100% (1)
MATH 10 - Module 9 - Math in The Social Sciences - Final - July 2018
39 pages
How To Build A Pikler Triangle
100% (1)
How To Build A Pikler Triangle
24 pages
Hypothesis Testing
100% (3)
Hypothesis Testing
23 pages
T - Test
No ratings yet
T - Test
45 pages
2M00156 PDF
No ratings yet
2M00156 PDF
2,095 pages
23MT2013 DSS CO4 Session 19 Statistical Tests
No ratings yet
23MT2013 DSS CO4 Session 19 Statistical Tests
42 pages
Hypothesis Testing: Applied Statistics - Lesson 8
No ratings yet
Hypothesis Testing: Applied Statistics - Lesson 8
6 pages
Final - Module 4 B
No ratings yet
Final - Module 4 B
61 pages
T Test
No ratings yet
T Test
35 pages
Theorising The Contemporary Sport Suppor
No ratings yet
Theorising The Contemporary Sport Suppor
310 pages
June 2012 (v2) MS - Paper 1 CIE Physics IGCSE
0% (1)
June 2012 (v2) MS - Paper 1 CIE Physics IGCSE
2 pages
Final Answer Keys of CUET - PG 2022
No ratings yet
Final Answer Keys of CUET - PG 2022
184 pages
Isds361b Notes
No ratings yet
Isds361b Notes
103 pages
L07 Test
No ratings yet
L07 Test
52 pages
3 - Data Analysis - Tests of Differences
No ratings yet
3 - Data Analysis - Tests of Differences
50 pages
8-9.t Test
No ratings yet
8-9.t Test
102 pages
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
No ratings yet
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
53 pages
Comparison of Means: Hypothesis Testing
No ratings yet
Comparison of Means: Hypothesis Testing
52 pages
Z - TEST and T Test
No ratings yet
Z - TEST and T Test
45 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
60 pages
Lab 8 - Sampling Techniques 1
No ratings yet
Lab 8 - Sampling Techniques 1
43 pages
CH 9 10
No ratings yet
CH 9 10
48 pages
Group Comparision
No ratings yet
Group Comparision
49 pages
Testing of Hypothesis
No ratings yet
Testing of Hypothesis
26 pages
Prob Stat Lesson 9
No ratings yet
Prob Stat Lesson 9
44 pages
Chapter 12 PowerPoint
No ratings yet
Chapter 12 PowerPoint
40 pages
Handbook On Project Work For Post Graduate Diploma in Urban Planning and Development (PGDUPDL)
No ratings yet
Handbook On Project Work For Post Graduate Diploma in Urban Planning and Development (PGDUPDL)
34 pages
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
No ratings yet
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
35 pages
10 - 1-Sample T-Test
No ratings yet
10 - 1-Sample T-Test
31 pages
Lect 2
No ratings yet
Lect 2
38 pages
Numerato Who Says No To Modern Football
No ratings yet
Numerato Who Says No To Modern Football
19 pages
Jalabert. Montevideo 1930 Reassessing The Selection of The First World Cup Host
No ratings yet
Jalabert. Montevideo 1930 Reassessing The Selection of The First World Cup Host
14 pages
Neon Green White Playful Illustrative Market Research Presentation
No ratings yet
Neon Green White Playful Illustrative Market Research Presentation
21 pages
ED 203 Stat (T TestIndependent) LucesPerlyS MedEm
No ratings yet
ED 203 Stat (T TestIndependent) LucesPerlyS MedEm
42 pages
Chapter 2 T Test
No ratings yet
Chapter 2 T Test
42 pages
Teknik Sampling 2020
No ratings yet
Teknik Sampling 2020
24 pages
Presentation1 T TEST MCC 703
No ratings yet
Presentation1 T TEST MCC 703
43 pages
R Commands New 2
No ratings yet
R Commands New 2
23 pages
Hypothesis Tests in R
No ratings yet
Hypothesis Tests in R
25 pages
Cleaning Data3
No ratings yet
Cleaning Data3
41 pages
Document From Da??
No ratings yet
Document From Da??
25 pages
Cleaning Data2
No ratings yet
Cleaning Data2
39 pages
Unit Ii DS LM
No ratings yet
Unit Ii DS LM
20 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Hypothesis
No ratings yet
Hypothesis
16 pages
Independent T
No ratings yet
Independent T
9 pages
Lesson 6 - T Tests
No ratings yet
Lesson 6 - T Tests
20 pages
Chapter1 1
No ratings yet
Chapter1 1
27 pages
Supreet Kaur 4132 (Research Methodology)
No ratings yet
Supreet Kaur 4132 (Research Methodology)
8 pages
R Lab7
No ratings yet
R Lab7
15 pages
Unit4 R
No ratings yet
Unit4 R
21 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Task: For This Assessment, Students Are Expected To Write A Weekly Journal Over The
No ratings yet
Task: For This Assessment, Students Are Expected To Write A Weekly Journal Over The
4 pages
Hypotesis Testing Chapter1
No ratings yet
Hypotesis Testing Chapter1
32 pages
T Test
No ratings yet
T Test
23 pages
Giulianotti 1999 Intro
No ratings yet
Giulianotti 1999 Intro
15 pages
Statistical Hypothesis Testing
No ratings yet
Statistical Hypothesis Testing
20 pages
CH 10
No ratings yet
CH 10
43 pages
Duroh
No ratings yet
Duroh
12 pages
ST130 - CHP 11
No ratings yet
ST130 - CHP 11
9 pages
Make Up Research Class Exercise
No ratings yet
Make Up Research Class Exercise
13 pages
T Test
No ratings yet
T Test
11 pages
Introduction To Statistical Hypothesis Testing in R
No ratings yet
Introduction To Statistical Hypothesis Testing in R
8 pages
My Topic: Significance of T-Test
No ratings yet
My Topic: Significance of T-Test
12 pages
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
No ratings yet
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
24 pages
T Test Function in Statistical Software
No ratings yet
T Test Function in Statistical Software
9 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Lecture 15
No ratings yet
Lecture 15
27 pages
Impact of Service Quality On Patient Satisfaction A Study at Physiotherapy Unit Pku Muhammadiyah Hospital of Yogyakarta
No ratings yet
Impact of Service Quality On Patient Satisfaction A Study at Physiotherapy Unit Pku Muhammadiyah Hospital of Yogyakarta
5 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
5afallsem2018-19 Mat2001 Ela Tt603 Vl2018191002235 Reference Material I Lab-5-I
No ratings yet
5afallsem2018-19 Mat2001 Ela Tt603 Vl2018191002235 Reference Material I Lab-5-I
6 pages
Cuma Bisa Ya Allah Ya Allah
No ratings yet
Cuma Bisa Ya Allah Ya Allah
8 pages
Distribución Normal Estándar: Tabla Z
No ratings yet
Distribución Normal Estándar: Tabla Z
6 pages
02 CP SCE3111 Evaluation in Science Teaching
No ratings yet
02 CP SCE3111 Evaluation in Science Teaching
5 pages
Module 3 Hypothesis Testing Using R
No ratings yet
Module 3 Hypothesis Testing Using R
7 pages
Lab6 - HT and CI in R Some Solutions
No ratings yet
Lab6 - HT and CI in R Some Solutions
7 pages
Scale For Spiritual Intelligence (SSI) : June 2013
No ratings yet
Scale For Spiritual Intelligence (SSI) : June 2013
3 pages
Inference About Population Variance
No ratings yet
Inference About Population Variance
11 pages
MKT-311 - Business Research Methods: Semester: Fall 2018 Program: BBA
No ratings yet
MKT-311 - Business Research Methods: Semester: Fall 2018 Program: BBA
2 pages
7-Applying The T-Test For Independent and Dependent Samples-13
No ratings yet
7-Applying The T-Test For Independent and Dependent Samples-13
6 pages
Experiment 7 Prob R
No ratings yet
Experiment 7 Prob R
5 pages
Item Analysis Q4
No ratings yet
Item Analysis Q4
4 pages
Lab6 - Hypothesis Testing and Confidence Intervals in R
No ratings yet
Lab6 - Hypothesis Testing and Confidence Intervals in R
3 pages
Degree Analysis
No ratings yet
Degree Analysis
1 page
BOT 614 Test of Significance - 095338
No ratings yet
BOT 614 Test of Significance - 095338
3 pages
Proposed 361 Outline
No ratings yet
Proposed 361 Outline
3 pages
Practical 8 PDF
No ratings yet
Practical 8 PDF
3 pages
SIP Exam Portal 2
No ratings yet
SIP Exam Portal 2
1 page
HQ - Primax Certificate ISO 24-06-2014
No ratings yet
HQ - Primax Certificate ISO 24-06-2014
1 page
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Performing t-tests

age_first_code_cut is a categorical variable with levels ( "child" and "adult" ).

HA : μchild > μadult

HA : μchild − μadult > 0

x̄ denotes a sample mean.

s is the standard deviation of the variable.

n is the sample size (number of observations/rows in sample).

(x̄child − x̄adult ) 2.4046

A normal distribution is a t-distribution with

Degrees of freedom are the maximum

You also know the sample mean is 5.

The last value is no longer independent; it must be 4.

There are 4 degrees of freedom.

Use a right-tailed test.

degrees_of_freedom <- n_child + n_adult - 2

Test statistic standard error used an approximation (not bootstrapping).

Use t-distribution CDF not normal CDF.

p_value <- pt(t_stat, df = degrees_of_freedom, lower.tail = FALSE)

HA : μ2008 − μ2012 < 0

Set α = 0.05 significance level.

ggplot(sample_data, aes(x = diff)) +

degrees_of_freedom <- n_diff - 1

Unpaired t-test has more chance of false 95 percent confidence interval:

Analysis of Variance Table

1 Linear regressions with lm() are taught in "Introduction to Regression in R"

Set significance level to α = 0.2.

Pairwise comparisons using t tests with pooled SD

data: stack_overflow$converted_comp and stack_overflow$job_sat

Very dissatisfied Slightly dissatisfied Neither Slightly satisfied

P value adjustment method: none

Pairwise comparisons using t tests with pooled SD

data: stack_overflow$converted_comp and stack_overflow$job_sat

Very dissatisfied Slightly dissatisfied Neither Slightly satisfied

P value adjustment method: bonferroni

"holm" "hochberg" "hommel" "bonferroni" "BH" "BY" "fdr" "none"

pmin(1, 10:1 * sort(p_values))

You might also like