100% found this document useful (1 vote)
460 views52 pages

Hypothesis Testing With T Tests

The document discusses the t-test and its use in hypothesis testing, explaining Student's t distribution and how it can be used to test differences between sample means when the population standard deviation is unknown. It covers one-sample, independent samples, and paired/dependent samples t-tests, and provides steps and an example of conducting a one-sample t-test to analyze therapy attendance data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
460 views52 pages

Hypothesis Testing With T Tests

The document discusses the t-test and its use in hypothesis testing, explaining Student's t distribution and how it can be used to test differences between sample means when the population standard deviation is unknown. It covers one-sample, independent samples, and paired/dependent samples t-tests, and provides steps and an example of conducting a one-sample t-test to analyze therapy attendance data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Hypothesis Testing with t Tests

Arlo Clark-Foos
What have we done so far?
 Hypothesis Testing & Inferential Statistics
alpha levels, cut-offs, p-value One-tailed vs. Two-tailed tests

◦ Goal: Estimate likelihood of obtaining sample mean


given some known population parameters (μ and σ)
◦ What if we didn’t know all of the population parameters?
The Story of Student’s t

“Guinness is the best beer available, it does not need advertising as its quality will
sell it, and those who do not drink it are to be sympathized with rather than
advertised to.” --W.S. Gosset (aka “Student”)

The pros and cons of beer sampling…


Using Samples to Estimate
Population Variability
 Acknowledge error
 Smaller samples, less spread

( X  M ) 2
s New!
N 1

This correction will affect larger samples


less so than it will affect smaller samples.
◦ N = 65, N – 1 = 64 (change of 1.5%)
◦ N = 4, N – 1 = 3 (change of 25%)
What happened to σM?
 Still there, only used for z tests

 We have a new measure of standard deviation for a


sample (as opposed to a population): s
◦ We need a new measure of standard error based on sample
standard deviation:
s
sM 
N
◦ Wait, what happened to “N-1”?
◦ We already did that when we calculated s, don’t correct again!
Student’s t Statistic
(M  M )
t
sM

Indicates the distance of a sample mean from a population


mean in standard errors (like standard deviations)
The many flavors of tea

 Single sample t
◦ One sample, compared with known population mean
◦ Goal: Is our sample different from population?

 Independent Samples t
◦ Different (independent) samples of participants
experience each level of IV
◦ Are our samples from different
populations?

 Paired/Dependent Samples
◦ Same or related (dependent) samples
of participants experience each level of IV
◦ Are our samples from different
populations?
Degrees of Freedom
 Necessary when making estimates…
 The number of scores that are free to vary when
estimating a population parameter from a sample
◦ df = N – 1 (for a Single-Sample t Test)

Example: I decide to ask 6 people how often they floss their teeth
and record their average = 2 (times per week)
 Eventual goal: Estimate population parameters (population variability).
 How many scores are free to vary and can still produce an average of 2?
3 Free 2
5 Free 1
1 Free 0
0 Free 0
2 Free 0
1 LOCKED 9
Average = 2 Average = 2
One Tailed vs. Two Tailed Tests
Six Steps for Hypothesis Testing
1. Identify
2. State the hypotheses
3. Characteristics of the comparison
distribution
4. Critical values
5. Calculate
6. Decide
old hat
Single-Sample t Test: Attendance in
Therapy Sessions
 Our Counseling center on campus is concerned that most students
requiring therapy do not take advantage of their services. Right
now students attend only 4.6 sessions in a given year!
Administrators are considering having patients sign a contract
stating they will attend at least 10 sessions in an academic year.
 Question: Does signing the contract actually increase
participation/attendance?

 We had 5 patients sign the contract and we counted the number of


times they attended therapy sessions
Number of Attended Therapy Sessions
6
6
12
7
8
Single-Sample t Test: Attendance in
Therapy Sessions
1. Identify
◦ Populations:
 Pop 1: All clients who sign contract
 Pop 2: All clients who do not sign contract

◦ Distribution:
 One Sample mean: Distribution of means

◦ Test & Assumptions: Population mean is known but not


standard deviation  single-sample t test
1. Data are interval
2. Probably not random selection
3. Sample size of 5 is less than 30, therefore distribution might not be
normal
Single-Sample t Test: Attendance in
Therapy Sessions
2. State the null and research hypotheses

H0: Clients who sign the contract will attend the same
number of sessions as those who do not sign the
contract.

H1: Clients who sign the contract will attend a different


number of sessions than those who do not sign the
contract.
Single-Sample t Test: Attendance in
Therapy Sessions
3. Determine characteristics of comparison
distribution (distribution of sample means)
◦ Population: μM = μ = 4.6times
◦ Sample: M = ____times,
7.8 s = _____,
2.490 sM = ______
1.114
# of Sessions (X) X-M (X-M)2
6 -1.8 3.24
6 -1.8 3.24
12 -4.2 17.64
7 -0.8 0.64
8 0.2 0.04
MX = 7.8 SSX = 24.8

( X  M ) 2 s 2.490
s 
24.8
 2.490 sM    1.114
N 1 5 1 N 5
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, M = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)
◦ In Behavioral Sciences, we use p = .05 (5%)
◦ Our hypothesis (“Clients who sign the contract will attend a different
number of sessions than those who do not sign the contract.”) is

nondirectional so our hypothesis test is two-tailed.

df = 4
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, M = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)

tcrit = ± 2.76
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, M = 7.8, N = 5, df = 4
5. Calculate the test statistic

( M   M ) (7.8  4.6)
t   2.873
sM 1.114
6. Make a decision

-2.76 +2.76
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, M = 7.8, N = 5, df = 4

2.873

6. Make a decision
t = 2.873 > tcrit = ±2.776, reject the null hypothesis

Clients who sign a contract will attend more sessions


than those who do not sign a contract,
t(4) = 2.87, p < .05.
Reporting Results in APA Format
1. Write the symbol for the test statistic (e.g., z
or t)
2. Write the degrees of freedom in parentheses
3. Write an equal sign and then the value of the
test statistic (2 decimal places)
4. Write a comma and then whether the p value
associated with the test statistic was less than
or greater than the cutoff p value of .05 (or
report exact p value).

t(4) = 2.87, p < .05


Another example?
A Tale of a Tail
The citizens of several Georgia towns are Hospital # of Tails
worried that a fiberglass insulation plant North Shore 2

is polluting the local environment. A Town Center 0

doctor at one hospital, Our Sister of the St. Mary 0


University 1
Failing Mercy, noted 3 babies born with a Failing Mercy 3
coccycx (tailbone) in the past year. South Central 1
Digging deeper she read a report in Oakmont 0
Human Pathology (Dao & Netsky, 1984) Bellevue 1

that between 1884 and 1984 only 23 East Valley 0


∑=8
babies were born with tails. She believes
N =8
her hospital data to be unusual but she M=1
also writes to doctors at 8 other
hospitals to determine how often they “between 1884 and 1984 only 23
had seen this birth defect in the past year babies were born with tails…”
(these data are below). 23 babies in 100 years (µ ) = .23
Rolling Kegs for Fun and Profit
 Making the perfect handmade cask
◦ Not too big (too much beer, lower profits)
◦ Not too small (not enough beer, fewer clients)

Stella Cunliffe

 Expectation: High rejection of poorly sized casks


 Data: Little-to-no rejections…why?
◦ Visit workstation for employee making rejections.
◦ Conditions must be equal in order to make a fair comparison.
 “how impossible it is to find human beings without biases, without prejudices, and
without the delightful idiosyncrasies which make them so fascinating.” (Cunliffe, 1976, p.5)
Studies with Two Samples
 Independent
◦ What is it?
◦ Pros
◦ Cons

 Paired (Dependent)
◦ What is it?
◦ Pros
◦ Cons

 Hypothetical Beer Tasting Experiment


◦ What is the ideal design for this study?
Two Related (dependent) samples

PAIRED SAMPLES
t TEST
(M  M )
t
Paired-Samples t Test sM

 Used to compare 2 means for a within-groups


design, a situation in which every participant is in
both samples (paired/dependent)

 New Terminology
◦ Distribution of Mean Differences
◦ Difference Scores: X1 –Y1, X2 –Y2, …

 Let’s walk through an example…


Six Steps for Hypothesis Testing
1. Identify
2. State the hypotheses
3. Characteristics of the comparison
distribution
4. Critical values
5. Calculate
6. Decide
old hat
Paired Samples t Test:
Does Studying in the Exam Room Help?
I have a debate with a research assistant about context effects in studying
for exams. She believes that she does far better when she studies for an
exam in the same room as she later takes the exam. I told her that I could
see it either hurting or helping students. We agreed to have a group of 5
participants complete two highly similar math exams. Half of these
participants studied for and completed the exams in the same room while
the other half were in different rooms, order counterbalanced*. Data for
SAME and DIFFERENT rooms are below.

Difference Score
SAME (X) DIFFERENT (Y) X-Y
122 111 -11
131 116 -15
127 113 -14
123 119 -4
132 121 -11

M = -11
Paired Samples t Test:
Does Studying in the Exam Room Help?
1. Identify
◦ Populations:
 Pop 1: Exam grades when studying and testing are in the same room.
 Pop 2: Exam grades when studying and testing are in different rooms.

◦ Distribution:
 Mean of Difference Scores: Distribution of Mean Differences

◦ Test & Assumptions: One group of participants that is studied


at two time points, paired-samples t test
1. Data are interval
2. Probably not random selection
3. Sample size of 5 is less than 30, therefore distribution might not be
normal
Paired Samples t Test:
Does Studying in the Exam Room Help?
2. State the null and research hypotheses

H0: Studying and testing in the same room will result in


the same grade as studying and testing in different
rooms.

H1: Studying and testing in the same room will result in


a different grade than studying and testing in
different rooms.
Paired Samples t Test:
Does Studying in the Exam Room Help?
3. Determine characteristics of comparison
distribution (distribution of mean differences)
◦ Population: μM = 0 (i.e., no mean difference)
◦ Sample(s): M = 11, s = 4.301, sM = 1.923
Difference Score Deviation Score Squared Deviation
X-Y (Score - Mean) (Score - Mean)2
-11 0 0
-15 -4 16
-14 -3 9
-4 7 49
-11 0 0
M = -11 SSX = 74

( X  M ) 2 74 s 4.301
s   4.301 sM    1.923
N 1 5 1 N 5
Paired Samples t Test:
Does Studying in the Exam Room Help?
μM = 0, sM = 1.923, M = -11, N = 5, df = 4
4. Determine critical value (cutoffs)
◦ In Behavioral Sciences, we use p = .05 (5%)
◦ Our hypothesis (“Studying and testing in the same room will result in a
different grade than studying and testing in different rooms.”) is nondirectional

so our hypothesis test is two-tailed.

df = 4
Paired Samples t Test:
Does Studying in the Exam Room Help?
μM = 0, sM = 1.923, M = -11, N = 5, df = 4
4. Determine critical value (cutoffs)

tcrit = ± 2.76
Paired Samples t Test:
Does Studying in the Exam Room Help?
μM = 0, sM = 1.923, M = -11, N = 5, df = 4
5. Calculate the test statistic

( M   M ) (11  0)
t   5.720
sM 1.923
6. Make a decision

-2.76 +2.76
Paired Samples t Test:
Does Studying in the Exam Room Help?
μM = 0, sM = 1.923, M = 11, N = 5, df = 4

6. Make a decision
t = -5.720 > tcrit = ±2.776, reject the null hypothesis

People studying and testing in different rooms


performed worse than … in the same rooms,
t(4) = 5.72, p < .05.
Paired Samples: Ideal Design?
 Given the simple calculations and low
cost (time and participants), why wouldn’t
we always use paired/dependent/within
subjects designs?

 Back to the hypothetical beer tasting


experiment…
Two Unrelated (unpaired) samples

INDEPENDENT
SAMPLES
t TEST
My Father’s “Chinese Lock”

 Does teaching someone a strategy decrease the amount


of time it takes to solve the puzzle?

 How do you design this study?


◦ Paired Samples option #1: No Strategy  Strategy
◦ Paired Samples option #2: Strategy  No Strategy
Independent Samples t Test
 Used to compare 2 means for a between-groups
design, a situation in which each participant is
assigned to only one condition.
 M X  M Y     X  Y    M X  M Y 
t 
sDifference sDifference

 New Statistics & Terminology:


◦ Distribution of Differences Between Means
◦ dfX , dfY , dfTotal
◦ Pooled Variance
◦ Standard Error of the Difference
Six Steps for Hypothesis Testing
1. Identify
2. State the hypotheses
3. Characteristics of the comparison
distribution
4. Critical values
5. Calculate
6. Decide
old hat
Independent Samples t Test:
Gender Differences in Humor Appreciation

I tend to believe that very few differences exist between males


and females in cognitive abilities but there is some evidence
that there are gender differences in, for example, humor
appreciation.
Independent Samples t Test:
Gender Differences in Humor Appreciation
In this hypothetical study we ask: what percentage of cartoons
do men and woman consider funny? We recruited 9 people
from the psychology subject pool and asked them to view a
cartoon. After the cartoon, each participant gave us a humor
rating of the cartoon, from 0-100 (100 being the funniest
possible). Here are those data.

Women (X) Men (Y)


84 88
97 90
58 52
90 97
86
Independent Samples t Test:
Gender Differences in Humor Appreciation
1. Identify
◦ Populations:
 Pop 1: Women exposed to the cartoon
 Pop 2: Men exposed to the cartoon

◦ Distribution:
 Difference Between Means: Distribution of Differences Between Means
 Not Distribution of Mean Differences

◦ Test & Assumptions: One group of participants that is studied


at two time points, paired-samples t test
1. Data are interval
2. Random selection
3. Sample size of 9 is less than 30, therefore distribution might not be
normal
Independent Samples t Test:
Gender Differences in Humor Appreciation
2. State the null and research hypotheses

H0: Women will categorize the same


number of cartoons as funny as will
men.

H1: Women will categorize a different


number of cartoons funny than will
men.
Independent Samples t Test:
Gender Differences in Humor Appreciation
3. Determine characteristics of comparison
distribution (distribution of differences
between means)
◦ Population: μ1 = μ2 (i.e., no difference between
means)

 M X  M Y     X  Y    M X  M Y 
t 
sDifference sDifference

What the @#$% ?


Independent Samples t Test:
Gender Differences in Humor Appreciation
3. Determine characteristics of comparison
distribution

SDifference
◦ Standard Error of the Difference:
a) Calculate variance for each sample
b) Pool variances, accounting for sample size
c) Convert from squared standard deviation to squared
standard error
d) Add the two variances
e) Take square root to get estimated standard error for
distribution of differences between means.
Calculating sDifference
a) Calculate variance (s2) for each sample

Women (X) X-M (X-M)2


84 1.75 3.063
( X  M )2 868.752
97 14.75 217.563 s 
2
  289.584
N 1 4 1
X
58 -24.25 588.063
90 7.75 60.063
MX = 82.25 SSX = 868.752

Men (Y) Y-M (Y-M)2


88 5.4 29.16
90 11.4 129.96  (Y  M ) 2
1314.4
52 -30.6 936.36
sY2    328.6
N 1 5 1
97 14.4 207.36
86 3.4 11.56
MY = 82.6 SSY = 1314.4
Independent Samples t Test
 New Types of Samples Means…
 New Degrees of Freedom

Women (X) dfX = N – 1 = 4 – 1 = 3 Men (Y)


84 88
97 dfX = 3 90
58
dfY = N – 1 = 5 – 1 = 4 52
90 97
N=4 dfY = 4 86

dfTotal = dfX + dfY = 3 + 4 = 7 N=5

dfTotal = 7
Pooled Variance
b) Pool variances, accounting for sample size
 Weighted average of the two estimates of variance
– one from each sample – that are calculated when
conducting an independent samples t test.

 df X  2  dfY  2
s 2
Pooled   sX    sY
 dfTotal   dfTotal 

3 4
s 2
Pooled    289.584    328.6
7 7

2
sPooled  124.107  187.771  311.878
Independent Samples t Test:
Gender Differences in Humor Appreciation
c) Convert from squared standard deviation to
squared standard error
2
sPooled  311.878
2 2
sPooled 311.878 sPooled 311.878
s 2
MX    77.970 s2
MY    62.376
N 4 N 5
d) Add the two variances
2
sDifference  sM2 X  sM2 Y  77.970  62.376  140.346

e) Take square root to get estimated standard error for


distribution of differences between means.
sDifference  sDifference
2
 140.346  11.847
Independent Samples t Test:
Gender Differences in Humor Appreciation
4. Determine critical value (cutoffs)
◦ In Behavioral Sciences, we use p = .05 (5%)
◦ Our hypothesis (“Women will categorize a different number of cartoons
funny than will men.”) is nondirectional so our hypothesis test

is two-tailed.
t = ± 2.365

dfTotal = 7
Independent Samples t Test:
Gender Differences in Humor Appreciation
4. Determine critical values
◦ dfTotal = 7 p = .05 t = ± 2.365

5. Calculate a test statistic


 M X  M Y     X  Y    M X  M Y 
t 
sDifference sDifference

t
 82.25  82.6 
 .03
11.847
Independent Samples t Test:
Gender Differences in Humor Appreciation
6. Make a decision

 Fail to reject null hypothesis


◦ Men and women find cartoons equally
humorous, t(7) = .03, p > .05
Summary

You might also like