0% found this document useful (0 votes)
40 views35 pages

Lecture 3 - Statistical Tests

The document discusses statistical tests, specifically the t-test. It explains that the t-test is used to test hypotheses about means when the population variance is unknown. The t-test comes in three varieties: single sample, independent samples, and dependent samples. It discusses the assumptions, calculations, and interpretations for each type of t-test.

Uploaded by

cheta21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views35 pages

Lecture 3 - Statistical Tests

The document discusses statistical tests, specifically the t-test. It explains that the t-test is used to test hypotheses about means when the population variance is unknown. The t-test comes in three varieties: single sample, independent samples, and dependent samples. It discusses the assumptions, calculations, and interpretations for each type of t-test.

Uploaded by

cheta21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

STATISTICAL

TESTS
THE T-TEST
 What is the main use of the t-test?
 How is the distribution of t related to the unit normal?
 When would we use a t-test instead of a z-test? Why might
we prefer one to the other?
 What are the chief varieties or forms of the t-test?
 What is the standard error of the difference between means?
What are the factors that influence its size?
• Identify the appropriate version of t to use for a given design.
• Compute and interpret t-tests appropriately.
• Given that

H 0 :   75; H1 :   75; s y  14; N  49; t(.05, 48)  2.01


• construct a rejection region. Draw a picture to illustrate.
BACKGROUND
 The t-test is used to test hypotheses about means when the population variance is
unknown (the usual case). Closely related to z, the unit normal.
 Developed by Gossett for the quality control of beer.
 Comes in 3 varieties: Single sample, independent samples, and dependent
samples.
 Single sample t – we have only 1 group; want to test against a hypothetical
mean.
 Independent samples t – we have 2 means, 2 groups; no relation between groups,
e.g., people randomly assigned to a single group.
 Dependent t – we have two means. Either same units in both groups, or units are
related, e.g., leaves-roots, top soil–deep soil, raw–treated H 2O.
ONE SAMPLE T-TEST

 One sample t-test: statistical procedure used to determine whether a


sample of observations could have been generated by a process with a
specific mean.
 Suppose you are interested in determining whether an assembly line
produces soap tablets with pH 7.5.
 To test this hypothesis, you could collect a sample of soap tablets
from the production line, measure their pH, and compare the sample
with a value of 7.5 using a one-sample t-test.
 There are two kinds of hypotheses for a one sample t-test, the null
hypothesis and the alternative hypothesis.
 The null hypothesis assumes that no difference exists between true
mean (μ) & comparison value (m0), whereas as the alternate
hypothesis assumes that some difference exists.
 Purpose of one sample t-test is to determine if the null hypothesis
should be rejected, given the sample data.
 Alternative hypothesis can assume one of 3 forms depending on the
question being asked.
 If goal is to measure any difference, regardless of direction, a two-
tailed hypothesis is used.
 If the direction of difference between sample mean and comparison
value matters, either an upper-tailed or lower-tailed hypothesis is
used. The null hypothesis remains the same for each type of one
sample t-test.
 The hypotheses are formally defined below:
 The null hypothesis (H0) assumes the difference between the true
mean (mu) & the comparison value (m0) is equal to zero.
 The two-tailed alternative hypothesis (H1) assumes that the difference
between the true mean (mu) & comparison value (m0) is not equal to
zero.
 The upper-tailed alternative hypothesis (H1) assumes the true mean
(mu) of sample is greater than comparison value (m0).
 The lower-tailed alternative hypothesis (H 1) assumes the true mean
(mu) of sample is less than the comparison value (m0).
 Mathematically this is presented as follows:
 H0: μ = m0
 H1: μ ≠ m0 (two-tailed)
 H1: μ >m0 (upper-tailed)
 H1: μ < m0 (lower-tailed)
 Note. It is important to remember that hypotheses are never about
data, they are about the processes which produce the data.
 If you are interested in knowing whether the mean pH of a sample of
soap tablets is equal to 7.5, the real question being asked is whether
the process that produced those soap tablets has a mean pH of 7.5.
ASSUMPTIONS
 As a parametric procedure (a procedure which estimates unknown
parameters), the one sample t-test makes several assumptions.
Although t-tests are quite robust, it is good practice to evaluate the
degree of deviation from these assumptions in order to assess the
quality of the results. The one sample t-test has four main
assumptions:
 The dependent variable must be continuous (interval/ratio).
 The observations are independent of one another.
 The dependent variable should be approximately normally distributed.
 The dependent variable should not contain any outliers.
 Given a sample size of 200 with EXAMPLE
sample mean of 11 and sample
standard deviation of 5. • Large samples n >
 Use the z test to examine if there is 100 and normally
significant difference between the distributed may use
sample mean (11) and a standard of 10 the z or t-test
at 95% confidence interval. • Small samples n: 30 –

 Note at 95% CI critical Z = 1.96 99 & normally


distributed use the t-
(Study use of Z-score in literature) test.
• For details refer to
handouts
SINGLE-SAMPLE Z TEST
 For large samples (N>100) can use z to test hypotheses about means.

 Suppose
(X  )  ( X  X ) 2

zM  sX N 1
est. M est . M  
 Then N N
H 0 :   10; H1 :   10; s X  5; N  200
 If

sX 5 5
est . M     .35
N 200 14.14

(11  10)
X  11  z   2.83; 2.83  1.96  p  .05 Sign. Diff.
.35
THE T DISTRIBUTION
We use t when the population variance is unknown (the usual case)
and sample size is small (N<100, the usual case). If you use a stat
package for testing hypotheses about means, you will use t.

The t distribution is a short, fat relative of the normal. The shape of t depends on its df. As N
becomes infinitely large, t becomes normal.
DEGREES OF FREEDOM
For the t distribution, degrees of freedom are always a
simple function of the sample size, e.g., (n-1).

One way of explaining df is that if we know the total or


mean, and all but one score, the last (n-1) score is not free to
vary. It is fixed by the other scores. 4+3+2+X = 10. X=1.
EXAMPLE
 We look at the same example but with n = 25, s = 5
 Given μ = 11; we wish to test Ho: μ = 10, H1: μ ≠ 10, at
95% confidence interval
 To construct the rejection region
 What will the df and α be?
 Check the critical value from your t-tables now.
 Determine the answer.
SINGLE-SAMPLE T - TEST
With a small sample size, we compute the same numbers
as we did for z, but we compare them to the t distribution
instead of the z distribution.
H 0 :   10; H1 :   10; s X  5; N  25
s 5 (11  10)
est . M  X  1 X  11  t  1
N 25 1

t (.05,24)  2.064 (c.f. z=1.96) 1<2.064, n.s.

X  tˆ M
Interval =
11  2.064(1)  [8.936, 13.064]
Interval is about 9 to 13 and contains 10, so n.s.
Rejection region is 8.936 > X > 13.064.
REVIEW
How are the distributions of z and t related?
H 0 :   75; H1 :   75; s y  14; N  49; t(.05, 48)  2.01
 Given that

construct a rejection region. Draw a picture to illustrate.


DIFFERENCE BETWEEN
MEANS (1)
 Most studies have at least 2 groups (e.g., M vs. F, Exp vs.
Control)
 If we want to know diff in population means, best guess is
difference in sample means.
 Unbiased: E ( y1  y2 )  E ( y1 )  E ( y2 )  1   2
 Variance of the Difference: var( y 1  y 2 )   2
M1   2
M2

 Standard Error:     
2 2
diff M1 M2
DIFFERENCE BETWEEN MEANS (2)
 We can estimate the standard error of the difference
between means.
est . diff  est . M2 1  est . M2 2
 For large samples, can use z
( X 1  X 2 )  ( 1   2 ) H 0 : 1   2  0; H1 : 1   2  0
z diff  est d iff X 1  10; N1  100; SD1  2
X 2  12; N 2  100; SD2  3
4 9 13
est . diff     .36
100 100 100
(10  12)  0 2
z diff    5.56; p  .05
.36 .36
INDEPENDENT SAMPLES T (1)
( y1  y 2 )  ( 1   2 )
 Looks just like z: t diff  est d iff

 df=N1-1+N2-1=N1+N2-2
 If SDs are equal, estimate is:
2 2  1 1 
 diff       2

N1 N 2 N
 1 N 2

Pooled variance estimate is weighted average:


 2  [( N1  1) s12  ( N 2  1) s22 ] /[1 /( N1  N 2  2)]
Pooled Standard Error of the Difference (computed):

( N1  1) s12  ( N 2  1) s22  N1  N 2 
est . diff   
N1  N 2  2 N N
 1 2 
INDEPENDENT SAMPLES T (2)
( N1  1) s12  ( N 2  1) s22  N1  N 2 
est . diff   
N1  N 2  2 N N
 1 2 
H 0 : 1   2  0; H1 : 1   2  0
( y1  y 2 )  ( 1   2 )
t diff  est d iff y1  18; s12  7; N1  5
y2  20; s22  5.83; N 2  7
4(7)  6(5.83) 12 
est . diff     1.47
5  7  2  35 
(18  20)  0  2
t diff    1.36; n.s.
1.47 1.47
tcrit = t(.05,10)=2.23
REVIEW
What is the standard error of the difference between
means? What are the factors that influence its size?
Describe a design (what IV? What DV?) where it
makes sense to use the independent samples t test.
DEPENDENT T (1)
Observations come in pairs. Brother, sister, repeated measure.
 diff
2
  M2 1   M2 2  2 cov( y1 , y2 )

Problem solved by finding diffs between pairs Di=yi1-yi2.

D
 (D ) i
 i
( D  D ) 2
est . MD 
sD
N s 
2
D
N 1 N
D  E(D )
t df=N(pairs)-1
est . MD
DEPENDENT T (2)
Brother Sister Diff (D  D )2
5 7 2 1
7 8 1 0
3 3 0 1
y 5 y6 D 1

sD 
 ( D  D ) 2

1
N 1 est . MD  1 / 3  .58
D  E(D ) 1
t   1.72
est . MD .58
ASSUMPTIONS
 The t-test is based on assumptions of normality and
homogeneity of variance.
 You can test for both these (make sure you learn the SAS
methods).
 As long as the samples in each group are large and nearly
equal, the t-test is robust, that is, still good, even tho
assumptions are not met.
REVIEW
Describe a design where it makes sense to use a
single-sample t.
Describe a design where it makes sense to use a
dependent samples t.
STRENGTH OF ASSOCIATION
(1)
 Scientific purpose is to predict or explain variation.
 Our variable Y has some variance that we would like to account for. There are statistical
indexes of how well our IV accounts for variance in the DV. These are measures of how
strongly or closely associated our Ivs and DVs are.
 Variance accounted for:

 2
  2
( 1   2 ) 2
 
2 Y Y|X

Y2
4 Y2
STRENGTH OF ASSOCIATION
(2)  2
  2
( 1   2 ) 2
 How much of variance in Y is associated with the IV?  
2 Y Y|X

Y2
4 Y2
Compare the 1st (left-most) curve with the curve in the
middle and the one on the right.
In each case, how 0.4

much of the variance


0.3
in Y is associated
with the IV, group 0.2

membership? More
in the second 0.1

comparison. As
mean diff gets big, so 0.0

-4 -2 0 2 4 6

does variance acct.


ASSOCIATION &
SIGNIFICANCE
 Power increases with association (effect size) and sample size.
 Effect size:
d  ( X1  X 2 ) /  p
 Significance = effect size X sample size.

(X   )
( X1  X 2 ) t
t
1 (single  2 
1  N
(independent  p2     
N
 1 N 2 sample)
samples)
td N
Increasing sample size does not increase effect size
(strength of association). It decreases the standard
error so power is greater, |t| is larger.
ESTIMATING POWER (1)
 If the null is false, the statistic is no longer distributed as t, but rather as noncentral t. This
makes power computation difficult.
 Howell introduces the noncentrality parameter delta to use for estimating power. For the one-
sample t,

 d n Recall the relations between t


and d on the previous slide
ESTIMATING POWER (2)
 Suppose (Howell, p. 231) that we have 25 people, a sample mean of 105, and a hypothesized
mean and SD of 100 and 15, respectively. Then

105  100
d Howell presents an appendix
 1 / 3  .33
15
where delta is related to
  d n  .33 25  1.65 power. For power = .8, alpha
power  .38 = .05, delta must be 2.80. To
solve for N, we compute:
2 2
    2 .8 
  d n; n        8.48  71.91
2

 d   .33 
ESTIMATING POWER (3)
 Dependent t can be cast as a single sample t using difference scores.
 Independent t. To use Howell’s method, the result is n per group, so double it. Suppose d = .5
(medium effect) and n =25 per group.

n 25 From Howell’s appendix, the


 d  .50  .5 12.5  1.77
2 2
value of delta of 1.77 with
2 2
   2.8 
n  2   2   62.72
alpha = .05 results in power of
d   .5  .43. For a power of .8, we
Need 63 per group. need delta = 2.80
SAS PROC POWER – SINGLE
SAMPLE EXAMPLE
proc power;
onesamplemeans test=t
nullmean = 100 The POWER Procedure
mean = 105 One-sample t Test for Mean
stddev = 15 Fixed Scenario Elements
power = .8 Distribution Normal
ntotal = . ; Method Exact
run; Null Mean 100
Mean 105
Standard Deviation 15
Nominal Power 0.8
Number of Sides 2
Alpha 0.05
Computed N Total
Actual N
Power Total
0.802 73
;

2 SAMPLE T POWER
Calculate sample size
proc power; Two-sample t Test for Mean Difference
twosamplemeans Fixed Scenario Elements
meandiff= .5 Distribution Normal
stddev=1 Method Exact
power=0.8 Mean Difference 0.5
ntotal=.;
Standard Deviation 1
run;
Nominal Power 0.8
Number of Sides 2
Null Difference 0
Alpha 0.05
Group 1 Weight 1
Group 2 Weight 1
Computed N Total
Actual N
Power Total
0.801 128
2 SAMPLE T POWER
 proc power;
The POWER Procedure
Two-Sample t Test for Mean Difference
Fixed Scenario Elements
 twosamplemeans
Distribution Normal
 meandiff = 5 [assumed Method Exact
difference] Number of Sides 1
 stddev =10 [assumed SD] Mean Difference 5
Standard Deviation 10
 sides = 1 [1 tail] Total Sample Size 50
 ntotal = 50 [25 per group] Null Difference 0
Alpha 0.05
 power = .; *[tell me!]; Group 1 Weight 1
Group 2 Weight 1
 run;

Computed Power
Power
0.539
TYPICAL POWER IN PSYCH
 Average effect size is about d=.40.
 Consider power for effect sizes between .3 and .6. What kind of sample size do we need for
power of .8?

proc power; Two-sample t Test for 1


twosamplemeans Computed N Total
meandiff= .3 to .6 by .1 Mean Actual N
stddev=1 Index Diff Power Total
power=.8 1 0.3 0.801 352
ntotal=.; 2 0.4 0.804 200
plot x= power min = .5 max=.95; 3 0.5 0.801 128
run; 4 0.6 0.804 90

Typical studies are underpowered.

You might also like