0% found this document useful (0 votes)
57 views

PSY 240: Statistics in Psychology: One Sample Statistics: Calculating Significance For Paired Means

The sampling distribution of the difference between means is a theoretical probability distribution that describes all possible mean differences that could be obtained by randomly sampling from the population. It has a normal shape due to the central limit theorem, which states that as sample sizes increase, the sampling distribution of means will approximate a normal distribution regardless of the shape of the population distribution. The mean of the sampling distribution equals the population mean difference, and its standard deviation is equal to the standard error of the difference between means.

Uploaded by

Fatmir Musliu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

PSY 240: Statistics in Psychology: One Sample Statistics: Calculating Significance For Paired Means

The sampling distribution of the difference between means is a theoretical probability distribution that describes all possible mean differences that could be obtained by randomly sampling from the population. It has a normal shape due to the central limit theorem, which states that as sample sizes increase, the sampling distribution of means will approximate a normal distribution regardless of the shape of the population distribution. The mean of the sampling distribution equals the population mean difference, and its standard deviation is equal to the standard error of the difference between means.

Uploaded by

Fatmir Musliu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

PSY 240:

Statistics in Psychology

One Sample
Statistics:

Calculating
Significance
for Paired
Means
Exam Review

 Exam 2 statistics:
 M = 75.90, SD = 15.72, Skew = -.43 (moderate)
 Scores were lower than Exam 1 but had more
variability – basically there were fewer B’s

Paired Samples Test

Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
Pair 1 Exam1 - Exam2 7.93103 12.76415 2.37024 3.07581 12.78626 3.346 28 .002
The Possibility of Being
Wrong
 Remember that when a finding has a very
low probability (lower than alpha) we
reject the null
 But since this is based on probability, it's
possible that we could be wrong
 With an alpha = .05, our outcome could
have occurred simply be chance 5% of
time
 Thus, we will be wrong 5 times out of 100
Possibilities and
Probabilities
Actual Situation (Often Unknown)
No True Difference True
Difference
Researcher’s Decision

Not Significant Correct Type II


Decision Error
(p = 1 – α ) (p = 1 – power)

Significant Type I Correct


Error Decision
(p = α ) (p = power)
The Possibility of Being Right

 Just as we are wrong 5% of the time for


Type I errors (rejecting the null when we
should not), there is some incidence of error
when we should reject the null but do not
 This is called Type II error, and is dependent
on several things
 Sample Size (N)
 α (usually .05 or .01)
 Effect size (e.g., Cohen’s d)
The Possibility of Being Right

 Power is the probability of rejecting the null


hypothesis given that we should
 In other words, if there is a difference in the population,
what is the chance that we will find it in our study?
 These things together contribute to “power”:
 Sample Size (N) Larger N means more power
 α (either .05 or .01) Larger α means more power
 Effect size (Cohen’s d) Larger effect means more power
 We like power and want the probability to be high
 High power = Greater chance of getting significant result
 Low power = The study might be a waste of time
The Concept of Power

 We want Statistical Power


 We want to find significance when true differences
exist (otherwise why run the study?)
 Specifically, our goal is to correctly identify treatment
effects
 Often, we need to balance power against the
likelihood of making an error
 This can be calculated or estimated
 Psychological studies tend to have low power
Factors Influencing Power
Maximizing Chances for
Significance
 The primary ways of increasing
power
 Larger mean differences
 Decreasing variability within groups
 Increasing sample size
 Can you see how these effect the
formula for the significance test?
Maximizing Chances for
Significance: Effect size
 With a larger difference between
means, we will have a larger number
in the numerator of our test statistic
 It is easier to see the difference
between two distributions if they
overlap less
Maximizing Chances for
Significance: N
 N affects the SE by increasing our confidence
that our sample is like the population (i.e., the
law of large numbers)
 Think of the SE formula: SD/√N
 As N increases, the SE decreases
 Since the SE is the denominator of t and z, our test
statistic will get larger with more people
 N has NOTHING to do with effect size – only our
judgment of statistical significance
Maximizing Chances for
Significance: Error variance
 Our SD really represents variability in scores
that we cannot understand
 We know that scores vary, but why?
 Sometimes this is due to group differences, but
not everyone within a group is the same
 Thus, the SD is “statistical error” and is why we
account for it in the SE
 A large SD will increase our SE, making our test
statistic smaller
Making Errors

 Type I Error: Finding statistical significance


when there is no true difference
 Also known as a False Positive
 The frequency of making this error = α
 Type II Error: Failing to find statistical
significance when there is a true
difference
 Also known as a False Negative
Using Power Tables

 Power tables provide the probabilities of finding


significance (i.e., power) for various combinations
of effect size and sample size
 For the example: With d = 1.0 and n = 3, Power = .
157 (for a 2 sample test)
 This means that we only gave ourselves a 16% of
even finding statistical significance with this effect
size!
Effect and Sample Sizes

 The more people we have in our sample,


the less sampling error we have and the
more accurate our parameter estimates
will be
 Large effect sizes are relative easy to
detect (meaning we don't need big sample
sizes to declare statistical significance)
 Small effect sizes are difficult to detect
(meaning we need large sample sizes to
declare statistical significance)
Estimating Power

 If N = 3 for a dependent samples design, what


effect size would be needed to achieve power =
.80?
 If d = 1.0, what sample size would be needed in
order to achieve power = .80?
 What is the power associated with a medium
effect size and n = 50 for both a dependent
and independent samples design?
Moral of the story…

 Within subjects designs are always more


powerful than between-subjects designs
 So, when possible, have your subjects
complete as many measures as possible
 However, consider the consequences of long
experiments, such as participant fatigue, and
perhaps even frustration!
Repeated/Paired Measures

 When we have data on two similar


measurements for each individual, we
can ask two different questions
 Is there a mean difference between the
measures (and is it significant)?
 Is there a correlation between the measures
(and is it significant)?
 Both questions can be answered by
using a variation of the t test procedure
Example Data
(with Scatterplot)
Testing Mean Differences

 For each person, create a difference


score (a number that reflects the
difference in measurement between
the two time points)
 Run a one-sample t test
(conceptually speaking) on the
difference scores
Computing the Difference
Scores
Participant Q1 Q2 Q1 – Q2
1 5 7 -2
2 6 7 -1
3 6 8 -2
4 7 8 -1
5 8 9 -1
M 6.4 7.8 -1.4
SD 1.14 .837 .548
Computing the paired-samples
t test

 The difference scores had a M = -1.4 and


SD = .548 and N = 5
 So…

M Difference − µ Difference
t ( N − 1) =
SDDifference
N
Calculation of the paired t

− 1.4 − 0 − 1.4
t (4) = = = 5.713
0.548 .245
5
For an alpha (α) of .05: CVt = 2.776
The approximate p value is less than .01 because the
value of our t statistic is between the critical values for .
01 and .001
Cohen’s d for paired samples

Conceptually, there is M Difference


nothing new about
calculating effect size
d=
SDDifference

− 1.4
d= = 2.55
This is a very large effect
size, which would make

.548 sense because we had a


significant result with a
very small sample
Steps in Creating the CI for the
Mean Difference (Raw Effect)

1. Establish the center of the interval.


2. Obtain the t score appropriate for the
level of confidence.
3. Estimate the standard error.
4. Calculate the confidence limits in raw
score values by using the following
equation:
CI D = ( M D − µ D ) ± (CVt )( SED )
The Significance Test and
Confidence Interval
PSY 240:
Statistics in Psychology

Two Sample
Statistics:

Understanding
the Sampling
Distribution
Our Basic Sampling
Distribution
 The Sampling Distribution of the
Difference between Means
 Theoretical probability distribution of all
possible mean differences for samples of a
given size from the population
 Characterized by a mean and the standard
error of the mean
 But what is the shape of this sampling
distribution? Is it reasonable to assume
that it is normal?
The Sampling Distribution

 The basic characteristics of the sampling distribution of


each mean difference are essentially the same: there is
some location and some spread (SE )

 How is a distribution of mean differences created?

 How does the average mean difference relate to the


population mean difference?

 The standard deviation of mean differences is NOT the


same as the standard deviation of scores or the same as
the standard error for either mean
Estimating the Two
Population Means
 The central limit theorem suggested that in the one
sample case, our sample mean was the best estimate
of the population mean
 In the two sample case, the difference between our
sample means is the best estimate of the difference
between the two population means
 Thus, the difference between the means is the location
(mean) of our sampling distribution
 If we assume the null hypothesis is true, then the center will
be equal to 0, and if standardized, have a SE = 1
Standard Error of the
Difference between Means
 But, in order to standardize, we need the raw SE

 The standard deviation of the sampling
distribution of mean differences is a function of
the standard errors of each mean
 If sample sizes are equal:
SE Difference = SE 2
M1 + SE 2
M2

 If sample sizes are unequal, the process is more


complex
Standard Error of the
Difference between Means

 If sample sizes are unequal, it is necessary


to pool our estimates of population
variance first:
SSWITHIN SS1 + SS 2
MS WITHIN = =
df WITHIN df 1 + df 2

 Then the standard error of the difference


MSWITHIN MSWITHIN
is: SEDifference = +
n1 n2
PSY 240:
Statistics in Psychology

Two Sample
Statistics:

Calculating
Two Sample
Statistics
Independent Groups t Test

Since the population variance is


estimated, we use a t formula:
M1 − M 2
t ( N − 2) =
SEDifference
 Note the similarity to the one sample
t formula
Steps in Creating the CI for the
Mean Difference (Raw Effect)

1. Establish the center of the interval.


2. Obtain the t score appropriate for the
level of confidence.
3. Estimate the standard error.
4. Calculate the confidence limits in raw
score values by using the following
equation:
CI D = ( M 1 − M 2 ) ± (CVt )( SEDiff )
Standardized Effect Size
 Cohen’s d Statistic
µ1 − µ 2 M1 − M 2
d= ≅
σ M SW ITH IN
 The denominator is like a “pooled” or average SD
 Important Points
 Interpreted similar to a standardized score
 Rule of thumb interpretations for size
(.2 = small, .5 = medium, .8 = large)
A Two Group Example
Suppose we compare a
control group to an
experimental group.
 Using an α = .05, are the
two groups significantly
different?
 What is the confidence
interval for the mean
difference?
 What is the standardized
effect size?
Working the Example
Group Statistics

Std. Error
Group N Mean Std. Deviation Mean
Score Control 3 4.0000 2.00000 1.15470
Experimental 3 6.0000 2.00000 1.15470
Working the Example
Group Statistics

Std. Error
Group N Mean Std. Deviation Mean
Score Control 3 4.0000 2.00000 1.15470
Experimental 3 6.0000 2.00000 1.15470

In d ep en d en t Samp les T est

t-test for Equality of M eans


95% Confidence
Interval of the
M ean Std. Error Difference
t df Sig. (2-tailed) Difference Difference Lower Upper
Score -1.225 4 .288 -2.0000 1.63299 -6.53392 2.53392
Steps for hand calculation

SED = 1.1547 + 1.1547 = 1.333 + 1.333


2 2

SE D = 2.667 = 1.633
M1 − M 2
t ( N − 2) =
SE D
4−6 −2
t ( 6 − 2) = = = −1.225
1.633 1.633
APA Method of Presentation

An independent samples t test indicated that


the difference between the control group
(M = 4.00, SD = 2.00) and the experimental
group (M = 6.00, SD = 2.00) was not statistically
significant, t(4) = -1.225, p = .288, d = 1.0. The
95% confidence interval for the difference
between means ranged from -6.534 to 2.534.

You might also like