0% found this document useful (0 votes)

129 views19 pages

Some Basic Null Hypothesis Tests

The document discusses several common null hypothesis testing procedures, focusing on the one-sample t-test, dependent t-test, and independent t-test. It provides details on how to conduct and interpret a one-sample t-test to compare a sample mean to a hypothetical population mean. An example is given of a one-sample t-test comparing students' estimates of calories in a cookie to the actual calorie amount. The test results in rejection of the null hypothesis that the sample mean equals the population mean.

Uploaded by

Sudhir Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

129 views19 pages

Some Basic Null Hypothesis Tests

Uploaded by

Sudhir Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Some Basic Null Hypothesis Tests

Learning Objectives

1. Conduct and interpret one-sample, dependent-samples, and independent-

samples t tests.
2. Interpret the results of one-way, repeated measures, and factorial ANOVAs.
3. Conduct and interpret null hypothesis tests of Pearson’s r.

In this section, we look at several common null hypothesis testing procedures. The
emphasis here is on providing enough information to allow you to conduct and interpret
the most basic versions. In most cases, the online statistical analysis tools mentioned
in Chapter 12 will handle the computations—as will programs such as Microsoft Excel
and SPSS.

The t Test

As we have seen throughout this book, many studies in psychology focus on the
difference between two means. The most common null hypothesis test for this type of
statistical relationship is the t test. In this section, we look at three types of t tests that
are used for slightly different research designs: the one-sample t test, the dependent-
samples t test, and the independent-samples t test.

One-Sample t Test

The one-sample t test is used to compare a sample mean (M) with a hypothetical

population mean (μ0) that provides some interesting standard of comparison. The null
hypothesis is that the mean for the population (µ) is equal to the hypothetical population
mean: μ = μ0. The alternative hypothesis is that the mean for the population is different
from the hypothetical population mean: μ ≠ μ0. To decide between these two
hypotheses, we need to find the probability of obtaining the sample mean (or one more
extreme) if the null hypothesis were true. But finding this p value requires first
computing a test statistic called t. (A test statistic is a statistic that is computed only to
help find the p value.) The formula for t is as follows:

Again, M is the sample mean and µ0 is the hypothetical population mean of
interest. SD is the sample standard deviation and N is the sample size.
The reason the t statistic (or any test statistic) is useful is that we know how it is
distributed when the null hypothesis is true. As shown in Figure 13.1, this distribution is
unimodal and symmetrical, and it has a mean of 0. Its precise shape depends on a
statistical concept called the degrees of freedom, which for a one-sample t test is N − 1.
(There are 24 degrees of freedom for the distribution shown in Figure 13.1.) The
important point is that knowing this distribution makes it possible to find the p value for
any t score. Consider, for example, a t score of +1.50 based on a sample of 25. The
probability of a t score at least this extreme is given by the proportion of t scores in the
distribution that are at least this extreme. For now, let us define extreme as being far
from zero in either direction. Thus the p value is the proportion of t scores that are +1.50
or above or that are −1.50 or below—a value that turns out to be .14.

Figure 13.1 Distribution of t Scores (With 24 Degrees of Freedom) When the Null
Hypothesis Is True. The red vertical lines represent the two-tailed critical values, and
the green vertical lines the one-tailed critical values when α = .05.

Fortunately, we do not have to deal directly with the distribution of t scores. If we were
to enter our sample data and hypothetical mean of interest into one of the online
statistical tools in Chapter 12 or into a program like SPSS (Excel does not have a one-
sample t test function), the output would include both the t score and the p value. At this
point, the rest of the procedure is simple. If p is less than .05, we reject the null
hypothesis and conclude that the population mean differs from the hypothetical mean of
interest. If p is greater than .05, we retain the null hypothesis and conclude that there is
not enough evidence to say that the population mean differs from the hypothetical mean
of interest. (Again, technically, we conclude only that we do not have enough evidence
to conclude that it does differ.)

If we were to compute the t score by hand, we could use a table like Table 13.2 to make
the decision. This table does not provide actual p values. Instead, it provides
the critical values of t for different degrees of freedom (df) when α is .05. For now, let
us focus on the two-tailed critical values in the last column of the table. Each of these
values should be interpreted as a pair of values: one positive and one negative. For
example, the two-tailed critical values when there are 24 degrees of freedom are +2.064
and −2.064. These are represented by the red vertical lines in Figure 13.1. The idea is
that any t score below the lower critical value (the left-hand red line in Figure 13.1) is in
the lowest 2.5% of the distribution, while any t score above the upper critical value (the
right-hand red line) is in the highest 2.5% of the distribution. Therefore any t score
beyond the critical value in either direction is in the most extreme 5% of t scores when
the null hypothesis is true and has a p value less than .05. Thus if the t score we
compute is beyond the critical value in either direction, then we reject the null
hypothesis. If the t score we compute is between the upper and lower critical values,
then we retain the null hypothesis.

Table 13.2 Table of Critical Values of t When α = .05

One-tailed critical
df Two-tailed critical value
value

3 2.353 3.182

4 2.132 2.776

5 2.015 2.571

6 1.943 2.447

7 1.895 2.365

8 1.860 2.306
9 1.833 2.262

10 1.812 2.228

11 1.796 2.201

12 1.782 2.179

13 1.771 2.160

14 1.761 2.145

15 1.753 2.131

16 1.746 2.120

17 1.740 2.110

18 1.734 2.101

19 1.729 2.093

20 1.725 2.086

21 1.721 2.080

22 1.717 2.074

23 1.714 2.069
24 1.711 2.064

25 1.708 2.060

30 1.697 2.042

35 1.690 2.030

40 1.684 2.021

45 1.679 2.014

50 1.676 2.009

60 1.671 2.000

70 1.667 1.994

80 1.664 1.990

90 1.662 1.987

100 1.660 1.984

Thus far, we have considered what is called a two-tailed test, where we reject the null
hypothesis if the t score for the sample is extreme in either direction. This test makes
sense when we believe that the sample mean might differ from the hypothetical
population mean but we do not have good reason to expect the difference to go in a
particular direction. But it is also possible to do a one-tailed test, where we reject the
null hypothesis only if the t score for the sample is extreme in one direction that we
specify before collecting the data. This test makes sense when we have good reason to
expect the sample mean will differ from the hypothetical population mean in a particular
direction.

Here is how it works. Each one-tailed critical value in Table 13.2 can again be
interpreted as a pair of values: one positive and one negative. A t score below the lower
critical value is in the lowest 5% of the distribution, and a t score above the upper critical
value is in the highest 5% of the distribution. For 24 degrees of freedom, these values
are −1.711 and +1.711. (These are represented by the green vertical lines in Figure
13.1.) However, for a one-tailed test, we must decide before collecting data whether we
expect the sample mean to be lower than the hypothetical population mean, in which
case we would use only the lower critical value, or we expect the sample mean to be
greater than the hypothetical population mean, in which case we would use only the
upper critical value. Notice that we still reject the null hypothesis when the t score for
our sample is in the most extreme 5% of the t scores we would expect if the null
hypothesis were true—so α remains at .05. We have simply redefined extreme to refer
only to one tail of the distribution. The advantage of the one-tailed test is that critical
values are less extreme. If the sample mean differs from the hypothetical population
mean in the expected direction, then we have a better chance of rejecting the null
hypothesis. The disadvantage is that if the sample mean differs from the hypothetical
population mean in the unexpected direction, then there is no chance at all of rejecting
the null hypothesis.

Example One-Sample t Test

Imagine that a health psychologist is interested in the accuracy of university students’

estimates of the number of calories in a chocolate chip cookie. He shows the cookie to
a sample of 10 students and asks each one to estimate the number of calories in it.
Because the actual number of calories in the cookie is 250, this is the hypothetical
population mean of interest (µ0). The null hypothesis is that the mean estimate for the
population (μ) is 250. Because he has no real sense of whether the students will
underestimate or overestimate the number of calories, he decides to do a two-tailed
test. Now imagine further that the participants’ actual estimates are as follows:

250, 280, 200, 150, 175, 200, 200, 220, 180, 250

The mean estimate for the sample (M) is 212.00 calories and the standard deviation
(SD) is 39.17. The health psychologist can now compute the t score for his sample:

If he enters the data into one of the online analysis tools or uses SPSS, it would also tell
him that the two-tailed p value for this t score (with 10 − 1 = 9 degrees of freedom) is .
013. Because this is less than .05, the health psychologist would reject the null
hypothesis and conclude that university students tend to underestimate the number of
calories in a chocolate chip cookie. If he computes the t score by hand, he could look
at Table 13.2 and see that the critical value of t for a two-tailed test with 9 degrees of
freedom is ±2.262. The fact that his t score was more extreme than this critical value
would tell him that his p value is less than .05 and that he should reject the null
hypothesis.

Finally, if this researcher had gone into this study with good reason to expect that
university students underestimate the number of calories, then he could have done a
one-tailed test instead of a two-tailed test. The only thing this decision would change is
the critical value, which would be −1.833. This slightly less extreme value would make it
a bit easier to reject the null hypothesis. However, if it turned out that university students
overestimate the number of calories—no matter how much they overestimate it—the
researcher would not have been able to reject the null hypothesis.

The Dependent-Samples t Test

The dependent-samples t test (sometimes called the paired-samples t test) is used to

compare two means for the same sample tested at two different times or under two
different conditions. This comparison is appropriate for pretest-posttest designs or
within-subjects experiments. The null hypothesis is that the means at the two times or
under the two conditions are the same in the population. The alternative hypothesis is
that they are not the same. This test can also be one-tailed if the researcher has good
reason to expect the difference goes in a particular direction.

It helps to think of the dependent-samples t test as a special case of the one-

sample t test. However, the first step in the dependent-samples t test is to reduce the
two scores for each participant to a single difference score by taking the difference
between them. At this point, the dependent-samples t test becomes a one-sample t test
on the difference scores. The hypothetical population mean (µ 0) of interest is 0 because
this is what the mean difference score would be if there were no difference on average
between the two times or two conditions. We can now think of the null hypothesis as
being that the mean difference score in the population is 0 (µ 0 = 0) and the alternative
hypothesis as being that the mean difference score in the population is not 0 (µ 0 ≠ 0).

Example Dependent-Samples t Test

Imagine that the health psychologist now knows that people tend to underestimate the
number of calories in junk food and has developed a short training program to improve
their estimates. To test the effectiveness of this program, he conducts a pretest-posttest
study in which 10 participants estimate the number of calories in a chocolate chip
cookie before the training program and then again afterward. Because he expects the
program to increase the participants’ estimates, he decides to do a one-tailed test. Now
imagine further that the pretest estimates are

230, 250, 280, 175, 150, 200, 180, 210, 220, 190
and that the posttest estimates (for the same participants in the same order) are

250, 260, 250, 200, 160, 200, 200, 180, 230, 240

The difference scores, then, are as follows:

+20, +10, −30, +25, +10, 0, +20, −30, +10, +50

Note that it does not matter whether the first set of scores is subtracted from the second
or the second from the first as long as it is done the same way for all participants. In this
example, it makes sense to subtract the pretest estimates from the posttest estimates
so that positive difference scores mean that the estimates went up after the training and
negative difference scores mean the estimates went down.

The mean of the difference scores is 8.50 with a standard deviation of 27.27. The health
psychologist can now compute the t score for his sample as follows:

If he enters the data into one of the online analysis tools or uses Excel or SPSS, it
would tell him that the one-tailed p value for this t score (again with 10 − 1 = 9 degrees
of freedom) is .148. Because this is greater than .05, he would retain the null hypothesis
and conclude that the training program does not increase people’s calorie estimates. If
he were to compute the t score by hand, he could look at Table 13.2 and see that the
critical value of t for a one-tailed test with 9 degrees of freedom is +1.833. (It is positive
this time because he was expecting a positive mean difference score.) The fact that
his t score was less extreme than this critical value would tell him that his p value is
greater than .05 and that he should fail to reject the null hypothesis.

The Independent-Samples t Test

The independent-samples t test is used to compare the means of two separate

samples (M1 and M2). The two samples might have been tested under different
conditions in a between-subjects experiment, or they could be preexisting groups in a
correlational design (e.g., women and men, extraverts and introverts). The null
hypothesis is that the means of the two populations are the same: µ 1 = µ2. The
alternative hypothesis is that they are not the same: µ 1 ≠ µ2. Again, the test can be one-
tailed if the researcher has good reason to expect the difference goes in a particular
direction.

The t statistic here is a bit more complicated because it must take into account two
sample means, two standard deviations, and two sample sizes. The formula is as
follows:

Notice that this formula includes squared standard deviations (the variances) that
appear inside the square root symbol. Also, lowercase n1 and n2 refer to the sample
sizes in the two groups or condition (as opposed to capital N, which generally refers to
the total sample size). The only additional thing to know here is that there are N − 2
degrees of freedom for the independent-samples t test.

Example Independent-Samples t Test

Now the health psychologist wants to compare the calorie estimates of people who
regularly eat junk food with the estimates of people who rarely eat junk food. He
believes the difference could come out in either direction so he decides to conduct a
two-tailed test. He collects data from a sample of eight participants who eat junk food
regularly and seven participants who rarely eat junk food. The data are as follows:

Junk food eaters: 180, 220, 150, 85, 200, 170, 150, 190

Non–junk food eaters: 200, 240, 190, 175, 200, 300, 240

The mean for the junk food eaters is 220.71 with a standard deviation of 41.23. The
mean for the non–junk food eaters is 168.12 with a standard deviation of 42.66. He can
now compute his t score as follows:

If he enters the data into one of the online analysis tools or uses Excel or SPSS, it
would tell him that the two-tailed p value for this t score (with 15 − 2 = 13 degrees of
freedom) is .015. Because this p value is less than .05, the health psychologist would
reject the null hypothesis and conclude that people who eat junk food regularly make
lower calorie estimates than people who eat it rarely. If he were to compute the t score
by hand, he could look at Table 13.2 and see that the critical value of t for a two-tailed
test with 13 degrees of freedom is ±2.160. The fact that his t score was more extreme
than this critical value would tell him that his p value is less than .05 and that he should
fail to retain the null hypothesis.

The Analysis of Variance

When there are more than two groups or condition means to be compared, the most
common null hypothesis test is the analysis of variance (ANOVA). In this section, we
look primarily at the one-way ANOVA, which is used for between-subjects designs with
a single independent variable. We then briefly consider some other versions of the
ANOVA that are used for within-subjects and factorial research designs.

One-Way ANOVA

The one-way ANOVA is used to compare the means of more than two samples
(M1, M2…MG) in a between-subjects design. The null hypothesis is that all the means are
equal in the population: µ1= µ2 =…= µG. The alternative hypothesis is that not all the
means in the population are equal.

The test statistic for the ANOVA is called F. It is a ratio of two estimates of the
population variance based on the sample data. One estimate of the population variance
is called the mean squares between groups (MSB) and is based on the differences
among the sample means. The other is called
the mean squares within groups (MSW) and is based on the differences among the
scores within each group. The F statistic is the ratio of the MSB to the MSW and can
therefore be expressed as follows:

F = MSB ÷ MSW

Again, the reason that F is useful is that we know how it is distributed when the null
hypothesis is true. As shown in Figure 13.2, this distribution is unimodal and positively
skewed with values that cluster around 1. The precise shape of the distribution depends
on both the number of groups and the sample size, and there is a degrees of freedom
value associated with each of these. The between-groups degrees of freedom is the
number of groups minus one: dfB = (G − 1). The within-groups degrees of freedom is the
total sample size minus the number of groups: dfW = N − G. Again, knowing the
distribution of F when the null hypothesis is true allows us to find the p value.
Figure 13.2 Distribution of the F Ratio With 2 and 37 Degrees of Freedom When the
Null Hypothesis Is True. The red vertical line represents the critical value when α is .05.

The online tools in Chapter 12 and statistical software such as Excel and SPSS will
compute F and find the p value. If p is less than .05, then we reject the null hypothesis
and conclude that there are differences among the group means in the population.
If p is greater than .05, then we retain the null hypothesis and conclude that there is not
enough evidence to say that there are differences. In the unlikely event that we would
compute F by hand, we can use a table of critical values like Table 13.3 “Table of
Critical Values of ” to make the decision. The idea is that any F ratio greater than the
critical value has a p value of less than .05. Thus if the F ratio we compute is beyond
the critical value, then we reject the null hypothesis. If the F ratio we compute is less
than the critical value, then we retain the null hypothesis.

Table 13.3 Table of Critical Values of F When α = .05

dfW 2 3 4
8 4.459 4.066 3.838

9 4.256 3.863 3.633

10 4.103 3.708 3.478

11 3.982 3.587 3.357

12 3.885 3.490 3.259

13 3.806 3.411 3.179

14 3.739 3.344 3.112

15 3.682 3.287 3.056

16 3.634 3.239 3.007

17 3.592 3.197 2.965

18 3.555 3.160 2.928

19 3.522 3.127 2.895

20 3.493 3.098 2.866

21 3.467 3.072 2.840

22 3.443 3.049 2.817

23 3.422 3.028 2.796

24 3.403 3.009 2.776

25 3.385 2.991 2.759

30 3.316 2.922 2.690

35 3.267 2.874 2.641

40 3.232 2.839 2.606

45 3.204 2.812 2.579

50 3.183 2.790 2.557

55 3.165 2.773 2.540

60 3.150 2.758 2.525

65 3.138 2.746 2.513

70 3.128 2.736 2.503

75 3.119 2.727 2.494

80 3.111 2.719 2.486

85 3.104 2.712 2.479

90 3.098 2.706 2.473

95 3.092 2.700 2.467

100 3.087 2.696 2.463

Example One-Way ANOVA

Imagine that the health psychologist wants to compare the calorie estimates of
psychology majors, nutrition majors, and professional dieticians. He collects the
following data:

Psych majors: 200, 180, 220, 160, 150, 200, 190, 200

Nutrition majors: 190, 220, 200, 230, 160, 150, 200, 210, 195

Dieticians: 220, 250, 240, 275, 250, 230, 200, 240

The means are 187.50 (SD = 23.14), 195.00 (SD = 27.77), and 238.13 (SD = 22.35),
respectively. So it appears that dieticians made substantially more accurate estimates
on average. The researcher would almost certainly enter these data into a program
such as Excel or SPSS, which would compute F for him and find the p value. Table 13.4
shows the output of the one-way ANOVA function in Excel for these data. This table is
referred to as an ANOVA table. It shows that MSB is 5,971.88, MSW is 602.23, and their
ratio, F, is 9.92. The p value is .0009. Because this value is below .05, the researcher
would reject the null hypothesis and conclude that the mean calorie estimates for the
three groups are not the same in the population. Notice that the ANOVA table also
includes the “sum of squares” (SS) for between groups and for within groups. These
values are computed on the way to finding MSB and MSW but are not typically reported
by the researcher. Finally, if the researcher were to compute the F ratio by hand, he
could look at Table 13.3 and see that the critical value of F with 2 and 21 degrees of
freedom is 3.467 (the same value in Table 13.4 under Fcrit). The fact that his F score was
more extreme than this critical value would tell him that his p value is less than .05 and
that he should reject the null hypothesis.

Table 13.4 Typical One-Way ANOVA Output From Excel

Source of
SS df MS F p-value Fcrit
variation
9.91623 0.00092
Between groups 11,943.75 2 5,971.875 3.4668
4 8

2
Within groups 12,646.88 602.2321
1

2
Total 24,590.63
3

ANOVA Elaborations

Post Hoc Comparisons

When we reject the null hypothesis in a one-way ANOVA, we conclude that the group
means are not all the same in the population. But this can indicate different things. With
three groups, it can indicate that all three means are significantly different from each
other. Or it can indicate that one of the means is significantly different from the other
two, but the other two are not significantly different from each other. It could be, for
example, that the mean calorie estimates of psychology majors, nutrition majors, and
dieticians are all significantly different from each other. Or it could be that the mean for
dieticians is significantly different from the means for psychology and nutrition majors,
but the means for psychology and nutrition majors are not significantly different from
each other. For this reason, statistically significant one-way ANOVA results are typically
followed up with a series of post hoc comparisons of selected pairs of group means to
determine which are different from which others.

One approach to post hoc comparisons would be to conduct a series of independent-

samples t tests comparing each group mean to each of the other group means. But
there is a problem with this approach. In general, if we conduct a t test when the null
hypothesis is true, we have a 5% chance of mistakenly rejecting the null hypothesis
(see Section 13.3 “Additional Considerations” for more on such Type I errors). If we
conduct several t tests when the null hypothesis is true, the chance of mistakenly
rejecting at least one null hypothesis increases with each test we conduct. Thus
researchers do not usually make post hoc comparisons using standard t tests because
there is too great a chance that they will mistakenly reject at least one null hypothesis.
Instead, they use one of several modified t test procedures—among them the
Bonferonni procedure, Fisher’s least significant difference (LSD) test, and Tukey’s
honestly significant difference (HSD) test. The details of these approaches are beyond
the scope of this book, but it is important to understand their purpose. It is to keep the
risk of mistakenly rejecting a true null hypothesis to an acceptable level (close to 5%).
Repeated-Measures ANOVA

Recall that the one-way ANOVA is appropriate for between-subjects designs in which
the means being compared come from separate groups of participants. It is not
appropriate for within-subjects designs in which the means being compared come from
the same participants tested under different conditions or at different times. This
requires a slightly different approach, called the repeated-measures ANOVA. The
basics of the repeated-measures ANOVA are the same as for the one-way ANOVA.
The main difference is that measuring the dependent variable multiple times for each
participant allows for a more refined measure of MSW. Imagine, for example, that the
dependent variable in a study is a measure of reaction time. Some participants will be
faster or slower than others because of stable individual differences in their nervous
systems, muscles, and other factors. In a between-subjects design, these stable
individual differences would simply add to the variability within the groups and increase
the value of MSW. In a within-subjects design, however, these stable individual
differences can be measured and subtracted from the value of MSW. This lower value
of MSW means a higher value of F and a more sensitive test.

Factorial ANOVA

When more than one independent variable is included in a factorial design, the
appropriate approach is the factorial ANOVA. Again, the basics of the factorial ANOVA
are the same as for the one-way and repeated-measures ANOVAs. The main difference
is that it produces an F ratio and p value for each main effect and for each interaction.
Returning to our calorie estimation example, imagine that the health psychologist tests
the effect of participant major (psychology vs. nutrition) and food type (cookie vs.
hamburger) in a factorial design. A factorial ANOVA would produce separate F ratios
and p values for the main effect of major, the main effect of food type, and the
interaction between major and food. Appropriate modifications must be made
depending on whether the design is between subjects, within subjects, or mixed.

Testing Pearson’s r

For relationships between quantitative variables, where Pearson’s r is used to describe

the strength of those relationships, the appropriate null hypothesis test is a test of
Pearson’s r. The basic logic is exactly the same as for other null hypothesis tests. In this
case, the null hypothesis is that there is no relationship in the population. We can use
the Greek lowercase rho (ρ) to represent the relevant parameter: ρ = 0. The alternative
hypothesis is that there is a relationship in the population: ρ ≠ 0. As with the t test, this
test can be two-tailed if the researcher has no expectation about the direction of the
relationship or one-tailed if the researcher expects the relationship to go in a particular
direction.
It is possible to use Pearson’s r for the sample to compute a t score with N − 2 degrees
of freedom and then to proceed as for a t test. However, because of the way it is
computed, Pearson’s r can also be treated as its own test statistic. The online statistical
tools and statistical software such as Excel and SPSS generally compute
Pearson’s r and provide the p value associated with that value of Pearson’s r. As
always, if the p value is less than .05, we reject the null hypothesis and conclude that
there is a relationship between the variables in the population. If the p value is greater
than .05, we retain the null hypothesis and conclude that there is not enough evidence
to say there is a relationship in the population. If we compute Pearson’s r by hand, we
can use a table like Table 13.5, which shows the critical values of r for various samples
sizes when α is .05. A sample value of Pearson’s r that is more extreme than the critical
value is statistically significant.

Table 13.5 Table of Critical Values of Pearson’s r When α = .05

Critical value of r, one-

N Critical value of r, two-tailed
tailed

5 .805 .878

10 .549 .632

15 .441 .514

20 .378 .444

25 .337 .396

30 .306 .361

35 .283 .334

40 .264 .312
45 .248 .294

50 .235 .279

55 .224 .266

60 .214 .254

65 .206 .244

70 .198 .235

75 .191 .227

80 .185 .220

85 .180 .213

90 .174 .207

95 .170 .202

100 .165 .197

Example Test of Pearson’s r

Imagine that the health psychologist is interested in the correlation between people’s
calorie estimates and their weight. He has no expectation about the direction of the
relationship, so he decides to conduct a two-tailed test. He computes the correlation for
a sample of 22 university students and finds that Pearson’s r is −.21. The statistical
software he uses tells him that the p value is .348. It is greater than .05, so he retains
the null hypothesis and concludes that there is no relationship between people’s calorie
estimates and their weight. If he were to compute Pearson’s r by hand, he could look
at Table 13.5 and see that the critical value for 22 − 2 = 20 degrees of freedom is .444.
The fact that Pearson’s r for the sample is less extreme than this critical value tells him
that the p value is greater than .05 and that he should retain the null hypothesis.
Key Takeaways

 To compare two means, the most common null hypothesis test is the t test. The one-
sample t test is used for comparing one sample mean with a hypothetical population mean of
interest, the dependent-samples t test is used to compare two means in a within-subjects
design, and the independent-samples t test is used to compare two means in a between-
subjects design.
 To compare more than two means, the most common null hypothesis test is the analysis
of variance (ANOVA). The one-way ANOVA is used for between-subjects designs with one
independent variable, the repeated-measures ANOVA is used for within-subjects designs, and
the factorial ANOVA is used for factorial designs.
 A null hypothesis test of Pearson’s r is used to compare a sample value of
Pearson’s r with a hypothetical population value of 0.

Exercises

1. Practice: Use one of the online tools, Excel, or SPSS to reproduce the one-sample t test,
dependent-samples t test, independent-samples t test, and one-way ANOVA for the four sets
of calorie estimation data presented in this section.
2. Practice: A sample of 25 university students rated their friendliness on a scale of 1
(Much Lower Than Average) to 7 (Much Higher Than Average). Their mean rating was 5.30
with a standard deviation of 1.50. Conduct a one-sample t test comparing their mean rating
with a hypothetical mean rating of 4 (Average). The question is whether university students
have a tendency to rate themselves as friendlier than average.
3. Practice: Decide whether each of the following Pearson’s r values is statistically
significant for both a one-tailed and a two-tailed test.
a. The correlation between height and IQ is +.13 in a sample of 35.
b. For a sample of 88 university students, the correlation between how disgusted
they felt and the harshness of their moral judgments was +.23.
c. The correlation between the number of daily hassles and positive mood is −.43
for a sample of 30 middle-aged adults.

Edur 8131 Notes 5 T Test
No ratings yet
Edur 8131 Notes 5 T Test
23 pages
Lecture 2 Hypothesis Test I - Updated2
No ratings yet
Lecture 2 Hypothesis Test I - Updated2
33 pages
11paired T
No ratings yet
11paired T
49 pages
Week4 Modified
No ratings yet
Week4 Modified
28 pages
T Test and Anova - Unit 3 - Notes
No ratings yet
T Test and Anova - Unit 3 - Notes
41 pages
Research III: Quarter 3 Week 4
No ratings yet
Research III: Quarter 3 Week 4
20 pages
Lecture 6 T-Test Part A
No ratings yet
Lecture 6 T-Test Part A
7 pages
Identifying Rejection Region
No ratings yet
Identifying Rejection Region
68 pages
CH 9 10
No ratings yet
CH 9 10
48 pages
8-9.t Test
No ratings yet
8-9.t Test
102 pages
Hypothesis Tests Regarding A Parameter 10.3 Hypothesis Tests For A Population Mean
No ratings yet
Hypothesis Tests Regarding A Parameter 10.3 Hypothesis Tests For A Population Mean
39 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
18 pages
T Test
No ratings yet
T Test
141 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
45 pages
Group 5 Parametric Statistics
No ratings yet
Group 5 Parametric Statistics
67 pages
UNIT IV New
No ratings yet
UNIT IV New
71 pages
Statistics Basic Concepts
No ratings yet
Statistics Basic Concepts
13 pages
Statistical Tests
No ratings yet
Statistical Tests
11 pages
Comparing Two Groups (Part 1)
No ratings yet
Comparing Two Groups (Part 1)
24 pages
XV. Z Test For A Mean
No ratings yet
XV. Z Test For A Mean
34 pages
What Is A T-Test
No ratings yet
What Is A T-Test
9 pages
4 T Distribution (Statistics IEM 2-2)
No ratings yet
4 T Distribution (Statistics IEM 2-2)
32 pages
Parametric Test
No ratings yet
Parametric Test
49 pages
T-Tests & Chi2
No ratings yet
T-Tests & Chi2
35 pages
Basics of Hypothesis Testing: The T Statistic
No ratings yet
Basics of Hypothesis Testing: The T Statistic
4 pages
Unit 10
No ratings yet
Unit 10
30 pages
10.statistics Unit-X 2
No ratings yet
10.statistics Unit-X 2
32 pages
T Test
No ratings yet
T Test
1 page
Inferential Statistics For Print Part I
No ratings yet
Inferential Statistics For Print Part I
23 pages
One Sample T-Test
No ratings yet
One Sample T-Test
2 pages
CH 21
No ratings yet
CH 21
58 pages
Application of The Paired T-Test
No ratings yet
Application of The Paired T-Test
6 pages
T Tests
No ratings yet
T Tests
66 pages
Hypothesis Testing Part 2
No ratings yet
Hypothesis Testing Part 2
63 pages
2statistics Prac New
No ratings yet
2statistics Prac New
13 pages
Day 13 - Intro To T Statistic
No ratings yet
Day 13 - Intro To T Statistic
7 pages
Statistical Inferences
No ratings yet
Statistical Inferences
46 pages
25-Tests For Population Means (One Sample and Two Samples) - 30!09!2023
No ratings yet
25-Tests For Population Means (One Sample and Two Samples) - 30!09!2023
12 pages
One Sample Parametric Tests: Unit IV: Statistical Procedures
No ratings yet
One Sample Parametric Tests: Unit IV: Statistical Procedures
6 pages
T Test
No ratings yet
T Test
58 pages
Hypothesis
No ratings yet
Hypothesis
49 pages
T - Test
No ratings yet
T - Test
9 pages
T Test Function in Statistical Software
No ratings yet
T Test Function in Statistical Software
9 pages
Comparison of Means: Hypothesis Testing
No ratings yet
Comparison of Means: Hypothesis Testing
52 pages
LESSON 2.1: Z-Test and T-Test
No ratings yet
LESSON 2.1: Z-Test and T-Test
4 pages
T Testz Test
No ratings yet
T Testz Test
47 pages
4 Hypothesis Testing 1 Sample Mean For Students
No ratings yet
4 Hypothesis Testing 1 Sample Mean For Students
18 pages
Waterlily As Alternative Rust Remover Final Paper
No ratings yet
Waterlily As Alternative Rust Remover Final Paper
25 pages
Math
No ratings yet
Math
24 pages
Week 8:: Hypothesis Testing With One-Sample T-Test
No ratings yet
Week 8:: Hypothesis Testing With One-Sample T-Test
18 pages
Hypothesis Test
No ratings yet
Hypothesis Test
6 pages
Statistics U4
No ratings yet
Statistics U4
38 pages
Application of T-Test To Analyze The Small Sample of Statistical Research
No ratings yet
Application of T-Test To Analyze The Small Sample of Statistical Research
4 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
5 Session 18-19 (Z-Test and T-Test)
No ratings yet
5 Session 18-19 (Z-Test and T-Test)
28 pages
"T" Test: DR - Shovan Padhy DM1 Yr (Senior Resident) Dept. of CP & T NIMS, Hyderabad
No ratings yet
"T" Test: DR - Shovan Padhy DM1 Yr (Senior Resident) Dept. of CP & T NIMS, Hyderabad
20 pages
T - Test
No ratings yet
T - Test
45 pages
Initial-public-offering-IPO-at India Bulls
No ratings yet
Initial-public-offering-IPO-at India Bulls
79 pages
Biostat W9
No ratings yet
Biostat W9
18 pages
Foundation Studies General Mathematics B: Hypothesis Testing Notes
89% (9)
Foundation Studies General Mathematics B: Hypothesis Testing Notes
62 pages
Ttest
No ratings yet
Ttest
8 pages
Hypothesis
100% (2)
Hypothesis
25 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
Sharma, Singh (2007) A Correlation Between P-Wave Velocity, Impact Strength Index, Slake Durability Index and Uniaxial Compressive Strength
No ratings yet
Sharma, Singh (2007) A Correlation Between P-Wave Velocity, Impact Strength Index, Slake Durability Index and Uniaxial Compressive Strength
6 pages
CHAPTER-12-Data Analysis Using SPSS
No ratings yet
CHAPTER-12-Data Analysis Using SPSS
619 pages
Ial Record
No ratings yet
Ial Record
144 pages
Agriculture Marketing in Thanjavur District - A Study With Reference To Regulated Markets
No ratings yet
Agriculture Marketing in Thanjavur District - A Study With Reference To Regulated Markets
12 pages
The Effect of Problem Solving On Physics in Senior Secondary School, Ondo West Local Government
No ratings yet
The Effect of Problem Solving On Physics in Senior Secondary School, Ondo West Local Government
35 pages
Math 208
100% (1)
Math 208
52 pages
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
No ratings yet
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
54 pages
RM 5 Wpu
No ratings yet
RM 5 Wpu
92 pages
The Perception of Female Leadership - Impact of Gender and Leader
No ratings yet
The Perception of Female Leadership - Impact of Gender and Leader
117 pages
Week10 One and Two Way Anova
No ratings yet
Week10 One and Two Way Anova
93 pages
Coco Imrad
No ratings yet
Coco Imrad
11 pages
Tellington Touch Effect On Fasting Blood Sugar Level in Type 2 Diabetic Patients
No ratings yet
Tellington Touch Effect On Fasting Blood Sugar Level in Type 2 Diabetic Patients
4 pages
Module 3 Paired and Two Sample T Test
No ratings yet
Module 3 Paired and Two Sample T Test
18 pages
Advanced Statistical Methods Using R
No ratings yet
Advanced Statistical Methods Using R
32 pages
2023 Q2 Restorative Methods As A Strategy For The Prevention of Violence and Bullying in Primary and Secondary Schools in Mexico
No ratings yet
2023 Q2 Restorative Methods As A Strategy For The Prevention of Violence and Bullying in Primary and Secondary Schools in Mexico
14 pages
232-Article Text-427-1-10-20240123
No ratings yet
232-Article Text-427-1-10-20240123
9 pages
Gender Diversity in The Audit Committee and Firm Risk
No ratings yet
Gender Diversity in The Audit Committee and Firm Risk
20 pages
Octoviana Et Al (2021)
No ratings yet
Octoviana Et Al (2021)
20 pages
PCST Asd
No ratings yet
PCST Asd
8 pages
An Assessment of Teachers' Performance Appraisal in Secondary Schools of Wolaita Zone, South Ethiopia
No ratings yet
An Assessment of Teachers' Performance Appraisal in Secondary Schools of Wolaita Zone, South Ethiopia
21 pages
Effects of Aggregation On Budgeting
No ratings yet
Effects of Aggregation On Budgeting
18 pages
Journal of Primary Education: Apriati Sri Kuwita Gandi, Sri Haryani & Deni Setiawan
No ratings yet
Journal of Primary Education: Apriati Sri Kuwita Gandi, Sri Haryani & Deni Setiawan
6 pages
Common Statistical Tests Are Linear Models
No ratings yet
Common Statistical Tests Are Linear Models
1 page
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet

Some Basic Null Hypothesis Tests

Uploaded by

Some Basic Null Hypothesis Tests

Uploaded by

Some Basic Null Hypothesis Tests

1. Conduct and interpret one-sample, dependent-samples, and independent-

The one-sample t test is used to compare a sample mean (M) with a hypothetical

Table 13.2 Table of Critical Values of t When α = .05

100 1.660 1.984

Imagine that a health psychologist is interested in the accuracy of university students’

The dependent-samples t test (sometimes called the paired-samples t test) is used to

It helps to think of the dependent-samples t test as a special case of the one-

The difference scores, then, are as follows:

+20, +10, −30, +25, +10, 0, +20, −30, +10, +50

The independent-samples t test is used to compare the means of two separate

The Analysis of Variance

Table 13.3 Table of Critical Values of F When α = .05

9 4.256 3.863 3.633

10 4.103 3.708 3.478

11 3.982 3.587 3.357

12 3.885 3.490 3.259

13 3.806 3.411 3.179

14 3.739 3.344 3.112

15 3.682 3.287 3.056

16 3.634 3.239 3.007

17 3.592 3.197 2.965

18 3.555 3.160 2.928

19 3.522 3.127 2.895

20 3.493 3.098 2.866

21 3.467 3.072 2.840

22 3.443 3.049 2.817

24 3.403 3.009 2.776

25 3.385 2.991 2.759

30 3.316 2.922 2.690

35 3.267 2.874 2.641

40 3.232 2.839 2.606

45 3.204 2.812 2.579

50 3.183 2.790 2.557

55 3.165 2.773 2.540

60 3.150 2.758 2.525

65 3.138 2.746 2.513

70 3.128 2.736 2.503

75 3.119 2.727 2.494

80 3.111 2.719 2.486

85 3.104 2.712 2.479

95 3.092 2.700 2.467

100 3.087 2.696 2.463

Example One-Way ANOVA

Dieticians: 220, 250, 240, 275, 250, 230, 200, 240

Table 13.4 Typical One-Way ANOVA Output From Excel

Post Hoc Comparisons

One approach to post hoc comparisons would be to conduct a series of independent-

For relationships between quantitative variables, where Pearson’s r is used to describe

Table 13.5 Table of Critical Values of Pearson’s r When α = .05

Critical value of r, one-

100 .165 .197

Example Test of Pearson’s r

You might also like