SOCI 301 Final Notes Chapter 5 - Hypothesis Testing

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

SOCI 301 Final Notes

Chapter 5 – Hypothesis Testing

hypothesis testing Procedure for deciding whether the outcome of a study (results for a
sample) supports a particular theory or practical innovation (which is thought to apply to a
population).
hypothesis A prediction, often based on observation, previous research, or theory, that is
tested in a research study.
theory A set of principles that attempt to explain one or more facts, relationships, or
events; behavioral and social scientists often derive specific predictions (hypotheses) from
theories that are then tested in research studies.

research hypothesis Statement in hypothesis testing about the predicted relation


between populations (usually a prediction of a difference between population means).
null hypothesis Statement about a relationship between populations that is the opposite
of the research hypothesis; a statement that in the population there is no difference (or a
difference opposite to that predicted) between populations; a contrived statement set up to
examine whether it can be rejected as part of hypothesis testing.
comparison distribution Distribution used in hypothesis testing. It represents the
population situation if the null hypothesis is true. It is the distribution to which you
compare the score based on your sample’s results.

conventional levels of significance (p < .05, p < .01) The levels of significance widely


used in the behavioral and social sciences.
statistically significant Conclusion that the results of a study would be unlikely if in fact
the sample studied represents a population that is no different from the population in
general; an outcome of hypothesis testing in which the null hypothesis is rejected.

Here is a summary of the five steps of hypothesis testing:


Restate the question as a research hypothesis and a null hypothesis about
the populations.
Determine the characteristics of the comparison distribution.

Determine the cutoff sample score on the comparison distribution at


which the null hypothesis should be rejected.

Determine your sample’s score on the comparison distribution.


Decide whether to reject the null hypothesis.

directional hypothesis Research hypothesis predicting a particular direction of


difference between populations—for example, a prediction that the population like the
sample studied has a higher mean than the population in general.
one-tailed test Hypothesis-testing procedure for a directional hypothesis; situation in
which the region of the comparison distribution in which the null hypothesis would be
rejected is all on one side (or tail) of the distribution.
nondirectional hypothesis Research hypothesis that does not predict a particular
direction of difference between the population like the sample studied and the population in
general.
two-tailed test Hypothesis-testing procedure for a nondirectional hypothesis; the
situation in which the region of the comparison distribution in which the null hypothesis
would be rejected is divided between the two sides (tails) of the distribution.

decision error Incorrect conclusion in hypothesis testing in relation to the real (but


unknown) situation, such as deciding the null hypothesis is false when it is really true.

Type I error Rejecting the null hypothesis when in fact it is true; getting a statistically
significant result when in fact the research hypothesis is not true.

Type II error Failing to reject the null hypothesis when in fact it is false; failing to get a
statistically significant result when in fact the research hypothesis is true.

Chapter 6 – Hypothesis Testing with Groups of Scores

distribution of means Distribution of means of samples of a given size from a particular


population (also called a sampling distribution of the mean); comparison distribution when
testing hypotheses involving a single sample of more than one individual.

mean of a distribution of means The mean of a distribution of means of samples of a


given size from a particular population; it comes out to be the same as the mean of the
population of individuals.
variance of a distribution of means Variance of the population divided by the number
of scores in each sample.
- The variance of a distribution of means is the variance of the population of
individuals divided by the number of individuals in each sample.

Population SD2M is the variance of the distribution of means, Population SD2 is the variance


of the population of individuals, and N is the number of individuals in each sample.

standard deviation of a distribution of means (Population SDM) Square root of the


variance of the distribution of means; same as standard error (SE).
- The standard deviation of a distribution of means is the square root of the variance of
the distribution of means.

standard error (SE) Same as standard deviation of a distribution of means; also


called standard error of the mean (SEM).

shape of a distribution of means Contour of a histogram of a distribution of means,


such as whether it follows a normal curve or is skewed; in general, a distribution of means
will tend to be unimodal and symmetrical and is often normal.

Summary of Rules and Formulas for Determining the Characteristics of a Distribution of Means

 Rule 1: The mean of a distribution of means is the same as the mean of the population
of individuals:
Population MM = Population M
 Rule 2a: The variance of a distribution of means is the variance of the population of
individuals divided by the number of individuals in each sample:
PopulationSD2M=PopulationSD2/N
 Rule 2b: The standard deviation of a distribution of means is the square root of the
variance of the distribution of means:
PopulationSDM=√ PopulationSD2M

 Rule 3: The shape of a distribution of means is approximately normal if either (a) each
sample is of 30 or more individuals or (b) the distribution of the population of individuals is
normal.

Hypothesis Testing with a Distribution of Means: The Z Test

Z test Hypothesis-testing procedure in which there is a single sample and the population


variance is known.

The Z score for the sample’s mean on the distribution of means is the sample’s mean minus
the mean of the distribution of means, divided by the standard deviation of the distribution
of means.
confidence interval (CI) Roughly speaking, the region of scores (that is, the scores
between an upper and lower value) that is likely to include the true population mean; more
precisely, the range of possible population means from which it is not highly unlikely that
you could have obtained your sample mean.

confidence limit Upper or lower value of a confidence interval.

95% confidence interval Confidence interval in which, roughly speaking, there is a 95%


chance that the population mean falls within this interval.

99% confidence interval Confidence interval in which, roughly speaking, there is a 99%


chance that the population mean falls within this interval.

Steps for Figuring the 95% and 99% Confidence Intervals


Here are three steps for figuring a confidence interval. These steps assume that the
distribution of means is approximately a normal distribution.

Estimate the population mean and figure the standard deviation of the
distribution of means. The best estimate of the population mean is the sample mean.
Next, find the variance of the distribution of means in the usual way: Population SD2
M= Population SD2/N. Take the square root of the variance of the distribution of
means to find the standard deviation of the distribution of means: SDM=√
PopulationSD2M

Find the Z scores that go with the confidence interval you want. For the
95% confidence interval, the Z scores are +1.96 and −1.96. For the 99% confidence
interval, the Z scores are +2.58 and −2.58.
To find the confidence interval, change these Z scores to raw scores. To
find the lower limit, multiply −1.96 (for the 95% confidence interval) or −2.58 (for
the 99% confidence interval) by the standard deviation of the distribution of means
(Population SDM) and add this to the population mean. To find the upper limit,
multiply +1.96 (for the 95% confidence interval) or +2.58 (for the 99% confidence
interval) by the standard deviation of the distribution of means (Population SDM)
and add this to the population mean.

Chapter 7 – Effect Size and Statistical Power

effect size Standardized measure of difference (lack of overlap) between populations.


Effect size increases with greater differences between means.

meta-analysis Statistical method for combining effect sizes from different studies.

statistical power Probability that the study will give a significant result if the research
hypothesis is true.

power table Table for a hypothesis-testing procedure showing the statistical power of a


study for various effect sizes and sample sizes.
[T]hey were confronted with the serious problem of having to accept the null hypothesis….
We can view this issue in terms of statistical power…. A minimal statistical power of .80
[80%] is required before one can consider the argument that the lack of significance may be
interpreted as evidence that Ho [the null hypothesis] is true. To conduct a power analysis, it
is necessary to specify an expected mean difference, the alpha [significance] level, and
whether a one-tailed or two-tailed test will be used. Given a power requirement of .8, one
can then determine the N necessary. Once these conditions are satisfied, if the experiment
fails to find a significant difference, then one can make the following kind of a statement:
“We have designed an experiment with a .8 probability of finding a significant difference, if
such exists in the population. Because we failed to find a significant effect, we think it quite
unlikely that one exists. Even if it does exist, its contribution would appear to be minimal….”
Mody et al. never discussed power, even though they interpreted negative findings as
evidence for the validity of the null hypothesis in all of their experiments…. Because the
participants were split in this experiment, the ns [sample sizes] were reduced to 10 per
group. Under such conditions one would not expect to find a significant difference, unless
the experimental variable was very powerful. In other words it is more difficult to reject the
null hypothesis when working with small ns [sample sizes]. The only meaningful conclusion
that can be drawn from this study is that no meaningful interpretation can be made of the
lack of findings…. 

SET 1 – CHAPTER 7
 (c)Regarding situation (a), the significance tells you the probability of getting your results if
the null hypothesis is true; sample size is already taken into account in figuring the significance.
Regarding situation (b), it is possible to get a significant result with a large sample even when the
actual practical effect is slight—such as when the mean of your sample (and this, your best
estimate of the mean of the population that gets the experimental treatment) is only slightly higher
than the mean of the known population. This is possible because significance is based on the
difference between the mean of your sample and the known population mean with this difference
then divided by the standard deviation of the distribution of means. If the sample size is very
large, then the standard deviation of the distribution of means is very small. (This is because it is
figured by taking the square root of the result of dividing the population variance by the sample
size.) Thus, even a small difference between the means when divided by a very small
denominator can give a large overall result, making the study significant.
5.
Power is the chance of rejecting the null hypothesis if the research hypothesis is true. In
other words, the power of a study represents the likelihood that you will get a statistically
significant result in your study, if in fact the research hypothesis is true. Ideally, a study
should have power of 80% or more. If a study has low power and does not get a statistically
significant result, the result of the study is entirely inconclusive. This is because it is not
clear whether the nonsignificant result is due to the low power of the study or because the
research hypothesis is in fact false.
Effect size can be thought of as the degree to which distributions do not overlap. The larger
the effect size, the larger the power. As noted in the quotation from the research article, the
study had a high level of power (about 90%) for detecting both large and medium-sized
effects. Given this high level of power, the researchers were able to conclude that the most
likely reason for the nonsignificant study results is that the research hypothesis is in fact
false. As the researchers noted, with such a high level of power, it is very unlikely that the
results of the study would be nonsignificant if there were in fact a medium-sized or large
effect in the population. Since smaller effect sizes are associated with lower power, the
researchers were careful not to rule out the possibility that there is in fact a small effect in
the population (which may not have been detected in the study due to the lower power for
identifying a small effect size).
7.
One situation is that when planning an experiment, figuring power gives you the chance to
make changes of various kinds (or even abandon the project) if power is too low. (Or if
power is higher than reasonably needed, you would then be able to make changes to make
the study less costly, for example, by reducing the number of participants.) Another
situation is figuring power after a study is done that had nonsignificant results. If you figure
that power was high in the study, this means you can be pretty confident that the null
hypothesis really is true in the population, in the sense that the true difference in the
population is really smaller than the effect size you used to figure power. But if you figure
the power of the study was low, this tells you that the result really is ambiguous and that it is
still reasonable to think that future research testing this hypothesis might have a chance of
being significant. A third possibility is figuring power after a study is done that got a
significant result and the researchers do not give the effect size. If the study had high power
(as when it used a large sample), this tells you that the effect size could have been small and
thus the effect not very important for practical application. But if the study seems to have
had low power (as from having a small sample), this tells you that the effect size must have
been large for them to get a significant result.

Chapter 8

t test Hypothesis-testing procedure in which the population variance is unknown; it


compares t scores from a sample to a comparison distribution called a t distribution.

biased estimate Estimate of a population parameter that is likely systematically to


overestimate or underestimate the true value of the population parameter. For
example, SD2 would be a biased estimate of the population variance (it would systematically
underestimate it).

unbiased estimate of the population variance Estimate of the population variance,


based on sample scores, which has been corrected so that it is equally likely to overestimate
or underestimate the true population variance; the correction used is dividing the sum of
squared deviations by the sample size minus 1, instead of the usual procedure of dividing by
the sample size directly.

degrees of freedom Number of scores free to vary when estimating a population


parameter; usually part of a formula for making that estimate—for example, in the formula
for estimating the population variance from a single sample, the degrees of freedom is the
number of scores minus 1.

- The estimated population variance is the sum of squared deviation scores divided by
the degrees of freedom.

t distribution Mathematically defined curve that is the comparison distribution used in


a t test.
- t distribution shown to have higher tails than a normal distribution.

t table Table of cutoff scores on the t distribution for various degrees of freedom,


significance levels, and one- and two-tailed tests.
t score On a t distribution, number of standard deviations from the mean (like a Z score,
but on a tdistribution).

Page 245 summary of hypothesis testing with sample mean and unknown population
variance.

repeated-measures design Research strategy in which each person is tested more than


once; same as within-subjects design.

t test for dependent means Hypothesis-testing procedure in which there are two scores
for each person and the population variance is not known; it determines the significance of a
hypothesis that is being tested using difference or change scores from a single group of
people.

difference score Difference between a person’s score on one testing and the same
person’s score on another testing; often an after score minus a before score, in which case it
is also called a change score.

Page 252: summary of t-test for dependent means

Page 255: Review and Comparison of Z Test, t Test for a Single Sample, and t Test for
Dependent Means

assumption A condition, such as a population’s having a normal distribution, required for


carrying out a particular hypothesis-testing procedure; a part of the mathematical
foundation for the accuracy of the tables used in determining cutoff values.

Page 261: approx. number of participants needed to conduct study with power level 80%

You might also like