UCLAChapter 8
UCLAChapter 8
• Write a Null and Alternative Hypothesis for a Test Involving a Population Proportion
A procedure that enables us to choose between two claims when we have variability in our measurements.
1. Hypothesize: State a hypothesis (claim) that will be weighed against a neutral “skeptical” claim.
2. Prepare: Determine how you’ll use data to make your decision and make sure you have enough
data to minimize the probability of making mistakes.
4. Interpret
1
A Pair of Hypotheses
• Always contains =
• The research hypothesis; the statement about a population parameter we intend to demonstrate is
true
The null hypothesis always gets the benefit of the doubt throughout the hypothesis-testing procedure.
We only reject the null hypothesis if the observed outcome is extremely unusual if the null hypothesis
were true.
It is analogous to assume that a defendant in a jury trial is innocent unless proven guilty “beyond a
reasonable doubt.”
The sign in the alternative hypothesis determines whether a hypothesis is one-sided or two-sided.
Example: A 2014 Pew Poll found that 61% of Americans believed in global warming. A researcher
believes this rate has declined. State the null and alternative hypotheses.
2
Test statistic
The test statistic compares our observed outcome with the outcome we would get if the null hypothesis is
true.
When the test statistic is far away from the value we would expect that if the null hypothesis is true, we
reject the null hypothesis and conclude the evidence supports the alternative hypothesis.
To test hypotheses regarding the population proportion, we can use the steps that follow,
provided that the central limit theorem conditions are met:
Note: When observed results are unlikely under the assumption that the null hypothesis is true, we say
the result is statistically significant. When results are found to be statistically significant, we reject
the null hypothesis.
Recall: The best point estimate of p, the proportion of the population with a certain characteristic, is
given by p̂ = nx , where x is the number of individuals in the sample with the specified characteristic and n
is the sample size.
µp̂ = p
√
p(1−p)
Standard Error = SE = σp̂ = n
p̂−µp̂
z= σp̂
3
Convert z statistics to P-value:
Frequency
1.56
Right-tailed
−1.56
Left-tailed
−1.56 1.56
Two-tailed
4
Interpretation of the P-value:
• The P-value answer the question: What is the probability of the observed test statistic or one
more extreme when Ho is true.
• Thus, smaller and smaller P-values provide stronger and stronger evidence against Ho
Examples
Significance level
Rule of thumb
5
8.2: hypothesis testing
∗ Left-tailed test
– Find and label the following values:
∗ Two-tailed test
n, x, p̂, p
∗ Right-tailed test
– Check the conditions of the Central
Limit Theorem if the distribution is
not normal.
– Compare the p-value with α.
∗ Simple Random Sample: The
sample is obtained by simple ran- ∗ P-value ≤ α
dom sampling.
· Reject Ho .
∗ Large sample size: n × p ≥ 10
and n × (1 − p) ≥ 10 ∗ P-value > α
· Fail to reject Ho .
• State the null and alternative hypoth-
esis:
H◦ :
Ha : • Stating a conclusion interpreting the
results of the hypothesis test:
• Compute the test statistics(zo ):
6
Example: In 1997, 46% of Americans said they did not trust the media “when it comes to reporting the
news fully, accurately and fairly”. In a 2022 poll of 1,010 adult nationwide, 525 stated they did not trust
the media. At the 5% level of significance, is there evidence to support the claim that the percentage of
Americans that do not trust the media to report fully and accurately has increased since 1997?
7
Example: According to the Center for Disease Control (CDC), the percent of adults 20 years of age and
over in the United States who are overweight is 69.0% (see
http://www.cdc.gov/nchs/fastats/obesity-overweight.htm). One city’s council wants to know if
the proportion of overweight citizens in their city is different from this known national proportion. They
take a random sample of 150 adults 20 years of age or older in their city and find that 98 are classified as
overweight. Let’s use the four-step hypothesis testing procedure to determine if there is evidence that the
proportion in this city is different from the known national proportion at a significant level of 5%.
8
8.3: Hypothesis tests in detail
Type I Error is concluding that the alternative hypothesis is correct when the null hypothesis is
correct. Alpha (α) is the probability of concluding that the alternative hypothesis is correct when the
null hypothesis is correct. This is also known as a false positive.
Type II Error is concluding that the null hypothesis is correct when the alternative hypothesis is
correct. Beta (β) is the probability of concluding that the null hypothesis is correct when the alternative
hypothesis is correct. This is also known as a false negative.
Example: In the Judicial Process a jury is a sworn body of people (the jurors) convened to render an
impartial verdict (a finding of fact on a question) officially submitted to them by a court. Here are the
two verdicts that the Jury members can conclude.
{
The person is found not guilty.
. (1)
The person is found guilty.
f. How can the jury’s minimize the probability of making a type I or type II error?
9
Example: In 2008, 62% of American adults regularly volunteered their time for charity work. A
researcher believes that this percentage is different today. For the following claims, explain what it would
mean to make a Type I error. What would it mean to make a Type II error?
Example: Suppose we conducted a hypothesis test on the average height of men and reject the null
hypothesis H◦ : µ = 66. If the true average height of the population of men is 69, what can be said about
our decision to reject the H◦ ?
Example: Suppose we conducted a hypothesis test on the average salary of single mothers and rejected
the null hypothesis H◦ : µ = 25, 000. If the true average salary of single mothers is 25,000, what can be
said about our conclusion for our hypothesis test?
Note: As the probability of a Type I error increases, the probability of a Type II error decreases, and
vice-versa.
10
Cautions about Writing Conclusion
Because we can never be 100% certain that our conclusion in hypothesis testing is true, when your
p-value is greater than your significance level, AVOID using any of the following phrases:
• “We accept H◦ .”
Say instead:
Confidence intervals and hypothesis tests are closely related but ask slightly different questions.
Hypothesis test: “Are the data consistent with the parameter being one particular value or might the
parameter be something else?
Even though they are designed to answer different questions, they are similar enough to lead us to reach
the same types of conclusions.
A confidence interval can lead us to the same type of conclusion as a two-sided hypothesis test.
11
8.4: Comparing proportions from two populations
Example: In January 2014, the Gallup organization reported that 45% of Americans reported feeling
“pretty good” about the amount of money they had to spend. In January 2015, Gallup reported that
49% of Americans felt this way. Both samples had a sample size of 3500. Can we conclude that economic
confidence has improved since 2014 or could this difference be due to chance variation during the
sampling procedure?
First let label p1 and p2 .
p1 :
p2 :
We are interested in the relationship between these two parameters. In comparing two population
proportions, the null hypothesis is
Ho : p1 = p2
the alternative hypothesis is one of these 3 possibilities:
a. Ha : p1 ̸= p2
b. Ha : p1 > p2
c. Ha : p1 < p2
12
Suppose that a simple random sample of size n1 is taken from a population where x1 of the individuals have
a specified characteristic, and a simple random sample of size n2 is independently taken from a different
population where x2 of the individuals have a specified characteristic. The sampling distribution of pˆ1 − pˆ2
x1
pˆ1 =
n1
x2
pˆ2 =
n2
The best point of estimate of p is called the pooled estimate of p, denoted by p̂,where,
x1 + x2
p̂ =
n1 + n2
√ ( )
1 1
σpˆ1 −pˆ2 = p̂(1 − p̂) +
n1 n2
13
We can say that data set is approximately normal using the central limit theorem if the following conditions
are met:
• Large samples: Both sample sizes must be large enough. We use (p̂,) ̂the pooled sample proportion,
where
x1 + x2
p̂ =
n1 + n2
n1 × p̂ ≥ 10
n1 × (1 − p̂) ≥ 10
n2 × p̂ ≥ 10
n2 × (1 − p̂) ≥ 10
• Random Samples: If we are not told explicitly that the sample was randomly drawn we may have
to assume this condition is satisfied.
• Independent within Samples: The observations within each sample must be independent of one
another.
14
Example: Myth Busters, a popular television program on the Discovery Channel, once conducted an
experiment to investigate whether or not yawning is contagious. The premise of the experiment was to
invite a stranger to sit in a booth for an extended period of time. Fifty subjects were said to be tested in
total, of which 34 were ”seeded” with a yawn by the person conducting the experiment. The other 16
were not given a yawn seed. Using a two-way mirror and a hidden camera, the experimenters observed
and recorded the results which are given below. Does the data provide statistical evidence that those
“seeded” with a yawn are more likely to actually yawn at an α = 0.10?
15
Example: A popular British TV show called Goldenballs features a final round where two contestants
each make a decision to either split or steal the final jackpot. If both choose ‘‘split,” they share the prize,
but if one chooses ‘‘split” and the other picks ‘‘steal,” the whole prize goes to the player who steals. If
both choose ‘‘steal,” they both win nothing. Some researchers collected data from 287 episodes, each
with two participants, to give 574 ‘‘split” or ‘‘steal” decisions. Some results are displayed in the Table
below broken down by the age of the participant. We use the data in the table to test if there is a
significant difference at a significant level of 5% in the proportions who choose ‘‘split” between younger
and older players.
16
Extra practice: An economist believes that the percentage of urban households with Internet access is
greater than the percentage of rural households with Internet access. He obtains a random sample of 800
urban households and finds that 338 of them have Internet access. He obtains a random sample of 750
rural households and finds that 292 of them have Internet access. Test the economist’s claim at the α =
0.05 level of significance.
17