Mod 7 Study Guide
Mod 7 Study Guide
Parameter: State the parameter of interest (be sure to include the word “all”, “true”, or “population” in your description).
• Ex. We are estimating the true proportion of students who watch soccer in Florida OR We are estimating the population mean travel time to work for Florida residents.
Conditions: Be sure to check the appropriate conditions (it would be a “SIN” to forget them). Please see the conditions that need to be verified below on the right.
• To earn full credit, students must state and verify each of the conditions by substituting the correct values in the appropriate formulas.
Calculations: Students must either substitute the correct values into the appropriate formula OR name the test (see names below) and report the test statistic AND p-value.
One Population Mean Two Population Means
By hand using formula: Using Calculator: Hypothesis Test Confidence Interval
t-test for μ
x-bar, Sx , n
μ (null parameter value)
OR Sample Data in L1
Note: if σ is known, perform a z-test. Note: we do NOT pool unless both population standard Note: be sure to use the given t-table to determine critical
deviations are equal, σ1 = σ2 (very unlikely). value (t*) using confidence level and degrees of freedom.
AP Tip: When naming a test, do not use calculator terms (calculator speak).
Instead, refer to the test by its full name. Using Calculator: Proper Name: Using Calculator: Proper Name:
t-test (on calculator) is a t-test for μ 2-sample t-test t-test for μ1 - μ2 2-sample t-interval t-interval for μ1 - μ2
Conclusion: when making a decision, we must do three things: 1) compare p-value to alpha level, 2) reject/fail to reject the null accordingly, and 3) state that we have suffi-
cient/insufficient evidence to support the alternative hypothesis in the context of the problem.
• For example, “Since 0.0376 < 0.05, we reject the null. We have sufficient evidence to support the alternative hypothesis that the true mean of [in context].”
Note: In a matched pairs design, there is one sample with dependent (paired) observations. Proportions (Categorical): when working Means (Quantitative): when working with
In a 2-sample t-test, there are two independent samples (and individuals are NOT paired).
with sample proportions we are collecting sample means we are collecting a value with
Parameter: True population Slope, β, relating x and y in context. binary responses (yes/no) from each member. units from each member.
Conditions: Same as 1-sample t-test (on right) using the differences as 1 sample. When there are two populations, each condition must be verified for each sample, respectively. Therefore, there should be two random
Calculations: samples, independence for each, and normality for each. Additionally, there is a fourth condition (please see last line below)*.
The following three conditions must be verified for either setting and two of them are
df = n - 1
the same (the third, Normality, is where they differ):
Using Calculator: Proper Name: Using Calculator: Proper Name: 1) Simple Random Sample or Randomized Experiment (representative of population)
t-test paired t-test t-interval paired t-interval 2) Independence* - Sample must be less than 10% of population (since we are
Conclusion: Same as 1-sample t-test or t-interval. :) sampling without replacement). To verify: N > 10n - Also, see below in red*
3) Normality: Large Counts Condition 3) Normality: Central Limit Theorem
Ap stats formula sheet Normality Condition for Proportions: Normality Condition for Means:
At least 10 “successes” and 10 “failures”. To verify, we must show that only one of the
When n is large, we can use a Normal following criteria has been met:
1. We are told the shape of population distribution
probability model to approximate is normal/approximately normal OR
binomial probabilities if the Large Counts 2. n ≥ 30 and state that the CLT applies OR
Condition is met (be sure to substitute 3. n < 30 and we verify the shape of the
distribution of the sample is not strongly skewed
values to verify): and there are no outliers.
Calculator Tip: When estimating a population mean and the sample size is less than
30, use the calculator to create a boxplot to check for strong skewness and outliers.
*Additional Condition: When there are two populations, we must verify that the samples are independent of
one another. This is either told in the scenario or must be reasonably inferred from scenario.
• Many students lose credit on the AP Statistics Exam when defining parameters because their description refers to the sample
instead of the population or because the description isn't clear about which group of individuals the parameter is describing. When
defining a parameter, we suggest including the word all or the word true in your description to make it clear that you aren't
referring to a sample statistic.
• Terminology matters. Never just say “the distribution.” Always say the “distribution of [blank]”, be careful to distinguish the
distribution of the population, the distribution of sample data, and the distribution of a statistic. Likewise don't use ambiguous
terms like “sample distribution” which could refer to the distribution of the sample data or to the sampling distribution of a
statistic. You will lose credit on the free-response questions for misusing statistical terms.
• Notation matters. The symbols all have specific and different meanings. Either use notation correctly–or don’t use it at all. You can
expect to lose credit if you use incorrect notation.
• The free response section almost always has a question that asks students to calculate a probability of some sort. Students should
always check the necessary conditions before calculating a probability even if the question doesn't specifically ask for the
conditions. Students will not be asked to perform a probability calculation in a context where the conditions have not been met.
There may, however, be a question that focuses on just the conditions. In this case, the conditions may not be met.
• The Random and Independence conditions are the same for sampling distributions that involve proportions and means. The only
condition that changes is the Normality condition. When working with proportions we must check the Large Counts Condition and
when working with means we must check the criteria for the Central Limit Theorem.
• If a free-response question asks you to complete a hypothesis test, you are expected to do the entire four-step process. That
includes clearly defining the parameter, identifying the procedure, and checking conditions. Yes, all three/four conditions. :)
• When your sample size is fewer than 30 observations AND the population shape is not given to be approximately normal, it is not
enough just to make a graph of the data on the calculator when assessing Normality. You must sketch the graph on your paper to
receive credit. You don't have to draw multiple graphs, any appropriate graph will do.
• There is almost always one free-response question that asks students to perform a significant test. Students will most likely be
asked if the data provide convincing evidence for the alternative hypothesis, rather than if the data provide convincing evidence
against the null hypothesis.
• When the P value is not small we fail to reject the null. Instead of failing to reject the null hypothesis, many students use language
that sounds like they accept the null hypothesis. Accepting the null hypothesis will always lose credit on the AP statistics exam.
• On the AP Statistics Exam, it is acceptable for students to use a confidence interval rather than the test statistic and p-value to
address a two-sided alternative hypothesis. However, if the alternative hypothesis is one-sided, students will lose credit for using
confidence interval approach unless they explicitly addressed the imperfect link between the one-sided test and the confidence
level. For instance, by adjusting the confidence level appropriately. Our recommendation for the AP statistics exam is to always
stick with a significant test.
• Many students lose credit when defining parameters in an experiment by describing the sample proportion rather than the true
proportion. For example, “the true proportion of the men who had surgery and survived 20 years” describes the sample statistic
and not the population parameter.
• When identifying the parameter of interest, it is essential to state which mean is μ1 and which is μ2. Because hypothesis testing
looks at whether there is a statistically significant difference between means, your alternative hypothesis is not relevant without
knowing which mean is represented by which statistic.
• For any two-sample hypothesis test or interval, you must check and state conditions for both samples. If you do not include the
work for both, you will not get credit for checking the conditions.
• Don’t overreact to minor issues in the graphs when checking the Normal and Equal SD conditions.
• We use a t-distribution because we are using an estimate of the standard deviation. Because this estimate is a variable, not a
constant, the shape of the distribution of the standardized test statistic is no longer Normal.
• You perform a significant test for slope when data from a random sample or randomized experiment suggest that a linear
association exists between two variables, there are two possible explanations for why the slope differs from zero. The first
explanation is that there really is no association between the variables and we got a non-zero slope due to sampling variability or
the chance variation due to random assignment. The second explanation is that there really is an association between the two
variables. We do a significance test to decide which explanation is more plausible.
• If the conditions for inference aren't met, the stated confidence level and significant levels may not be correct. For example if the
conditions aren’t met, “95%” confidence intervals for the slope may capture the true slope less than 95% of the time.
• Always reject a null hypothesis when the hypothesized value is not in the confidence interval.
• Don’t forget to review how to read computer output for: Slope, y-intercept, r-squared, SE, t-score, and p-value AND how to
interpret each of them.
Please note the last line contains the various confidence levels. When in need of a t-critical value (t*), see where confidence level
column intersects the appropriate degrees of freedom. For z*, use df = infinity row (last row) and confidence level.