AP Stats Study Guide
AP Stats Study Guide
AP Stats Study Guide
HEY FOLKS. HERE’S THE ORG POST WITH SOME INFO ABOUT THE AP TEST AND MORE RESOURCES
IF YOU HAVE ANY QUESTIONS, OR WANT TO SEE THE NOTES I TOOK AND/OR EXAMPLES OF PROBLEMS, CONTACT ME
GOOD LUCK ON THE TEST. DRINK WATER AND SLEEP WELL.
Center Would the mean or median be a better If there are outliers, use median. distribution is linear.
C description? ● Always provide numbers to
support your claim whenever
Spread Would the standard deviation or I.Q.R. If there are outliers, use I.Q.R. possible.
S be the better description?
INTERPRETING
Item Variable Explanation Example
Y-intercept a When the [explanatory variable] is zero [explanatory units], our model a ≈ 65.141
predicts that [response variable] will be [y-intercept value] [response
units]. When the subject’s age is zero years, our model predicts
that glucose level will be 65.141 unit
NOTE: Is this an extrapolation in the data set?
Correlation Coefficient r [Explanatory Variable] and [Response Variable] have a [strength] SLOPE = Increasing; r ≈ 0.52
[direction] [linear/nonlinear] relationship.
Subject’s age and glucose level have a positive
NOTE: moderate linear relationship.
● Strength
○ Weak: |r| = 0.0 - 0.3
○ Moderate: |r| = 0.4 - 0.6
○ Strong: |r| = 0.7/+
● Direction (Depends on Slope)
○ Positive = Increasing
○ Negative = Decreasing
SAMPLING STRATEGIES
TYPE NAME ABOUT
VALID Simple Random Sample (SRS) Every possible sample of a given size can be chosen
VALID Stratified Random Sample Divide population into stratas then perform a SRS in each strata proportionally
VALID Cluster Sample Divide population into clusters then randomly select a cluster. All individuals in the cluster are sampled
VALID Systematic Random Sample Choose a random location, take every nth person to be in the sample
INVALID Voluntary Response Sample Respondents choose themselves These are invalid methods of collecting a sample
from a population because it may not represent
INVALID Convenience Sample Respondents are easy to sample the entire population accurately. The proportion
may be under or overestimated.
INVALID Undercoverage Sample Missing members of the population
-------------------------
INVALID Nonresponse bias Respondents refuse to answer On the AP test, be able to explain in context why a
certain strategy would be better than another or
INVALID Response bias Respondents are “lead” to answer or respond correctly in context why an invalid strategy is flawed.
INFERE Central Limit Theorem (C.L.T.) - if a population distribution is not normal, the IMPORTANT TIP: A lot of tests have the same structure/set-up to them.
C.L.T. states that when n is large, the sampling distribution of x bar is normal. (I.E. Anything proportions have the same conditions, etc.)
NCES
1 Proportion z DESCRIPTION Random: Data came from random sample EQUATIONS I am __% confident that
Interval Confidence level Normal: (1 + C.L.) the population
________
Context ● npp ≥ 10 z* = invNorm[ , 0, 1 ] proportion of [context] is
2
● n(1-pp ) ≥ 10 within the interval ( __ ,
pp̂(1-pp̂)
Independent: pp̂ ± z*√( ______) __ )
n
Either
DEFINITIONS
● Individual observations are
pp = Point estimate
independent when sampling without
z* = Critical value
replacement
Standard Deviation =
● Sample is less than 10% of its
● √{ [ pp (1-pp ) ] ÷ n }
respective population (n ≤
Margin of Error =
0.10N)
● z*√{ [ pp (1-pp ) ] ÷ n }
CALCULATOR
STAT ⇒ TESTS ⇒ 5A: 1-PropZInterval
1 Sample z DESCRIPTION Random: Data came from random sample EQUATIONS I am __% confident that
Interval for Confidence level Normal: (1 + C.L.) the population mean of
________
Means Degrees of Freedom = n-1 For z z* = invNorm[ , 0, 1 ] [context] is within the
2
Context ● n ≥ 30 xx ± z*( σxx̅ / √n ) interval ( __ , __ )
For t σxx̅ = σx / √n
NOTES ● n ≥ 30 Delta x bar equals delta x divided by the
Identifying ● n ≥ 15 square root of n
● σ is known (rate) ○ Without strong skewness or
DESCRIPTION outliers CALCULATOR
Confidence level ● n < 15 STAT ⇒ TESTS ⇒ 7: ZInterval
Degrees of Freedom = n-1 ○ Without strong skewness or
Context outliers
1 Sample t ○ Distribution is approx. normal EQUATIONS
Interval for NOTES Independent: _______
t* = invT[ (1 + C.L.) ,
Means Identifying between z & t Either _
d.f.] 2
● z test ● Individual observations are
xx̅ ± t*( Sx ÷ √n )
○ σ is known (rare) independent when sampling without
σx = Sx / √n
○ comes from the replacement
population ● Sample is less than 10% of its
1 Proportion z DESCRIPTION Random: Data came from random sample GRAPH & EQUATIONS I reject/fail to reject the
Test Significance Level Normal: Ho because the p-val,
● Use α=0.05 if none given ● npp ≥ 10 [value], is less/greater
Context ● n(1-pp ) ≥ 10 than α = [value]. There is
Independent: sufficient/insufficient
HYPOTHESIS Either evidence to suggest [Ha
Ho: p = #. ● Individual observations are in words].
Ha: p ? # independent when sampling without
● Replace ? with either ≠, replacement
>, or < ● Sample is less than 10% of its CALCULATOR
respective population (n ≤ STAT ⇒ TESTS ⇒ 5: 1-PropZTest
0.10N)
P-VAL
DISTR (2ND VARS) ⇒ DISTR ⇒ 2:
normalcdf(
● Shading left: Lower = -1 x 1099
● Shading right: Upper = 1 x 1099
1 Sample z DESCRIPTION Random: Data came from random sample GRAPH & EQUATIONS I reject/fail to reject the
Test Significance Level Normal: FOR z Ho because the p-val,
● Use α=0.05 if none given For z [value], is less/greater
For t: Degrees of Freedom = n-1 ● n ≥ 30 than α = [value]. There is
Context For t sufficient/insufficient
● n ≥ 30 evidence to suggest [Ha
HYPOTHESIS ● n ≥ 15 in words].
Ho: μ = #. ○ Without strong skewness or
Ha: μ ? # outliers
1 Sample t ● Replace ? with either ≠, ● n < 15
Test FOR t
>, or < ○ Without strong skewness or
outliers
NOTES ○ Distribution is approx. normal
Identifying between z & t Independent:
● z test Either
CALCULATOR
Choose the according one
STAT ⇒ TESTS ⇒ 1: Z-Test
STAT ⇒ TESTS ⇒ 2: T-Test
P-VAL
DISTR (2ND VARS) ⇒ DISTR ⇒ 2:
normalcdf(
● Shading left: Lower = -1 x 1099
● Shading right: Upper = 1 x 1099
Matched Pairs DESCRIPTION Random: Data came from random sample GRAPH & EQUATIONS I reject/fail to reject the
t Test Significance Level Normal: Ho because the p-val,
● Use α=0.05 if none given ● n ≥ 30 [value], is less/greater
● Degrees of Freedom = n-1 ● n ≥ 15 than α = [value]. There is
Context ○ Without strong skewness or sufficient/insufficient
outliers evidence to suggest [Ha
HYPOTHESIS ● n < 15 in words].
Ho: μD = #. ○ Without strong skewness or
Ha: μD ? # outliers *NOTE: BE SURE TO ADD SUBSCRIPT D ON
● Replace ? with either ≠, ○ Distribution is approx. normal MU, X BAR, AND S
>, or < Independent:
Either CALCULATOR
NOTES ● Individual observations are STAT ⇒ TESTS ⇒ 2: T-Test
Identifying independent when sampling without
● Two extremely similar replacement P-VAL
subjects ● Sample is less than 10% of its DISTR (2ND VARS) ⇒ DISTR ⇒ 2:
● Before/after on the same respective population (n ≤ normalcdf(
subjects 0.10N) ● Shading left: Lower = -1 x 1099
2 Proportion z DESCRIPTION Random: Data came from random sample EQUATIONS NOTES
Interval pp̂1(1- pp̂2(1-
Confidence Level Normal: _______ _______ Make sure to use
p
(pp̂1 - pp̂2) ± z*√( n1p̂ 1) + pn2
p̂ 2)
Use “difference” ● n1pp̂1 ≥ 10 “difference”
Context ● n(11-pp̂1) ≥ 10 Margin) of error is everything after the ±
● n2pp̂2 ≥ 10 symbol I am __% confident that
PARAMETER ● n2(1-pp̂2) ≥ 10 the difference in
Let p1 = population proportion of … “Both sample sizes are sufficiently large.” CALCULATOR population proportion of
p2 = population proportion of … Independent: STAT ⇒ TESTS ⇒ 2-PropZTest [context] is within the
Either interval ( __ , __ )
● Both samples are independent
● Both samples are less than 10% of its
respective population
2 Proportion z DESCRIPTION Random: Data came from random sample GRAPH & EQUATIONS I reject/fail to reject the
Test Significance Level Normal: Ho because the p-val,
● Use α=0.05 if none given ● n1pp̂1 ≥ 10 [value], is less/greater
Use “difference” ● n(11-pp̂1) ≥ 10 than α = [value]. There is
Context ● n2pp̂2 ≥ 10 sufficient/insufficient
● n2(1-pp̂2) ≥ 10 evidence to suggest [Ha
PARAMETER “Both sample sizes are sufficiently large.” in words].
Let p1 = population proportion of … Independent:
p2 = population proportion of … Either
● Both samples are independent
HYPOTHESIS ● Both samples are less than 10% of its
Ho: p1 - p2 = 0 respective population
Ha: p1 - p2 ? 0
● Replace ? with either ≠,
>, or <
CALCULATOR
STAT ⇒ TESTS ⇒ B: 2-PropZInt
2 Sample z DESCRIPTION Random: Data came from random sample EQUATION I am __% confident that
Interval for Confidence level Normal: (do for both n1 and n2) FOR z the population mean
Means For t: ● n ≥ 30 difference of [context] is
(xx̅1 - xx̅2) ± z*√( σ12+
____ σ22)
____
Degrees of Freedom = ● n ≥ 15 n1 n2 within the interval ( __ ,
● Use d.f. on calculator ○ Without strong skewness or FOR t __ )
Context outliers
2 Sample z DESCRIPTION Random: Data came from random sample GRAPH & EQUATION I reject/fail to reject the
Test for Means Significance Level Normal: (do for both n1 and n2) Ho because the p-val,
● Use α=0.05 if none given ● n ≥ 30 [value], is less/greater
For t: ● n ≥ 15 than α = [value]. There is
Degrees of Freedom = ○ Without strong skewness or sufficient/insufficient
● Use d.f. on calculator outliers evidence to suggest [Ha
Context ● n < 15 in words].
○ Without strong skewness or
PARAMETER outliers
Let μ1 = population mean __ of … ○ Distribution is approx. normal EXAMPLE
FOR z
μ2 = population mean __ of … Independent: I fail to reject the Ho
Either (xx̅1 - xx̅2) - (μ1 - μ2) because the p-val, 0.08,
2 Sample t z= _______________
HYPOTHESIS ● Individual observations are is greater than α = 0.05.
Test for Means σ12 σ22
Ho: μ = 0 independent when sampling without √( ____+ ____
) There is insufficient
n1 n2
Ha: μ ? 0 replacement evidence to suggest that
● Replace ? with either ≠, ● Sample is less than 10% of its FOR t there is a difference in
>, or < respective population (n ≤ (xx̅1 - xx̅2) - (μ1 - μ2) mean response time
0.10N) t= _______________
between the Northern
NOTES S12 S22 and Southern fire
√( ____+ ____
)
Identifying between z & t n1 n2 stations..
● z test
○ σ is known (rare) CALCULATOR
Chi Squared DESCRIPTION Random Normal, 10% condition when GRAPH I reject/fail to reject the
Goodness of Significance Level necessary Ho because the p-val,
Fitness Test ● Use α=0.05 if none stated Expected Counts: All expected counts [value], is less/greater
Degrees of Freedom = k - 1 exceed 5 than α = [value]. There is
● k = # of categories ● E[x] = np sufficient/insufficient
Context ● Always in parentheses evidence to suggest [Ha
● Must show expected counts in the in words].
HYPOTHESIS answers
Ho = P1=, P2=, P3=, ... Independence: Sample is less than 10%
Ha = At least one Pi is incorrect of its population
x2 = ∑ [ (Observed-Expected)2 /
Expected ]
CALCULATOR
STAT ⇒ TESTS ⇒ D: x2GOF-Test
TIP: Press draw to see what the graph
should look like
Chi Squared DESCRIPTION Random Normal, 10% condition when GRAPH I reject/fail to reject the
Test for Significance Level necessary Ho because the p-val,
Homogeneity ● Use α=0.05 if none stated Expected Counts: All expected counts [value], is less/greater
Degrees of Freedom = exceed 5 than α = [value]. There is
● (# of Rows)(# of Columns) ● E[x] = ( Row Total ∗ Column sufficient/insufficient
Context Total) ÷ Grand Total evidence to suggest [Ha
● Always in parentheses in words].
Chi Squared HYPOTHESIS ● Must show expected counts in the
For Homogeneity answers
Association/ Ho: There is no difference in __ Independence: Sample is less than 10%
Independence when __ of its population
Independence
Ho: There is no association between
__ and __
Ha: There is an association between
__ and __
NOTES
Identifying
● Homogeneity x2 = ∑ [ (Observed-Expected)2 /
○ Multiple samples and Expected ]
treatment groups
● Independence CALCULATOR
○ Cut into groups MATRIX (2nd x-1) ⇒ EDIT ⇒ 1: [A]
● Adjust dimensions accordingly
● Insert in observed count values
STAT ⇒ TESTS ⇒ (C) x2-Test
● Observed: [A]
● Expected: [B]
MATRIX (2nd x-1) ⇒ EDIT ⇒ 1: [B]
● Get expected count values
PROBABILITY RULES
NAME ABOUT/EQUATION NAME ABOUT/EQUATION
Probability of Event A P(A) = (# of events corresponding to A) ÷ (Total # of events P(A|B) Reads the probability that event A happens, given that
in the same space) event B happens
● | symbol means “given”
Sample Space (s) ● P(S) = 1 Rule of addition P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
RANDOM VARIABLES
IDENTIFYING
B Binary B Binary
I Independent I Independent
MORE
● EX: 3! = 3 Factorial = 3 ∗ 2 ∗ 1 = 6
● EX: 5! = 5 Factorial = 5 ∗ 4 ∗ 3 ∗2 ∗ 1 = 120