AP Stats Module 2 Notes
AP Stats Module 2 Notes
Explanatory variable: variable used to explain or predict the changes in the values of another variables,
AKA independent variable (x).
Response Variable: variable that measures the outcome (prediction); it is in response to the explanatory variables,
AKA dependent variable (y).
y - intercept:
Candy Slope interpretation: On average, for each additional [x in
context], the predicted [y in context] changes by [a units].
Slope:
*Always report four decimal places when possible.
Tip: Always show all work when calculating a residual and include units.
Residual values
R-Squared: Coefficient of Determination
Residual Plots: Used to determine whether current
Helps to determine whether a linear model is appropriate
linear model is appropriate. The x-axis usually plots the
(after checking that the residual plot shows no visible pattern).
x-variable and y-axis is usually the residuals.
Residual = Actual - Predicted value. Think AP Statistics. The closer to 1 r-squared is, the more appropriate the linear model.
To interpret: The residual represents how much our model either
To interpret: r-squared is the percent of variation in [y] that
over/underestimated the actual value to be.
can be accounted for by the LSRL relating
Be careful, a positive residual is an UNDERestimate [y in context] to [x in context].
Random Scatter is GOOD! It Visible pattern is BAD!
and a negative residual is an OVERestimate. This has When reading computer output, we NEVER report
means that current linear It means that another
model is appropriate. model could be better! to do with where the point is relative to the LSRL. r-squared adj (adjusted).
For Power (x2, x3, x1/2, or xn in general): Graph log(x) vs. log(y) (see top graph on right) • Round to four decimal places!
For Exponential (2x,ex,(1/4)x, or ax in general): Graph x vs. log(y) (see bottom graph on right)
• Be sure to include units for both your x- and y-variable.
To find log(x), find log(L1) and store in L3. To find log(y), find log(L2) and store in L4.
After transforming the original association, check to see if linearity was achieved by: • When interpreting slope and y-intercept, the use of the word “predicted or
1. Check the new x vs. y in a scatterplot
estimated” is mandatory. Otherwise, it seems as though y is the actual value.
Correlation • Please remember that your variables (x and y-hat) must be defined. Be sure to
include the context (what do x and y-hat stand for). We should always write
Correlation* coefficient (r) - measures both direction (+/-) and strength (closer to
'where x represents ___ and y-hat represents the predicted ____.' Also, when
–1 or 1 stronger, closer to 0 weaker).
defining variables, there is a BIG difference between saying x is the hand span
*Correlation does NOT imply causation!
versus x is the length of arm span.
Examples of correlation
• Always write your answers in the context of the problem when interpreting
slope, y-intercept, r, r2, residuals, etc.
• The sign of the residual is opposite to what one would believe. A negative
residual is an overestimate and a positive residual is an underestimate. Think
about the order of the subtraction! :)
• Remember that correlation is a measure of association, not causation.
• When you are asked to describe the association shown in a scatterplot, you are expected to discuss the direction, form, and strength
of the association, along with any unusual features, in the context of the problem. This means that you need to use both variable
names in your description.
• Correlation:
IS only appropriate to use the correlation to describe the strength and direction for linear relationships.
• Don’t make predictions using values of x that are much larger or much smaller than those that actually appear in your data (known as
extrapolation, see 02.02 Page 4 of 8).
• When asked to interpret the slope or y intercept, it is very important to include the word predicted in your response. Otherwise, it
might appear that you believe the regression equation provides actual values of y.
• Remember that slope is changes in y over changes in x. (Sy/Sx) is still consistent with this and can be found on the AP Statistics For-
mula Sheet.