ES 209 Lecture Notes Week 14
ES 209 Lecture Notes Week 14
Linear Regression
Analysis of Variance (ANOVA)
Statistical Hypotheses: General
Concepts
• Example : A certain type of cold vaccine is known to be only 25% effective after a
period of 2 years. To determine if a new and more expensive vaccine is superior
in providing protection against the same virus for a longer period of time,
suppose that 20 people are chosen at random and inoculated. If more than 8 of
those receiving the new vaccine surpass the 2-year period without contracting
the virus, the new vaccine will be considered superior to the one presently in use.
• We are testing the null hypothesis that the new vaccine is equally effective after a
period of 2 years as the one now commonly used. The alternative hypothesis is
that the new vaccine is superior. This is equivalent to testing the hypothesis that
the binomial parameter for the probability of a success on a given trial is p = 1/4
against the alternative that p > 1/4. This is usually written as follows:
• Example : A certain type of cold vaccine is known to be only 25% effective after a
period of 2 years. To determine if a new and more expensive vaccine is superior
in providing protection against the same virus for a longer period of time,
suppose that 20 people are chosen at random and inoculated. If more than 8 of
those receiving the new vaccine surpass the 2-year period without contracting
the virus, the new vaccine will be considered superior to the one presently in use.
• The critical value is the number 8. Therefore, if x > 8, we reject H0 in favor of the
alternative hypothesis H1. If x ≤ 8, we fail to reject H0.
Testing a Statistical Hypothesis
or perhaps
Example: A procedure for estimating the tar content for various levels of the
inlet temperature from experimental information. It is likely that for many
example runs in which the inlet temperature is the same, say 130◦C, the outlet
tar content will not be the same.
ü Tar content, gas mileage (mpg), and the price of houses (in thousands of
dollars) are natural dependent variables, or responses.
ü Inlet temperature, engine volume (cubic feet), and square feet of living
space are, respectively, natural independent variables, or regressors.
ü A reasonable form of a relationship between the response Y and the
regressor x is the linear relationship
Y = β0 + β1x,
where, of course, β0 is the intercept and β1 is the slope.
ü If the relationship is exact, then it is a deterministic relationship between
two scientific variables and there is no random or probabilistic component
to it.
Introduction to Linear Regression
• The concept of regression analysis deals with finding the best relationship between Y
and x, quantifying the strength of that relationship, and using methods that allow for
prediction of the response values given values of the regressor x.
Introduction to Linear Regression