Stats135 Reviewer
Stats135 Reviewer
parameter estimation
○ where β0 and β1 are ★ we estimate the para using the least
constants called the model squares method
regression coefficients or ○ gives the line that minimizes the
parameters (also regre beta sum of squares of the vertical
coef) distances from each point to the
○ B1 is the slope of the regre line
line ○ these errors can be obtained as
○ ε is a random disturbance or
error (also known as residual
errors) ★ the sum of the squares of these distances
★ the obsv on the dependent var Y can then be written as
○ assumed to be random obsv
from populations of random
vars wt the mean of each
popu given by the expected
★ the values of β0 and β1 that minimize the
value of E[Y]
equ above r given by
○ the deviation of an
observation Y from its
population mean is taken
into acc by adding a random
and
error ε
★ the random errors ε𝑖 have zero
mean ★ the estimates 𝐵0 and 𝐵1 r called least
★ the random errors are assumed to squares estimates of β0 and β1 bc they r
have a common variance or the solution to the least squares method
homoscedastic and to be pairwise
○ the intercept nd the slope of the line
independent
that has the smallest possible sum
★ the random errors are assumed to
of squares of the vertical distances
be normally distributed
from each point to the line
○ implies that the Y are also
○ for this reason, the line is called the
normally distributed
least squares regression line which
○ the random error is given by
assumptions are frequently
stated as
★ fitted values
○ for each obsv in our data, we can ➔ after we compute the least squares
compute estimates of the parameters of a linear
model, let us compute the following
quantities
★ these vertical distances r called ordinary ➔ a fundamental equality, in both slr n mlr is
least squares residuals given by
○ property: their sum is equal to zero SST = SSR + SSE
○ means that the sum of the dist ➔ this arises from the descrip of an
above the line is = to the sum of the obsv as
dist below d line
★ ex.
➔ subtracting y bar from both sides we
obtain
2
➔ the ratio 𝑅 = SSR/SST can be interpreted
as the proportion of the total variation in
Y that is accounted for by the predictor
test of hypotheses var X
➔ hypotheses for the constant β0 2
◆ 𝑅 is called as the goodness of fit
index
2
◆ we can rewrite 𝑅 as
➔ hypotheses for the slope β1