Week 3-4
Week 3-4
Regression
Population Model
• Cross-sectional analysis
• Assume that sample is collected randomly from the population.
• We want to know how y varies with changes in x.
• What if y is affected by factors other than x.
• What is the functional form.
• How we can distinguish causality from correlation?
• Consider the following model, that hold in the population:
Population Model
• We allow for other factors to affect y by including u (error term).
• If the other factors in u are held fixed, ∆u = 0, then x has a linear
effect on y.
• Linearity: a one unit change in x has the same effect on y.
f(y)
. E(y|x) = 0 + 1x
x1 x2
Ordinary Least Squares
●
Basic idea of regression is to estimate the population parameters
from a sample
●
Let {(xi,yi): i = 1, …,n} denote a random sample of size n from the
population
●
For each observation in this sample, it will be the case that: yi = 0 +
1xi + ui
●
ui is unobserved.
To calculate the estimates of the coefficients The regression equation that estimates
that minimize the differences between the data the equation of the first order linear model
points and the line, use the formulas: is:
cov(XX,,YY))
cov(
bb11
s22
s xx bb00 bb11xx
ŷŷ
yy bb11xx
bb00
Example 17.1 Relationship between odometer
reading and a used car’s selling price.
• A car dealer wants to find
Car Odometer Price
the relationship between
1 37388 5318
the odometer reading and
2 44758 5061
the selling price of used cars. 3 45833 5008
• A random sample of 100 cars is selected, 4 30862 5795
and the data recorded. 5 31705 5784
• Find the regression line. 6 34010 5359
. . .
. . .
. . .
Independent variable x
Dependent variable y
9
Solution
• Solving by hand
• To calculate b0 and b1 we need to calculate several statistics first;
x 36,009.45; s 2x
( x i x) 2
43,528,688
n 1
cov(X , Y ) 1,356,256
b1 .0312
s 2x 43,528,688
b 0 y b1x 5411.41 ( .0312)(36,009.45) 6,533
ŷ b 0 b1x 6,533 .0312x
10
Alternative approach in deriving OLS
Estimates
●
To derive the OLS estimates we need to realize that our
main assumption of E(u|x) = E(u) = 0 also implies that
●
Cov(x,u) = E(xu) = 0
●
Why? Remember from basic probability that Cov(X,Y) =
E(XY) – E(X)E(Y).
●
We can write our 2 restrictions just in terms of x, y, 0 and
, since u = y – 0 – 1x
Alternate approach, continued
●
If one uses calculus to solve the minimization problem for the
two parameters you obtain the following first order
conditions, which are the same as we obtained before,
multiplied by n:
Deriving OLS continued
●
We can write our 2 restrictions just in terms of x, y, 0 and ,
since u = y – 0 – 1x
E(y – 0 – 1x) = 0
E[x(y – 0 – 1x)] = 0
●
These are called moment restrictions.
• and are the estimates from the data.
More Derivation
• The OLS regression line always goes through the mean of the sample.
Goodness of Fit
●
How do we think about how well our sample regression line fits our sample data?
●
Can compute the fraction of the total sum of squares (SST) that is explained by
the model, call this the R-squared of regression.
●
We can think of each observation as being made up of an unexplained part, and
an explained part
• We then define the following:
Where ,
x and u are independent
. E(y|x) = 0 + 1x
x1 x2
Heteroskedastic Case
f(y|x)
y .
. E(y|x) = 0 + 1x
.
x1 x2 x3 x
Variance of OLS estimators
• As increased, so does ; the more noise in the relationship between y and x (i.e.
the larger the variability in u), the harder it is to learn something about 1.
• F stat measuring the relative increase in the SSR when moving from
the unrestricted to the restricted model.
F-stat from R-squared
• Sometimes it is more convenient to compute F-stat using R-squares
than SSR.
• SSR = SST(1-)
F-stat (cont’d)
●
Just as with t statistics, p-values can be calculated by looking
up the percentile in the appropriate F distribution
●
If only one exclusion is being tested, then F = t2, and the p-
values will be the same.
●
If Ho is fail to be rejected, this means that we must look for
other variables to explain y.
The F-statistic for Overall
Significance of a Regression
• We use F-statistics
●
Small R-squared sometimes results in a highly significant F stat.
●
That’s why we must look F-stat for joint significance on top of R-
squared.