The Simple Regression Model
The Simple Regression Model
y = b0 + b1x + u
Some Terminology
1
2021/10/10
A Simple Assumption
Intercept parameter
2
2021/10/10
Slope parameter
3
2021/10/10
. E(y|x) = b + b x
0 1
.
x1 x2
7
Earnings = b 0 + b1education+ u
E(u|x) = E(u) implies that E(abil|9)=E(abil|16)
4
2021/10/10
y3 .} u3
y2 u {.
2
y1 .} u1
x1 x2 x3 x4 x
10
5
2021/10/10
11
12
6
2021/10/10
13
(y )
n
n −1
i − bˆ0 − bˆ1 xi = 0
i =1
x (y )
n
n −1
i i − bˆ0 − bˆ1 xi = 0
i =1
14
7
2021/10/10
y = bˆ0 + bˆ1 x ,
or
bˆ0 = y − bˆ1 x
15
x (y − (y − bˆ x )− bˆ x ) = 0
n
i i 1 1 i
i =1
n n
xi ( yi − y ) = bˆ1 xi (xi − x )
i =1 i =1
n n
(
ix − x )( y i − y ) = ˆ (x − x )2
b 1 i
i =1 i =1
16
8
2021/10/10
(x − x )( y − y )
i i
bˆ1 = i =1
n
(
ix − x )2
i =1
n
provided that (xi − x ) 0
2
i =1
17
18
9
2021/10/10
More OLS
19
y1 .} û1
x1 x2 x3 x4 x
20
10
2021/10/10
(uˆ ) = ( )
n n
yi − bˆ0 − bˆ1 xi
2 2
i
i =1 i =1
21
(y )
n
i − bˆ0 − bˆ1 xi = 0
i =1
x (y )
n
i i − bˆ0 − bˆ1 xi = 0
i =1
22
11
2021/10/10
Examples
23
24
12
2021/10/10
n uˆ i
uˆ
i =1
i = 0 and thus, i =1
n
=0
n
x uˆ
i =1
i i =0
y = bˆ0 + bˆ1 x
25
More terminology
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi = yˆ i + uˆi We then define the following :
( y − y ) is the total sum of squares (SST)
2
i
26
13
2021/10/10
i( y − y )2
= (
i i i y − ˆ
y ) + ( ˆ
y − y )2
= uˆi + ( yˆ i − y )
2
= uˆi2 + 2 uˆi ( yˆ i − y ) + ( yˆ i − y )
2
Goodness-of-Fit
◼ How do we think about how well our sample
regression line fits our sample data?
◼ Can compute the fraction of the total sum of
squares (SST) that is explained by the model,
call this the R-squared of regression
◼ R2 = SSE/SST = 1 – SSR/SST
◼ R2 is equal to the square of the sample
correlation coefficient between yi and yˆ i
28
14
2021/10/10
Goodness-of-Fit
◼ In the social science, low R2 in regression
equations are not uncommon, especially for
cross-sectional analysis
◼ A seemingly low R2 does not necessarily
mean that an OLS regression equation is
useless.
◼ An Example: CEO Salary and Return on
Equity
29
30
15
2021/10/10
31
32
16
2021/10/10
Level-level
Level-log
Log-level
Log-log
33
Unbiasedness of OLS
34
17
2021/10/10
(x − x )yi
bˆ1 = i 2 , where
sx
s x2 (xi − x )
2
35
(x − x )y = (x − x )(b + b x + u ) =
i i i 0 1 i i
(x − x )b + (x − x )b x
i 0 i 1 i
+ (x − x )u =
i i
b (x − x ) + b (x − x )x
0 i 1 i i
+ (x − x )u
i i
36
18
2021/10/10
(x − x ) = 0,
i
(x − x )x = (x − x )
2
i i i
ˆ (xi − x )ui
b1 = b1 +
s x2
37
( )
E bˆ1 = b1 + 1 2 d i E (ui ) = b1
sx
38
19
2021/10/10
Unbiasedness Summary
39
40
20
2021/10/10
◼ Var(u|x) = E(u2|x)-[E(u|x)]2
◼ E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u)
◼ Thus s2 is also the unconditional variance,
called the error variance
◼ s, the square root of the error variance is
called the standard deviation of the error
◼ Can say: E(y|x)=b0 + b1x and Var(y|x) = s2
41
Homoscedastic Case
y
f(y|x)
. E(y|x) = b + b x
0 1
.
x1 x2
42
21
2021/10/10
Heteroscedastic Case
f(y|x)
.
. E(y|x) = b0 + b1x
.
x1 x2 x3 x
43
( )
Var bˆ1 = Var b1 + 1 2 d i ui =
sx
2 2
2 Var ( d i ui ) =
1 1
sx
2
sx
d Var (u )
i
2
i
2 2
= 1 2
sx
d s = s 1 sx2
i
2 2 2
d i
2
=
( )
2
s 1 2 s x2 = s 2 = Var bˆ1
2 2
sx sx
44
22
2021/10/10
45
46
23
2021/10/10
uˆi2 = SSR / (n − 2)
1
sˆ 2 =
(n − 2)
47
( ) (
se bˆ1 = sˆ / (xi − x )
2
) 1
2
48
24