0% found this document useful (0 votes)
6 views

The Simple Regression Model

Uploaded by

Kashif javed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

The Simple Regression Model

Uploaded by

Kashif javed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

2021/10/10

The Simple Regression Model

y = b0 + b1x + u

Some Terminology

◼ In the simple linear regression model, where


y = b0 + b1x + u, we typically refer to y as the
❑ Dependent Variable, or
❑ Explained Variable, or
❑ Response Variable, or
❑ Predicted Variable, or
❑ Regressand

1
2021/10/10

Some Terminology, cont.

◼ In the simple linear regression of y on x, we


typically refer to x as the
❑ Independent Variable, or
❑ Explanatory Variable, or
❑ Control Variable, or
❑ Predictor Variable, or
❑ Regressor

A Simple Assumption

◼ The average value of u, the error term, in the


population is 0. That is,
◼ E(u) = 0
◼ This is not a restrictive assumption, since we
can always use b0 to normalize E(u) to 0

Intercept parameter

2
2021/10/10

Zero Conditional Mean

◼ We need to make a crucial assumption


about how u and x are related
◼ We want it to be the case that E(u) does not
depend on the value of x. That is,
◼ E(u|x) = E(u) = 0, which implies
◼ E(y|x) = b0 + b1x

Slope parameter

Zero Conditional Mean

◼ The zero conditional mean assumption E(u|x)


= 0 breaks y into two components:
◼ The piece b0 + b1x is called the systematic
part of y
◼ u is called the unsystematic part, or the part
of y not explained by x

3
2021/10/10

E(y|x) as a linear function of x, where for any x


the distribution of y is centered about E(y|x)
y
f(y)

. E(y|x) = b + b x
0 1
.

x1 x2
7

Example: Returns to Education

◼ A model of human capital investment implies getting


more education should lead to higher earnings
◼ In the simplest case, this implies an equation like

Earnings = b 0 + b1education+ u
E(u|x) = E(u) implies that E(abil|9)=E(abil|16)

4
2021/10/10

Ordinary Least Squares

◼ Basic idea of regression is to estimate the


population parameters from a sample
◼ Let {(xi,yi): i=1, …,n} denote a random sample
of size n from the population
◼ For each observation in this sample, it will be
the case that
◼ yi = b0 + b1xi + ui

Population regression line, sample data points


and the associated error terms
y E(y|x) = b0 + b1x
y4 .
u4 {

y3 .} u3
y2 u {.
2

y1 .} u1
x1 x2 x3 x4 x
10

5
2021/10/10

Deriving OLS Estimates

◼ To derive the OLS estimates we need to


realize that our main assumption of E(u|x) =
E(u) = 0 also implies that
◼ Cov(x,u) = E(xu) = 0
◼ Why? Remember from basic probability that
Cov(X,Y) = E(XY) – E(X)E(Y)

11

Deriving OLS continued

◼ We can write our 2 restrictions just in terms of


x, y, b0 and b1 , since u = y – b0 – b1x
◼ E(y – b0 – b1x) = 0
◼ E[x(y – b0 – b1x)] = 0
◼ These are called moment restrictions

12

6
2021/10/10

Deriving OLS using M.O.M.

◼ The method of moments approach to


estimation implies imposing the population
moment restrictions on the sample moments
◼ What does this mean? Recall that for E(X),
the mean of a population distribution, a
sample estimator of E(X) is simply the
arithmetic mean of the sample

13

More Derivation of OLS

◼ We want to choose values of the parameters that


will ensure that the sample versions of our moment
restrictions are true
◼ The sample versions are as follows:

 (y )
n
n −1
i − bˆ0 − bˆ1 xi = 0
i =1

 x (y )
n
n −1
i i − bˆ0 − bˆ1 xi = 0
i =1

14

7
2021/10/10

More Derivation of OLS

◼ Given the definition of a sample mean, and


properties of summation, we can rewrite the first
condition as follows

y = bˆ0 + bˆ1 x ,
or
bˆ0 = y − bˆ1 x
15

More Derivation of OLS

 x (y − (y − bˆ x )− bˆ x ) = 0
n

i i 1 1 i
i =1
n n

 xi ( yi − y ) = bˆ1  xi (xi − x )
i =1 i =1
n n
(
 ix − x )( y i − y ) = ˆ (x − x )2
b 1 i
i =1 i =1

16

8
2021/10/10

So the OLS estimated slope is

 (x − x )( y − y )
i i
bˆ1 = i =1
n
(
 ix − x )2

i =1
n
provided that  (xi − x )  0
2

i =1
17

Summary of OLS slope estimate

◼ The slope estimate is the sample covariance


between x and y divided by the sample
variance of x
◼ If x and y are positively correlated, the slope
will be positive
◼ If x and y are negatively correlated, the slope
will be negative
◼ Only need x to vary in our sample

18

9
2021/10/10

More OLS

◼ Intuitively, OLS is fitting a line through the


sample points such that the sum of squared
residuals is as small as possible, hence the
term least squares
◼ The residual, û, is an estimate of the error
term, u, and is the difference between the
fitted line (sample regression function) and
the sample point

19

Sample regression line, sample data points


and the associated estimated error terms
y
y4 .
û4{
yˆ = bˆ0 + bˆ1 x
y3 .} û3
y2 û {.
2

y1 .} û1
x1 x2 x3 x4 x
20

10
2021/10/10

Alternate approach to derivation

◼ Given the intuitive idea of fitting a line, we can set


up a formal minimization problem
◼ That is, we want to choose our parameters such that
we minimize the following:

 (uˆ ) =  ( )
n n
yi − bˆ0 − bˆ1 xi
2 2
i
i =1 i =1

21

Alternate approach, continued

◼ If one uses calculus to solve the minimization


problem for the two parameters you obtain the
following first order conditions, which are the same
as we obtained before, multiplied by n

 (y )
n

i − bˆ0 − bˆ1 xi = 0
i =1

 x (y )
n

i i − bˆ0 − bˆ1 xi = 0
i =1
22

11
2021/10/10

Examples

◼ Example 1: CEO Salary and Return on Equity

◼ Example 2: Wage and Education

23

Algebraic Properties of OLS

◼ The sum of the OLS residuals is zero


◼ Thus, the sample average of the OLS
residuals is zero as well
◼ The sample covariance between the
regressors and the OLS residuals is zero
◼ The OLS regression line always goes
through the mean of the sample

24

12
2021/10/10

Algebraic Properties (precise)


n

n  uˆ i

 uˆ
i =1
i = 0 and thus, i =1
n
=0
n

 x uˆ
i =1
i i =0

y = bˆ0 + bˆ1 x
25

More terminology
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi = yˆ i + uˆi We then define the following :
 ( y − y ) is the total sum of squares (SST)
2
i

 ( yˆ − y ) is the explained sum of squares (SSE)


2
i

 uˆ is the residual sum of squares (SSR)


2
i

Then SST = SSE + SSR

26

13
2021/10/10

Proof that SST = SSE + SSR

 i( y − y )2
= (
 i i i y − ˆ
y ) + ( ˆ
y − y )2

=  uˆi + ( yˆ i − y )
2

=  uˆi2 + 2 uˆi ( yˆ i − y ) +  ( yˆ i − y )
2

= SSR + 2 uˆi ( yˆ i − y ) + SSE


and we know that  uˆi ( yˆ i − y ) = 0
27

Goodness-of-Fit
◼ How do we think about how well our sample
regression line fits our sample data?
◼ Can compute the fraction of the total sum of
squares (SST) that is explained by the model,
call this the R-squared of regression
◼ R2 = SSE/SST = 1 – SSR/SST
◼ R2 is equal to the square of the sample
correlation coefficient between yi and yˆ i

28

14
2021/10/10

Goodness-of-Fit
◼ In the social science, low R2 in regression
equations are not uncommon, especially for
cross-sectional analysis
◼ A seemingly low R2 does not necessarily
mean that an OLS regression equation is
useless.
◼ An Example: CEO Salary and Return on
Equity

29

Units of Measurement and Functional


Form
The equation
salˆary = 963.191 + 18.501roe
where salary is measured in thousands of dollars.

The above equation can be written as


salaˆrdol = 963191+ 18501roe
where salardol is salary in dollars.

30

15
2021/10/10

Units of Measurement and Functional


Form (cont)

The above equation can also be written as


salˆary = 963.191 + 1850.1roedec
where roedec is the decimal equivalent of roe.

31

Units of Measurement and Functional


Form (cont)
◼ Example 1: Log Wage and Education

◼ Example 2: Log CEO Salary and Log Firm Sales

32

16
2021/10/10

Units of Measurement and Functional


Form (cont)
Model Dependent Independent Interpretation
Variable Variable of

Level-level

Level-log

Log-level

Log-log

33

Unbiasedness of OLS

◼ Assumption SLR.1: the population model is linear in


parameters as y = b0 + b1x + u
◼ Assumption SLR.2: we have a random sample of
size n, {(xi, yi): i=1, 2, …, n}, from the population
model. Thus we can write the sample model yi = b0
+ b1xi + ui
◼ Assumption SLR.3: there is variation in the xi
◼ Assumption SLR.4: E(u|x) = 0

34

17
2021/10/10

Unbiasedness of OLS (cont)

◼ In order to think about unbiasedness, we need to


rewrite our estimator in terms of the population
parameter
◼ Start with a simple rewrite of the formula as

(x − x )yi
bˆ1 =  i 2 , where
sx
s x2   (xi − x )
2

35

Unbiasedness of OLS (cont)

 (x − x )y = (x − x )(b + b x + u ) =
i i i 0 1 i i

 (x − x )b +  (x − x )b x
i 0 i 1 i

+  (x − x )u =
i i

b  (x − x ) + b  (x − x )x
0 i 1 i i

+  (x − x )u
i i

36

18
2021/10/10

Unbiasedness of OLS (cont)

 (x − x ) = 0,
i

 (x − x )x =  (x − x )
2
i i i

so, the numerator can be rewritten as


b1s x2 +  (xi − x )ui , and thus

ˆ  (xi − x )ui
b1 = b1 +
s x2
37

Unbiasedness of OLS (cont)

let d i = (xi − x ), so that


 
bˆi = b1 +  1 2  d i ui , then
 sx 

( )  
E bˆ1 = b1 +  1 2  d i E (ui ) = b1
 sx 
38

19
2021/10/10

Unbiasedness Summary

◼ The OLS estimates of b1 and b0 are


unbiased
◼ Proof of unbiasedness depends on our 4
assumptions – if any assumption fails, then
OLS is not necessarily unbiased
◼ Remember unbiasedness is a description of
the estimator – in a given sample we may be
“near” or “far” from the true parameter

39

Variance of the OLS Estimators

◼ Now we know that the sampling distribution of our


estimate is centered around the true parameter
◼ Want to think about how spread out this distribution
is
◼ Much easier to think about this variance under an
additional assumption, so
◼ Assumption SLR.5: Var(u|x) = s2 (Homoscedasticity)

40

20
2021/10/10

Variance of OLS (cont)

◼ Var(u|x) = E(u2|x)-[E(u|x)]2
◼ E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u)
◼ Thus s2 is also the unconditional variance,
called the error variance
◼ s, the square root of the error variance is
called the standard deviation of the error
◼ Can say: E(y|x)=b0 + b1x and Var(y|x) = s2

41

Homoscedastic Case
y
f(y|x)

. E(y|x) = b + b x
0 1
.

x1 x2
42

21
2021/10/10

Heteroscedastic Case
f(y|x)

.
. E(y|x) = b0 + b1x

.
x1 x2 x3 x
43

Variance of OLS (cont)

( )    
Var bˆ1 = Var  b1 +  1 2  d i ui  =
  sx  
2 2

2  Var ( d i ui ) = 
 1   1 

 sx 
2
 sx 
 d Var (u )
i
2
i

2 2
   
= 1 2
 sx 
 d s = s  1 sx2 
i
2 2 2
d i
2
=

( )
2
 
s  1 2  s x2 = s 2 = Var bˆ1
2 2

 sx  sx
44

22
2021/10/10

Variance of OLS Summary

◼ The larger the error variance, s2, the larger


the variance of the slope estimate
◼ The larger the variability in the xi, the smaller
the variance of the slope estimate
◼ As a result, a larger sample size should
decrease the variance of the slope estimate
◼ Problem that the error variance is unknown

45

Estimating the Error Variance

◼ We don’t know what the error variance, s2, is,


because we don’t observe the errors, ui
◼ What we observe are the residuals, ûi
◼ We can use the residuals to form an estimate
of the error variance

46

23
2021/10/10

Error Variance Estimate (cont)

uˆi = yi − bˆ0 − bˆ1 xi


= (b 0 + b1 xi + ui ) − bˆ0 − bˆ1 xi
i (
= u − bˆ − b − bˆ − b
0 0 )( 1 1 )
Then, an unbiased estimator of s 2 is

uˆi2 = SSR / (n − 2)
1
sˆ 2 =
(n − 2) 
47

Error Variance Estimate (cont)

sˆ = sˆ 2 = Standard error of the regression


()
recall that sd bˆ = s
sx
if we substitute sˆ for s then we have
the standard error of bˆ1 ,

( ) (
se bˆ1 = sˆ /  (xi − x )
2
) 1
2

48

24

You might also like