0% found this document useful (0 votes)
80 views45 pages

Torój - OLS Revisited

This document provides an introduction and outline for a lecture on Ordinary Least Squares (OLS) regression. It discusses using OLS to estimate parameters in a linear regression model by minimizing the sum of squared errors. It also introduces the concept of using a matrix notation where the dependent variable is represented by y, the independent variables by X, and the parameter estimates by β. The goal of OLS is to estimate the β parameters that best fit the linear relationship between y and X.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views45 pages

Torój - OLS Revisited

This document provides an introduction and outline for a lecture on Ordinary Least Squares (OLS) regression. It discusses using OLS to estimate parameters in a linear regression model by minimizing the sum of squared errors. It also introduces the concept of using a matrix notation where the dependent variable is represented by y, the independent variables by X, and the parameter estimates by β. The goal of OLS is to estimate the β parameters that best fit the linear relationship between y and X.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Introduction OLS

Lecture 1: OLS revisited


Econometric Methods

Andrzej Torój

SGH Warsaw School of Economics – Institute of Econometrics

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 1 / 32
Introduction OLS

Outline

1 Introduction
Course information
Econometrics: a reminder

2 OLS: theoretical reminder


Point estimation
Measuring precision
Model quality diagnostics under OLS

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 2 / 32
Introduction OLS

Outline

1 Introduction

2 OLS: theoretical reminder

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 3 / 32
Introduction OLS

Course information

Course information

lecturers: Andrzej Torój & Piotr Dybka


my website: http://web.sgh.waw.pl/~atoroj/ (lecture
slides, exercise files, literature, contact)
final grade: homework (50%) + written exam (50%); details:
website

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 4 / 32
Introduction OLS

Course information

Course information

lecturers: Andrzej Torój & Piotr Dybka


my website: http://web.sgh.waw.pl/~atoroj/ (lecture
slides, exercise files, literature, contact)
final grade: homework (50%) + written exam (50%); details:
website

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 4 / 32
Introduction OLS

Course information

Course information

lecturers: Andrzej Torój & Piotr Dybka


my website: http://web.sgh.waw.pl/~atoroj/ (lecture
slides, exercise files, literature, contact)
final grade: homework (50%) + written exam (50%); details:
website

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 4 / 32
Introduction OLS

Econometrics: a reminder

Why econometrics?

investigation of relationships
finding parameter values in economic models (e.g. elasticities)
confronting economic theories with data
forecasting
simulating policy scenarios

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 5 / 32
Introduction OLS

Econometrics: a reminder

Why econometrics?

investigation of relationships
finding parameter values in economic models (e.g. elasticities)
confronting economic theories with data
forecasting
simulating policy scenarios

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 5 / 32
Introduction OLS

Econometrics: a reminder

Why econometrics?

investigation of relationships
finding parameter values in economic models (e.g. elasticities)
confronting economic theories with data
forecasting
simulating policy scenarios

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 5 / 32
Introduction OLS

Econometrics: a reminder

Why econometrics?

investigation of relationships
finding parameter values in economic models (e.g. elasticities)
confronting economic theories with data
forecasting
simulating policy scenarios

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 5 / 32
Introduction OLS

Econometrics: a reminder

Why econometrics?

investigation of relationships
finding parameter values in economic models (e.g. elasticities)
confronting economic theories with data
forecasting
simulating policy scenarios

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 5 / 32
Introduction OLS

Econometrics: a reminder

Example (1/4)
Student satisfaction survey
Master students of Applied Econometrics at Warsaw School of Economics in
Winter semester 2016/2017 were asked about their satisfaction from studying
to be evaluated from 0 to 100. In addition, their average note from previous
studies and their sex were registered.
1 What kind of data is this? Cross-section, time series, panel?
Frequency? Micro- or macroeconomic?
2 How can we quickly visualise a hypothesised causality from average
note to satisfaction from studying?
3 Does such a relationship seem to be there?
4 How can sex of the respondent potentially affect the satisfaction from
studies or the relationship in question? How can we visualise this?
5 Bottom line, what is the right specification of the linear regression
model?

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 6 / 32
Introduction OLS

Outline

1 Introduction

2 OLS: theoretical reminder

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 7 / 32
Introduction OLS

Point estimation

Linear regression model

yi = β0 + β1 x1,i + β2 x2,i 
+ ... +
 βk xk,i + εi =
β0
 β1 
 
1 x1,i x2,i . . . xk,i  β2  + εi = xi β + εi
  
 .. 
 . 
βk
 T
Vector of parameters β0 β1 β2 . . . βk is unknown.
Minimize the dispersion of εi around zero, as measured e.g. by
n
ε2i .
P
t=1

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 8 / 32
Introduction OLS

Point estimation

Ordinary Least Squares (OLS)

n n
2
ε2i =
P P
S= (yi − β0 − β1 x1,i − β2 x2,i − . . . − βk xk,i ) → min
i=1 i=1 β0 ,β1,...

∂S
FOC: ∂β =0

β0
 
1
   
y1 x1,1 x2,1 ... xk,1
 β1 
 y2   1 x1,2 x2,2 ... xk,2   
Denote: y=

, X = 
 
, β = 
  β2 
.
. .
. .
. .
. .
.

 .   . . . ... .   .
.

 . 
yn 1 x1,n x2,n ... xk,n
βk
and obtain:
−1 T
β = XT X X y

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 9 / 32
Introduction OLS

Point estimation

Proof

n
ε2i = εT ε = (y − Xβ)T (y − Xβ) =
P
S =
i=1
= − β T XT y − yT Xβ + β T XT Xβ =
yT y
= yT y − 2yT Xβ + β T XT Xβ
(2. and 3. component were transposed scalars, so they were equal)
∂S ∂yT y 2yT Xβ β T XT Xβ
∂β = 0 ⇐⇒ ∂β − ∂β + ∂β =0
According to the rulesof matrix calculus: 
T T
−2yT X T T
 + β T2X X = 0 99K β X X = y X 99K
T
T
X X β=X y
−1 T
β = XT X X y

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 10 / 32
Introduction OLS

Point estimation

Example (2/4)

Student satisfaction survey


1 Run the regression model with an automated command in R.
2 Write the equation and try to interpret the parameters. Be
careful – it’s tricky! (Why?)
3 Manually replicate the parameter estimates.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 11 / 32
Introduction OLS

Measuring precision

Estimator as a random variable


β̂ is an estimator of the true parameter value β (function of
the random sample choice)
samples, and hence the values of β̂, can be different
estimator as a (vector) random variable has its
variance(-covariance matrix)
β̂0
 
 β̂1 
 
 β̂2   
β̂ = .  Var β̂ =

 . 
 . 
β̂k
       
var β̂0 cov β̂0 , β̂1 cov β̂0 , β̂2 ···
       

 cov β̂0 , β̂1 var β̂1 cov β̂1 , β̂2 ··· 

       

 cov β̂0 , β̂2 cov β̂1 , β̂2 var β̂2 ··· 

 
 . . . . 
 . . . . 

 . . . . 
  
var β̂k

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 12 / 32
Introduction OLS

Measuring precision

Estimator as a random variable


β̂ is an estimator of the true parameter value β (function of
the random sample choice)
samples, and hence the values of β̂, can be different
estimator as a (vector) random variable has its
variance(-covariance matrix)
β̂0
 
 β̂1 
 
 β̂2   
β̂ = .  Var β̂ =

 . 
 . 
β̂k
       
var β̂0 cov β̂0 , β̂1 cov β̂0 , β̂2 ···
       

 cov β̂0 , β̂1 var β̂1 cov β̂1 , β̂2 ··· 

       

 cov β̂0 , β̂2 cov β̂1 , β̂2 var β̂2 ··· 

 
 . . . . 
 . . . . 

 . . . . 
  
var β̂k

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 12 / 32
Introduction OLS

Measuring precision

Estimator as a random variable


β̂ is an estimator of the true parameter value β (function of
the random sample choice)
samples, and hence the values of β̂, can be different
estimator as a (vector) random variable has its
variance(-covariance matrix)
β̂0
 
 β̂1 
 
 β̂2   
β̂ = .  Var β̂ =

 . 
 . 
β̂k
       
var β̂0 cov β̂0 , β̂1 cov β̂0 , β̂2 ···
       

 cov β̂0 , β̂1 var β̂1 cov β̂1 , β̂2 ··· 

       

 cov β̂0 , β̂2 cov β̂1 , β̂2 var β̂2 ··· 

 
 . . . . 
 . . . . 

 . . . . 
  
var β̂k

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 12 / 32
Introduction OLS

Measuring precision

Variance-covariance matrix of a random vector

Definition: n o
Var (β) = E [β − E (β)] [β − E (β)]T
For a centered variable, i.e. E (ε) = 0, this definition simplifies:
T

Var (ε) = E εε

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 13 / 32
Introduction OLS

Measuring precision

OLS estimator: properties

−1 T
β̂ = XT X X y is an estimator (function of the sample) of
the “true”, unknown values β (population / data generating
process). Under certain conditions (i.a. E (X T ε) = 0
E (εεT ) = σ 2 I ), the OLS estimator is:
 
unbiased: E β̂ = β
consistent: β̂ converges to β with growing n
efficient: least possible estimator variance (i.e. highest
precision)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 14 / 32
Introduction OLS

Measuring precision

OLS estimator: properties

−1 T
β̂ = XT X X y is an estimator (function of the sample) of
the “true”, unknown values β (population / data generating
process). Under certain conditions (i.a. E (X T ε) = 0
E (εεT ) = σ 2 I ), the OLS estimator is:
 
unbiased: E β̂ = β
consistent: β̂ converges to β with growing n
efficient: least possible estimator variance (i.e. highest
precision)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 14 / 32
Introduction OLS

Measuring precision

OLS estimator: properties

−1 T
β̂ = XT X X y is an estimator (function of the sample) of
the “true”, unknown values β (population / data generating
process). Under certain conditions (i.a. E (X T ε) = 0
E (εεT ) = σ 2 I ), the OLS estimator is:
 
unbiased: E β̂ = β
consistent: β̂ converges to β with growing n
efficient: least possible estimator variance (i.e. highest
precision)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 14 / 32
Introduction OLS

Measuring precision

Variance of the error term (1)

n
1
1. Variance of the error term (scalar): σ̂ 2 = ε2i
P
n−(k+1)
i=1
Why such a formula if the general formula is
n
1 P
Var (X ) = n−1 (xi − x̄)2 ?
i=1
First of all note that ε̄ = 0 (prove it on your own).
Second, we need to know why 1 turned into (k + 1) in the
denominator.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 15 / 32
Introduction OLS

Measuring precision

Variance of the error term (2)

By Your intuition, what is the standard deviation in the


following dataset of 3 observation?

Without r
a correction in denominator:
√ h
2 2 2
i q
Var = 3 (3 − 2) + (2 − 2) + (1 − 2) = 23 6= 1
1

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 16 / 32
Introduction OLS

Measuring precision

Variance of the error term (2)

By Your intuition, what is the standard deviation in the


following dataset of 3 observation?

Without r
a correction in denominator:
√ h
2 2 2
i q
Var = 3 (3 − 2) + (2 − 2) + (1 − 2) = 23 6= 1
1

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 16 / 32
Introduction OLS

Measuring precision

Variance of the error term (3)


The intuition behind the standard deviation of 1 is build upon
an implicit, graphical calibration of mean based on the data
sample.

With an adequate correction for thatin denominator:



r h i q
2 2 2
Var = 3−1 (3 − 2) + (2 − 2) + (1 − 2) = 22 = 1
1

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 17 / 32
Introduction OLS

Measuring precision

Variance of the error term (3)


The intuition behind the standard deviation of 1 is build upon
an implicit, graphical calibration of mean based on the data
sample.

With an adequate correction for thatin denominator:



r h i q
2 2 2
Var = 3−1 (3 − 2) + (2 − 2) + (1 − 2) = 22 = 1
1

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 17 / 32
Introduction OLS

Measuring precision

Variance of the error term (4)

When X is directly observed, the terms like (xi − x̄) consume


one degree of freedom (there is one x̄ estimated before).
When ε is not observed, the terms
εi = yi − β̂0 − β̂1 x1i − ... − β̂k xki consume k + 1) degrees of
freedom (there are k + 1 elements in vector β̂ estimated
before).

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 18 / 32
Introduction OLS

Measuring precision

Variance-covariance matrix of the estimator

    T 
Var β̂ = E β̂ − β β̂ − β =
h i h iT 
−1 T −1 T
XT X X y − β · XT X

= E X y −β =
h i h iT 
T
−1 T T
−1 T
= E X X X (Xβ + ε) − β · X X X (Xβ + ε) − β =
h −1 T −1 i
T T T
= E X X X ε·ε X X X =
T
−1 T T
 T
−1
= X X X E εε X X X =
−1 T 2 −1
= XT X X σ IX XT X =
−1 T −1
= σ 2 XT X X X XT X =
2 T
 −1
= σ X X
  −1
Empirical counterpart: Var β̂ = σ̂ 2 XT X ≡ [di,j ](k+1)×(k+1)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 19 / 32
Introduction OLS

Measuring precision

Standard errors of estimation

Standard
  p  p (vector
errors of estimation –
for each parameter):
p
s β̂0 = d1,1 s β̂1 = d2,2 s β̂2 = d3,3 . . .

Calculating S.E.
1. estimate parameters, 2. compute the empirical error terms, 3.
estimate their variance, 4. compute the variance-covariance matrix
of the OLS estimator, 5. compute the SE as a square root of its
diagonal elements.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 20 / 32
Introduction OLS

Measuring precision

Standard errors of estimation

Standard
  p  p (vector
errors of estimation –
for each parameter):
p
s β̂0 = d1,1 s β̂1 = d2,2 s β̂2 = d3,3 . . .

Calculating S.E.
1. estimate parameters, 2. compute the empirical error terms, 3.
estimate their variance, 4. compute the variance-covariance matrix
of the OLS estimator, 5. compute the SE as a square root of its
diagonal elements.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 20 / 32
Introduction OLS

Measuring precision

t-tests for variable significance

t-Student test
H0 : βi = 0, i.e. i-th explanatory variable does not significantly
influence y
H1 : βi 6= 0, i.e. i-th explanatory variable does not significantly
influence y
Test statistic: t = s β̂β̂i is distributed as t (n − k − 1).
( 1)
p-value<α∗ – reject H0
p-value>α∗ – do not reject H0

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 21 / 32
Introduction OLS

Measuring precision

Example (3/4)

Student satisfaction survey


1 Compute the fitted values and the error terms.
2 Use this result to estimate the variance of the error term.
3 Estimate the variance-covariance matrix of the β̂ estimates.
4 Derive the standard errors from this matrix.
5 Replicate and interpret the t statistics and the p-values.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 22 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (1)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 23 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (2)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 24 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (3)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 25 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (4)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 26 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (5)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 27 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (6)

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 28 / 32
Introduction OLS

Model quality diagnostics under OLS

R-squared (7)

R 2 ∈ [0; 1] is a share of yt volatility explained by the model in


total yt volatility:
T
(ŷt −ȳ )2
P
T T T
2 2 2
R2 =
P P P t=1
(yt − ȳ ) = (ŷt − ȳ ) + (yt − ŷt ) T
t=1 t=1 t=1 (yt −ȳ )2
P
t=1

Standard goodness-of-fit measure in OLS regressions with a


constant.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 29 / 32
Introduction OLS

Model quality diagnostics under OLS

Wald test statistic

Wald test
H0 : β1 = β2 = . . . = βk = 0, i.e. no explanatory variable
influences y
H1 : ∃i βi 6= 0, at least 1 explanatory variable influences y
2 /k
Test statistic: F = (1−R 2 R)/(T −k−1)
distributed as F (k, T − k − 1).

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 30 / 32
Introduction OLS

Model quality diagnostics under OLS

Adjusted R-squared

k
R¯2 = |{z}
R2 − 1 − R2

T − (k + 1)
fit | {z }
penalty for overparametrization

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 31 / 32
Introduction OLS

Model quality diagnostics under OLS

Example (4/4)

Student satisfaction survey


1 Interpret the F-test result.
2 Replicate the F statistic and its p-value manually.
3 Interpret the R-squared.
4 Replicate the R-squared and adjusted R-squared manually.

Andrzej Torój SGH Warsaw School of Economics – Institute of Econometrics


(1) OLS 32 / 32

You might also like