0% found this document useful (0 votes)
95 views19 pages

Econometrics PDF

This document provides an introduction to econometrics and regression analysis. It discusses what a regression is, how regression equations are fitted to data, and how to interpret regression coefficients. Specifically, it covers: 1) What a regression is and how it is used to understand relationships between variables. 2) How a simple linear regression equation is fitted to minimize the sum of squared distances between the data points and regression line. 3) How to interpret the intercept and slope coefficients from a regression analysis. 4) How to perform and interpret log-log regressions where coefficients represent elasticities. 5) How to include multiple explanatory variables in a regression.

Uploaded by

Tsegaye Mulugeta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views19 pages

Econometrics PDF

This document provides an introduction to econometrics and regression analysis. It discusses what a regression is, how regression equations are fitted to data, and how to interpret regression coefficients. Specifically, it covers: 1) What a regression is and how it is used to understand relationships between variables. 2) How a simple linear regression equation is fitted to minimize the sum of squared distances between the data points and regression line. 3) How to interpret the intercept and slope coefficients from a regression analysis. 4) How to perform and interpret log-log regressions where coefficients represent elasticities. 5) How to include multiple explanatory variables in a regression.

Uploaded by

Tsegaye Mulugeta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Introduction to Econometrics

Arthur Campbell

MIT

16th February 2007

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 1 / 19


Today’s Recitation

What is a Regression?
Regression Equation
Regression Coe¢ cients, Standard Errors, T-statistics, Level of
Signi…cance, R 2 values
Interaction terms

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 2 / 19


What is a Regression?

It is a statistical tool for understanding the relationship between


di¤erent variables
Usually we want to know the causal e¤ect of one variable on another
For instance we might ask the question how much extra income do
people receive if they have had one more year of education all other
things equal?
When I represents income and E education this is equivalent to asking
what is ∂∂EI ?
To answer this question the econometrician collects data on income
and education, and uses it to run a regression equation

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 3 / 19


What is a Regression?

The most simple regression is a regression with a single explanatory


variable. In the case of income and education this could be

I = β0 + β1 E + ε

I is called the dependent (endogenous) variable and E is known as


the explanatory (exogenous)
β0 and β1 are the regression co-e¢ cients
ε is the noise term
This regression equation will put a straight line through the data

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 4 / 19


Fitting the regression equation

Consider the following set of data on income and education

Figure by MIT OCW and adapted from:

Sykes, Alan. "An introduction to regression analysis." Chicago Working Paper in Law and Economics 020 (October 1993): 4.

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 5 / 19


Fitting the regression equation

The regression will typically …t the line which minimizes the sum of
the squared distances of the data points to the line

Figure by MIT OCW and adapted from:

Sykes, Alan. "An introduction to regression analysis." Chicago Working Paper in Law and Economics 020 (October 1993): 7.

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 6 / 19


Fitting the regression equation

The criteria we have used here is

min ∑ (yi β0 β1 Xi )2
β0 β1

This determines the values of β0 and β1 and hence the position of the
line
There are many potential criteria we could use such

min ∑ jyi β0 β1 Xi j
β0 β1

However provided the noise term from earlier ε satis…es certain


assumptions the sum of squared distances is optimal

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 7 / 19


Interpreting the coe¢ cients in the linear regression model

β0 is the intercept of the line


∂I
β1 is the slope of the line or in other words is ∂E
∂I
If for instance β1 = ∂E = 15, 000 this would imply that for every
additional year of schooling an individual would on average earn
$15,000 more
For a given level of income and education we could now work out the
elasticity of income wrt education

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 8 / 19


Interpreting the coe¢ cientsin the log-log regression model

Consider now an isoelastic demand curve

QD = β0 P β1

Now take the logarithm of both sides

ln QD = ln β0 + β1 ln P

We can estimate the following regression relationship

ln QD = ln β0 + β1 ln P + ε

to determine β0 and β1
Here each data point would be (ln QD , ln P ) and the value of the
intercept is ln β0 and the slope is β1

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 9 / 19


Interpreting the coe¢ cients in the log-log regression model

In this log-log speci…cation β1 is again the derivative of the dependent


variable wrt the explanatory variable ∂∂lnlnQPD = ∂Q D P
∂P Q and has the
natural interpretation of the elasticity of demand with respect to price
In Problem Set 2 you will be asked to calculate elasticities from the
regression results

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 10 / 19


Multivariable regression

The regression may in fact contain more than one explanatory variable
For instance we might think that a person’s income is in‡uenced by
both the number of years of education and the number of years
experience in the labour force
In this case we might run the following multi-variable regression

I = β0 + β1 E + β2 L

Here we can …nd the e¤ect education and labour force experience on
income separately

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 11 / 19


Results of a regression

Basic Model: Double Log


1975-1980 2001-2006
Continued...
βo -0.615 -1.697*** Jul 0.031*** 0.040***
(0.929) (0.587) (0.010) (0.005)

In(P ) -0.335*** -0.042*** Aug 0.042*** 0.046***


(0.024) (0.009) (0.010) (0.004)

In(Y ) 0.467*** 0.530*** Sep -0.028*** -0.039***


(0.096) (0.058) (0.006) (0.005)

Jan -0.079*** -0.044*** Oct 0.002 0.008


(0.010) (0.006) (0.010) (0.005)

Feb -0.129*** -0.122*** Nov -0.058*** -0.032***


(0.019) (0.010) (0.012) (0.004)
Mar -0.019*** -0.008 εj's y y
(0.006) (0.005)

Apr -0.021 -0.024*** R2 0.85 0.94


(0.016) (-0.005)
^
σ 0.027 0.011
May 0.013 0.026***
(0.011) (0.004) ***(p < 0.01)
Jun 0.020 0.000
(0.010) (0.004)

Figure by MIT OCW and adapted from: Hughes, J., C. Knittel, and D. Sperling. "Evidence of a shift in the short-run price elasticity of gasoline demand."
Center for the Study of Energy Markets Working Paper 159 (2006): Table 1.

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 12 / 19


Dummy variables and seasonality

In the previous slide the regression included 11 dummy variables for


the months Jan-Nov
These variables take a value of 1 if the data point was observed
during that month and 0 otherwise
They are included to remove any seasonality in the data, a positive
value means that there was more (gasoline) consumed during that
month compared to the month without a dummy variable (December)

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 13 / 19


Standard Errors (s)

When the error terms ε are normally distributed it is possible to show


that our estimates from the regression of the β0 s are also normally
distributed
Standard errors represent how accurately we have estimated a
coe¢ cient
A very small standard error means it is a very accurate estimate
In the regression results from earlier these standard errors are typically
reported in parantheses beneath the coe¢ cient’s value

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 14 / 19


t-statistic

A t-statistic is used to measure how con…dent we are given the results


of the regression that the true β is di¤erent from 0
For instance if we measured a very high value for β with a very small
standard error we would be very con…dent
On the other hand if we found a small value of β with a high standard
error we would be far less con…dent
The t-statistic is calculated as
β
s
The magnitude of this term not the sign is what is important since β
can be positive or negative

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 15 / 19


Level of signi…cance (p)

Associated with a t-statistic is a level of signi…cance


The level of signi…cance is the probability we attach to the real value
of β being 0 given the evidence we have found through our regression
β
As the magnitude of s increases the level of signi…cance decreases
The signi…cance of an estimate is often indicated with a *,**, or ***
the meaning of these is usually indicated below the regression results

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 16 / 19


Goodness of …t (R-squared)

The goodnesss of …t measure R 2 is a measure of the extent to which


the variation of the dependent variable is explained by the explanatory
variable(s).
The formula for it is
sum of squared errors
R2 = 1
sum of deviations from mean
2
∑i (yi β0 β1 xi )
R2 = 1 2
∑i (yi y )
where y is the average value of y
sum of squared errors
sum of deviations from mean is the amount of the total variation of y that
is unexplained by the regression, so 1- sumsum of squared errors
of deviations from mean is the
amount which is explained by the regression
Clearly R 2 will be between 0 and 1, values close to 1 indicate good
explanatory power
Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 17 / 19
Adjusted R-squared

An obvious way to increase the R 2 of a regression is to simply


increase the number of explanatory variables since including
additional variables cannot decrease its explanatory power
The adjusted R 2 is a measure of explanatory power which is adjusted
for the number of explanatory variables included in the regression
The formula for the adjusted R 2 is

2 n 1
RAdjusted =1 1 R2
n m 1
where n is the number of data points and m is the number of
explanatory variables
The adjusted R 2 increases when a new variable is added if the new
term improves the model more than would be expected by chance
It is always less than the actual R 2

Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 18 / 19


Interaction terms in a regression
An interaction term is where we construct a new explanatory variable
from 2 or more underlying variables
For instance we could multiply two variables together, say Price and
Income
The regression equation we would estimate would then be
QD = β0 + β1 P + β2 Y + β3 PY
We do this if we think that the e¤ect of P on QD is di¤erent when Y
is high or low, and similarly the e¤ect of Y on QD is di¤erent when P
is high or low
Consider the demand elasticity wrt price
∂QD P P
ED = = ( β1 + β3 Y )
∂P Q QD
We see here that holding everything else constant increasing Y by 1
unit will increase ED by β3 QPD .
Arthur Campbell (MIT) Introduction to Econometrics 02/16/07 19 / 19

You might also like