0% found this document useful (0 votes)

34 views46 pages

Lecture 3 - Econometria I

The document discusses the assumptions and properties of simple and multiple linear regression models. It outlines the assumptions of the simple linear regression model, including that the model is linear in parameters, random sampling is used, there is variation in the explanatory variable, the error term has a zero conditional mean, and the error term is homoskedastic. It then discusses interpreting the OLS regression equation and outlines two key assumptions of multiple linear regression models.

Uploaded by

Tassio Schiavetti Rossi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views46 pages

Lecture 3 - Econometria I

Uploaded by

Tassio Schiavetti Rossi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Econometrics I

Prof. Dr. Daniel Roland

Federal University of ABC - 1/2024
Simple/Multiple linear
regression: assumptions and
estimator properties
Prof. Dr. Daniel Roland
Federal University of ABC - 1/2024
Simple linear regression model

 We explored the simple linear regression model and derived the

OLS estimator previously. Now we will discuss the model’s
assumptions and implications before moving on to multiple linear
regression models. Here are the simple linear regression (SLR)
assumptions.

SLR.1 (Linear in parameters) – In the population model, the dependent

variable, y, is related to the independent variable, x, and the error (or
disturbance), u, as
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝑢
where 𝛽0 and 𝛽1 are the population intercept and slope parameters,
respectively.
Simple linear regression model
Simple linear regression model

SLR.2 (Random Sampling) – We have a random sample of size n,

𝑥𝑖 , 𝑦𝑖 : 𝑖 = 1, 2, … , 𝑛 , following the population model shown
previously. Therefore, the random sample is given by

𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖 , 𝑖 = 1, 2, … , 𝑛

Samples are meant to be a representation of the population with all its

characteristics. If our sample is not random, it might contain bias which
would ultimately invalidate any results. E.g. a survey interested in finding
the population’s opinion on a particular football team would be very
misleading if it interviewed only the supporters of the football team.
Simple linear regression model

SLR.3 (Sample variation in the explanatory variable) – The sample

outcomes on x, namely, 𝑥𝑖 , 𝑖 = 1, 2, … , 𝑛 , are not all the same
value.

If there is no variation of the explanatory variable, there is no way to

estimate the OLS regression line. A simple inspection of summary
statistics can test this assumption. If the standard deviation of x is not
zero, the assumption holds.
Simple linear regression model

SLR.4 (Zero Conditional Mean) – The error u has an expected value of

zero given any value of the explanatory variable. In other words,

𝐸 𝑢|𝑥 = 0

This assumption can be violated if there are important unobserved

factors, measurement errors, reversed causality or non-linearity. For
now, it is sufficient to know that by definition the simple linear
regression model uses this assumption in order to produce unbiased
estimators.
Simple linear regression model

 If assumptions SLR.1-SLR.4 hold, we know that the OLS estimators are

unbiased. That is, on average, they are a good approximation of
the true mean effect found in the PRF. But is that sufficient?

Source: Encyclopedia of Social Measurement, 2005.

Simple linear regression model

 If we add one more assumption, we ensure that the OLS estimators

are not only unbiased, but that they also have the least dispersion in
the class of all linear estimators.

SLR.5 (Homoskedasticity) – The error, u, has the same variance given

any value of the explanatory variable. In other words,

𝑉𝑎𝑟 𝑢|x = 𝜎 2
Simple linear regression model
Simple linear regression model
Simple linear regression model

 Recall that the homoskedasticity assumption, also called “constant

variance” assumption, i.e. 𝑉𝑎𝑟 𝑢|x = 𝜎 2 , is not the same as the
zero conditional mean assumption, i.e. 𝐸 𝑢|𝑥 = 0. The former
refers to variance of u given x while the latter refers to expected
value.
 Essentially, the homoskedasticity assumption says that the variance
of the error term does not vary with x.
Simple linear regression model

 The assumptions presented, SLR.1 – SLR.5, are frequently called the

Gauss-Markov assumptions thanks to the 18th and 19th century
mathematicians Carl Friederich Gauss and Andrey Markov.
 The Gauss-Markov theorem states that if SLR.1-SLR.5 hold, then the
OLS estimator is the best linear unbiased estimator (BLUE). The results
are generalised to the multiple linear regression model with a few
minor modifications.
Multiple linear regression model

 Multiple linear regression models allows us more freedom in our

attempts to explore social and economic problems.
 The concept of ceteris paribus is truly used as we can explicitly
control for other factors that might affect the outcome variable y.
 The assumption that all other factors that affect y are uncorrelated
with x, which is many times not realistic, can be dealt with.
Multiple linear regression model

Suppose we have a simple linear regression model looking at the

effect of education on wage:
𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝑢
We would have to assume that years of experience, which is in the
error term, is uncorrelated with education – a not so reasonable
assumption.
With a multiple linear regression model, we can write:

𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑒𝑥𝑝𝑒𝑟 + 𝑢

Multiple linear regression model

Now suppose you want to explore the effect of public spending per
student (expend) on their average standardized test score (avgscore).
However, in the USA, public school expenditure per student is based
on property and local income taxes, thus we could include family
average income in our regression.

𝑎𝑣𝑔𝑠𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1 𝑒𝑥𝑝𝑒𝑛𝑑 + 𝛽2 𝑎𝑣𝑔𝑓𝑖𝑛𝑐 + 𝑢

Multiple linear regression model

A multiple linear regression model can be written in the population as

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + ⋯ + 𝛽𝑘 𝑥𝑘 + 𝑢

where 𝛽0 is the intercept, 𝛽1 is the parameter associated with 𝑥1 , 𝛽2 is

the parameter associated with 𝑥2 , and so on.
Multiple linear regression model

A key assumption for the general multiple regression model is very

similar to the one we saw in the simple linear regression model and it
can be written in terms of a conditional expectation:

𝐸 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 0

In other words, all factors in the unobserved error term must be

uncorrelated with the explanatory variables.
Multiple linear regression model

The estimated OLS equation can be written in a form similar to the

simple regression case. If we have only two explanatory variables, that
would be:

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
where 𝛽0 is the estimate of 𝛽0 , 𝛽1 is the estimate of 𝛽1 and 𝛽2 is the
estimate of 𝛽2 .
Multiple linear regression model

We use the method of OLS to choose values of 𝛽0 , 𝛽1 and 𝛽2 that minimise

the sum of squared residuals. That is, given n observations on y, 𝑥1 and 𝑥2 ,
𝑥𝑖1 , 𝑥𝑖2 , 𝑦𝑖 : 𝑖 = 1,2, … , 𝑛 , the estimates of 𝛽0 , 𝛽1 and 𝛽2 are chosen
simultaneously to make the following equation as small as possible

𝑛
2
𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − 𝛽2 𝑥𝑖2
𝑖=1

The same principles can be generalised to a model with k number of

explanatory variables.
Multiple linear regression model

Lets assume that 𝑥1 , 𝑥2 , … , 𝑥𝑘 are independent variables and 𝑦 is a

dependent variable.
Given a sample of n observations,

𝑥𝑖1 , 𝑥𝑖2 , … , 𝑥𝑖𝑘 , 𝑦𝑖 , 𝑖 = 1, 2, … , 𝑛

the population linear regression model is given by:

𝐸 𝑦𝑖 |𝑥𝑖 = 𝛽0 + 𝛽1 𝑥𝑖1 + ⋯ +𝛽𝑘 𝑥𝑖𝑘

Multiple linear regression model

And the sample linear regression function (or OLS regression line) is
given by

𝑦 = 𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 + ⋯ + 𝛽𝑘 𝑥𝑖𝑘

Thus, the sum of squared residuals to be minimised is:

𝑛
2
𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − 𝛽2 𝑥𝑖2 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘
𝑖=1
Deriving the OLS estimator

We have k+1 linear equations and k+1 unknowns 𝛽0 , 𝛽1 , … , 𝛽𝑘 :

𝑛
𝜕
𝑆 𝛽0 , 𝛽1 , … , 𝛽𝑘 = −2 (𝑦𝑖 −𝛽0 − 𝛽1 𝑥𝑖1 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘 ) = 0
𝜕𝛽0
𝑖=1
𝑛
𝜕
𝑆 𝛽0 , 𝛽1 , … , 𝛽𝑘 = −2 𝑥𝑖1 (𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘 ) = 0
𝜕𝛽1
𝑖=1
𝑛
𝜕
𝑆 𝛽0 , 𝛽1 , … , 𝛽𝑘 = −2 𝑥𝑖2 (𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘 ) = 0
𝜕𝛽2
𝑖=1
.
.
.
𝑛
𝜕
𝑆 𝛽0 , 𝛽1 , … , 𝛽𝑘 = −2 𝑥𝑖𝑘 (𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘 ) = 0
𝜕𝛽𝑘
𝑖=1
Deriving the OLS estimator

 The previous system of equations can be solved:

 Manually (which would demand an incredible amount of time)
 Using a matrix algebra approach (which is beyond the scope of
this course)
 Through the use of an econometric software (which will be our
approach by the end of this course).
Interpreting the OLS regression
equation
Lets start with two independent variables such that

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
The intercept 𝛽0 is the predicted value when both 𝑥1 and 𝑥2 equal
zero. The estimates 𝛽1 and 𝛽2 have partial effect or ceteris paribus
interpretations.

∆𝑦 = 𝛽1 ∆𝑥1 +𝛽2 ∆𝑥2

Interpreting the OLS regression
equation
We can generalise this to k explanatory variables in a similar fashion

∆𝑦 = 𝛽1 ∆𝑥1 +𝛽2 ∆𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘

 With (plenty of!) caveats, the multiple linear regression allows

economists to mimic a laboratory environment where it is easy to fix
each factor, even though the data collected did not explicitly hold
each factor constant.
 Once we obtain the partial effects, we can also explore how y is
affected by simultaneous changes in two or more independent
variables.
Multiple linear regression assumptions

The first two multiple linear regression (MLR) assumptions are very similar
to the SLR assumptions:

MLR.1 (Linear in Parameters) – The model in the population can be

written as
𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽𝑘 𝑥𝑘 + 𝑢
where 𝛽0 , 𝛽1 , … , 𝛽𝑘 are unknown parameters (constants) of interest
and u is unobservable random error or disturbance term. Note that this
does not exclude non-linearity in the independent variables.
Multiple linear regression assumptions

MLR.2 (Random sampling) – We have a random sample of n

observations, 𝑥𝑖1 , 𝑥𝑖2 , … 𝑥𝑖𝑘 , 𝑦𝑖 , following the population model in
assumption MLR.1.

We cannot draw our estimates without a population sample. And we

need to ensure that the sample we have is representative, i.e. it
represents the population. Just like in the SLR.2, if our sample is biased
then our estimates will also be biased.
Multiple linear regression assumptions

MLR.3 (No Perfect Collinearity) – In the sample (and therefore in the

population), none of the independent variables is constant, and there
are no exact linear relationships among the independent variables.

If an independent variable is an exact linear combination of the other

independent variables, then we say the model suffers from perfect
collinearity and it cannot be estimated by OLS. That assumption does
allow for independent variables to be correlated, and many times
they are, but it rules out perfect correlation.
Multiple linear regression assumptions

The MLR.3 assumption does not exclude non-linear combinations of

independent variables such as the use of a squared variable.

MRL.4 (Zero conditional mean) – The error u has an expected value of

zero given any values of the independent variables. In other words,

𝐸 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 0

There are several ways that this assumption might not be met.
Multiple linear regression assumptions

 Functional form is not specified properly.

 Omitted variable in our functional relationship, which is correlated
with one (or more) independent variable.
 Measurement error.
 Simultaneous effects between regressor and regressand.

For now, we only need to worry about the first two. When MLR.4. holds,
it is often said that we have exogenous explanatory variables.
Unbiasedness of OLS

If assumptions MLR.1 through MLR.4 hold, the OLS estimators are

unbiased.

Theorem 3.1 (Unbiasedness of OLS) – Under assumptions MLR.1 through

MLR.4., for any values of the population parameter 𝛽𝑗 , the expected
value of the parameter estimate is equal to the parameter 𝛽𝑗 . That is

𝐸(𝛽𝑗 ) = 𝛽𝑗 , 𝑗 = 0, 1, … , 𝑘
This theorem, properly described here, is also applied to SLR. Essentially,
it describes the unbiasedness of OLS.
Multiple linear regression assumptions

MLR.5 (homoskedasticity) – The error u has the same variance given

any values of the explanatory variables. In other words

𝑉𝑎𝑟 𝑢|𝑥1 , 𝑥2 , … , 𝑥𝑘 = 𝜎 2

Just as in the simple linear regression model, here the variance of the
unobserved factor is not affected by different values of the
explanatory variable. This is true for each explanatory variable in the
model.
Multiple linear regression assumptions

If we consider a model explaining the wage of school teachers, we

could write.
𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑒𝑥𝑝𝑒𝑟 + 𝛽3 𝑡𝑒𝑛𝑢𝑟𝑒 + 𝑢

Assumption MLR.5 (homoskedasticity) would state that

𝑉𝑎𝑟 𝑢|𝑒𝑑𝑢𝑐, 𝑒𝑥𝑝𝑒𝑟, 𝑡𝑒𝑛𝑢𝑟𝑒 = 𝜎 2

If the variance changes with any of the three explanatory variables,

then heteroskedasticity is present.
The Gauss-Markov Theorem (BLUE)

Theorem 3.4. (Gauss-Markov Theorem) – Under Assumptions MLR.1

through MLR.5, 𝛽0 , 𝛽1 , … , 𝛽𝑘 are the best linear unbiased estimators
(BLUE) of 𝛽0 , 𝛽1 , … , 𝛽𝑘 , respectively.

Best – It has the smallest variance.

Linear – It is within the class of linear estimators.
Unbiased - E 𝛽𝑗 = 𝛽𝑗
Estimator – a rule that can be applied to any sample to produce an
estimate.
Inclusion/exclusion of regressors

When we add/remove an explanatory variable from our model, the

effect (or lack of) caused by this depends on the relevance of the
variable being included/excluded. So we have two situations of
interest:

1) Including irrelevant variable in a regression model.

2) Exclusion of a relevant variable in a regression model.

Including irrelevant variable in a
regression model.
In the first case, when an irrelevant variable is included in the model
(also called overspecification of the model). It can look like this:

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢

In this example, 𝑥3 has no effect on y and the model is overspecified.

In this case, 𝛽3 = 0 . But we do not know this and estimate the model:

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3
Including irrelevant variable in a
regression model.
 Thankfully, the inclusion of an irrelevant regressor does not affect the
unbiasedness of OLS. That is to say, if assumptions MLR.1 through
MLR.4 hold, the OLS estimators for each 𝛽𝑗 are unbiased, i.e.
𝐸 𝛽𝑗 = 𝛽𝑗 for any value of 𝛽𝑗 .
 The actual value of 𝛽𝑗 will most likely not be zero, it should be very
close to it and, on average across all samples, it will be zero.
 While the OLS estimator remains unbiased, the inclusion of an
irrelevant variable does affect the variance of OLS estimators (to be
seen).
Exclusion of a relevant variable in a
regression model
 The second case, where we exclude a relevant variable in a
regression model, is more complex.
 We saw previously that the exclusion of a relevant variable in our
model can be problematic, specially if that variable is correlated
with one of the other regressors as this invalidates MLR.4 and the
unbiasedness of OLS estimators. Now we have a closer look at the
process. Suppose we want to estimate the following:

𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑎𝑏𝑖𝑙 + 𝑢

Exclusion of a relevant variable in a
regression model
 Instead of the previous model, we estimate the following

𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝑣
where 𝑣 = 𝛽2 𝑎𝑏𝑖𝑙 + 𝑢.

The estimated equation is

𝑦 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐
Exclusion of a relevant variable in a
regression model
 Notice the difference between “~” and “^” in the model. We use
this to explicitly show that we are using an underspecified model.
 In this case, the underspecified coefficient 𝛽1 can be written as

𝛽1 = 𝛽1 + 𝛽2 𝛿1
𝛽1 and 𝛽1 come from estimates from the true population regression
function while 𝛿1 is the slope of the simple linear regression of abil on
educ, i.e. ability as a function of education.
Exclusion of a relevant variable in a
regression model
Since we can assume 𝛿1 is a non-random value and assumptions
MLR.1-MLR.4 are met, we can write:

𝐸 𝛽1 = 𝐸 𝛽1 + 𝛽2 𝛿1 = 𝐸 𝛽1 + 𝐸 𝛽2 𝛿1 = 𝛽1 + 𝛽2 𝛿1

From this, we can calculate the bias in the estimation of 𝛽1

Bias 𝛽1 = 𝐸 𝛽1 − 𝛽1 = 𝛽2 𝛿1
Exclusion of a relevant variable in a
regression model
Bias 𝛽1 = 𝛽2 𝛿1
The term on the right side of the equation is called omitted variable
bias and we can infer the direction of bias based on the value of both
𝛽2 and 𝛿1 . Of course, if either one of them are equal to zero then the
OLS estimate for 𝛽2 will be unbiased.

The sign of 𝛿1 will depend on the correlation between the regressors

educ and abil or, in general terms, between 𝑥1 and 𝑥2 given by
𝐶𝑜𝑟𝑟(𝑥1 , 𝑥2 ).
Exclusion of a relevant variable in a
regression model
Bias 𝛽1 = 𝛽2 𝛿1
Summary of bias in 𝛽1 when 𝑥2 is omitted in an underspecified model.

𝐶𝑜𝑟𝑟 𝑥1 , 𝑥2 > 0 𝐶𝑜𝑟𝑟 𝑥1 , 𝑥2 < 0

𝛽2 > 0 Positive bias Negative bias

𝛽2 < 0 Negative bias Positive bias

Note that we can tell the direction of bias in this way, but not the magnitude .
Exclusion of a relevant variable in a
regression model
Bias 𝛽1 = 𝛽2 𝛿1
To summarise, if the excluded variable is not correlated with any other
regressor, then there is no bias in the OLS estimates. However, if there is
correlation, the direction of the bias will depend on the sign of both
the true parameter and the correlation between the regressors.

Moving on to equations with three or more regressors, in general, if

there is exclusion of a relevant variable then the OLS estimators will be
biased, even if the excluded variable is only correlated with one of the
regressors. The direction of the bias is not as straightforward as in the
case with two explanatory variables.
To do list

 Read sections 3.1-3.3 and 3.5 of Wooldridge if you haven’t already.

 Our next lecture will cover goodness-of-fit from chapters 2 and 3.
 Problem set #1 has been released on Moodle. You can submit the
answers up until 15th March at noon. Any late submissions will not be
accepted.

CFA Mindmap
92% (12)
CFA Mindmap
98 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Multiple Linear Regression Notes
No ratings yet
Multiple Linear Regression Notes
9 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
Econometrics Lecture4 MultipleRegression
No ratings yet
Econometrics Lecture4 MultipleRegression
40 pages
Ols 2
No ratings yet
Ols 2
19 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
Lec Topic3
No ratings yet
Lec Topic3
51 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
17 pages
Multiple Linear Regression Model
No ratings yet
Multiple Linear Regression Model
99 pages
Lecture 3
No ratings yet
Lecture 3
27 pages
Ordinary Least Squares: Linear Model
No ratings yet
Ordinary Least Squares: Linear Model
13 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Assignments Ashoka University
No ratings yet
Assignments Ashoka University
32 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Econometrics Chap - 2
No ratings yet
Econometrics Chap - 2
57 pages
Chapter3 Econometrics MultipleLinearRegressionModel
No ratings yet
Chapter3 Econometrics MultipleLinearRegressionModel
41 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
37 pages
3-Econometrics-Linear Regression
No ratings yet
3-Econometrics-Linear Regression
13 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
51 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
67 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Lecture 8
No ratings yet
Lecture 8
29 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Chapter3
No ratings yet
Chapter3
52 pages
Ecc321 Chapter 3
No ratings yet
Ecc321 Chapter 3
8 pages
Multiple Linear Regression Model: (Or Equivalently
No ratings yet
Multiple Linear Regression Model: (Or Equivalently
41 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
BS Classes V2
No ratings yet
BS Classes V2
70 pages
Lecture 8 - Removed
No ratings yet
Lecture 8 - Removed
13 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
Intecxpres 182
No ratings yet
Intecxpres 182
65 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Econometric S
No ratings yet
Econometric S
8 pages
Linear Regression
No ratings yet
Linear Regression
53 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Chapter Three: Estimation of Multiple Linear Regression Model
No ratings yet
Chapter Three: Estimation of Multiple Linear Regression Model
18 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
MLRM
No ratings yet
MLRM
22 pages
統計摘要
No ratings yet
統計摘要
12 pages
(Reformatted) Module 5 (Students)
No ratings yet
(Reformatted) Module 5 (Students)
32 pages
Lecture 7. Multiple Regression
No ratings yet
Lecture 7. Multiple Regression
11 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
52 pages
Eco 3
No ratings yet
Eco 3
68 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Multiple Regression Model
No ratings yet
Multiple Regression Model
17 pages
Lec 22 - Multiple Regression
No ratings yet
Lec 22 - Multiple Regression
22 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Econometrics For Finance Lecture III
No ratings yet
Econometrics For Finance Lecture III
54 pages
Multiple Regression Analysis: I 0 1 I1 K Ik I
100% (1)
Multiple Regression Analysis: I 0 1 I1 K Ik I
30 pages
Chapter - Two - Simple Linear Regression - Final Edited
No ratings yet
Chapter - Two - Simple Linear Regression - Final Edited
28 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
Espelho - João Nogueira 7 Cordas
100% (1)
Espelho - João Nogueira 7 Cordas
5 pages
Toada - Boca Livre PDF
No ratings yet
Toada - Boca Livre PDF
1 page
GGG Rocket PCB PDF
No ratings yet
GGG Rocket PCB PDF
1 page
Rocket PCB: Trademarks Are Property of Their Owners
No ratings yet
Rocket PCB: Trademarks Are Property of Their Owners
1 page
Aguiar Gopinath
No ratings yet
Aguiar Gopinath
43 pages
Week 3 Assignment
100% (2)
Week 3 Assignment
3 pages
Stat For Fin CH 4 PDF
No ratings yet
Stat For Fin CH 4 PDF
17 pages
Estimation: Chapter 2+3: Minimum Variance Unbiased Estimation + The CRLB
No ratings yet
Estimation: Chapter 2+3: Minimum Variance Unbiased Estimation + The CRLB
14 pages
AR Prediction - Afl.afl
No ratings yet
AR Prediction - Afl.afl
21 pages
Sae: An R Package For Small Area Estimation
No ratings yet
Sae: An R Package For Small Area Estimation
18 pages
CH 05 Testj
No ratings yet
CH 05 Testj
16 pages
Model Misspecification: Gerda Claeskens
No ratings yet
Model Misspecification: Gerda Claeskens
21 pages
Machine Learning Unit 1 Notes
No ratings yet
Machine Learning Unit 1 Notes
22 pages
A Multiple IMM Estimation Approach With Unbiased Mixing For Thrusting Projectiles-3
No ratings yet
A Multiple IMM Estimation Approach With Unbiased Mixing For Thrusting Projectiles-3
18 pages
ME 1 ES Descriptive Analysis Updated
No ratings yet
ME 1 ES Descriptive Analysis Updated
12 pages
Multivariate Generalized Linear Mixed Models For Count Data: Guilherme P. Silva Henrique A. Laureano
No ratings yet
Multivariate Generalized Linear Mixed Models For Count Data: Guilherme P. Silva Henrique A. Laureano
22 pages
Elements of Econometrics - Study Guide
No ratings yet
Elements of Econometrics - Study Guide
363 pages
Procedures For Estimation of Weibull Parameters: United States Department of Agriculture
No ratings yet
Procedures For Estimation of Weibull Parameters: United States Department of Agriculture
19 pages
Structural Safety: Armin Tabandeh, Gaofeng Jia, Paolo Gardoni
No ratings yet
Structural Safety: Armin Tabandeh, Gaofeng Jia, Paolo Gardoni
18 pages
M Stat (2015) - Revised PDF
No ratings yet
M Stat (2015) - Revised PDF
59 pages
The Symmetrical Fitting Method For Model Identification
No ratings yet
The Symmetrical Fitting Method For Model Identification
55 pages
Notif CSP 24 Engl 140224
No ratings yet
Notif CSP 24 Engl 140224
26 pages
Komorowski EDA2016
No ratings yet
Komorowski EDA2016
20 pages
Newbold Chapter 7
No ratings yet
Newbold Chapter 7
62 pages
Normal Distribution
No ratings yet
Normal Distribution
30 pages
MathEcon18 FinalExam Solution
No ratings yet
MathEcon18 FinalExam Solution
13 pages
Statistics
No ratings yet
Statistics
11 pages
Political Institutions and Academic Freedom: Evidence From Across The World
No ratings yet
Political Institutions and Academic Freedom: Evidence From Across The World
24 pages
HASAN Unr 0139M 12141
No ratings yet
HASAN Unr 0139M 12141
103 pages
2024 Gate Eco
No ratings yet
2024 Gate Eco
37 pages
The Jackknife Approach: B X 1 N X X
No ratings yet
The Jackknife Approach: B X 1 N X X
3 pages
Meta-Analysis of Screening and Diagnostic Tests
No ratings yet
Meta-Analysis of Screening and Diagnostic Tests
12 pages
thống kê
No ratings yet
thống kê
4 pages

Lecture 3 - Econometria I

Uploaded by

Lecture 3 - Econometria I

Uploaded by

Econometrics I

Prof. Dr. Daniel Roland

 We explored the simple linear regression model and derived the

SLR.1 (Linear in parameters) – In the population model, the dependent

SLR.2 (Random Sampling) – We have a random sample of size n,

Samples are meant to be a representation of the population with all its

SLR.3 (Sample variation in the explanatory variable) – The sample

If there is no variation of the explanatory variable, there is no way to

SLR.4 (Zero Conditional Mean) – The error u has an expected value of

This assumption can be violated if there are important unobserved

 If assumptions SLR.1-SLR.4 hold, we know that the OLS estimators are

Source: Encyclopedia of Social Measurement, 2005.

 If we add one more assumption, we ensure that the OLS estimators

SLR.5 (Homoskedasticity) – The error, u, has the same variance given

 Recall that the homoskedasticity assumption, also called “constant

 The assumptions presented, SLR.1 – SLR.5, are frequently called the

 Multiple linear regression models allows us more freedom in our

Suppose we have a simple linear regression model looking at the

𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑒𝑥𝑝𝑒𝑟 + 𝑢

𝑎𝑣𝑔𝑠𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1 𝑒𝑥𝑝𝑒𝑛𝑑 + 𝛽2 𝑎𝑣𝑔𝑓𝑖𝑛𝑐 + 𝑢

A multiple linear regression model can be written in the population as

where 𝛽0 is the intercept, 𝛽1 is the parameter associated with 𝑥1 , 𝛽2 is

A key assumption for the general multiple regression model is very

In other words, all factors in the unobserved error term must be

The estimated OLS equation can be written in a form similar to the

We use the method of OLS to choose values of 𝛽0 , 𝛽1 and 𝛽2 that minimise

The same principles can be generalised to a model with k number of

Lets assume that 𝑥1 , 𝑥2 , … , 𝑥𝑘 are independent variables and 𝑦 is a

𝑥𝑖1 , 𝑥𝑖2 , … , 𝑥𝑖𝑘 , 𝑦𝑖 , 𝑖 = 1, 2, … , 𝑛

the population linear regression model is given by:

𝐸 𝑦𝑖 |𝑥𝑖 = 𝛽0 + 𝛽1 𝑥𝑖1 + ⋯ +𝛽𝑘 𝑥𝑖𝑘

𝑦 = 𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 + ⋯ + 𝛽𝑘 𝑥𝑖𝑘

Thus, the sum of squared residuals to be minimised is:

We have k+1 linear equations and k+1 unknowns 𝛽0 , 𝛽1 , … , 𝛽𝑘 :

 The previous system of equations can be solved:

∆𝑦 = 𝛽1 ∆𝑥1 +𝛽2 ∆𝑥2

∆𝑦 = 𝛽1 ∆𝑥1 +𝛽2 ∆𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘

 With (plenty of!) caveats, the multiple linear regression allows

MLR.1 (Linear in Parameters) – The model in the population can be

MLR.2 (Random sampling) – We have a random sample of n

We cannot draw our estimates without a population sample. And we

MLR.3 (No Perfect Collinearity) – In the sample (and therefore in the

If an independent variable is an exact linear combination of the other

The MLR.3 assumption does not exclude non-linear combinations of

MRL.4 (Zero conditional mean) – The error u has an expected value of

 Functional form is not specified properly.

If assumptions MLR.1 through MLR.4 hold, the OLS estimators are

Theorem 3.1 (Unbiasedness of OLS) – Under assumptions MLR.1 through

MLR.5 (homoskedasticity) – The error u has the same variance given

If we consider a model explaining the wage of school teachers, we

Assumption MLR.5 (homoskedasticity) would state that

𝑉𝑎𝑟 𝑢|𝑒𝑑𝑢𝑐, 𝑒𝑥𝑝𝑒𝑟, 𝑡𝑒𝑛𝑢𝑟𝑒 = 𝜎 2

If the variance changes with any of the three explanatory variables,

Theorem 3.4. (Gauss-Markov Theorem) – Under Assumptions MLR.1

Best – It has the smallest variance.

When we add/remove an explanatory variable from our model, the

1) Including irrelevant variable in a regression model.

2) Exclusion of a relevant variable in a regression model.

In this example, 𝑥3 has no effect on y and the model is overspecified.

𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑎𝑏𝑖𝑙 + 𝑢

The estimated equation is

From this, we can calculate the bias in the estimation of 𝛽1

The sign of 𝛿1 will depend on the correlation between the regressors

𝐶𝑜𝑟𝑟 𝑥1 , 𝑥2 > 0 𝐶𝑜𝑟𝑟 𝑥1 , 𝑥2 < 0

𝛽2 < 0 Negative bias Positive bias

Moving on to equations with three or more regressors, in general, if

 Read sections 3.1-3.3 and 3.5 of Wooldridge if you haven’t already.

You might also like