0% found this document useful (0 votes)
17 views26 pages

Session-Classical Assumption

The document discusses the classical assumptions of regression models, including that the regression is linear in parameters, the error term has zero mean and is uncorrelated with the explanatory variables, there is no serial correlation or heteroskedasticity, and no perfect multicollinearity. It also explains that under these assumptions, OLS provides the best linear unbiased estimates and the minimum variance, according to the Gauss-Markov theorem.

Uploaded by

hilmiazis15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views26 pages

Session-Classical Assumption

The document discusses the classical assumptions of regression models, including that the regression is linear in parameters, the error term has zero mean and is uncorrelated with the explanatory variables, there is no serial correlation or heteroskedasticity, and no perfect multicollinearity. It also explains that under these assumptions, OLS provides the best linear unbiased estimates and the minimum variance, according to the Gauss-Markov theorem.

Uploaded by

hilmiazis15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Classical Assumptions

of Regression Model
DR. INDRA, S.Si, M.Si
Introduction: Review of OLS

Yi =  0 + 1 X 1i +  2 X 2i + ... +  K X Ki +  i
• Objective of OLS → Minimize the sum of
squared residuals:
min n 2
• ˆ 
 i =1
ei where ei = Yi − Yˆi

• Remember that OLS is not the only possible


estimator of the βs.
• But OLS is the best estimator under certain
assumptions…
Property of Estimator under OLS

The OLS procedure to estimate the


coefficients of regression model needs
some properties:
1. Unbiased
2. Efficient
The properties is determined by several
assumptions (called as Classical Assumptions)
to be fulfilled….
Classical Assumptions

1. Regression is linear in parameters


2. Some observations in X are different
3. Error term has zero population mean
4. Error term is not correlated with X’s
5. No serial correlation
6. No heteroskedasticity
7. No perfect multicollinearity
and we usually add:
8. Error term is normally distributed
Assumption 1: Linearity in Parameter

• The regression model:


– A) is linear
• It can be written as
Yi =  0 + 1 X 1i +  2 X 2i + ... +  K X Ki +  i
• This doesn’t mean that the theory must be linear
• For example… suppose we believe that CEO salary is
related to the firm’s sales and CEO’s tenure.
• We might believe the model is:
log( salary)i = 0 + 1 log( salesi ) +  2tenurei + 3tenure2i +  i
Assumption-1: Linearity in Parameter

The regression model:


– B) is correctly specified
• The model must have the right variables
• No omitted variables
• The model must have the correct functional form
• This is all untestable → We need to rely on economic
theory.
– C) must have an additive error term
• The model must have + εi
Assumption-2 : Some observations in X are different

 1
S xy S xx =  ( X t − X ) 2 =  X t − n( X ) 2 =  X t − (X t ) 2
2 2

 = n
S xx S xy =  ( X t − X )(Yt − Y ) = ( X t Yt ) − n X Y

  1  1
 =Y − X =
n
 Yt − 
n
 Xt

Estimator can not be found if variance of X (Sxx) = 0, or in the situation


where all observations in Xt are the same. Moreover, this assumption
requires:
 1
var( x) =
n −1
 t
( X − X ) 2
Assumption-3: Error term has zero population mean

•  is a random variable where  () = 0 (by average, the


value of  is 0)
• Each observation has a random error with a mean of
zero
• What if E(εi)≠0? → This is actually fixed by adding a
constant (AKA intercept) term.
• Example: Suppose instead the mean of εi
was -4 → Then we know E(εi+4)=0
• We can add 4 to the error term and subtract 4 from the
constant term:
• Yi =β0+ β1Xi+εi
• Yi =(β0-4)+ β1Xi+(εi+4)
• We can rewrite:
• Yi =β0*+ β1Xi+εi*
• Where β0*= β0-4 and εi*=εi+4
• Now E(εi*)=0, so we are OK
Assumption-4:Error term is not correlated with X’s

• This assumption is called as exogeneity →


important !!!
• All explanatory variables are uncorrelated with
the error term
E(εi|X1i,X2i,…, XKi,)=0
• Explanatory variables are determined outside of the
model (They are exogenous)
• What happens if assumption 3 is violated?
• Suppose we have the model:
• Yi =β0+ β1Xi+εi
• Suppose Xi and εi are positively correlated → When Xi is
large, εi tends to be large as well.
Assumption-4:Error term is not correlated with X’s

• Why would x and ε be correlated?


• Suppose you are trying to study the
relationship between the price of a
hamburger and the quantity sold across a
wide variety of Ventura County restaurants.
• We estimate the relationship using the foll
owing model:
salesi= β0+β1pricei+εi
• What’s the problem?
Assumption-4:Error term is not correlated with X’s

• What’s the problem?


– What else determines sales of hamburgers?
– How would you decide between buying a
burger at McDonald’s ($0.89) or a burger at
TGI Fridays ($9.99)?
– Quality differs
– salesi= β0+β1pricei+εi  quality isn’t an X varia
ble even though it should be.
– It becomes part of εi
Assumption-4:Error term is not correlated with X’s

What’s the problem?


– But price and quality are highly positively
correlated
– Therefore x and ε are also positively
correlated.
– This means that the estimate of β1will be too
high
– This is called “Omitted Variables Bias”
Assumption-5:No Serial Correlation

• error term (u) is independently distributed → Cov (ut, us)


=  (utus) = 0, untuk t  s
• These assumption explain that error term (u) is independe
ntly and identically distributed (i.i.d)
• Serial Correlation: The error terms across observations
are correlated with each other
• i.e. ε1 is correlated with ε2, etc.
• This is most important in time series
• If errors are serially correlated, an increase in the error
term in one time period affects the error term in the next.
Assumption-6:No Heteroskedasticity
(Homoscedasticity)
• error term (u ) has an identical distribution
with the same variant value (2) :
• Var (ut) =  (ut2) = 2

• Homoskedasticity: The error has a constant


variance
• This is what we want…as opposed to
• Heteroskedasticity: The variance of the
error depends on the values of Xs.
Assumption-6:No Heteroskedasticity
(Homoscedasticity)

Homoskedasticity:
The error has constant
variance
Assumption-6:No Heteroskedasticity
(Homoscedasticity)

Heteroskedasticity:
Spread of error
depends on X.
Assumption-6:No Heteroskedasticity
(Homoscedasticity)

Another form of
Heteroskedasticity
Assumption 7: No Perfect Multicollinearity

• Two variables are perfectly collinear if one can


be determined perfectly from the other (i.e. if
you know the value of x, you can always find
the value of z).
• Example: If we regress income on age, and
include both age in months and age in years.
– But age in years = age in months/12
– e.g. if we know someone is 246 months old,
we also know that they are 20.5 years old.
Assumption 7: No Perfect Multicollinearity

• What’s wrong with this?


• incomei= β0 + β1agemonthsi + β2ageyearsi + εi
• What is β1?
• It is the change in income associated with a
one unit increase in “age in months,” holding
age in years constant.
– But if you hold age in years constant, age in
months doesn’t change!
Assumption-7: No Perfect Multicollinearity

β1 = Δincome/Δagemonths
Holding Δageyears = 0
If Δageyears = 0; then Δagemonths = 0
So β1 = Δincome/0
It is undefined!
Assumption-8: Normally Distributed Error

• This is required not required for OLS, but it is important for


hypothesis testing
• More on this assumption next time.
Putting Assumptions All Together

• Last class, we talked about how to compare


estimators. We want:
• 1. ˆ is unbiased.
– E(ˆ ) = 
– on average, the estimator is equal to the population
value
• 2. ˆ is efficient
– The variance of the estimator is as small as possible
Putting Assumptions All Together
• Possible distributions of error

Which one is the best?


Gauss-Markov Theorem

• Given OLS assumptions 1 through 4, one can say


that estimator and are unbiased and consistent.
• An unbiased (least squares) estimator is said to be
more efficient if its variant is smaller than other
unbiased estimators. This condition requires
assumption-5 and -6.
• According to assumptions 1-7, the OLS estimator of βk
is the minimum variance estimator from the
set of all linear unbiased estimators of βk for k=0,1,2,…,K
• OLS is BLUE → The Best, Linear, Unbiased Estimator
Gauss-Markov Theorem

• What happens if we add assumption 8?


• Given assumptions 1 through 8, OLS is the
best unbiased estimator
Gauss-Markov Theorem

With Assumptions 1-8 OLS is:


1. Unbiased: E(ˆ ) = 
2. Minimum Variance – the sampling distribution
is as small as possible
3. Consistent – as n→∞, the estimators converge
to the true parameters
– As n increases, variance gets smaller, so each estimate
approaches the true value of β.
4. Normally Distributed. You can apply statistical
tests to them.

You might also like