0% found this document useful (0 votes)
70 views34 pages

Two-Variable Regression Analysis

The document discusses regression analysis and two-variable linear regression models. It introduces the concepts of the population regression function (PRF), which defines the theoretical linear relationship between two variables, and the sample regression function (SRF), which is estimated from sample data. It also discusses the error term and how real-world data may differ from the theoretical PRF due to random and unspecified factors captured in the error term. Least squares estimation is introduced as a method to estimate the parameters of the SRF.

Uploaded by

Muthia Ardhini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views34 pages

Two-Variable Regression Analysis

The document discusses regression analysis and two-variable linear regression models. It introduces the concepts of the population regression function (PRF), which defines the theoretical linear relationship between two variables, and the sample regression function (SRF), which is estimated from sample data. It also discusses the error term and how real-world data may differ from the theoretical PRF due to random and unspecified factors captured in the error term. Least squares estimation is introduced as a method to estimate the parameters of the SRF.

Uploaded by

Muthia Ardhini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

2.

Week 2

Two-variable
Regression
Analysis
Gujarati(2003): Chapter 2
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.2

Purpose of Regression Analysis

1. Estimate a relationship among economic


variables, such as Y = f(X).

2. Forecast or predict the value of one


variable, Y, based on the value of
  another variable, X.

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.3
Weekly Food Expenditures
Y = dollars spent each week on food items.
X = consumer’s family weekly income.

The relationship between x and the expected value

of Y , given X, might be linear:


P(Y|X) = E(Y|Xi) = f(xi) = 1 + 2 Xi
Means that each conditional mean E(Y|X i) is a
function of Xi, this equation is known as
the population regression function (PRF)
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.4
f(Y|X=80)

f(Y|X=80)

Y|X=80 Y

Condition Probability Distribution f(Y|X=80)


of Food Expenditures if given income X=$80.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
f(Y|x)
2.5
f(Y|x=80) f(Y|x=100)

Y|x=80 Y|x=100 Y

Condition Probability Distribution of Food


Expenditures if given income X=$80 and X=$100.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.6
Y
Conditional means
E(Y|xi)

Population regression line


149 (PRF)
101
65
Distribution of Y
given X=220
X
80 140 220
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Average
2.7
Consumption Y
E(Y|x)
E(Y|x)=1+2x

E(Y|X)
E(Y|X)  2=
X
x slope
1{
Intercept X (income)

The Econometric Model: a linear relationship


between average consumption and income.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.8
Stochastic Specification of PRF
Given any income level of Xi, an family’s consumption is
clustered around the average of all families at that Xi, that
is, around its conditional expectation, E(Y|Xi).
The deviation of any individual Yi is:

ui = Yi - E(Y|Xi)

or Yi = E(Y|Xi) + u i
Shochastic error or
or Yi = 1 + 2 X + u i Stochastic disturbance

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.9
The Error Term
Y is a random variable composed of two parts:

I. Systematic component: E(Y) = 1 + 2x


This is the mean of Y.

II. Random component: u = Y - E(Y)


= Y - 1 - 2X
This is called the random or shochastic error.

Together E(Y) and u form the model:


Y = 1 + 2X + u
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
For examples: 2.10
given X = $80, the individual consumption are
Y1 = 55 = 1 + 2 (80) + u 1
Y2 = 60 = 1 + 2 (80) + u 2
Y3 = 65 = 1 + 2 (80) + u 3
Y4 = 70 = 1 + 2 (80) + u 4
Y5 = 75 = 1 + 2 (80) + u 5

^1 = 65 = ^
Y 1 + ^2 (80)
Estimated average: ^2 = 65 = ^
Y 1 + ^2 (80)
^ = 65 = ^
Y  + ^ (80)
3 1 2
^ ^ ^
Y4 = 65 = 1 + 2 (80)
^ ^ ^
Y5 = 65 = 1 + 2 (80)
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.11
The reasons for stochastic disturbance

• Vagueness of theory
• Unavailability of data
• Direct effect vs indirect effect
• (Core variables vs peripheral variables)
• Intrinsic randomness in human behaviour
• Poor proxy variables
• Principle of parsimony
• wrong functional form

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.12
Unobservable Nature of Error Term
• Unspecified factors / explanatory variables,
not in the model, may be in the error term.
For example: Final examine score is not only
depend on class attended but also other
unobserved factors such as student ability,
maths background, hard work effort, etc.
• Approximation error is in the error term if
relationship between y and x is not exactly a
perfectly linear relationship.

• Strictly unpredictable random behavior that


may be unique to that observation is in error.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Y (SRF) ^ ^ 2.13^
Y = 1 + 2x
Y4 .
{ E(Y|x)=1+2x
Y3 (PRF)
}
Y2 ^
^ u2
Y2 u2
E(Y|x2)

Y1 .} u 1
x
x1 x2 x3 x4
The relationship among Yi, ui and the true regression line.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.14
The Sample Regression Function (SRF)
(SRF2)
Y ^ ^ ^
Y = 1 + 2x
Y4 ^u {. ^ ^ ^
4 Y = 1 + 2x
(SRF1)
Y3 . }^u 3
Y2 ^u .
2{

^
} 1
u
Y1 .
x
x1 x2 x3 x4
Different samples will have different SRFs)
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
SRF:
Yi = 1 + 2 Xi 2.15
^ ^ ^
or Yi = 
^1 + ^2Xi + u^i
Residual
or Yi = b1 + b2 Xi + ei

PRF:
E(Y|X) = 1 + 2 Xi
Yi = 1 + 2 Xi + u i Error term or
Disturbance
^
Yi = estimator of Yi (E(Y|xi)
^
i or bi = estimator of i
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Least Squared Method ^ 2.16
SRF :Y = a +a X
2
2
1 2

Y
^
SRF1:Y1= b1+b2X
1 -2
1

1 -1/2
-11/2
2
0
-1
-1
X
SRF1: |u| = |1| + |-1| + |-1| + |1| + |-1.5| = 5.5
u2 =12 + 12 + 12 + 12 + 1.52 = 6.25 smaller

SRF2: |u| = |2| + |0| + |-1/2| + |1| + |-2| = 5.5


u2 = 22 + 02 + (-1/2)2 + 12+ (-2)2 = 9.25
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.17
Ordinary Least Squares (OLS) Method

Yi = 1 + 2Xi + ui
u i = Y i -  1 - 2X i

Minimize error sum of squared deviations:


n n

i=1
2
(Y i - 1 - 2X i ) = f(1,2)
ui = i=1 2

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.18
Minimize w. r. t. 1 and 2:
n
f(1,2) = i=1(Y i - 1 - 2x i ) = f() 2

 f ( )
1 = - 2 (Y i - 1 - 2Xi )
f()
2 = - 2 Xi (Yi - 1 - 2Xi )
Set each of these two derivatives equal to zero and
solve these two equations for the two unknowns: 1
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2
To minimize f(.), you set the two 2.19
derivatives equal to zero to get:
f()
1 = - 2 (Y i – b1 – b2Xi ) = 0
f()
2 = - 2 xi (Yi - b1 – b2Xi ) = 0
When these two terms are set to zero,
1 and 2 become b1 and b2 because they no longer
represent just any value of 1 and 2 but the special
values that correspond to the minimum of f() .
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.20
- 2 (Y i - b1 – b2Xi ) = 0
- 2 Xi (Y i – b1 – b2Xi ) = 0

Yi - nb1 – b2 Xi = 02


Xi Yi - b1 X i - b2  Xi = 0

nb1 + b2  Xi = Y i
2
b1 Xi + b2 Xi = Xi Yi
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
n 2.21
Xi b1
= Yi

Xi X
2
i b2 = Xi Yi

Solve the two unknowns

b2 = n Xi Yi -  Xi Yi
2 2

n X i - (Xi )
xy
= Xi - X )Yi -Y) =
X
 i - X ) 2
x 2

b1 = Y - b2 x
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Y 2.22
. Y4 ^Y = b

^*
^*
Y2 .
^*
Y3
^u*
4 {. ^*
^* *
1 + b 2X
*
Y = b1 + b2X
.
Y1 . u*3{
^ Y4

{
u*2 {. Y2
^ .
Y3
^
u*1

.
Y1
x1 x2 x3 x4 x

Why the SRF is the best one?


^ is larger.
The sum of squared residuals from any other line Y*
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.23
Assumptions of Simple Regression
1. The linear regression Model:linear in
parameters
Y = 1+ 2X+ u
2. X values are fixed in repeated sampling, so that X is
not constant (X is nonstochastic).

3. Zero mean value of error terms (disturbance, ui),


E( ui | xi) = 0
4. Homoscedasticity or equal variance of ui, the
conditional variances of u i are identical, i.e.,
var(ui|xi) = 2
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Homoscedasticity Case 2.24
f(Yi) Yi

re
tu
di
p en
.
ex .

x1=80 x2=100 income xi


The probability density function for Yi at two
levels of family income, X i , are identical.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Heteroscedasticity Case 2.25
f(Yi)

Y
i
r e
i tu
e nd
p
ex
.
.
.
x1 x2 x3 income xt
The variance of Yi increases as family income,
xi, increases.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.26
Assumptions of SRF (continue)

5. No autocorrelation between the disturbance.


cov(ui,uj|xi ,xj) = 0

6. Zero covariance between ui and xi, i.e.,


cov(ui,xi ) = E(ui,xi ) = 0
7. The # of observation (n) must be greater
than the # of parameters (k) to be estimated.
n>k
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.27
Assumptions of SRF (continue)
8. Variability in X values: The values in a
given sample must not all be the same,
at least two must different.

9. No specification bias or error: the regression


model is correctly specified.

10. There is no perfect multicollinearity. No


perfect linear relationship among the
independent variables. i.e.,
Xk   X m
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.28
One more assumption that is often used in
practice but is not required for least squares:
(Optional) The values of y are normally
distributed about their mean for each
value of x:

Y ~ N [(1+2X),2 ]

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.29
The Error Term Assumptions
1. The value of y, for each value of x, is
Y = 1 + 2X + u
2. The average value of the random error u is:
E(u) = 0
3. The variance of the random error u is:
var(u) = 2 = var(Y)
4. The covariance between any pair of u’s is:
cov(ui , uj) = cov(Yi ,Yj) = 0
5. u is normally distributed with mean 0, var(u)=2
u ~ N(0,2)

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


2.30
Prediction
Estimated regression equation:
^
y = 4 + 1.5 x t
t

x t = years of experience
^
yt = predicted wage rate
^
If x t = 2 years, then yt = $7.00 per hour.
^
If x t = 3 years, then yt = $8.50 per hour.
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
2.31
Mean Prediction:

Ŷ  ˆ 0  ˆ 1 X Prediction

Ŷ = 24.454 + 0.5090 X
X= 100
^
Y = 24.454 + 0.5090 (100) = 75.364
(estimated result)

All rights reserved by Dr.Bill Wan Sing Hung - HKBU


The “ex-post” and “ex ante” forecasting: 2.32
For example: Suppose you have data of Y and X from 1947–1999.
And the estimated consumption expenditures for 1947-1995 is

1947 – 1995: ^
Yt = 238.4 + 0.87Xt
Given values of X96 = 10,419; X97 = 10,625; … X99 = 11,286
The calculated predictions or the “ex post” forecasts are:
^
1996: Y96 = 238.4 + 0.87(10,149) = 9.355
^
1997: Y97 = 238.4 + 0.87(10,625) = 9.535.50
…..
^
1999: Y99 = 238.4 + 0.87(11285) = 10,113.70

The calculated predictions or the “ex ante” forecasts base on the


assumed value of X2000=12000:
^
2000: Y2000 = 238.4 + 0.87(12,000) = 10678.4
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Forecasting with the two-variable regression model
2.33
ex-post forecast ex-ante forecast
1996 1999 2003

Estimated regression function in a time-series context:

^ ^ ^
Yt  1   2 Xt
Forecast for-period t+ is
^ ^ ^
Yt   1   2 Xt 
 : # of period into the future
Forecast error:
xt  : is an observed or control value
^
ut 
f ^ of future
 Yt   Yt 
All rights reserved by Dr.Bill Wan Sing Hung - HKBU
Comparison of Forecasts 2.34
^
Mean squared error  (Y  Y ) 2

(MSE) 
nk
Root mean squared  MSE
error(RMSE)
^  2
 (Y Y )

nk
^  Y |
|Y
Mean absolute percentage error ( i
Yi
i )
(MAPE) MAPE
n  k

All rights reserved by Dr.Bill Wan Sing Hung - HKBU

You might also like