0% found this document useful (0 votes)
95 views

Introduction To Econometrics

1) Econometrics integrates economic theory, mathematics, and statistics to test hypotheses about economic phenomena, estimate relationships between economic variables, and forecast future values. 2) There are four main types of data in econometrics: cross-sectional, time series, pooled cross sections, and panel data. 3) Simple regression analysis is used to test relationships between a dependent variable and one independent variable and predict future values. The OLS estimator provides the best linear unbiased estimates.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

Introduction To Econometrics

1) Econometrics integrates economic theory, mathematics, and statistics to test hypotheses about economic phenomena, estimate relationships between economic variables, and forecast future values. 2) There are four main types of data in econometrics: cross-sectional, time series, pooled cross sections, and panel data. 3) Simple regression analysis is used to test relationships between a dependent variable and one independent variable and predict future values. The OLS estimator provides the best linear unbiased estimates.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

INTRODUCTION TO

ECONOMETRICS
Eugene Kaciak, Ph.D.
Faculty of Business, Brock University
St. Catharines, Ontario, Canada
E-mail: [email protected]
Personal website: http://
spartan.ac.brocku.ca/~ekaciak
1
Introduction
Econometrics is the integration of economic
theory, mathematics, and statistical
techniques for the purpose of:
Testing hypotheses about economic
phenomena
Estimating coefficients of economic
relationships
Forecasting or predicting future values
of economic variables or phenomena.
2
In econometrics, data sets come in
four main types:
Cross-sectional data set: a sample
of objects taken at a given point in
time
Time series data set: observations
on variables over time

3
Pooled cross sections: cross-
sectional data for different objects
taken at different time-periods.
Panel (or longitudinal) data set:
cross-sectional data for the same
objects taken at different time-
periods.

4
Econometric research, in general,
involves the following three stages:
Specification of the model, together
with the a priori theoretical
expectations about the sign and the
size of the parameters of the
function.

5
Collection of data on the variables
of the model and estimation of the
coefficients of the function with
appropriate techniques.
Evaluation of the estimated
coefficients of the function on the
basis of economic, statistical, and
econometric criteria

6
Simple Regression Analysis
1. Testing hypotheses about the relationship
between
a dependent (or explained, endogeneous,
predicted, response, effect, regressand)
variable Y and
one independent (or explanatory,
exogeneous, predictor, control, casual,
regressor) variable X
2. Prediction
7
The simple linear regression model is:
Yi = β0 + β1X1i + ui
 i = 1, 2, …, n observations
 β0+β1X1 is pop. reg. line (function)
 β0 is the intercept of the pop. reg. line
 β1 is the slope of the pop. reg. line
 ui is the error term

8
Assumptions of Simple Reg. Analysis
The error term ui is assumed to be:
 normally distributed with
 expected value = 0
 constant variance σ2
(this is called homoscedasticity)
• if the variance σ2 is not constant,
we have heteroscedasticity)

9
Any two error terms ui, uj (i≠j) are
uncorrelated;
if they are correlated, this
condition is called
autocorrelation
The variable X1 assumes fixed
values in repeated sampling (so that
X1i and ui are also uncorrelated).
10
11
Ordinary least-squares
(OLS) estimators are best
linear unbiased estimators
(BLUE)

12
Lack of bias (an unbiased
estimator) means that the expected
value of the estimate b of parameter
β equals β, i.e., E(b) = β.
OLS estimators are the best (B in
BLUE) among all unbiased (U in
BLUE) linear (L in BLUE)
estimators (E in BLUE).
13
This is known as the Gauss-
Markov theorem and represents
the most important justification for
using OLS.

14
Another desired feature of an
estimator is consistency.
An estimator is consistent if, as the
sample approaches infinity in the
limit, its value approaches the true
parameter (i.e., it is asymptotically
unbiased) and its distribution
collapses on the true parameter.

15
16
Here is another practical way of
calculating b1:

Sxy ΣXY – n(avgX )(avgY)


b1= ----- = ----------------------------------
Sxx ΣX2 – n(avgX)2

17
Simple Regression Analysis
 Example:
Estimate the model: Y = β0 + β1X + u,
where X is labor hrs of work and Y is
output, based on a sample of n = 10:
i 1 2 3 4 5 6 7 8 9 10
X 10 7 10 5 8 8 6 7 9 10
Y 11 10 12 6 10 7 9 10 11 10

18
We compute:
ΣX = 80 ΣY = 96 AvgX = 8 AvgY = 9.6
ΣXY = 789 ΣX2 = 668 ΣY2 = 952
Sxx = ΣX2 – n(avgX)2 = 668 – 10(82) = 28
Syy = ΣY2 – n(avgY)2 = 952 – 10(9.62) =
30.4
Sxy = ΣXY – n(avgX)(avgY) =
= 789 – 10(8)(9.6) = 21
19
Find the OLS estimates of β0 and β1
 Sxx = 28 Syy = 30.4 Sxy = 21
 b1 = Sxy/Sxx = 21/28 = 0.75
 b0 = avgY – b1avgX = 9.6 – 0.75(8) = 3.6
 Ŷ = 3.6 + 0.75X
 Constant term has no natural
interpretation. It captures the mean of Y
as well as the average effect of omitted
variables.
20
 Standard errors of parameter
estimates
Given: Sxx = 28 Syy = 30.4 Sxy = 21
Residual Sum of Squares (RSS) =
= Syy – b1Sxy = 30.4 – 0.75(21) = 14.65
Standard Error of Regression (SER or ŝ):
ŝ = SER = √[RSS/(n-2)] =√(14.65/8) =
√1.83 = 1.35 Note also that: RSS =
(SER)2[n-(k+1)], where k = # of X’s
21
• Sxx = 28 Syy = 30.4 Sxy = 21
• RSS = 14.65 ŝ = SER = 1.35
Standard Error of b0 = SE(b0)
= ŝ√[(1/n) + (avgX)2/Sxx]
= 1.35√[(1/10) + 82/28]
= 1.35√2.39
= 1.35(1.54)
= 2.09
22
 Standard Error of b1 = SE(b1)
= ŝ/√Sxx
= 1.35/√28
= 1.35/5.29
= 0.256
The model estimated by OLS:
Ŷ = 3.6 + 0.75X
(2.09) (0.256) St. Errors
23
Testing the statistical significance of
β0
H0: β0 = 0 vs. H1: β0 > 0
t0 = (b0 – 0)/SE(b0) = (3.6 – 0)/2.09 = 1.72
t0 has a t-student distribution with
n-2 = 8 d.f.
tcritical or tcr = 1.8595 (for α = 5%)

24
Since t0 < tcr --> Cannot reject H0 at the 5%
significance level
If H1: β0 ≠ 0, tcr = 2.3060
(for α/2 = 2.5%)
 p-value = is the smallest α that permits
to reject H0 (p-value = 0.123 is found in
Excel)
The general rule is to ignore the
constant term’s lack of significance.
25
Testing the statistical significance of β1
H0: β1 = 0 vs. H1: β1 > 0
t1 = (b1 – 0)/SE(b1) = (0.75 – 0)/0.256 =
2.93
t1 has a t-student distribution with n-
2 = 8 d.f.
tcritical or tcr = 1.8595 (for α = 5%)

26
Since t0 > tcr --> Reject H0 at the 5%
level of significance
 If H1: β1 ≠ 0, then tcr = 2.3060
(for α/2 = 2.5%)
 p-value (Excel) = 0.019

27
95% Confidence Interval for β0
95% = 1 – α, where α is the signi-
ficance level. In this case, α = 5%
 Left-hand side of the confidence level = b0
– tcrSE(b0)
= 3.6 – 2.3060(2.09) = -1.22
 Right-hand side of the confidence level =
b0 + tcrSE(b0)
= 3.6 + 2.3060(2.09) = 8.42
28
95% Confidence Interval for β1
 Left-hand side of the confidence
level = b1 – tcrSE(b1) =
0.75 –2.3060(0.256) = 0.160
 Right-hand side of the confidence
level = b1 + tcrSE(b1) =
0.75 + 2.3060(0.256) = 1.34

29
Goodness-of-fit of the model
 The coefficient of determination R2
R2 = b1(Sxy/Syy) = 0.75(21/30.4) = .518
or
R2 = t12/(t12 + n – 2) = 2.932/(2.932 + 8)
= .518
or
R2 = ESS/TSS = 1 – RSS/TSS,
where TSS = ESS + RSS (next slide)

30
 TSS = ESS + RSS
• TSS = Total Sum of Squares =
Σ(Y – avgY)2
• ESS = Explained Sum of Squares
= Σ(Ŷ – avgY)2
• RSS = Residual Sum of Squares =
Σ(Y – Ŷ)2 = Σû2
• where û = Y – Ŷ is the residual
31
• ESS = b1Sxy = 0.75(21) = 15.75 or
• ESS = R2Syy = 0.518(30.4) = 15.75
• RSS = (1 – R2)Syy =
(1 – 0.518)(30.4) = 14.65

32
Test of the overall significance of
the regression (when the # of slope
parameters is 1)
H0: β1 = 0 vs.
H1: β1 ≠ 0
F = R2/[(1 – R2)/(n – 2)] =
= 0.518]/[(1 – 0.518)/(10-2)] = 8.60

33
F has an F-distribution with
d.f. = 1 and n-2
We find Fcr= 5.32
Since F = 8.60 > 5.32 = Fcr
reject H0 at the 5% level
 Note: p-value (Excel) = 0.0189

34
Prediction with the Simple Reg.
Model
Ŷ = 3.6 + 0.75X R2 = 0.518
(2.09) (0.256) t-statistics
[0.123] [0.019] p-values

Let X0 = 6. Then Ŷ0 = 3.6 + .75(6)=8.1

35
The 95% confidence interval of the
forecast:
LHS = Ŷ0 – tcr Ŝ
RHS = Ŷ0 + tcrŜ
where
Ŝ = ŝ√[1 + (1/n) + (X0-avgX)2/Sxx]

36
• Let’s compute:
The 95% confidence interval of the forecast
Ŷ0 = 8.1:
Ŝ = ŝ√[1 + (1/n) + (X0-avgX)2/Sxx] =
= 1.35√[1 + 0.1 + (6-8)2/28] = 1.51
LHS = Ŷ0 – tcrŜ
= 8.1 – 2.3060(1.51) = 3.48
RHS = Ŷ0+ tcrŜ
= 8.1 + 2.3060(1.51) = 11.6
37

You might also like