Ôn Final KTL
Ôn Final KTL
Methodology of Econometrics
- Statement of theory or hypothesis
- Specification of the mathematical model of the theory
- Specification of the statistical, or econometric model
- Collecting data
- Estimation of the parameters of the econometric model
- Hypothesis testing
- Forecasting or prediction
- Using the model for control or policy purposes
Types of data
- Time series data: a variable - different times
- Cross-section data: one or more variables - same point in time
- Panel data/ Pooled data: time series + cross-section
Chapter 2
Properties of OLS statistics
- The sum and the sample average of the OLS residuals is zero ↔ ∑ u^i=0
- The sample covariance between the regressors and the OLS residuals is zero ↔ ∑ X i u^i=0
- The point ( X , Y ) is always on the OLS regression line
Properties of least-squares estimators
- Gauss-Markov Theorem: the least-squares estimators have the minimum variance - BLUE (Best Linear
Unbiased Estimator)
Chapter 5
1. Multicollinearity
- High (but not perfect) correlation between two or more independent variables is called multicollinearity
- The estimator is indeterminate, the variances are infinite
- The t ratio of one or more coefficients tends to be statistically insignificant
- R2, the overall measure of goodness of fit, can be very high
- When collinearity is high, tests on individual regressors are not reliable
- Detecting multicollinearity: variance-inflating factor (VIF) and Tolerance and variance inflation factor (TOL)
- Curing the problem: do nothing, a priori information, additional or new data
2. Heteroscedasticity
- The variances of ui are not the same
- Heteroscedasticity can arise as a result of the presence of outliers
- Heteroscedasticity is likely to happen in cross-sectional analysis
- OLS estimators are still linear and unbiased
- Not BLUE → variance of coefficient of estimators is not minimum, σ^ 2 is biased, t and F statistics are
unreliable
- Detecting heteroscedasticity: graphical method, Park test, Breusch-Pagan test, White’s General
Heteroscedasticity test
- Correcting for heteroscedasticity:
+ When σ 2i is known: the method of weighted least squares
+ When σ 2i is unknown: use 1 of 4 assumptions → transformation
3. Autocorrelation
- Autocorrelation is most likely to occur in time series data
- Reason for autocorrelation: inertia, excluded variables case, incorrect functional form, Cobweb phenomenon,
lags, manipulation of data, data transformation
- The coefficient estimator is still linear and unbiased
- Not BLUE → variance of coefficient of estimators is not minimum, σ^ 2 is biased, t and F statistics are
unreliable
- Should use GLS instead of OLS
- Dectecting autocorrelation: graphical method, the runs test, the Durbin Waston test, Breusch-Godfrey (BG)
test
- Curing the problem: pure autocorrelation (add more important variables), use GLS, Newey-West, continue to
use OLS.
Chapter Dummy
- A dummy variable (binary variable) is a variable that takes on the value 1 or 0.
- Do not include male and female in a model with an intercept → dummy variable trap due to perfect
collinearity.
- If there are n categories, there should be n −1 dummy variables.
Giải đề
Đề 1 K59 27/12/2021
Section A: MCQ
1B 2C 3B 4A 5D 6B 7A 8D 9C 10C
Section B: Short answer
Problem 1:
(i) .
- rosneg; utility have negative relationships on lsalary.
- roe; finance; consprod; lsales have positive relationships on lsalary.
- R2 <50 % is not high
- Prob> F=0 % → reject H 0 → not at all the slope coefficients simultaneously zero
(ii) .
- ∆=0.161 −0.138=0.023=2.3 %
Let finance is a base group, we have equation:
lsalary =α 0+ α 1 lsales+ α 2 roe+α 3 rosneg +γ 2 consprod + γ 3 utility +v
Use t-test on consprod to test whether the difference is statistically significant
Problem 2:
(i) Hypothesis:
H 0 : β 1=0
H 1 : β1 ≠ 0
−0.163
t= =− 9 →this is fairly strong ¿ reject H 0 →There is a tradeoff between working∧sleeping
0.0181
(ii)
2
R /(k −1)
F= 2
=19.635 > F∗ → not at all the slope coefficients are simultaneously zero
(1 − R )/(n − k )
(iii) Hypothesis:
H 0 : β 1=−0.1
H 1 : β1 >− 0.1
−0.163+ 0.1
t= =− 3.48 →this is fairly strong ¿ reject H 0
0.0181
(iv)
Step 1: Exclude other factors, build a new equation including sleep (dependent variable), age and agesq
Step 2: Regression this equation
Step 3: State hypothesis that both slope coefficients are zero
Step 4: Use t-test to test
Problem 3:
β2 β3
Y i=β 1 X 2 i X 3 i ui →lnY =ln β1 + β 2 ln X 2 i+ β3 X 3 i +ln ui →lnY =β 0 + β 2 ln X 2 i+ β3 X 3 i
Y ∗
→ lnY − ln X 2i =β 0+(β 2 − 1) ln X 2 i + β 3 X 3 i → ln =β 0 + β 2 ln X 2i + β 3 X 3 i
X2i
Test hypothesis whether β ∗2 =0
∗
H 0 : β 2 =β 3=0
H1: ∃ β ≠ 0
0.995 /2 ∗
F= =1691 , 5> F 2, 17 ,0.05=3.2 → reject H 0
(1 −0.995)/17
Problem 4:
(i) 1 point increases in female*educ leads to the decrease in wage by 0.724%
(ii) 1 point increases in female leads to the