Detecting and Resolving Model Specification Errors in STATA

The document outlines methods for detecting and resolving model specification errors in STATA, focusing on data normality, outlier management, and model specification tests. It details the use of regression analysis, residual prediction, and various statistical tests such as Ramsey’s RESET and Langrage Multiplier tests to identify and correct issues in econometric models. Additionally, it discusses the Box-Cox transformation for selecting the appropriate functional form when comparing models with different dependent variables.

Uploaded by

Sharjeel Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views7 pages

Detecting and Resolving Model Specification Errors in STATA

Uploaded by

Sharjeel Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Detecting and Resolving Model Specification Errors in STATA

(STATA Commands, Results and Interpretations)

Used Econometric Model: wage=ß1+ ß2educ+ ß3exper+ ß4 tenure+ ß4 IQ
Data File: WAGE2
1) Detecting Data Normality
*Run the regression based on above econometric model as follows:

*predict the residuals of model with the following STATA command

. predict res01, residual
*draw the histogram with normality curve
. histogram res01, normal
.0015
.001
Density
5.0e-04

-1000 0 1000 2000

Residuals

Interpretation: The data is somewhat positively skewed as its right tail is longer.
*You can also check the data normality with the following STATA command

Interpretation: Since JB test value is highly significant (its p-value<0.05), the data is non-normal
and has outliers in data as well.
Resolving Data Non-Normality Issue
i) Application of log-lin or double-log model: In this approach, you can run the log model to
smoothen and normalize the data to a certain extent.
*Run the following STATA command of log-lin model of wage to address data normality
issue

*Now predict the residuals of this log-lin model and apply JB test again to check data
normality.

Interpretation: As compared to previous results, JB test value has been greatly reduced from
738.4 to 42.89 and data has been normalized to a greater extent though it is yet not perfectly normal
ii) Winsorization of Data: This statistical method is used to replace the outliers with the nearest
values of quartiles or percentiles.
*First install winsor2 command in STATA as follows:

*Now run the winsor2 command to address outliers.

. winsor2 lwage educ exper tenure IQ
*Now run the regression model with new winsorized variables created by STATA with _w
lables:

*Now predict the residuals of this log-lin model and apply JB test again to check data
normality.

Interpretation: Bravo! The data has been completely normalized and outliers have been removed
as well. Now our hypothesis testing would be valid as t and F tests require the data normality
condition.
2) Model Specification Tests (Detecting Omitted Variable Bias)
i) Ramsey’s RESET Test: The first test is Ramsey’s RESET test which is commonly used to
detect the model specification error by including the quadratic and cubic powers of fitted values
of Y variable (in this case wage).
*Run the following STATA command using our final model with winsorized variables.

* After running the above regression, now predict the fitted values of lwage_w, and generate
(g) the quadratic and cubic values of lwage_w as follows:

* Now run the Ramsey’s RESET test by regressing the lwage_w on regressors and quadratic
and cubic terms of lwage_w with the following STATA command.
Interpretation: Since the F-value of unrestricted model is highly significant (its p-
value<0.05), we conclude that model is misspecified.
ii) Langrage Multiplier (LM) Test: In this test we regress the residuals of our model on quadratic
and cubic terms of fitted or estimated values of Y.
*Run the following STATA command

*The following STATA command generates the value of LM test by multiplying the estimates
of number of observations (e(N) with r-squared (e(r2).
. scalar nR2=e(N)*e(r2)
*The following command generates the 5% critical value of χ2 distribution.
. scalar chi2critical=invchi2tail(e(df_m), 0.05)
*The following command generates the p-value of χ2 distribution.
. scalar p_value=chi2tail(e(df_m), nR2)
* Now list all the scalar values generated previously with the following command.

Interpretation: Since calculated value of LM test (586.65) is much greater than χ2 critical value
(5.99), we reject the null hypothesis of no specification errors.
3) Detecting the Right Functional Form: If we want to compare two competing models with
the same dependent variables, then run the regressions in STATA and choose the model which has
the highest Adjusted R2 and lower AIC or BIC values. However, problem arises when we have
different DVs. For instance, you want to compare two model in which DV is wage in one model
and lwage in another model.
wage=ß1+ ß2educ+ ß3exper+ ß4 tenure+ ß4 IQ (1)
lwage=ß1+ ß2 leduc+ ß3 lexper+ ß4 tenure+ ß4 lIQ (2)
In this case, we use the following Box-Cox transformation procedure to choose the right functional
form.
*Step 1. Find out the geometric mean of wage variable with the following STATA command:

*Step 2. Now divide wage variable by its geometric mean to create new variable ‘wagestar’
with the following STATA command

*Step 3. Now regress both models 1 and 2 with newly created common variable ‘wagestar’
with the following STATA commands:
*Step 4. Now calculate the Box-Cox statistic as follows: -

B-Cox stat=   n  ln 
1 RSS2 
2   where RSS represents the residual sum of squares of both models.
 RSS1 

Note: keep the higher RSS in the numerator which is RSS2 of log model in this case Moreover B-
Cox stat follows the chi-square distribution with k-1 degree of freedom, where k is number of
coefficients.
 152.02 
B-Cox Stat   0.5*935  ln    3.395
 150.92 

*Now calculate the p-value of Box-Cox statistic where k-1 (5-1=4) degree of freedom; there
are four IVs in our model.

*Now list the calculated p-value with the following STATA command

Interpretation: Since the test-statistic is insignificant (as its p-value is greater than 0.05), so we
cannot conclude that the log function is superior to linear model.

Course Plans of Department of Economics, University of Dhaka
56% (18)
Course Plans of Department of Economics, University of Dhaka
63 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
2 pages
Cameron and Trivedi STATA
100% (3)
Cameron and Trivedi STATA
732 pages
Econometric Modeling:: Model Specification and Diagnostic Testing
100% (1)
Econometric Modeling:: Model Specification and Diagnostic Testing
57 pages
OLS Stata9
No ratings yet
OLS Stata9
13 pages
Valio So
No ratings yet
Valio So
23 pages
Final Exam
100% (1)
Final Exam
2 pages
Points For Session 4 - Updated
No ratings yet
Points For Session 4 - Updated
9 pages
Statistics Econometrics Exam Feb
No ratings yet
Statistics Econometrics Exam Feb
8 pages
An Introduction To Modern Econometrics Using Stata by Christopher F. Baum
No ratings yet
An Introduction To Modern Econometrics Using Stata by Christopher F. Baum
362 pages
2A.3 Lecture Slides5 Model Specification
No ratings yet
2A.3 Lecture Slides5 Model Specification
15 pages
An Introduction To Stata For Economists: Data Analysis
No ratings yet
An Introduction To Stata For Economists: Data Analysis
48 pages
Analyzing GRT Data in Stata
No ratings yet
Analyzing GRT Data in Stata
17 pages
07-Functional Form and Functional Adequacy
No ratings yet
07-Functional Form and Functional Adequacy
16 pages
Problem Set 1: Panel Data
No ratings yet
Problem Set 1: Panel Data
3 pages
Bio624 Class1handout
No ratings yet
Bio624 Class1handout
48 pages
Panel Data Stata
No ratings yet
Panel Data Stata
16 pages
B203: Quantitative Methods: y X Z, - Z, X
No ratings yet
B203: Quantitative Methods: y X Z, - Z, X
12 pages
Text - On - Class Econometrics
No ratings yet
Text - On - Class Econometrics
17 pages
Heckman Lecture
No ratings yet
Heckman Lecture
8 pages
Pbset1 Dofile
No ratings yet
Pbset1 Dofile
3 pages
Assignment # 1 - BBA - 2k23
No ratings yet
Assignment # 1 - BBA - 2k23
4 pages
Econometrics II ReExam
No ratings yet
Econometrics II ReExam
8 pages
Stata Workshop
No ratings yet
Stata Workshop
5 pages
Testing For Normality & Specification Error
No ratings yet
Testing For Normality & Specification Error
7 pages
Faculty of Higher Education: Assignment Cover Sheet
No ratings yet
Faculty of Higher Education: Assignment Cover Sheet
8 pages
CH - 05 - Further Issues - TQT
No ratings yet
CH - 05 - Further Issues - TQT
35 pages
CH 07 Specification and Data Issues TQT
No ratings yet
CH 07 Specification and Data Issues TQT
45 pages
Econ 306 HW 3
No ratings yet
Econ 306 HW 3
7 pages
Cheat Sheet: Summarize Data Estimate Models, 1/2
No ratings yet
Cheat Sheet: Summarize Data Estimate Models, 1/2
2 pages
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
No ratings yet
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
14 pages
Text On Class
No ratings yet
Text On Class
18 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
Final Assignment: 1 Instructions
No ratings yet
Final Assignment: 1 Instructions
5 pages
Stata An Introduction Summer 2020
No ratings yet
Stata An Introduction Summer 2020
60 pages
Athrho Interpretation
No ratings yet
Athrho Interpretation
3 pages
2017, Sem 5, Applied Econometrics
No ratings yet
2017, Sem 5, Applied Econometrics
27 pages
IE Questions
No ratings yet
IE Questions
6 pages
Panel Lecture - Gujarati
100% (1)
Panel Lecture - Gujarati
26 pages
PSQF6270 Example4a Binomial
No ratings yet
PSQF6270 Example4a Binomial
13 pages
Stata IV Simple Example
No ratings yet
Stata IV Simple Example
7 pages
Assignment #2 - For Statistical Software
No ratings yet
Assignment #2 - For Statistical Software
4 pages
Xtxtmlogit
No ratings yet
Xtxtmlogit
31 pages
Biostatistics in Public Health Using STATA-2016
100% (3)
Biostatistics in Public Health Using STATA-2016
202 pages
EC295 Assign2 2025
No ratings yet
EC295 Assign2 2025
5 pages
Statistical Model Specification
No ratings yet
Statistical Model Specification
3 pages
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
No ratings yet
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
104 pages
Stata Textbook Examples Introductory Eco No Metrics by Jeffrey
100% (1)
Stata Textbook Examples Introductory Eco No Metrics by Jeffrey
104 pages
05 Diagnostic Test of CLRM 2
No ratings yet
05 Diagnostic Test of CLRM 2
39 pages
cn4 IV
No ratings yet
cn4 IV
18 pages
Econ7020X 2024S FinalExam
No ratings yet
Econ7020X 2024S FinalExam
10 pages
Violations of Classical Assumptions: Chapter Four
No ratings yet
Violations of Classical Assumptions: Chapter Four
38 pages
Discussion1 Solution
No ratings yet
Discussion1 Solution
5 pages
Comandos
No ratings yet
Comandos
51 pages
The Stata Command All Commands Concerning Fixed and Random Effect
No ratings yet
The Stata Command All Commands Concerning Fixed and Random Effect
10 pages
Techniques of Statistical Analysis 1 Group 2 2014-15
No ratings yet
Techniques of Statistical Analysis 1 Group 2 2014-15
3 pages
Lecture 01
No ratings yet
Lecture 01
26 pages
An Introduction To Modern Econometrics Using Stata by Christopher F. Baum
No ratings yet
An Introduction To Modern Econometrics Using Stata by Christopher F. Baum
349 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
3/5 (4)
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
MFML Unit-4 Notes - 14-06-2024
No ratings yet
MFML Unit-4 Notes - 14-06-2024
36 pages
LR Ratios Test
No ratings yet
LR Ratios Test
12 pages
2017 Paper 2 GP Memo
No ratings yet
2017 Paper 2 GP Memo
19 pages
Answer Assignment 2
No ratings yet
Answer Assignment 2
6 pages
Pengaruh Perceived Desirability Perceived Feasibil
No ratings yet
Pengaruh Perceived Desirability Perceived Feasibil
7 pages
Multiple Regression
No ratings yet
Multiple Regression
3 pages
ch12 Autocorrelation
100% (1)
ch12 Autocorrelation
36 pages
Econometrics Chapter 8 PPT Slides
100% (1)
Econometrics Chapter 8 PPT Slides
42 pages
A Practical Guide To Using Panel Data
No ratings yet
A Practical Guide To Using Panel Data
6 pages
Econometrics
No ratings yet
Econometrics
25 pages
Chi Nguyen - 1622431 - LAB 4
No ratings yet
Chi Nguyen - 1622431 - LAB 4
5 pages
Asymptotic Variance
No ratings yet
Asymptotic Variance
6 pages
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
No ratings yet
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
40 pages
Diff - Simplifying The Estimation of Difference-In-difference Treatment Effects
No ratings yet
Diff - Simplifying The Estimation of Difference-In-difference Treatment Effects
20 pages
Week 13 Tutorial - Sample Solutions - Chapter 14-MYLOVJune2020S1
No ratings yet
Week 13 Tutorial - Sample Solutions - Chapter 14-MYLOVJune2020S1
5 pages
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
QM 8 Panel Regression, Random Effects
No ratings yet
QM 8 Panel Regression, Random Effects
39 pages
I Sem MA 2024 Admn Exam 1
No ratings yet
I Sem MA 2024 Admn Exam 1
8 pages
Week 5 Quiz - ARMA Processes
No ratings yet
Week 5 Quiz - ARMA Processes
3 pages
Reinstein Psid Substitution
No ratings yet
Reinstein Psid Substitution
49 pages
Chapter 2. Dynamic Panel Data Models
No ratings yet
Chapter 2. Dynamic Panel Data Models
209 pages
Estimation Theory Lec 1 - InTRODUCTION
No ratings yet
Estimation Theory Lec 1 - InTRODUCTION
21 pages
Logistic Regression-Advanced Biostat PDF
No ratings yet
Logistic Regression-Advanced Biostat PDF
86 pages
17 Owolabi EJBM Owolabi
No ratings yet
17 Owolabi EJBM Owolabi
22 pages
Regression
No ratings yet
Regression
4 pages
Fit Cmclogit
No ratings yet
Fit Cmclogit
30 pages
Items Upload-Slides SpatialEconomStata PDF
100% (2)
Items Upload-Slides SpatialEconomStata PDF
94 pages
Control Function Approach Slides
No ratings yet
Control Function Approach Slides
9 pages
Introduction To Vars and Structural Vars:: Estimation & Tests Using Stata
100% (1)
Introduction To Vars and Structural Vars:: Estimation & Tests Using Stata
69 pages