Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
Topics
Running a proper regression analysis First half of the day:
1. What is regression? 2. How to estimate? (Simple and Multiple Regression) 3. Checking the assumptions of regression
Types of data
Cross sectional Time series Panel data Where to get the data, DOS and BNM Lets download some data Data transformation level data, growth rate, index numbers, nominal to real values, exponential to linear models, etc
EXPLORE DATA
no
What is regression?
Relationship between two variables (simple) or more than two variables (multiple) Models: Y = + X + is the intercept is the coefficient is the error term
Linearity
Straight enough condition (scatter plots) SPSS: Graphs: Scatter: Matrix: enter the dependent (outcome) variable first and then each of the independent variables (categorical/nominal variables dont need to be entered, but do it anyway to see what it looks like). SPSS: Analyze: Regression: Linear: Ramsey RESET test.
Normality
We do not need to test each series Just test the residuals Jarque-Berra statistics or the QQ and PP plots We use JB stat. Null Hypo: Normal
Serial Correlation/AutoCorrelation
Likely a problem in time series data especially data with short frequency What cause autocorrelation?
Omitted variables Misspecification
Consequences of autocorrelation
OLS estimators will be inefficient Variance of the coefficient will be biased and inconsistent
Heteroskedasticity
The opposite of homoskedasticity Hetero means unequal; Homo means equal Second part of the word skedasticity means spread (variance) Example: Consumption rich and poor rich have better spread (save and consumption) poor have lower spread There are many ways to test hetero : Graphically plot residual squared against dependent or independent variable there must not be a systematic pattern However graphical methods can be used for multiple regression
Heteroskedasticity
The following test can be used:
Breusch-Pagan LM test, Glesjer LM Test, HarveyGodfrey LM test, Park test, Goldfeld-Quabdt test, and White test Lets use the White test Null Hypo: No hetero or homo
Multicollinearity
Whether there is any relationship between the regressors Consequences parameter is indetermine if perfect multicollinearity (However, real data do not have perfect multicollinearity) Imperfect multicollinearity when regressors are correlated but less than perfect How to detect?
Correlation matrics Check the significance of individual coefficient (t-test) and the joint significance (F-test) Run the regression by separating the regressors VIF Eviews or in SPSS (VIF value of less than 10 is ok)
24
25
Dummy Variables
A dummy variable is a variable that takes on the value 1 or 0 Examples: male (= 1 if are male, 0 otherwise), south (= 1 if in the south, 0 otherwise), etc. Dummy variables are also called binary variables, for obvious reasons
26
27 Example of d0 > 0
Economics 20 - Prof. Anderson
d=1
d0
b0
y = b0 + b1x
x
28
29
30
31
32
33
Thank you
QUESTIONS PLEASE
More materials will soon be available (by end of the month) through my website:
www.vgrchandran.com/default.html