Practical Examples With STATA
Practical Examples With STATA
(MGMT3071)
By: Teklebirhan A. 4
Note the following notations from the above table:
SS = Sum of squares
df = Degrees of freedom
MS = Mean squares
Number of obs = No of observations used in the regression
F() = F value from the joint test of significance of the model
Prob > F = p-value of the F test
R-squared = Model’s R-Squared
Adj R-squared = Model’s Adjusted R-squared
Root MSE = Root Mean Squared Error
Coeff= estimated coefficients ( , , , , , respectvely )
t= t-ratios/statistics of corresponding coefficients
By: Teklebirhan A. 5
Report the Regression Result. How?
There are two ways to report a regression result:
a) By fitting the estimated coefficients in to the model &
b) Table form
= . − . + . − . + .
(197.502) (27.059) (0.015) (95.928) (0.028)
= . %
NB: The values in parenthesis are standard errors of the respective parameters.
By: Teklebirhan A. 6
b) Table Form
eststo: reg Saving FS Income Sex Wealth
esttab using save.rtf, se r2 label
Household's Monthly Saving (in ETB)
Family Size -13.54
(27.06)
Constant 178.3
(197.5)
Observations 50
R2 0.848
Standard errors in parentheses
* p < 0.05, ** p < 0.01, *** p < 0.001
You can use eststo and esttab command to order Stata to produce you a
regression table that looks like those in Journal articles. By: Teklebirhan A. 7
Can we interpret the above result now?
NO!!!
Why???
By: Teklebirhan A. 8
BECAUSE this estimated model do not have statistical
backing since the validity of the model was not tested.
Besides, it is not tested whether the CLRM assumptions
are satisfied or not.
By: Teklebirhan A. 9
1. Statistical Tests of Significance (FOT)
(Testing Validity of the Regression Model)
Broadly speaking, a test of significance is a procedure
by which sample results are used to verify the truth or
falsity of a null hypothesis.
= , ~ ( , )
( )
By: Teklebirhan A. 11
NB: To test a hypothesis, we choose level(s) of significance.
Level of significance is the probability of making ‘wrong’
decision, i.e. the probability of rejecting the hypothesis when it
is actually true.
It is customary in econometric research to choose the 1% or
the 5% or the 10% level of significance (tolerance level).
By: Teklebirhan A. 13
i. F-statistics: measures the overall significance of the model.
The F-statistic tests:
The null hypothesis ( ): = = = i.e. that the
coefficients are equal to zero implying that there is no
relationship between the dependent variable and the
independent variables.
The alternate hypothesis ( ): ≠ ≠ ≠ , i.e.,
none of the coefficients is equal to zero.
Decision Rule:
if the is accepted it implies that there is no relation between
the dependent and independent variables even if the
coefficients are not zero.
If the is rejected, i.e. the F-statistic is valid, then the overall
model is valid and we can go ahead and check the other tests .
By: Teklebirhan A. 14
How do we do the F-test?
You can calculate the F-test manually or using computer.
This training concentrates on computer use.
You are strongly advised to read any statistical book for
manual f-statistic calculation and how it is applied.
By: Teklebirhan A. 23
Classical Linear Regressions: Regression Diagnostics:
1-Normality of Residuals
If the normality assumption is violated, hypothesis testing based on
the standard statistical techniques of tests would be impossible.
Therefore, after a model has been estimated we have to test the
normality of the residuals using graphical and non-graphical tests.
A. Graphical method:
reg Saving FS Income Sex Wealth
predict r, resid
kdensity r, normal or histogram r, kdensity normal
B. Non-graphic methods: Doornik-Hansen (mvtest norm r) and
Shapiro-Wilk test (swilk e) for normality.
By: Teklebirhan A. 24
Both tests the null hypothesis that the distribution of the
residual values is normal.
. mvtest norm r
. swilk r
1000
500
Residuals
0
-500
chi2(1) = 14.43
Prob > chi2 = 0.0001
The hettest implies that our model is heteroskedastic as the p-value (is
significant @ 1%) is smaller than the standard 0.01(99% significance) and
thus, we can reject the null hypothesis of constant variance.
Both of our tests suggest the possible presence of heteroskedasticity in our
model.
Robust
Saving Coef. Std. Err. t P>|t| [95% Conf. Interval]
By: Teklebirhan A. 29
Note that after robust regression, our results are free of
heteroskcedasticity problem.
.
. estat bgodfrey
1 12.012 1 0.0005
The insignificant hat square shows that the model has no error on its
formula and no omission of significant variable
By: Teklebirhan A. 34
General Guidelines for Building a Regression
Model
Make sure all relevant predictors are included.
These are based on your research question, theory,
empirical evidence and knowledge on the topic.
Get your sampling technique right.
Obtain the right data and choose appropriate technique.
By: Teklebirhan A. 36