Prac 10 Autocorrelation
Prac 10 Autocorrelation
This practical exercise focuses on the methods of detection and remedies for autocorrelation in
multiple regression estimation as discussed in Session 10. Autocorrelation is most often
encountered when working with time-series data.
Please follow the instructions carefully, and only ask a tutor for help if you can’t work out what to
do yourself.
1. Log on to the workstation using your student username and password (Recall: for
emergencies, the generic login is username teacher and password training).
2. Open STATA:
3. Start a log file:
Click on the log icon (fourth button from the left – it looks like a brown notebook).
Select where you want the log file to be saved.
Choose the Log (*.log) format from the ‘save as type’ drop-down list.
Provide a name for the log file (for this prac exercise, you can call your log file prac11).
Click on ‘Save’.
Check that the message in the Results window shows that you’ve started a text file.
4. Download the dataset autocorr1.dta from the Moodle site for the course and save it in a
location that is easy for you to find. Go back to Stata and open the dataset from where you
have saved it.
This data set contains time series data on aggregate consumption expenditure and
disposable income of South African households. To confirm this, type:
desc
5. Regress consumption on disposable income, predicting the residuals and calling the
variable that contains the residuals e. Type:
reg cons inc
predict e, resid
6. Graph the residuals against time (year). Use the option yline(0) to include a horizontal line
at 0 in your scatter plot. Type:
scatter e year, yline(0)
From what you observe, do you think there may be a problem and if so, what sort?
7. We need to generate the lagged values of the residuals in order to see whether there is a
relationship between current and previous values of the residuals. Note that _n is Stata’s
internal reference for the current observation; _n-1 means the previous observation. Type:
gen elag=e[_n-1]
Check the effect of this command by looking at the values of the variable elag in the Data
Browser (you can open this by clicking on the magnifying glass icon).
8. Graph elag against e, including the parameters xline(0)yline(0)in order for Stata to
draw an x-axis and a y-axis in the scatter plot. Type:
scatter e elag, xline(0) yline(0)
tsset year
This tells Stata that the variable corresponding to the time period is called ‘year’. In order to
perform the Durbin-Watson test, type:
dwstat
You need to look up the critical values in the Durbin-Watson d-statistic table (Table B-4 in
Appendix B (from Moodle)), and use the rules in the lecture notes to interpret them. What
can you conclude? Does it confirm your earlier impressions?
10. Conduct the runs test for autocorrelation to confirm your findings above:
runtest e, mean
The null hypothesis is that the residuals are randomly distributed (i.e. no autocorrelation).
Using the p-value displayed with the results of the test, what can you conclude?
11. Given your findings above, first assume that the value of rho = 1 and perform the GLS
transformation of the variables. Note that we are assuming an AR(1) process – is this
necessarily a valid assumption? You can transform your model into a first differenced
equation as indicated in your notes as follows.
Then form the first differences of both the dependent and explanatory variables:
gen dcons=cons-consl
gen dinc=inc-incl
dwstat
What do you conclude? Note that, strictly speaking, the DW test is not appropriate here – a
non-parametric test would be suitable (why?).
13. Predict the residuals of the transformed model and conduct the runs test. Type:
B Second Approach
14. Now estimate the value of rho from the DW statistic of the original regression model (the
one plagued by autocorrelation in 5 above). Recall that d ≈ 2(1 – ) from equation 4 of your
notes, hence ≈1– . Obtain the estimate of rho using this equation and the DW statistic
gen rho2=…
What is its value? Type:
display rho2
15. Now using this estimate of rho (rho2), perform the GLS transformation of the variables as
explained in your notes. [Notice that we are assuming an AR(1) process – is this
necessarily a valid assumption?]
gen dcons2=cons-(rho2*consl)
gen dinc2=inc-(rho2*incl)
16. Regress the transformed variables as follows (using regdw in order to get Stata to print out
a DW statistic):
C Third Approach
18. Now obtain an estimate of rho from the OLS residuals as follows.
The estimated value of rho (i.e. rho3) is the coefficient on elag. To get Stata to retrieve this
value, enter the command below. (Note: _b means the coefficient on the variable that
follows.)
gen rho3=_b[elag]
gen dcons3=cons-(rho3*consl)
gen dinc3=inc-(rho3*incl)
21. What do you conclude? Check, using the runs test to confirm your findings. Type: