0% found this document useful (0 votes)
23 views41 pages

DOC-20191127-WA0017. Ecs4863

The document provides solutions for past exam papers on ECS4863. It includes calculating South Africa's investment to GDP ratio from 1990 to 2016, building an econometric model for investment by testing for unit roots and cointegration between variables, and estimating an error correction model for investment. Diagnostic checks are also performed on the error correction model.

Uploaded by

thobejanekelly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views41 pages

DOC-20191127-WA0017. Ecs4863

The document provides solutions for past exam papers on ECS4863. It includes calculating South Africa's investment to GDP ratio from 1990 to 2016, building an econometric model for investment by testing for unit roots and cointegration between variables, and estimating an error correction model for investment. Diagnostic checks are also performed on the error correction model.

Uploaded by

thobejanekelly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

ECS4863

EXAM PREPARATION SOLUTIONS FOR PAST PAPERS.

NB: Please contact me (Jeka: 073 768 7330) if you need data sets to practice.

May/June 2017

ANSWERS

QUESTION 1: [55]
Question 1.1 (10)

Calculate South Africa’s investment to GDP ratio for the period 1990Q1 to 2016Q2, using the data
provided in the worksheet. Name the variable: inv_per_gdp (Hint: you will have to generate the
variable using GFCF and GDP)

(a) Provide a graph of the variable: inv_per_gdp (2)

INV_PER_GDP
26

24

22

20

18

16

14
90 92 94 96 98 00 02 04 06 08 10 12 14 16
(b) Comment on the trends you observe for the periods: 1996 – 2001 and 2002 – 2016.
Do they confirm the World Bank’s findings?
Comment:
 Between 1996 – 2001 investment per gdp ratio frequently changed and a significant drop occurred
in the 1st quarter of 1999 and investment per gdp started to slowly increase in the 2nd quarter
then dropped in the 1st quarter of 2002.
 In the 2nd quarter of 2002, investment per gdp ratio sharply increased until 2009 which in line with
A 2011 World Bank report which indicated that the real returns to capital in South Africa have
risen sharply during the same period but the investment per gdp ratio abruptly declined after the
1st quarter of 2009 such can be owing to the global economic crisis.

(c) Complete the following table:


(4)

Descriptive statistic Value


Mean 18.29702
Maximum 25.35447
Minimum 14.50758
Standard deviation 2.282499

Question 1.2 (45)

Next you need to build an econometric model for investment in South Africa
Calculate the following variables by using the formulas as provided:

Variable name Description Formula


RGFCF Real gross fixed capital RGFCF = GFCF/PPI x 100
formation
RGDP Real gross domestic product RGDP = GDP/PPI x 100
INF Producer inflation (quarter on INF = (PPI/PPI(-1)-1) x 100
quarter of same year)

(a) On a single graph show the variables GFCF and RGFCF. Comment on what you
observe as far as their likely stationarity is concerned. (4)

Compiled by Jeka: 073 768 7330 www.jekanomics.com


2
240,000

200,000

160,000

120,000

80,000

40,000

0
90 92 94 96 98 00 02 04 06 08 10 12 14 16

GFCF RGFCF
Comment:
 By merely inspecting the shapes of the two graphs you should be able to tell that:
 RGFCF seems to be stationary as its means and variances seem to be constant over time thus
possibility of l(0)
 GFCF seems to be non-stationary as its mean seem rising over time and its variance could be
could be constant around the trend that is l(1) or its mean are rapidly changing over time l(2)

(b) Use the ADF test to test for unit roots. Provide your results for the following
variables in the table below (Hint: please remember to log variables, where
appropriate, before performing the tests): (8)

Variable Model Lags ADF test statistic


   
LRGFCF Trend and Intercept 4 -2.5698
Intercept 4 -0.81538
None 4 1.6094
INF Trend and Intercept 0 -7.02507***
Intercept 0 -6.9075***
None 0 -4.2279***
DRGFCF Trend and Intercept 3 -3.0223
Intercept 3 -3.1356**
None 3 -2.6519***
DINF Trend and Intercept 5 -7.306***
Intercept 5 -7.346***
None 5 -7.3767***
Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)
Comment:
The order of integration of the two variables. (2)
INF = I(0), stationary.
LRGFCF = I(1), non-stationary.

[Note: You do not have to include D(INF) in normal scenarios, given the already highly significant results
for INF in level. Please also read section 21.8, p.748-754, which provides for different ways of testing

Compiled by Jeka: 073 768 7330 www.jekanomics.com


3
for stationarity and the fact that in “real live” one test often does not suffice. However, as far as this
module is concern, the results you obtain from the ADF-test can be treated as sufficient evidence of
testing for stationarity]

(c) Test for cointegration between variables:

(i) Estimate the following long-run cointegration equation and use your results
to complete the table. (Remember to include an intercept term.) (4)

LRGFCF = f (LRGDP)

Dependent variables: LRGFCFt

Variable Coefficient
1.178005
LRGDPt

C -3.929171

(ii) Interpret the coefficients of the long-run equation. (2)

Comment:
It is evident that both LRGFCF and LRGDP are positively related (i.e. LRGDP have a
positive sign), which is correct according to our a priori (economic theory)
expectations. The magnitude of the coefficients seems irrelevant – in this case it is
not between 0 and 1, and if LRGDP increases by 1%, LRGFC will by 1.17%

(iii) Generate the residual series with the command PROC/make residual series.
Name the series: RES_GFCF. Perform a unit root test on RES_GFCF and
report your answers in the table below. (2)

Variable Model Lags ADF test


statistic
   
RES_GFCF None 4 -3.2***

Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

(iv) Can we conclude that the variables in the long-run equation are indeed
cointegrated? Explain. (2)

Comment:
 The results indicate that the variable is statistically significant at 1% (, i.e. three stars)
which means that we can reject the null hypothesis of No cointegration.

(d) Build an ECM for investment, using the variables and lag lengths as provided in
the table below. Complete the following table: (10 x ½ = 5)

Dépendent variable : D(LRGFCF)t

Compiled by Jeka: 073 768 7330 www.jekanomics.com


4
Variable Coefficient Std. error t-statistic

D(LRGDP)t 0.446465 0.156814 2.847103


D(LPRIME)t 0.181358 0.070490 2.572821
INF -0.005765 0.002745 -2.100420
RES_GFCFt-1 -0.123824 0.032407 -3.820912
CONSTANT 0.014675 0.006486 2.262541
Sample period (adjusted): 1990Q2 2016Q2
R²: 0.257440
Adjusted R²: 0.227737
S.E. of regression: 0.035941
F-statistic: 8.667299

Compiled by Jeka: 073 768 7330 www.jekanomics.com


5
(e) Perform diagnostic checks on the ECM.

(i) Complete the following table: (6)

Test Null hypothesis Test statistic P-value Conclusion

Jarque-Bera H 0 : Normally JB = 1.06635 0.586738 Fail to reject the


𝐻0 : Residuals are
distributed normally
residuals distributed.
Ljung-Box Q H 0 : No serial LBQ(6) = 24.189 0.0000 Reject the 𝐻0 :
There is 1st order
correlation serial correlation
up to the 6th lag
Breusch-Godfrey H 0 : No serial nR²(2) = 0.2222 Fail to reject the
LM TEST 3.008087 𝐻0 : There no
correlation autocorrelation
up to the order 2
ARCH-LM H 0 : No nR²(1) = 0.0064 Reject the 𝐻0 :The
3.968550 is
heteroscedasticity heteroscedasticit
y up to order 2.
White H 0 : No nR² (no CT) = 0.016 Reject the 𝐻0 :
12.819671 There is
heteroscedasticity heteroscedasticit
y
Ramsey RESET H 0 : No LR(1) = 0.0931 Fail to Reject 𝐻0 :
2.819671 No
misspecification misspecification

NB: *Rule of thumb: if p<α; Reject 𝐻0

(ii) Given your conclusions on the diagnostic check of the ECM, do you think
that this is an acceptable model? (Please provide reasons, no marks will be
awarded for only stating yes/no.) (4)

Comment:
 The model is unacceptable. There is general inconsistency of the model.
 There is evidence of 1st order serial correlation (Data is often of a “cyclical” nature.
When errors associated with observations of different time periods are related to each
other, we refer to the errors as being serially correlated) hence inconsistency.
 There is heteroscedasticity.
 The is heteroscedasticity up to order 2.

(f) Regardless of the results you obtained in question 1(d), you still decide to create
a model to combine your long run and ECM. Provide your EViews model statement
(6 marks), and a copy of the graph depicting the actual and modelled values (2
marks), in the space provided.
(8)

Compiled by Jeka: 073 768 7330 www.jekanomics.com


6
res_gfcf = lrgfcf -(-3.92917120657+1.17800547382*lrgdp).

lrgfcf=0.446465400297*d(lrgdp)+0.181357706001*d(lprime)-0.00576476633367*inf-
0.123823577887*res_gfcf(-1) + 0.0146753964222+res_gfcf(-1)

rgfcf=exp(lrgfcf)

QUESTION 2:
(a)Explain your priori/hypothesized expectations regarding the relationship between the dependent
variable (i.e. Admitted to Graduate Program) and the explanatory variables (i.e.Q and V). (2)

Economic theory/expectations:

 Aptitude Test score(Quantitative)and Aptitude Test score (Verbal) are positively related
to the qualitative variable (ADM) that is the students Admitted to the Graduate
programme. If a student is to increase the chance or the probability of being admitted
to the programme, his/her aptitude score both verbal and quantitative has to be high.

Consider the following model:


ADM i   0  1Qi   2Vi  i
(b)Estimate the preceding model using linear probability modelling (LPM). Write down the estimated
results, and interpret them in terms of economic meaning and statistical significance. (6)

 Estimated model: ADM= - 2.86739+0.003126Q+0.002343V

 Although our statistical results look satisfactory the LPM model is not a satisfactory
model because of the non-normality of the error term, Heteroscedasticity and others.

c) R² is not considered to be a well suited measure of goodness of fit in this instance.


Suggest an alternative measure and briefly explain it workings.

Weighted least squares is the procedure to be used to obtain the more efficient estimates
of the standard errors. The entire LPM is divided by the square root of the weights to have
then Weighted Least Squares. (2
[10]

QUESTION 3: [20]

(a) Explain the concept of linearity. Also comment on what is meant by an intrinsically linear
regression model?

Compiled by Jeka: 073 768 7330 www.jekanomics.com


7
(4)
 In Linear Regression the term linear is understood in 2 ways that is linearity in variables
and linearity in parameters. Linear regression however always means linearity in
parameters, irrespective of linearity in explanatory variables. A linear regression for 2
variables is represented mathematically as:

Y = 𝐵1 + 𝐵2 X + u Or Y = 𝐵1 +𝐵2 X ² + u (Where u is the error term)


 Here the variable X can be non-linear i.e. X or X² and still we can consider this as a linear
regression. However, if our parameters are not linear i.e. say the regression equation is
Y = 𝐵1 ² +𝐵2 ²X + u then this cannot be said to represent a linear regression equation.
 Some econometrical models may appear non-linear in the parameters but are inherently
or intrinsically linear. This is because with suitable transformations they can be made
linear in parameters. However, if these cannot be linearized, these are called
intrinsically non-linear regression models. When we say ‘non-linear regression model’
we mean that it is intrinsically non-linear.

Source: Also See Gujarati (2008): 38-39


(b) Are the following models linear regression models? Why or why not? (2)

 1 
(i) ln Yi  1   2    i
 Xi 
 Linear in parameters equation: that is, the parameters are raised to the first power only.
(ii) Yi  1   23 X i  i
 Non-linear in parameters equation: Since, the parameters are raised to the 3rd power.

NB: The term “linear” regression will always mean a regression that is linear in the parameters;
the β’s (that is, the parameters) are raised to the first power only. It may or may not be
linear in the explanatory variables, the X’s (see Gujarati 2008:38).

(c) Discuss the following statement: “Researchers should always keep in mind that their
results are only as good as the data they are working with”
(4)

 In given situations researchers find that the results of the research are unsatisfactory, the
cause may be not that they used the wrong model but that the quality of the data was
poor. Unfortunately, because of the non-experimental nature of the data used in most
social science studies, researchers very often have no choice but to depend on the
available data.
 But they should always keep in mind that the data used may not be the best and should
try not to be too dogmatic about the results obtained from a given study, especially when
the quality of the data is suspect.
 Because of all of the reason listed below and many other problems, the researcher should
always keep in mind that the results of research are only as good as the quality of the
data.
o Reasons:
i) There is the possibility of observational errors, either of omission or commission
since the data in non-experimental.
ii) Even in experimentally collected data, errors of measurement arise from
approximations and round offs.
iii) In questionnaire-type surveys, the problem of nonresponse can be serious
which cause selectivity bias.
iv) The sampling methods used in obtaining the data may vary so widely that it is
often difficult to compare the results obtained from the various samples.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


8
v) Economic data are generally available at a highly aggregate level (macro level
data) Such highly aggregated data may not tell us much about the individuals or
micro units that may be the ultimate object of study.
vi) Because of confidentiality, certain data can be published only in highly
aggregate form and such macro analysis often fails to reveal the dynamics of
the behaviour of the micro-units.

d) In your own words explain the differences between linear probability models (LPM’s)
and logistic (Logit) models. (10)

Linear Probability Logit Model


model (LPM)
Specification Underlying probability Underlying probability
distribution is the binomial distribution is the
distribution Logistic distribution
Estimation Ordinary Least Squares (OLS) Maximum Likelihood
Estimation technique
Problem/Advantage Possibility of getting predicted The predicted values
values greater than 1 or less are bounded between 0
than 0. and 1 ( 0<y<1).

QUESTION 4: [15]

Consider the following simultaneous-equation model developed for some hypothesised


economy:

Yt   0  1Yt 1   2 I t  1t

I t   3   41Yt   5Qt  2t

Ct   6   7Yt  8Ct 1   9 Pt  3t

Qt  10  11Qt 1  12Qt 2  13 Rt  4t


Where:
Y = national income
I = Investment
C = Consumption
Q = Profits
P = Price index
R = Productivity

(a) Which of the variables are endogenous and which are exogenous? Make sure to also
include the lagged variables in your answer.

Endogenous variables

Y, I, C and Q

Exogenous variables.
𝑌𝑡−1 , 𝑄𝑡−1 , 𝐶𝑡−1 ,𝑄𝑡−2 , R and P
Compiled by Jeka: 073 768 7330 www.jekanomics.com
9
(10 x ½ = 5)
(b) Explain the method of Indirect Least Squares (ILS) and comment if it will be suited for this
model?

The ILS method involves three steps:

 Obtain reduced form equations


 Apply Ordinary Least Squares (OLS) to the reduced form equations individually
 Obtain estimates of the original structural coefficients from the reduced form
coefficients
 In cases where equations are over-identified, ILS cannot be used. In such a
case, the more popular 2SLS is used.

(5)
(c) Explain the concept of unilateral causal dependence, and if it is relevant to the above
model?

The recursive system OLS cannot be applied to each equation separately. Actually, we
do have a simultaneous-equation problem in this situation. From the structure of such
systems, it is clear that there is interdependence among the endogenous variables.
 The second equation and the 3rd equation contains the exogenous variables and
endogenous variables on the right-hand side.
 Thus, equations do not exhibit a unilateral causal dependence.
(3)
(d) In your own words, explain the problem of an equation being underidentified? (2)
 Underidentification (too little information available) Numerical values for the structural
parameters cannot be obtained.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


10
Appendix

ECM EVIEWS OUTPUT

End

Compiled by Jeka: 073 768 7330 www.jekanomics.com


11
January/February 2019
Answers.

QUESTION 1: [55]

Question 1.1 (5)

Use the information as provided to answer the following questions

(a) Draw a graph of real imports of goods and services and paste it in the space provided. (1)

REAL IMPORTS
1,000,000

900,000

800,000

700,000

600,000

500,000

400,000

300,000

200,000

100,000
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

(b)To what do you attribute the sudden drop in imports around 2009? (1)

Comment: Global financial recession of the 2007-2009.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


12
(c)Calculate the imports to GDP (IMP: GDP) ratio. Copy and paste a graph of the ratio for the
whole period and comment on the relative size/importance of imports for the South
African economy. (3)

IMP_GDP Ratio
.32

.28

.24

.20

.16

.12
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Comment:
i. Prior to South Africa obtaining independence (1994) on average imports
were below 20% of the Gross Domestic Product possible due to the fact
that South Africa was a closed economy due to the Sanctions against
apartheid.
ii. In the period 2009-2015 the relative size of imports is about 29% of the GDP
which implies that the South Africa economy was now integrated with the
international community (Open Economy).

Question 1.2 (50)

(a)Test the variables for stationarity. Provide your results for the following variables in the table
below: (8)

Variable Model Lags ADF test


statistic
   
LGDP Trend and 1 -1.918751
Intercept 1 -0.273673
Intercept 1 3.358554
None
LIMP Trend and 0 -1.914282
Intercept 0 0.119503
Intercept 0 1.993517
None
DLGDP Trend and 0 -4.6459***

Compiled by Jeka: 073 768 7330 www.jekanomics.com


13
Intercept 0 -4.7006***
Intercept 0 -2.9552***
None
DLIMP Trend and 0 -6.0652***
Intercept 0 -5.9530***
Intercept 0 -5.64726***
None
Statistically significant at the: 10% level (*), 5% level (**), 1% level (***).

Comment: LGDP, LIMP are non-stationary, integrated of order l (1).

You can assume that the variables LPZ, LRELPZ, LRAND and LCPI are non-stationary,
integrated of order I (1).

(b) Test for cointegration between variables:

i) Estimate the following long-run cointegration equation and use your results to complete
the table. (4)

IMP= f (GDP, RELPZ, DUM)

Dependent variables: LIMPt

Variable Coefficient
LGDPt 1.372653

LRELPZt -0.491880

DUMt 0.220729

C -7.060542

(ii) Evaluate the potential long-run equation. Do the estimated coefficients correspond
to your a priori expectations in terms of size and sign? Explain. (6)

Economic Evaluation.
i. To evaluate our (potential) cointegration equation, it is evident that LGDP is
positively related to LIMP (i.e.it has a positive sign), which is correct
according to our a priori (economic theory) expectations. However, the
magnitude of the coefficients also seems to irrelevant – in this case it not
between 0 and 1, and if LGDP increases by 1% we expect LIMP to increase
by 1.4%.
ii. It is also evident that LRELPZ is negatively related to LIMP (i.e.it has a
negative sign), which is correct according to our a priori (economic theory)
expectations. However, the magnitude of the coefficients also seems to
irrelevant – in this case it not between 0 and 1, and if LRELPZ decreases
by 1% we expect LIMP to increase by 1.4%.
iii. Dummy: Other things being equal, on average real imports increased by
22.072% after 1994 it is in line with a prior economic theory expectation.

ii) Generate the residual series: RES_IMP. Test the residual series for stationarity and complete

Compiled by Jeka: 073 768 7330 www.jekanomics.com


14
the table below. (2)

Variable Model lag ADF test


s statistic
   
RES_IMP None 0 -2.178582**

Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

iii) Can we conclude that the variables in the long-run equation are indeed cointegrated?
Explain. (2).

 Our results show that the variable is statistically significant (5% level i.e. two stars),
which means that we can reject the null hypothesis of no cointegration.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


15
(c) Build an ECM for real imports, using the variables (and lags) as provided in the table below.

(i) Complete the following table: (10 x ½ = 5)

Dependent variable: D(LIMP)t

Variable Coefficient Std. error t-statistic

D(LGDP)t 3.452632 0.382775 9.019997


D(LRAND)t-1 0.216804 0.101239 2.141509
RES_IMPt-1 -0.265124 0.050740 -5.225135
CONSTANT -0.053742 0.012239 -4.391078
Sample period (adjusted): 1972 2016

Adjusted R²:0.765795

(ii) Evaluate the ECM statistically. (5)

 The coefficient of the lagged residual(RES_IMP) IS NEGATIVE and SIGNIFICANT (i.e.


(-1;0)). In our example the lagged residual is RES_IMPS (-1).
 From the EViews output we can see that it has a negative sign (-0.265124), which is
between -1 and zero, and also that it is statistically significant (p-value < 0.05).
 Evaluating the R² (indicating that 77% percentage of the variation in the dependent
variable is explained by the explanatory variables). An adjusted R² is 75% which is more
than 60% is defendable

Compiled by Jeka: 073 768 7330 www.jekanomics.com


16
(d) Perform diagnostic checks on the ECM. (6)

Test Null hypothesis Test statistic P-value Conclusion

Jarque-Bera 𝐻𝑂 : JB = 0.7222799 0.696701 Fail to reject 𝐻𝑂 .


Normally Residuals are
distribute normally
d distributed.
residuals

Ljung-Box Q 𝐻𝑂 : No serial LBQ(6) = 9.5820 Fail to reject


correlation 0.143 𝐻𝑂 .No first order
serial correlation
up to 6th lag
Breusch-Godfrey 𝐻𝑜 : No nR²(2) = Reject 𝐻𝑂 . There
LM TEST serial 6.704582 0.0350 is auto correlation
correlatio up to second
n order

ARCH-LM 𝐻𝑜 : No nR²(1) = 0.6867 Fail to Reject


heteroskedasticity 0.162662 𝐻𝑂 .There is no 1st
order
autoregressive
conditional
heteroscedasticit
y.
White 𝐻𝑜 : No nR² (no CT) = Fail to reject 𝐻𝑂 .
heterosedasticity 3.862538 0.2767 No
heteroskedasticit
y
Ramsey RESET 𝐻𝑜 : No LR(1) = Fail to reject 𝐻𝑂 .
misspecification 0.884096 0.3471 No
misspecification

Compiled by Jeka: 073 768 7330 www.jekanomics.com


17
e) Create a model statement in EViews to combine your long-run and ECM. Provide
your model statement (4 marks), and a copy of the graph depicting the actual and
modelled values (2 marks), in the space provided. (6)

RES_IMP = LIMP-(1.37265323557*LGDP - 0.491879963791*LRELPZ +


0.220729334239*DUM - 7.0605424033)

LIMP = 3.4526315628*D(LGDP) + 0.216803654946*D(LRAND(-1)) -


0.265124421849*RES_IMP (-1) - 0.0537415348293 +LIMP(-1)

IMP = EXP(LIMP)

1,000,000

900,000

800,000

700,000

600,000

500,000

400,000

300,000

200,000

100,000
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

IMP IMP (Baseline)

f) Supply your estimated value for imports (IMP^) for the year 2016. (2)

774080.70

g) Discuss the performance of the model (relative to the actual values). In your
discussion focus on two specific periods: 1970-2010 and 2011-2016. (4)

The actual value and the fitted values of real imports are reasonable close to each
other in the 1970 to 2010 periods. Nonetheless, in the period 2011 to 2016, the
model seems to overestimates and or underestimates the actual and fitted values.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


18
QUESTION 2:
(a) The Port Elizabeth Lifeguard Association received 100 applications for positions as trainee
lifeguards for 2018. Using their historic records, it is evident that, on average, 6 out of ten
females and 4 out of ten males, are selected. Calculate the following:

(i) The probability of being selected (or not) of both genders.


(2)
(ii) The odds of selection of both genders.
(2)
(iii) The odds ratio (between females and males) of selection. What does this mean
if you are a female applicant? (3)
Answers:

(a) (i) The probability of being selected for:


6
Females: = 0.6 (𝑜𝑟 60%)
10
4
Males : = 0.4 (𝑜𝑟 40%)
10
(ii) The odds of selection for:
Females:1.5 to 1 that a female will be selected
Males : 0.67 to 1 that a male will be selected
(iii) Odds ratio
𝑃 0.6
Females: 1−𝑃𝑖 = 1−0.6 = 1.5
𝑖
𝑃𝑖 0.4
Males : = = 0.67
1−𝑃𝑖 1−0.4
If you are a female, it means you have higher chances of
being selected than a male counterpart.

(b) In this question, you need to analyse data related to housing loan applications of 40
individuals.
You are provided the following information in sheet “ECS4863 JanFeb 19 Question 2b” of
the MS EXCEL file “ECS4863_Jan Feb 2019 exam data.xls”.

The variables are:

Loan application outcome (Y) = 1, if loan is approved; 0, otherwise


Deposit (Dep) = size of deposit (percentage)
Income-to-loan ratio (IL) = individual’s income relative to loan amount (percentage)

(i) Explain your a priori/hypothesized expectations regarding the relationship


between the dependent variable (i.e, Y) and each of the explanatory variables.
(2)

Consider the following logit model:

 P 
Li  ln i    0  1 Depi   2 ILi  i
 1  Pi 

(ii) Estimate the preceding model using logistic modelling (Logit). Write down
the estimated results. (3)

(iii) Interpret your results in terms of economic meaning and statistical


significance. (4)

Compiled by Jeka: 073 768 7330 www.jekanomics.com


19
Answers.

(b)
(i) We expect to have a positive relationship between loan application outcome and
deposits because the more deposits a financial institution have in their coffers the
more they are likely to loan them out.
We also expect to see a positive relationship between loan application outcome
and income-to-loan ratio because the higher the income of an individual the more
likely they are to pay back the loan. Therefore, the bank will give them a loan as
the they can afford to pay back the loan.

(ii) 𝐿𝑖 = −10.31534 + 0.147302 ∗ 𝐷𝑒𝑝𝑖 + 0.194575 ∗ 𝐼𝐿𝑖


se (3.190573) (0.065864) (0.061948)
𝑀𝑐𝐹𝑎𝑑𝑑𝑒𝑛 𝑅^2 = 0.439397.

(iii) The estimated slope coefficient suggests that for a unit increase in deposits, the
weighted log of the odds in favour of the loan being approved goes up by 0.15
units. Similarly, for a unit increase in individual’s income relative to the loan
amount, the weighted log of the odds in favour of the loan being approved goes up
by 0.19 units. Both variables are statistically significant at 5% level of significance.

[15]

Compiled by Jeka: 073 768 7330 www.jekanomics.com


20
QUESTION 3.

(a)Suppose you have monthly data over a number of years, how many dummy variables will you
introduce to test the following hypotheses (provide an answer for both a present and
suppressed intercept term)?

(i) All 12 months of the year exhibit seasonal patterns.


(ii) Only March, June, September and December exhibit seasonal patterns.

(b)Discuss the following measurement scales of variables and provide your own example of
each: (4)

(i) Interval scale:

 An interval scale variable satisfies the last two properties of the ratio scale variable but
not the first. Thus, the distance between two time periods, say (2000–1995) is
meaningful, but not the ratio of two time periods (2000/1995). At 11:00 a.m. PST on
August 11, 2007, Portland, Oregon, reported a temperature of 60 degrees Fahrenheit
while Tallahassee, Florida, reached 90 degrees. Temperature is not measured on a
ratio scale since it does not make sense to claim that Tallahassee was 50 percent
warmer than Portland. This is mainly due to the fact that the Fahrenheit scale does not
use 0 degrees as a natural base.

(ii) Ordinal scale:

 A variable belongs to this category only if it satisfies the third property of the ratio scale
(i.e., natural ordering). Examples are grading systems (A, B, C grades) or income class
(upper, middle, lower).
 For these variables the ordering exists but the distances between the categories cannot
be quantified. Students of economics will recall the indifference curves between two
goods. Each higher indifference curve indicates a higher level of utility, but one cannot
quantify by how much one indifference curve is higher than the others.

c) Classical linear regression relies strongly on various assumptions underlying the method
of least squares (OLS).

i) Explain any two of these assumptions. (4).

 Assumption 1: The regression model is linear in the coefficients and the error term: This
assumption addresses the functional form of the model.
 Assumption 2: The error term has the population mean of zero. Under this assumption
the error term accounts for the variation in the dependent variable that the independent
variable does not explain.

ii) Comment on how realistic these assumptions are in practice and how you can apply
them when reviewing research done by others. (3)
 The reality of assumptions is an age-old question in the philosophy of economic science.
Some argue that it does not matter whether the assumptions are realistic. What matters
are the predictions based on those assumptions.
 Notable among the irrelevance-of-assumptions thesis is Milton Friedman. To him,
unreality of assumptions is a positive advantage: to be important . . . a hypothesis must
be descriptively false in its assumptions.
 One may not subscribe to this viewpoint fully, but in any scientific study we make certain
assumptions because they facilitate the development of the subject matter in gradual
steps, not because they are necessarily realistic in the sense that they replicate reality

Compiled by Jeka: 073 768 7330 www.jekanomics.com


21
exactly. If simplicity is a desirable criterion of good theory, all good theories idealize and
oversimplify outrageously.

Compiled by Jeka: 073 768 7330 www.jekanomics.com


22
QUESTION 4: [10]
(a) State if the following statements are true or false. Provide a brief reason for your
answer (no reason = no marks!). (4)

(i) A behavioural equation is one that expresses an endogenous variable solely in


terms of the predetermined variables and stochastic disturbances.
(ii) Reduced-form coefficients are also known as impact multipliers.
(iii) In econometric models endogenous variables play a crucial role and are often
under the control of the government.
(iv) There is no such thing as an R² for simultaneous-equation models as a whole.

Answers
(a)
(i) False because a behavioural equation can also include endogenous variables
as explanatory variables especially in a multi-equation model.
(ii) True, reduced-form coefficients are also known as impact, or short-run
multipliers, because they measure the immediate impact on the endogenous
variable of a unit change in the value of the exogenous variable.
(iii) False, they are not often under the control of the government.
(iv) False, 𝑅 2 are also calculated for simultaneous-equation models

(b) Consider the following basic linear Keynesian macroeconomic model of the
South African economy:

Yt  Ct  I t  Gt  NX t

Ct   0  1YDt   2Ct 1  1t

YDt  Yt  Tt

I t   3   4Yt   5 rt 1  2t
Where:

Yt = Gross Domestic Product (GDP) in year t


Ct = total personal consumption expenditure in year t
It = total gross private domestic investment in year t
Gt = government purchases of goods and services in year t
NXt = net exports of goods and services (exports minus imports) in year t
Tt = taxes in year t.
rt = the interest rate in year t.
YDt = disposable income in year t.

i) Which of the variables are endogenous and which are exogenous? Make sure to
also include the lagged variables in your answer. (8 x ½ = 4)
ii) Which single-equation estimation method are you most likely to use to estimate the
reduced form equations? Why? (2)

(i) Endogenous: 𝑌𝑡 , 𝐶𝑡 , 𝑌𝐷𝑡 and 𝐼𝑡


Exogenous : 𝐺𝑡 , 𝑇𝑡 and 𝑁𝑋𝑡
Predetermined: 𝐶𝑡−1 and 𝑟𝑡−1
(ii) OLS –the reduced form essential expresses endogenous variables in terms
of exogenous variables and it satisfy all the assumptions of OLS.

END

24
MAY/JUNE 2019 – EXAMINATION.

MEMO

QUESTION 1: [55]

You are employed as the chief economist for the BRICS New Development Bank (NDB). Your task is to
prepare for a seminar at the African Regional Centre of the BRICS NDB in Sandton, Johannesburg. The
focus of the seminar is on possible drivers of economic growth in emerging economies that make up the
BRICS (Brazil, Russia, India, China and South Africa). You turn to theory and find that GDP is negatively
correlated with inflation and interest rates. The causality regarding unemployment is however difficult to
determine.

You decide to begin with a model for Brazil (with the intention of replicating the study for the other
countries in the group). You manage to get quarterly data for Brazil, for the period 1994Q1 to 2012Q4.
The variables of interest are:

 Nominal GDP (national currency)


 Consumer price index (CPI)
 Money market rates (nominal).

Question 1.1 (5)

Use the information as provided by the BRICS NDB and explain how you will calculate the
following (You may use mathematical/statistical notation, but make sure to explain its
components. Note that you only need to explain how you will go about calculating the variables,
no actual calculations are required):

(a) Real GDP (2)

GDPt
RGDP = ( ) ∗ 100
CPIt

(b) Inflation Rate (using the quarter on same quarter of previous year method) (2)
CPIt
INF = (( ) − 1) ∗ 100
CPIt−1
(c) Real Interest rates (use money market rates as proxy for interest rates) (1)

Real Interest rate = Nominal interest rate − inflation rate

Question 1.2 (50)

After cleaning and renaming the data, you have the following variables:

BRA_GDP = BrazilReal Gross Domestic Product (GDP), (National currency, 2010 = 100)
BRA_INFLA = Brazil Inflation Rate, Percent (quarter on same quarter of previous year)
BRA_INT = Brazil Real interest rate, (percentage per annum)
BRA_UNEMP = Brazil Unemployment rate (percent, %)
BRA_DUM = Dummy variable for inflation targeting (introduced in 1999, thus dummy = 0
before 1999; and 1 from 1999Q1)

Quarterly data for the above variables is available in sheet “ECS4863 MayJun 19 Question 1” of the
MS EXCEL file “ECS4863_May June 2019 exam data.xls”.

25
(a) Test the variables for stationarity. Provide your results for the following variables in the
table below: (8)

Variable Model Lags ADF test


statistic
   
LBRA_GDP Trend and 5 -1.099883
Intercept
Intercept 5 1.580744
None 5 5.116498
LBRA_UNEMP Trend and 4 -1.723455
Intercept
Intercept 4 -2.185123
None 4 -0.026812
DLBRA_GDP Trend and 4 -5.375804***
Intercept
Intercept 3 -2.947926*
None 3 -2.961132***

DLBRA_UNEMP Trend and 4 -5.227453 ***


Intercept
Intercept 3 -3.173179 **
None 3 -3.19505 ***
Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

Assume that the other variables (i.e. BRA_INFLA, BRA_INT, BRA_DUM) are non-
stationary, integrated of order I(1)..

(b) Test for cointegration between variables:

(i) Estimate the following long-run cointegration equation and use your results to
complete the table. (2)

LBRA_GDP = f (INFL, UNEMP, DUMMY)

Dependent variables: LBRA_GDPt

Variable Coefficient
LBRA_INFLAt -0.095827

LBRA_UNEMPt -0.635853

BRA_DUMt 0.504946

C 28.37804

(ii) Interpret the coefficients of the long-run equation. (3)

 LBRA_INFLA – if LBRA_INFLA increases by 1%, LBRA_GDP will decrease by


0.095827%
 LBRA_UNEMP – if LBRA_UNEMP increases by 1%, LBRA_GDP will decrease by
0.635853%.
 When BRA_DUM = 0, the coefficient won't have any effect on the LBRA_GDP, When

26
BRA_DUM = 1 so LBRA_GDP will be affected by 0. 504946.The coefficient is positive.

(iii)Do the estimated coefficients correspond to your a priori expectations in terms of


size and sign? Explain. (2)
 To evaluate our (potential) cointegration equation, it is evident that both LBRA_INFLA
and LBRA_UNEMP are negatively related to LBRA_GDP (i.e. they both have a negative
sign), which is correct according to our a priori (economic theory) expectations.
 The magnitude or size of all coefficients seem relevant in that, all are between 0 and 1
and that if LBRA_INFLA increases by 1% we expect LBRA_GDP to decrease by
0.096%, while if LBRA_UNEMP increases by 1% we expect LBRA_GDP to decrease
by 0.64%

(iv) Generate the residual series: RES_BRAZIL. Perform a unit root test on
RES_BRAZIL and report your answers in the table below. (2)

Variable Model Lags ADF test statistic


   

RES_BRAZIL None 0 -3.705290***

Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

(v) Can we conclude that the variables in the long-run equation are indeed
cointegrated? Explain. (2)

 The results indicate that the variable is statistically significant (at 1% level, i.e. three stars,
which means that we can reject the null hypothesis (of no cointegration).

(c) Build an ECM for the GDP in Brazil, using the variables as provided in the table below.

Complete the following table: (10 x ½ = 5)

Dépendent variable : D(LBRA_GDP)t

Variable Coefficient Std. error t-


statistic

D(LBRA_GDP)t-1 -0.378687 0.092478 -4.094868


D(LBRA_INFL)t -0.046175 0.014763 -3.127722
D(LBRA_UNEMP)t -0.295189 0.061206 -4.822870
D(BRA_INT)t -2.78E-05 4.14E-06 -6.725834
RES_BRAZILt-1 -0.079751 0.041806 -1.907636
CONSTANT 0.015965 0.004924 3.242329
Sample period (adjusted): 1994Q3 2012Q4
R²:0.497396
Adjusted R²:0.460439
S.E. of regression: 0.039421

27
(d) Perform diagnostic checks on the ECM.

(i) Complete the following table: (6)

Test Null hypothesis Test statistic P-value Conclusion

Jarque-Bera H0 : Normality JB = 1.847019 0.397123 Fail to reject H0 .


distributed Residuals are
residuals normally
distributed.
Ljung-Box Q H0 : No serial LBQ(6) = 46.288 0.000 Reject H0 . There
correlation is 1st order serial
correlation up to
6th lag.
Breusch-Godfrey H0 : No serial nR²(2) = 0.0002 Fail to reject H0 .
LM TEST correlation 17.58607 No
autocorrelation
up to order 2.
ARCH-LM H0 : No nR²(1) = 0.1118 Fail to reject H0 .
Heteroskedasticit 2.528796 No
y. heteroscedasticit
y up to order 2.
White H0 : No nR² (no CT) = 0.4317 Fail to reject H0.
Heteroscedasticit 4.871842 No
y heteroskedasticit
y.
Ramsey RESET H0 : No LR(1) = 0.6633 Fail to reject H0 .
misspecification. 0.189599 No
misspecification

(e) Regardless of the results you obtained in question 1(d),

(i) Write the model statement in the space provided. (6)

RES_BRAZIL = LBRA_GDP-(28.3780397114 -0.0958270447992*LBRA_INFLA -


0.635852503501*LBRA_UNEMP + 0.504946460421*BRA_DUM)

LBRA_GDP = -0.378687080036*D(LBRA_GDP(-1)) -
0.0461748136591*D(LBRA_INFLA) - 0.295188744854*D(LBRA_UNEMP) -
2.78187891141e-05*D(BRA_INT) - 0.0797509566805*RES_BRAZIL(-1) +
0.0159654498461+LBRA_GDP(-1)

BRA_GDP = EXP(LBRA_GDP)

(ii) Draw the graph of the actual and modelled values. (4)

28
1.2E+12

1.0E+12

8.0E+11

6.0E+11

4.0E+11

2.0E+11
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012

BRA_GDP BRA_GDP (Baseline)


(f) You present your findings to the “New Development Bank” board. The board members
request you to provide the following two graphs:

(i) A graph showing the impact of a temporary increase of 20% in the inflation rate,
during 2005 (that is 2005Q1 to 2005Q4). (5)

5,000

4,000

3,000

2,000

1,000

0
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012

BRA_INFLA BRA_INFLA_TS

(ii) A graph showing the impact of a permanent worsening of 10% in the unemployment
rate, starting from 2000Q1. (5)

29
14

12

10

2
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012

BRA_UNEMP BRA_UNEMP_PS

30
QUESTION 2: [20]

a) Briefly explain the main differences between a Linear Probability model, Probit and logit.

PROBIT MODEL LPM MODEL LOGIT MODEL


Specification The cumulative Underlying probability Underlying
distribution function distribution is the probability
of the standard binomial distribution distribution is the
normal distribution. logistic distribution.
Estimation Maximum Likelihood Ordinary Least Maximum Likelihood
Estimation technique Squares (OLS) Estimation technique
.

b) Discuss the use of pseudo R-squared as measure of goodness of fit in binary regressand
models.

 The pseudo R-squared is measure of goodness of fit for some common nonlinear regression
models. For example, the Cox and Snell R-squared: it is the usual R-squared for linear
regression but it depends on the likelihoods of the models with and without predictors.

c) i) Estimate a logistic regression and write down the estimated results.


(4)

Dependent Variable: CANCER


Method: ML - Binary Logit (Newton-Raphson / Marquardt steps)
Date: 11/27/19 Time: 09:40
Sample: 1 178
Included observations: 178
Convergence achieved after 6 iterations
Coefficient covariance computed using observed Hessian

Variable Coefficient Std. Error z-Statistic Prob.

AGE 0.015428 0.020985 0.735213 0.4622


AGPI 0.099333 0.052973 1.875155 0.0608
CHK -1.569722 0.513854 -3.054804 0.0023
MISC 0.322766 0.220780 1.461938 0.1438
HIGD -0.058738 0.085383 -0.687939 0.4915
WEIGHT -0.027915 0.009806 -2.846654 0.0044
C 0.505525 2.224197 0.227284 0.8202

McFadden R-squared 0.185046 Mean dependent var 0.224719


S.D. dependent var 0.418575 S.E. of regression 0.382180
Akaike info criterion 0.947093 Sum squared resid 24.97657
Schwarz criterion 1.072219 Log likelihood -77.29126
Hannan-Quinn criter. 0.997835 Deviance 154.5825
Restr. deviance 189.6826 Restr. log likelihood -94.84129
LR statistic 35.10005 Avg. log likelihood -0.434221
Prob(LR statistic) 0.000004

Obs with Dep=0 138 Total obs 178


Obs with Dep=1 40

ii) Comment on the statistical significance of the estimated parameters.


(3)

31
 Comment: Qualitatively, the results of the logit model:

o Age is statistically insignificant since the p-value is 0.4622


o HIGD is statistically insignificant since the p-value is 0.4915
o CHK is statistically significant since the p-value is 0.0023 at 1%
o AGPI is statistically significant since the p-value is 0.0608 at 10%
o MISC is statistically insignificant since the p-value is 0.1438
o WEIGHT is statistically significant since the p-value is 0.0044 at 1%

iii) Comment on the overall significance of the model.

 Collectively, 3 the coefficients are statistically significant and 3 coefficients are statistically
insignificant since the value of LR statistic is 35.10005. The value of is not very
large. Of course in most empirical research typically one could not hope to find predictors which
are strong enough to give predicted probabilities so close to 0 or 1, and so one shouldn't be
surprised.

iv) Based on these results, what recommendations do you make about cancer to the
Minister of Health. (4).

 More physical exam. Doctor may feel areas of your body for lumps that may indicate a tumor
and Laboratory tests, such as urine and blood tests, may help the doctor identify abnormalities
that can be caused by cancer in South Africa.
 How these effects interact with breast cancer risk depend on a woman's age. Women who give
birth to their first child at age 35 or younger tend to get a protective benefit from pregnancy.
Breast cancer risk is increased for about 10 years after a first birth.
 The evidence also shows that, in general, the more weight people gain as adults, the higher the
risk of postmenopausal breast cancer. In contrast, the evidence shows that, in general, the more
excess weight people have as young adults, the lower the risk of breast cancer

QUESTION 3: [15]
a) Log(Wage) =β0 + β1Educ + β2Exper + β3 Gender + β4 Marij + u.

(i) Based on your results, what is the difference in monthly salary between Marijuana
smokers and non-smokers? (4)
Dependent Variable: WAGE
Method: Least Squares
Date: 11/27/19 Time: 12:22
Sample: 1 935
Included observations: 935

Variable Coefficient Std. Error t-Statistic Prob.

EDUC 76.58566 6.229174 12.29467 0.0000


EXPER 16.37716 3.141524 5.213127 0.0000
GENDER -16.46777 24.38498 -0.675324 0.4996
MARIJ 185.9404 39.61594 4.693575 0.0000
C -421.1403 111.5746 -3.774519 0.0002

R-squared 0.156247 Mean dependent var 957.9455


Adjusted R-squared 0.152618 S.D. dependent var 404.3608
S.E. of regression 372.2278 Akaike info criterion 14.68222
Sum squared resid 1.29E+08 Schwarz criterion 14.70811
Log likelihood -6858.939 Hannan-Quinn criter. 14.69209
F-statistic 43.05452 Durbin-Watson stat 1.808733

32
Prob(F-statistic) 0.000000

(4)
Model:

Log(Wage) =-421.1403 + 76.58566Educ + 16.37716Exper -16.46777 Gender +


185.9404Marij.
o The coefficient on Marij implies that, for the same levels of educ, exper, and
gender, smokers earn about 100(186.94) 18 694% more than non-smokers.

(ii) Do your results change when you consider the squared values of experience
(Exper2)? How would you justify such a specification?
(3)
o The results will change. One simple way to capture diminishing returns is to add
a quadratic term to a linear relationship. Each additional year of experience
increases wage by less than the previous year—reflecting a diminishing
marginal return to experience. This is not very realistic, but it is one of the
consequences of using a quadratic function to capture a diminishing marginal
effect: at some point, the function must reach a maximum and curve downward.
For practical purposes, the point at which this happens is often large enough to
be inconsequential, but not always.

b) Specify a model that would allow you to test whether drug usage has different effects
on earnings for men and women.

o Add an interaction term in women and usage:

Log(Wage) =β0 + β1Usage + β2Educ + β3Exper + β4Exper2 + β5Women


+β6Women.Usage+u.

c) Hypothesising that marijuana usage varies across individuals, you decide to categorise
people as follows: (i) Non-user; (ii) Light user (1 to 5 times per month); (iii) Moderate
user (6 to 10 times per month); and (iv) Heavy user (more than 10 times per month).
How does your model specification change?

o Take the base group to be nonuser. Then there is need of dummy variables for
the other three groups: lghtuser, moduser, and hvyuser. Assuming no interactive
effect with gender, the model would be:

Log(wage) = β0 + 𝛿1 lghtuser + 𝛿2 moduser +𝛿3 hvyuser + β2Educ + β3Exper + β4Exper2


+ β5Women u

d) Explain in detail what a “dummy variable trap” is. How can you overcome this challenge?

 The dummy variable trap is concerned with cases where a set of dummy variables is so
highly collinear with each other that OLS cannot identify the parameters of the model.
That happens mainly if you include all dummies from a certain variable, e.g. you have
3 dummies for education "no degree", "high school", and "college". If you include all
dummies in the regression together with an intercept (a vector of ones), then this set of
dummies will be linearly dependent with the intercept and OLS cannot solve.
 The solution to the dummy variable trap is to drop one of the categorical variables (or
alternatively, drop the intercept constant) - if there are m number of categories, use m-
1 in the model, the value left out can be thought of as the reference value and the fit
values of the remaining categories represent the change.

QUESTION 4: [10]

33
You are employed as the Chief Economist in the Department of Energy where there is
increasing need to understand the energy demand and supply factors. After your extensive
literature search, you decide to follow the work of Halvorsen (1975) and specify the following
model:

Residential Demand for electricity


log Q  1  2 log P  3 log Y  4 log G  5 log D  6 log J  7 log R  8 log H  

Electricity supply
log P  1   2 log Q  3 log L   4 logIPP  5 log F  6 log R  7 log I  8 log T  u

Where:
Q = Residential electricity sales
P = Price of residential electricity
Y = Annual income
G = Price for residential gas
D = Heating Degree Days (A measure of how cold the temperate was on a given
day or over a period of days)
J = Average June temperature
R = Percentage of population in rural areas
H = Household size (number of people in the household)
T = Time trend variable
L = Labour cost
IPP = Percentage generated by Independent Power Producers
F = Fuel cost per Kilowatt-hour generation
I = Ratio of industrial sales to residential sales

(a) Which of the variables are endogenous and which are exogenous? (4)
 Endogenous variables
log Q and log P.
 Exogenous variables.
logY, logL, logG, logD, logJ, logR, logH, logIPP, logF, logI and logT

(b) Comment on the identification of the two equations. (2)

 Numerical estimates of the parameters of a structural can be obtained from the


estimated reduced form coefficients. If this can be done; we say that the equation is
identified else the equation is over-identified or under-identified.

(c) Describe in detail how you would estimate this model. (4)

Estimation of structural equations using the Indirect Least Squares


(ILS) method.

The ILS method involves three steps:

o Obtain reduced form equations.


o Apply Ordinary Least Squares (OLS) to the reduced form equations individually
o Obtain estimates of the original structural coefficients from the reduced form
Coefficients
o In cases where equations are over-identified, ILS cannot be used. In such a
case, the more popular 2SLS is used.

34
End

JANUARY/FEBRUARY 2016.

MEMO.

QUESTION 1: [65]

For Question 1 you need to estimate a demand function for skilled labour in South Africa. In theory
the production function may be used to derive the demand for labour within a framework of profit
maximization. The theoretical specification of the demand for skilled labour (NS) may be specified as:

NS = f (GDP, LC, LP)

Where:

NS = demand for skilled labour


GDP = real output (real GDP at market prices)
LC = nominal unit labour cost
LP = labour productivity

Table 1: Detailed description of the data

Variable Description
NS = demand for skilled labour Labour: Employment in the non-agricultural
sectors: Grand total (Seasonally adjusted,
2010=100 (Period))
GDP = real output (real GDP at market prices) Gross domestic product at market prices (GDP)
LC = nominal unit labour cost Labour: Labour costs in the non-agricultural
sectors: Nominal unit labour costs
(Seasonally adjusted, 2010=100 (Period))
LP = labour productivity Labour: Labour costs in the non-agricultural
sectors: Labour productivity (Seasonally
adjusted, 2010=100 (Period))

Annual data for the above variables is available on sheet “ECS4863 JanFeb16 Question 1” of the MS
EXCEL file “ECS4863_Jan Feb 2016 exam_data.xls” for the period 1970-2014 (i.e. 45 years).

(Hint: create a workfile in EViews using the regular frequency/annual options and for the relevant years.
Then copy the data from EXCEL into EViews.)

35
(a) Use the ADF test to test all four variables for unit roots. Provide your answers in the table below
(Hint: please remember to log variables before performing the tests): (16)

Variable Model Lags ADF test statistic


   
LNS Trend and Intercept 1 -2.080231
Intercept 1 -0.883143
None 1 1.341926
LGDP Trend and Intercept 1 1.777274
Intercept 1 - 0.0251063
None 1 3.384360

LLC Trend and Intercept 0 -0.618018


Intercept 0 -2.181973
None 1 1.469061
LLP Trend and Intercept 3 2.665692
Intercept 3 -1.673093
None 3 0.526113
DLNS Trend and Intercept 0 -3.457736*
Intercept 0 -3.487169**
None 0 -3.160672***
DLGDP Trend and Intercept 0 -4.627472***
Intercept 0 -4.671391***
None 0 -2.883014***
DLLC Trend and Intercept 0 4.817874***
Intercept 0 -4.396242***
None 0 -1.554889
DLLP Trend and Intercept 2 -2.663889
Intercept 2 -2.693153*
None 2 -2.655411***
Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

(b) Test for cointegration between variables:

(i) Estimate the following long-run cointegration equation and use your results to
complete the table. (Remember to include an intercept term.) (4)

LNS = f(LGDP, LLC)

Dependent variables: LNSt

Variable Coefficient

LGDPt
1.298990
LLCt -0.129727

(ii) Interpret the coefficients of the long-run equation and Do the coefficients correspond
to your a priori expectations in terms of their size and sign? (8)

36
 To evaluate our (potential) cointegration equation, it is evident that LGDP is positively related
to LNS (i.e.it has a positive sign), which is correct according to our a priori (economic theory)
expectations. However, the magnitude of the coefficients also seems to irrelevant – in this case
it not between 0 and 1, and if LGDP increases by 1% we expect LNS to increase by 129.9%.
 It is also evident that LLC is negatively related to LNS (i.e.it has a negative sign), which is correct
according to our a priori (economic theory) expectations. However, the magnitude of the
coefficients also seems to irrelevant – in this case it not between 0 and 1, and if LLC decreases
by 1% we expect LNS to increase by 12.98%.

(iii) Generate the residual series with the command GENR: RESNS = RESID. Perform a unit
root test on RESNS and report your answers in the table below (2)

Variable Model Lags ADF test statistic

RESNS None 1 -2.811186***

Statistically significant at the: 10% level (*), 5% level (**), 1% level (***)

(iv)
Can we conclude that the variables in the long run equation are indeed
cointegrated? Explain. (2)
 The results indicate that the variable is statistically significant (at 1% level, i.e. three stars,
which means that we can reject the null hypothesis (of no cointegration).
(c) Build an Error Correction Model (ECM) for the demand for skilled labour.

(i) Complete the following table: (8)

Dependent variable: DLNSt

Variable Coefficient Std. error t-statistic

DLGDPt 0.892320 0.053556 16.66151


DLLPt -0.970114 0.032150 -30.17447
RESNSt-1 -0.036667 0.015163 -2.418204
CONSTANT 0.002268 0.001746 1.298846

Sample period (adjusted): 1971-2014


R²: 0.971835
Adjusted R²: 0.969722
S.E. of regression: 0.007521

(ii) Can we interpret the signs and size of the coefficients in the ECM? (2)
 The coefficient of the lagged residual (RESNS (-1)) is negative and significant, that is it
between -1 and 0. It is -0.036667.
(d) Perform diagnostic checks on the ECM.

(i) Complete the following table: (12)

37
Test Null hypothesis Test statistic P-value Conclusion

Jarque-Bera H0 : Normally JB = 27,50960 0.000001 Reject Ho.


distributed Residuals are not
residuals normally
distributed
Ljung-Box Q H0 : No serial LBQ(6) = 11.954 0.063 Fail to Reject Ho.
correlation There is no 1st
order serial
correlation up to
6th lag.
Breusch-Godfrey H0 : No serial nR²(2) =15.44363 0.0004 Reject Ho .There
LM TEST correlation is autocorrelation
up to order 2.
ARCH-LM H0 : No nR²(2) = 0.0094 Reject Ho. There
heteroscedasticity 9.327868 is
heteroscedasticit
y up to order 2
White H0 : No nR² (no CT) = 0.0000 Reject Ho. There
heteroscedasticity 36.27086 is
heteroscedasticit
y up to order 2
Ramsey RESET H0 : No LR(2) = 0.7692 Fail to Reject
misspecification 0.086129 Ho.No
is
misspecification.

(ii) Given your conclusions on the diagnostic check of the ECM, do you think that this is an
acceptable model? (Please provide reasons, no marks will be earned for
only stating yes/no.) (3)
 NO:
Reasons:
o Residuals are not normally distributed
o There is autocorrelation up to the order 2.
o There is heteroscedasticity up to the order 2.
(e) Regardless of the results you obtained in question 1(d), suppose you still decide to create a model
statement in EViews to combine your long run and ECM.

(i) Provide the missing values/variables in the model statement (please write your
answer next to the correct option in the space provided below the statement): (5)

RESNS = (a)--------- - ((b)---------- * LGDP - 0.129726760054 * LLC - 14.0665958153)

(c)--------- = 0.892320312733 * D(LGDP) - (d)--------- * D(LLP) - 0.0366672785059 *


RESNS (-1) + 0.00226826179642 + LNS(-1)

NS = EXP((e)--------- )

38
(a) LNS (b) 1.2989899682 (c) LNS (d) 0.970114223164

(e) LNS

(iii) Graph the actual and estimated values for the demand for labour. Comment the fit
you observe. (Hint: copy/paste your graph of NS and NS_0.)

120

110

100

90

80

70

60

50

40
1970 1975 1980 1985 1990 1995 2000 2005 2010

NS NS (Baseline)
Comment: A very good fit is observed. However, in practice this is not always the case. Keep in (3)
mind what the purpose of the model (e.g. academic publication, scenario analysis, forecasting,
etc.) is and also how good/reliable the data is that you are working with. But if you went through
all the checks/steps of the model process and you got good results up to here, the model results
will usually be acceptable.

QUESTION 2: [20]

(a) Consider the following statement “(the) use of coefficients of determination as a


summary statistic should be avoided in models with qualitative dependent variables”
(Aldrich and Nelson, Qualitative Response Model, Journal of Economic Literature,
1981, vol. 19:331-354).

(i) List and briefly explain two possible problems when using OLS to estimate linear
probability models (LPM’s).

 Non-normality of the error term: The assumption that the error is normally distributed is
critical for performing hypothesis tests after estimating your econometric model.
 Heteroscedasticity: The classical linear regression model (CLRM) assumes that the error
term is homoscedastic. The assumption of homoscedasticity is required to prove that the
OLS estimators are efficient (or best). The proof that OLS estimators are efficient is an
important component of the Gauss-Markov theorem. The presence of heteroscedasticity

39
can cause the Gauss-Markov theorem to be violated and lead to other undesirable
characteristics for the OLS estimators.

ii) What other measures of goodness of fit, apart from R², are available in binary regress and
models? List any two and briefly explain how they work. (2)

 The pseudo R-squared is measure of goodness of fit for some common nonlinear
regression models.
 The Cox and Snell R-squared: the usual R-squared for linear regression depends on the
likelihoods for the models with and without predictors.

(b) For Question 2 (b) you need to estimate a logistic regression (logit) function to explain the
determinants of being accepted into an honours module at university, using hypothetical
data obtained from 200 students

The variables are as follows:

HON = if a student is accepted into an honours module or not, where 1 = yes and 0 = no.
READ = reading mark obtained (out of 100)
MATH = mathematics mark obtained (out of 100)

The variables are provided in the sheet named: ECS4863 JanFeb16 Question 2.
(i) Hypothesise the expected relationship with the two explanatory variables, i.e.
READ and MATH.
H0: B1 = B2 =0
 For every one-unit increase in reading mark obtained (so, for every additional point on the
reading test), we expect an increase in the log-odds of HON
 For every one-unit increase in MATH, we expect an increase in the log-odds of HON.

(ii) Import the data into EViews and estimate the following function (Hint: remember
to change your estimation setting to binary/logit): (2)

Logit (HON) = -10.83161 + 0.056716READ + 0.118779MATH

(iii) Interpret your estimated coefficients. (Hint: remember to do the relevant


adjustments to the coefficients.
 READ: The coefficient of 0.0567 attached to READ is to be interpreted as follows: Take its
antilog, subtract one from it and multiply the result by 100. Thus, antilog (0.056716) = 1.05836,
subtracting one from this and multiplying the difference by 100, gives 5.84%. This means that if
reading mark increases by one unit, the odds in favour of higher HON (A student accepted into
an honours module) goes up by 5.84%.

 MATH: The coefficient of 0.118779 attached to MATH is to be interpreted as follows: Take its
antilog, subtract one from it and multiply the result by 100. Thus, antilog (0.11877) = 1.12612,
subtracting one from this and multiplying the difference by 100, gives 12.61%. This means that

40
if MATH (Mathematic mark) increases by one unit, the odds in favour of HON (A student
accepted into an honours module) goes up by 12.61%.

(iv) Which of the coefficients are statistically significant? Explain. (2)


MATH is statistically significant since the p-value is 0.0001.
READ is statistically significant since the p-value is 0.0255
(v) What is the effect of a percentage point increase in the mathematics mark obtained on the
odds of being accepted into an honours class? (2)

 The odds ratio will go up by 12.61%


QUESTION 3: [7]

(a) Explain simultaneous-equation bias and why, or in which instances, OLS may not be
applied in a system of simultaneous equations.
Given the following equations representing a simultaneous equations model:

J1  1 J 2  1Z1  u1

J 2   2 J1   2 Z 2  u2

 Whenever an explanatory variable is also an endogenous variable, the ordinary least squares
(OLS) estimation procedure for the value of its coefficient is biased. This arises when one or
more of the explanatory variables is jointly determined with the dependent variable.
 OLS suffer from simultaneous equation bias when 𝐽2 is correlated with 𝑢1 because of
simultaneity.

QUESTION 4: [8]

You have recently been employed as economic advisor to the Department of Agriculture. As
part of your primary duties, your line manager asks you to build an econometric model to
explain the value added by the agriculture, forestry and fishing sector to the overall GDP of
South Africa.

(a) Before you start working on the model, you are requested to provide a brief overview of the
methodology that you plan to apply. Your overview should include items such as the various
steps to be followed, a discussion of potential explanatory variables (including a priori
expectations) and the estimation technique(s) to be employed. (Hint: please do not provide any
regression or other output.) Your answer should not exceed 2 pages.
Discuss:
 Statement of Economic Theory or Hypothesis.
 Specification of the Mathematical Model of Consumption (single-equation model)
 Specification of the Econometric Model.
 Obtaining Data.
 Estimation of the Econometric Model
 Hypothesis Testing
 Forecasting or Prediction
 Use of the Model for Control or Policy Purposes

End.

41

You might also like