Econometrics - Exercise Set 5 (Solution)
Econometrics - Exercise Set 5 (Solution)
Exercise 1
Remember exercise 2 in exercise set 4. Load the same dataset (E42.RData) and estimate the following
model again.
1.1
Would quantile regression be appropriate to use with this dataset? If yes, why? If no, why?
Yes, since the effects from the different regressors might vary depending on the different levels of GDP.
E.g., the role of education may be more/less important at different stages of economic development.
A simple QQ-plot of the least squares regression model indicates that the residuals are not normally
distributed, and we have a few outliers for which the least squares estimate might not be representative.
Page 1 of 12
Least-squares method
Quantile regression
The quantile regression estimates the conditional median of the dependent variable. When minimizing
the expression below for a given quantile (𝜏 ), we find the 𝜏 th quantile best-fit median.
1.2
Page 2 of 12
Y-axes: Effect on GDP
X-axes: Quantiles
The graphs above plot the marginal effects for all deciles. The effect of education and capital are
somewhat constant, while the ones for labour and energy are decreasing and increasing, respectively.
1.3
Labour: The marginal effect decreases substantially when focusing on larger values of GDP. That
is, an increase in labour has a smaller marginal effect when GDP is large. This could indicate the
importance of labour for developing countries. Notice that labour is barely significant in the third
specification.
Capital: Capital only appears to have (slight) significant effect for countries with low GDP. For
countries with low GDP, an increase in the capital stock is expected to decrease GDP.
Page 3 of 12
Exercise 2
The dataset E52.RData contains information on net financial wealth (nettfa), age of survey respondent
(age), annual family income (inc), family size (fsize) and information on participation in certain pension
plans for people in the United States. The wealth and income variables are both recorded in thousands
of euros.
For this problem, use only data for married people without children living at home (marr = 1, fsize =
2) and consider the following model,
2.1
Estimate the model above by OLS, and comment on the results. Do not forget to comment on
intercept
Income: A one-unit increase in income is expected to increase net financial wealth by 1.399. The
Age: A one-year increase in age is expected to increase net financial wealth by 1.840. The
Page 4 of 12
Intercept: This is the predicted net financial wealth for an individual for whom age = 0 and income
= 0. In this setting, the intercept brings little relevant information as we are not interested in predicting
2.2
Reformulate the model such that the intercept fits the data. Please comment on your findings
Note: There are multiple ways of reformulating the model. This can either be done by demeaning the
variables, making a log-transformation or including additional regressors.
Demeaned model
In this case, the intercept can be interpreted as the average net financial wealth when all the regressors
are set to their means. From the output, we see that when income and age are evaluated at their
Page 5 of 12
The interpretation of the coefficients does not change.
Additional regressors
Added male, pira, p401k, incsq, and agesq. Thus, I will be estimating:
IRA: Significant on a 1% level. Having an “individual retirement account” increases net wealth
by 35.832.
Income (sq.): Significant on a 1% level. Interestingly, for low values of income the effect is negative,
Log-transformed model
Income: A one-unit increase in income is expected to increase net financial wealth by 3.1%. This
Age: A one-year increase in age is expected to increase net financial wealth by 4.4%.
Page 6 of 12
2.3
Given the extremely low p-values, we can reject 𝐻0 : 𝛽1 = 1. Hence, it is plausible that 𝛽1 = 1.399 in
Note that the test is not exactly applicable to the 3rd and 4th model as we include squared covariates
2.4
Remove age and agesq from the models and comment on your findings
Every regressor turns out to be highly significant. For the original and demeaned model, the coefficient
The magnitude of the income, male, IRA, and income squared coefficients increases in Model 3 (with
Page 7 of 12
The effect of a one-unit increase in income decreases from 3.1% to 2.9%. This might be caused by the
2.5
Estimate the model(s) without age and agesq by Quantile Regression for 𝝉=
Model 1 and 2
Interestingly, from our original and demeaned models, it seems that, as net financial wealth increases,
the effect of income becomes stronger. Compared to OLS, the 75% quantile is the most representative.
Model 3
In model 3, where additional regressors are included, the effect from income is negative and increasing
The effects of being male is relatively constant in all specifications, with the exception of at 𝜏 = 0.95
Page 8 of 12
The effect of having an “individual retirement account” seems to have a larger effect for individuals
with higher net financial wealth. The same can be said for having an “employer-sponsored defined-
The positive coefficient on 𝑖𝑛𝑐2 suggests that the effect from income follows a U-shape for all quantiles.
Model 4
In the log-transformed model, the effect of income has a constant effect for the 𝜏 = 5% and 𝜏 = 25%
quantiles (approx. 3.4%). From then on, the effect decreases as net financial wealth increases.
2.6
How are the Standard Errors estimated in the Quantile Regression? And under which assumptions
Hint: Look in “summary.rq” in the help section in R to find the different estimation methods for the
standard errors.
The general formula for computing standard errors in a quantile regression is:
√ 𝐽 2
√∑ (𝜃 ̂ − 𝜃𝑗̂ )
√
𝑆𝐸(𝜃)̂ =
𝑗=1
⎷ 𝐽 −1
The reason for this formula, is that the standard errors are cumbersome to obtain so we draw j random
samples with replacement. 𝜃 ̂ is the estimate based on the original sample (each quantile data set).
Page 9 of 12
Exercise 3
Perform an analysis of factors affecting birthweight similar to the one in Koenker and Hallock (2001),
Quantile Regression, but using a dataset from the MASS package in RStudio. Be sure to install the
MASS-package and load the dataset using the following code: data=birthwt
3.1
The dataset is substantially smaller than that of Koenker and Hallock (2001), p. 6. Is that an issue?
Yes, since it makes computing standard errors and coefficients more difficult because fewer observations
Page 10 of 12
3.2
Estimate the relationship between birthweight and other relevant factors by OLS and Quantile
Regression for 𝝉 = [𝟎. 𝟎𝟓, 𝟎. 𝟐𝟓, 𝟎. 𝟓, 𝟎. 𝟕𝟓, 𝟎. 𝟗𝟓]. Argue for each and every variable, and whether
Mother’s weight: Mother’s weight seems to positively impact the birth weight of a child. The
effect seems to be largest around the middle quantiles, and is only statistically significant at 𝜏 = 0.75.
Race:
• Black: Increasing negative effects, with a sudden drop at the 95% quantile.
Smoker: Negative effect, which seems to be most prominent at 𝜏 = 0.5 (the LAD
regression)
Page 11 of 12
Hypertension: Largest negative effect at 𝜏 = 0.25, which is the only significant effect (at
Uterine irritability: Negative effect which is significant across all quantiles except the 95%
quantile.
We experience insignificant coefficients for the tails of the distribution of children’s birthweight with
3.3
How are the standard errors calculated, and under which assumptions are they valid?
√ 𝐽 2
√∑𝑗=1(𝜃 ̂ − 𝜃𝑗̂ )
√
𝑆𝐸(𝜃)̂ =
⎷ 𝐽 −1
It’s a valid method if errors are not normally distributed and there is heteroskedasticity.
Page 12 of 12