0% found this document useful (0 votes)
10 views12 pages

Econometrics - Exercise Set 5 (Solution)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

Econometrics - Exercise Set 5 (Solution)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Econometrics - Exercise set 5

Exercise 1
Remember exercise 2 in exercise set 4. Load the same dataset (E42.RData) and estimate the following
model again.

𝑔𝑑𝑝𝑖 = 𝛽0 + 𝛽1 𝑙𝑎𝑏𝑜𝑢𝑟𝑖 + 𝛽2 𝑒𝑑𝑢𝑖 + 𝛽3 𝑒𝑛𝑒𝑟𝑔𝑦𝑖 + 𝛽4 𝑐𝑎𝑝𝑖𝑡𝑎𝑙𝑖 + 𝜀𝑖

1.1

Would quantile regression be appropriate to use with this dataset? If yes, why? If no, why?

Yes, since the effects from the different regressors might vary depending on the different levels of GDP.

E.g., the role of education may be more/less important at different stages of economic development.

A simple QQ-plot of the least squares regression model indicates that the residuals are not normally

distributed, and we have a few outliers for which the least squares estimate might not be representative.

Page 1 of 12
Least-squares method

Estimate the average value of 𝑦 for each 𝑥𝑖𝑘 :


𝑛
𝑆(𝛽) = ∑(𝑦𝑖 − 𝑥′𝑖 𝛽)2
𝑖=1

Quantile regression

The quantile regression estimates the conditional median of the dependent variable. When minimizing

the expression below for a given quantile (𝜏 ), we find the 𝜏 th quantile best-fit median.

𝑆(𝜉) = 𝜏 ∑|𝑦𝑖 − 𝜉| + (1 − 𝜏 ) ∑|𝑦𝑖 − 𝜉|


𝑦𝑖 ≥𝜉 𝑦𝑖 <𝜉

1.2

Conduct Quantile Regression on the model with 𝝉 = [𝟎. 𝟐𝟓, 𝟎. 𝟓, 𝟎. 𝟕𝟓]

Page 2 of 12
Y-axes: Effect on GDP

X-axes: Quantiles

The graphs above plot the marginal effects for all deciles. The effect of education and capital are

somewhat constant, while the ones for labour and energy are decreasing and increasing, respectively.

1.3

Comment on your findings in 1.2

Labour: The marginal effect decreases substantially when focusing on larger values of GDP. That

is, an increase in labour has a smaller marginal effect when GDP is large. This could indicate the

importance of labour for developing countries. Notice that labour is barely significant in the third

specification.

Education: Positive and (highly) significant across all quantiles.

Energy: Significant on a 1% significance level across all specifications. Effect is gradually

increasing as we consider higher quantiles.

Capital: Capital only appears to have (slight) significant effect for countries with low GDP. For

countries with low GDP, an increase in the capital stock is expected to decrease GDP.

Page 3 of 12
Exercise 2
The dataset E52.RData contains information on net financial wealth (nettfa), age of survey respondent
(age), annual family income (inc), family size (fsize) and information on participation in certain pension
plans for people in the United States. The wealth and income variables are both recorded in thousands
of euros.

For this problem, use only data for married people without children living at home (marr = 1, fsize =
2) and consider the following model,

𝑛𝑒𝑡𝑡𝑓𝑎 = 𝛽0 + 𝛽1 𝑖𝑛𝑐 + 𝛽2 𝑎𝑔𝑒 + 𝜀

2.1

Estimate the model above by OLS, and comment on the results. Do not forget to comment on

intercept

Income: A one-unit increase in income is expected to increase net financial wealth by 1.399. The

coefficient is significant at a 1% significance level.

Age: A one-year increase in age is expected to increase net financial wealth by 1.840. The

coefficient is significant at a 1% significance level.

Page 4 of 12
Intercept: This is the predicted net financial wealth for an individual for whom age = 0 and income

= 0. In this setting, the intercept brings little relevant information as we are not interested in predicting

the wealth for infants.

2.2

Reformulate the model such that the intercept fits the data. Please comment on your findings

Note: There are multiple ways of reformulating the model. This can either be done by demeaning the
variables, making a log-transformation or including additional regressors.

Demeaned model

To demean a model, we subtract the mean of each regressor:

𝑛𝑒𝑡𝑡𝑓𝑎 = 𝛽0 + 𝛽1 (𝑖𝑛𝑐 − 𝚤𝑛𝑐


̅̅̅̅̅̅̅̅) − 𝛽2 (𝑎𝑔𝑒 − 𝑎𝑔𝑒
̅̅̅̅̅̅̅̅̅) + 𝜖

In this case, the intercept can be interpreted as the average net financial wealth when all the regressors

are set to their means. From the output, we see that when income and age are evaluated at their

average values, net financial wealth is expected to be positive (48.035).

Page 5 of 12
The interpretation of the coefficients does not change.

Additional regressors

Added male, pira, p401k, incsq, and agesq. Thus, I will be estimating:

𝑛𝑒𝑡𝑡𝑓𝑎 = 𝛽0 + 𝛽1 𝑖𝑛𝑐 + 𝛽2 𝑎𝑔𝑒 + 𝛽3 𝑚𝑎𝑙𝑒 + 𝛽4 𝑝𝑖𝑟𝑎 + 𝛽5 𝑝401𝑘 + 𝛽6 𝑖𝑛𝑐2 + 𝛽7 𝑎𝑔𝑒2 + 𝜖

Male: Insignificant effect.

IRA: Significant on a 1% level. Having an “individual retirement account” increases net wealth

by 35.832.

P401k: Significant on a 1% level. Having an “employer-sponsored defined-contribution pension

account” increases net financial wealth by 15.894.

Income (sq.): Significant on a 1% level. Interestingly, for low values of income the effect is negative,

but turns positive at some point.

Age (sq.): Insignificant effect.

Log-transformed model

Here, I will be estimating:

log(𝑛𝑒𝑡𝑡𝑓𝑎) = 𝛽0 + 𝛽1 𝑖𝑛𝑐 + 𝛽2 𝑎𝑔𝑒 + 𝜖

Income: A one-unit increase in income is expected to increase net financial wealth by 3.1%. This

effect is significant at a 1% significance level.

Age: A one-year increase in age is expected to increase net financial wealth by 4.4%.

Page 6 of 12
2.3

Test the following hypothesis, 𝑯𝟎 : 𝜷𝟏 = 𝟏 (coefficient on income), on one or more of your

reformulated models and compare to the original model

Given the extremely low p-values, we can reject 𝐻0 : 𝛽1 = 1. Hence, it is plausible that 𝛽1 = 1.399 in

the original and demeaned model.

Note that the test is not exactly applicable to the 3rd and 4th model as we include squared covariates

and use a log-transformation, respectively.

2.4

Remove age and agesq from the models and comment on your findings

Every regressor turns out to be highly significant. For the original and demeaned model, the coefficient

on income decreases slightly compared to the regressions including age.

The magnitude of the income, male, IRA, and income squared coefficients increases in Model 3 (with

additional regressors), while the 401(k) variable decreases in magnitude.

Page 7 of 12
The effect of a one-unit increase in income decreases from 3.1% to 2.9%. This might be caused by the

fact that income and age is negatively correlated.

2.5

Estimate the model(s) without age and agesq by Quantile Regression for 𝝉=

[𝟎. 𝟎𝟓, 𝟎. 𝟐𝟓, 𝟎. 𝟓, 𝟎. 𝟕𝟓, 𝟎. 𝟗𝟓], and compare to OLS

Model 1 and 2

Interestingly, from our original and demeaned models, it seems that, as net financial wealth increases,

the effect of income becomes stronger. Compared to OLS, the 75% quantile is the most representative.

Model 3

In model 3, where additional regressors are included, the effect from income is negative and increasing

in magnitude as we consider higher quantiles of net financial wealth.

The effects of being male is relatively constant in all specifications, with the exception of at 𝜏 = 0.95

where the magnitude increases substantially.

Page 8 of 12
The effect of having an “individual retirement account” seems to have a larger effect for individuals

with higher net financial wealth. The same can be said for having an “employer-sponsored defined-

contributions pension account”.

The positive coefficient on 𝑖𝑛𝑐2 suggests that the effect from income follows a U-shape for all quantiles.

Model 4

In the log-transformed model, the effect of income has a constant effect for the 𝜏 = 5% and 𝜏 = 25%

quantiles (approx. 3.4%). From then on, the effect decreases as net financial wealth increases.

2.6

How are the Standard Errors estimated in the Quantile Regression? And under which assumptions

are they valid?

Hint: Look in “summary.rq” in the help section in R to find the different estimation methods for the
standard errors.

The general formula for computing standard errors in a quantile regression is:

√ 𝐽 2
√∑ (𝜃 ̂ − 𝜃𝑗̂ )

𝑆𝐸(𝜃)̂ =
𝑗=1
⎷ 𝐽 −1

The reason for this formula, is that the standard errors are cumbersome to obtain so we draw j random

samples with replacement. 𝜃 ̂ is the estimate based on the original sample (each quantile data set).

Let 𝜃𝑗̂ denote the estimate from bootstrap sample number 𝑗.

Page 9 of 12
Exercise 3
Perform an analysis of factors affecting birthweight similar to the one in Koenker and Hallock (2001),
Quantile Regression, but using a dataset from the MASS package in RStudio. Be sure to install the
MASS-package and load the dataset using the following code: data=birthwt

This data frame contains the following columns:

• Low indicator of birth weight less than 2.5 kg.


• Age mother’s age in years.
• Lwt mother’s weight in pounds at last menstrual period.
• Race mother’s race (1 = white, 2 = black, 3 = other).
• Smoke smoking status during pregnancy.
• Ptl number of previous premature labours.
• Ht history of hypertension.
• Ui presence of uterine irritability.
• Ftv number of physician visits during the first trimester.
• Bwt birth weight in grams.

3.1

The dataset is substantially smaller than that of Koenker and Hallock (2001), p. 6. Is that an issue?

Yes, since it makes computing standard errors and coefficients more difficult because fewer observations

cause less consistent models from the central limit theorem.

Page 10 of 12
3.2

Estimate the relationship between birthweight and other relevant factors by OLS and Quantile

Regression for 𝝉 = [𝟎. 𝟎𝟓, 𝟎. 𝟐𝟓, 𝟎. 𝟓, 𝟎. 𝟕𝟓, 𝟎. 𝟗𝟓]. Argue for each and every variable, and whether

they should be transformed in any way

Mother’s weight: Mother’s weight seems to positively impact the birth weight of a child. The

effect seems to be largest around the middle quantiles, and is only statistically significant at 𝜏 = 0.75.

Race:

• Black: Increasing negative effects, with a sudden drop at the 95% quantile.

• Other: We only estimate significant effects (at a 10% significance level) at 𝜏 =

0.25 and 𝜏 = 0.50.

Age: Largest negative effect for 75% quantile.

Premature labour: Insignificant.

Smoker: Negative effect, which seems to be most prominent at 𝜏 = 0.5 (the LAD

regression)

Page 11 of 12
Hypertension: Largest negative effect at 𝜏 = 0.25, which is the only significant effect (at

a 10% significance level)

Uterine irritability: Negative effect which is significant across all quantiles except the 95%

quantile.

We experience insignificant coefficients for the tails of the distribution of children’s birthweight with

the exception of “smoker” and “uterine irritability”.

3.3

How are the standard errors calculated, and under which assumptions are they valid?

√ 𝐽 2
√∑𝑗=1(𝜃 ̂ − 𝜃𝑗̂ )

𝑆𝐸(𝜃)̂ =
⎷ 𝐽 −1

An assumption is that the number of samples used must be large.

It’s a valid method if errors are not normally distributed and there is heteroskedasticity.

Page 12 of 12

You might also like