0% found this document useful (0 votes)
23 views10 pages

Metrics Jan 2021

The document outlines an examination for a PhD in econometrics consisting of two parts over two days. Part A includes three questions on statistics, OLS estimation, and a regression application. Part B includes three questions on causal analysis using the potential outcomes framework, maximum likelihood estimation, and partitioned regression.

Uploaded by

Ahmed leo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views10 pages

Metrics Jan 2021

The document outlines an examination for a PhD in econometrics consisting of two parts over two days. Part A includes three questions on statistics, OLS estimation, and a regression application. Part B includes three questions on causal analysis using the potential outcomes framework, maximum likelihood estimation, and partitioned regression.

Uploaded by

Ahmed leo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

PhD Econometrics Examination

Part A

Wednesday, March 17th, 2021

Total Time: 3 hours

PART A

(Answer any TWO of the following three questions)

Q1. Statistics
−𝜆𝜆𝜆𝜆
a. Let the pdf of random variable 𝑥𝑥 be 𝑓𝑓(𝑥𝑥) = �𝜆𝜆𝜆𝜆 , 𝑥𝑥 ≥ 0. Define a new random
0, 𝑥𝑥 < 0
2
variable 𝑦𝑦 = 𝑥𝑥 . Find the pdf of 𝑦𝑦.

Let X1, X2, …, Xn represent a random sample following 𝜒𝜒 2 (1). 𝑋𝑋� is sample mean.
b. Find the limiting distribution of �𝑋𝑋� . (note: 𝜒𝜒 2 (𝑘𝑘) distribution has mean 𝑘𝑘 and variance
2𝑘𝑘)

c. Find the limiting distribution of √𝑛𝑛(�𝑋𝑋� − 1)/𝑋𝑋� .

Consider two random variables X1 and X2, whose joint density function is

2 , 0 < 𝑥𝑥1 < 𝑥𝑥2 < 1


𝑓𝑓(𝑥𝑥1 , 𝑥𝑥2 ) = �
0, elsewhere

d. Find the marginal density function of X1 and X2.

e. Find the conditional density function of X1 given X2.

Page 1 of 4
Q2. OLS estimation

For a dependent variable vector y with n observations, its corresponding independent variable
matrix is X with k variables, parameter vector is 𝛽𝛽 and residual vector is e.

a. Write out the matrix representation of linear regression, detail the dimensions of each
matrix/vector.

b. Derive the OLS solution of parameter vector estimate b.

c. Interpret the goodness of fit measurement 𝑅𝑅 2 and derive its expression as a function of y,
𝑦𝑦� and e.

d. Derive the distribution of estimated parameter vector, assuming the normality of residual
distribution 𝑁𝑁(0, 𝜎𝜎 2 𝐼𝐼𝑛𝑛 ).

e. If a restriction Rβ = q, is imposed on the regression coefficients, demonstrate what


happens to the sum of squared errors e’e.

Page 2 of 4
Q3. Regression application

Consider the following time series regression output. Someone is regressing per capita annual
gasoline consumption on a set of explanatory variables including linear trend, income and price
of each year.
log(gas/pop) ~ log(income) + log(price) + ltrend
Residuals:
Min 1Q Median 3Q Max
-0.06302 -0.01648 0.00579 0.01840 0.04726

Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept -16.634946 1.002185 -16.599 < 2e-16 ***
log(income) 1.870306 0.114454 16.341 < 2e-16 ***
log(price) -0.114410 0.022667 -5.048 1.73e-05 ***
ltrend -0.017939 0.002599 ??? ???
---
Residual standard error: 0.02956 on 32 degrees of freedom
Multiple R-squared: 0.9653, Adjusted R-squared: 0.962
F-statistic: 296.7 on 3 and 32 DF, p-value: < 2.2e-16

a. Using 𝑡𝑡0.025 = 2, calculate the 95% confidence interval for variable log(income). Round to
the 2nd decimal place.

b. Calculate the t-value of ltrend coefficient. Is it significant at 1% level?

c. Test the hypothesis that the income elasticity equals to 1.

d. To perform an F test on the hypothesis that the coefficient of log(price) = 0, what is the test
statistics value and why?

e. Suppose below is the residual plot of this time series regression, what type of problem is likely
to exist in the error terms? What kind of problem does it introduce to the estimates?

Page 3 of 4
Page 4 of 4
PhD Econometrics Examination

Part B

Thursday, March 18th, 2021

Total Time: 3 hours

PART B

(Answer any TWO of the following three questions)

Q1. Causal Analysis

In econometric analysis, we are often concerned with estimating the causal effect of some treatment
variable, 𝐷𝐷𝑖𝑖 , on an outcome variable of interest, 𝑌𝑌𝑖𝑖 . However, obtaining a causal estimate can be
challenging.

a. What is the “Fundamental Problem of Causal Inference”. Please define it and explain
what it means for empirical analysis.

b. Define 𝑌𝑌1𝑖𝑖 as the outcome of individual 𝑖𝑖 if she/he is treated and 𝑌𝑌0𝑖𝑖 and the outcome of
that same individual if she/he were not treated. If treatment were randomly assigned
across individuals in the sample, then the treatment effect of 𝐷𝐷𝑖𝑖 on 𝑌𝑌𝑖𝑖 is as follows:

𝐸𝐸𝑖𝑖 [𝑌𝑌1𝑖𝑖 − 𝑌𝑌01 ] = 𝐸𝐸[𝑌𝑌𝑖𝑖 |𝐷𝐷𝑖𝑖 = 1] − 𝐸𝐸[𝑌𝑌𝑖𝑖 |𝐷𝐷𝑖𝑖 = 1], (B.1)

where 𝐷𝐷𝑖𝑖 is equal to one if individual 𝑖𝑖 was treated and equal to zero if she/he was not
treated (i.e., was in the control group).

Suppose that treatment was not randomly assigned. Using the Potential Outcomes
Framework, decompose the expectation in equation (B.1) into the “average treatment
effect” and “selection bias”. Say in words what is captured by the average treatment
effect term and the selection bias terms that you derive.

c. A large body of evidence indicates that in utero and early life health affects adult
economic well-being. Suppose that you are interested in estimating the effect of being
born at a low birth weight (an indicator of poor in utero health and health at birth) on
adult economic well-being for a sample of individuals in Ethiopia. Note: in utero refers
to the period during which a child was in his/her mother’s womb (i.e., after conception
but before birth). So, you estimate the following model:

𝑌𝑌𝑖𝑖 = 𝛼𝛼 + 𝛽𝛽𝛽𝛽𝛽𝛽𝑊𝑊𝑖𝑖 + 𝜀𝜀𝑖𝑖 (B.2)

Page 1 of 4
where 𝑌𝑌𝑖𝑖 is the adult earnings of individual 𝑖𝑖 and 𝐿𝐿𝐿𝐿𝑊𝑊𝑖𝑖 is equal to one if individual 𝑖𝑖 was
born at a low birth weight and zero otherwise. Ideally, then, 𝛽𝛽 would capture the effect of
being born at a low birth weight on adult earnings.

What assumption is required in order for the OLS estimate of 𝛽𝛽 to represent the unbiased,
causal effect of low birth weight on earnings? Why is this assumption likely to fail?

d. You decide to instrument for low-birth weight using the instrumental variable, 𝑍𝑍. Which
two requirements are necessary for this instrument to be valid? Please define both
mathematically and with words.

e. Using 𝑍𝑍 as your instrument, derive the GMM-IV estimator for 𝛽𝛽.

f. Derive and describe the estimator for 𝛽𝛽 using the control function approach and using 𝑍𝑍
as your instrument.

g. Would the following variables be plausible instruments for being born at a low birth
weight? Explain why or why not and be sure to address each of the requirements for a
valid instrument in your explanation.

i. Mother’s and/or father’s birthweight

ii. Mother’s and/or father’s highest grade attainment

iii. The prevalence of infectious disease in the area where individual 𝑖𝑖 was born
during his/her in utero period.

iv. Rainfall in during the most recent growing season prior to individual 𝑖𝑖′𝑠𝑠 birth.

Page 2 of 4
Q2. Maximum Likelihood

Let 𝑦𝑦� be some unobserved latent variable such that


𝑦𝑦� = 𝑥𝑥𝑥𝑥 + 𝜀𝜀 where 𝜀𝜀~𝑁𝑁(0, 𝜎𝜎 2 𝐼𝐼)

You observe 𝑦𝑦𝑖𝑖 and 𝑥𝑥𝑖𝑖 , 𝑖𝑖 = 1, … 𝑁𝑁, such that

1 𝑖𝑖𝑖𝑖 𝑦𝑦�𝑖𝑖 > 0


𝑦𝑦𝑖𝑖 = �
0 𝑖𝑖𝑖𝑖 𝑦𝑦�𝑖𝑖 ≤ 0

Define 𝜙𝜙(𝜃𝜃) as the pdf for a standard normal and Φ(𝜃𝜃) as the cdf for the standard normal. Note:

𝜕𝜕Φ(𝑧𝑧) 𝜕𝜕𝜕𝜕
= 𝜙𝜙(𝑧𝑧)
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕

a. What is 𝜃𝜃, the identifiable parameter of interest in this problem?

b. Derive the probabilities that 𝑦𝑦𝑖𝑖 = 1 and 𝑦𝑦𝑖𝑖 = 0 for individual 𝑖𝑖.

c. Derive the contribution of each individual in your sample to the overall likelihood
function (i.e., derive ℒ𝑖𝑖 (θ)) and the individual log-likelihood function.

d. Derive the Score Function needed to identify 𝜃𝜃�𝑀𝑀𝑀𝑀𝑀𝑀 .

e. Explain what is implied by the simplified form of the Score function (i.e., what is the
implied orthogonality condition).

Page 3 of 4
Q3. Partitioned Regression and Frisch-Waugh-Lovell Theorem

Consider the model 𝑦𝑦 = 𝑋𝑋𝑋𝑋 + 𝑒𝑒, where 𝑋𝑋 is a 𝑛𝑛 × 𝑘𝑘 matrix. Let the data matrix 𝑋𝑋 be partitioned
into two matrices, 𝑋𝑋 = [𝑋𝑋1 : 𝑋𝑋2 ], where 𝑋𝑋1 and 𝑋𝑋2 have the dimensions 𝑛𝑛 × 𝑘𝑘1 and 𝑛𝑛 × 𝑘𝑘2 ,
respectively, and 𝑘𝑘1 + 𝑘𝑘2 = 𝑘𝑘. Thus, we can rewrite the model as

𝑦𝑦 = 𝑋𝑋1 𝑏𝑏1 + 𝑋𝑋2 𝑏𝑏2 + 𝑒𝑒. (B.3)

a. Perform an OLS regression of 𝑋𝑋1 on 𝑋𝑋2 . Derive the matrix of residuals from this
regression and denote it 𝑒𝑒12 . (Hint: use the residual matrix for 𝑋𝑋2 ).

b. Perform and OLS regression of 𝑦𝑦 on 𝑋𝑋2 . Derive the matrix of residuals from this
regression and denote it 𝑒𝑒𝑦𝑦2 . (Hint: use the residual matrix for 𝑋𝑋2 ).

c. Perform and OLS regression of 𝑒𝑒𝑦𝑦2 on 𝑒𝑒12 . Derive the OLS coefficient from this
regression and denote it 𝑏𝑏�1 . (You may use the normal equations to do this).

d. Show that 𝑏𝑏�1 = 𝑏𝑏�1 , where 𝑏𝑏�1 is the OLS coefficient on 𝑋𝑋1 obtained from a regression
of 𝑦𝑦 on both 𝑋𝑋1 and 𝑋𝑋2 . (Hint: use the answer derived in part c and substitute it into
the full model for 𝑦𝑦 represented by equation B.3. The residual 𝑒𝑒 in the regression of
𝑦𝑦 on 𝑋𝑋 is orthogonal to both 𝑋𝑋1 and 𝑋𝑋2 .)

e. Denote the residuals from the regression of 𝑒𝑒𝑦𝑦2 on 𝑒𝑒12 as 𝑒𝑒̃ . Show that these
residuals, based on the model, 𝑒𝑒𝑦𝑦2 = 𝑒𝑒12 𝑏𝑏� + 𝑒𝑒̃ , are the same as the residuals obtained
from the regression of 𝑦𝑦 on both 𝑋𝑋1 and 𝑋𝑋2 . (Hint: decompose 𝑒𝑒𝑦𝑦2 and 𝑒𝑒12 into their
original parts. You will also need to use the results from part d).

f. Suppose that 𝑋𝑋1 𝑋𝑋2 = 0, meaning that the two sets of variables are orthogonal. Show
�1 = 𝑏𝑏1∗, where 𝑏𝑏1∗ is the OLS coefficient on 𝑋𝑋1 obtained from a
that, in this case, 𝑏𝑏
regression of 𝑦𝑦 on 𝑋𝑋1 alone.

g. Define the Frisch-Waugh Theorem and describe its intuition.

Page 4 of 4
PhD Econometrics Examination

Part C

Friday, March 19th, 2021

Total Time: 3 hours

PART C

(Answer any TWO of the following three questions)

Q7. Consider the following study/survey scenario. In order to find out people’s preference to buy micro
health insurance, a choice experiment study was set up. Three attributes were considered with
respective levels:

Price: 10, 200, 500 (in NRP, Nepalese Rupees / month).


Major Surgery: No, Yes
Doctor’s Visits (/ month): 1, 2, unlimited
Lab Work: No, Yes
Immunization: No, Yes

In addition, other socio economic variables were collected: Age, Gender, Income, Current Insurance
(Yes/No); Education Level, and Distance (in minutes) to Nearest Clinic, number of children under 18.
The objective was to calculate the marginal willingness to pay value.

a. Set up a RUM structure for this model using the indirect utility functions etc.

b. Present two examples of choice sets.

c. Set up the log-likelihood function for this model.

d. Why is it called a conditional logit model?

e. Present the formula for the marginal willingness to pay for each attribute, and interpret them.

f. What would be the total WTP value?

Page 1 of 2
Q8. In the same survey, new mothers were asked the following health outcome questions: 1) Number
of times the women visited the clinic for antenatal care; 2) The BMI of the child at birth; 3) Mode of
delivery (At-home by family members, by community midwife; or at Clinic), and 4) Self-rated health
status (4= Feeling very well …. 1 = Not feeling well at all). That is, there were four different data
generating processes to describe the outcome variables (Antenatal visits, BMI, Mode of delivery, and the
self-reported ranked health status.

a. For each health outcome measure, choose an appropriate modeling/estimation method and
describe as to why you chose that estimation/modeling method.

b. For each health outcome case, write in steps all regression equations; and the log-likelihood
functions.

c. Also, describe the expected sign for each of the independent variables you chose to include in
your model.

Q9. Define and describe the difference between the Tobit and the Heckman Selectivity model. Give a
read-world example for each of the cases with the corresponding loglikelihood functions. Show your
work.

Page 2 of 2

You might also like