0% found this document useful (0 votes)

4 views7 pages

Assignment9Sol - Copy

Intro to Econometrics NYU

Uploaded by

warn2104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views7 pages

Assignment9Sol - Copy

Intro to Econometrics NYU

Uploaded by

warn2104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Assignment 9: ECON-UA 266 - Intro to Econometrics

Sahar Parsa

Fall 2024

The nineth assignment is due on Friday, November 22rd, 2024. It covers the material related to logit and
probit model as well as panel data methods. For the Data questions, report the output of your analysis in a
“report style” pleasing to read and add the codes you used to generate your results.

Question 1
Four hundred driver’s license applicants were randomly selected and asked whether they passed their driving
test (P assi = 1) or failed their test (P assi = 0); data were also collected on their gender (M alei = 1 if
male and = 0 if female) and their years of driving experience (Experiencei , in years). The following table
summarizes several estimated models.
Probit Logit Linear Probability
Experience 0.031 0.040 0.006
(0.009) (0.016) (0.002)
Constant 0.712 1.059 0.774
(0.126) (0.221) (0.034)
a. Using the results in column (1), does the probability of passing the test depend on Experience?
Assumed that Matthew has 10 years of driving experience, what is the probability that he will pass the
test? Christopher is a new driver (zero years of experience). What is the probability that he will pass
the test? The sample included values of Exper between 0 and 40 years, and only four people in the
sample had more than 30 years of driving experience. Jed is 95 years old and has been driving since he
was 15. What is the model’s prediction for the probability that Jed will pass the test? Do you think
that this prediction is reliable? Why or why not?
Solution :
At 5% significant level, the probability of passing the test depends on Experience. In the Probit model, the
value and standard errors of experience coefficients are 0.031 and 0.009. The corresponding t-value1 could be
calculated using the below formula:

t − stat1 = 0.031/0.009 = 3.44

The t-value1 is greater than 1.96 (threshold at the 5% significant level). Therefore, we reject the null
hypothesis that the probability of passing the test does not depend on Experience.
For Probit model, the probability of passing is

P r(P assi = 1|Exper) = Φ(0.712 + 0.031 ∗ Exper)

where Φ is the cumulative standard normal distribution function.

Given Mathew has 10 years of driving experience, the probability of passing is Φ(0.712 + 0.31) = Φ(1.022).
According to the cumulative normal distribution table, Φ(1.022) = P r(Z ≤ 1.022) = 0.84661. Thus, Mathew
has 86.661% probability of passing.

1
Similarly, Christopher’s probability of passing is Φ(0.712) = 0.76177.
For Jed, the predicted probability is Φ(0.712 + 0.031 ∗ 80) = Φ(3.192) = 0.99930. Note that one could worry
that because our sample has only individuals between 0 to 40 years of experience, the model might not apply
for the Jed who has 80 years of driving experience. In particular, we could worry that the Probability of
passing the test conditional on experience decreases after a certain number of years of experience as our
individuals would become very old. This could be picked up by a z-index that is quadratic in experience. But
again, we might not not the right sample to pick up the effect for the population at large and our model
might only apply for the population specific to our study. More generally, there are no reasons to believe that
a model can’t be used to extrapolate out of sample. But one has to understand the context better.
b. Answer (a) using the results in column (2). Sketch the predicted probabilities from the probit and logit
in columns (1) and (2) for values of Experience between 0 and 60. Are the probit and logit models
similar?
Solution :
At 5% significant level, the probability of passing the test depends on Experience. In the Logit model, the
value and standard errors of experience coefficients are 0.040 and 0.016. The corresponding t-value2 could be
calculated using the below formula:

t − stat2 = 0.040/0.016 = 2.5

The t-value2 is greater than 1.96 needed at the 5% significant level. Therefore, we could reject the null
hypothesis that the probability of passing the test not depend on Experience.
For Logit model:
P r(P assi = 1) = (1 + exp(−(1.059 + 0.040 ∗ Experience)))−1

Mathew has 10 years of driving experience, the probability of passing is (1 + exp(−(1.059 + 0.040 ∗ 10)))−1 =
0.81138.
Similarly, Christopher’s probability of passing is (1 + exp(−1.059))−1 = 0.74250.
For Jed, the predicted probability is (1 + exp(−(1.059 + 0.040 ∗ 80)))−1 = 0.98606. The same comment about
the large number of years of experience for Jed applies here as well.

2
Logit vs. Probit
1.00

0.95

0.90
Model
Logit
y

Probit
0.85

0.80

0.75

0 20 40 60
Experience
The Probit model always predicts a higher probability than the Logit model given our sample and the
experience range we are considering, but the prediction is highly similar. Those two models have a similar
shape as well, and both show a diminishing experience effect on the probability of passing. Another familiar
plot would have been plotting against the z-index for both the probit and logit.
c. Answer (a) using the results in column (3). Sketch the predicted probabilities from the probit and
linear probability in columns (1) and (3) as a function of Experience for values of Experience between 0
and 60. Do you think that the linear probability is appropriate here? Why or why not? Solution :
At 5% significant level, the probability of passing the test depends on Experience. In the Linear Probability
model, the value and standard errors of experience coefficients are 0.006 and 0.002. The corresponding
t-value3 could be calculated using the below formula:

t − stat3 = 0.006/0.002 = 3

The t-value3 is greater than 1.96 needed at the 5% significant level. Therefore, we reject the null hypothesis
at the 5% that the probability of passing the test not depend on Experience.
For Linear Probability model:

P r(P assi = 1) = 0.774 + 0.006 ∗ Experience

Mathew has 10 years of driving experience, the probability of passing is 0.774 + 0.006 ∗ 10 = 0.834.
Similarly, Christopher’s probability of passing is 0.774.
For Jed, the predicted probability is 0.774 + 0.006 ∗ 80 = 1.254
In this case, the linear probability is inappropriate to predict probabilities as the predicted value would be
larger than 1. We know that probabilities are bounded between 0 and 1. Assumed there is a person who has

3
50 years of experience, his/her probability of passing is 1.07(0.774 + 0.006 ∗ 50). The linear probability plot
also shows that the predicted probability would exceed 1 after some experience level.
Probit vs. Linear
1.15

1.05

Model
Linear Prob
y

0.95
Probit

0.85

0 20 40 60
Experience

Question 2
Suppose that, for one semester, you can collect the following data on a random sample of college juniors
and seniors for each class taken: a standardized final exam score, percentage of lectures attended, a dummy
variable indicating whether the class is within the student’s major, cumulative grade point average prior to
the start of the semester, and SAT score.
a. Is this dataset a cluster data? Why would you classify this data set as a cluster sample? Roughly, how
many observations would you expect for the typical student?
Solution :
This dataset is a cluster data because the scores within a class are likely to be correlated. In this question,
the final exam scores for each class are a cluster, as different professors likely have different grading criteria
and students might help each other studying the material.
A typical undergraduate student takes 4 courses in one semester on average. I would expect for 4 observations
for one student.
b. Write a model that explains final exam score on the percentage of lectures attended and the other
characteristics. Use s to subscript student and c to subscript class. Which variables do not change
within a student?
Solution :

F inalExamScoresc = β0 + β1 Attendsc + β2 M ajorsc + β4 GP As + β5 SAT scores + εsc

4
where,
F inalExamScoresc = Final exam score for each student in the given class
Attendsc = percentage of lectures attended for each student in the given class
M ajorsc = a dummy variable whether this class is in the student’s major
GP As = cumulative grade point average prior to the start of the semester for each student
SAT scores = SAT score for each student
Among these variables, GP As and SAT scores do not change within a student as they are predetermined
before this semester start.
c. If you pool all of the data and use OLS, what are you assuming about the unobserved student
characteristics that affect performance and attendance rate? What roles do SAT score and prior GPA
play in this regard?
Solution :
If using an OlS estimator on the pooled data, to get unbiased estimators we need the unobserved student
characteristics to be uncorrelated with Attendsc . However, we might worry that Ability might be correlated
with Attendance and be an important omitted variable in this case. SAT scores and GPA scores are fixed
students characteristics within our setting and will help alleviate the omitted variable problem. But not
completely deal with it as other variables might matter as well. It is unlikely that SAT scores and GPA
scores adequately capture a student ability.
d. If you think SAT score and prior GPA do not adequately capture student ability, how would you
estimate the effect of attendance on final exam performance?
Solution :
We would use the fixed effect model to estimate the true effect of attendance on final exam scores. If GP As
and SAT scores are unable to capture student ability, then As prone to correlated with Attendsc . As a
result, the pooled OLS estimators are biased and inconsistent. Instead, we should use the fixed effect model.
In the lecture, there are three methods to estimate the fixed effect model. Here we are going to use the
entity-demeaned OLS.

F inalExamScoresc = β0 + β1 Attendsc + β2 M ajorsc + βa As + εsc

where As are student’s characteristics that do not change within a student β4 GP As and SAT scores are
included in As .
1. Let’s first specify the fixed effect model from the previous regression:

F inalExamScoresc = βs + β1 Attendsc + β2 M ajorsc + εsc

where βs = β0 + βa , is a set of individual student intercept coefficients.

2. The entity averages satisfy:

C C C C
1 X 1 X 1 X 1 X
F inalExamScoresc = βs + β1 Attendsc + β2 M ajorsc + εsc
C c=1 C c=1 C c=1 C c=1
where the C is the number of classes a typical student taken, which can be rewritten as:

¯
F inalExamScore ¯ ¯
s = βs + β1 Attends + β2 M ajor s + ε̄s

Subtracting the fixed effect regression with the demeaned regression, we can get the deviation from the entity
averages and eliminate the students’ fixed effects.

5
¯
F inalExamScoresc − F inalExamScore ¯ ¯
s = β1 [Attendsc − Attends ] + β2 [M ajorsc − M ajor s ] + εsc − ε̄s

To clarify the procedure, we use following demeaned notation:

˜
F inalExamScore ˜ ˜
sc = β1 Attendsc + β2 M ajorsc + ε̃sc
˜
3. The final step is to estimate beta1 by regressing F inalExamScore ˜ ˜
sc on Attendsc and M ajorsc using
OLS.

Question 3
From Stock and Watson Chapter 11: Consider a model for new capital investment in a particular industry
(say, manufacturing), where the cross section observations are at the county level and there are T years of
data for each county:

log(investit ) = β0 zit + β1 taxit + β2 disasterit + βi + βt + εit

The variable taxit is a measure of the marginal tax rate on capital in the county. Disaster is a dummy
indicator equal to one if there was a significant natural disaster in county i at a time period t (for example, a
major flood, a hurricane, or an earthquake). The variables in zit are other factors affecting capital investment,
and the βt represents different time intercepts.
a. Why is allowing for aggregate time effects in the equation important?
Solution :
The equation includes aggregate time effects because the macroeconomic environment for investing is changing
over time. This macroeconomic change impacts all counties systematically to the same extend. Based on the
economic cycle theorem, investment return varies across years and depends heavily on the macroeconomic
environment in a given year. During the recession, companies with cash might hold on to their cash and stop
investing, consumers might have less income and not be able to save. We know that the saving would turn
in investment as well. Finally, banks might be holding on to their reserve as well (credit crunch). Overall,
there might be less ideas to invest in. Thus, it is essential to add time effects as an important factor in our
equation. But note that the time fixed effects are aggregate country level time characteristics and would not
pick up any variation at the county level within time.
b. What kinds of variables are captured in βi ?
Solution :
βi captures county characteristics affecting capital investment except taxit and disasterit , which are un-
observable but not changing across years in this case. They may include the county’s inherent investment
culture (i.e., whether they are risk-aversion or risk-lover) and the geographic location of each county i.
c. Interpreting the equation in a causal fashion, what sign does economic reasoning suggest for β1 ?
Solution :
One would expect a negative relationship between tax and capital investment. This is because the taxes
would distort the relative return of capital investment: β1 negative. On the other hand, we would expect β2
to be negative. In period of natural disasters, we would expect less investment due to heightened uncertainty.
d. Explain in detail how you would estimate this model; be specific about the assumptions you are making.
Solution :
We can use a regression with N − 1 county fixed effects and T − 1 time fixed effects and a constant. Then we
could run everything with OLS. The main drawback is that we have more than 3000 counties in the US and

6
it might be too demanding. Alternatively, we could estimate the model on the entity demeaned variables.
Then we could run the OLS on T − 1 without constant model. This will give us the unbiased estimators as
long as the effects are fixed effects and there is no other omitted variables changing within states and time
affecting capital investment.

Probability and Statistics (Final Sample)
0% (1)
Probability and Statistics (Final Sample)
25 pages
(Ebook) Real Stats: Using Econometrics for Political Science and Public Policy by Bailey, Michael A. ISBN 9780199981946, 0199981949 pdf download
No ratings yet
(Ebook) Real Stats: Using Econometrics for Political Science and Public Policy by Bailey, Michael A. ISBN 9780199981946, 0199981949 pdf download
48 pages
0-LIMDEP-MODEL-nb-nb
No ratings yet
0-LIMDEP-MODEL-nb-nb
96 pages
s11205-024-03404-w
No ratings yet
s11205-024-03404-w
39 pages
Regression and Analysis
No ratings yet
Regression and Analysis
132 pages
2. Mid-Term PGP Mid-Term OCT 2018
No ratings yet
2. Mid-Term PGP Mid-Term OCT 2018
19 pages
2018may 02402 Solution En
No ratings yet
2018may 02402 Solution En
36 pages
Alfaro-Urena Manelici Vasquez 2022 QJE
No ratings yet
Alfaro-Urena Manelici Vasquez 2022 QJE
58 pages
Working-paper-336-Panda-et-al
No ratings yet
Working-paper-336-Panda-et-al
54 pages
unit 3 LOGISTIC (1)
No ratings yet
unit 3 LOGISTIC (1)
7 pages
Lecture 6&7_Qualitative Dependent Models
No ratings yet
Lecture 6&7_Qualitative Dependent Models
15 pages
Problem-Solving and Data Analysis-Inference From Sample Statistics and Margin of Error
No ratings yet
Problem-Solving and Data Analysis-Inference From Sample Statistics and Margin of Error
14 pages
Stock_Watson_3U_ExerciseSolutions_Chapter11_Instructors
No ratings yet
Stock_Watson_3U_ExerciseSolutions_Chapter11_Instructors
12 pages
Puglio and Tucker (2021) Neural Networks and Recession Forecasting
No ratings yet
Puglio and Tucker (2021) Neural Networks and Recession Forecasting
27 pages
Accident Analysis and Prevention: Loukas Dimitriou, Katerina Stylianou, Mohamed A. Abdel-Aty
No ratings yet
Accident Analysis and Prevention: Loukas Dimitriou, Katerina Stylianou, Mohamed A. Abdel-Aty
15 pages
Introduction To Logistic Regression
No ratings yet
Introduction To Logistic Regression
12 pages
Solutions To Ch12 Blanchard
No ratings yet
Solutions To Ch12 Blanchard
11 pages
Ec2020 2016 Exam Questions and Answers
No ratings yet
Ec2020 2016 Exam Questions and Answers
33 pages
Journal of Agribusiness in Developing and Emerging Economies
No ratings yet
Journal of Agribusiness in Developing and Emerging Economies
22 pages
STAT511Q2Q4
No ratings yet
STAT511Q2Q4
11 pages
Final-Dr-Naser-Statistic 2
100% (1)
Final-Dr-Naser-Statistic 2
6 pages
Exame - 2022:2023 (2º Sem) - Soluções
No ratings yet
Exame - 2022:2023 (2º Sem) - Soluções
14 pages
Mock Exam 2
No ratings yet
Mock Exam 2
2 pages
Logit & Probit Theo Sheet
No ratings yet
Logit & Probit Theo Sheet
6 pages
Assignment # 1
No ratings yet
Assignment # 1
28 pages
Ordered Probit Model
No ratings yet
Ordered Probit Model
13 pages
Homework 02 Key Answer STAT 4444
No ratings yet
Homework 02 Key Answer STAT 4444
5 pages
Eco220y Au18
No ratings yet
Eco220y Au18
25 pages
Problem Set 7
No ratings yet
Problem Set 7
5 pages
Wooldridge 7e Ch17 SM
No ratings yet
Wooldridge 7e Ch17 SM
14 pages
Econ 3230 HW5
No ratings yet
Econ 3230 HW5
2 pages
Adoption Studies
No ratings yet
Adoption Studies
28 pages
W
No ratings yet
W
2 pages
Agresti Cda
No ratings yet
Agresti Cda
191 pages
BES220 E Nov2022 - Memo
No ratings yet
BES220 E Nov2022 - Memo
15 pages
SP Day3 Q4
No ratings yet
SP Day3 Q4
15 pages
Cap1_Slides
No ratings yet
Cap1_Slides
30 pages
Midterm Codes A - B 28.10.21
No ratings yet
Midterm Codes A - B 28.10.21
1 page
UPDATED Practice Final Exams Solutions
100% (1)
UPDATED Practice Final Exams Solutions
39 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
Universiti Tunku Abdul Rahman Faculty of Business and Finace ACADEMIC YEAR 2022/2023
No ratings yet
Universiti Tunku Abdul Rahman Faculty of Business and Finace ACADEMIC YEAR 2022/2023
3 pages
Ahmed Saleem Khan Assignment 1 STATA
No ratings yet
Ahmed Saleem Khan Assignment 1 STATA
3 pages
Mock Exam 2 - Solutions
No ratings yet
Mock Exam 2 - Solutions
6 pages
Mixed Models Theory and Applications with R 2nd Edition Complete eBook Edition
100% (9)
Mixed Models Theory and Applications with R 2nd Edition Complete eBook Edition
17 pages
Assignment9 - Copy
No ratings yet
Assignment9 - Copy
2 pages
Anselin, L Et AL - Advances in Spatial Econometrics - Methodology, T
No ratings yet
Anselin, L Et AL - Advances in Spatial Econometrics - Methodology, T
515 pages
Introductury Econometrics: A Modern Approach 7th Edition Jeffrey M. Wooldridge - eBook PDF instant download
100% (1)
Introductury Econometrics: A Modern Approach 7th Edition Jeffrey M. Wooldridge - eBook PDF instant download
53 pages
Logistic Regression
100% (3)
Logistic Regression
41 pages
Taller6 Econometria2
No ratings yet
Taller6 Econometria2
3 pages
PScompre Regular
No ratings yet
PScompre Regular
2 pages
PS6_sol
No ratings yet
PS6_sol
7 pages
Multinomial Logistic Regression Models: Newsom Psy 525/625 Categorical Data Analysis, Spring 2021 1
No ratings yet
Multinomial Logistic Regression Models: Newsom Psy 525/625 Categorical Data Analysis, Spring 2021 1
5 pages
Ecntr Assmm
No ratings yet
Ecntr Assmm
23 pages
Qualitative Response Regression Models 1
No ratings yet
Qualitative Response Regression Models 1
29 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
55 pages
Econometric Analysis of Panel Data: William Greene Department of Economics University of South Florida
No ratings yet
Econometric Analysis of Panel Data: William Greene Department of Economics University of South Florida
116 pages
Introduction To Econometrics - Stock & Watson - CH 9 Slides
100% (1)
Introduction To Econometrics - Stock & Watson - CH 9 Slides
69 pages
Non Linear Probability Models
No ratings yet
Non Linear Probability Models
18 pages
SurveyData 3
No ratings yet
SurveyData 3
49 pages
Chapter 5 Answer Key
No ratings yet
Chapter 5 Answer Key
6 pages
Chapter 5 Discrete Choice Models
100% (1)
Chapter 5 Discrete Choice Models
19 pages
POST POSITION BIAS An Econometric Analysis of The 1987 Season at Exhibition Park PDF
No ratings yet
POST POSITION BIAS An Econometric Analysis of The 1987 Season at Exhibition Park PDF
16 pages
Smallholder Farmers' Adaptation Strategies To Climate Change: The Case of Ankesha Guagusa District of Awi Zone, Northwestern Ethiopia
No ratings yet
Smallholder Farmers' Adaptation Strategies To Climate Change: The Case of Ankesha Guagusa District of Awi Zone, Northwestern Ethiopia
13 pages
Econometrics Chapter 11 PPT Slides
No ratings yet
Econometrics Chapter 11 PPT Slides
46 pages
Predicting Customer Potential Value An Application in The Insurance Industry
No ratings yet
Predicting Customer Potential Value An Application in The Insurance Industry
11 pages
Econometrics-CH-4 (1)
No ratings yet
Econometrics-CH-4 (1)
14 pages
Estimation of Logit Choice Models Using Mixed Stated-Preference and Revealed-Preference Information
No ratings yet
Estimation of Logit Choice Models Using Mixed Stated-Preference and Revealed-Preference Information
19 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Statistical Analysis
No ratings yet
Statistical Analysis
3 pages
DS535 Note 4 (With Marks)
No ratings yet
DS535 Note 4 (With Marks)
18 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
Homework 07 Answers
No ratings yet
Homework 07 Answers
3 pages
Chapter 4
No ratings yet
Chapter 4
11 pages
Limited Dependent Variable Models Example
No ratings yet
Limited Dependent Variable Models Example
5 pages
0804
No ratings yet
0804
3 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
1 Logit Probit and Tobit Model
100% (2)
1 Logit Probit and Tobit Model
51 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Econometrics Assignment HW4
No ratings yet
Econometrics Assignment HW4
8 pages
Limited Dependent Variables - Binary Dependent Variables
No ratings yet
Limited Dependent Variables - Binary Dependent Variables
24 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
BSC Intermediate Econometrics: Please Do Not Distribute
No ratings yet
BSC Intermediate Econometrics: Please Do Not Distribute
25 pages
Homework Answers
No ratings yet
Homework Answers
7 pages
10 Minute Guide to Orthogonal Array Test Strategy
From Everand
10 Minute Guide to Orthogonal Array Test Strategy
Rajeev Nair Raman
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Probability Case Study
No ratings yet
Probability Case Study
28 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
LSAT PrepTest 75 Unlocked: Exclusive Data, Analysis & Explanations for the June 2015 LSAT
From Everand
LSAT PrepTest 75 Unlocked: Exclusive Data, Analysis & Explanations for the June 2015 LSAT
Kaplan Test Prep
No ratings yet

Assignment9Sol - Copy

Uploaded by

Assignment9Sol - Copy

Uploaded by

Assignment 9: ECON-UA 266 - Intro to Econometrics

t − stat1 = 0.031/0.009 = 3.44

P r(P assi = 1|Exper) = Φ(0.712 + 0.031 ∗ Exper)

where Φ is the cumulative standard normal distribution function.

t − stat2 = 0.040/0.016 = 2.5

P r(P assi = 1) = 0.774 + 0.006 ∗ Experience

F inalExamScoresc = β0 + β1 Attendsc + β2 M ajorsc + β4 GP As + β5 SAT scores + εsc

F inalExamScoresc = β0 + β1 Attendsc + β2 M ajorsc + βa As + εsc

F inalExamScoresc = βs + β1 Attendsc + β2 M ajorsc + εsc

where βs = β0 + βa , is a set of individual student intercept coefficients.

To clarify the procedure, we use following demeaned notation:

log(investit ) = β0 zit + β1 taxit + β2 disasterit + βi + βt + εit

You might also like