Assignment10Sol - Copy
Assignment10Sol - Copy
Sahar Parsa
Fall 2024
The tenth assignment is due on Friday, December 6th, 2024. It covers the material related to quasi-experimental
methods.
Question 1
Read the following article: “The Immigration Equation,” by Roger Lowenstein. The New York Times Magazine
Section, July 9, 2006: https://www.nytimes.com/2006/07/09/magazine/the-immigration-equation.html
You can also download it from LexisNexis that you can access from the Library. Answer the following three
questions:
a. What are the two positions that are taken on the impact of immigration on the labor market by Card
and Borjas?
Solution :
Borjas argued that immigrants hurt the economic prospects of the Americans they compete with. And now
that the most significant contingent of immigrants is poorly educated Mexicans, they hurt poorer Americans,
especially African-Americans the most.
Card, however, said that immigration is no big deal and that a lot of the opposition to it is most likely social
or cultural.
b. Explain what a natural experiment is and the importance of “natural experiments” in economics. Are
natural experiment part of the quasi-experiment toolkit? Discuss how Card uses a natural experiment
to estimate the impact of illegal immigration on the labor market. What does he find?
Solution :
A natural experiment is an observational study that benefits the properties of a randomized control experiment,
that is generated outside of the control of an experiment. For instance, an earthquake could generate randomly
economic damages one could exploit to study the effect of economic loss on voting behavior. Card and Borjas
use such natural experiment to study the effect of immigration on local labor markets. These are known as
quasi-experimental methods.
In 1980, 125,000 Cubans were suddenly permitted to immigrate. They who named Marielitos, arrived in
South Florida with virtually no advance notice, and approximately half remained in the Miami area, They
joined an already-sizable Cuban community and swell the city’s labor force by 7 percent. Card compared the
aftershocks in Miami(testing group) with the labor markets in four cities – Tampa, Atlanta, Houston, and Los
Angeles(control group) – that hadn’t suddenly been injected with immigrants. There were no labor market
indications of this big immigration, so that cause and effect in this “natural experiment” were delineated.
Card concluded that the Mariel influx appears to have virtually no effect on the wages or unemployment
rates of less-skilled workers. This founding was confirmed by some observations. First, Card found that
Miami black workers did better than in control cities. Their wages were fractionally higher than in 1979,
while control cities’ back wages were down. Second, although unemployment in all of the cities rose the
1
following year, black unemployment in Miami had retreated to below its level of 1979 by 1985, while in the
control cities it remained much higher.
c. Explain the evidence that Borjas generates to support his argument. What data does he use?
Solution :
With the graph Borjas generated, during the 80’s and ’90s, for instance, immigrants caused dropouts to suffer
a 5 percent decline relative to college graduates. Assuming businesses did not hire any of the new immigrants,
Borjas’s finding would translate to a hefty 9 percent wage loss for the unskilled over those two decades and
lesser declines for other groups.
Question 2
In 1985, neither Florida nor Georgia had laws banning open alcohol containers in vehicle passenger compart-
ments. By 1990, Florida had passed such a law, but Georgia had not.
a. Suppose you can collect random samples of the driving-age population in both states, for 1985 and
1990. Let arrest be a binary variable equal to unity if a person was arrested for drunk driving during
the year. Without controlling for any other factors, write down a linear probability model that allows
you to test whether the open container law reduced the probability of being arrested for drunk driving.
Which coefficient in your model measures the effect of the law?
Solution :
2
Arrestc90 = βc + β90 + β3 × F Lc90 + µc90
Arrestc85 = βc + µc85
Arrestc90 − Arrestc85 = β90 + β3 × F Lc90 + µc90 − µc85
Note that β90 appears as a constant in our first difference to measure the time fixed effect. Hence, our model
requires a constant with the first difference due to the presence of the time fixed effect.
Question 3
In politics, there is a large literature exploring whether Democrats differ from Republican politicians in office.
To that end, scholars have been using a variety of outcome variables, among others data on spending and
revenues. Consider the following model:
Yi = β0 + β1 × Democrati + εi
where Yi is a measure of spending in city i and Democrati is a dummy equal to one if the mayor is a Democrat
and zero otherwise.
a. Can you explain interpret what β1 is capturing?
Solution :
β1 shows the effect of the Democratic governor on the government spending.
b. Can you explain why theoretically why β1 should be different from zero. Do you expect it to be positive
or negative.
Solution :
It is commonly believed that Democrats are more likely than Republicans to support social policies, increase
government involvement, and increase government spending.
c. Explain why estimating β1 with OLS will generate a biased estimator.
Solution :
1. One source of endogeneity could be voters’ preferences. Those factors such as labor-market conditions,
voter characteristics, quality of candidates, the resources available for campaigns, and other unmeasured
characteristics of states and candidates would bias estimates of the impact of the party allegiance of
governors. These factors can influence who wins the election.
2. Endogeneity problems also come from states’ economic and demographic characteristics. Those
characteristics include population, and whether the state is located in the south, GDP level.
d. Suppose you have information about all the elections in all american cities from 1950 to today with the
margin of victory of the party in office. Can you use this variable and restrict the dataset to the close
elections, i.e., margins of victory less than 1% to estimate the regression above with OLS?
Solution :
The margin of victory is the difference between the percentage of the vote cast for the winner and the runner
up, i.e., the candidate who finishes second. One could use elections where the winning candidate won out of
a small margin of victories, i.e., less than 1%. Out of few vote shares, such as the outcome of the race can be
seen as almost random. In such races, one can compare race where a Democrat won out of a few votes to
races where the Republican won out of a few votes. These states would on average look the same and would
ex ante only differ because of the party of the governor. This could give us a perfect natural experiment for
the effect of the party on government spending. In fact, a number of academic studies have used this method
known as a regression discontinuity.
3
e. Will it lead to an unbiased estimator? Explain why?
Solution :
Because the identity of the governor elected would be determined by a small margin, it is less likely to
correlate with state level characteristics that could drive a difference in the governor spending and state run
by a Democrat or a Republican.
f. Does it lead to an external validity problem? Explain why?
Solution :
This methodology has external validity issues because the effect will only be for races that are very competitive.
The candidate won out of a small margin. Highly competitive races differ with races with a larger margin
as it could give more room for the candidate to implement different policies more in line with their other
identities as opposed to their partisan identities.
This model only applies to the states that share a small margin of victory (competitive), and only estimate
the partisan effect among these substates. This is known as a local treatment effect.
mydata=mydata %>%
mutate(Age = as.numeric(as.character(age))) %>%
na.omit()
white_female<-mydata %>%
select(year, sex, Age, race, educ, labforce, wkswork1) %>%
subset( sex=="female" & race=="white" & Age>=25 & Age<=49)
library(arsenal)
##
## Attaching package: 'arsenal'
## The following object is masked from 'package:lubridate':
##
## is.Date
newdata1 = white_female %>%
group_by(year)
my_controls = tableby.control(
total = F,
test = F,
numeric.stats = c("meansd", "medianq1q3", "min", "max"),
cat.stats = c("countpct"),
ordered.stats = c("countpct"),
4
stats.labels = list(
meansd = "Mean ± SD",
medianq1q3 = "Median (IQR)",
min = "Min",
max = "Max"
)
)
b. Do you observe anything unusual about the dataset? Do you think there could be something wrong
with the data? How would you describe the problem?
Solution :
According to the descriptive statistics table, the average number of working weeks decreased from 12.35 to
3.70. And it is unusual that the portion of no schooling women increased from 12.5% to 74.3%. This suggests
the dataset has errors in measurement and recollection error. We should correct this before using or we could
have biased estimators if we use these variables as our covariates.