0% found this document useful (0 votes)
26 views3 pages

Assignment 2023 2024

This document provides instructions for completing two exercises analyzing financial and public health data using regression analysis in Stata. Exercise A involves analyzing the relationship between dengue prevalence, humidity, and temperature in different regression models. Exercise B consists of an event study of Apple's stock price response to the January 2007 iPhone announcement using abnormal return calculations and a segmented linear regression model with Newey-West standard errors. The exercises must be completed and submitted as a .do and .pdf file by October 10th.

Uploaded by

mariana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

Assignment 2023 2024

This document provides instructions for completing two exercises analyzing financial and public health data using regression analysis in Stata. Exercise A involves analyzing the relationship between dengue prevalence, humidity, and temperature in different regression models. Exercise B consists of an event study of Apple's stock price response to the January 2007 iPhone announcement using abnormal return calculations and a segmented linear regression model with Newey-West standard errors. The exercises must be completed and submitted as a .do and .pdf file by October 10th.

Uploaded by

mariana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment

Deadline: Monday, October 10th


You can solve it in groups of up to 7 students. You should end up with two files:
a working code in STATA (a .do file), and a pdf document with comments
required in a few questions. Upload the zipped file in moodle. Remember to
write your names and student numbers in each file (as comments in .do files, of
course)

Exercise A: Regressions

1. Load the dengue.csv file from this site. Documentation on the variables is
available here.

• In Stata, some of the variables will be read as text. Use destring *, replace
force to turn them numeric.

2. Run an OLS regression using average humidity to predict whether dengue was observed
in the area, and look at the results.

3. Write two sentences, one interpreting the intercept and one interpreting the slope.
4. Get a set of summary statistics for the humidity variable and write a comment on how
this can help you make sense of the intercept in the regression from step 2.

5. We might recognize that, if we’re interested in the effect of humidity on Dengue,


temperature might be on a back door. Add a control for temperature, rerun the
regression, and show the results.

6. A long one: Now let’s say we’re directly interested in the relationship between
temperature and humidity. Run an OLS regression of humidity on temperature.
Calculate the residuals of that regression, and then make a plot that will let you evaluate
whether there is likely heteroskedasticity in the model. Rerun the model with
heteroskedasticity-robust standard errors. Show both models, and say whether you think
there is heteroskedasticity

• In Stata, you can access residuals using predict resid, r after running the model.

7. In the graph in the last problem you may have noticed that for certain ranges of
temperature, the errors were clearly nonzero on average. This can indicate a functional
form problem. Run the model from question 6 again (with heteroskedasticity-robust
standard errors), but this time use the logarithm of humidity in place of humidity. Add a
sentence interpreting the coefficient on temperature.
• In Stata, you’ll need to create logged humidity in the data before running the regression.

Exercise B: Event studies

1. We are going to download some financial data, using a package designed to do so. Download the
closing price of Apple stock (AAPL) each day from December 1, 2006 through January 31, 2007
from Yahoo Finance. Then, rename the closing price AAPL_Close, and merge the data with
the S&P 500 closing price on the same day (which you’ll call SP500_Close) found
in SP500_Historicalnew2.csv.

Finally, sort the data in date order, and create the columns AAPL_RR and SP500_RR which are
the day-to-day percentage increases (RR, or rate of return), which you can calculate by taking
the closing price, subtracting the closing price one row above, and then dividing all of that by
the price one row above.

• In Stata, use the getsymbols function from the getsymbols package (ssc install
getsymbols). You will probably want to get the S&P500 data prepared to merge in first (and
saved as a .dta file) before using getsymbols. You’ll need to do some date finagling in the
S&P500 data to get the date variable recognized as a date. You can do this by generating a new
variable daten = date(date, "YMD"). Finally, use merge 1:1 daten to merge
together your getsymbols and S&P500 data. Then, to create the RR variables, use sort to
sort the data, and AAPL_close[_n-1] (or similarly for S&P) to access the price one row
above.

2. Apple announced the iPhone on January 7, 2007. Let’s start by graphing the event study of how
this affected the Apple stock price. Make a line graph of the AAPL_RR on the y-axis and date on
the x-axis. Add a vertical line at the event date.

Write a line commenting on whether it seems like the announcement improved Apple’s stock price.
Language-specific instructions:

• In Stata, use the twoway function, which can graph multiple lines (one for the line graph and
another for the vertial line) using parentheses: twoway () (). The line graph should be in
the first set of parentheses, and in the second you can do function y = eventdate,
horiz ra(bottom top), where eventdate is the event date, which you can make
with date("YYYY-MM-DD", "YMD"), and bottom top are the bottom and top of the
range you want it to draw the line for.

3. Calculate the abnormal return for Apple in each period (both pre-event and post-event) using
the mean method. Be sure to only use data from the observation period to calculate the mean.
Save this new variable as AAPL_AR_mean.

(note: the process of making AAPL_RR has created a missing value for the first day. You may
have to deal with this)
4. Calculate the abnormal return for Apple in each period (both pre-event and post-event) using
the market method. Save this new variable as AAPL_AR_market.

5. Calculate the abnormal return for Apple in each period (both pre-event and post-event) using
the risk-adjusted method. Use an OLS regression to estimate the relationship between Apple and
the market. Save this new variable as AAPL_AR_risk.

6. Make a graph of each of the AR variables, like in step 2 (ideally, put all three on the same set of
axes and label them, but this is not required). Then, make a comment on what the event study
results look like, and whether they’re different by method.
7. Using AAPL_AR_market, show an estimate of the effect of the announcement on the stock
price on the day of the announcement. Then, create a basic t-statistic for the effect using the
standard deviation of the pre-event AAPL_AR_market. Use this t-statistic to create a p-value
for the test against the null hypothesis that the effect is 0. Show the estimate, t-statistic, and p-
value.
You can get a p-value for a two-sided test by making your t-statistic negative if it isn’t already,
feeding that to the cumulative standard normal distribution function, and then multiplying that
by 2.
Finally, write a line explaining whether the effect is statistically significant at the 99% level.

• In Stata, the cumulative standard normal distribution function is normal(). Note you can pull
a value out and store it as a local, which may be handy for this. For example summarize
AAPL_AR_market followed by local AR_mkt_mean = r(mean).

8. Comment on whether you think it would make sense or not to estimate this event study design
using a segmented linear regression (i.e. Y = Time + AfterEvent + Time*AfterEvent), and why.
9. Regardless of your answer to 8, estimate this event study design using a segmented linear
regression, counting the event day in the “after” period (and creating an After variable to use
in regression). However, instead of any of the AR measures, use AAPL_Close as the dependent
variable (think to yourself - why am I having you do this?). Use Newey-West standard errors
with 3 maximum lags (See Chapter 13). Write a line describing the estimated effect you found.
(Tip: Create a variable Time that’s just the row number and use that as your Time variable. This
may be easier to work with in a regression than a date, and also doesn’t run into issues with the
fact that this data doesn’t have weekends)

You might also like