HW 2 Write-Up
HW 2 Write-Up
Problem 1
If an aircraft is present in a certain area, a radar correctly registers its presence with probability
0.99. If it is not present, the radar falsely registers an aircraft presence with probability 0.10.
Suppose that on average across all days, an aircraft is present with probability 0.05.
The radar today just registered the presence of an aircraft. What is the probability that an
aircraft is actually present? Make sure to show your work.
We know that:
P(B|A) = 0.99
P(A) = 0.05
P(B|A’) = 0.1
P(A’) = 1 - 0.05 = .95
Whose creatinine clearance rate is healthier (higher) for their age: a 40-year-old with a rate of
135, or a 60-year-old with a rate of 112?
C) I chose a different method than the observed, predicted model and found the correct
answer to this question…
I used my new data frame for Part A, age and input the new ages I wanted to test (40,
60), whose creatinine clearance rates I wanted to predict
The 40 year old tested a 135 ml/minute per year and according to the model they should
have been a 123.02 ml/minute per year , so they are 10 ml/minute per year above
average
While the 60 year old tested with a rate of 112 ml/minute per year and they should have
tested 110.62 ml/minute per year , so they are near the average creatinine clearance
rate for a person of this age.
Therefore, we can conclude that the 40 year old is doing better for their age because
they are massively outperforming the model, whereas the 60 year old is right on the
average predicted value…
Problem 3
Is the observed data (49 cases out of 1072) consistent with the null hypothesis that, over the
long run, elderly residents within 5 miles of the power plant experience COPD at the national
background rate?
Use Monte Carlo simulation (with at least 100,000 simulations) to calculate a p-value under this
null hypothesis.
Include the following items in your write-up:
• the null hypothesis your are testing: whether, over the long run, elderly residents within 5 miles
of the power plant experience COPD at the national background rate
• the test statistic you used to measure evidence against the null hypothesis: 49/1072 which is
the observed data that we are weighing against our null hypothesis
• a picture of the probability distribution of the test statistic, assuming that the null hypothesis is
true
• and a one-sentence conclusion about whether you think the null hypothesis looks plausible in
light of the data.
The result in light of the data suggests that the null hypothesis does not look plausible because
the p-value is very low. Because our p-value is lower than .05, there is strong evidence against
the null hypothesis.
Thinking about the risk of a firm's stock, we divide it into unsystematic (firm specific risk), and
systematic (related to the market as a whole). Beta is the measure of this systematic risk.. For
example if the slope were to go up by one percent in our regression model, beta is the
percentage change in the stocks return. Beta is a measure of a stock’s volatility in relation to the
overall market. High-beta stocks are riskier but provide a higher potential return, while low betas
pose less systematic risk, and result in lower returns. Standard beta coefficients are the
coefficients that would result if the variables in the linear regression were converted to z-scores.
The slope term in the regression model is beta, and R^2 is a measure of the strength of the
model.
• the table itself, along with an informative caption below the table, no more than 2-3 sentences
in length, to give readers the information necessary to interpret the table.
How to read the necessary information, example:
● To build this table in R, I used stargazer package and exported my code to a LaTeX
editor called Overleaf which then built my table in one line of code
● Under the column corresponding to AAPL, we will have SPY (beta) = 1.066, alpha
(intercept) = 0.009, R^2= 0.013
Here is a table of the 6 stocks regressed on the return of S&P 500, beta represents the slope
coefficient for our model, while alpha represents the intercept. Because R^2 = 0.648 is highest
for Google, the model created for it is the strongest
#Takeaway: higher values for R^2 indicate a stronger model, while smaller values indicate a
weaker model
• a conclusion that answers two questions: in light of your analysis, which of these six stocks
has the lowest systematic risk? And which has the highest systematic risk? (Again, watch the
video to understand how this is measured using the regression model.)
Systematic risk is telling you what is your risk associated with the overall market going up and
down. In our regression model systematic risk is measured with beta, the slope coefficient for
our model. The systematic risk of the average firm is 1.0.
The highest coefficient (SPY) indicates the riskiest stock, in this case Apple has the highest
systematic risk (1.066), but potentially higher return, and Walmart has the lowest risk (0.519)
(consistent) but probably low return.