0% found this document useful (0 votes)
22 views4 pages

HWK6 Stats

The document provides instructions for homework assignment 6 on statistics. It covers topics like submitting homework, using R code, and explaining answers. It then provides exercises on interval estimation, hypothesis testing, and bootstrap confidence intervals using a dataset on cherry tree measurements.

Uploaded by

aakelley3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views4 pages

HWK6 Stats

The document provides instructions for homework assignment 6 on statistics. It covers topics like submitting homework, using R code, and explaining answers. It then provides exercises on interval estimation, hypothesis testing, and bootstrap confidence intervals using a dataset on cherry tree measurements.

Uploaded by

aakelley3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Stat 371 Homework #6

Allison Kelley

• Submit your homework to Canvas by the due date and time. Email your lecturer if you have
extenuating circumstances and need to request an extension.
• If an exercise asks you to use R, include a copy of all relevant code and output in your submitted
homework file. You can copy/paste your code, take screenshots, or compile your work in an
Rmarkdown document.
• If a problem does not specify how to compute the answer, you many use any appropriate method. I
may ask you to use R or use manual calculations on your exams, so practice accordingly.
• You must include an explanation and/or intermediate calculations for an exercise to be complete.
• Be sure to submit the HWK6 Autograde Quiz which will give you ~20 of your 40 accuracy points.
• 50 points total: 40 points accuracy, and 10 points completion

Interval estimation for a population proportion


Exercise 1. An automobile club pays for emergency road services (ERS) requested by its members. Upon
examining a sample of 2927 ERS calls from the club members, the club finds that 1499 calls related to
starting problems, 849 calls involved serious mechanical failures requiring towing, 498 calls involved flat
tires or lockouts, and 81 calls were for other reasons.
a. Construct a 98% confidence interval “by hand” for the proportion of all ERS calls from
club members that are serious mechanical problems requiring towing services (after checking
that necessary assumptions are well met).
Total: 2927
849 (serious mechanical problems)
849/2927 = 0.29
p-hat ~ N(0.29, (0.29(1-0.29))/ 2927) = N(0.29, 0.00007)
n(0.29) > or = 5? 268.83 YES
n(1- 0.29) > or = 5? 2078.17 YES
98% CI, 1-0.98 = 0.02, 0.02/2 = 0.01 = alpha/2
P-hat( or point est.) +/- crit value x standard error
0.29 +/- qt(0.99 / 2926) x standard error
(0.2705, 0.3096)
b. The current policy rate the automobile club pays is based on the thought that 20% of
services requested will be serious mechanical problems requiring towing. However, the
insurance company claims that the auto club has a higher rate of serious mechanical
problems requiring towing services. Using your confidence interval in part (a), respond to the
insurance company’s claim.
The claim of the insurance company is correct, as the confidence interval says with 98% confidence,
the serious mechanical problems that require towing are between 27.05% and 30.96% (because the

1
values from of CI were 0.2705 and 0.3096), meaning the true population proportion requiring
towing is above 20%.

c. The club wants to construct a 95% confidence interval for the proportion of members who
want a chocolate fountain at the annual picnic. They want the margin of error to be less
than 0.01. How large of a random sample of club members should they contact if they
start with the assumption that 50% are in favor of a chocolate fountain at the picnic?
(Hint: write out the formula for margin of error, then solve for n)

Margin of error = z*sqrt((p-hat(1 - p-hat)/n))


Qnorm(0.975) = 1.959964 or approx. 1.96
1.96(sqrt(0.25)/n)n > 1.96^2 0.25/0.01^2 n > 9604.
The sample size must be greater than 9604.

T test for a single population mean


Exercise 2. Recall the cherry tree data set in R, trees. Note that the diameter (in inches) is labelled Girth
in the data.
a. Consider the hypothesis test of H0 : µD = 12 vs HA : µD ̸= 12 where µD is the mean
diameter of cherry trees from which this sample was collected. Use an alpha level of α =
0.10.

2
(i) Compute the t test statistic and pvalue by hand (not using t.test) and then
confirm the values using t.test.
T =( xbar-mu0)/(sd/sqrt(n))
(13.248-12)/(3.138/sqrt(31))
1.28/0.5636 = 2.214 = t-test statistic
p-value = 0.0346

(ii) Use the p value to draw a conclusion about the hypotheses: H0 : µD = 12 vs


HA : µD ̸= 12 in the context of the question.

The p-value of 0.0346 < 0.05, or is very small, so there is enough evidence against the null hypothesis,
meaning we can reject it. The value 12 is not contained.

(iii) Compare the conclusions drawn from the 90% confidence interval for µD in home-
work 5, exercise 2(b) and the hypothesis test in the previous question.

Much like the example above, I would reject the null hypothesis, as 12 is outside of its
confidence interval of (12.307, 14.193).

b. Consider the hypothesis test of H0 : µH = 77 vs HA : µ ̸= 77 where µH is the mean


height of cherry trees from which this sample was collected. Use an alpha level of α =
0.10.
(i) Compute the t test statistic and pvalue by hand (not using t.test) and then
confirm the values using t.test.
Mean(trees$height) = 76
Sd(trees$height) = 6.371813
Length(trees$height) = 31
T = (xbar – mu0)/(sd/sqrt(n))
(76-77)/(6.37/sqrt(31)) = -0.8738 = t test stat.
Qt(0.05, 30) = -1.697261
Qt(0.95, 30) – 1.697261
2 x pt(-0.8738, 30) = 0.389  p value
t.test(trees$height, mu=77, conf. level=0.9) = (74.05764 77.94236)

(ii) Use the p value to draw a conclusion about the hypotheses: H0 : µH = 77 vs


HA : µ ̸= 77 in the context of the question.

The p value of 0.389 is greater than alpha, meaning the 90% confidence interval contains the value of 77, so
we fail to reject the null hypothesis.

(iii) Compare the conclusions drawn from the 90% confidence interval for µH in
home- work 5, exercise 2(b) and the hypothesis test in the previous question.

These conclusions are similar to the one in the previous homework. The values from the
question above include a p value test, which tells us the probability of getting an extreme
result from the data, assuming the null hypothesis is true. The mean diameter is close to 77,
and is included in the 90% CI.
3
c. The code below calculates the lower and upper critical values needed for a 90% bootstrap
confidence interval for µD (mean diameter). Do not edit this code - just run the chunk and
read off the output.
n <- 31
x_bar <- mean(trees$Girth)

t_hat <- numeric(1000)

set.seed(371)
# Bootstrap loop
for(i in 1:1000){
# 2. Draw a SRS of size n from data
x_star <- sample(trees$Girth, size = n, replace = T)

# 3. Calculate resampled mean and sd


x_bar_star <- mean(x_star)
s_star <- sd(x_star)

# 4. Calculate t_hat, and store it in vector


t_hat[i] <- (x_bar_star - x_bar) / (s_star/sqrt(n))
}

# Find left and right critical values of approx. distribution


quantile(t_hat, probs = 0.05, names = F)

## [1] -1.690054
quantile(t_hat, probs = 0.95, names = F)

## [1] 1.523721
Use these critical values to construct a 90% bootstrap t confidence interval for µD
(mean diameter) from the sample data in the trees data set. Compare this confidence
interval to the regular t CI constructed in homework 5, 2(b) and brainstorm possible
reasons for the relationships you noticed.

The point est. for average diameter is 13.24839


(12.35120, 14.24843)
The two confidence intervals are similar, which is to be expected, as the bootstrap method
calculate a t-hat value several times over, approximating the sampling distribution of t.

You might also like