Solutions: Stat 101 Final
Solutions: Stat 101 Final
SOLUTIONS
Question #1 / 10 Question #7 /7
Question #2 / 10 Question #8 /5
Question #3 / 14 Question #9 /4
Question #4 / 15 Question #10 /3
Question #5 /4 Multiple Choice (#11-20) / 20
Question #6 /8 TOTAL / 100
1
Figure 1: Common percentiles for the standard normal distribution
Percentile .005 .01 .025 .05 .10 .90 .95 .975 .99 .995
z-value -2.58 -2.33 -1.96 -1.64 -1.28 1.28 1.64 1.96 2.33 2.58
2
1. (10) Descriptive Statistics
For each of the following situations, draw a rough sketch of an appropriate visualization
(no need to number or label the axes), and give one summary statistic that would be
appropriate.
3
2. (10) Coffee and depression
A study was conducted to evaluate the relationship between coffee consumption and
depression in women,1 following 50,739 women from 1996 through 2006. All women
were free of depressive symptoms at the beginning of the study. The table below shows
the clinical depression status at the end of the study by amount of caffeinated coffee
consumption.
Caffeinated coffee consumption
≤ 1 cup/week 2-6 cups/week 1 cup/day 2-3 cups/day ≥ 4 cups/day Total
Clinical Yes 670 *373* 905 564 95 2607
depression No 11545 6244 16329 11726 2288 48132
Total 12215 6617 17234 12290 2383 50739
(a) (1) Calculate the overall proportion of women who do not suffer from depression.
48132/50739 = 0.9486
(b) (1) What type of test is appropriate for evaluating if there is an association be-
tween coffee intake and depression?
Chi-squared test for association
(c) (1) Write the hypotheses for this test.
H0 : Caffeinated coffee consumption and depression in women are not associated.
HA : Caffeinated coffee consumption and depression in women are associated.
(d) (2) Calculate the contribution of the starred cell, *373*, to the test statistic.
expected = 2607∗6617
50739
= 339.9854 ≈ 340
(observed−expected)2 (373−340)2
expected
= 340
= 3.20
(e) (2) Given that the test statistic is 20.93 and the p-value is 0.0003, what is the
conclusion of the hypothesis test? Make sure to interpret it in context.
p-value is small and we reject H0 . The data provide convincing evidence to suggest
that caffeinated coffee consumption and depression in women are associated.
(f) (3) One of the authors of this study was quoted on the NYTimes as saying it was
“too early to recommend that women load up on extra coffee” based on just this
study.2 Do you agree with this statement? Why or why not?
Yes, this is an observational study. Based on this study we can’t deduce that
drinking more coffee causes a decreased risk of depression. There may be other
confounding variables that cause decreased depression in women who drink more
coffee.
1
Lucas, M et. al., Coffee, Caffeine, and Risk of Depression Among Women. Arch Intern Med.
2011;171(17).
2
NYTimes. September 16, 2011. http://well.blogs.nytimes.com/2011/09/26/coffee-drinking-linked-to-
less-depression-in-women
4
3. (14) Coffee and physical activity
Another factor shown to have an association with depression is physical activity. Based
on exercise data collected on the participants in the study described in Question 2,
researchers estimated the total hours of metabolic equivalent tasks (MET) per week,
a measure of physical activity, for each individual. The table below gives summary
statistics of MET for women in this study based on the amount of coffee consumption.
Caffeinated coffee consumption
≤ 1 cup/week 2-6 cups/week 1 cup/day 2-3 cups/day ≥ 4 cups/day Total
Mean 18.7 19.6 19.3 18.9 17.5
SD 21.1 25.5 22.5 22.0 22.0
n 12215 6617 17234 12290 2383 50739
(a) (1) Determine if the distributions of physical activity level (measured in MET) in
each group is nearly normal, right-skewed, or left-skewed. Explain your reasoning.
The distributions must be right-skewed. The standard deviations are too large for
a nearly normal distribution since physical activity level cannot be negative.
(b) (1) Write the hypotheses for evaluating if the average physical activity level varies
among the different levels of coffee consumption? (no need to define your param-
eters)
(c) (3) Given below is part of the output associated with this test. Fill in the empty
cells.
Df Sum Sq Mean Sq F value Pr(>F)
coffee Cell 1 10508 Cell 3 Cell 5 0.0003
Residuals Cell 2 25564819 Cell 4
dfG = k − 1 = 5 − 1 = 4
dfE = n − k = 50739 − 5 = 50734
M SG = SSg /dfg = 10508/4 = 2627
M SE = SSE /dfE = 25564819/50734 = 503.9
F = M SG /M SE = 2627/505 = 5.2
5
(d) (2) If an F-distribution is appropriate to use to calculate the p-value, give the
degrees of freedom. If the F-distribution is not appropriate to use, explain why.
4, 50734
(e) (2) Interpret the p-value in context. Do not give the conclusion of the test; actually
state what the p-value means.
If in fact the mean physical activity level for women in all five groups were equal,
the probability of observing differences as extreme as the ones observed in the
observed sample means would be 0.003.
(f) (3) Give a 95% confidence interval for the difference in mean total hours of MET
between women who drink ≤ 1 cup/week, and women who drink ≥ 4 cups/day.
statistic ± t∗ × SE
s
1 1
X ≤1cup/week − X ≥4cups/day ± 2 × M SE +
n≤1cup/week n≤1cup/week
s
1 1
(18.7 − 17.5) ± 2 × 503.9 +
12215 2383
1.2 ± 2 × 0.50
(0.2, 2.2)
(g) (2) Knowing that physical activity level is associated with both coffee and depres-
sion, how could you test to see whether coffee is associated with depression, even
after accounting for physical activity level?
Do logistic regression with depression as the response and coffee and physical
activity level as explanatory variables, and see whether the coefficient for coffee
is significant. Another option is to conduct a randomized experiment, randomly
assigning people to drink different amounts of coffee, and then see if there are
different rates of depression between the groups.
6
4. (15) Body Fat Percentage
Body fat percentage can be complicated to estimate, while variables such Age, Height,
Weight, and measurements of various body parts are easy to measure. Based on data3
on body fat percentage and other various measurements, we develop a model to predict
body fat percentage, based on easy to obtain measurements. We begin by throwing
all available explanatory variables into the model, and then run stepwise regression,
which gives the following model:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -20.06213 10.84654 -1.850 0.06558 .
Age 0.05922 0.02850 2.078 0.03876 *
Weight -0.08414 0.03695 -2.277 0.02366 *
Neck -0.43189 0.20799 -2.077 0.03889 *
Abdomen 0.87721 0.06661 13.170 < 2e-16 ***
Hip -0.18641 0.12821 -1.454 0.14727
Thigh 0.28644 0.11949 2.397 0.01727 *
Forearm 0.48255 0.17251 2.797 0.00557 **
Wrist -1.40487 0.47167 -2.978 0.00319 **
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
3
Penrose, K., Nelson, A., and Fisher, A. (1985), ”Generalized Body Composition Prediction Equation for
Men Using Simple Measurement Techniques”, Medicine and Science in Sports and Exercise, 7(2), 189.
7
(e) (2) Your friend wants to estimate his body fat percentage, has measured the nec-
essary explanatory variables, and asks you for help using the model. You give him
a point estimate calculated from the model, and, being a good statistician, want
to give him an idea for how certain you are about the estimate. Should you give
him a confidence interval or a prediction interval? Explain. A prediction interval,
because that is more relevant for an individual case. Your friend wants to know
what his body fat percentage is, not the average body fat percentage of all people
like him.
(f) (2) Upon a closer look at the model, your friend replies “The coefficient for Weight
is negative?!? AND significant??? That doesn’t make any sense! How can this
be possible?!?”. How would you reply? Weight may be correlated with other
explanatory variables in the model, so even if it is positively correlated with body
fat percentage on it’s own, the coefficient may be negative due to the other variables
in the model. For example, if you already know that someone has a very large
Abdomen, also knowing they weigh a lot may actually decrease the predicted body
fat percentage.
(g) (1) Besides the fact that it is significant, how would you quantify the success of
this model? R2 is 0.7467, so about 75% of the variability in body fat percentages
can be explained by this model.
(h) (3) Based on the plots below, do the conditions for multiple regression appear to
be satisfied? Explain.
Yes. There
is no obvious pattern in the residual plot, so linearity is okay. The variability of the
residuals appears to be constant, and the histogram shows residuals that appear to
be approximately normally distributed. We cannot really assess the independence
assumption based on the information given because we do not know whether or not
this was a random sample (or whether there are friends or family members in the
sample). However, most likely the independence assumption is fine.
8
5. (4) Balance and Age
People of different ages were asked to stand on a “force platform” and asked to maintain
a stable upright position. The “wiggle” of the board in the forward-backward direction
is recorded; more wiggle corresponds to less balance. The participants are divided into
two age groups: young and elderly. The average wiggle among elderly people was 26.33
mm, and the average among young people was 18.125 mm.4
(a) (2) The randomization distribution for the difference in means is given below, based
on 100 simulated randomizations. Is balance associated with age? Estimate the
p-value.
The observed difference is 26.33 - 18.125 = 8.205. Only 1 out of 100 dots exceeds
8.205, so the proportion in the upper tail is 0.01. We double this (because we have
a two-sided alternative) to get the p-value 0.02. (If you count the number of dots
above 8.205 and below -8.205 to get a p-value of 0.03, that is okay as well).
(b) (2)The bootstrap distribution for the difference in means is given below, based on
100 bootstrap samples. Estimate a 90% confidence interval for the true difference
in means.
Trimming off the most extreme 5 dots from either side, we are left with the middle
90% of dots, or the 90% confidence interval. This is roughly 4.5 mm to 15 mm.
4
Teasdale, Bard, LaRue, and Fleury. (1993). Experimental Aging Research .
9
6. (8) Cookies and Kindness
In an experiment conducted on male undergraduate students studying in a library,
researchers had a student randomly approach some students with free cookies.5 A few
minutes later, the experimenter approached the students and asked if they would be
willing to volunteer to help with a psychology experiment.9 out of 13 students who got
the cookies offered to help, and 6 out of the 12 students who did not get the cookies
offered to help. Are students who receive cookies more willing to help?
(b) (2) What distribution would you compare the statistic calculated in part 6a to, in
order to calculate a p-value?
A randomization distribution. The sample sizes are too small to use a normal
distribution.
(c) (2) Give the mean and standard deviation of the distribution referred to in part
6b.
mean = 0,
standard deviation = SE
s
1 1
= p̂(1 − p̂) +
13 12
s
9+6 9+6 1 1
= (1 − ) +
13 + 12 13 + 12 13 12
= 0.196.
(d) (2) The p-value for the test is 0.28. State the conclusion in context.
We do not have enough evidence to say that students who receive cookies are more
willing to help.
5
Isen, A. M., & Levin, P. F. (1972). Effects of feeling good on helping: Cookies and kindness. Journal of
Personality and Social Psychology, 21, 384-388.
10
7. (7) Exercise Hours
Based on sample data on undergraduates at another university, when asked how many
hours they exercise per week we find X = 9.054 and s = 5.74. Suppose we want to
do inference for the true standard deviation, σ, of exercise hours per week for under-
graduates at this university. By bootstrapping, we find out that the standard error of
the sample standard deviation is about 0.31. You may assume that the distribution of
sample standard deviations is approximately normal.
(a) (4) Test whether the true standard deviation is significantly higher than 5. Show
all details of the test, including hypotheses, test statistic, p-value, and a conclusion
in context.
H0 : σ = 5
Ha : σ > 5
statistic − null 5.74 − 5
z= = = 2.39.
SE 0.31
Based on the picture provided of the standard normal distribution, this corresponds
to a p-value < 0.01. We have evidence to reject H0 , and can conclude that the
standard deviation of hours of exercise per week is significantly greater than 5.
(b) (3) Give a 90% confidence interval for the true standard deviation, and interpret
this interval in context.
11
8. (5) HIV and the ELISA test
About 0.5% of the American population carries HIV. The ELISA test is one of the first
and most accurate tests for HIV. For those who carry HIV, the ELISA test is 99.7%
accurate. For those who do not carry HIV, the test is 92.6% accurate. If the test result
is positive for a randomly selected individual there are two possible options: he either
carries or does not carry HIV. We decide to use a Bayesian approach to evaluate these
hypotheses.
(a) (1) What is the probability that a randomly selected individual carries HIV?
P(HIV) = 0.005
(b) (1) What is the probability that the test result is positive for an individual who
carries HIV?
P(+ | HIV) = 0.997
(c) (1) What is the probability that the test result is positive for an individual that
does not carry HIV?
P(+ — not HIV) = 1 - 0.926 = 0.074
(d) (1) What is the probability that the test result is positive for a randomly selected
individual?
(e) (1) If an individual has tested positive, what is the probability that the individual
carries HIV?
P (+|HIV )P (HIV )
P (HIV |+) =
P (+)
0.997 × 0.005
=
0.079
= 0.063
12
9. (4) AIDS Survival
Based on data from Australia on people diagnosed with AIDS6 , the following logistic
regression model was fit, with the response variable whether the person was alive or
dead at the end of the study. Positive coefficients correspond to a higher chance of
surviving. sexM is a dummy variable for males, age is age when diagnosed, and T.categ
denotes different categories of transmission. Based on the model below, calculate the
predicted probability that a 22 year old male who contracted the disease through
receipt of blood (T.categ = blood) survived the duration of the study.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.251434 0.341967 0.735 0.46218
sexM -0.052151 0.303300 -0.172 0.86348
age 0.007922 0.004111 1.927 0.05395 .
T.categhsid 0.070279 0.248652 0.283 0.77745
T.categid -0.890161 0.324852 -2.740 0.00614 **
T.categhet -0.878695 0.352845 -2.490 0.01276 *
T.categhaem 0.084729 0.309908 0.273 0.78455
T.categblood 0.877505 0.290621 3.019 0.00253 **
T.categmother -0.541732 0.796871 -0.680 0.49662
T.categother -0.247278 0.247846 -0.998 0.31842
ˆ
log(odds) = 0.251 − 0.052 + 0.008 × 22 + 0.878 = 1.253
ˆ
elog(odds) e1.253
p̂ = ˆ
= = 0.78
1 + elog(odds) 1 + e1.253
13
MULTIPLE CHOICE - 2 points each
11. (2) The p-value for a test is 0.002. Which is the correct interpretation of this number?
(a) A statistic this extreme would only occur in 0.002 of all samples.
(b) The probability that this statistic occurred just by random chance is 2/1000.
(c) If the null hypothesis is true, we would only get a statistic this extreme
in 2 out of 1000 samples.
(d) There is a 0.2% chance that the null hypothesis is true.
(e) There is a 0.2% chance that the alternative hypothesis is true
12. (2) Which association is broken by randomizing cases into treatment groups in a ran-
domized experiment?
13. (2) Which association is broken by re-randomizing units into treatment groups (real-
locating) for a randomization test?
(a) the association between the explanatory variable and a confounding variable
(b) the association between a confounding variable and the response variable
(c) the association between the explanatory variable and the response vari-
able
14. (2) An article in the American Association of Retired People (AARP) website dated
December 2010 states that “Of the 3,012 people ages 45 and up who participated in
our study, 35 percent are chronically lonely (as rated on the UCLA Loneliness Scale,
a standard measurement tool.)”7 . Assume that the participants were selected through
a simple random sample. Which of the following is the margin of error for a 95%
confidence interval?
(a) 0.012%
(b) 0.85%
(c) 1.2%
(d) 1.7%
(e) 3.4%
15. (2) If based on the same data, which is wider, a 90% confidence interval or a 95%
confidence interval?
(a) 90%
(b) 95%
7
http://www.hs.iastate.edu/2010/12/06/russellscale/
14
16. (2) A math teacher brags that a majority of her students scored above the mean on
her last exam. This means
17. (2) Increasing the sample size will cause the standard deviation of the bootstrap dis-
tribution to
(a) Increase
(b) Decrease
(c) Stay about the same
18. (2) Increasing the number of bootstrap samples (the number of simulations) will cause
the standard deviation of the bootstrap distribution to
(a) Increase
(b) Decrease
(c) Stay about the same
19. (2) Does the South Beach diet work? Researchers randomly divided 500 people into
two equal-sized groups. One group spent 6 months on the South Beach diet. The
other group received a pamphlet about controlling portion sizes. At the beginning of
the study, the average difference in weights between the two groups was about 0. After
the study, the average difference was about 8 pounds, and the South Beach group had
the lower average weight. To test whether an average difference of 8 pounds could be
due to chance, a statistician writes everyone’s end-of-diet weight on an index card. He
shuffles these cards together very well, and then deals them into two equal-sized groups.
Which of the following best describes the outcome of the statistician’s activity?
(a) The average difference between the two stacks of cards will be about 0
pounds.
(b) The average difference between the two stacks of cards will be about 8 pounds.
(c) If the diet was effective, the average difference between the two stacks of cards will
be more than 8 pounds.
20. (2) A researcher is comparing the hypotheses of no association between two variables
versus an association between two variables. She calculates the probability that there
is an association between the two variables, based on the observed data. Which type
of inference was used?
(a) Frequentist
(b) Bayesian
15