0% found this document useful (0 votes)
92 views6 pages

Lab Test 2018 Answers PDF

1) This document contains a lab test for a university course on experimental design and data analysis. The test has two questions involving data analysis and hypothesis testing using the R programming language. 2) The first question involves analyzing data on episodes of otitis media (ear infections) in babies' first two years. The second question analyzes data from a pilot study on a new cholesterol-reducing drug. 3) Key analyses include calculating summary statistics, plotting distributions, confidence intervals, hypothesis tests, and prediction intervals. The goal is to apply statistical and programming skills to answer questions about real medical data.

Uploaded by

Zihan Yan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views6 pages

Lab Test 2018 Answers PDF

1) This document contains a lab test for a university course on experimental design and data analysis. The test has two questions involving data analysis and hypothesis testing using the R programming language. 2) The first question involves analyzing data on episodes of otitis media (ear infections) in babies' first two years. The second question analyzes data from a pilot study on a new cholesterol-reducing drug. 3) Key analyses include calculating summary statistics, plotting distributions, confidence intervals, hypothesis tests, and prediction intervals. The goal is to apply statistical and programming skills to answer questions about real medical data.

Uploaded by

Zihan Yan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

lOMoARcPSD|4683239

Lab Test 2018 Answers

Experimental Design And Data Analysis (University of Melbourne)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Zihan Yan ([email protected])
lOMoARcPSD|4683239

MAST10011 Computer Test — Semester 1, 2018

Name:

Student Number:

Tutor:

Instructions:
❼ This test contains 2 problems, worth a total of 25 marks.

❼ It accounts for 10% of the final assessment for this subject.

❼ This test will be conducted under examination conditions: you are not allowed to talk
until after you leave the room. You must work individually.
❼ This is an open book test: you may use any printed or hand-written materials. Electronic
devices (including, but not limited to, calculators and mobile phones) are not permitted
and will be confiscated.

❼ Write your answers in the spaces provided. You will need to use R to produce graphs
and statistics in order to answer the questions on this test. You DO NOT need to write
down your commands, or copy any graphs which you produce.
❼ You may only use R and RStudio. Internet access and all other software, including Excel,
are NOT PERMITTED.
❼ You are allowed to use your lecture notes, annotated computer lab sheets and any other
printed or written notes.
❼ You have 40 minutes to complete the test.

DO NOT TURN THIS PAGE until


you are instructed to do so.

Test Number 1

Downloaded by Zihan Yan ([email protected])


lOMoARcPSD|4683239

Question 1 [12 marks]


Otitis media, a disease of the middle ear, is one of the most common reasons for visiting a doctor in
the first 2 years of life (other than a routine visit). Let X be the random variable that represents the
number of episodes of otitis media in the first 2 years of a baby’s life. In Australia, X has the following
probability mass function and cumulative distribution function:
x 0 1 2 3 4
Pr(X = x) 0.13 0.41 0.27 0.15 0.04
Pr(X ≤ x) 0.13 0.54 0.81 0.96 1

a. [2 marks] Calculate the median and interquartile range of X.


Solution: median = 1; IQR = 1. 1 mark each.
b. [2 marks] Produce a bargraph of the distribution of X on-screen, and comment on its shape.
Justify your comment with supporting statistics.

> p <- c(0.13,0.41,0.27,0.15,0.04)


> barplot(p)
0.4
0.3
0.2
0.1
0.0

Solution: X is clearly right-skewed. This can be seen by the fact that its mean (1.56, below) is
larger than its median (1). 1 mark for right-skewed, 1 mark for justification.

Downloaded by Zihan Yan ([email protected])


lOMoARcPSD|4683239

c. [2 marks] In a family with 4 children, what is the probability that exactly 1 will have more than
1 episode of otitis media in the first 2 years of their life?

> dbinom(1, 4, 1-0.54)

[1] 0.2897338

Solution: 0.290. 1 mark for binomial, 1 mark for correct answer.

d. [3 marks] From the distribution above, it can be calculated that E(X) = 1.56 and sd(X) = 1.023.
In a town of 5000 babies, calculate a 95% probability interval for the total number of episodes of
otitis media for all of these babies.

> qnorm(c(0.025,0.975),5000*1.56,1.023*sqrt(5000))

[1] 7658.222 7941.778

Solution: (7658, 7942). 1 mark for normal, 1 mark for correct mean/sd, 1 mark for correct
answer.

e. [3 marks] In a study of 200 babies in New Zealand, it was found that 79 of them had more than 1
episode of otitis media in the first 2 years of their life. Test the hypothesis that there is a different
proportion of such babies in New Zealand than in Australia, at the 5% significance level. Clearly
state your hypotheses, p-value and conclusion in the context of the problem.

> prop.test(79, 200, 1-0.54)

1-sample proportions test with continuity correction

data: 79 out of 200, null probability 1 - 0.54


X-squared = 3.1451, df = 1, p-value = 0.07615
alternative hypothesis: true p is not equal to 0.46
95 percent confidence interval:
0.3274613 0.4666408
sample estimates:
p
0.395

Solution: We test H0 : pNZ = 0.46 against H1 : pNZ 6= 0.46. With a p-value of 0.076, we do not
reject H0 : there is insufficient evidence to show that New Zealand has a different proportion of
these babies. 1 mark for hypotheses, 1 mark for p-value, 1 mark for conclusion in context.

Downloaded by Zihan Yan ([email protected])


lOMoARcPSD|4683239

Question 2 [13 marks]


A new drug, RedChol, is designed to reduce cholesterol levels. A pilot study is conducted of 20 overweight
middle-aged males (45–64 years old, with BMI > 30) who volunteered for the study. These individuals
were randomly assigned to two treatment protocols which were followed for three months:
❼ a low-cholesterol diet, a basic exercise routine and RedChol (12 subjects);
❼ a low-cholesterol diet, a basic exercise routine and a placebo (8 subjects).
The cholesterol levels of the participants (in mg/100mL) were measured before and after the study, and
the difference (after−before) recorded in the table below.
RedChol 6.9 6.6 -10.6 -16.8 3.5 -10.4 -22.1 0.4 -17.2 -21.9 -11.1 -8.5
Placebo -26.8 -15.5 -13.3 -6.2 8.4 1.1 -1.9 3.6
a. [2 marks] Calculate a 95% confidence interval for the average reduction in cholesterol for the
RedChol group.
> redchol <- c(6.9,6.6,-10.6,-16.8,3.5,-10.4,-22.1,0.4,-17.2,-21.9,-11.1,-8.5)
> placebo <- c(-26.8,-15.5,-13.3,-6.2,8.4,1.1,-1.9,3.6)
> t.test(redchol)
One Sample t-test

data: redchol
t = -2.7829, df = 11, p-value = 0.01781
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-15.103220 -1.763446
sample estimates:
mean of x
-8.433333
Solution: (-15.10, -1.76). 1 mark for t-test, 1 mark for answer.
b. [2 marks] What distributional assumption are you making to answer part (a)? Produce an ap-
propriate plot on-screen to check this assumption, and comment on your findings.
> qqnorm(redchol)
> qqline(redchol)

Normal Q−Q Plot

● ●
5


0
Sample Quantiles

−5


−10

● ●

−15



−20

● ●

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

Theoretical Quantiles

Solution: We assume that the difference in cholesterol levels of the RedChol group is normal.
From the QQ plot, this does not seem unreasonable. 1 mark for normal assumption, 1 mark for
QQ plot.

Downloaded by Zihan Yan ([email protected])


lOMoARcPSD|4683239

c. [3 marks] Calculate a 95% prediction interval for the reduction in cholesterol for someone taking
the RedChol treatment.

> mean(redchol) + c(-1,1)*qt(0.975,df=11)*sd(redchol)*sqrt(1+1/12)

[1] -32.48195 15.61529

Solution: (-32.48, 15.62). 1 mark for mean/sd, 1 mark for formula, 1 mark for answer.

d. [3 marks] Test the hypothesis that RedChol is as effective as a placebo for reducing cholesterol,
at the 5% significance level. Clearly state your hypotheses, p-value and conclusion in the context
of the problem.

> t.test(redchol, placebo, var.equal=TRUE)

Two Sample t-test

data: redchol and placebo


t = -0.42222, df = 18, p-value = 0.6779
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-12.59913 8.38246
sample estimates:
mean of x mean of y
-8.433333 -6.325000

Solution: We test H0 : µRedChol = µplacebo against H1 : µRedChol 6= µplacebo . With a p-value of


0.678, we do not reject H0 : there is insufficient evidence to conclude that RedChol performs better
than a placebo. 1 mark for hypotheses, 1 mark for p-value, 1 mark for conclusion in context.

e. [3 marks] The makers of RedChol claim that it reduces cholesterol levels by an average of
12mg/100mL in three months. The makers of a rival drug, BlueChol, claim that RedChol reduces
cholesterol levels by only 8mg/100mL. Calculate the sample size needed to distinguish between
these claims with power 0.9 and significance 0.05.

> library(pwr)
> pwr.t.test(d=4/sd(redchol), sig.level=0.05, power=0.9, type=✬one.sample✬)

One-sample t test power calculation

n = 74.31763
d = 0.3810377
sig.level = 0.05
power = 0.9
alternative = two.sided

Solution: We require a sample of size at least 75. 1 mark for t-test, 1 mark for some right
arguments, 1 mark for correct answer.

Downloaded by Zihan Yan ([email protected])

You might also like