0% found this document useful (0 votes)
84 views5 pages

Homework 02 Key Answer STAT 4444

The document discusses analyzing survey data from a local election using frequentist and Bayesian statistical methods. It provides examples of calculating maximum likelihood estimates, confidence intervals, and credible intervals to analyze the proportion of voters supporting a sales tax increase. Both approaches find significant evidence that less than half of voters support the increase.

Uploaded by

IncreDABels
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views5 pages

Homework 02 Key Answer STAT 4444

The document discusses analyzing survey data from a local election using frequentist and Bayesian statistical methods. It provides examples of calculating maximum likelihood estimates, confidence intervals, and credible intervals to analyze the proportion of voters supporting a sales tax increase. Both approaches find significant evidence that less than half of voters support the increase.

Uploaded by

IncreDABels
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Homework #2 (20 points+1 bonus point) Applied Bayesian Statistics

th
Due by Monday, February 27 midnight 11:59PM

Your name: ______Key answer ________ Class ID (uploaded on canvas): _________


SHOW all your work to get the full CREDIT
Question 1: (2 pts, 0.5 each) The shape of Beta distribution depends on its parameters, 𝛼 and 𝛽.
Plot each of the following densities by the R statistical software:

Question 2: (3 pts, 1 pts each) It is known that the uniform distribution is a special case of the beta
distribution.
a. What are the numeric values of the uniform distribution? That is, U(0, 1) = Beta(?, ?)
U(0,1)=Beta(1,1)
b. What is the equivalent prior sample size for a U(0, 1) prior?
The equivalent sample size of 𝑩𝒆𝒕𝒂(𝜶, 𝜷) is 𝒏 = 𝜶 + 𝜷 − 𝟐. So, for U(0,1)=Beta(1,1) the
sample size is 1+1-2 = 0
c. What is the equivalent prior sample size for a beta(9,9) prior?
𝒏 = 𝟗 + 𝟗 − 𝟐 = 𝟏𝟔

Question 3: During the severe floods in the Midwest in 2008, Iowa City and Coralville in Johnson
County, Iowa, were hit hard and hundreds of homes, businesses, churches, and university buildings
were destroyed. Less than a year later, a vote was held on a proposal to impose a local sales tax of one
cent on the dollar to pay for flood-prevention and flood-mitigation projects. A few days before the
actual vote, a local newspaper reported in its online edition:

“The outcome of Tuesday’s local-option sales tax election in Johnson County appears too close to
call, based on results from a Gazette Communications poll of voters. The telephone survey of 320
registered voters in Johnson County, conducted April 27–29, shows 40% in favor of the 4-year 1%
sales tax. . . ”

Always believe in yourself and never give up on your dreams! 1 I am available if you have any questions or concerns!
A member of a local organization called “Ax the Tax” claims that this means that under half of all
registered voters in the county support the local-option sales tax. She would like to use the sample
survey data of the newspaper to test the two hypotheses:

H0 : π ≥ 0.5 Ha : π < 0.5

where π represents the proportion of all Johnson County registered voters who support the sales tax.
Let us practice on frequentist approach and Bayesian approach using this real problem.

1. First: frequentist approach (5 pts) – use R functions as needed and report R functions output.
a. Use calculus to drive the maximum likelihood estimator of the population proportion, π, is
!
𝜋: = " .

b. Calculate the point estimate (MLE) of the population proportion, π, using the given
information.
𝟏𝟐𝟖
<=
𝝅 = 𝟎. 𝟒
𝟑𝟐𝟎
c. Calculate a 95% confidence interval for the population proportion, π,.

> binom.test(128,320,p=0.5,alternative="two.sided")
Exact binomial test

data: 128 and 320


number of successes = 128, number of trials = 320, p-value = 0.0004118
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.3459083 0.4559608
sample estimates:
probability of success
0.4
95% percent confidence interval:
0.3459083 0.4559608

d. Interpret the 95% confidence interval you calculated in (c) in the context.
Always believe in yourself and never give up on your dreams! 2 I am available if you have any questions or concerns!
With 95% confidence level, the true population proportion is between 0.346 and 0.4556.
OR if we selected all the possible samples of size 320 and for each we calculated a 95%
confidence level, 95% of the confidence intervals will contain the true population
proportion and 5% will not contain the true population proportion.

e. Test the claim at 5% significance level and state your decision.

> binom.test(128,320,p=0.5,alternative="less")
Exact binomial test

data: 128 and 320


number of successes = 128, number of trials = 320,
p-value = 0.0002059
alternative hypothesis: true probability of success is less than 0.5
95 percent confidence interval:
0.0000000 0.4471654
sample estimates:
probability of success
0.4
p-value is less than 5%, we reject the null hypothesis. So, at a significance level 5%, the
true population proportion is less than 0.5.

f. What is the interpretation of the p-value of the test you did in (e)?
p-value is the probability of getting a test statistic value equal to the observed test
statistic or more extreme.

2. Second: Bayesian approach (8 pts) - use R functions as needed


a. (4 pts) Under a uniform (0,1) prior distribution
I. Use calculus to show that the posterior distribution of the population proportion π, is
𝐵𝑒𝑡𝑎(129,193).

II. Calculate the posterior mean, posterior mode, posterior median, and posterior
variance of the population proportion, π.
𝜶 𝟏𝟐𝟗
• 𝐏𝐨𝐬𝐭𝐞𝐫𝐢𝐨𝐫 𝐦𝐞𝐚𝐧 = 𝜶$𝜷 = 𝟏𝟐𝟗$𝟏𝟗𝟑 = 𝟎. 𝟒𝟎𝟎𝟔
𝜶*𝟏 𝟏𝟐𝟖
• 𝐏𝐨𝐬𝐭𝐞𝐫𝐢𝐨𝐫 𝐦𝐨𝐝𝐞 = 𝜶$𝜷*𝟐 = 𝟑𝟐𝟏 = 𝟎. 𝟒
𝜶*𝟏/𝟑 𝟏𝟐𝟗*𝟏/𝟑
• Posterior median = 𝜶$𝜷*𝟐/𝟑 = 𝟏𝟐𝟗$𝟏𝟗𝟑*𝟐/𝟑 = 𝟎. 𝟒𝟎𝟎𝟒
𝜶𝜷 𝟏𝟐𝟗(𝟏𝟗𝟑)
• Posterior variance = (𝜶$𝜷)𝟐 (𝜶$𝜷$𝟏) = (𝟏𝟐𝟗$𝟏𝟗𝟑)𝟐 (𝟏𝟐𝟗$𝟏𝟗𝟑$𝟏) = 𝟎. 𝟎𝟎𝟎𝟕𝟒

III. Calculate a 95% equal-tail posterior credible set and interpret it.
qbeta(c(0.025,0.975),129,193)
Always believe in yourself and never give up on your dreams! 3 I am available if you have any questions or concerns!
The 95% equal-tail posterior credible set is (0.3478036, 0.4546081).
Interpretation: with probability 0.95, the population proportion is between
0.348, and 0.454.
IV. Calculate P(π ≥ 0.5|y) and P(π < 0.5|y) and what is your conclusion? [i.e. Is there
significant evidence in support of hypothesis Ha: π < 0.5?]
pbeta(0.5,129,193)

𝐏(𝛑 ≥ 𝟎. 𝟓|𝐲) = 𝟏 − 𝟎. 𝟗𝟗𝟗𝟖𝟑𝟎𝟑 = 𝟎. 𝟎𝟎𝟎𝟏𝟔𝟗𝟕 and P(𝛑 < 𝟎. 𝟓|𝐲) =


𝟎. 𝟗𝟗𝟗𝟖𝟑𝟎𝟑. This means we accept the alternative hypothesis with
probability 0.9998303 that is because the probability of the null hypothesis is
much less than the probability of the alternative hypothesis.

b. (4 pts) Under a beta (20,45) prior distribution


I. Use calculus to show that the posterior distribution of the population proportion π, is
𝐵𝑒𝑡𝑎(148,237).

II. Plot the likelihood function, prior distribution, and posterior distribution by R
function triplot.

III. Calculate a 95% equal-tail posterior credible set and interpret it. Which credible set is
wider, under uniform or beta prior? why?
qbeta(c(0.025,0.975),148,237)
• The 95% equal-tail posterior credible set is (0.3364848, 0.4334839). The
interpretation: with probability 0.95, the true population proportion is
between 0.3364848 and 0.4334839.
• The 95% credible set under uniform prior is wider because the posterior
distribution has less information compared to the posterior with beta
prior.

Always believe in yourself and never give up on your dreams! 4 I am available if you have any questions or concerns!
IV. Calculate P(π ≥ 0.5|y) and P(π < 0.5|y) and what is your conclusion? [i.e. Is there
significant evidence in support of hypothesis Ha : π < 0.5?]
P(𝛑 ≥ 𝟎. 𝟓|𝐲) = 𝟏 −0.9999975 = 0.0000025. So, the probability of the
alternative hypothesis is correct is 0.9999975. This means the null hypothesis
is not correct because its probability is very small.

3. (1 bonus point) What are your conclusions, your observations, or your comments from all the
analyses above in 1 and 2?
• The MLE and posterior mode under uniform prior are the same, 0.40.
• The 95% credible set under uniform prior is wider than under beta(20,45) prior
because the beta(20,45) prior is informative.
• Frequentist, Bayesian under uniform, and under Beta(20,45) prior have the
same test of hypothesis decision, reject the null hypothesis.
• Student answers may differ but must be correct to receive the bonus point.

V. (2 pts) Using a uniform prior for the population proportion π and a random sample of 𝑛 = 320
voters, 𝑦 = 128 support the sales tax, suppose that the newspaper plans on taking a new survey
of 25 voters, 𝑛∗ = 25. Let y∗ denote the number in this new sample who support the sales tax.
a. Find the posterior predictive probability that y∗ = 7.

25 Г(129 + 193) Г(7 + 129)Г(18 + 193)


𝑃(7) = f g = 0.0810533
7 Г(129)Г(193) Г(129 + 193 + 25)

OR use pbetap(c(129,193),25,7)= 0.0810533

b. Find the 90% posterior predictive interval for y∗. Hint: find the predictive
probabilities for each of the possible values of y∗ and ordering them from largest
probability to smallest probability. Then add the most probable values of y∗ into your
probability set one at a time until the total probability exceeds 0.90 for the first time.

Using pbeatp function we can calculate


the probabilities of all the possible
values of y. then we select the values
that have the highest probabilities, such
that the sum of their probabilities is at
least 0.9. after we order the values in
terms of their probabilities, the 90%
posterior predictive is (6 ,14) and the
exact value 92.6%.

Always believe in yourself and never give up on your dreams! 5 I am available if you have any questions or concerns!

You might also like