0% found this document useful (0 votes)
9 views3 pages

Bayesian Statistics

The document discusses Bayesian statistics, focusing on estimating binomial probabilities using beta distributions as priors. It includes a case study involving an insurance company's survey responses, where posterior distributions are derived and analyzed under different prior assumptions. Additionally, it explores the impact of prior distributions on posterior estimates and includes calculations and plots to illustrate these concepts.

Uploaded by

parviarora06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

Bayesian Statistics

The document discusses Bayesian statistics, focusing on estimating binomial probabilities using beta distributions as priors. It includes a case study involving an insurance company's survey responses, where posterior distributions are derived and analyzed under different prior assumptions. Additionally, it explores the impact of prior distributions on posterior estimates and includes calculations and plots to illustrate these concepts.

Uploaded by

parviarora06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

BAYESIAN STATISTICS

For the estimation of a binomial probability θ from a single observation x of the random variable X with the prior
distribution of θ being beta with parameters α and β, investigate the form of the posterior distribution of θ and
determine the Bayesian estimate of θ under quadratic loss.

x <- rep(0,n)

for (i in 1:n) {

theta <- rbeta(1,alpha,beta)

x[i] <- rbinom(1,1,theta)

mean(x)

Q1.

An insurance company designed a new product and wanted to assess its clients’ responses to the product. A survey was
carried out giving an opportunity to each participating client to give a positive or negative response to the product,
independently of other clients.

Let X be the random variable representing the positive responses to the new product.

(i) Identify the distribution of X and its parameters. [1]

Out of 160 clients who responded independently to the survey, 101 gave a positive response for the new product.

The probability of obtaining a positive response for the product is denoted by θ and a Beta prior distribution with
parameters (α, β) is assumed for θ. The posterior distribution of θ is proportional to:

f(θ|x) ∝ θ x + α –1(1 – θ) n – x + β – 1,

where x is the number of positive responses obtained out of n clients surveyed.

(ii) Specify the posterior distribution of θ with its parameters. [2]


(iii) Comment on the prior distribution of θ in relation to the posterior distribution. [1]
(iv) State the parameter values for which the prior is a Uniform(0, 1) distribution. [1]
(v) (a) Plot the prior density of θ with the parameters obtained in part (iv). Set the maximum limit of the y axis
to 12. [2]
(b) Plot the posterior distribution of θ on the same graph as above. [2]

[Hint: you may find the lines function useful.]

An Analyst consulted by the company suggests that based on previous experience, a Beta prior with parameters (40, 24)
is more appropriate.

(vi) Plot the new prior and posterior distributions of θ on the same graph from part (v). [3]
(vii) Comment on the plots obtained in parts (v) and (vi). [2]

The company will put the new product on the market only if there is a high probability that θ is higher than 60%.
(viii) (a) Calculate the probability P(θ > 0.6 | X) in the case of both priors; that is, Uniform(0,1) and Beta with
parameters (40, 24). [4]

(b) Comment on your answer to part (viii)(a). [2] [Total 20]

Q2. A study was carried out to estimate the proportion, 𝑝, of workers that commute by train to work. A total of n = 200
workers were sampled at random and were asked the question: ‘Do you take the train to work?’ The workers’ answers
were recorded as a binary outcome, yi, for worker i, with 1 for yes and 0 for no. The data are available in the file
BinaryTrain.RData.

Two commuters, Alice and Norman, were interested in the study and proposed different prior distributions for the
proportion p.

Alice assumed a discrete prior distribution g(p) given in the following table:

p 0.1 0.2 0.3 0.4 0.5

g(p) 0.5 0.2 0.2 0.05 0.05

Norman chose to use a beta prior distribution for p, with parameters 3 and 12.

(i) (a) Calculate the mean and the standard deviation for Alice’s prior distribution. [4]

(b) Generate 10,000 random values from Norman’s prior distribution. [1]

(c) Calculate the mean and standard deviation of the values generated in part (i)(b). [2]

(d) Comment on whether or not Alice and Norman have similar prior beliefs for p. [2]

Norman’s beta prior distribution for p is adopted for the remainder of the question.

The likelihood of the model in the study is given by:

L(p) ∝ p∑ yi(1 – p)n – ∑ yi

The posterior density of p is given by:

f(p|y) ∝ p2 + ∑ yi(1 – p)11 + n – ∑yi ,

where ∑ yi is the total sum of all the binary data.

(ii) Plot the shape of the posterior density of p without identifying it. [4]

(iii) Plot the density of Norman’s prior distribution by setting ylim = c(0,14). [3]

The posterior distribution of p is beta with parameters 3 + ∑ yi and 12 + n – ∑ yi .

(iv) (a) Plot the posterior density of p by adding it to the plot in part (iii). [3]

(b) Compare the two densities using your answer in part (iv)(a). [1]

(c) Comment on the extent to which the posterior distribution is affected by the prior distribution. [1]
(v) Determine a 90% interval estimate for p based on its posterior distribution. [2]

(vi) Determine the exact posterior probability that p exceeds 0.25. [2]

(vii) (a) Generate 10,000 samples from the posterior distribution of p.

(b) Calculate the proportion of sampled values of p that exceed 0.25.

(c) Compare your answer in part (vii)(b) with your answer in part (vi). [3] [Total 28]

Q3. Consider the n = 30 independent and identically distributed observations ( y1, y2, …, yn) given below from a random
variable Y with probability distribution function f (y, θ) = θy e–θ / y! .

You can enter the y values into R by using:

y = c(5,5,6,2,4,10,2,5,5,2,5,3,7,4,4,5,4,6,7,2,8,4,6,4,3, 6,6,6,5,7)

By assuming a prior distribution proportional to e–αθ, we can show that the posterior distribution of θ is:

f (θ|y1,y2,…, yn )∝ θ∑yi e–(n + α)θ

We can observe that the posterior distribution of θ is Gamma with parameters ∑yi –1 and n + α.

(i) (a) Plot the posterior probability density function of θ for values of θ in the interval [3.2, 6.8] and assuming α
= 0.01.

[Hint: the range of values of θ can be obtained in R by seq(3.2, 6.8, by = 0.01).]

(b) Carry out a simulation of N = 5,000 posterior samples for the parameter θ. [8]

(ii) Plot the histogram of the posterior distribution of θ. [2]


(iii) Calculate the mean, median and standard deviation of the posterior distribution of θ. [3]

Two possible values for the true value of parameter θ are θ =15 and θ = 5.
(iv) Comment on these two values based on the posterior distribution of θ plotted in part (ii) and summarised in
part (iii). [3] [Total 16]

You might also like