0% found this document useful (0 votes)
27 views6 pages

Bayesian Statistics 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views6 pages

Bayesian Statistics 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 6
2015 Bayesian statistics Subject CT6 UNIT 5 — BAYESIAN STATISTICS Sillabus objectives () 1. Use Bayes" Theorem to calculate simple conditional probabilities. 2. Explain what js meant by a prior distribution, a posterior distribution and a conjugate prior distribution, 3. Derive the posterior distribution for a parameter in simple cases. 4. Explain what is meant by a loss function, Use simple loss functions to derive Bayesian estimates of parameters. 0 Introduction ‘The Bayesian philosophy involves a completely different approach to statistics. The Bayesian version of estimation is considered here for the basic situation concerning the estimation ofa parameter given a random sample from a particular distribution. Classical estimation involves the method of maximum likelihood, ‘The fundamental difference between Bayesian and classical methods is that the parameter 8 is considered to be a random variable in Bayesian methods. In classical statistics 0 is a fixed but unknown quantity. This leads to difficulties such as the careful interpretation required for classical confidence intervals, where itis the interval that is random. As soon as the data are observed and a numerical interval is calculated, there is no probability involved. A statement such as P(10.45 < 0 < 13.26) = 0.95 cannot >be made because @ is not a random variable. In Bayesian statistics no such difficulties arise and probability statements can be made concerning the values of a parameter 0, 1 Bayes’ Theorem Tf Bi Basu By constitute a partition of a sample space Sand P(B,) #0 fori = 1,2, then for any event in S such that P(A) #0 oh PUlh, P(A) (B,) P(B,|A) AAC! where P(A) = $rulayPa) for 1,2 on 9h (© Institute and Faculty of Actuaries Subject CT6 Bayesian statistics 2015 44 24 An example Three manufacturers supply clothing to a retailer. 60% of the stock comes from manufacturer 1, 30% from manufacturer 2 and 10% from manufacturer 3. 10% of the clothing from manufacturer | is faulty, 5% from manufacturer 2 is faulty and 15% from manufacturer 3 is faulty. What is the probability that a faulty garment comes from manufacturer 32 Solution Let A be the event that a garment is faulty. Let B, be the event that the garment comes from manufacturer j, Substituting the figures into the formula for Bayes’ Theorem, Pula) = (01310) __ ois (0:140.6)4 (0.05K0.3) +(0.13K0.1) 0.09 1.167 Although manufacturer 3 supplies only 10% of the garments to the retailer, neatly 17% of the faulty garments come from that manufacturer, Prior and posterior distributions Suppose = (X}, 2, X,) i a random sample from a population specified by the density or probability function /(x;0) and it is required to estimate 8, As a result of the parameter 0 being a random variable, it will have a distribution. ‘This allows the use of any knowledge available about possible values for 0 before the collection of any data, This knowledge is quantified by expressing it as the prior distribution of Then after collecting appropriate data the posterior distribution of @ is determined and this forms the basis ofall inference concerning 0. Notation As 0 is a random variable, it should really be denoted by the capital ©, and its prior density written as f(6). However, for simplicity no distinction will be made between © and 0, and the density will simply be denoted by f(@). Note that referring to a density here implies that 0 is continuous. In most applications this will be the case, as even when X'is discrete (like the binomial or Poisson), the parameter (p ot 2) will vary in a continuous space ((0, 1) or (0, 0), respectively). Also the population density or probability function will be denoted by /'(x 6) rather than {he earlier (1:0) as it represents the conditional distribution of X given 0. 5, Page 2 © Institute and Faculty of Actuaries 2015 22 Bayesian statistios, Subject CT6 Determining the posterior density ‘Suppose that .V is a random sample from a population specified by f(x'| 0) and that @ has the prior density f(8). ‘The posterior density of @| X is determined by applying the basic definition of a conditional densi Note that /(X) = j (X10) /@)d0. This result is like a continuous version of Bayes* theorem from basie probability. A useful way of expressing the posterior density is to use proportionality. (A) does not involve @ and is just the constant needed to make it a proper density that integrates to unity, so FOX) © (410), Also niote that (2¢|0) being the joint density of the sample values is none other than the likelihood. So the posterior is proportional to the product of the likelihood and the prior. Fora given likelihood, if the prior to the same family as the pri this likelihood, stribution leads to a posterior distribution belonging tribution, then this prior is called the conjugate prior for The loss function To obtain an estimator of 8, a loss function must first be specified. This is a measure of the “Toss” incurred when g(X) is used as an estimator of 8. A loss function is sought which is zero when the estimation is exactly correct. that is, g(X) = 0, and which is positive and does not decrease as ¢(X) gels further away from 0, There is one very commonly used loss function, called quadratic or squared error loss. Two others are also used in practice, Then the Bayesian estimator is the g(4) which minimises the expected loss with respect to the posterior distribution, ‘The main loss fumetion is quadratic loss defined by LR), 8) = (g(x) - 0 and itis related to mean square error from classical statistics. © Institute and Faculty of Actuaries Unit 5, Page 3 Subject CTS Bayesian statistics 2015 3.1 Unit 5, Page 4 © Institute and Faculty of Actu A second loss function is absolute error loss defined by Le). = |g -8| A third loss function is “0/1” or “all-or-nothing” loss defined by He@),0)= 0 ify) =0 1 itge) +0. The Bayesian estimator that arises by minimising the expected loss for each of these loss functions in turn is the mean, median and mode, respectively, of the posterior distribution, each of which is a measure of location of the posterior distribution. ‘The expected posterior loss is EPL = EX{L(g(x), 0)} = [L(e(3),0)f(@|x)d0. Quadratic loss For simplicity, ¢ will be written instead of e(2), So EPL= j(g-6) f(@lxae spr = 20g -0)f(0lyd0 dg equating to zero => gf f(O|x)d0 = for@laydo but (/(@lya0 g= fof(olxyda = 2013) Clearly this minimises FPZ. + The Bayesian estimator under quadratic loss is the mean of the posterior distribution. 2015 3.2 3.3 Bayesian statistics Subject CTS Absolute error loss Again, g will be written instead of g(a). So EPL= fg -0| f(@lx)a0 Assuming the range for 0 is (~ 22), then EPL= § (y-0)f00|3)40+ @-s)/@lya0 2 i “gp = § f@lxyd0-{ polnae dg A - ho 4 1Recat that °F" Fonds = “f Ereyides BOW) COMA) «oy ay) equating 10 zero=> } s(olyyaa = j realyiae _ s that is, PO 0, this tends to the required loss function, -*F rolyyao (gla) for small ¢, minimised by taking g to be the mode of f(0 |x). © Institute and Faculty of Actuaries Unit 5, Page 5 Subject CT6 Bayesian statistics 2015 3.4 Unit 5, Page 6 An example For the estimation of a binomial probability 0 from a single observation X' with the prior distribution of 0 being beta with parameters a and , investigate the form of the posterior distribution of 8 and determine the Bayesian estimator of 8 under quadratic loss, Solution ‘The proportionality argument be used and any constants simply omitted as appropriate, ye ot Ta+B) Prior: £(@) 26°10)", omitting the constant 1¢*D) rior: #0) (1-0)? omitting the constant Tere) ff Likelihood: f(41|6) « 0*(1- 6)", omitting the constant (") 2 FOOL) < 81-0y"* 81a — oy! = grat gyri Now it can be seen that, apart from the appropriate constant of proportionality, this is the density ofa beta random variable. ‘Therefore the immediate conclusion is that the posterior distribution of @ given ¥'is beta with parameters (x + a) and (n ~x+ B) {can also be seen that the posterior density and the prior density belong to the same family of distributions, Thus the conjugate prior for the binomial distribution is the beta distribution The Bayesian estimator under quadratic loss is __xta ta Gras0 Bras the mean of this distribution, that is, END © Institute and Faculty of Actuaries

You might also like