0% found this document useful (0 votes)

78 views10 pages

Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals

This document discusses Bayesian updating when there is a continuous range of hypotheses rather than a finite number. It begins by introducing examples where the parameter being estimated, like the probability of heads for a coin, can take on any value in a continuous range. It then reviews concepts like the law of total probability and Bayes' theorem that are needed to perform Bayesian updating in the continuous case. A coin flipping example is used throughout to illustrate the concepts.

Uploaded by

Md Cassim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views10 pages

Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals

Uploaded by

Md Cassim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Bayesian Updating with Continuous Priors

Class 13, 18.05

Jeremy Orloﬀ and Jonathan Bloom

1 Learning Goals

1. Understand a parameterized family of distributions as representing a continuous range

of hypotheses for the observed data.

2. Be able to state Bayes’ theorem and the law of total probability for continous densities.

3. Be able to apply Bayes’ theorem to update a prior probability density function to a

posterior pdf given data and a likelihood function.

4. Be able to interpret and compute posterior predictive probabilities.

2 Introduction

Up to now we have only done Bayesian updating when we had a finite number of hypothesis,
e.g. our dice example had five hypotheses (4, 6, 8, 12 or 20 sides). Now we will study
Bayesian updating when there is a continuous range of hypotheses. The Bayesian update
process will be essentially the same as in the discrete case. As usual when moving from
discrete to continuous we will need to replace the probability mass function by a probability
density function, and sums by integrals.
The first few sections of this note are devoted to working with pdfs. In particular we will
cover the law of total probability and Bayes’ theorem. We encourage you to focus on how
these are essentially identical to the discrete versions. After that, we will apply Bayes’
theorem and the law of total probability to Bayesian updating.

3 Examples with continuous ranges of hypotheses

Here are three standard examples with continuous ranges of hypotheses.

Example 1. Suppose you have a system that can succeed or fail with probability p. Then
we can hypothesize that p is anywhere in the range [0, 1]. That is, we have a continuous
range of hypotheses. We will often model this example with a ‘bent’ coin with unknown
probability p of heads.

Example 2. The lifetime of a certain isotope is modeled by an exponential distribution

exp(λ). In principal, the mean lifetime 1/λ can be any real number in (0, ∞).

Example 3. We are not restricted to a single parameter. In principle, the parameters µ

and σ of a normal distribution can be any real numbers in (−∞, ∞) and (0, ∞), respectively.
If we model gestational length for single births by a normal distribution, then from millions
of data points we know that µ is about 40 weeks and σ is about one week.

1
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 2

In all of these examples we modeled the random process giving rise to the data by a dis
tribution with parameters –called a parametrized distribution. Every possible choice of the
parameter(s) is a hypothesis, e.g. we can hypothesize that the probability of succcess in
Example 1 is p = 0.7313. We have a continuous set of hypotheses because we could take
any value between 0 and 1.

4 Notational conventions

4.1 Parametrized models

As in the examples above our hypotheses often take the form a certain parameter has value
θ. We will often use the letter θ to stand for an arbitrary hypothesis. This will leave
symbols like p, f , and x to take there usual meanings as pmf, pdf, and data. Also, rather
than saying ‘the hypothesis that the parameter of interest has value θ’ we will simply say
the hypothesis θ.

4.2 Big and little letters

We have two parallel notations for outcomes and probability:

1. (Big letters) Event A, probability function P (A).
2. (Little letters) Value x, pmf p(x) or pdf f (x).
These notations are related by P (X = x) = p(x), where x is a value the discrete random

variable X and ‘X = x’ is the corresponding event.

We carry these notations over to the probabilities used in Bayesian updating.

1. (Big letters) From hypotheses H and data D we compute several associated probabilities

P (H), P (D), P (H|D), P (D|H).

In the coin example we might have H = ‘the chosen coin has probability 0.6 of heads’, D
= ‘the ﬂip was heads’, and P (D|H) = 0.6
2. (Small letters) Hypothesis values θ and data values x both have probabilities or proba
bility densities:
p(θ) p(x) p(θ|x) p(x|θ)
f (θ) f (x) f (θ|x) f (x|θ)
In the coin example we might have θ = 0.6 and x = 1, so p(x|θ) = 0.6. We might also write
p(x = 1|θ = 0.6) to emphasize the values of x and θ, but we will never just write p(1|0.6)
because it is unclear which value is x and which is θ.
Although we will still use both types of notation, from now on we will mostly use the small
letter notation involving pmfs and pdfs. Hypotheses will usually be parameters represented
by Greek letters (θ, λ, µ, σ, . . . ) while data values will usually be represented by English
letters (x, xi , y, . . . ).
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 3

5 Quick review of pdf and probability

Suppose X is a random variable with pdf f (x). Recall f (x) is a density; its units are
probability/(units of x).
f (x) f (x)

probability f (x)dx

P (c ≤ X ≤ d)

x dx
c x
d x

The probability that the value of X is in [c, d] is given by

l d
f (x) dx.
c

The probability that X is in an infinitesimal range dx around x is f (x) dx. In fact, the
integral formula is just the ‘sum’ of these infinitesimal probabilities. We can visualize these
probabilities by viewing the integral as area under the graph of f (x).
In order to manipulate probabilities instead of densities in what follows, we will make
frequent use of the notion that f (x) dx is the probability that X is in an infinitesimal range
around x of width dx. Please make sure that you fully understand this notion.

6 Continuous priors, discrete likelihoods

In the Bayesian framework we have probabilities of hypotheses –called prior and posterior
probabilities– and probabilities of data given a hypothesis –called likelihoods. In earlier
classes both the hypotheses and the data had discrete ranges of values. We saw in the
introduction that we might have a continuous range of hypotheses. The same is true for
the data, but for today we will assume that our data can only take a discrete set of values.
In this case, the likelihood of data x given hypothesis θ is written using a pmf: p(x|θ).
We will use the following coin example to explain these notions. We will carry this example
through in each of the succeeding sections.
Example 4. Suppose we have a bent coin with unknown probability θ of heads. The
value of of θ is random and could be anywhere between 0 and 1. For this and the examples
that follow we’ll suppose that the value of θ follows a distribution with continuous prior
probability density f (θ) = 2θ. We have a discrete likelihood because tossing a coin has only
two outcomes, x = 1 for heads and x = 0 for tails.

p(x = 1|θ) = θ, p(x = 0|θ) = 1 − θ.

Think: This can be tricky to wrap your mind around. We have a coin with an unknown
probability θ of heads. The value of the parameter θ is itself random and has a prior pdf
f (θ). It may help to see that the discrete examples we did in previous classes are similar.
For example, we had a coin that might have probability of heads 0.5, 0.6, or 0.9. So,
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 4

we called our hypotheses H0.5 , H0.6 , H0.9 and these had prior probabilities P (H0.5 ) etc. In
other words, we had a coin with an unknown probability of heads, we had hypotheses about
that probability and each of these hypotheses had a prior probability.

7 The law of total probability

The law of total probability for continuous probability distributions is essentially the same
as for discrete distributions. We replace the prior pmf by a prior pdf and the sum by an
integral. We start by reviewing the law for the discrete case.
Recall that for a discrete set of hypotheses H1 , H2 , . . . Hn the law of total probability says
n
n
P (D) = P (D|Hi )P (Hi ). (1)
i=1

This is the total prior probability of D because we used the prior probabilities P (Hi )
In the little letter notation with θ1 , θ2 , . . . , θn for hypotheses and x for data the law of total
probability is written
n n
p(x) = p(x|θi )p(θi ). (2)
i=1
We also called this the prior predictive probability of the outcome x to distinguish it from
the prior probability of the hypothesis θ.
Likewise, there is a law of total probability for continuous pdfs. We state it as a theorem
using little letter notation.
Theorem. Law of total probability. Suppose we have a continuous parameter θ in the
range [a, b], and discrete random data x. Assume θ is itself random with density f (θ) and
that x and θ have likelihood p(x|θ). In this case, the total probability of x is given by the
formula. l b
p(x) = p(x|θ)f (θ) dθ (3)
a
Proof. Our proof will be by analogy to the discrete version: The probability term p(x|θ)f (θ) dθ
is perfectly analogous to the term p(x|θi )p(θi ) in Equation 2 (or the term P (D|Hi )P (Hi )
in Equation 1). Continuing the analogy: the sum in Equation 2 becomes the integral in
Equation 3
As in the discrete case, when we think of θ as a hypothesis explaining the probability of the
data we call p(x) the prior predictive probability for x.
Example 5. (Law of total probability.) Continuing with Example 4. We have a bent coin
with probability θ of heads. The value of θ is random with prior pdf f (θ) = 2θ on [0, 1].
Suppose I ﬂip the coin once. What is the total probability of heads?
answer: In Example 4 we noted that the likelihoods are p(x = 1|θ) = θ and p(x = 0|θ) =
1 − θ. So the total probability of x = 1 is
l 1 l 1 l 1
2
p(x = 1) = p(x = 1|θ) f (θ) dθ = θ · 2θ dθ = 2θ 2 dθ = .
0 0 0 3
Since the prior is weighted towards higher probabilities of heads, so is the total probability.

18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 5

8 Bayes’ theorem for continuous probability densities

The statement of Bayes’ theorem for continuous pdfs is essentially identical to the statement
for pmfs. We state it including dθ so we have genuine probabilities:
Theorem. Bayes’ Theorem. Use the same assumptions as in the law of total probability,
i.e. θ is a continuous parameter with pdf f (θ) and range [a, b]; x is random discrete data;
together they have likelihood p(x|θ). With these assumptions:
p(x|θ)f (θ) dθ p(x|θ)f (θ) dθ
f (θ|x) dθ = = Jb . (4)
p(x)
a p(x|θ)f (θ) dθ

Proof. Since this is a statement about probabilities it is just the usual statement of Bayes’
theorem. This is important enough to warrant spelling it out in words: Let Θ be the random
variable that produces the value θ. Consider the events

H = ‘Θ is in an interval of width dθ around the value θ’

and
D = ‘the value of the data is x’.
Then P (H) = f (θ) dθ, P (D) = p(x), and P (D|H) = p(x|θ). Now our usual form of Bayes’
theorem becomes
P (D|H)P (H) p(x|θ)f (θ) dθ
f (θ|x) dθ = P (H|D) = =
P (D) p(x)
Looking at the ﬁrst and last terms in this equation we see the new form of Bayes’ theorem.
Finally, we ﬁrmly believe that is is more conducive to careful thinking about probability
to keep the factor of dθ in the statement of Bayes’ theorem. But because it appears in the
numerator on both sides of Equation 4 many people drop the dθ and write Bayes’ theorem
in terms of densities as
p(x|θ)f (θ) p(x|θ)f (θ)
f (θ|x) = = Jb .
p(x)
a p(x|θ)f (θ) dθ

9 Bayesian updating with continuous priors

Now that we have Bayes’ theorem and the law of total probability we can finally get to
Bayesian updating. Before continuing with Example 4, we point out two features of the
Bayesian updating table that appears in the next example:
1. The table for continuous priors is very simple: since we cannot have a row for each of
an infinite number of hypotheses we’ll have just one row which uses a variable to stand for
all hypotheses θ.
2. By including dθ, all the entries in the table are probabilities and all our usual probability
rules apply.
Example 6. (Bayesian updating.) Continuing Examples 4 and 5. We have a bent coin
with unknown probability θ of heads. The value of θ is random with prior pdf f (θ) = 2θ.
Suppose we flip the coin once and get heads. Compute the posterior pdf for θ.
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 6

answer: We make an update table with the usual columns. Since this is our ﬁrst example
the ﬁrst row is the abstract version of Bayesian updating in general and the second row is
Bayesian updating for this particular example.
hypothesis prior likelihood Bayes numerator posterior

θ f (θ) dθ p(x = 1|θ) p(x = 1|θ)f (θ) dθ f (θ|x = 1) dθ

θ 2θ dθ θ 2θ2 dθ 3θ2 dθ
Jb J1
total a f (θ) dθ = 1 p(x = 1) = 0 2θ2 dθ = 2/3 1

Therefore the posterior pdf (after seeing 1 heads) is f (θ|x) = 3θ2 .

We have a number of comments:
1. Since we used the prior probability f (θ) dθ, the hypothesis should have been:
’the unknown paremeter is in an interval of width dθ around θ’.
Even for us that is too much to write, so you will have to think it everytime we write that
the hypothesis is θ.
2. The posterior pdf for θ is found by removing the dθ from the posterior probability in
the table.
f (θ|x) = 3θ2 .

3. (i) As always p(x) is the total probability. Since we have a continuous distribution
instead of a sum we compute an integral.
(ii) Notice that by including dθ in the table, it is clear what integral we need to compute
to ﬁnd the total probability p(x).
4. The table organizes the continuous version of Bayes’ theorem. Namely, the posterior pdf
is related to the prior pdf and likelihood function via:
p(x|θ) f (θ)dθ p(x|θ) f (θ)
f (θ|x)dθ = J b =
p(x)
a p(x|θ)f (θ) dθ

Removing the dθ in the numerator of both sides we have the statement in terms of densities.

5. Regarding both sides as functions of θ, we can again express Bayes’ theorem in the form:

f (θ|x) ∝ p(x|θ) · f (θ)

posterior ∝ likelihood × prior.

9.1 Flat priors

One important prior is called a flat or uniform prior. A flat prior assumes that every
hypothesis is equally probable. For example, if θ has range [0, 1] then f (θ) = 1 is a flat
prior.
Example 7. (Flat priors.) We have a bent coin with unknown probability θ of heads.
Suppose we toss it once and get tails. Assume a flat prior and find the posterior probability
for θ.
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 7

answer: This is the just Example 6 with a change of prior and likelihood.
hypothesis prior likelihood Bayes numerator posterior
θ f (θ) dθ p(x = 0|θ) f (θ|x = 0) dθ

θ 1 · dθ 1−θ (1 − θ) dθ 2(1 − θ) dθ
l 1
Jb
total a f (θ) dθ = 1 p(x = 0) = (1 − θ) dθ = 1/2 1
0

9.2 Using the posterior pdf

Example 8. In the previous example the prior probability was ﬂat. First show that this
means that a priori the coin is equally like to be biased towards heads or tails. Then, after
observing one heads, what is the (posterior) probability that the coin is biased towards
heads?
answer: Since the parameter θ is the probability the coin lands heads, the ﬁrst part of the
problem asks us to show P (θ > .5) = 0.5 and the second part asks for P (θ > .5 | x = 1).
These are easily computed from the prior and posterior pdfs respectively.
The prior probability that the coin is biased towards heads is
l 1 l 1
1
P (θ > .5) = f (θ) dθ = 1 · dθ = θ|1.5 = .
.5 .5 2

The probability of 1/2 means the coin is equally likely to be biased toward heads or tails.
The posterior probabilitiy that it’s biased towards heads is
l 1 l 1
1 3
P (θ > .5|x = 1) = f (θ|x = 1) dθ = 2θ dθ = θ 2 .5 = .
.5 .5 4
We see that observing one heads has increased the probability that the coin is biased towards
heads from 1/2 to 3/4.

10 Predictive probabilities

Just as in the discrete case we are also interested in using the posterior probabilities of the
hypotheses to make predictions for what will happen next.
Example 9. (Prior and posterior prediction.) Continuing Examples 4, 5, 6: we have a
coin with unknown probability θ of heads and the value of θ has prior pdf f (θ) = 2θ. Find
the prior predictive probability of heads. Then suppose the first flip was heads and find the
posterior predictive probabilities of both heads and tails on the second flip.
answer: For notation let x1 be the result of the first flip and let x2 be the result of the
second flip. The prior predictive probability is exactly the total probability computed in
Examples 5 and 6.
l 1 l 1
2
p(x1 = 1) = p(x1 = 1|θ)f (θ) dθ = 2θ2 dθ = .
0 0 3
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 8

The posterior predictive probabilities are the total probabilities computed using the poste
rior pdf. From Example 6 we know the posterior pdf is f (θ|x1 = 1) = 3θ2 . So the posterior
predictive probabilities are
l 1 l 1
p(x2 = 1|x1 = 1) = p(x2 = 1|θ, x1 = 1)f (θ|x1 = 1) dθ = θ · 3θ2 dθ = 3/4
0 0
l 1 l 1
p(x2 = 0|x1 = 1) = p(x2 = 0|θ, x1 = 1)f (θ|x1 = 1) dθ = (1 − θ) · 3θ2 dθ = 1/4
0 0

(More simply, we could have computed p(x2 = 0|x1 = 1) = 1 − p(x2 = 1|x1 = 1) = 1/4.)

11 From discrete to continuous Bayesian updating

To develop intuition for the transition from discrete to continuous Bayesian updating, we’ll
walk a familiar road from calculus. Namely we will:
(i) approximate the continuous range of hypotheses by a finite number.
(ii) create the discrete updating table for the finite number of hypotheses.
(iii) consider how the table changes as the number of hypotheses goes to infinity.
In this way, will see the prior and posterior pmf’s converge to the prior and posterior pdf’s.
Example 10. To keep things concrete, we will work with the ‘bent’ coin with a flat prior
f (θ) = 1 from Example 7. Our goal is to go from discrete to continuous by increasing the

number of hypotheses

4 hypotheses. We slice [0, 1] into 4 equal intervals: [0, 1/4], [1/4, 1/2], [1/2, 3/4], [3/4, 1].

Each slice has width Δθ = 1/4. We put our 4 hypotheses θi at the centers of the four slices:

θ1 : ‘θ = 1/8’, θ2 : ‘θ = 3/8’, θ3 : ‘θ = 5/8’, θ4 : ‘θ = 7/8’.

The ﬂat prior gives each hypothesis a probability of 1/4 = 1 · Δθ. We have the table:
hypothesis prior likelihood Bayes num. posterior

θ = 1/8 1/4 1/8 (1/4) × (1/8) 1/16

θ = 3/8 1/4 3/8 (1/4) × (3/8) 3/16

θ = 5/8 1/4 5/8 (1/4) × (5/8) 5/16

θ = 7/8 1/4 7/8 (1/4) × (7/8) 7/16

n
X
Total 1 – θi ∆θ 1
i=1

Here are the density histograms of the prior and posterior pmf. The prior and posterior
pdfs from Example 7 are superimposed on the histograms in red.
18.05 class 13, Bayesian Updating with Continuous Priors, Spring 2014 9

2 density 2 density

1.5 1.5

1 1

.5 .5

x x
1/8 3/8 5/8 7/8 1/8 3/8 5/8 7/8

8 hypotheses. Next we slice [0,1] into 8 intervals each of width Δθ = 1/8 and use the
center of each slice for our 8 hypotheses θi .
θ1 : ’θ = 1/16’, θ2 : ’θ = 3/16’, θ3 : ’θ = 5/16’, θ4 : ’θ = 7/16’
θ5 : ’θ = 9/16’, θ6 : ’θ = 11/16’, θ7 : ’θ = 13/16’, θ8 : ’θ = 15/16’
The ﬂat prior gives each hypothesis the probablility 1/8 = 1 · Δθ. Here are the table and
density histograms.
hypothesis prior likelihood Bayes num. posterior
θ = 1/16 1/8 1/16 (1/8) × (1/16) 1/64
θ = 3/16 1/8 3/16 (1/8) × (3/16) 3/64
θ = 5/16 1/8 5/16 (1/8) × (5/16) 5/64
θ = 7/16 1/8 7/16 (1/8) × (7/16) 7/64
θ = 9/16 1/8 9/16 (1/8) × (9/16) 9/64
θ = 11/16 1/8 11/16 (1/8) × (11/16) 11/64
θ = 13/16 1/8 13/16 (1/8) × (13/16) 13/64
θ = 15/16 1/8 15/16 (1/8) × (15/16) 15/64
n
X
Total 1 – θi ∆θ 1
i=1

2 density 2 density

1.5 1.5

1 1

.5 .5

x x
1/16 3/16 5/16 7/16 9/16 11/16 13/16 15/16 1/16 3/16 5/16 7/16 9/16 11/16 13/16 15/16

20 hypotheses. Finally we slice [0,1] into 20 pieces. This is essentially identical to the
previous two cases. Let’s skip right to the density histograms.
2 density 2 density

1.5 1.5

1 1

.5 .5

x x

Looking at the sequence of plots we see how the prior and posterior density histograms
converge to the prior and posterior probability density functions.
MIT OpenCourseWare
https://ocw.mit.edu

18.05 Introduction to Probability and Statistics

Spring 2014

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

ComPact NS - 33228
No ratings yet
ComPact NS - 33228
3 pages
05 Ncecsc Dgarm
No ratings yet
05 Ncecsc Dgarm
9 pages
Data Structures Study Notes
No ratings yet
Data Structures Study Notes
34 pages
Seismic Reference Datums
No ratings yet
Seismic Reference Datums
12 pages
2.2 Bayesian Statistics
No ratings yet
2.2 Bayesian Statistics
12 pages
305766
No ratings yet
305766
7 pages
Kingspan Quadcore ks1000rw Roof Panel Data Sheet en GB Ie
No ratings yet
Kingspan Quadcore ks1000rw Roof Panel Data Sheet en GB Ie
9 pages
ACE Module 5 v2.0
No ratings yet
ACE Module 5 v2.0
38 pages
IEC Timers and IEC Counter For SIMATIC S7-1200
No ratings yet
IEC Timers and IEC Counter For SIMATIC S7-1200
33 pages
Assignment 9
No ratings yet
Assignment 9
3 pages
LIST - Approved Individual Verifiers - EPD International
No ratings yet
LIST - Approved Individual Verifiers - EPD International
7 pages
Css s24 Model Answer Paper of Summer 2024 Exam Css
No ratings yet
Css s24 Model Answer Paper of Summer 2024 Exam Css
31 pages
Introduction To Probability and Statistics
No ratings yet
Introduction To Probability and Statistics
28 pages
A Sustainable Quality Assessment Model For The Information Delivery in E - Learning System
No ratings yet
A Sustainable Quality Assessment Model For The Information Delivery in E - Learning System
38 pages
2 Probability and Statistics
No ratings yet
2 Probability and Statistics
29 pages
L-3, Output Devices by Arpita Mam
No ratings yet
L-3, Output Devices by Arpita Mam
22 pages
Einhell-Bmh 33-36-E
No ratings yet
Einhell-Bmh 33-36-E
1 page
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
No ratings yet
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
29 pages
ESP32-MINI-1 IC Certification
No ratings yet
ESP32-MINI-1 IC Certification
2 pages
Wiring Mikrohidro
No ratings yet
Wiring Mikrohidro
6 pages
Unit-5 Notes Updated
No ratings yet
Unit-5 Notes Updated
22 pages
DLC Lab - 09
100% (1)
DLC Lab - 09
3 pages
Bayes Stats
No ratings yet
Bayes Stats
3 pages
Lecture 9 - Probability COMP7180
No ratings yet
Lecture 9 - Probability COMP7180
58 pages
CS1 (A) Book
No ratings yet
CS1 (A) Book
38 pages
Moss White Paper English Final Reduced
No ratings yet
Moss White Paper English Final Reduced
91 pages
ML - Lec 2 - Review of Probability and Statistics
No ratings yet
ML - Lec 2 - Review of Probability and Statistics
30 pages
ALL ST218 Lecture Notes
No ratings yet
ALL ST218 Lecture Notes
87 pages
Shear Check As Per Codes
No ratings yet
Shear Check As Per Codes
10 pages
Stepwise Regression: Forward (Step-Up) Selection
No ratings yet
Stepwise Regression: Forward (Step-Up) Selection
7 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
PTSP
No ratings yet
PTSP
74 pages
Advertisement For Multiple Positions at CER and EAL
No ratings yet
Advertisement For Multiple Positions at CER and EAL
2 pages
Maintaining A Customer Project Billing Proposal - PPM PDF
No ratings yet
Maintaining A Customer Project Billing Proposal - PPM PDF
7 pages
07 Probability Review
No ratings yet
07 Probability Review
56 pages
AD - Landing Field Length
No ratings yet
AD - Landing Field Length
1 page
s01 Cgs
No ratings yet
s01 Cgs
3 pages
Notes On The Control of An Aircraft With Throttle Only
No ratings yet
Notes On The Control of An Aircraft With Throttle Only
5 pages
RC Circuit Step Response I: Find The Differential Equation That Describes The Circuit Below
No ratings yet
RC Circuit Step Response I: Find The Differential Equation That Describes The Circuit Below
7 pages
Superposition I: G U (T) y (T) U
No ratings yet
Superposition I: G U (T) y (T) U
7 pages
BST413 12jan Page1to11
No ratings yet
BST413 12jan Page1to11
11 pages
Battery Model Concept Test
No ratings yet
Battery Model Concept Test
2 pages
Lecture S3 Muddiest Points: General Comments
No ratings yet
Lecture S3 Muddiest Points: General Comments
3 pages
Lecture S1 Muddiest Points: General Comments
No ratings yet
Lecture S1 Muddiest Points: General Comments
3 pages
Module 1
No ratings yet
Module 1
12 pages
A Procedure For The Classification of Synoptic Weather Maps From G R I D D E D Atmospheric Pressure Surface Data
No ratings yet
A Procedure For The Classification of Synoptic Weather Maps From G R I D D E D Atmospheric Pressure Surface Data
14 pages
Node Method Concept Test: R 2 R 4 e e
No ratings yet
Node Method Concept Test: R 2 R 4 e e
5 pages
s05 Cgs
No ratings yet
s05 Cgs
6 pages
What Is A Model? Concept Test
No ratings yet
What Is A Model? Concept Test
6 pages
Light Bulb II Concept Test: B C B C A A
No ratings yet
Light Bulb II Concept Test: B C B C A A
7 pages
Lecture S4 Muddiest Points: General Comments
No ratings yet
Lecture S4 Muddiest Points: General Comments
2 pages
Kcse Analysis From 2017 To 2023 Mathematics Tr. Brian PP1 and PP2
No ratings yet
Kcse Analysis From 2017 To 2023 Mathematics Tr. Brian PP1 and PP2
4 pages
Nonlinearity in Structural Dynamics Chapter App A
No ratings yet
Nonlinearity in Structural Dynamics Chapter App A
10 pages
IPSL Boot Camp Part 5: CDO and NCO: 1 Climate Data Operators
No ratings yet
IPSL Boot Camp Part 5: CDO and NCO: 1 Climate Data Operators
10 pages
02 4runner Window PDF
No ratings yet
02 4runner Window PDF
6 pages
Probability FoundationalMathofAI S24
No ratings yet
Probability FoundationalMathofAI S24
7 pages
App X-Ray Multiphos 10 Plus
No ratings yet
App X-Ray Multiphos 10 Plus
8 pages
L15-Bayes Rule For Probability Mass Function
No ratings yet
L15-Bayes Rule For Probability Mass Function
13 pages
Lecture 3 - Probability - BMSLec02
No ratings yet
Lecture 3 - Probability - BMSLec02
16 pages
Advisory & Transaction Services: Corporate Real Estate Team Overview
No ratings yet
Advisory & Transaction Services: Corporate Real Estate Team Overview
5 pages
Session - 9: Advanced Microprocessor Features - Study of Intel 80286 Processor
No ratings yet
Session - 9: Advanced Microprocessor Features - Study of Intel 80286 Processor
10 pages
Retail Assignment Sakshi Sharda
No ratings yet
Retail Assignment Sakshi Sharda
7 pages
PTSP
No ratings yet
PTSP
101 pages
Model: Service Manual
No ratings yet
Model: Service Manual
62 pages
TP Lecture1h
No ratings yet
TP Lecture1h
34 pages
Power Point Presentation On Topic: Framework: Submitted By: Himani Kathal
No ratings yet
Power Point Presentation On Topic: Framework: Submitted By: Himani Kathal
11 pages
Handbook PDF
No ratings yet
Handbook PDF
40 pages
ML Unit2-1
No ratings yet
ML Unit2-1
11 pages
Foundations of Machine Learning: Part A: Probability Basics
No ratings yet
Foundations of Machine Learning: Part A: Probability Basics
75 pages
MIT18 05S14 Class14 Slides
No ratings yet
MIT18 05S14 Class14 Slides
26 pages
Mathematics in Machine Learning
No ratings yet
Mathematics in Machine Learning
83 pages
Chapter 6 - Random Variables and Probability Distributions
No ratings yet
Chapter 6 - Random Variables and Probability Distributions
101 pages
UN40D5500 Trobleshooting PDF
100% (1)
UN40D5500 Trobleshooting PDF
49 pages
II Sem - Last Minute Revision
No ratings yet
II Sem - Last Minute Revision
44 pages
NFBA Orginal Change Order Presentation
No ratings yet
NFBA Orginal Change Order Presentation
12 pages
Schematic Wiring GTBZ18A
No ratings yet
Schematic Wiring GTBZ18A
2 pages
PPT6-Probability and Random Variables
No ratings yet
PPT6-Probability and Random Variables
42 pages
Random Variables
No ratings yet
Random Variables
14 pages
Chapter 2: Belief, Probability, and Exchangeability: Lecture 1: Probability, Bayes Theorem, Distributions
No ratings yet
Chapter 2: Belief, Probability, and Exchangeability: Lecture 1: Probability, Bayes Theorem, Distributions
17 pages
Lect1 Math231
No ratings yet
Lect1 Math231
65 pages
CHP 5
No ratings yet
CHP 5
63 pages
BMA2102 Probability and Statistics II Lecture 1
No ratings yet
BMA2102 Probability and Statistics II Lecture 1
15 pages
Notas - Bayes Rule
No ratings yet
Notas - Bayes Rule
9 pages
Bayesian Updating: Probabilistic Prediction Class 12, 18.05 Jeremy Orloff and Jonathan Bloom
No ratings yet
Bayesian Updating: Probabilistic Prediction Class 12, 18.05 Jeremy Orloff and Jonathan Bloom
4 pages
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
PyCon 2015 - Bayesian Statistics Made Simple
100% (4)
PyCon 2015 - Bayesian Statistics Made Simple
145 pages
Statistics and Probability Katabasis
No ratings yet
Statistics and Probability Katabasis
7 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
MIT18 05S14 Reading11
No ratings yet
MIT18 05S14 Reading11
9 pages
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
No ratings yet
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
23 pages
Lecture-2: Assignment - 1 Problems
No ratings yet
Lecture-2: Assignment - 1 Problems
9 pages
On Probability Theory &stochastic Process
No ratings yet
On Probability Theory &stochastic Process
101 pages
CS145: Probability & Computing: Lecture 6: Multiple Discrete Variables, Joint & Conditional Distributions, Independence
No ratings yet
CS145: Probability & Computing: Lecture 6: Multiple Discrete Variables, Joint & Conditional Distributions, Independence
23 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
Example: (1, 2, 3, 4, 5, 6), U (A, B, C)
No ratings yet
Example: (1, 2, 3, 4, 5, 6), U (A, B, C)
64 pages
Statistical Data Analysis: PH4515: 1 Course Structure
No ratings yet
Statistical Data Analysis: PH4515: 1 Course Structure
5 pages
Math2830 Chapter 08
No ratings yet
Math2830 Chapter 08
9 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Probability Theory - Towards Data Science
No ratings yet
Probability Theory - Towards Data Science
19 pages
Probability Theory 2013
No ratings yet
Probability Theory 2013
61 pages
Chap 2
No ratings yet
Chap 2
15 pages
CS373: Glossary of Terms: Unit One
No ratings yet
CS373: Glossary of Terms: Unit One
3 pages
Probability
100% (1)
Probability
145 pages
Probability Review
No ratings yet
Probability Review
12 pages
Introduction To Probability Theory: A Short Course On Graphical Models
No ratings yet
Introduction To Probability Theory: A Short Course On Graphical Models
30 pages
Rvrlecture 1
No ratings yet
Rvrlecture 1
20 pages
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
27 pages
Probability Probability Distribution Function Probability Density Function Random Variable Bayes' Rule Gaussian Distribution
No ratings yet
Probability Probability Distribution Function Probability Density Function Random Variable Bayes' Rule Gaussian Distribution
26 pages

Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals

Uploaded by

Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals

Uploaded by

Bayesian Updating with Continuous Priors

Class 13, 18.05

Jeremy Orloﬀ and Jonathan Bloom

1. Understand a parameterized family of distributions as representing a continuous range

3. Be able to apply Bayes’ theorem to update a prior probability density function to a

4. Be able to interpret and compute posterior predictive probabilities.

3 Examples with continuous ranges of hypotheses

Here are three standard examples with continuous ranges of hypotheses.

Example 2. The lifetime of a certain isotope is modeled by an exponential distribution

Example 3. We are not restricted to a single parameter. In principle, the parameters µ

4.1 Parametrized models

4.2 Big and little letters

We have two parallel notations for outcomes and probability:

variable X and ‘X = x’ is the corresponding event.

We carry these notations over to the probabilities used in Bayesian updating.

P (H), P (D), P (H|D), P (D|H).

5 Quick review of pdf and probability

The probability that the value of X is in [c, d] is given by

6 Continuous priors, discrete likelihoods

p(x = 1|θ) = θ, p(x = 0|θ) = 1 − θ.

7 The law of total probability

8 Bayes’ theorem for continuous probability densities

H = ‘Θ is in an interval of width dθ around the value θ’

9 Bayesian updating with continuous priors

θ f (θ) dθ p(x = 1|θ) p(x = 1|θ)f (θ) dθ f (θ|x = 1) dθ

Therefore the posterior pdf (after seeing 1 heads) is f (θ|x) = 3θ2 .

f (θ|x) ∝ p(x|θ) · f (θ)

posterior ∝ likelihood × prior.

9.1 Flat priors

9.2 Using the posterior pdf

11 From discrete to continuous Bayesian updating

θ1 : ‘θ = 1/8’, θ2 : ‘θ = 3/8’, θ3 : ‘θ = 5/8’, θ4 : ‘θ = 7/8’.

θ = 1/8 1/4 1/8 (1/4) × (1/8) 1/16

θ = 3/8 1/4 3/8 (1/4) × (3/8) 3/16

θ = 5/8 1/4 5/8 (1/4) × (5/8) 5/16

θ = 7/8 1/4 7/8 (1/4) × (7/8) 7/16

18.05 Introduction to Probability and Statistics

You might also like