0% found this document useful (0 votes)

29 views27 pages

Bayes

This document provides an overview of probabilistic graphical models and Bayesian inference. It discusses why graphical models are useful, what probability means from frequentist and Bayesian perspectives, and key concepts like prior and posterior distributions, marginal likelihood, Bayesian updating, exchangeability, prediction, model validation, and Bayesian estimation.

Uploaded by

Allen Plato

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views27 pages

Bayes

Uploaded by

Allen Plato

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

CSC535: Probabilistic Graphical Models

Bayesian Probability and Statistics

Prof. Jason Pacheco

Why Graphical Models?

Data elements often have dependence arising from structure

Pose Estimation

Protein Structure

Exploit structure to simplify representation and computation

Why “Probabilistic”?

Stochastic processes have many sources of uncertainty

Randomness in Measurement
State of Nature Process
PGMs let us represent and reason about these in structured ways
What is Probability?

What does it mean that the probability of heads is ½ ?

Two schools of thought…

Frequentist Perspective
Proportion of successes (heads) in repeated
trials (coin tosses)

Bayesian Perspective
Belief of outcomes based on assumptions
about nature and the physics of coin flips

Neither is better/worse, but we can compare interpretations…

Administrivia

• HW1 due 11:59pm tonight

• Will accept submissions through Friday, -0.5pts per day late
• HW only worth 4pts so maximum score on Friday is 75%
• Late policy only applies to this HW
Frequentist & Bayesian Modeling

We will use the following notation throughout:

- Unknown (e.g. coin bias) - Data

Frequentist Bayesian
(Conditional Model) (Generative Model)
Prior Belief Likelihood

• is a non-random unknown • is a random variable (latent)

parameter • Requires specifying the
• is the sampling / data prior belief
generating distribution
Frequentist Inference

Example: Suppose we observe the outcome of N coin flips.

. What is the probability of heads (coin bias)?

• Coin bias is not random (e.g. there is some true value)

• Uncertainty reported as confidence interval (typically 95%)
Correct Interpretation: On repeated trials of N coin flips will fall inside
the confidence interval 95% of the time (in the limit)

• Inferences are valid for multiple trials, never on single trials

Wrong Interpretation: For this trial there is a 95% chance falls in the
confidence interval
Bayesian Inference

Posterior distribution is complete representation of uncertainty

Prior Belief
Posterior computed by Bayes’ rule: Likelihood

Marginal Likelihood
(more on this later)
• Must specify a prior belief about coin bias
• Coin bias is a random quantity
• Interval can be reported in lieu of full
posterior, and takes intuitive interpretation for a single trial
Interval Interpretation: For this trial there is a 95% chance that
lies in the interval
Bayesian Inference Example
Getty Images
About 29% of American adults have
high blood pressure (BP). Home test
has 30% false positive rate and no
false negative error.

A recent home test states that you have high

BP. Should you start medication?
Bayesian Inference Example
Getty Images
About 29% of American adults have
high blood pressure (BP). Home test
has 30% false positive rate and no
false negative error.

• Latent quantity of interest is hypertension:

• Measurement of hypertension:
• Prior:
• Likelihood:
Bayesian Inference Example
Getty Images
About 29% of American adults have
high blood pressure (BP). Home test
has 30% false positive rate and no
false negative error.

Suppose we get a positive measurement, then posterior is:

What conclusions can be drawn from this calculation?

Marginal Likelihood

Posterior calculation requires the marginal likelihood,

• Also called the partition function or evidence

• Key quantity for model learning and selection
• NP-hard to compute in general (actually #P)
Example: Consider the vector with binary ,
Bayesian Updating

Consider two conditionally independent observations and , their

joint distribution is:
Probability chain rule

So, conditioned on : Update prior belief after seeing X1

This is proportional to the full posterior by Bayes’ rule:

Normalizer is marginal
likelihood p(X1,X2)

In general, given conditionally independent :

Exchangeability

We often assume the model is invariant to data ordering

Def: Consider N random variables and any permutation

of indices. The variables are exchangeable if every
permutation has equal probability,

• is infinitely exchangeable if every finite subsequence is

exchangeable
• Independence implies exchangeability, but the converse is not true
de Finetti’s Theorem

Simple hierarchical representation for exchangeable models

• Observe: this is the marginal likelihood for a model with prior

• Often used as justification for Bayesian statistics
• Technically only true for infinitely exchangeable sequences but reasonable
approximation for many finite sequences
Posterior Marginal

In hierarchical models a subset of variables may be of interest

Normal distribution with random parameters:

Nuisance variable

Quantity of interest

Marginalize out nuisance variables:

Use of conjugate prior

ensures analytic
posterior
Prediction

Can make predictions of unobserved before seeing any data,

Similar calculation to
marginal likelihood

This is the prior predictive distribution

When we observe we can predict future observations ,

This is the posterior predictive distribution

Prediction Example
Getty Images
About 29% of American adults have
high blood pressure (BP). Home test
has 30% false positive rate and no
false negative error.

What is the likelihood of another positive measurement?

What conclusions can be drawn from this calculation?

Model Validation

How do we know if the model is good?

Supervised Learning
Validation set consists of known . Are true
values typically preferred under the posterior?
Good (maybe lucky) Not Good (maybe unlucky)

Repeat trials over validation set for more certainty

Model Validation

How do we know if the model is good?

Unsupervised Learning
Validation set only contains observable data. Check
validation data against posterior-predictive distribution.
Good (maybe lucky) Not Good (maybe unlucky)

Repeat trials over validation set for more certainty

Likelihood and Odds Ratios

Which parameter value or is more likely to have

generated the observed data ?

The posterior odds ratio is:

Prior Odds Likelihood

Ratio Ratio

Observe: the marginal likelihood cancels!

Bayesian Estimation

Task: produce an estimate of after observing data .

Bayes estimators minimize expected loss function:

Example: Minimum mean squared error (MMSE):

Posterior mean always minimizes squared error.

Bayes Estimation: More Examples

Minimum absolute error:

Note: Same answer for linear function .

Maximum a posteriori (MAP):
Very common to produce maximum probability estimates,

Loss function is degenerate,

Not a Bayes estimator!
(unless discrete)
Posterior Summarization

Ideally we would report the full posterior distribution as the

result of inference…but this is not always possible

Summary of Posterior Location:

Point estimates: mean (MMSE), mode, median (min. absolute
error)

Summary of Posterior Uncertainty:

Credible intervals / regions, posterior entropy, variance

Bayesian analysis should report uncertainty when possible

Credible Interval

Def. For parameter the a credible

interval satisfies,
Interval containing
fixed percentage of
posterior
probability density.

Note: This is not unique -- consider the 95% intervals below:

[Source: Gelman et al., “Bayesian Data Analysis”]

Summary

• Marginal likelihood required for Bayesian inference, which can be hard:

• One exception is posterior odds (used in model selection, hypothesis

testing, …)

• Posterior predictive can be used for model quality in unsupervised

setting:
Summary

• Bayesian estimation minimizes expected loss function:

• Common estimators: Posterior mean  MMSE, Median  MAE

• Posterior uncertainty can be summarized by (not necessarily unique)
credible intervals:

• Interpretation: For this trial parameter lies in interval with specified

probability (e.g. 0.95)

Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
No ratings yet
Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
34 pages
25 Intro To Bayesian Inference
No ratings yet
25 Intro To Bayesian Inference
31 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
76 pages
Bayesian Modeling - Student
No ratings yet
Bayesian Modeling - Student
26 pages
24 Intro To Bayesian Inference
No ratings yet
24 Intro To Bayesian Inference
33 pages
BDM Final
No ratings yet
BDM Final
41 pages
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
No ratings yet
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
41 pages
IDS22Bayes Applications
No ratings yet
IDS22Bayes Applications
34 pages
Bayesian Modelling For Data Analysis and Learning From Data
No ratings yet
Bayesian Modelling For Data Analysis and Learning From Data
19 pages
Bayesian Inference
No ratings yet
Bayesian Inference
12 pages
Bayes
No ratings yet
Bayes
31 pages
Bayes 2 V
No ratings yet
Bayes 2 V
32 pages
Chapter8 Bayes
No ratings yet
Chapter8 Bayes
24 pages
03 Bay Est He or em
No ratings yet
03 Bay Est He or em
13 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Bayesian-Statistics Final 20140416 3
No ratings yet
Bayesian-Statistics Final 20140416 3
38 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
An Overview of Bayesian Econometrics
No ratings yet
An Overview of Bayesian Econometrics
30 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Intro-Bayes Theory
No ratings yet
Intro-Bayes Theory
17 pages
Bayesian Credible Interval
100% (1)
Bayesian Credible Interval
8 pages
FSMLecture6 - Statistics
No ratings yet
FSMLecture6 - Statistics
61 pages
Modern Bayesian Econometrics
No ratings yet
Modern Bayesian Econometrics
100 pages
Bayesian Inference: Chris Mathys
No ratings yet
Bayesian Inference: Chris Mathys
32 pages
Lecture Material 2.5 - Bayesian Estimation & Concepts
No ratings yet
Lecture Material 2.5 - Bayesian Estimation & Concepts
12 pages
STATS 225: Bayesian Analysis Lecture 1: Introduction: Babak Shahbaba
No ratings yet
STATS 225: Bayesian Analysis Lecture 1: Introduction: Babak Shahbaba
49 pages
Lecture 6. Bayesian Estimation
No ratings yet
Lecture 6. Bayesian Estimation
14 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Baysian Inferences
No ratings yet
Baysian Inferences
20 pages
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
No ratings yet
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
65 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Bayesian Learning: Thanks To Nir Friedman, HU
No ratings yet
Bayesian Learning: Thanks To Nir Friedman, HU
41 pages
Slides PDF
No ratings yet
Slides PDF
40 pages
Bayesian Networks
No ratings yet
Bayesian Networks
48 pages
LN 13
No ratings yet
LN 13
5 pages
Wa0031.
No ratings yet
Wa0031.
41 pages
Bayes Theorem PDF
No ratings yet
Bayes Theorem PDF
12 pages
Bayesian Statistics: Thomas Bayes
No ratings yet
Bayesian Statistics: Thomas Bayes
22 pages
1-MS2 (Intro Bayes)
No ratings yet
1-MS2 (Intro Bayes)
38 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
UNIT 3-Bayesian Statistics
No ratings yet
UNIT 3-Bayesian Statistics
80 pages
CH 5
No ratings yet
CH 5
45 pages
Bayesian Inference: The Basics
No ratings yet
Bayesian Inference: The Basics
37 pages
MIT18 650F16 Bayesian Statistics
No ratings yet
MIT18 650F16 Bayesian Statistics
18 pages
Baysian-Slides 16 Bayes Intro
No ratings yet
Baysian-Slides 16 Bayes Intro
49 pages
Bayesian Inference
No ratings yet
Bayesian Inference
18 pages
Lectures 5
No ratings yet
Lectures 5
31 pages
Trotta Bayesian Inference Aug 2018
No ratings yet
Trotta Bayesian Inference Aug 2018
57 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Bayesian Ibrahim
No ratings yet
Bayesian Ibrahim
370 pages
Bayesian Statistics Homework
100% (1)
Bayesian Statistics Homework
7 pages
Bayesian Statistics 01
100% (1)
Bayesian Statistics 01
22 pages
PracticeProblems Bayesian
No ratings yet
PracticeProblems Bayesian
10 pages
2.2 Bayesian Statistics
No ratings yet
2.2 Bayesian Statistics
12 pages
Mstat Note14 Bayesian Inference FSP
No ratings yet
Mstat Note14 Bayesian Inference FSP
30 pages
Chapter 6 - Summarizing
No ratings yet
Chapter 6 - Summarizing
37 pages

Bayes

Uploaded by

Bayes

Uploaded by

CSC535: Probabilistic Graphical Models

Bayesian Probability and Statistics

Prof. Jason Pacheco

Data elements often have dependence arising from structure

Exploit structure to simplify representation and computation

Stochastic processes have many sources of uncertainty

What does it mean that the probability of heads is ½ ?

Two schools of thought…

Neither is better/worse, but we can compare interpretations…

• HW1 due 11:59pm tonight

We will use the following notation throughout:

• is a non-random unknown • is a random variable (latent)

Example: Suppose we observe the outcome of N coin flips.

• Coin bias is not random (e.g. there is some true value)

• Inferences are valid for multiple trials, never on single trials

Posterior distribution is complete representation of uncertainty

A recent home test states that you have high

• Latent quantity of interest is hypertension:

Suppose we get a positive measurement, then posterior is:

What conclusions can be drawn from this calculation?

Posterior calculation requires the marginal likelihood,

• Also called the partition function or evidence

Consider two conditionally independent observations and , their

So, conditioned on : Update prior belief after seeing X1

This is proportional to the full posterior by Bayes’ rule:

In general, given conditionally independent :

We often assume the model is invariant to data ordering

Def: Consider N random variables and any permutation

• is infinitely exchangeable if every finite subsequence is

Simple hierarchical representation for exchangeable models

• Observe: this is the marginal likelihood for a model with prior

In hierarchical models a subset of variables may be of interest

Marginalize out nuisance variables:

Use of conjugate prior

Can make predictions of unobserved before seeing any data,

This is the prior predictive distribution

When we observe we can predict future observations ,

This is the posterior predictive distribution

What is the likelihood of another positive measurement?

What conclusions can be drawn from this calculation?

How do we know if the model is good?

Repeat trials over validation set for more certainty

How do we know if the model is good?

Repeat trials over validation set for more certainty

Which parameter value or is more likely to have

The posterior odds ratio is:

Prior Odds Likelihood

Observe: the marginal likelihood cancels!

Task: produce an estimate of after observing data .

Bayes estimators minimize expected loss function:

Example: Minimum mean squared error (MMSE):

Posterior mean always minimizes squared error.

Minimum absolute error:

Note: Same answer for linear function .

Loss function is degenerate,

Ideally we would report the full posterior distribution as the

Summary of Posterior Location:

Summary of Posterior Uncertainty:

Bayesian analysis should report uncertainty when possible

Def. For parameter the a credible

Note: This is not unique -- consider the 95% intervals below:

[Source: Gelman et al., “Bayesian Data Analysis”]

• Marginal likelihood required for Bayesian inference, which can be hard:

• One exception is posterior odds (used in model selection, hypothesis

• Posterior predictive can be used for model quality in unsupervised

• Bayesian estimation minimizes expected loss function:

• Common estimators: Posterior mean  MMSE, Median  MAE

• Interpretation: For this trial parameter lies in interval with specified

You might also like