0% found this document useful (0 votes)

72 views35 pages

Chapter 2

The document discusses Bayesian data analysis in Python. It begins with an overview of Bayes' theorem and how it relates the prior distribution, likelihood, and posterior distribution. It then provides an example of using a grid approximation to calculate the posterior distribution of a coin's bias given an observed number of heads in tosses. The example generates the prior, likelihood, and normalizes the product of prior and likelihood to get the posterior distribution. It then plots the posterior to show the most likely values of the coin's bias. The document concludes with discussing choosing appropriate prior distributions and conjugate priors that allow direct calculation of the posterior.

Uploaded by

Abhishek Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views35 pages

Chapter 2

Uploaded by

Abhishek Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Under the Bayesian

hood
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N

Michal Oleszak
Machine Learning Engineer
Bayes' Theorem revisited

P (B∣A) ∗ P (A)
P (A∣B) =
P (B)

BAYESIAN DATA ANALYSIS IN PYTHON

Bayes' Theorem revisited

P (data∣parameters) ∗ P (parameters)
P (parameters∣data) =
P (data)

P(parameters | data) → posterior distribution: what we know about the parameters a er

having seen the data

P(parameters) → prior distribution: what we know about the parameters before seeing
any data

P(data | parameters) → likelihood of the data according to our statistical model

P(data) → scaling factor

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

num_heads = np.arange(0, 101, 1)

head_prob = np.arange(0, 1.01, 0.01)

coin = pd.DataFrame([(x, y) for x in num_heads for y in head_prob])

coin.columns = ["num_heads", "head_prob"]

num_heads head_prob
0 0 0.00
1 0 0.01
2 0 0.02
... ...
10199 100 0.99
10200 100 1.00
[10201 rows x 2 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

from scipy.stats import uniform

coin["prior"] = uniform.pdf(coin["head_prob"])

num_heads head_prob
0 0 0.00
1 0 0.01
2 0 0.02
... ...
10199 100 0.99
10200 100 1.00
[10201 rows x 2 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

from scipy.stats import uniform

coin["prior"] = uniform.pdf(coin["head_prob"])

num_heads head_prob prior

0 0 0.00 1.0
1 0 0.01 1.0
2 0 0.02 1.0
... ... ...
10199 100 0.99 1.0
10200 100 1.00 1.0
[10201 rows x 3 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

from scipy.stats import uniform

coin["prior"] = uniform.pdf(coin["head_prob"])

from scipy.stats import binom

coin["likelihood"] = binom.pmf(coin["num_heads"], 100, coin["head_prob"])

num_heads head_prob prior

0 0 0.00 1.0
1 0 0.01 1.0
2 0 0.02 1.0
... ... ...
10199 100 0.99 1.0
10200 100 1.00 1.0
[10201 rows x 3 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

from scipy.stats import uniform

coin["prior"] = uniform.pdf(coin["head_prob"])

from scipy.stats import binom

coin["likelihood"] = binom.pmf(coin["num_heads"], 100, coin["head_prob"])

num_heads head_prob prior likelihood

0 0 0.00 1.0 1.000000
1 0 0.01 1.0 0.366032
2 0 0.02 1.0 0.132620
... ... ... ...
10199 100 0.99 1.0 0.366032
10200 100 1.00 1.0 1.000000
[10201 rows x 4 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

coin["posterior_prob"] = coin["prior"] * coin["likelihood"]

coin["posterior_prob"] /= coin["posterior_prob"].sum()

num_heads head_prob prior likelihood

0 0 0.00 1.0 1.000000
1 0 0.01 1.0 0.366032
2 0 0.02 1.0 0.132620
... ... ... ...
10199 100 0.99 1.0 0.366032
10200 100 1.00 1.0 1.000000
[10201 rows x 4 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

coin["posterior_prob"] = coin["prior"] * coin["likelihood"]

coin["posterior_prob"] /= coin["posterior_prob"].sum()

num_heads head_prob prior likelihood posterior_prob

0 0 0.00 1.0 1.000000 0.009901
1 0 0.01 1.0 0.366032 0.003624
2 0 0.02 1.0 0.132620 0.001313
... ... ... ... ...
10199 100 0.99 1.0 0.366032 0.003624
10200 100 1.00 1.0 1.000000 0.009901
[10201 rows x 5 columns]

BAYESIAN DATA ANALYSIS IN PYTHON

Tossing the coin again: grid approximation
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

from scipy.stats import binom

from scipy.stats import uniform

num_heads = np.arange(0, 101, 1)

head_prob = np.arange(0, 1.01, 0.01)
coin = pd.DataFrame([(x, y) for x in num_heads for y in head_prob])
coin.columns = ["num_heads", "head_prob"]

coin["prior"] = uniform.pdf(coin["head_prob"])
coin["likelihood"] = binom.pmf(coin["num_heads"], 100, coin["head_prob"])

coin["posterior_prob"] = coin["prior"] * coin["likelihood"]

coin["posterior_prob"] /= coin["posterior_prob"].sum()

BAYESIAN DATA ANALYSIS IN PYTHON

Plotting posterior distribution
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?

heads75 = coin.loc[coin["num_heads"] == 75]

heads75["posterior_prob"] /= heads75["posterior_prob"].sum()

num_heads head_prob prior likelihood posterior_prob

7575 75 0.00 1.0 0.000000e%2000 0.000000e%2000
7576 75 0.01 1.0 1.886367e-127 1.867690e-129
... ... ... ... ...
7674 75 0.99 1.0 1.141263e-27 1.129964e-29
7675 75 1.00 1.0 0.000000e%2000 0.000000e%2000
[101 rows x 5 columns]

sns.lineplot(heads75["head_prob"], heads75["posterior_prob"])
plt.show()

BAYESIAN DATA ANALYSIS IN PYTHON

Plotting posterior distribution
Q: What's the probability of tossing heads with a coin, if we observed 75 heads in 100 tosses?
A:

BAYESIAN DATA ANALYSIS IN PYTHON

Let's practice
calculating
posteriors using grid
approximation!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Prior belief
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N

Michal Oleszak
Machine Learning Engineer
Prior distribution
Prior distribution re ects what we know about the parameter before observing any data:
nothing → uniform distribution (all values equally likely)

old posterior → can be updated with new data

One can choose any probability distribution as a prior to include external info in the model:
expert opinion

common knowledge

previous research

subjective belief

BAYESIAN DATA ANALYSIS IN PYTHON

Prior's impact

BAYESIAN DATA ANALYSIS IN PYTHON

Prior distribution
Prior distribution chosen before we see the data.

Prior choice can impact posterior results (especially with li le data).

To avoid cherry-picking, prior choices should be:

clearly stated,

explainable: based on previous research, sensible assumptions, expert opinion, etc.

BAYESIAN DATA ANALYSIS IN PYTHON

Choosing the right prior
Our prior belief: heads less likely

Some choices are be er than others!

BAYESIAN DATA ANALYSIS IN PYTHON

Conjugate priors
Some priors, multiplied with speci c likelihoods, yield known posteriors.

They are known as conjugate priors.

In the case of coin tossing:

if we choose a prior Beta(a, b),

then the posterior is Beta(#heads + a, #tosses - #heads + b)

We can sample from the posterior using numpy .

get_heads_prob() from Chapter 1:

def get_heads_prob(tosses):
num_heads = np.sum(tosses)
# prior: Beta(1,1)
return np.random.beta(num_heads + 1, len(tosses) - num_heads + 1, 1000)

BAYESIAN DATA ANALYSIS IN PYTHON

Two ways to get the posterior
Simulation Calculation

If posterior is known, we can sample from it If posterior is not known, we can calculate
using numpy : it using grid approximation.

draws = np.random.beta(2, 4, 1000) Outcome: posterior probability for each

grid element:
Outcome: an array of 1000 posterior draws:
head_prob posterior_prob
array([0.05941031, ..., 0.70015975]) 0 0.00 0.009901
1 0.01 0.003624
... ...
Can be plo ed with 10199 0.99 0.003624
10200 1.00 0.009901
sns.kdeplot(draws)

Can be plo ed with

sns.lineplot(df["head_prob"], df["posterior_prob"])

BAYESIAN DATA ANALYSIS IN PYTHON

Let's practice
working with priors!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Reporting Bayesian
results
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N

Michal Oleszak
Machine Learning Engineer
The honest way
Report the prior and the posterior of each parameter

posterior_draws

array([8.02800413, 8.97359548, 7.57437476, ..., 5.85264609, 7.92875104,

7.41463758])

Plot prior and posterior distributions

sns.kdeplot(prior_draws, shade=True, label="prior")

sns.kdeplot(posterior_draws, shade=True, label="posterior")

BAYESIAN DATA ANALYSIS IN PYTHON

The honest way

BAYESIAN DATA ANALYSIS IN PYTHON

The honest way

BAYESIAN DATA ANALYSIS IN PYTHON

The honest way

BAYESIAN DATA ANALYSIS IN PYTHON

The honest way

BAYESIAN DATA ANALYSIS IN PYTHON

Bayesian point estimates
No single number can fully convey the
complete information contained in a
distribution

However, sometimes a point estimate of a

parameter is needed

BAYESIAN DATA ANALYSIS IN PYTHON

Bayesian point estimates
No single number can fully convey the
complete information contained in a
distribution

However, sometimes a point estimate of a

parameter is needed

posterior_mean = np.mean(posterior_draws)

BAYESIAN DATA ANALYSIS IN PYTHON

Bayesian point estimates
No single number can fully convey the
complete information contained in a
distribution

However, sometimes a point estimate of a

parameter is needed

posterior_mean = np.mean(posterior_draws)
posterior_median = np.median(posterior_draws)

BAYESIAN DATA ANALYSIS IN PYTHON

Bayesian point estimates
No single number can fully convey the
complete information contained in a
distribution

However, sometimes a point estimate of a

parameter is needed

posterior_mean = np.mean(posterior_draws)
posterior_median = np.median(posterior_draws)
posterior_p75 = np.percentile(posterior_draws, 75)

BAYESIAN DATA ANALYSIS IN PYTHON

Credible intervals
Such an interval that the probability that
the parameter falls inside it is x%

The wider the credible interval, the more

uncertainty in parameter estimate

Parameter is random, so it can fall into an

interval with some probability

In the frequentist world, the (con dence)

interval is random while the parameter is
xed

BAYESIAN DATA ANALYSIS IN PYTHON

Highest Posterior Density (HPD)
import pymc3 as pm

hpd = pm.hpd(posterior_draws,
hdi_prob=0.9)
print(hpd)

[-4.86840193 4.96075498]

BAYESIAN DATA ANALYSIS IN PYTHON

Let's practice
reporting Bayesian
results!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N

PyCon 2015 - Bayesian Statistics Made Simple
100% (4)
PyCon 2015 - Bayesian Statistics Made Simple
145 pages
Application of Genetic Engineering in Medicine
80% (5)
Application of Genetic Engineering in Medicine
5 pages
Statistics and Risk Modelling Using Python
No ratings yet
Statistics and Risk Modelling Using Python
99 pages
Metropolis-Explained - Jupyter Notebook
No ratings yet
Metropolis-Explained - Jupyter Notebook
27 pages
Slides-Sksk
100% (1)
Slides-Sksk
151 pages
Chapter 21
No ratings yet
Chapter 21
20 pages
MMPC-5 Imp
100% (1)
MMPC-5 Imp
32 pages
Fair Coin
No ratings yet
Fair Coin
17 pages
CS1 (A) Book
No ratings yet
CS1 (A) Book
38 pages
Machine Learning Essentials
No ratings yet
Machine Learning Essentials
36 pages
Foundations of Probability in Python - Part 1
No ratings yet
Foundations of Probability in Python - Part 1
44 pages
PTSP
No ratings yet
PTSP
101 pages
PTSP
No ratings yet
PTSP
74 pages
1 Bayesian Talk
No ratings yet
1 Bayesian Talk
84 pages
Bayesian Estimation
No ratings yet
Bayesian Estimation
13 pages
AI Obse-2
No ratings yet
AI Obse-2
32 pages
HW2P
No ratings yet
HW2P
19 pages
Python by BD Shared - Edited
No ratings yet
Python by BD Shared - Edited
47 pages
Data Science Experiments
No ratings yet
Data Science Experiments
31 pages
Lab Manual Data Science
No ratings yet
Lab Manual Data Science
24 pages
Who Is Bayes? What Is Bayes?: Michal Oleszak
No ratings yet
Who Is Bayes? What Is Bayes?: Michal Oleszak
25 pages
Notes 19
No ratings yet
Notes 19
11 pages
Nirvaan ABA 5
No ratings yet
Nirvaan ABA 5
9 pages
Foundations of Probability in Python
No ratings yet
Foundations of Probability in Python
44 pages
Projectpdf
No ratings yet
Projectpdf
12 pages
Introduction To Simulation Using Python
No ratings yet
Introduction To Simulation Using Python
19 pages
Introdiscreteprobas v1.2
No ratings yet
Introdiscreteprobas v1.2
91 pages
CRM Project Report
No ratings yet
CRM Project Report
47 pages
COMPSCI 5590-f23-IS-rr-lecture2-1
No ratings yet
COMPSCI 5590-f23-IS-rr-lecture2-1
12 pages
Practical 1
No ratings yet
Practical 1
6 pages
Building For Future Generations First Principles of True Reed, Jeff
No ratings yet
Building For Future Generations First Principles of True Reed, Jeff
68 pages
Slides 11 09 PDF
No ratings yet
Slides 11 09 PDF
105 pages
Understanding Python
No ratings yet
Understanding Python
9 pages
Data Analysis For Social Scientists Cheatsheet
No ratings yet
Data Analysis For Social Scientists Cheatsheet
12 pages
Unit 2
No ratings yet
Unit 2
20 pages
Engineering Economy: University of Eastern Philippines College of Engineering
No ratings yet
Engineering Economy: University of Eastern Philippines College of Engineering
34 pages
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
No ratings yet
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
40 pages
Catalogue of G K Publishers
No ratings yet
Catalogue of G K Publishers
10 pages
Experiment 6
No ratings yet
Experiment 6
5 pages
Probability
No ratings yet
Probability
6 pages
Bayes Stats
No ratings yet
Bayes Stats
3 pages
Tutorial3 Q&A 2023-09-08 04 - 44 - 25
No ratings yet
Tutorial3 Q&A 2023-09-08 04 - 44 - 25
6 pages
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
No ratings yet
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
18 pages
PLC Questions Liii
No ratings yet
PLC Questions Liii
209 pages
Teste 3
No ratings yet
Teste 3
3 pages
Exp1 21GS61R04
No ratings yet
Exp1 21GS61R04
10 pages
I Unit
No ratings yet
I Unit
16 pages
Nurse Patient Interaction
No ratings yet
Nurse Patient Interaction
14 pages
Poisson
No ratings yet
Poisson
3 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Unit Iii 1
No ratings yet
Unit Iii 1
20 pages
SML - Week 2
No ratings yet
SML - Week 2
4 pages
MIT18 05S14 Reading11
No ratings yet
MIT18 05S14 Reading11
9 pages
Cosc 416
No ratings yet
Cosc 416
6 pages
Product Management Compendium
No ratings yet
Product Management Compendium
25 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
The Econometrics of Financial Market PDF
0% (1)
The Econometrics of Financial Market PDF
8 pages
How To Manage Your Time Like A CEO
No ratings yet
How To Manage Your Time Like A CEO
47 pages
De Minimis Benefits
0% (1)
De Minimis Benefits
3 pages
Sampling Concepts, Sampling Distributions & Estimation
No ratings yet
Sampling Concepts, Sampling Distributions & Estimation
21 pages
Bayesian Model - Statistics
No ratings yet
Bayesian Model - Statistics
29 pages
How To Make A Survey in Thesis
100% (1)
How To Make A Survey in Thesis
6 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
10 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Probability Theory - Towards Data Science
No ratings yet
Probability Theory - Towards Data Science
19 pages
Jupyter Notebook Viewer
No ratings yet
Jupyter Notebook Viewer
32 pages
Introduction To Bayesian Statistics1
No ratings yet
Introduction To Bayesian Statistics1
28 pages
Allocation of Joint Costs/by-Product Costing
No ratings yet
Allocation of Joint Costs/by-Product Costing
8 pages
The Z/EVES 2.0 User's Guide
No ratings yet
The Z/EVES 2.0 User's Guide
71 pages
Bayesian Inference
No ratings yet
Bayesian Inference
5 pages
Indonesian Architecture (Word
No ratings yet
Indonesian Architecture (Word
5 pages
Chapter 4
No ratings yet
Chapter 4
52 pages
Story Comprehension and Retelling Language Arts Pre K
No ratings yet
Story Comprehension and Retelling Language Arts Pre K
11 pages
Lab4 ConditionalProbAndBayes
No ratings yet
Lab4 ConditionalProbAndBayes
1 page
Chapter 3
No ratings yet
Chapter 3
36 pages
Tips, Fun Facts, & Guideliness A Chapter I Removed From The Grimoire of Deathful Wombs
No ratings yet
Tips, Fun Facts, & Guideliness A Chapter I Removed From The Grimoire of Deathful Wombs
4 pages
Reactive Blue 221
No ratings yet
Reactive Blue 221
4 pages
Aspergillus Salpingitis A Rare Case Report
No ratings yet
Aspergillus Salpingitis A Rare Case Report
4 pages
Chemistry Brochure PROOF
No ratings yet
Chemistry Brochure PROOF
2 pages
College Essay Brainstorming Questions
No ratings yet
College Essay Brainstorming Questions
5 pages
Crypto Exchange: Marketing Options
No ratings yet
Crypto Exchange: Marketing Options
6 pages
Frequentist Versus Bayesian Approaches To Multiple Testing: Arvid Sjölander Stijn Vansteelandt
No ratings yet
Frequentist Versus Bayesian Approaches To Multiple Testing: Arvid Sjölander Stijn Vansteelandt
13 pages
HS 200 Past QP2
No ratings yet
HS 200 Past QP2
2 pages
What Is A Selection?: Characteristics of Selections
No ratings yet
What Is A Selection?: Characteristics of Selections
7 pages
Comparison of Bayesian and Frequentist Methods For Prevalence Estimation Under Misclassification
No ratings yet
Comparison of Bayesian and Frequentist Methods For Prevalence Estimation Under Misclassification
10 pages
JBCC Minor Works Contract
56% (9)
JBCC Minor Works Contract
26 pages
Assimilation
No ratings yet
Assimilation
2 pages
Inductive Grammar Activity (Unit 6, Page 64)
No ratings yet
Inductive Grammar Activity (Unit 6, Page 64)
2 pages
The Developers UX Checklist OutSystems PDF
No ratings yet
The Developers UX Checklist OutSystems PDF
1 page
Notes For Assignment
No ratings yet
Notes For Assignment
1 page