0% found this document useful (0 votes)

63 views9 pages

Understanding Python

The document discusses Bayesian inference through the example of a coin toss experiment. It introduces Bayesian inference as updating prior beliefs about probabilities after observing data. It shows how to model a coin toss experiment as a binomial process and simulate drawing random probabilities from a uniform prior to generate fake data. The probabilities that generate the actual observed data are collected to form the posterior distribution, representing the updated beliefs after seeing the data. The document also discusses using informative priors representing existing beliefs, like a coin having a 50% chance of heads.

Uploaded by

Moh Fitrah Giffari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views9 pages

Understanding Python

Uploaded by

Moh Fitrah Giffari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

About Features Explore Membership Sea

Files Attachment 2 22

No le attachment
Understanding Bayesian
Statistic Intuitively (In Imam AR'`s Book
Python)
Understanding Bayesian
bayesianstatistics bayesianinference statistics Statistic Intuitively (In
Python)

Imam Published on Sep Updated 4 See all notes

AR 15, 2020 months ago
, Indonesia

Overview

Table of Content

Introduction to Bayesian inference

On using informative prior

Some other example: linear regression

Practical implementation using PyMC3

Introducing Bayesian inference

I would like to start this -unlike other typical bayesian

tutorials- without the Bayesian theorem first. But let

start a very simple practical example in statistics: coin

toss! Let us do a 5 consecutive coin toss which

resulted in {HTTTT} (H means Head part of the coin,

and T means Tail part of the coin, basically, we get 1

head and 4 tail in the result). The question, how

certain we are if it was a fair coin? Or it is not?In

Bayesian inference, it is essential to understand the

process of how the data was being generated. In this

case, the coin toss process can be modeled with the

binomial process. Here is the key take point from

Wikipedia page :

In probability theory and statistics,

the binomial distribution with

parameters n and p, denoted

Bin(n,p) is the discrete probability

distribution of the number of

successes in a sequence of n

independent experiments, each

asking a yes-no question, and each

with its own boolean-valued

outcome: success/yes/true/one

(with probability p), or

failure/no/false/zero (with

probability q = 1 − p).

If we consider if H (head) means success and T (tail) is

not, this implies exactly as a binomial process by the

definition. The binomial process has two parameters

which are n (numbers of experiment conducted) and p

(probability of success), denoted by Bin(n,p). In this

case, let us put our case into the binomial distribution

parameter. Let see what we have: 5 times coin toss

(n=5) with unknown probability of success (unknown

p) and the outcome of {HTTTT} (1 fo 5 outcomes is a

success), and let put this into input-process-output

perspective, basically,

Input: n & p

Process: binomial distribution with parameter n & p

Output: {HTTTT}

As said before, understanding the data generation

process is essential in order to do Bayesian inference.

As we would like to know how likely a coin is a fair

coin, basically we are looking for a parameter p. If p

likely to be valued at 0.5, then we are pretty sure it

was a fair coin, and vice versa. How do we get this p?

Why won't we try a simulation?Let assume, we

completely have no idea regarding the value of p

before looking at the data. Then we can assume any

value between 0 and 1 should have equal probability

to be the value of p, which means, we model our prior

belief of p to be uniformly distributed from 0 to 1,

denoted by p_prior~Uniform(0,1) . And putting this

into our simulation machine, by doing the following:

Let take random value from the distribution of p_prior,

for example, we get p_prior=Uniform(0,1)=0.3

Let put this value into our data generation process

which is a binomial process. For example, we get

Bin(n,p)=Bin(5,0.3)={HHTTT} (2 success and 3 fail

experiment). As this is different from the actual data

we get (which is {HTTTT}), we should reject this

result.

Redo the simulation from 1 as many times as possible,

and we should only collect all the value of p which

resulted in the same data as the data we have.

Here the simulation of several iteration :

Iteration 1 : p=Uniform(0,1)=0.32 -->

Bin(5,0.32)={HHTTT}. rejected

Iteration 2 : p=Uniform(0,1)=0.11 --> Bin(5,0.11)=

{HTTTT}. get this value p=0.11

Iteration 3 : p=Uniform(0,1)=0.69 --> Bin(5,0.69)=

{HHHTT}. rejected

Doing this many times, we will get a set of the

possible value of p which matches our data. I am

trying to do this in python with 100,000 iterations and

plot the distribution of value p we get from the

simulation, following are the code

import matplotlib.pyplot as pltimport

matplotlibimport seaborn as
snssns.set()matplotlib.rcParams['figure.figsize']
= [10,5]import numpy as npn=5 # number of
trial/coin tossp_posterior=list()for i in
range(100000): p_prior = np.random.random()
n_success = np.random.binomial(5, p_prior, 1)[0]
# how many times we get head if n_success ==
1: # we collect everytime the simulation get the
same result as the data
p_posterior.append(p_prior)ax =
sns.distplot(p_posterior)ax.set(xlabel='p')

And here the distribution of p we generated,

Oh, look! we get some intuitive view over that. The

distribution of p parameter shows the most probable

value of p is around 0.2 (of course, it is 1 from 5 data

that show we get "success" or head result). BUT (big

but over there), as we only have 5 collected data, we

are actually quite uncertain about that one, which

actually also reflected to this resulted distribution. We

can see this is quite a fat distribution that implies our

uncertainty towards the value. Other than that, it also

possible to infer other value from the result. Like, how

confident we are if the actual value of p is lies

between 0.15-0.25? We can infer this from all of our

sample results which lies between 0.15-0.25 divided

by total sample (from my code, I get it ~24%), with the

following code
p_posterior =
np.array(p_posterior)sum((p_posterior<0.25) &
(p_posterior>0.15)) / len(p_posterior)

as the result of Bayesian inference is actually a

sample distribution, unlike the frequentist approach,

we can infer many things from the result only.

This resulted distribution is what we called a prior

distribution or a prior belief. In sense, what we have

done is that we update our prior belief of p (which

previously uniformly distributed from 0 to 1) into new

belief after looking at the data. And this is a Bayesian

inference is all about, use the data to update our

belief towards some parameter/value. In probability

term, our prior believe denoted by P(p) (probability of

p), and we infer the probability of p given the data

which denoted P(p|x) by some likelihood distribution

P(x|p) (or probability of the data given a parameter).

We can see that P(p|x) basically is the multiplication

of probability of p ( P(p) ) and the likelihood of the data

given parameter p P(x|p) (remember our process,

basically a multiplication of probability is looking at

the 2 events where both are true, which exactly how

our simulation is happening). We can write this as,

P(p|x) ∝ P(p)P(x|p)

This notation you might encounter at any Bayesian

inference tutorial out there as a core concept of the

inference. This concept is actually very natural for

humans, basically we revisit our prior belief after

looking at data/evidence. Like after watching the sun

always rise every day, we believe it will also rise

tomorrow with huge confidence. Or when you would

like to lend a car to somebody, you would rather

believe a friend whom you have already seen trusted

(in terms of driving capability and honesty) compare

to the complete strangers. And the more evidence you

have, the more confidence you are towards the future

outcome (whether the car will come back safely or

not). This is very natural in the way of human thinking.

On using informative prior

One interesting stuff on Bayesian statistics is, its

ability to incorporate our opinion or prior belief to the

model. For the case of the coin toss, usually, we know

that most of the coin will have a 50:50 change of

getting head or tail. Then we would like to incorporate

this belief to our model, how do we do it? This can be

done by using a concept called prior belief.

Remember that the prior distribution we were using

previously is p_prior ~ Uniform(0,1) ? Previously this

means we believe that the value of p can be anything

from 0 to 1 with same equal opportunity, then we

change this belief in accordance to the data. Right

now, we want to change this prior belief in

accordance to our knowledge first. In a real world

scenarios, this can be happen in form of expert

opinion, or previous similar experiment being

conducted.

Now going back to the coin toss case, we know that

the value of p is only possible between 0 and 1. And

we have some belief of 50:50 change on getting head

or tail. First, we need to look for a distribution that can

help us to incorporate this belief. Let use a Beta

distribution in this case. Note that the most important

thing about prior distribution is its shape. The reason

Beta distribution is a perfect choice in this case, it has

range of possible values between 0 and 1, and we can

alter its shape by changing the α and β parameter of

the distribution. Beta distribution denoted by

Beta(α,β) and you can see on the Wikipedia page on

how different parameters affects the distribution's

shape. I would like to use Beta(5,5) as my prior belief,

which resulted on the below distribution shape.

p_prior = np.random.beta(5,5,100000)ax =
sns.displot(p_prior)ax.set(xlabel="p")

Why Beta(5,5) ? As I said previously, the most

important thing about this prior distribution is it

shape. If we want to make a thinner shape, which

reflects a stronger belief towards p equal to 0.5, you

may choose Beta(100,100) , or weaker belief towards

p=0.5, you may choose Beta(2,2) . It really depends

on how much the distribution reflects your prior belief.

Now let see the differences between our previous

setup and the current modified prior belief setup.

Previously

p_prior~Uniform(0,1)data~Binomial(5,p)

Current Setup

p_prior~Beta(5,5)data~Binomial(5,p)

With this configuration, we can redo the simulation by

following

n = 5p_posterior = list()for I in range(100000):

p_prior = np.random.beta(5,5,1)[0] n_success =
np.random.bunomial(5, p_prior, 1)[0] if
n_success == 1:
p_posterior.append(p_prior)ax =
sns.distplot(p_posterior)ax.set(x_label="p")

See how the differences of the result? Here, the

resulted distribution does not really skew towards 0.2,

the peak of distribution still closer to 0.5 (if compared

with the previous result), but it does not exactly at 0.5,

in fact in between 0.3 and 0.4. Since we have a prior

belief of 50:50 chance, after looking at the data our

belief starts to change, but not as extreme as when

we have zero assumption towards the value of p .

Some other example: linear regression

In this section, I would give you a simple example of

how linear regression is performed within the

Bayesian framework. But, I will only show you the

setup, not actually implement it into a code, in order to

give you a better view on how to set up a problem in

Bayesian inference.

Up to know you might realize, in Bayesian inference,

the parameter itself has a distribution instead of a

single number value. In the usual sense, we fit data to

the equation y=ax+c in order to do a linear regression.

But in Bayesian statistics, we define y is the data, and

the data is normally distributed with a mean value of

ax+c . Or we can write as follows,

y~Normal(μ,σ)μ=ax+c

Notice that there is another parameter which are a, c,

and σ, in which we need to define it prior distribution

too. Since we don't know any prior information or

knowledge, we can use a non-informative prior to this

case, so our final model becomes

y~Normal(μ,σ)μ=ax+ca~Uniform(-inf,
inf)c~Uniform(-inf, inf)σ~HalfUniform(0, inf)

The actual implementation of this model will not be

done in this example. I only want to show you how the

model construction works in Bayesian inference.

Practical implementation using PyMC3

One of the biggest problems of Bayesian inference is

that it expensive computational costs that requires

huge resources. In our previous case, it possible to use

a simulation procedure in order to get a posterior

sample since we only handle 5 observations and very

limited parameters and processes. But in case of huge

observation, finding a set of simulation results that

match our observation must require a really long time

and many many iterations.

Fortunately, in this opensource era and current

computational resource, the application of Bayesian

inference becomes much easier and possible. For

example, one of the most popular algorithms to be

used in order to get a posterior sample without the

need of doing simulation is MCMC (Markov-Chain

Monte Carlo). Of course, if you would like to implement

this algorithm by yourself, it requires time to

implement it right (which will not also be

demonstrated in this article). Instead, we can use a

popular opensource package available in Python

called PyMC3. Since our focus is to practically

implement the inference, we can directly use this

package as the most important thing is the result.

There are several assumptions and diagnoses that

need to be understood prior to use this method in

order to do Bayesian inference, but it will be for

another article (stay tuned!). As a closing statement, I

will show you how to implement our coin toss problem

using PyMC3 packages below.

First, define our model

p ~ Beta(5,5)observation/data ~ Bin(5,p)

Second, put it into code

import pymc3 as pmdata = np.array([1]) # this

mean only 5 resulted in successwith pm.Model() as
coin_model: p = pm.Beta("p", 5, 5) obs =
pm.Binomial("obs", n=5, p=p, observed=data)
trace = pm.sample(10000, tune=2000, cores=4)ax =
sns.distplot(trace["p"])ax.set(xlabel="p")

Of course, not all subjects related to Bayesian

statistics can be discussed within one article. There

are many aspect such as various types of statistical

distribution, diagnosis on MCMC result, etc. The whole

point of this article is to understand general idea of

the Bayesian inference framework intuitively. If you

are interested on learning this further, below are

several source I recommend to read regarding this

matter:

Books: Bayesian Method for Hacker. Book about

Bayesian inference from practical point of view with

PyMC. If you are a hacker, this book really are

convenient to use as it has bigger weight on it

practical implementation. In the book, it use PyMC

instead of PyMC3, but there are converted PyMC3

version of implementation in this github.

Books: Bayesian Analysis with Python--Second

Edition. Understanding concept and practical

implementation using PyMC3.

Books: Statistical Rethinking. One of the most used

book to completely understand and start Bayesian

statistics. Implementation using STAN and R.

1 from 1 notes
Comment

ECE 069 - Engineering Data Analysis - WM
No ratings yet
ECE 069 - Engineering Data Analysis - WM
133 pages
PyCon 2015 - Bayesian Statistics Made Simple
100% (4)
PyCon 2015 - Bayesian Statistics Made Simple
145 pages
Bayesian Inference
No ratings yet
Bayesian Inference
5 pages
Probability Theory - Towards Data Science
No ratings yet
Probability Theory - Towards Data Science
19 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
Introduction To Probabilistic Learning
No ratings yet
Introduction To Probabilistic Learning
9 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Bayesian Model - Statistics
No ratings yet
Bayesian Model - Statistics
29 pages
Bayesian Statistics Explained To Beginners in Simple English
No ratings yet
Bayesian Statistics Explained To Beginners in Simple English
16 pages
Unit Iii 1
No ratings yet
Unit Iii 1
20 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Statistics for People Who Think They Hate Statistics 6th Edition Salkind Fast Access
No ratings yet
Statistics for People Who Think They Hate Statistics 6th Edition Salkind Fast Access
316 pages
Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Bayesian Updating With Continuous Priors Class 13, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
10 pages
Neeru Jain PDF
100% (1)
Neeru Jain PDF
27 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
No ratings yet
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
29 pages
Bootcamp 2 Session PPT Day 1 Probability Statistics Ankit Javeri 2ND May 2024
No ratings yet
Bootcamp 2 Session PPT Day 1 Probability Statistics Ankit Javeri 2ND May 2024
37 pages
Notes 19
No ratings yet
Notes 19
11 pages
Bayesian Inference
No ratings yet
Bayesian Inference
12 pages
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
No ratings yet
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
18 pages
Unit 2
No ratings yet
Unit 2
20 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
No ratings yet
Bayes For Beginners: Luca Chech and Jolanda Malamud Supervisor: Thomas Parr 13 February 2019
41 pages
DI&M Part3
No ratings yet
DI&M Part3
18 pages
5: Discrete Random Variables: Probability Mass Functions and Expectations
No ratings yet
5: Discrete Random Variables: Probability Mass Functions and Expectations
18 pages
CH 5
No ratings yet
CH 5
45 pages
BST413 12jan Page1to11
No ratings yet
BST413 12jan Page1to11
11 pages
The Random Variable For Probabilities: Chris Piech CS109, Stanford University
No ratings yet
The Random Variable For Probabilities: Chris Piech CS109, Stanford University
58 pages
Probabilistic Theory of Deep Learning
No ratings yet
Probabilistic Theory of Deep Learning
11 pages
100 MCQs For Research Methodology
No ratings yet
100 MCQs For Research Methodology
10 pages
Information Retrieval: Venkatesh Vinayakarao
No ratings yet
Information Retrieval: Venkatesh Vinayakarao
57 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Bayesian Basics: Ryan P. Adams
No ratings yet
Bayesian Basics: Ryan P. Adams
7 pages
Bayesian Learning: Thanks To Nir Friedman, HU
No ratings yet
Bayesian Learning: Thanks To Nir Friedman, HU
41 pages
Bayes Stats
No ratings yet
Bayes Stats
3 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
1.AlleleFrequencies 0
No ratings yet
1.AlleleFrequencies 0
55 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Bayesian Estimation
No ratings yet
Bayesian Estimation
13 pages
2 2assignement
No ratings yet
2 2assignement
11 pages
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
No ratings yet
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
11 pages
03 Lecturenote MLE MAP
No ratings yet
03 Lecturenote MLE MAP
7 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
CLASS 2025 Bayesian Framework
No ratings yet
CLASS 2025 Bayesian Framework
46 pages
Non Parametric
No ratings yet
Non Parametric
18 pages
MQM100 MultipleChoice Chapter2
No ratings yet
MQM100 MultipleChoice Chapter2
9 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
Charcoal Research
No ratings yet
Charcoal Research
22 pages
EE675A Lecture 4
No ratings yet
EE675A Lecture 4
7 pages
2.2 Bayesian Statistics
No ratings yet
2.2 Bayesian Statistics
12 pages
Bayes Theorem
No ratings yet
Bayes Theorem
2 pages
Bayes Manuscripts
No ratings yet
Bayes Manuscripts
180 pages
Bayesian Inference
No ratings yet
Bayesian Inference
1 page
SML - Week 2
No ratings yet
SML - Week 2
4 pages
Bayesian
No ratings yet
Bayesian
91 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Unit 3 AIML
No ratings yet
Unit 3 AIML
15 pages
Bayesian Statistics Homework
100% (1)
Bayesian Statistics Homework
7 pages
Senior High School (SHS) Subject Offerings Per Track/Strand: St. Camillus College of Manaoag Foundation, Inc
100% (1)
Senior High School (SHS) Subject Offerings Per Track/Strand: St. Camillus College of Manaoag Foundation, Inc
6 pages
B.A. (Psychology)
No ratings yet
B.A. (Psychology)
26 pages
Psyc 235: Introduction To Statistics: Don'T Forget To Sign in For Credit!
No ratings yet
Psyc 235: Introduction To Statistics: Don'T Forget To Sign in For Credit!
41 pages
1 Inference
No ratings yet
1 Inference
9 pages
Normal Distribution
No ratings yet
Normal Distribution
28 pages
Bayes' Theorem: Points of Significance
No ratings yet
Bayes' Theorem: Points of Significance
2 pages
Firefight V 1.3.1
No ratings yet
Firefight V 1.3.1
121 pages
Chapter 08
No ratings yet
Chapter 08
23 pages
SGPE Econometrics Lecture 1 OLS
No ratings yet
SGPE Econometrics Lecture 1 OLS
87 pages
Lec 2
No ratings yet
Lec 2
46 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
Bayes' Estimators: The Method
No ratings yet
Bayes' Estimators: The Method
7 pages
Final Exams Schedule F2024
No ratings yet
Final Exams Schedule F2024
3 pages
Pearson Edexcel Level 3 Advanced Subsidiary GCE in Mathematics (8MA0) Pearson Edexcel Level 3 Advanced GCE in Mathematics (9MA0)
No ratings yet
Pearson Edexcel Level 3 Advanced Subsidiary GCE in Mathematics (8MA0) Pearson Edexcel Level 3 Advanced GCE in Mathematics (9MA0)
21 pages
Basic Statistics Lecture Notes
No ratings yet
Basic Statistics Lecture Notes
6 pages
DOE Design
No ratings yet
DOE Design
22 pages
Radial Basis Function (RBF) Neural Networks For The Senior Design Project
No ratings yet
Radial Basis Function (RBF) Neural Networks For The Senior Design Project
17 pages
CTU Masteral Exercise 5 - November 4, 2023
No ratings yet
CTU Masteral Exercise 5 - November 4, 2023
8 pages
Mid Home Assignment: IUBAT-International University of Business Agriculture and Technology
No ratings yet
Mid Home Assignment: IUBAT-International University of Business Agriculture and Technology
5 pages
Tamil Nadu Open University: Regulations and Overview For
No ratings yet
Tamil Nadu Open University: Regulations and Overview For
107 pages
Zzzz-Essential Bayes
No ratings yet
Zzzz-Essential Bayes
158 pages
Rosenthal 1979 Psych Bulletin
No ratings yet
Rosenthal 1979 Psych Bulletin
4 pages
PTE Prediction 21-27 Jan
No ratings yet
PTE Prediction 21-27 Jan
82 pages
Prelim Exam 2nd Sem For Students
No ratings yet
Prelim Exam 2nd Sem For Students
4 pages
48 Sample Chapter
No ratings yet
48 Sample Chapter
8 pages
Activity 1: Statistics and Data Handling in Analytical Chemistry
No ratings yet
Activity 1: Statistics and Data Handling in Analytical Chemistry
16 pages
IS328 Data Mining-Tutorial Lab Session 2 - Solution - Updated
No ratings yet
IS328 Data Mining-Tutorial Lab Session 2 - Solution - Updated
15 pages
Endogeneity
No ratings yet
Endogeneity
10 pages
Martín Albo, J., Núñez, J., Navarro, J., & Grijalvo, F. (2007)
No ratings yet
Martín Albo, J., Núñez, J., Navarro, J., & Grijalvo, F. (2007)
11 pages
Blackwell Publishing Royal Statistical Society
No ratings yet
Blackwell Publishing Royal Statistical Society
7 pages