Chapter 3
Chapter 3
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
A/B testing
Randomized experiment: divide users in two groups (A and B)
No click (failure)
x = NumberOfSuccesses + a
y = NumberOfObservations − NumberOfSuccesses + b
[0 1 1 0 0 0 0 0 0 0 1 ... ]
[0 0 0 1 0 0 0 1 1 0 1 ... ]
A_posterior = simulate_beta_posterior(A_clicks, 1, 1)
B_posterior = simulate_beta_posterior(B_clicks, 1, 1)
-0.0077850237030215215
Michal Oleszak
Machine Learning Engineer
Decision analysis
Decision-makers care about maximizing pro t, reducing costs, saving lives, etc.
Michal Oleszak
Machine Learning Engineer
Linear regression
y = β0 + β1 x1 + β2 x2 + ...
sales = β0 + β1 marketingSpending
Frequentist inference:
sales = β0 + β1 marketingSpending + ε
ε ∼ N (0, σ)
Bayesian inference:
sales ∼ N (β0 + β1 marketingSpending, σ)
plt.show()
plt.show()
plt.show()
β0 ∼ N (5, 2)
β1 ∼ N (2, 10)
σ ∼ Unif (0, 3)
Uniform prior for standard deviation, as we don't know what it could be.
Choose conjugate priors and simulate from a known posterior → unintuitive priors
Third way: simulate from the posterior even with non-conjugate priors!
print(marketing_spending_draws)
import pymc3 as pm
pm.plot_posterior(
marketing_spending_draws,
hdi_prob=0.95
)