0% found this document useful (0 votes)
86 views

Use Julia

This document provides an overview of probabilistic data analysis tools for Python. It discusses linear regression, maximum likelihood estimation, and uncertainty quantification as examples of probabilistic inference problems. Linear regression can be performed with known Gaussian uncertainties using least squares optimization. Maximum likelihood estimation handles non-linear models and non-Gaussian uncertainties using optimizers like SciPy to minimize the negative log-likelihood. Automatic differentiation tools like Theano and autograd are recommended for efficiently computing gradients needed for optimization, by exactly evaluating derivatives during compilation.

Uploaded by

Felipe Arzola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Use Julia

This document provides an overview of probabilistic data analysis tools for Python. It discusses linear regression, maximum likelihood estimation, and uncertainty quantification as examples of probabilistic inference problems. Linear regression can be performed with known Gaussian uncertainties using least squares optimization. Maximum likelihood estimation handles non-linear models and non-Gaussian uncertainties using optimizers like SciPy to minimize the negative log-likelihood. Automatic differentiation tools like Theano and autograd are recommended for efficiently computing gradients needed for optimization, by exactly evaluating derivatives during compilation.

Uploaded by

Felipe Arzola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Tools for Probabilistic Data Analysis in Python *

Dan Foreman-Mackey | #pyastro16


Tools for Probabilistic Data Analysis in Python *

Dan Foreman-Mackey | #pyastro16

* in 15 minutes
What have I done?
Tools for Probabilistic Data Analysis in Python
Physics

Data
Physics

mean model
(physical parameters → predicted data)

Data
Physics

mean model
(physical parameters → predicted data)

Data

noise
(stochastic; instrument, systematics, etc.)
Physics

inference mean model


(parameter estimation) (physical parameters → predicted data)

Data

noise
(stochastic; instrument, systematics, etc.)
A few examples

1 linear regression

2 maximum likelihood

3 uncertainty quantification
Linear regression
Linear regression

if you have:
a linear mean model and
known Gaussian uncertainties

and you want:


"best" parameters and uncertainties
Linear (mean) models

y = mx + b
Linear (mean) models

y = mx + b

y = a2 x2 + a1 x + a0
Linear (mean) models

y = mx + b

y = a2 x2 + a1 x + a0

y = a sin(x + w)
Linear (mean) models

y = mx + b

y = a2 x2 + a1 x + a0

y = a sin(x + w)

+ known Gaussian uncertainties


Linear regression
Linear regression
Linear regression

# x, y, yerr are numpy arrays of the same shape

import numpy as np

A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
✓ ◆
m
w=
b
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
✓ ◆
m
w=
b
That's it!
(in other words: "Don't use MCMC for linear regression!")
Maximum likelihood
Maximum likelihood

if you have:
a non-linear mean model and/or
non-Gaussian/unknown noise

and you want:


"best" parameters
Likelihoods

p(data | physics)
"probability of the data given physics"

parameterized by some parameters


Example likelihood function

log-likelihood mean model

XN 2
1 [yn f✓ (xn )]
ln p({yn } | ✓) = 2
+ constant
2 n=1 n

" "
2
Likelihoods
Likelihoods

SciPy
Likelihoods

# x, y, yerr are numpy arrays of the same shape

import numpy as np
from scipy.optimize import minimize

def model(theta, x):


a, b, c = theta
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta):
return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2)

r = minimize(nll, [1.0, 10.0, 1.5])


print(r)

XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
Likelihoods

# x, y, yerr are numpy arrays of the same shape

import numpy as np
from scipy.optimize import minimize
a
def model(theta, x): f✓ (xn ) = b (xn c)
a, b, c = theta 1 +e
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta):
return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2)

r = minimize(nll, [1.0, 10.0, 1.5])


print(r)

XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
Likelihoods

# x, y, yerr are numpy arrays of the same shape

import numpy as np
from scipy.optimize import minimize
a
def model(theta, x): f✓ (xn ) = b (xn c)
a, b, c = theta 1 +e
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta): ln p({yn } | ✓)


return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2)

r = minimize(nll, [1.0, 10.0, 1.5])


print(r)

XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
"But it doesn't work…"
— everyone
1 initialization

2 bounds

3 convergence

4 gradients
1 initialization

2 bounds

3 convergence

4 gradients
Gradients

d
ln p({yn } | ✓)
d✓
seriously?
AutoDiff to the rescue!

"The most criminally underused tool


in the [PyAstro] toolkit"
— adapted from
justindomke.wordpress.com
AutoDiff

"Compile" time exact gradients


AutoDiff

"Compile" time chain rule


AutoDiff

GradType sin (GradType x):


return GradType(
x.value,
x.grad * cos(x.value)
)
AutoDiff

"Compile" time exact gradients


AutoDiff in Python

1 Theano: deeplearning.net/software/theano

2 HIPS/autograd: github.com/HIPS/autograd
HIPS/autograd just works

import autograd.numpy as np
from autograd import elementwise_grad

def f(x):
y = np.exp(-x)
return (1.0 - y) / (1.0 + y)

df = elementwise_grad(f)
ddf = elementwise_grad(df)
HIPS/autograd just works

1.0
f (x); f (x); f (x)
0.5
import autograd.numpy as np
00

from autograd import elementwise_grad

def f(x):
y = np.exp(-x)
0.0
0

return (1.0 - y) / (1.0 + y)

0.5
df = elementwise_grad(f)
ddf = elementwise_grad(df)

1.0
4 2 0 2 4
x
before autograd

# x, y, yerr are numpy arrays of the same shape

import numpy as np
from scipy.optimize import minimize

def model(theta, x):


a, b, c = theta
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)

r = minimize(neg_log_like, [1.0, 10.0, 1.5])

print(r)
after autograd

# x, y, yerr are numpy arrays of the same shape

from autograd import grad


import autograd.numpy as np
from scipy.optimize import minimize

def model(theta, x):


a, b, c = theta
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)

r = minimize(neg_log_like, [1.0, 10.0, 1.5],


jac=grad(neg_log_like))
print(r)
after autograd

# x, y, yerr are numpy arrays of the same shape

from autograd import grad


import autograd.numpy as np
from scipy.optimize import minimize

def model(theta, x):


a, b, c = theta
return a / (1 + np.exp(-b * (x - c)))

def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)

r = minimize(neg_log_like, [1.0, 10.0, 1.5],


jac=grad(neg_log_like))
print(r)
after autograd

# x, y, yerr are numpy arrays of the same shape

from autograd import grad


import autograd.numpy as np
from scipy.optimize import minimize

def model(theta, x):


a, b, c = theta
115
return a / (1 +calls 66 calls
np.exp(-b * (x - c)))

def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)

r = minimize(neg_log_like, [1.0, 10.0, 1.5],


jac=grad(neg_log_like))
print(r)
HIPS/autograd just works

but… HIPS/autograd is not super fast


HIPS/autograd just works

but… HIPS/autograd is not super fast

you might need to drop down to a compiled language


HIPS/autograd just works

but… HIPS/autograd is not super fast

you might need to drop down to a compiled language

or...
Use Julia?
Uncertainty quantification
Uncertainty quantification

if you have:
a non-linear mean model and/or
non-Gaussian/unknown noise

and you want:


parameter uncertainties
Uncertainty

p(physics | data) / p(data | physics) p(physics)

distribution of likelihood prior


physical parameters
consistent with data
You're going to have to

SAMPLE

cbnd
Flickr user Franz Jachim
MCMC sampling
MCMC sampling

it's
ham
me
r tim
e!

emcee
The MCMC Hammer
MCMC sampling with emcee

dfm.io/emcee; github.com/dfm/emcee
MCMC sampling with emcee

# x, y, yerr are numpy arrays of the same shape

import emcee
import numpy as np

def model(theta, x):


a, b, c = theta
return a / (1 + np.exp(-b * (x - c)))

def log_prob(theta):
log_prior = 0.0
r = (y - model(theta, x)) / yerr
return -0.5 * np.sum(r*r) + log_prior

ndim, nwalkers = 3, 32
p0 = np.array([1.0, 10.0, 1.5])
p0 = p0 + 0.01*np.random.randn(nwalkers, ndim)
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob)
sampler.run_mcmc(p0, 1000)
MCMC sampling with emcee

a
f✓ (xn ) = b (xn c)
14
10 12
1+e
b
8 5
50 .52
1
0
c
1.
5
47
1.

8
5
0
5
0

10
12
14

5
97
00
02
05

47

50

52
0.
1.
1.
1.

1.

1.

1.
a b c made using:
github.com/dfm/corner.py
1 initialization

2 bounds

3 convergence

4 gradients
1 initialization

2 priors

3 convergence

4 gradients?
Other MCMC samplers in Python

1 pymc-devs/pymc3

2 stan-dev/pystan

3 JohannesBuchner/PyMultiNest

4 eggplantbren/DNest4
Other MCMC samplers in Python

hierarchical
1 pymc-devs/pymc3

inference
2 stan-dev/pystan

3 JohannesBuchner/PyMultiNest

4 eggplantbren/DNest4
Other MCMC samplers in Python

hierarchical
1 pymc-devs/pymc3

inference
2 stan-dev/pystan

3 JohannesBuchner/PyMultiNest

sampling
nested
4 eggplantbren/DNest4
in summary…
If your data analysis problem looks like this… *

Physics

inference mean model


(parameter estimation) (physical parameters → predicted data)

Data

noise
(stochastic; instrument, systematics, etc.)

* it probably does
… now you know how to solve it! *

https://speakerdeck.com/dfm/pyastro16

* in theory

You might also like