0% found this document useful (0 votes)
14 views44 pages

(Seminar) An Introduction To Simulation-Based Inference

The document presents an introduction to simulation-based inference, highlighting its significance in statistical analysis and the advancements made possible by deep learning. It discusses various algorithms for inference, including neural ratio estimation and diagnostics for validating results. Challenges such as the curse of dimensionality and the need for extensive data are acknowledged as areas requiring further development.

Uploaded by

Yiqiao Jin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views44 pages

(Seminar) An Introduction To Simulation-Based Inference

The document presents an introduction to simulation-based inference, highlighting its significance in statistical analysis and the advancements made possible by deep learning. It discusses various algorithms for inference, including neural ratio estimation and diagnostics for validating results. Challenges such as the curse of dimensionality and the need for extensive data are acknowledged as areas requiring further development.

Uploaded by

Yiqiao Jin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

An introduction to

simulation-based inference

51st SLAC Summer Institute

August 16, 2023

Gilles Louppe
[email protected]

1 / 36
2 / 36
vx = v cos(α), vy = v sin(α),
dx dy dvy
= vx , = vy , = −G.
dt dt dt

3 / 36
def simulate(v, alpha, dt=0.001):
v_x = v * np.cos(alpha) # x velocity m/s
v_y = v * np.sin(alpha) # y velocity m/s
y = 1.1 + 0.3 * random.normal()
x = 0.0

while y > 0: # simulate until ball hits floor


v_y += dt * -G # acceleration due to gravity
x += dt * v_x
y += dt * v_y

return x + 0.25 * random.normal()

4 / 36
5 / 36
What parameter values θ are the most plausible?

6 / 36
7 / 36
Outline
1. Simulation-based inference

2. Algorithms

Neural ratio estimation

Neural posterior estimation

Neural score estimation

3. Diagnostics

8 / 36
Simulation-based inference

8 / 36
Scienti c simulators

9 / 36
θ, z, x ∼ p(θ, z, x)

10 / 36
θ, z ∼ p(θ, z∣x)

11 / 36
12 / 36
12 / 36
12 / 36
12 / 36
p(x∣θ) = ∭ p(zp ∣θ)p(zs ∣zp )p(zd ∣zs )p(x∣zd )dzp dzs dzd

yikes!

13 / 36
Bayesian inference

Start with

a simulator that can generate N samples xi ∼ p(xi ∣θi ),

a prior model p(θ),

observed data xobs ∼ p(xobs ∣θtrue ).

Then, estimate the posterior

p(xobs ∣θ)p(θ)
p(θ∣xobs ) = .
p(xobs )

14 / 36
15 / 36
Algorithms

15 / 36

Credits: Cranmer, Brehmer and Louppe, 2020. 16 / 36
Approximate Bayesian Computation (ABC)

Issues:

How to choose x′ ? ϵ? ∣∣ ⋅ ∣∣?

No tractable posterior.

Need to run new simulations for new data or new prior.


Credits: Johann Brehmer. 17 / 36

Credits: Cranmer, Brehmer and Louppe, 2020. 18 / 36

Credits: Cranmer, Brehmer and Louppe, 2020. 18 / 36
Neural ratio estimation
p(x∣θ) p(x,θ)
The likelihood-to-evidence r(x∣θ) = p(x) = p(x)p(θ) ratio can be learned, even
if neither the likelihood nor the evidence can be evaluated:

x, θ ∼ p(x, θ)

r^(x∣θ)

x, θ ∼ p(x)p(θ)


Credits: Cranmer et al, 2015; Hermans et al, 2020. 19 / 36
The solution d found after training approximates the optimal classi er

p(x, θ)
d(x, θ) ≈ d∗ (x, θ) = .
p(x, θ) + p(x)p(θ)
Therefore,

p(x∣θ) p(x, θ) d(x, θ)


r(x∣θ) = = ≈ = r^(x∣θ).
p(x) p(x)p(θ) 1 − d(x, θ)

20 / 36
p(θ∣x) ≈ r^(x∣θ)p(θ)

21 / 36
Constraining dark matter with stellar streams

.]
Interaction of Pal 5 with two …


Image credits: C. Bickel/Science; D. Erkal. 22 / 36

Credits: Hermans et al, 2021. 23 / 36
Preliminary results for GD-1 suggest a preference for CDM over WDM.

24 / 36
Neural Posterior Estimation

min Ep(x) [KL(p(θ∣x)∣∣qϕ (θ∣x))]


25 / 36
Normalizing ows

A normalizing ow is a sequence of invertible transformations fk that map a


simple distribution p0 to a more complex distribution pK :

By the change of variables formula, the log-likelihood of a sample x is given by

K
log p(x) = log p(z0 ) − ∑ log ∣det Jfk (zk−1 )∣ .
k=1

26 / 36
Exoplanet atmosphere characterization


Credits: NSA/JPL-Caltech, 2010. 27 / 36

Credits: Vasist et al, 2023. 28 / 36
Diagnostics

28 / 36
p^(θ∣x) = sbi(p(x∣θ), p(θ), x)

We must make sure our approximate


simulation-based inference
algorithms can (at least) actually
realize faithful inferences on the
(expected) observations.

How do we know this is good enough?

29 / 36
Mode convergence

The maximum a posteriori estimate converges towards the nominal value θ∗


for an increasing number of independent and identically distributed observables
xi ∼ p(x∣θ∗ ):

lim arg max p(θ∣{xi }N


i=1 )
N →∞ θ

= lim arg max p(θ) ∏ r(xi ∣θ) = θ∗


N →∞ θ
xi


Credits: Brehmer et al, 2019. 30 / 36
Coverage diagnostic

For x, θ ∼ p(x, θ), compute the 1 − α


credible interval based on p^(θ∣x).

If the fraction of samples for which θ is


contained within the interval is larger than the
nominal coverage probability 1 − α, then the
approximate posterior p^(θ∣x) has coverage.


Credits: Hermans et al, 2021; Siddharth Mishra-Sharma, 2021. 31 / 36

Credits: Hermans et al, 2021. 32 / 36
What if diagnostics fail?

33 / 36
Balanced NRE
Enforce neural ratio estimation to be conservative by using binary classi ers d^
that are balanced, i.e. such that

Ep(θ,x) [ d^(θ, x)] = Ep(θ)p(x) [1 − d^(θ, x)] .


Credits: Delaunoy et al, 2022. 34 / 36

Credits: Delaunoy et al, 2022. 35 / 36
Summary
Advances in deep learning have enabled new approaches to statistical
inference.

This is major evolution in the statistical capabilities for science, as it


enables the analysis of complex models and data without simplifying
assumptions.

Inference remains approximate and requires careful validation.

Obstacles remain to be overcome, such as the curse of dimensionality and


the need for large amounts of data.

36 / 36
The end.

36 / 36

You might also like