A Bayesian approach with Gaussian priors to the inverse problem of source identification in elliptic PDEs

Matteo Giordano

ESOMAS Department, University of Turin,
Corso Unione Sovietica 218 bis, Turin, Italy
Abstract

We consider the statistical linear inverse problem of making inference on an unknown source function in an elliptic partial differential equation from noisy observations of its solution. We employ nonparametric Bayesian procedures based on Gaussian priors, leading to convenient conjugate formulae for posterior inference. We review recent results providing theoretical guarantees on the quality of the resulting posterior-based estimation and uncertainty quantification, and we discuss the application of the theory to the important classes of Gaussian series priors defined on the Dirichlet-Laplacian eigenbasis and Matérn process priors. We provide an implementation of posterior inference for both classes of priors, and investigate its performance in a numerical simulation study. The reproducible code is available at: https://github.com/MattGiord/Bayesian-Source-Identification

Keywords. Parameter identification; semiparametric inference; uncertainty quantification; frequentist analysis of posterior distributions; simulation study.

1 Introduction

Linear inverse problems consist in the task of recovering unknown objects or physical quantities from linear indirect noisy measurements. A widespread mathematical formulation for these problems postulates that the recovery target be an element f𝑓fitalic_f of a Hilbert space 1subscript1\mathbb{H}_{1}blackboard_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and that the data arise according to the equation

Yε=G(f)+εW,superscript𝑌𝜀𝐺𝑓𝜀𝑊Y^{\varepsilon}=G(f)+\varepsilon W,italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT = italic_G ( italic_f ) + italic_ε italic_W , (1)

where G:12:𝐺subscript1subscript2G:\mathbb{H}_{1}\to\mathbb{H}_{2}italic_G : blackboard_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT → blackboard_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is a linear operator between 1subscript1\mathbb{H}_{1}blackboard_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and another Hilbert space 2subscript2\mathbb{H}_{2}blackboard_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, W𝑊Witalic_W is additive observational noise and ε>0𝜀0\varepsilon>0italic_ε > 0 is the noise level. In view of the central limit theorem, normality of the measurement errors can often be maintained, whereby W𝑊Witalic_W is assumed to be a white noise process indexed by 2subscript2\mathbb{H}_{2}blackboard_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The goal is then to estimate f𝑓fitalic_f from an observed realisation of Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT.

Observation schemes as in equation (1) are found in a variety of scientific fields and engineering applications, including medical imaging [9], geophysics [40], acoustics [12] and finance [7]. For example, Computerised Tomography (CT), a technique to obtain detailed images of the human body and information about the density variation of the tissues, is based on a mathematical model (related to the ‘Radon transform’) for the absorption of X𝑋Xitalic_X-rays. Similar concepts underpin many other medical imagining techniques, such as Magnetic Resonance (MR) and Positron Emission Tomography (PET); see [9] for further details and references.

In many such applications, the unknown f𝑓fitalic_f may be characterised as a functional coefficient governing a partial differential equation (PDE), while the observed object G(f)𝐺𝑓G(f)italic_G ( italic_f ) is the corresponding PDE solution. The ‘forward operator’ is then the coefficient-to-solution map G:fG(f):𝐺maps-to𝑓𝐺𝑓G:f\mapsto G(f)italic_G : italic_f ↦ italic_G ( italic_f ). See the monograph [24] for an extensive overview on PDE-based inverse problems.

In the present article, we shall mostly focus on the representative example of ‘source identification’ in elliptic PDEs. Many of the ideas developed hereafter have a natural application to other linear inverse problems. Let 𝒪d𝒪superscript𝑑\mathcal{O}\subset\mathbb{R}^{d}caligraphic_O ⊂ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT be an open and bounded set with smooth boundary 𝒪𝒪\partial\mathcal{O}∂ caligraphic_O. Let the unknown (square-integrable) function f1L2(𝒪)𝑓subscript1superscript𝐿2𝒪f\in\mathbb{H}_{1}\equiv L^{2}(\mathcal{O})italic_f ∈ blackboard_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≡ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) be the ‘source term’ in the elliptic PDE with zero Dirichlet boundary conditions,

(cu)f=0,on𝒪,u=0,on𝒪,\begin{split}\nabla\cdot(c\nabla u)-f&=0,\ \ \text{on}\ \ \mathcal{O},\\ u&=0,\ \ \text{on}\ \ \partial\mathcal{O},\end{split}start_ROW start_CELL ∇ ⋅ ( italic_c ∇ italic_u ) - italic_f end_CELL start_CELL = 0 , on caligraphic_O , end_CELL end_ROW start_ROW start_CELL italic_u end_CELL start_CELL = 0 , on ∂ caligraphic_O , end_CELL end_ROW (2)

where \nabla\cdot∇ ⋅ and \nabla denote, respectively, the divergence and gradient operators, and where the smooth and positive ‘diffusion coefficient’ cC(𝒪¯)𝑐superscript𝐶¯𝒪c\in C^{\infty}(\overline{\mathcal{O}})italic_c ∈ italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( over¯ start_ARG caligraphic_O end_ARG ), infx𝒪c(x)>0subscriptinfimum𝑥𝒪𝑐𝑥0\inf_{x\in\mathcal{O}}c(x)>0roman_inf start_POSTSUBSCRIPT italic_x ∈ caligraphic_O end_POSTSUBSCRIPT italic_c ( italic_x ) > 0, is assumed to be known. By standard elliptic theory, e.g. [33, Chapter 2] and [14, Chapter 6], for any fL2(𝒪)𝑓superscript𝐿2𝒪f\in L^{2}(\mathcal{O})italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) there exists a unique (weak) solution G(f)u𝐺𝑓𝑢G(f)\equiv uitalic_G ( italic_f ) ≡ italic_u in the Sobolev space H1(𝒪)L2(𝒪)2superscript𝐻1𝒪superscript𝐿2𝒪subscript2H^{1}(\mathcal{O})\subset L^{2}(\mathcal{O})\equiv\mathbb{H}_{2}italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( caligraphic_O ) ⊂ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) ≡ blackboard_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, giving rise to a linear (injective, self-adjoint and compact) operator G:L2(𝒪)L2(𝒪):𝐺superscript𝐿2𝒪superscript𝐿2𝒪G:L^{2}(\mathcal{O})\to L^{2}(\mathcal{O})italic_G : italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) → italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ); see Appendix B in [18] for further details. We then consider the problem of estimating the source f𝑓fitalic_f from observations Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT arising as in (1), with G𝐺Gitalic_G the solution map associated to the PDE (2) and W𝑊Witalic_W a Gaussian white noise process indexed by L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ). An illustration of the problem with synthetic data is provided in Figure 1 below. Among the numerous applications areas, inverse problems based on elliptic PDEs are important building blocks in oil reservoir modelling [47]. Source identification problems have been extensively investigated in the applied mathematics and statistics communities; see [2, 13, 21, 18] and the many references therein.

Refer to caption
Refer to caption
Figure 1: Left: a source function f𝑓fitalic_f on a rotated elliptically-shaped domain, modelling three heat sources centred at the points (0.5,0)0.50(-0.5,0)( - 0.5 , 0 ), (0,0)00(0,0)( 0 , 0 ) and (0,0.5)00.5(0,0.5)( 0 , 0.5 ). Right: noisy observations of the associated PDE solution G(f)𝐺𝑓G(f)italic_G ( italic_f ) (with fixed diffusivity c𝑐citalic_c).
Refer to caption
Refer to caption
Figure 2: Left: the posterior mean estimate of the source function f𝑓fitalic_f. Right: cross section along the x𝑥xitalic_x-axis of the true source function f0subscript𝑓0f_{0}italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (in black), of the posterior mean (in red), and of 2500250025002500 posterior samples (in green).

Here, we shall pursue the popular nonparametric Bayesian approach to inverse problems [41]. We shall assign to the unknown source f𝑓fitalic_f a prior distribution Π()Π\Pi(\cdot)roman_Π ( ⋅ ) supported on the function space L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) and then, following the Bayesian paradigm, combine it with the (Gaussian) data likelihood induced by the statistical model (1) to form the posterior distribution Π(|Yε)\Pi(\cdot|Y^{\varepsilon})roman_Π ( ⋅ | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) of f|Yεconditional𝑓superscript𝑌𝜀f|Y^{\varepsilon}italic_f | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT, representing our updated belief about the inferential target and providing us with point estimates and uncertainty quantification. In particular, Gaussian priors represent a natural choice for inference in the context of the observation scheme (1) due to their conjugacy, which leads to convenient explicit formulae for the posterior distribution.

The Bayesian approach to inverse problems dates back at least to the 1980’s, with early seminal work laid out in [43, 31, 32] among the others, and has since gained enormous popularity across several applied fields. See the monographs [25, 42], as well as the more recent reviews [8, 6], where further references can be found. Over the last decade, a large number of articles have investigated the theoretical recovery performance of Gaussian priors in linear inverse problems. We refer in particular to [29, 30, 4, 27, 20], and also mention [1, 36, 19, 35, 17] for results in nonlinear problems. Recently, Giordano and Kekkonen [18], building on earlier findings by [34], identified general conditions on the forward operator, on the ground truth and on the prior distribution under which ‘semiparametric’ Bernstein-von Mises theorems can be obtained, characterising the asymptotic shape of the posterior distribution of a large collection of linear functionals of the unknown. For the problem of estimating the source f𝑓fitalic_f in the elliptic PDE (2) from observations Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT arising as in (1), they showed that a wide class of ‘standard’ centred Gaussian process priors (such as the one associated to the commonly used Matérn kernel) yield, in the small noise limit ε0𝜀0\varepsilon\to 0italic_ε → 0 and under the frequentist assumption that the data have been generated by a fixed ground truth f0subscript𝑓0f_{0}italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, valid and optimal estimation and uncertainty quantification, via the posterior mean estimate and credible intervals centred around it, for one-dimensional aspects of the unknown source function.

For this study, we shall focus on the implementation of the Bayesian procedures with Gaussian priors for the inverse problem of source identification that were investigated in [18], corroborating the theory developed therein with a numerical simulation study. To this end, we will first review, in Section 2.2, the asymptotic results derived in [18]. We will then provide, in Section 2.3, two examples of Gaussian priors satisfying the assumptions of the general theory; in particular, we will consider centred Gaussian series priors defined on the eigenbasis of the Dirichlet-Laplacian, as well as centred stationary Gaussian process priors associated to the Matérn covariance kernel. Both classes of priors are of practical interest and widely used, and they will lead to two different discretisation strategies for the implementation.

We will present the numerical simulation study in Section 3, were we will provide an implementation of posterior inference in the source identification problem for Gaussian series priors and for the Matérn process priors. The series approach will hinge on a natural discretisation of the parameter space via high-dimensional basis expansions, while for the Matérn priors we will employ piecewise linear functions defined on the elements of a deterministic triangular mesh. Under a suitable discretisation of the statistical model (1), we will provide the explicit formulae for the Gaussian conjugate posterior distributions, which we will exploit to efficiently compute the posterior mean estimates and to efficiently implement, via posterior sampling, credible sets for uncertainty quantification. In the study, we will numerically investigate the asymptotic concentration of the posterior distribution around the ground truth generating the data, the efficiency of the posterior mean estimator (in terms of minimality of its asymptotic variance), and further the frequentist coverage of the obtained credible intervals. Overall, the study shows a close correspondence between the numerical results and the expected performance predicted by the theory developed in [18]. The posterior mean estimate (relative to a Gaussian series prior) and the associated uncertainty quantification (obtained via the cross-section of 2500 posterior samples along the x𝑥xitalic_x-axis) are depicted in Figure 2, to be compared to the true source function shown in Figure 1. The reproducible (MATLAB) code used for the study is available at: https://github.com/MattGiord/Bayesian-Source-Identification.

2 A nonparametric Bayesian approach with Gaussian priors

In this section, we formally introduce the statistical model and the Bayesian procedures of interest for the present article. We will provide details on the likelihood, the prior and the posterior distribution in Section 2.1. We will then review the asymptotic results of [18] in Section 2.2, and further present two classes of Gaussian priors to which the theory applies in Section 2.3.

2.1 Details on the Bayesian model

Throughout, let 𝒪d𝒪superscript𝑑\mathcal{O}\subset\mathbb{R}^{d}caligraphic_O ⊂ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, d𝑑d\in\mathbb{N}italic_d ∈ blackboard_N, be a non-empty, open and bounded set with smooth boundary 𝒪𝒪\partial\mathcal{O}∂ caligraphic_O. For a fixed, smooth and positive ‘diffusion coefficient’ cC(𝒪¯)𝑐superscript𝐶¯𝒪c\in C^{\infty}(\overline{\mathcal{O}})italic_c ∈ italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( over¯ start_ARG caligraphic_O end_ARG ), infx𝒪c(x)>0subscriptinfimum𝑥𝒪𝑐𝑥0\inf_{x\in\mathcal{O}}c(x)>0roman_inf start_POSTSUBSCRIPT italic_x ∈ caligraphic_O end_POSTSUBSCRIPT italic_c ( italic_x ) > 0, let G:L2(𝒪)L2(𝒪):𝐺superscript𝐿2𝒪superscript𝐿2𝒪G:L^{2}(\mathcal{O})\to L^{2}(\mathcal{O})italic_G : italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) → italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) be the (linear) source-to-solution map associated to the elliptic PDE (2); see Appendix B in [18] for details. For such G𝐺Gitalic_G, and a given noise level ε>0𝜀0\varepsilon>0italic_ε > 0, consider observations Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT arising as in (1) for some unknown fL2(𝒪)𝑓superscript𝐿2𝒪f\in L^{2}(\mathcal{O})italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), where W𝑊Witalic_W is a white noise process indexed by L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) defined on some probability space (Ω,𝒮,Pr)Ω𝒮Pr(\Omega,\mathcal{S},\Pr)( roman_Ω , caligraphic_S , roman_Pr ), that is the centred Gaussian process (W(g),gL2(𝒪))𝑊𝑔𝑔superscript𝐿2𝒪(W(g),\ g\in L^{2}(\mathcal{O}))( italic_W ( italic_g ) , italic_g ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) ) with covariance E[W(g1)W(g2)]=g1,g22Edelimited-[]𝑊subscript𝑔1𝑊subscript𝑔2subscriptsubscript𝑔1subscript𝑔22\textnormal{E}[W(g_{1})W(g_{2})]=\langle g_{1},g_{2}\rangle_{2}E [ italic_W ( italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_W ( italic_g start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = ⟨ italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Throughout most of the article, we will regard ε𝜀\varepsilonitalic_ε as known. In practice, it can often be replaced by an estimate (cf. Section 3.3.2). As described, for example, in Chapter 1 of [16], observing Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT is understood as observing a realisation of the Gaussian process (Yε(g),gL2(𝒪))superscript𝑌𝜀𝑔𝑔superscript𝐿2𝒪(Y^{\varepsilon}(g),\ g\in L^{2}(\mathcal{O}))( italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_g ) , italic_g ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) ) with mean E[Yε(g)]=G(f),g2Edelimited-[]superscript𝑌𝜀𝑔subscript𝐺𝑓𝑔2\textnormal{E}[Y^{\varepsilon}(g)]=\langle G(f),g\rangle_{2}E [ italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_g ) ] = ⟨ italic_G ( italic_f ) , italic_g ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and covariance ov[Yε(g1),Yε(g2)]=g1,g22ovsuperscript𝑌𝜀subscript𝑔1superscript𝑌𝜀subscript𝑔2subscriptsubscript𝑔1subscript𝑔22\mathbbm{C}\textnormal{ov}[Y^{\varepsilon}(g_{1}),Y^{\varepsilon}(g_{2})]=% \langle g_{1},g_{2}\rangle_{2}blackboard_C ov [ italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_g start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = ⟨ italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Such observation scheme serves as a convenient continuous counterpart of the inverse regression model

Yi=G(f)(Xi)+σWi,i=1,,n,formulae-sequencesubscript𝑌𝑖𝐺𝑓subscript𝑋𝑖𝜎subscript𝑊𝑖𝑖1𝑛Y_{i}=G(f)(X_{i})+\sigma W_{i},\qquad i=1,\dots,n,italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_G ( italic_f ) ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + italic_σ italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , … , italic_n , (3)

comprising noisy point evaluations of the PDE solution G(f)𝐺𝑓G(f)italic_G ( italic_f ) over a set of points X1,,Xn𝒪subscript𝑋1subscript𝑋𝑛𝒪X_{1},\dots,X_{n}\in\mathcal{O}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ caligraphic_O, corrupted by Gaussian measurement errors σW1,,σWniidN(0,σ2)𝜎subscript𝑊1𝜎subscript𝑊𝑛iidsimilar-to𝑁0superscript𝜎2\sigma W_{1},\dots,\sigma W_{n}\overset{\textnormal{iid}}{\sim}N(0,\sigma^{2})italic_σ italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_σ italic_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT overiid start_ARG ∼ end_ARG italic_N ( 0 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), σ>0𝜎0\sigma>0italic_σ > 0, known to be asymptotically equivalent (in the sense of [10]) to the white noise model (1) under suitable assumptions on the grid and the calibration n/σ2ε2similar-to-or-equals𝑛superscript𝜎2superscript𝜀2n/\sigma^{2}\simeq\varepsilon^{-2}italic_n / italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≃ italic_ε start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT.

For continuous observations Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT arising as in (1), for any fL2(𝒪)𝑓superscript𝐿2𝒪f\in L^{2}(\mathcal{O})italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), the (cylindrically-defined) law Pfεsubscriptsuperscript𝑃𝜀𝑓P^{\varepsilon}_{f}italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT of Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT is absolutely continuous with respect to the law P0εsuperscriptsubscript𝑃0𝜀P_{0}^{\varepsilon}italic_P start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT of the scaled white noise εW𝜀𝑊\varepsilon Witalic_ε italic_W, with log-likelihood

ε(f):=logdPfεdP0ε(Yε)=1ε2Yε[G(f)]12ε2G(f)22.assignsubscript𝜀𝑓𝑑subscriptsuperscript𝑃𝜀𝑓𝑑subscriptsuperscript𝑃𝜀0superscript𝑌𝜀1superscript𝜀2superscript𝑌𝜀delimited-[]𝐺𝑓12superscript𝜀2superscriptsubscriptnorm𝐺𝑓22\ell_{\varepsilon}(f):=\log\frac{dP^{\varepsilon}_{f}}{dP^{\varepsilon}_{0}}(Y% ^{\varepsilon})=\frac{1}{\varepsilon^{2}}Y^{\varepsilon}[G(f)]-\frac{1}{2% \varepsilon^{2}}\|G(f)\|_{2}^{2}.roman_ℓ start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_f ) := roman_log divide start_ARG italic_d italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_ARG start_ARG italic_d italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT [ italic_G ( italic_f ) ] - divide start_ARG 1 end_ARG start_ARG 2 italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ italic_G ( italic_f ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

In view of the joint measurability of εsubscript𝜀\ell_{\varepsilon}roman_ℓ start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, regarding f𝑓fitalic_f as a random function with values in L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) and assigning to it any prior distribution in the form of a Borel probability measure Π()Π\Pi(\cdot)roman_Π ( ⋅ ) supported on L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) then induces, via Bayes’ formula (for example, [15, p.7]), the posterior distribution

Π(A|Yε)=Aeε(f)𝑑Π(f)L2(𝒪)eε(f)𝑑Π(f),AL2(𝒪)measurable,formulae-sequenceΠconditional𝐴superscript𝑌𝜀subscript𝐴superscript𝑒subscript𝜀𝑓differential-dΠ𝑓subscriptsuperscript𝐿2𝒪superscript𝑒subscript𝜀superscript𝑓differential-dΠsuperscript𝑓𝐴superscript𝐿2𝒪measurable\Pi(A|Y^{\varepsilon})=\frac{\int_{A}e^{\ell_{\varepsilon}(f)}d\Pi(f)}{\int_{L% ^{2}(\mathcal{O})}e^{\ell_{\varepsilon}(f^{\prime})}d\Pi(f^{\prime})},\qquad A% \subseteq L^{2}(\mathcal{O})\ \textnormal{measurable},roman_Π ( italic_A | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) = divide start_ARG ∫ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT roman_ℓ start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_f ) end_POSTSUPERSCRIPT italic_d roman_Π ( italic_f ) end_ARG start_ARG ∫ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT roman_ℓ start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT italic_d roman_Π ( italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG , italic_A ⊆ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) measurable ,

that is, the conditional distribution of f|Yεconditional𝑓superscript𝑌𝜀f|Y^{\varepsilon}italic_f | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT. In particular, it will be of interest to consider Gaussian priors which, in view of the linearity of the forward operator G𝐺Gitalic_G and the normal assumption on the noise W𝑊Witalic_W, will lead to conjugate Gaussian posteriors. Concrete formulae are provided in Section 3. In the following, we will repeatedly use elements of the theory of Gaussian processes and measures on Hilbert spaces, and we refer to [16, Chapter 2] for the necessary background. For a Gaussian prior Π()Π\Pi(\cdot)roman_Π ( ⋅ ) on L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), the ‘information geometry’ is encoded within an associated reproducing kernel Hilbert space (RKHS) of functions defined on the domain 𝒪𝒪\mathcal{O}caligraphic_O, strictly contained inside L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ). Popular prior choices in applications and theoretical studies typically model functions belonging to a ‘smoothness scale’, with associated RKHS equal to (or included in) a Sobolev space Hα(𝒪)superscript𝐻𝛼𝒪H^{\alpha}(\mathcal{O})italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( caligraphic_O ), for some regularity level α>0𝛼0\alpha>0italic_α > 0. These include Gaussian series priors defined on bases spanning the Sobolev scale, as well as stationary Gaussian processes with the Matérn covariance kernel; see [15, Chapter 11].

2.2 Theoretical guarantees for estimation and uncertainty quantification

The asymptotic properties of nonparametric Bayesian procedures with Gaussian priors in inverse problems have recently been investigated by Giordano and Kekkonen [18], resulting, under general ‘regularity conditions’ for the forward operator, for the ground truth and for the prior distribution, in semiparametric Bernstein-von Mises theorems that entail the convergence of certain one-dimensional posterior distributions to limiting Gaussian probability measures with minimal variance, in the small noise limit and under the frequentist assumption that the data have been generated by a fixed ground truth. These results were then leveraged in [18] to prove the asymptotic efficiency of the associated posterior mean estimators, as well as to derive theoretical guarantees certifying that credible intervals centred around them are asymptotically valid confidence intervals with minimal width. In this section, we provide a review of the findings of [18] for the inverse problem of source identification. These will later be corroborated by the results of the numerical simulation study presented in Section 3.

The investigation of [18] builds on the semiparametric approach to the Bernstein-von Mises theorem in infinite-dimensional statistical models developed by Castillo and Nickl [11] and later refined by Monard et al. [34] in the inverse problem setting. It is based on the study of the posterior distributions of a class of scaled and centred one-dimensional functionals of the unknown which, in the context of the source identification problem, take the form ε1ff¯ε,ψ2superscript𝜀1subscript𝑓subscript¯𝑓𝜀𝜓2\varepsilon^{-1}\langle f-\bar{f}_{\varepsilon},\psi\rangle_{2}italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⟨ italic_f - over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for test functions ψL2(𝒪)𝜓superscript𝐿2𝒪\psi\in L^{2}(\mathcal{O})italic_ψ ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), where f¯ε:=EΠ[f|Yε]assignsubscript¯𝑓𝜀superscript𝐸Πdelimited-[]conditional𝑓superscript𝑌𝜀\bar{f}_{\varepsilon}:=E^{\Pi}[f|Y^{\varepsilon}]over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT := italic_E start_POSTSUPERSCRIPT roman_Π end_POSTSUPERSCRIPT [ italic_f | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ] is the posterior mean. Let (ε1ff¯ε,ψ2|Yε)conditionalsuperscript𝜀1subscript𝑓subscript¯𝑓𝜀𝜓2superscript𝑌𝜀\mathcal{L}(\varepsilon^{-1}\langle f-\bar{f}_{\varepsilon},\psi\rangle_{2}|Y^% {\varepsilon})caligraphic_L ( italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⟨ italic_f - over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) denote the associated scaled and centred posterior distribution.

Theorem 1 (Theorem 4.1 in [18]).

Let Π()Π\Pi(\cdot)roman_Π ( ⋅ ) be a centred Gaussian Borel probability measure supported on L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) with RKHS equal to Hα(𝒪)superscript𝐻𝛼𝒪H^{\alpha}(\mathcal{O})italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( caligraphic_O ) for some α>d/2𝛼𝑑2\alpha>d/2italic_α > italic_d / 2. Let f0Hβ(𝒪)subscript𝑓0superscript𝐻𝛽𝒪f_{0}\in H^{\beta}(\mathcal{O})italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( caligraphic_O ), for some β>αd/2𝛽𝛼𝑑2\beta>\alpha-d/2italic_β > italic_α - italic_d / 2, be compactly supported inside 𝒪𝒪\mathcal{O}caligraphic_O, and consider observations YεPf0εsimilar-tosuperscript𝑌𝜀subscriptsuperscript𝑃𝜀subscript𝑓0Y^{\varepsilon}\sim P^{\varepsilon}_{f_{0}}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ∼ italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT from the statistical model (1) with G(f)𝐺𝑓G(f)italic_G ( italic_f ) the solution to the PDE (2) and f=f0𝑓subscript𝑓0f=f_{0}italic_f = italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Then, for any γ>2+d/2𝛾2𝑑2\gamma>2+d/2italic_γ > 2 + italic_d / 2 and any compactly supported test function ψHγ(𝒪)𝜓superscript𝐻𝛾𝒪\psi\in H^{\gamma}(\mathcal{O})italic_ψ ∈ italic_H start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT ( caligraphic_O ), we have

(ε1ff¯ε,ψ2|Yε)N(0,(cψ)22),conditionalsuperscript𝜀1subscript𝑓subscript¯𝑓𝜀𝜓2superscript𝑌𝜀𝑁0subscriptsuperscriptnorm𝑐𝜓22\mathcal{L}(\varepsilon^{-1}\langle f-\bar{f}_{\varepsilon},\psi\rangle_{2}|Y^% {\varepsilon})\overset{\mathcal{L}}{\longrightarrow}N(0,\|\nabla\cdot(c\nabla% \psi)\|^{2}_{2}),caligraphic_L ( italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⟨ italic_f - over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) overcaligraphic_L start_ARG ⟶ end_ARG italic_N ( 0 , ∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , (4)

in Pf0εsubscriptsuperscript𝑃𝜀subscript𝑓0P^{\varepsilon}_{f_{0}}italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT-probability as ε0𝜀0\varepsilon\to 0italic_ε → 0.

The result asserts that the random (data-dependent) one-dimensional probability distribution (ε1ff¯ε,ψ2|Yε)conditionalsuperscript𝜀1subscript𝑓subscript¯𝑓𝜀𝜓2superscript𝑌𝜀\mathcal{L}(\varepsilon^{-1}\langle f-\bar{f}_{\varepsilon},\psi\rangle_{2}|Y^% {\varepsilon})caligraphic_L ( italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⟨ italic_f - over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) converges (in the topology of weak convergence) in probability to a centred normal distribution with variance (cψ)22subscriptsuperscriptnorm𝑐𝜓22\|\nabla\cdot(c\nabla\psi)\|^{2}_{2}∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The latter can be shown to be minimal [18, Remark 2.4], as it coincides with the Cramér-Rao lower bound for estimating the one-dimensional quantity f,ψ2subscript𝑓𝜓2\langle f,\psi\rangle_{2}⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT from data Yεsuperscript𝑌𝜀Y^{\varepsilon}italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT arising as in (1). Furthermore, the class of test functions ψ𝜓\psiitalic_ψ for which the convergence (4) is obtained is to be understood to be maximal, in the sense that the infinite-dimensional Gaussian probability measure with marginals identified by the right hand side of (4) is tight (a necessary condition for weak convergence) only when γ>2+d/2𝛾2𝑑2\gamma>2+d/2italic_γ > 2 + italic_d / 2; see Lemma 4.2 in [18] and the related discussion.

A first important consequence of Theorem 1 is a central limit for the ‘plug-in’ posterior mean estimators f¯ε,ψ2subscriptsubscript¯𝑓𝜀𝜓2\langle\bar{f}_{\varepsilon},\psi\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of the one dimensional aspects f,ψ2subscript𝑓𝜓2\langle f,\psi\rangle_{2}⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of the unknown. Note that, for Gaussian priors, these can be efficiently computed via the explicit formulae for the conjugate Gaussian posteriors. The central limit follows, as argued in Remark 2.4 in [18], from the convergence of moments in the limit (4). In particular, under the assumptions of Theorem 1, it holds that

ε1(f¯ε,ψ2f0,ψ2)𝑑N(0,(cψ)22),superscript𝜀1subscriptsubscript¯𝑓𝜀𝜓2subscriptsubscript𝑓0𝜓2𝑑𝑁0subscriptsuperscriptnorm𝑐𝜓22\varepsilon^{-1}\left(\langle\bar{f}_{\varepsilon},\psi\rangle_{2}-\langle f_{% 0},\psi\rangle_{2}\right)\overset{d}{\longrightarrow}N(0,\|\nabla\cdot(c\nabla% \psi)\|^{2}_{2}),italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - ⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) overitalic_d start_ARG ⟶ end_ARG italic_N ( 0 , ∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , (5)

as ε0𝜀0\varepsilon\to 0italic_ε → 0. In view of the aforementioned minimality of the asymptotic variance (cψ)22subscriptsuperscriptnorm𝑐𝜓22\|\nabla\cdot(c\nabla\psi)\|^{2}_{2}∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the result indeed implies the asymptotic efficiency of the plug-in estimators f¯ε,ψ2subscriptsubscript¯𝑓𝜀𝜓2\langle\bar{f}_{\varepsilon},\psi\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

The second implication of the Berstein-von Mises result stated in Theorem 1 concerns the coverage and width of credible intervals built around the efficient estimators f¯ε,ψ2subscriptsubscript¯𝑓𝜀𝜓2\langle\bar{f}_{\varepsilon},\psi\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which can be shown to be asymptotically valid frequentist confidence intervals and to have diameter shrinking at the optimal parametric rate ε1superscript𝜀1\varepsilon^{-1}italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. For any level a(0,1)𝑎01a\in(0,1)italic_a ∈ ( 0 , 1 ), consider the (1a)%percent1𝑎(1-a)\%( 1 - italic_a ) %-credible interval

Cε,a:={z:|zf¯ε,ψ2|Rε,a},assignsubscript𝐶𝜀𝑎conditional-set𝑧𝑧subscriptsubscript¯𝑓𝜀𝜓2subscript𝑅𝜀𝑎C_{\varepsilon,a}:=\{z\in\mathbb{R}:|z-\langle\bar{f}_{\varepsilon},\psi% \rangle_{2}|\leq R_{\varepsilon,a}\},italic_C start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT := { italic_z ∈ blackboard_R : | italic_z - ⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | ≤ italic_R start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT } , (6)

where Rε,a>0subscript𝑅𝜀𝑎0R_{\varepsilon,a}>0italic_R start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT > 0 is the (1a/2)%percent1𝑎2(1-a/2)\%( 1 - italic_a / 2 ) %-quantile of the one-dimensional (Gaussian) posterior distribution of f,ψ2|Yεconditionalsubscript𝑓𝜓2superscript𝑌𝜀\langle f,\psi\rangle_{2}|Y^{\varepsilon}⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT, so that

Π(f:f,ψ2Cε,a|Yε)=1a.\Pi\left(f:\langle f,\psi\rangle_{2}\in C_{\varepsilon,a}|Y^{\varepsilon}% \right)=1-a.roman_Π ( italic_f : ⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_C start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ) = 1 - italic_a .

Then, in the setting of Theorem 1, the asymptotic frequentist coverage of Cε,asubscript𝐶𝜀𝑎C_{\varepsilon,a}italic_C start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT is given by

Pf0ε(f0,ψ2Cε,a)1a,subscriptsuperscript𝑃𝜀subscript𝑓0subscriptsubscript𝑓0𝜓2subscript𝐶𝜀𝑎1𝑎P^{\varepsilon}_{f_{0}}\left(\langle f_{0},\psi\rangle_{2}\in C_{\varepsilon,a% }\right)\to 1-a,italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_C start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT ) → 1 - italic_a , (7)

as ε0𝜀0\varepsilon\to 0italic_ε → 0, while its radius Rε,asubscript𝑅𝜀𝑎R_{\varepsilon,a}italic_R start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT satisfies

Rε,a=OPf0ε(ε1).subscript𝑅𝜀𝑎subscript𝑂subscriptsuperscript𝑃𝜀subscript𝑓0superscript𝜀1R_{\varepsilon,a}=O_{P^{\varepsilon}_{f_{0}}}(\varepsilon^{-1}).italic_R start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT = italic_O start_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_ε start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) .

See Corollary 2.5 in [18]. Note that although an analytic formulation of the credible intervals Cε,asubscript𝐶𝜀𝑎C_{\varepsilon,a}italic_C start_POSTSUBSCRIPT italic_ε , italic_a end_POSTSUBSCRIPT requires the derivation of the quantiles of the one dimensional posterior distributions f,ψ2|Yεconditionalsubscript𝑓𝜓2superscript𝑌𝜀\langle f,\psi\rangle_{2}|Y^{\varepsilon}⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_Y start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT, these can typically be numerically approximated by efficiently sampling from the explicitly available conjugate posterior distributions.

2.3 Examples of Gaussian priors

In this section, we provide two concrete examples of Gaussian priors to which Theorem 1 applies. For both instances, an implementation of the resulting posterior inference will be presented in Section 3 below, based on two different discretisation strategies. The first example concerns Gaussian series priors.

Example 2 (Gaussian series priors on the Dirichlet-Laplacian eigenbasis).

Let (ϕj,j)H1(𝒪)C(𝒪¯)subscriptitalic-ϕ𝑗𝑗superscript𝐻1𝒪superscript𝐶¯𝒪(\phi_{j},\ j\in\mathbb{N})\subset H^{1}(\mathcal{O})\cap C^{\infty}(\overline% {\mathcal{O}})( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j ∈ blackboard_N ) ⊂ italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( caligraphic_O ) ∩ italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( over¯ start_ARG caligraphic_O end_ARG ) be the orthonormal basis of the space L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) formed by the eigenfunctions of the (negative) Dirichlet-Laplacian,

Δϕjλjϕj=0,on𝒪ϕj=0,on𝒪,j,\begin{split}-\Delta\phi_{j}-\lambda_{j}\phi_{j}&=0,\ \ \textnormal{on}\ \ % \mathcal{O}\\ \phi_{j}&=0,\ \ \textnormal{on}\ \ \partial\mathcal{O},\end{split}\qquad j\in% \mathbb{N},start_ROW start_CELL - roman_Δ italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_CELL start_CELL = 0 , on caligraphic_O end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_CELL start_CELL = 0 , on ∂ caligraphic_O , end_CELL end_ROW italic_j ∈ blackboard_N , (8)

with associated eigenvalues 0<λ1<λ2λ3,0subscript𝜆1subscript𝜆2subscript𝜆30<\lambda_{1}<\lambda_{2}\leq\lambda_{3}\leq\dots,0 < italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ≤ … , satisfying λjsubscript𝜆𝑗\lambda_{j}\to\inftyitalic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT → ∞ as j𝑗j\to\inftyitalic_j → ∞ according to Weyl’s asymptotics, namely λj=O(j2/d)subscript𝜆𝑗𝑂superscript𝑗2𝑑\lambda_{j}=O(j^{2/d})italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_O ( italic_j start_POSTSUPERSCRIPT 2 / italic_d end_POSTSUPERSCRIPT ) as j𝑗j\to\inftyitalic_j → ∞. We refer to Example 6.3 and Section 7.4 in [23] for details. The associated Hilbert scale

α:={fL2(𝒪):fα2:=j=1λjα|f,ϕj2|2<},α0,formulae-sequenceassignsuperscript𝛼conditional-set𝑓superscript𝐿2𝒪assignsubscriptsuperscriptnorm𝑓2superscript𝛼superscriptsubscript𝑗1superscriptsubscript𝜆𝑗𝛼superscriptsubscript𝑓subscriptitalic-ϕ𝑗22𝛼0\mathbb{H}^{\alpha}:=\Bigg{\{}f\in L^{2}(\mathcal{O}):\|f\|^{2}_{\mathbb{H}^{% \alpha}}:=\sum_{j=1}^{\infty}\lambda_{j}^{\alpha}|\langle f,\phi_{j}\rangle_{2% }|^{2}<\infty\Bigg{\}},\qquad\alpha\geq 0,blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT := { italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) : ∥ italic_f ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT := ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT | ⟨ italic_f , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < ∞ } , italic_α ≥ 0 ,

then satisfies 0=L2(𝒪)superscript0superscript𝐿2𝒪\mathbb{H}^{0}=L^{2}(\mathcal{O})blackboard_H start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) (with equality of norms) and the continuous (strict) embedding αHα(𝒪)superscript𝛼superscript𝐻𝛼𝒪\mathbb{H}^{\alpha}\subset H^{\alpha}(\mathcal{O})blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⊂ italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( caligraphic_O ) for all α>0𝛼0\alpha>0italic_α > 0 [44, p. 472]. In fact, it holds that fαfHαsimilar-to-or-equalssubscriptnorm𝑓superscript𝛼subscriptnorm𝑓superscript𝐻𝛼\|f\|_{\mathbb{H}^{\alpha}}\simeq\|f\|_{H^{\alpha}}∥ italic_f ∥ start_POSTSUBSCRIPT blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ≃ ∥ italic_f ∥ start_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_POSTSUBSCRIPT for all fα𝑓superscript𝛼f\in\mathbb{H}^{\alpha}italic_f ∈ blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT and α0𝛼0\alpha\geq 0italic_α ≥ 0.

For any α>d/2𝛼𝑑2\alpha>d/2italic_α > italic_d / 2, consider the Gaussian random series

F:=j=1λjα/2Fjϕj,FjiidN(0,1),assign𝐹superscriptsubscript𝑗1superscriptsubscript𝜆𝑗𝛼2subscript𝐹𝑗subscriptitalic-ϕ𝑗subscript𝐹𝑗iidsimilar-to𝑁01F:=\sum_{j=1}^{\infty}\lambda_{j}^{-\alpha/2}F_{j}\phi_{j},\qquad F_{j}% \overset{\textnormal{iid}}{\sim}N(0,1),italic_F := ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_α / 2 end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT overiid start_ARG ∼ end_ARG italic_N ( 0 , 1 ) , (9)

corresponding to the Karhunen-Loève expansions of certain commonly used Gaussian process priors with covariance kernel given by an inverse power of the Laplacian [41, Section 2.4]. By Weyl’s asymptotics, we have

E[F22]=j=1λjαj=1j2α/d<,Edelimited-[]superscriptsubscriptnorm𝐹22superscriptsubscript𝑗1superscriptsubscript𝜆𝑗𝛼similar-to-or-equalssuperscriptsubscript𝑗1superscript𝑗2𝛼𝑑\textnormal{E}[\|F\|_{2}^{2}]=\sum_{j=1}^{\infty}\lambda_{j}^{-\alpha}\simeq% \sum_{j=1}^{\infty}j^{-2\alpha/d}<\infty,E [ ∥ italic_F ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ≃ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_j start_POSTSUPERSCRIPT - 2 italic_α / italic_d end_POSTSUPERSCRIPT < ∞ ,

since 2α/d>12𝛼𝑑12\alpha/d>12 italic_α / italic_d > 1, showing that F𝐹Fitalic_F takes values almost surely in L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ). By Lemma I.5 in [15], the law Π()Π\Pi(\cdot)roman_Π ( ⋅ ) of F𝐹Fitalic_F is then seen to define a Gaussian Borel probability measure supported on L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ). Furthermore, by Theorem I.12 in [15], its RKHS is equal to αsuperscript𝛼\mathbb{H}^{\alpha}blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT. Noting that, for any compactly supported test function ψHγ(𝒪)𝜓superscript𝐻𝛾𝒪\psi\in H^{\gamma}(\mathcal{O})italic_ψ ∈ italic_H start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT ( caligraphic_O ), γ>2+d/2𝛾2𝑑2\gamma>2+d/2italic_γ > 2 + italic_d / 2, the approximation argument in the proof of Theorem 4.1 in [18] can be carried out with minimal modifications with Hα(𝒪)superscript𝐻𝛼𝒪H^{\alpha}(\mathcal{O})italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( caligraphic_O ) replaced by αsuperscript𝛼\mathbb{H}^{\alpha}blackboard_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, we conclude that Theorem 1 applies by modelling the unknown source function f𝑓fitalic_f via the Gaussian series (9).

While not explicitly available for general domains 𝒪𝒪\mathcal{O}caligraphic_O, we note that the Dirichlet-Laplacian eigenbasis can be numerically computed via efficient finite element methods for elliptic eigenvalue problems, offering a broadly applicable framework for implementation. More details will be provided in Section 3 below.

A second example of interest involves stationary Gaussian processes defined via a covariance kernel of choice. In particular, the Matérn kernel is widely used in applications [37, Section 4.2].

Example 3 (Matérn process priors).

Let F=(F(x),x𝒪)𝐹𝐹𝑥𝑥𝒪F=(F(x),\ x\in\mathcal{O})italic_F = ( italic_F ( italic_x ) , italic_x ∈ caligraphic_O ) be the centred and stationary Gaussian process with Matérn covariance kernel

Cα,(x,y)=21αΓ(α)(|xy|2α)αBα(|xy|2α),x,y𝒪,formulae-sequencesubscript𝐶𝛼𝑥𝑦superscript21𝛼Γ𝛼superscript𝑥𝑦2𝛼𝛼subscript𝐵𝛼𝑥𝑦2𝛼𝑥𝑦𝒪C_{\alpha,\ell}(x,y)=\frac{2^{1-\alpha}}{\Gamma(\alpha)}\left(\frac{|x-y|\sqrt% {2\alpha}}{\ell}\right)^{\alpha}B_{\alpha}\left(\frac{|x-y|\sqrt{2\alpha}}{% \ell}\right),\qquad x,y\in\mathcal{O},italic_C start_POSTSUBSCRIPT italic_α , roman_ℓ end_POSTSUBSCRIPT ( italic_x , italic_y ) = divide start_ARG 2 start_POSTSUPERSCRIPT 1 - italic_α end_POSTSUPERSCRIPT end_ARG start_ARG roman_Γ ( italic_α ) end_ARG ( divide start_ARG | italic_x - italic_y | square-root start_ARG 2 italic_α end_ARG end_ARG start_ARG roman_ℓ end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( divide start_ARG | italic_x - italic_y | square-root start_ARG 2 italic_α end_ARG end_ARG start_ARG roman_ℓ end_ARG ) , italic_x , italic_y ∈ caligraphic_O , (10)

with regularity parameter α>d/2𝛼𝑑2\alpha>d/2italic_α > italic_d / 2 and length scale >00\ell>0roman_ℓ > 0. Above, ΓΓ\Gammaroman_Γ denotes the gamma function and Bαsubscript𝐵𝛼B_{\alpha}italic_B start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT is the modified Bessel function of the second kind. The finite dimensional distributions of F𝐹Fitalic_F are identified by the relation

(F(x1),,F(xM))TNM(0,𝐂),similar-tosuperscript𝐹subscript𝑥1𝐹subscript𝑥𝑀𝑇subscript𝑁𝑀0𝐂(F(x_{1}),\dots,F(x_{M}))^{T}\sim N_{M}(0,\mathbf{C}),( italic_F ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_F ( italic_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∼ italic_N start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( 0 , bold_C ) , (11)

with 𝐂:=(Cα,(xh,xm))h,m=1MM,Massign𝐂superscriptsubscriptsubscript𝐶𝛼subscript𝑥subscript𝑥𝑚𝑚1𝑀superscript𝑀𝑀\mathbf{C}:=(C_{\alpha,\ell}(x_{h},x_{m}))_{h,m=1}^{M}\in\mathbb{R}^{M,M}bold_C := ( italic_C start_POSTSUBSCRIPT italic_α , roman_ℓ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) start_POSTSUBSCRIPT italic_h , italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_M , italic_M end_POSTSUPERSCRIPT, holding for any M𝑀M\in\mathbb{N}italic_M ∈ blackboard_N and any x1,,xM𝒪subscript𝑥1subscript𝑥𝑀𝒪x_{1},\dots,x_{M}\in\mathcal{O}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∈ caligraphic_O. By Lemma I.4 in [15], a version of F𝐹Fitalic_F can be identified with sample paths belonging almost surely to the Hölder space Cα(𝒪)L2(𝒪)superscript𝐶superscript𝛼𝒪superscript𝐿2𝒪C^{\alpha^{\prime}}(\mathcal{O})\subset L^{2}(\mathcal{O})italic_C start_POSTSUPERSCRIPT italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( caligraphic_O ) ⊂ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) for any 0<α<αd/20superscript𝛼𝛼𝑑20<\alpha^{\prime}<\alpha-d/20 < italic_α start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT < italic_α - italic_d / 2, and therefore, in view of Lemma I.7 in [15], the law Π()Π\Pi(\cdot)roman_Π ( ⋅ ) of such version defines a Gaussian Borel probability measure supported on L2(𝒪)superscript𝐿2𝒪L^{2}(\mathcal{O})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ). Moreover, the results in Section 11.4.4 in [15] imply that the RKHS of F𝐹Fitalic_F equals, with norm equivalence, the set of restrictions to the domain 𝒪𝒪\mathcal{O}caligraphic_O of functions in the Sobolev space Hα(d)superscript𝐻𝛼superscript𝑑H^{\alpha}(\mathbb{R}^{d})italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). Since 𝒪𝒪\mathcal{O}caligraphic_O is assumed to have smooth boundary, the latter is indeed equal to Hα(𝒪)superscript𝐻𝛼𝒪H^{\alpha}(\mathcal{O})italic_H start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( caligraphic_O ). Thus, Theorem 1 applies with Π()Π\Pi(\cdot)roman_Π ( ⋅ ) a Matérn process prior with covariance kernel (10).

3 Numerical simulation study

For illustration, we take as working domain the area 𝒪𝒪\mathcal{O}caligraphic_O contained inside a rotated ellipse with horizontal semi-axis of unit length, vertical semi-axis of length 3/4343/43 / 4, and rotation angle θ=π/6𝜃𝜋6\theta=\pi/6italic_θ = italic_π / 6,

{(cos(t)cos(θ)3/4sin(t)sin(θ),3/4sin(t)cos(θ)+cos(t)sin(θ)),t[0,2π)}.𝑡𝜃34𝑡𝜃34𝑡𝜃𝑡𝜃𝑡02𝜋\{(\cos(t)\cos(\theta)-3/4\sin(t)\sin(\theta),3/4\sin(t)\cos(\theta)+\cos(t)% \sin(\theta)),\ t\in[0,2\pi)\}.{ ( roman_cos ( italic_t ) roman_cos ( italic_θ ) - 3 / 4 roman_sin ( italic_t ) roman_sin ( italic_θ ) , 3 / 4 roman_sin ( italic_t ) roman_cos ( italic_θ ) + roman_cos ( italic_t ) roman_sin ( italic_θ ) ) , italic_t ∈ [ 0 , 2 italic_π ) } .

For an unknown source function fL2(𝒪)𝑓superscript𝐿2𝒪f\in L^{2}(\mathcal{O})italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), we assume in practice that we are given n𝑛nitalic_n noisy point evaluations 𝐘:=(Y1,,Yn)Tnassign𝐘superscriptsubscript𝑌1subscript𝑌𝑛𝑇superscript𝑛\mathbf{Y}:=(Y_{1},\dots,Y_{n})^{T}\in\mathbb{R}^{n}bold_Y := ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT of the solution G(f)𝐺𝑓G(f)italic_G ( italic_f ) to the PDE (2) generated according to the equivalent discrete statistical model (3), for a given deterministic grid of points x1,,xn𝒪subscript𝑥1subscript𝑥𝑛𝒪x_{1},\dots,x_{n}\in\mathcal{O}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ caligraphic_O comprising the nodes of a triangular mesh covering 𝒪𝒪\mathcal{O}caligraphic_O (Figure 3, top-left). We then seek to estimate f𝑓fitalic_f from data 𝐘𝐘\mathbf{Y}bold_Y.

3.1 Posterior inference with Gaussian series priors

3.1.1 Methodology

For the Gaussian series priors defined via the Dirichlet-Laplacian eigenpairs {(ϕj,λj),j}subscriptitalic-ϕ𝑗subscript𝜆𝑗𝑗\{(\phi_{j},\lambda_{j}),\ j\in\mathbb{N}\}{ ( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , italic_j ∈ blackboard_N } considered in Example 2, we discretise the parameter space by modelling the unknown source function f𝑓fitalic_f as the finite sum

f=j=1Jfjϕj,f1,,fJ,J.formulae-sequence𝑓superscriptsubscript𝑗1𝐽subscript𝑓𝑗subscriptitalic-ϕ𝑗subscript𝑓1formulae-sequencesubscript𝑓𝐽𝐽f=\sum_{j=1}^{J}f_{j}\phi_{j},\qquad f_{1},\dots,f_{J}\in\mathbb{R},\qquad J% \in\mathbb{N}.italic_f = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ∈ blackboard_R , italic_J ∈ blackboard_N . (12)

For any such f𝑓fitalic_f, the linearity of the forward map G𝐺Gitalic_G then implies that the discrete observations are given by

Yi=j=1JfjG(ϕj)(xi)+σWi(𝐆𝐟)i+σWisubscript𝑌𝑖superscriptsubscript𝑗1𝐽subscript𝑓𝑗𝐺subscriptitalic-ϕ𝑗subscript𝑥𝑖𝜎subscript𝑊𝑖subscript𝐆𝐟𝑖𝜎subscript𝑊𝑖\displaystyle Y_{i}=\sum_{j=1}^{J}f_{j}G(\phi_{j})(x_{i})+\sigma W_{i}\equiv(% \mathbf{G}\mathbf{f})_{i}+\sigma W_{i}italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_G ( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + italic_σ italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≡ ( bold_Gf ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_σ italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT

where 𝐆:=[G(ϕj)(xi),i=1,,n,j=1,,J]n,J\mathbf{G}:=[G(\phi_{j})(x_{i}),\ i=1,\dots,n,\ j=1,\dots,J]\in\mathbb{R}^{n,J}bold_G := [ italic_G ( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_i = 1 , … , italic_n , italic_j = 1 , … , italic_J ] ∈ blackboard_R start_POSTSUPERSCRIPT italic_n , italic_J end_POSTSUPERSCRIPT and 𝐟:=(f1,,fJ)TJassign𝐟superscriptsubscript𝑓1subscript𝑓𝐽𝑇superscript𝐽\mathbf{f}:=(f_{1},\dots,f_{J})^{T}\in\mathbb{R}^{J}bold_f := ( italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT, whereby the inverse regression model (3) can be written in matrix notation as

𝐘=𝐆𝐟+σ𝐖𝐘𝐆𝐟𝜎𝐖\mathbf{Y}=\mathbf{G}\mathbf{f}+\sigma\mathbf{W}bold_Y = bold_Gf + italic_σ bold_W (13)

with 𝐖:=(W1,,Wn)TNn(0,𝐈n)assign𝐖superscriptsubscript𝑊1subscript𝑊𝑛𝑇similar-tosubscript𝑁𝑛0subscript𝐈𝑛\mathbf{W}:=(W_{1},\dots,W_{n})^{T}\sim N_{n}(0,\mathbf{I}_{n})bold_W := ( italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∼ italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( 0 , bold_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ). Thus, for any given 𝐟J𝐟superscript𝐽\mathbf{f}\in\mathbb{R}^{J}bold_f ∈ blackboard_R start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT, 𝐘|𝐟Nn(𝐆𝐟,σ2𝐈n)similar-toconditional𝐘𝐟subscript𝑁𝑛𝐆𝐟superscript𝜎2subscript𝐈𝑛\mathbf{Y}|\mathbf{f}\sim N_{n}(\mathbf{G}\mathbf{f},\sigma^{2}\mathbf{I}_{n})bold_Y | bold_f ∼ italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_Gf , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ).

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: Top-left: triangular mesh on a rotated elliptically-shaped domain with n=4500𝑛4500n=4500italic_n = 4500 nodes. Top-right, bottom-left and bottom-right, respectively: numerical approximations of the first Dirichlet-Laplacian eigenfunction ϕ1subscriptitalic-ϕ1\phi_{1}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, of the second eigenfunction ϕ2subscriptitalic-ϕ2\phi_{2}italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and of the eigenfunction ϕ84subscriptitalic-ϕ84\phi_{84}italic_ϕ start_POSTSUBSCRIPT 84 end_POSTSUBSCRIPT associated to the largest eigenvalue λ84=493.3725subscript𝜆84493.3725\lambda_{84}=493.3725italic_λ start_POSTSUBSCRIPT 84 end_POSTSUBSCRIPT = 493.3725 found in the range [0,500]0500[0,500][ 0 , 500 ].
Refer to caption
Figure 4: Numerical approximations of the Dirichlet-Laplacian eigenvalues in the range [0,500]0500[0,500][ 0 , 500 ].

In practice, outside certain special cases for the domain 𝒪𝒪\mathcal{O}caligraphic_O (such as squared or circular ones), the Dirichlet-Laplacian eigenpairs are not explicitly available. For general domains, we then resort to numerical methods to solve the elliptic eigenvalue problem (8). In particular, we used MATLAB PDE Toolbox, that allows to input a range of search [0,λmax]0subscript𝜆max[0,\lambda_{\textnormal{max}}][ 0 , italic_λ start_POSTSUBSCRIPT max end_POSTSUBSCRIPT ], λmax>0subscript𝜆max0\lambda_{\textnormal{max}}>0italic_λ start_POSTSUBSCRIPT max end_POSTSUBSCRIPT > 0, for the eigenvalues, and then returns numerical approximations, obtained via finite element methods, of the eigenvalues in the prescribed interval and of the corresponding eigenfunctions. For the considered rotated elliptically-shaped domain, Figure 3 shows the first, the second and the last eigenfunction returned by the elliptic PDE solver. The range was set to [0,λmax]=[0,500]0subscript𝜆max0500[0,\lambda_{\textnormal{max}}]=[0,500][ 0 , italic_λ start_POSTSUBSCRIPT max end_POSTSUBSCRIPT ] = [ 0 , 500 ], for which J=84𝐽84J=84italic_J = 84 eigenvalues were found. The (numerical approximations to) the eigenvalues are displayed in Figure 4. They exhibit a linear growth as expected from Weyl’s asymptotics for bi-dimensional domains. The computation of the matrix 𝐆𝐆\mathbf{G}bold_G in the discretised observation model (13) is performed by numerically solving, again using MATLAB PDE Toolbox, the elliptic PDE (2) with f𝑓fitalic_f replaced by ϕjsubscriptitalic-ϕ𝑗\phi_{j}italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (or, more precisely, by the finite element approximation of ϕjsubscriptitalic-ϕ𝑗\phi_{j}italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT) and then evaluating 𝐆ij:=G(ϕj)(xi)assignsubscript𝐆𝑖𝑗𝐺subscriptitalic-ϕ𝑗subscript𝑥𝑖\mathbf{G}_{ij}:=G(\phi_{j})(x_{i})bold_G start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT := italic_G ( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) for all i=1,,n𝑖1𝑛i=1,\dots,nitalic_i = 1 , … , italic_n and j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J.

Under the discretisation (12), the Gaussian series prior described in Example 2 is approximately implemented by truncating the random series (9) at level J𝐽Jitalic_J, and then assigning to the coefficients f1,,fJsubscript𝑓1subscript𝑓𝐽f_{1},\dots,f_{J}italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT in (12) independent Gaussian priors N(0,λjα)𝑁0superscriptsubscript𝜆𝑗𝛼N(0,\lambda_{j}^{-\alpha})italic_N ( 0 , italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ), j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J. In the discretised observation model (13), this corresponds to assigning to the vector 𝐟𝐟\mathbf{f}bold_f the J𝐽Jitalic_J-dimensional Gaussian prior with diagonal covariance matrix

𝐟NJ(0,𝚲),𝚲:=diag(λ1α,,λJα)J,J.formulae-sequencesimilar-to𝐟subscript𝑁𝐽0𝚲assign𝚲diagsuperscriptsubscript𝜆1𝛼superscriptsubscript𝜆𝐽𝛼superscript𝐽𝐽\mathbf{f}\sim N_{J}(0,\mathbf{\Lambda}),\qquad\mathbf{\Lambda}:=\textnormal{% diag}(\lambda_{1}^{-\alpha},\dots,\lambda_{J}^{-\alpha})\in\mathbb{R}^{J,J}.bold_f ∼ italic_N start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( 0 , bold_Λ ) , bold_Λ := diag ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT , … , italic_λ start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_J , italic_J end_POSTSUPERSCRIPT . (14)

Thus, recalling that, according to (13), 𝐘|𝐟Nn(𝐆𝐟,σ2𝐈n)similar-toconditional𝐘𝐟subscript𝑁𝑛𝐆𝐟superscript𝜎2subscript𝐈𝑛\mathbf{Y}|\mathbf{f}\sim N_{n}(\mathbf{G}\mathbf{f},\sigma^{2}\mathbf{I}_{n})bold_Y | bold_f ∼ italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_Gf , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), a standard conjugate computation for multivariate models with Gaussian likelihood and prior yields the posterior distribution

𝐟|𝐘NJ(𝐟¯n,𝚲n),similar-toconditional𝐟𝐘subscript𝑁𝐽subscript¯𝐟𝑛subscript𝚲𝑛\mathbf{f}|\mathbf{Y}\sim N_{J}(\bar{\mathbf{f}}_{n},\mathbf{\Lambda}_{n}),bold_f | bold_Y ∼ italic_N start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_Λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) , (15)

where

𝐟¯n:=1σ2𝚲n𝐆T𝐘;𝚲n:=(σ2𝐆T𝐆+𝚲1)1.formulae-sequenceassignsubscript¯𝐟𝑛1superscript𝜎2subscript𝚲𝑛superscript𝐆𝑇𝐘assignsubscript𝚲𝑛superscriptsuperscript𝜎2superscript𝐆𝑇𝐆superscript𝚲11\bar{\mathbf{f}}_{n}:=\frac{1}{\sigma^{2}}\mathbf{\Lambda}_{n}\mathbf{G}^{T}% \mathbf{Y};\qquad\mathbf{\Lambda}_{n}:=(\sigma^{-2}\mathbf{G}^{T}\mathbf{G}+% \mathbf{\Lambda}^{-1})^{-1}.over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_Λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT bold_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Y ; bold_Λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := ( italic_σ start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_G + bold_Λ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (16)

Using the conjugate formulae (15) and (16), it is straightforward to compute posterior mean estimates and drawing posterior samples. In turn, this allows to efficiently implement credible sets centred around the posterior mean, replacing the theoretical posterior quantiles (for example, the ones used in the definition of the credible intervals (6)) with the empirical quantiles associated to a sufficiently large sample from the posterior distribution.

3.1.2 Experiments

Throughout the numerical simulation study, the true source function (shown in Figure 1, left) was taken to be

f0(x,y)=e(5x2.5)2(5y)2+e(7.5x)2(2.5y)2+e(5x2.5)2(5y)2,(x,y)𝒪.formulae-sequencesubscript𝑓0𝑥𝑦superscript𝑒superscript5𝑥2.52superscript5𝑦2superscript𝑒superscript7.5𝑥2superscript2.5𝑦2superscript𝑒superscript5𝑥2.52superscript5𝑦2𝑥𝑦𝒪f_{0}(x,y)=e^{-(5x-2.5)^{2}-(5y)^{2}}+e^{-(7.5x)^{2}-(2.5y)^{2}}+e^{-(5x-2.5)^% {2}-(5y)^{2}},\qquad(x,y)\in\mathcal{O}.italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x , italic_y ) = italic_e start_POSTSUPERSCRIPT - ( 5 italic_x - 2.5 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 5 italic_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT + italic_e start_POSTSUPERSCRIPT - ( 7.5 italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 2.5 italic_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT + italic_e start_POSTSUPERSCRIPT - ( 5 italic_x - 2.5 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 5 italic_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT , ( italic_x , italic_y ) ∈ caligraphic_O .

Figure 1 (right) shows n=4500𝑛4500n=4500italic_n = 4500 discrete noisy observations, over the nodes of the triangular mesh depicted in Figure 3 (top-left), of the corresponding PDE solution G(f0)𝐺subscript𝑓0G(f_{0})italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) arising as in the inverse regression model (3) with noise standard deviation σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005 (with corresponding signal-to-noise ratio G(f0)2/σ=37.55subscriptnorm𝐺subscript𝑓02𝜎37.55\|G(f_{0})\|_{2}/\sigma=37.55∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ = 37.55). The diffusion coefficient was taken to be c(x,y):=2+5e(5x2)2(5y2)2+5e(5x+2)2(5y+2)2assign𝑐𝑥𝑦25superscript𝑒superscript5𝑥22superscript5𝑦225superscript𝑒superscript5𝑥22superscript5𝑦22c(x,y):=2+5e^{-(5x-2)^{2}-(5y-2)^{2}}+5e^{-(5x+2)^{2}-(5y+2)^{2}}italic_c ( italic_x , italic_y ) := 2 + 5 italic_e start_POSTSUPERSCRIPT - ( 5 italic_x - 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 5 italic_y - 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT + 5 italic_e start_POSTSUPERSCRIPT - ( 5 italic_x + 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 5 italic_y + 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, (x,y)𝒪𝑥𝑦𝒪(x,y)\in\mathcal{O}( italic_x , italic_y ) ∈ caligraphic_O. The PDE solution G(f0)𝐺subscript𝑓0G(f_{0})italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) was calculated using MATLAB PDE Toolbox, which also contains the routine to create the triangular mesh.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5: Left to right, top to bottom: posterior mean estimates f¯nsubscript¯𝑓𝑛\bar{f}_{n}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of the source function f𝑓fitalic_f for increasing sample sizes n=100,250,500,2000𝑛1002505002000n=100,250,500,2000italic_n = 100 , 250 , 500 , 2000.

The posterior mean estimate f¯n:=j=1J𝐟¯n,jϕjassignsubscript¯𝑓𝑛superscriptsubscript𝑗1𝐽subscript¯𝐟𝑛𝑗subscriptitalic-ϕ𝑗\bar{f}_{n}:=\sum_{j=1}^{J}\bar{\mathbf{f}}_{n,j}\phi_{j}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n , italic_j end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT shown in Figure 2 (left) was obtained by computing the vector of coefficients 𝐟¯nsubscript¯𝐟𝑛\bar{\mathbf{f}}_{n}over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT according to the conjugate formula in (16). A diagonal Gaussian prior as in (14) was used, with regularity parameter α=3/4𝛼34\alpha=3/4italic_α = 3 / 4. The parameter space was discretised using J=84𝐽84J=84italic_J = 84 basis functions. The obtained L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation error was f¯nf02=0.060077subscriptnormsubscript¯𝑓𝑛subscript𝑓020.060077\|\bar{f}_{n}-f_{0}\|_{2}=0.060077∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.060077. For comparison, f02=0.4764subscriptnormsubscript𝑓020.4764\|f_{0}\|_{2}=0.4764∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.4764 (with corresponding relative error f¯nf02/f02=12.5%subscriptnormsubscript¯𝑓𝑛subscript𝑓02subscriptnormsubscript𝑓02percent12.5\|\bar{f}_{n}-f_{0}\|_{2}/\|f_{0}\|_{2}=12.5\%∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 12.5 %), while the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-approximation error incurred by projecting f0subscript𝑓0f_{0}italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT onto the linear space spanned by the employed set of basis functions (furnishing a lower bound for the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation error) is 0.04860.04860.04860.0486. The 2500250025002500 posterior draws whose cross-sections along the x𝑥xitalic_x-axis are shown in Figure 2 (right) were sampled from the conjugate Gaussian posterior distribution in (15).

Figure 5 provides an illustration of asymptotic convergence in the infinitely-informative data limit, showing the posterior mean estimates obtained for increasing sample sizes. The (decreasing) L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors for sample sizes ranging between n=50𝑛50n=50italic_n = 50 and n=4500𝑛4500n=4500italic_n = 4500 are reported in Table 1. Across the experiments, the same discretisation with J=84𝐽84J=84italic_J = 84 basis function and the same diagonal Gaussian prior with regularity α=3/4𝛼34\alpha=3/4italic_α = 3 / 4 were used.

Table 1: L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors achieved by the posterior mean estimator f¯nsubscript¯𝑓𝑛\bar{f}_{n}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for increasing sample sizes.
n𝑛nitalic_n 50 100 250 500 750 1000 2000 3000 4500
f¯nf02subscriptnormsubscript¯𝑓𝑛subscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.22 0.18 0.13 0.099 0.088 0.078 0.076 0.069 0.060
f¯nf02/f02subscriptnormsubscript¯𝑓𝑛subscript𝑓02subscriptnormsubscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}/\|f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 45.8% 37.5% 27.1% 20.6% 18.3% 16.3% 15.8% 14.3% 12.5%

We next consider semiparametric inference for one-dimensional linear functionals f,ψ2subscript𝑓𝜓2\langle f,\psi\rangle_{2}⟨ italic_f , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, ψL2(𝒪)𝜓superscript𝐿2𝒪\psi\in L^{2}(\mathcal{O})italic_ψ ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ), and provide a numerical illustration of the asymptotic results presented in Section 2.2. In particular, we focus on test functions ψ=ϕj𝜓subscriptitalic-ϕ𝑗\psi=\phi_{j}italic_ψ = italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , j{1,,J}𝑗1𝐽j\in\{1,\dots,J\}italic_j ∈ { 1 , … , italic_J }, belonging to the Dirichlet-Laplacian eigenbasis, for which, under the discretisation (12), f,ϕj2=fjsubscript𝑓subscriptitalic-ϕ𝑗2subscript𝑓𝑗\langle f,\phi_{j}\rangle_{2}=f_{j}⟨ italic_f , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Accordingly, for 𝐟¯nsubscript¯𝐟𝑛\bar{\mathbf{f}}_{n}over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝚲nsubscript𝚲𝑛\mathbf{\Lambda}_{n}bold_Λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT as in (16), the plug-in posterior estimators are given by f¯n,ϕj2=𝐟¯n,jsubscriptsubscript¯𝑓𝑛subscriptitalic-ϕ𝑗2subscript¯𝐟𝑛𝑗\langle\bar{f}_{n},\phi_{j}\rangle_{2}=\bar{\mathbf{f}}_{n,j}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n , italic_j end_POSTSUBSCRIPT, with corresponding posterior variances 𝚪n,jjsubscript𝚪𝑛𝑗𝑗\mathbf{\Gamma}_{n,jj}bold_Γ start_POSTSUBSCRIPT italic_n , italic_j italic_j end_POSTSUBSCRIPT. Thus, the 95%percent9595\%95 %-credible interval for f,ϕj2subscript𝑓subscriptitalic-ϕ𝑗2\langle f,\phi_{j}\rangle_{2}⟨ italic_f , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is given by 𝐟¯n,j±1.96𝚪n,jjplus-or-minussubscript¯𝐟𝑛𝑗1.96subscript𝚪𝑛𝑗𝑗\bar{\mathbf{f}}_{n,j}\pm 1.96\sqrt{\mathbf{\Gamma}_{n,jj}}over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n , italic_j end_POSTSUBSCRIPT ± 1.96 square-root start_ARG bold_Γ start_POSTSUBSCRIPT italic_n , italic_j italic_j end_POSTSUBSCRIPT end_ARG.

In order to compute the asymptotic variances (cψ)22superscriptsubscriptnorm𝑐𝜓22\|\nabla\cdot(c\nabla\psi)\|_{2}^{2}∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT appearing in the right hand side of (4) and (5), we obtain the singular value decomposition (SVD) of the forward operator G𝐺Gitalic_G, corresponding to finding the eigenfunctions (ξk,k)L2(𝒪)subscript𝜉𝑘𝑘superscript𝐿2𝒪(\xi_{k},\ k\in\mathbb{N})\subset L^{2}(\mathcal{O})( italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_k ∈ blackboard_N ) ⊂ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_O ) and eigenvalues (ηk,k)[0,)subscript𝜂𝑘𝑘0(\eta_{k},\ k\in\mathbb{N})\subset[0,\infty)( italic_η start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_k ∈ blackboard_N ) ⊂ [ 0 , ∞ ) solving the problem

(cξ)ηξ=0,on𝒪ξ=0,on𝒪,\begin{split}-\nabla\cdot(c\nabla\xi)-\eta\xi&=0,\ \ \textnormal{on}\ \ % \mathcal{O}\\ \xi&=0,\ \ \textnormal{on}\ \ \partial\mathcal{O},\end{split}start_ROW start_CELL - ∇ ⋅ ( italic_c ∇ italic_ξ ) - italic_η italic_ξ end_CELL start_CELL = 0 , on caligraphic_O end_CELL end_ROW start_ROW start_CELL italic_ξ end_CELL start_CELL = 0 , on ∂ caligraphic_O , end_CELL end_ROW (17)

whereupon there follow the identities

G(f)=k=1ηk1f,ξk2ξk,𝐺𝑓superscriptsubscript𝑘1superscriptsubscript𝜂𝑘1subscript𝑓subscript𝜉𝑘2subscript𝜉𝑘G(f)=\sum_{k=1}^{\infty}\eta_{k}^{-1}\langle f,\xi_{k}\rangle_{2}\xi_{k},italic_G ( italic_f ) = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⟨ italic_f , italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,

and

(cu)=k=1ηku,ξk2ξk;(cu)22=k=1ηk2u,ξk22.formulae-sequence𝑐𝑢superscriptsubscript𝑘1subscript𝜂𝑘subscript𝑢subscript𝜉𝑘2subscript𝜉𝑘superscriptsubscriptnorm𝑐𝑢22superscriptsubscript𝑘1superscriptsubscript𝜂𝑘2superscriptsubscript𝑢subscript𝜉𝑘22\nabla\cdot(c\nabla u)=\sum_{k=1}^{\infty}\eta_{k}\langle u,\xi_{k}\rangle_{2}% \xi_{k};\qquad\|\nabla\cdot(c\nabla u)\|_{2}^{2}=\sum_{k=1}^{\infty}\eta_{k}^{% 2}\langle u,\xi_{k}\rangle_{2}^{2}.∇ ⋅ ( italic_c ∇ italic_u ) = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟨ italic_u , italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ; ∥ ∇ ⋅ ( italic_c ∇ italic_u ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⟨ italic_u , italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 6: Left to right, top to bottom: histograms relative to 1000100010001000 realisations of the plug-in posterior mean estimators f¯n,ϕj2subscriptsubscript¯𝑓𝑛subscriptitalic-ϕ𝑗2\langle\bar{f}_{n},\phi_{j}\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, for j=2,4,8,16𝑗24816j=2,4,8,16italic_j = 2 , 4 , 8 , 16. The vertical red lines identify the true parameters f0,ϕj2subscriptsubscript𝑓0subscriptitalic-ϕ𝑗2\langle f_{0},\phi_{j}\rangle_{2}⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The vertical green lines identify the predicted asymptotic quantiles f0,ϕj2±1.96(cϕj)2σ/nplus-or-minussubscriptsubscript𝑓0subscriptitalic-ϕ𝑗21.96superscriptnorm𝑐subscriptitalic-ϕ𝑗2𝜎𝑛\langle f_{0},\phi_{j}\rangle_{2}\pm 1.96\|\nabla\cdot(c\nabla\phi_{j})\|^{2}% \sigma/\sqrt{n}⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ± 1.96 ∥ ∇ ⋅ ( italic_c ∇ italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_σ / square-root start_ARG italic_n end_ARG (recalling the asymptotic regime ε2σ2/nsimilar-to-or-equalssuperscript𝜀2superscript𝜎2𝑛\varepsilon^{2}\simeq\sigma^{2}/nitalic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≃ italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_n).
Table 2: Observed coverages for increasing sample sizes of the 95%percent9595\%95 %-credible intervals for the linear functionals f,ϕj2subscript𝑓subscriptitalic-ϕ𝑗2\langle f,\phi_{j}\rangle_{2}⟨ italic_f , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, with j=2,4,8,16𝑗24816j=2,4,8,16italic_j = 2 , 4 , 8 , 16.
n𝑛nitalic_n 50 100 250 500 750 1000
ϕ2subscriptitalic-ϕ2\phi_{2}italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.885 0.921 0.954 0.969 0.947 0.962
ϕ4subscriptitalic-ϕ4\phi_{4}italic_ϕ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT 0.904 0.936 0.943 0.96 0.953 0.958
ϕ8subscriptitalic-ϕ8\phi_{8}italic_ϕ start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT 0.92 0.934 0.951 0.944 0.963 0.952
ϕ16subscriptitalic-ϕ16\phi_{16}italic_ϕ start_POSTSUBSCRIPT 16 end_POSTSUBSCRIPT 0.949 0.922 0.92 0.946 0.945 0.96

In practice, we tackle the eigenvalue problem (17) via finite element methods exactly as outlined in Section 3.1.1 for the computation of the Dirichlet-Laplacian eigenbasis, obtaining numerical approximations of the eigenpairs. We note that, while used here as a convenient computational device to evaluate the asymptotic variances, knowledge of the SVD of the forward operator G𝐺Gitalic_G is not assumed for the theoretical results of Section 2.2, nor is required for the specification of the two classes of Gaussian priors introduced in Examples 2 and 3 respectively. The theory and methodology investigated in the present article are indeed generally applicable to inverse problems where the SVD might be challenging or unfeasible to compute, or to settings where the properties of the associated eigenpairs might be unknown.

Figure 6 shows the (approximate) distributions of the plug-in posterior mean estimators f¯n,ψ2subscriptsubscript¯𝑓𝑛𝜓2\langle\bar{f}_{n},\psi\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for four representative test functions ψ=ϕ2,ϕ4,ϕ8,ϕ16𝜓subscriptitalic-ϕ2subscriptitalic-ϕ4subscriptitalic-ϕ8subscriptitalic-ϕ16\psi=\phi_{2},\phi_{4},\phi_{8},\phi_{16}italic_ψ = italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT 16 end_POSTSUBSCRIPT. The plots present the histograms relative to 1000100010001000 realisations of the estimators, obtained by drawing 1000100010001000 independent collections of observations from the inverse regression model (3). For each experiment, a sample of size n=1000𝑛1000n=1000italic_n = 1000 was drawn, with noise standard deviation σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005. As expected from the central limit theorem (5), the distributions of the plug-in estimators f¯n,ψ2subscriptsubscript¯𝑓𝑛𝜓2\langle\bar{f}_{n},\psi\rangle_{2}⟨ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT exhibit a normal shape, are approximately centred around the true parameter f0,ψ2subscriptsubscript𝑓0𝜓2\langle f_{0},\psi\rangle_{2}⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ψ ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and their spread is mostly captured by the asymptotic variance (cψ)22subscriptsuperscriptnorm𝑐𝜓22\|\nabla\cdot(c\nabla\psi)\|^{2}_{2}∥ ∇ ⋅ ( italic_c ∇ italic_ψ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

Finally, Table 2 reports the coverage, for increasing sample sizes, of the 95%percent9595\%95 %-credible intervals (6) for the same linear functionals f,ϕj2subscript𝑓subscriptitalic-ϕ𝑗2\langle f,\phi_{j}\rangle_{2}⟨ italic_f , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, with j=2,4,8,16𝑗24816j=2,4,8,16italic_j = 2 , 4 , 8 , 16, considered in the previous set of experiments. The results were obtained by drawing 1000100010001000 independent collections, of size n=1000𝑛1000n=1000italic_n = 1000, of observations from the inverse regression model (3), with noise standard deviation σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005. For each random sample, a realisation of the 95%percent9595\%95 %-credible intervals 𝐟¯n,j±1.96𝚪n,jjplus-or-minussubscript¯𝐟𝑛𝑗1.96subscript𝚪𝑛𝑗𝑗\bar{\mathbf{f}}_{n,j}\pm 1.96\sqrt{\mathbf{\Gamma}_{n,jj}}over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n , italic_j end_POSTSUBSCRIPT ± 1.96 square-root start_ARG bold_Γ start_POSTSUBSCRIPT italic_n , italic_j italic_j end_POSTSUBSCRIPT end_ARG was obtained, and the final coverage scores were computed as the fraction of times in which the true parameters f0,ϕj2subscriptsubscript𝑓0subscriptitalic-ϕ𝑗2\langle f_{0},\phi_{j}\rangle_{2}⟨ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT were contained in the obtained credible intervals. As expected from the theoretical convergence result in (7), the observed coverages stabilise, as the sample size increases, around the prescribed credibility level 95%percent9595\%95 %.

3.2 Posterior inference with the Matérn process prior

Next, we consider the Matérn process priors introduced in Example 3. We discretise the parameter space by assuming that f𝑓fitalic_f is given by the finite sum

f=m=1Mfmφm,f1,,fM,M,formulae-sequence𝑓superscriptsubscript𝑚1𝑀subscript𝑓𝑚subscript𝜑𝑚subscript𝑓1formulae-sequencesubscript𝑓𝑀𝑀f=\sum_{m=1}^{M}f_{m}\varphi_{m},\qquad f_{1},\dots,f_{M}\in\mathbb{R},\qquad M% \in\mathbb{N},italic_f = ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∈ blackboard_R , italic_M ∈ blackboard_N , (18)

where φ1,,φMsubscript𝜑1subscript𝜑𝑀\varphi_{1},\dots,\varphi_{M}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_φ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT are piecewise linear functions on the nodes z1,,zM𝒪subscript𝑧1subscript𝑧𝑀𝒪z_{1},\dots,z_{M}\in\mathcal{O}italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_z start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∈ caligraphic_O of a deterministic triangular mesh, uniquely identified by the property φm(zm)=1{m=m}subscript𝜑𝑚subscript𝑧superscript𝑚subscript1𝑚superscript𝑚\varphi_{m}(z_{m^{\prime}})=1_{\{m=m^{\prime}\}}italic_φ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) = 1 start_POSTSUBSCRIPT { italic_m = italic_m start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } end_POSTSUBSCRIPT; see Figure 7. Accordingly, f𝑓fitalic_f in (18) satisfies f(zm)=fm𝑓subscript𝑧𝑚subscript𝑓𝑚f(z_{m})=f_{m}italic_f ( italic_z start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, and for any x𝒪𝑥𝒪x\in\mathcal{O}italic_x ∈ caligraphic_O the value f(x)𝑓𝑥f(x)italic_f ( italic_x ) is obtained by linearly interpolating the pairs {(zm,fm),m=1,,M}formulae-sequencesubscript𝑧𝑚subscript𝑓𝑚𝑚1𝑀\{(z_{m},f_{m}),\ m=1,\dots,M\}{ ( italic_z start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , italic_m = 1 , … , italic_M }.

Refer to caption
Refer to caption
Refer to caption
Figure 7: Left to right: piecewise linear interpolation functions φ75subscript𝜑75\varphi_{75}italic_φ start_POSTSUBSCRIPT 75 end_POSTSUBSCRIPT, φ100subscript𝜑100\varphi_{100}italic_φ start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT and φ250subscript𝜑250\varphi_{250}italic_φ start_POSTSUBSCRIPT 250 end_POSTSUBSCRIPT for a triangular mesh with M=1169𝑀1169M=1169italic_M = 1169 nodes.

Under the discretisation (18), the inverse regression model (3) can be written in matrix notation exactly as in (13), now with

𝐆:=[G(φm)(xi),i=1,,n,m=1,,M]n,M,\mathbf{G}:=[G(\varphi_{m})(x_{i}),\ i=1,\dots,n,\ m=1,\dots,M]\in\mathbb{R}^{% n,M},bold_G := [ italic_G ( italic_φ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_i = 1 , … , italic_n , italic_m = 1 , … , italic_M ] ∈ blackboard_R start_POSTSUPERSCRIPT italic_n , italic_M end_POSTSUPERSCRIPT ,

and with

𝐟:=(f1,,fM)TM.assign𝐟superscriptsubscript𝑓1subscript𝑓𝑀𝑇superscript𝑀\mathbf{f}:=(f_{1},\dots,f_{M})^{T}\in\mathbb{R}^{M}.bold_f := ( italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT .

Thus, again, 𝐘|𝐟Nn(𝐆𝐟,σ2𝐈n)similar-toconditional𝐘𝐟subscript𝑁𝑛𝐆𝐟superscript𝜎2subscript𝐈𝑛\mathbf{Y}|\mathbf{f}\sim N_{n}(\mathbf{G}\mathbf{f},\sigma^{2}\mathbf{I}_{n})bold_Y | bold_f ∼ italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_Gf , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ). Similarly to Section 3.1.1, the numerical computation of the matrix 𝐆𝐆\mathbf{G}bold_G can be carried out with finite element methods for elliptic PDEs.

Recalling that, under the discretisation (18), fm=f(zm)subscript𝑓𝑚𝑓subscript𝑧𝑚f_{m}=f(z_{m})italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_f ( italic_z start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) for m=1,,M𝑚1𝑀m=1,\dots,Mitalic_m = 1 , … , italic_M, and the finite dimensional distributions property (11), assigning to f𝑓fitalic_f a Matérn process prior with covariance Cα,subscript𝐶𝛼C_{\alpha,\ell}italic_C start_POSTSUBSCRIPT italic_α , roman_ℓ end_POSTSUBSCRIPT as in (10) corresponds to assigning to the vector 𝐟𝐟\mathbf{f}bold_f the M𝑀Mitalic_M-dimensional Gaussian prior with covariance matrix

𝐟NM(0,𝐂),𝐂:=[Cα,(zh,zm)]h,m=1MM,M.formulae-sequencesimilar-to𝐟subscript𝑁𝑀0𝐂assign𝐂superscriptsubscriptdelimited-[]subscript𝐶𝛼subscript𝑧subscript𝑧𝑚𝑚1𝑀superscript𝑀𝑀\mathbf{f}\sim N_{M}(0,\mathbf{C}),\qquad\mathbf{C}:=[C_{\alpha,\ell}(z_{h},z_% {m})]_{h,m=1}^{M}\in\mathbb{R}^{M,M}.bold_f ∼ italic_N start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( 0 , bold_C ) , bold_C := [ italic_C start_POSTSUBSCRIPT italic_α , roman_ℓ end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ] start_POSTSUBSCRIPT italic_h , italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_M , italic_M end_POSTSUPERSCRIPT .

The same conjugate computation as the one outlined in Section 3.1.1 can then be carried out, leading to the Gaussian posterior distribution 𝐟|𝐘NM(𝐟¯n,𝐂n)similar-toconditional𝐟𝐘subscript𝑁𝑀subscript¯𝐟𝑛subscript𝐂𝑛\mathbf{f}|\mathbf{Y}\sim N_{M}(\bar{\mathbf{f}}_{n},\mathbf{C}_{n})bold_f | bold_Y ∼ italic_N start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), with posterior mean and covariance matrix respectively given by

𝐟¯n:=1σ2𝐂n𝐆T𝐘;𝐂n:=(σ2𝐆T𝐆+𝐂1)1.formulae-sequenceassignsubscript¯𝐟𝑛1superscript𝜎2subscript𝐂𝑛superscript𝐆𝑇𝐘assignsubscript𝐂𝑛superscriptsuperscript𝜎2superscript𝐆𝑇𝐆superscript𝐂11\bar{\mathbf{f}}_{n}:=\frac{1}{\sigma^{2}}\mathbf{C}_{n}\mathbf{G}^{T}\mathbf{% Y};\qquad\mathbf{C}_{n}:=(\sigma^{-2}\mathbf{G}^{T}\mathbf{G}+\mathbf{C}^{-1})% ^{-1}.over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT bold_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Y ; bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := ( italic_σ start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_G + bold_C start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Using the above conjugate formulae, posterior inference for the source function f𝑓fitalic_f based on the Matérn process prior can efficiently be implemented.

Refer to caption
Figure 8: The posterior mean estimate of the source function f𝑓fitalic_f based on a Matérn process prior with parameters α=10𝛼10\alpha=10italic_α = 10, =0.250.25\ell=0.25roman_ℓ = 0.25.
Table 3: L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors achieved by the posterior mean estimator f¯nsubscript¯𝑓𝑛\bar{f}_{n}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT arising from the Matérn process prior for increasing sample sizes.
n𝑛nitalic_n 50 100 250 500 750 1000 2000 3000 4500
f¯nf02subscriptnormsubscript¯𝑓𝑛subscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.30 0.30 0.18 0.12 0.13 0.10 0.086 0.076 0.067
f¯nf02/f02subscriptnormsubscript¯𝑓𝑛subscript𝑓02subscriptnormsubscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}/\|f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 62.5% 62.5% 37.5% 25% 27.1% 20.8% 17.9% 15.6% 13.9%

For the ground truth displayed in Figure 1 (left), Table 3 reports the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors attained by the posterior mean estimate f¯n=m=1M𝐟¯n,mφmsubscript¯𝑓𝑛superscriptsubscript𝑚1𝑀subscript¯𝐟𝑛𝑚subscript𝜑𝑚\bar{f}_{n}=\sum_{m=1}^{M}\bar{\mathbf{f}}_{n,m}\varphi_{m}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT over¯ start_ARG bold_f end_ARG start_POSTSUBSCRIPT italic_n , italic_m end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT based on an increasing number of observations from the inverse regression model (3), with noise standard deviation σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005 (with corresponding signal-to-noise ratio G(f0)2/σ=37.55subscriptnorm𝐺subscript𝑓02𝜎37.55\|G(f_{0})\|_{2}/\sigma=37.55∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ = 37.55). Across the experiments, the parameter space was discretised using a triangular mesh with M=1169𝑀1169M=1169italic_M = 1169 nodes. The prior regularity parameter for the Matérn process prior was set to α=10𝛼10\alpha=10italic_α = 10, and the length-scale parameter to =0.250.25\ell=0.25roman_ℓ = 0.25. Figure 8 shows the posterior mean estimate resulting from n=4500𝑛4500n=4500italic_n = 4500 observations.

The results are relative to the same collection of synthetic data sets with increasing sample size employed in Section 3.1.2, allowing a direct comparison to the results obtained with the Gaussian series priors considered therein. Overall, the achieved L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors are comparable in magnitude for each sample size, albeit the performance of the Gaussian series priors was consistently slightly better. It is plausible that such small discrepancy is caused by finite sample effects, prior tuning and the various numerical approximations.

3.3 Further numerical experiments

3.3.1 Sensitivity to the noise variance

Table 4: L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation errors achieved by the posterior mean estimator f¯nsubscript¯𝑓𝑛\bar{f}_{n}over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT arising from the Gaussian series prior for decreasing noise standard deviation.
σ𝜎\sigmaitalic_σ 0.01 0.005 0.0025 0.001 0.0005 0.0001
G(f0)2/σsubscriptnorm𝐺subscript𝑓02𝜎\|G(f_{0})\|_{2}/\sigma∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ 1.88 3.75 7.51 18.77 37.55 187.74
f¯nf02subscriptnormsubscript¯𝑓𝑛subscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.21 0.16 0.14 0.078 0.060 0.049
f¯nf02/f02subscriptnormsubscript¯𝑓𝑛subscript𝑓02subscriptnormsubscript𝑓02\|\bar{f}_{n}-f_{0}\|_{2}/\|f_{0}\|_{2}∥ over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 43.75% 33.33% 29.17% 16.25% 12.5% 10.20%

For the empirical results presented in Sections 3.1 and 3.2, a fixed noise standard deviation σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005 in the inverse regression model (3) was used (with corresponding signal-to-noise ratio G(f0)2/σ=37.55subscriptnorm𝐺subscript𝑓02𝜎37.55\|G(f_{0})\|_{2}/\sigma=37.55∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ = 37.55). Here, we provide a brief investigation of the sensitivity of the considered methodology to the value of σ𝜎\sigmaitalic_σ, performing a set of experiments with decreasing standard deviation from σ=0.01𝜎0.01\sigma=0.01italic_σ = 0.01 (for which G(f0)2/σ=1.8774subscriptnorm𝐺subscript𝑓02𝜎1.8774\|G(f_{0})\|_{2}/\sigma=1.8774∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ = 1.8774) to σ=0.0001𝜎0.0001\sigma=0.0001italic_σ = 0.0001 (for whichG(f0)2/σ=187.74subscriptnorm𝐺subscript𝑓02𝜎187.74\|G(f_{0})\|_{2}/\sigma=187.74∥ italic_G ( italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_σ = 187.74), based on the same domain and ground truth used previously. Across the experiments, the sample size was kept fixed at n=4500𝑛4500n=4500italic_n = 4500.

For concreteness, we focus on the Gaussian series priors from Section 3.1, with the same prior tuning employed therein; similar results may be obtained with the Matérn process priors. The L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation error associated to the resulting posterior mean estimates are shown in Table 4. Unsurprisingly, these were observed to decrease monotonically as the signal-to-noise ratio increased. In particular, at the lowest value σ=0.0001𝜎0.0001\sigma=0.0001italic_σ = 0.0001, the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation error may be seen to approach the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-approximation error resulting from projecting f0subscript𝑓0f_{0}italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT onto the employed basis, which is equal to 0.04860.04860.04860.0486.

3.3.2 Inference with unknown noise variance

We conclude the simulation study considering the important practical scenario where the noise standard deviation σ𝜎\sigmaitalic_σ is itself unknown and needs to be estimated from the data. Given observations {(Yi,Xi)}i=1nsuperscriptsubscriptsubscript𝑌𝑖subscript𝑋𝑖𝑖1𝑛\{(Y_{i},X_{i})\}_{i=1}^{n}{ ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT from the inverse regression model (3), we undertake the simple ‘empirical Bayes’ approach of obtaining a preliminary estimate σ^nsubscript^𝜎𝑛\hat{\sigma}_{n}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of σ𝜎\sigmaitalic_σ, and then carry over the methodology laid out in Sections 3.1 and 3.2 with σ𝜎\sigmaitalic_σ replaced by σ^nsubscript^𝜎𝑛\hat{\sigma}_{n}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Alternatively, we note that a joint Bayesian model for f𝑓fitalic_f and σ𝜎\sigmaitalic_σ in (3) could be considered by endowing σ𝜎\sigmaitalic_σ with a prior distribution. For example, an independent inverse-gamma distribution would lead (conditionally given f𝑓fitalic_f) to a conjugate posterior distribution, whereupon joint posterior samples for the pair (f,σ)𝑓𝜎(f,\sigma)( italic_f , italic_σ ) could readily be obtained via a Gibbs sampler, alternating draws from the Gaussian posterior distribution of f|{(Yi,Xi)}i=1n,σconditional𝑓superscriptsubscriptsubscript𝑌𝑖subscript𝑋𝑖𝑖1𝑛𝜎f|\{(Y_{i},X_{i})\}_{i=1}^{n},\sigmaitalic_f | { ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_σ and draws from the inverse-gamma posterior distribution of σ|{(Yi,Xi)}I=1n,fconditional𝜎superscriptsubscriptsubscript𝑌𝑖subscript𝑋𝑖𝐼1𝑛𝑓\sigma|\{(Y_{i},X_{i})\}_{I=1}^{n},fitalic_σ | { ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_I = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_f. For brevity, we will not pursue this approach further here.

Table 5: Inferential results for the difference-based estimator σ^nsubscript^𝜎𝑛\hat{\sigma}_{n}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of the noise standard deviation and the the empirical Bayes posterior mean estimator f^nsubscript^𝑓𝑛\hat{f}_{n}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, for increasing sample sizes.
n𝑛nitalic_n 1000 2000 3000 4500
σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG 0.0034 0.0024 0.0023 0.00072
f^nf02subscriptnormsubscript^𝑓𝑛subscript𝑓02\|\hat{f}_{n}-f_{0}\|_{2}∥ over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.18 0.14 0.12 0.063
f^nf02/f02subscriptnormsubscript^𝑓𝑛subscript𝑓02subscriptnormsubscript𝑓02\|\hat{f}_{n}-f_{0}\|_{2}/\|f_{0}\|_{2}∥ over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ∥ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 37.5% 29.17% 25% 13.12%

Several strategies have been proposed in the literature for variance estimation in nonparametric regression models, ranging from residual-based estimators using kernel smoothing [22] and splines [46], to difference-based estimators [38]. See [5] for an overview. Here, we will consider the difference-based method proposed in [38], estimating σ𝜎\sigmaitalic_σ in model (3) by

σ^n:=σ^n2,σ^n2:=12(n1)i=2n(YiYi1)2.formulae-sequenceassignsubscript^𝜎𝑛subscriptsuperscript^𝜎2𝑛assignsubscriptsuperscript^𝜎2𝑛12𝑛1superscriptsubscript𝑖2𝑛superscriptsubscript𝑌𝑖subscript𝑌𝑖12\hat{\sigma}_{n}:=\sqrt{\hat{\sigma}^{2}_{n}},\qquad\hat{\sigma}^{2}_{n}:=% \frac{1}{2(n-1)}\sum_{i=2}^{n}(Y_{i}-Y_{i-1})^{2}.over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := square-root start_ARG over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG , over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT := divide start_ARG 1 end_ARG start_ARG 2 ( italic_n - 1 ) end_ARG ∑ start_POSTSUBSCRIPT italic_i = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_Y start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Based on σ^nsubscript^𝜎𝑛\hat{\sigma}_{n}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, the ‘empirical Bayes posterior mean’ estimate f^nsubscript^𝑓𝑛\hat{f}_{n}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT arising from a Gaussian series prior or a Matérn process prior can then be readily computed exactly as described in Sections 3.1 and 3.2 respectively, replacing σ𝜎\sigmaitalic_σ with σ^nsubscript^𝜎𝑛\hat{\sigma}_{n}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT in the relevant conjugate formulae.

Table 5 summarises the inferential results obtained with the difference-based estimation procedure for increasing sample sizes. For these experiments, the noise standard deviation was set to σ=0.0005𝜎0.0005\sigma=0.0005italic_σ = 0.0005, and a Gaussian series priors with the same tuning as in Sections 3.1.2 and 3.3.1 was used. The results show a progressive improvement in the reconstruction quality for both the noise standard deviation σ𝜎\sigmaitalic_σ and the unknown source function f𝑓fitalic_f. In particular, for the largest considered sample size n=4500𝑛4500n=4500italic_n = 4500, the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-estimation error f^nf02subscriptnormsubscript^𝑓𝑛subscript𝑓02\|\hat{f}_{n}-f_{0}\|_{2}∥ over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT resulted to be only marginally higher than the one obtained under the same experimental conditions (and prior tuning) in the context of the empirical results presented in Section 3.1.2 (cf. Table 1), for which knowledge of the value of σ𝜎\sigmaitalic_σ was assumed.

4 Summary and discussion

In this article we have considered the nonparametric Bayesian approach with Gaussian priors to linear inverse problems, focusing on the important example of source identification in elliptic PDEs. The main advantages of the considered methodology lie in its modelling flexibility, its ease of implementation (cf. the conjugate formulae (15) and (16)), as well as its theoretical guarantees on estimation and uncertainty quantification (cf. Section 2.2). The performance of the approach has been investigated in a numerical simulation study (cf. Section 3) under two distinct prior models (Gaussian series and Matérn process priors), both for which excellent reconstruction results have been obtained.

The present work also raises various related research questions. Firstly, it is of interest and practical importance to further explore the setting where the noise standard deviation σ𝜎\sigmaitalic_σ in the inverse regression model (3) is unknown. While the simple difference-based estimator considered in Section 3.3.2 has proved effective, several competing approaches, including the joint conjugate Bayesian model outlined in Section 3.3.2, could be investigated. Furthermore, a related interesting question concerns the extensions of the theoretical results presented in Section 2.2 to the setting with unknown variance; see e.g. [26] for related results in a direct regression model.

Lastly, let us mention the important issue of specifying the hyperparameter values for the considered prior distributions, namely the truncation level and regularity in the Gaussian series (14), and the smoothness and length-scale parameters in the Matérn covariance kernel (10). There is by now a vast literature investigating the methodological and theoretical aspects of empirical and hierarchical Bayesian approaches to fully data-driven selection of the hyperparameters; see [28, 39, 45, 3] and the many reference therein. Investigating the implications and performance of these methods in the context of the observation model and prior distributions considered in the present article is an interesting problem for future research.

Acknowledgement.

The Author is grateful to three anonymous referee for many helpful comments that lead to an improvement of the article. This research has been partially supported by MUR, PRIN project 2022CLTYP4. The Author also gratefully acknowledges the support of “de Castro" Statistics Initiative, Collegio Carlo Alberto, Torino. There are no conflicts of interest to declare that are relevant to the content of this chapter.

References

  • [1] Abraham, K., and Nickl, R. On statistical Calderón problems. Math. Stat. Learn. 2, 2 (2019), 165–216.
  • [2] Adavani, S. S., and Biros, G. Fast algorithms for source identification problems with elliptic pde constraints. SIAM Journal on Imaging Sciences 3, 4 (2010), 791–808.
  • [3] Agapiou, S., Bardsley, J. M., Papaspiliopoulos, O., and Stuart, A. M. Analysis of the gibbs sampler for hierarchical inverse problems. SIAM/ASA Journal on Uncertainty Quantification 2, 1 (2014), 511–544.
  • [4] Agapiou, S., Stuart, A. M., and Zhang, Y.-X. Bayesian posterior contraction rates for linear severely ill-posed inverse problems. J. Inverse Ill-Posed Probl. 22, 3 (2014), 297–321.
  • [5] Alharbi, Y. F. M. Error variance estimation in nonparametric regression models. PhD thesis, University of Birmingham, 2013.
  • [6] Arridge, S., Maass, P., Öktem, O., and Schönlieb, C.-B. Solving inverse problems using data-driven models. Acta Numer. 28 (2019), 1–174.
  • [7] Baumeister, J. Inverse problems in finance. In Recent developments in computational finance: Foundations, algorithms and applications. World Scientific, 2013, pp. 81–157.
  • [8] Benning, M., and Burger, M. Modern regularization methods for inverse problems. Acta Numerica 27 (2018), 1–111.
  • [9] Bertero, M., and Piana, M. Inverse problems in biomedical imaging: modeling and methods of solution. In Complex systems in biomedicine. Springer Italia, Milan, 2006, pp. 1–33.
  • [10] Brown, L. D., and Low, M. G. Asymptotic equivalence of nonparametric regression and white noise. Ann. Statist. 24, 6 (1996), 2384–2398.
  • [11] Castillo, I., and Nickl, R. Nonparametric Bernstein–von Mises Theorems in Gaussian white noise. Ann. Statist. 41, 4 (2013), 1999–2028.
  • [12] Collins, M., and Kuperman, W. Inverse problems in ocean acoustics. Inverse Problems 10, 5 (1994), 1023.
  • [13] Elvetun, O. L., and Nielsen, B. F. A regularization operator for source identification for elliptic pdes. Inverse Problems & Imaging 15, 4 (2021).
  • [14] Evans, L. C. Partial differential equations, second ed., vol. 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2010.
  • [15] Ghosal, S., and van der Vaart, A. W. Fundamentals of Nonparametric Bayesian Inference. Cambridge University Press, New York, 2017.
  • [16] Giné, E., and Nickl, R. Mathematical foundations of infinite-dimensional statistical models. Cambridge University Press, New York, 2016.
  • [17] Giordano, M. Bayesian nonparametric inference in pde models: asymptotic theory and implementation. In 2023 JSM Proceedings. Zenodo, 2023, pp. 1–17.
  • [18] Giordano, M., and Kekkonen, H. Bernstein–von Mises theorems and uncertainty quantification for linear inverse problems. SIAM/ASA J. Uncertain. Quantif. 8, 1 (2020), 342–373.
  • [19] Giordano, M., and Nickl, R. Consistency of Bayesian inference with Gaussian process priors in an elliptic inverse problem. Inverse Problems 36, 8 (2020), 085001–85036.
  • [20] Giordano, M., and Ray, K. Nonparametric bayesian inference for reversible multidimensional diffusions. The Annals of Statistics 50, 5 (2022), 2872–2898.
  • [21] Gugushvili, S., van der Vaart, A., and Yan, D. Bayesian linear inverse problems in regularity scales. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 56, 3 (2020), 2081 – 2107.
  • [22] Hall, P., and Marron, J. On variance estimation in nonparametric regression. Biometrika 77, 2 (1990), 415–419.
  • [23] Haroske, D. D., and Triebel, H. Distributions, Sobolev Spaces, Elliptic Equations. EMS Press, 2007.
  • [24] Isakov, V. Inverse problems for partial differential equations, third ed., vol. 127 of Applied Mathematical Sciences. Springer, Cham, 2017.
  • [25] Kaipio, J., and Somersalo, E. Statistical and Computational Inverse Problems. No. 160 in Applied Mathematical Sciences. Springer-Verlag New York, 2004.
  • [26] Kejzlar, V., Son, M., Bhattacharya, S., and Maiti, T. A fast and calibrated computer model emulator: an empirical bayes approach. Statistics and Computing 31, 4 (2021), 49.
  • [27] Kekkonen, H., Lassas, M., and Siltanen, S. Posterior consistency and convergence rates for Bayesian inversion with hypoelliptic operators. Inverse Problems 32, 8 (2016), 085005, 31.
  • [28] Knapik, B., Szabò, B., van der Vaart, A. W., and van Zanten, H. Bayes procedures for adaptive inference in inverse problems for the white noise model. Probab. Theory Relat. Fields, 164 (2015), 771–813.
  • [29] Knapik, B., van der Vaart, A. W., and van Zanten, J. H. Bayesian inverse problems with Gaussian priors. Ann. Statist. 39, 5 (2011), 2626–2657.
  • [30] Knapik, B. T., van der Vaart, A. W., and van Zanten, J. H. Bayesian recovery of the initial condition for the heat equation. Comm. Statist. Theory Methods 42, 7 (2013), 1294–1313.
  • [31] Lehtinen, M. S. On statistical inversion theory. Theory and applications of inverse problems 67 (1988), 46–57.
  • [32] Lehtinen, M. S., Paivarinta, L., and Somersalo, E. Linear inverse problems for generalised random variables. Inverse Problems 5, 4 (1989), 599.
  • [33] Lions, J. L., and Magenes, E. Non-Homogeneous Boundary Value Problems and Applications, 1 ed. Grundlehren der mathematischen Wissenschaften. Springer-Verlag Berlin Heidelberg, 1972.
  • [34] Monard, F., Nickl, R., and Paternain, G. P. Efficient nonparametric Bayesian inference for X𝑋Xitalic_X-ray transforms. Ann. Statist. 47, 2 (2019), 1113–1147.
  • [35] Monard, F., Nickl, R., and Paternain, G. P. Consistent inversion of noisy non-Abelian X-ray transforms. Comm. Pure Appl. Math. 74, 5 (2021), 1045–1099.
  • [36] Nickl, R., van de Geer, S., and Wang, S. Convergence rates for penalized least squares estimators in PDE constrained regression problems. SIAM/ASA J. Uncertain. Quantif. 8, 1 (2020), 374–413.
  • [37] Rasmussen, C. E., and Williams, C. K. I. Gaussian processes for machine learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2006.
  • [38] Rice, J. Bandwidth choice for nonparametric regression. The annals of Statistics (1984), 1215–1230.
  • [39] Rousseau, J., and Szabo, B. Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator. The Annals of Statistics 45, 2 (2017), 833 – 865.
  • [40] Snieder, R., and Trampert, J. Inverse problems in geophysics. In Wavefield Inversion (Vienna, 1999), A. Wirgin, Ed., Springer Vienna, pp. 119–190.
  • [41] Stuart, A. M. Inverse problems: a Bayesian perspective. Acta Numer. 19 (2010), 451–559.
  • [42] Tarantola, A. Inverse problem theory and methods for model parameter estimation. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2005.
  • [43] Tarantola, A., Valette, B., et al. Inverse problems= quest for information. Journal of geophysics 50, 1 (1982), 159–170.
  • [44] Taylor, M. E. Partial Differential Equations I. Springer New York, NY, 2011.
  • [45] Teckentrup, A. L. Convergence of gaussian process regression with estimated hyper-parameters and applications in bayesian inverse problems. SIAM/ASA Journal on Uncertainty Quantification 8, 4 (2020), 1310–1337.
  • [46] Wahba, G. Improper priors, spline smoothing and the problem of guarding against model errors in regression. Journal of the Royal Statistical Society Series B: Statistical Methodology 40, 3 (1978), 364–372.
  • [47] Yeh, W. W.-G. Review of parameter identification procedures in groundwater hydrology: The inverse problem. Water Resources Research 22, 2 (1986), 95–108.