REXPACO ASDI: Joint unmixing and deconvolution of the circumstellar environment by angular and spectral differential imaging

Olivier Flasseur1, Loïc Denis2, Éric Thiébaut1, Maud Langlois1
1Université de Lyon, Université Lyon1, ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR 5574, Saint-Genis-Laval, France
2Université de Lyon, UJM-Saint-Etienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien UMR 5516, Saint-Étienne, France
E-mail: [email protected]
Abstract

Angular and spectral differential imaging is an observational technique of choice to investigate the immediate vicinity of stars. By leveraging the relative angular motion and spectral scaling between on-axis and off-axis sources, post-processing techniques can separate residual star light from light emitted by surrounding objects such as circumstellar disks or point-like objects. This paper introduces a new algorithm that jointly unmixes these components and deconvolves disk images. The proposed algorithm is based on a statistical model of the residual star light, accounting for its spatial and spectral correlations. These correlations are crucial yet remain inadequately modeled by existing reconstruction algorithms. We employ dedicated shrinkage techniques to estimate the large number of parameters of our correlation model in a data-driven fashion. We show that the resulting separable model of the spatial and spectral covariances captures very accurately the star light, enabling its efficient suppression. We apply our method to datasets from the VLT/SPHERE instrument and compare its performance with standard algorithms (median subtraction, PCA, PACO). We demonstrate that considering the multiple correlations within the data significantly improves reconstruction quality, resulting in better preservation of both disk morphology and photometry. With its unique joint spectral modeling, the proposed algorithm can reconstruct disks with circular symmetry (e.g., rings, spirals) at intensities one million times fainter than the star, without needing additional reference datasets free from off-axis objects.

keywords:
techniques: high angular resolution – techniques: image processing – methods: numerical – methods: statistical – methods: data analysis
pubyear: 2024pagerange: REXPACO ASDI: Joint unmixing and deconvolution of the circumstellar environment by angular and spectral differential imagingLABEL:lastpage

1 Introduction

Direct imaging is a recent observational technique allowing to probe the close environment of young stars (Traub & Oppenheimer, 2010; Bowler, 2016). The targeted tasks are threefold (see e.g., Pueyo (2018); Currie et al. (2022a); Follette (2023) for reviews): (i) detecting (massive) exoplanets, (ii) characterizing their physical properties by estimating their spectral energy distribution (SED), and (iii) reconstructing the flux distribution image of the circumstellar environment surrounding young nearby stars. In this paper, we primarily focus on the latter objective and we also address the unmixing of spatially resolved disks from point-like sources.

Circumstellar disks are key components of the intricate processes governing planetary formation. As an illustration, several studies performed in total intensity or in polarimetry (Esposito et al., 2020; Garufi et al., 2020; Langlois et al., 2020) have revealed the presence of a diversity of structures such as spirals, warps, rings, gaps, shadows and asymmetries, which are considered as potential indicators for the presence of exoplanets (Benisty et al., 2015; Muro-Arena et al., 2020). High-quality reconstruction of the circumstellar environment from high-contrast data thus offer a unique vantage point to understand the physical processes governing these objects (Keppler et al., 2018; Haffert et al., 2019; Mesa et al., 2019b). It also allows to study the intricate interactions between exoplanets and disks, and to provide critical insights into the mechanisms steering the evolution of exoplanetary systems.

In this context, direct imaging faces two observational challenges. First, the objects of interest (i.e., spatially resolved disks and exoplanets appearing as point-like sources) have a very low contrast111Throughout this paper, we define the contrast of the objects of interest as the ratio of their peak intensity to the star peak intensity. This also corresponds to the classical definition of contrast for single-pixel point objects. (typically lower than 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT in the infrared). Second, these off-axis objects are located in the immediate vicinity of the star, thus necessitating high angular resolution to separate them from the star (disks are generally observed inside an angle of less than 1 arcsecond). The angular resolution requirement can be achieved using large ground-based telescopes equipped with extreme adaptive optics systems to compensate in real-time for atmospheric turbulence. The contrast is further improved by filtering most of the star light with a coronagraph. However, this is not sufficient to recover interpretable images of the circumstellar environment, as residual star light still dominates (see Fig. 1). To further reduce the impact of star light, differential imaging is employed. This observational technique involves capturing several images in configurations that introduce diversity (e.g., relative motion in ADI or SDI) between the objects of interest and the star diffraction patterns, known as speckles, caused by diffraction effects in the telescope pupil. There are two primary configurations for differential imaging. In angular differential imaging (ADI; Marois et al. (2006)), a sequence of images is acquired over a few hours of observation. During the acquisition, the pupil of the telescope keeps a constant orientation (so-called pupil-tracking mode), while the field of view rotates due to Earth’s rotation. This leads to a rotation of the objects of interest around the optical axis in the images, while quasi-static speckles created by uncorrected optical aberrations stay mostly fixed between individual exposures. In spectral differential imaging (SDI; Sparks & Ford (2002); Thatte et al. (2007)), images are captured simultaneously in several spectral bands. Due to diffraction, the speckle pattern scales linearly with wavelength, in first approximation. By properly rescaling the images spectrally, the speckle patterns are aligned, while the objects of interest undergo radial motion and homothety due to the scaling transform. ADI and SDI can be advantageously combined to form angular and spectral differential imaging (ASDI) sequences, see e.g. Vigan et al. (2010); Christiaens et al. (2019); Kiefer et al. (2021). The images recorded in ADI, SDI, or ASDI are then combined in a post-processing step to enhance contrast and obtain interpretable images of the circumstellar environment.

The classical post-processing pipeline typically begins by estimating the stellar component, for instance, by averaging a stack of images with aligned speckles. This stellar component is then subtracted from the data, followed by the alignment and stacking of the residuals to compensate for rotations and scaling of the field of view. Beyond simple averaging, the stellar component can be estimated using various techniques: a median approach (Marois et al., 2006; Lagrange et al., 2009)), a weighted linear combination (LOCI methods; Lafrenière et al. (2007); Marois et al. (2013); Marois et al. (2014); Wahhaj et al. (2015)), or principal component analysis (PCA-based methods; Soummer et al. (2012); Amara & Quanz (2012)). All of these methods can be applied on spatio-temporo-spectral data from IFS by leveraging differential diversity in various ways, see e.g. Christiaens et al. (2019); Kiefer et al. (2021) for PCA. This can be done using ADI alone (i.e., a different model for each spectral channel), SDI alone (i.e., a different model for each temporal frame), ADI+SDI (i.e., two models: the first exploiting ADI diversity and the second exploiting SDI diversity to the ADI residuals), SDI+ADI (i.e., two models applied in reverse order of ADI+SDI models), or ASDI (i.e., a single model that jointly leverages both angular and spectral diversities). This latter strategy combined to the specific case of PCA is known as COmbined Differential Imaging (CODI; Kiefer et al. (2021)). However, these methods share a common drawback: part of the signal of interest is included in the estimated stellar component, resulting in its loss when the star component is subtracted from the data. This critical phenomenon, known as self-subtraction (Milli et al., 2012; Pairet et al., 2019), is particularly problematic close to the star, where the diversity between the disk and the star light is more limited (the apparent displacement of the off-axis objects due to the rotations and scaling transforms being separation-dependent). Consequently, disentangling the component of interest from the star light is even more difficult nearer the star. Self-subtraction can introduce various artifacts, such as partial replicas, suppression of some smooth extended structures, and smearing or non-uniform attenuations of disk features (Milli et al., 2012).

To mitigate the impact of self-subtraction, several approaches were considered. Some works perform iterative PCA in which the current disk reconstruction is subtracted from the measurements to improve progressively the estimation of the star light (Pairet et al., 2019; Stapper, L. M. & Ginski, C., 2022). In the same vein, data imputation strategies (Ren et al., 2020; Ren, 2023) discard measurements affected by the disk, either through a data-driven approach or based on prior knowledge of its shape and location, during the estimation of the star light contribution. This type of approaches remains limited by the strategy designed to discard fractions of the field of view impacted by the disk on each image. Other works consider a parametric model of a disk and iteratively adjust its parameters (Esposito et al., 2013; Currie et al., 2017; Milli et al., 2017) by minimizing the resulting residuals, possibly by modeling the effect of the self-subtraction (Lawson et al., 2020; Mazoyer et al., 2020; Hom et al., 2024). These approaches are mainly applicable to simple disk structures, such as ellipses, which are typical morphologies of debris disks. Another technique, Reference Differential Imaging (RDI; see Smith & Terrile (1984); Lafrenière et al. (2009); Lagrange et al. (2010) for some first examples of applications), employs additional images of one or more reference stars without known exoplanets or disks. These additional data can be captured simultaneously with the observation of the target star using the star-hopping technique (Wahhaj et al., 2021), or they can be drawn from a large library of archival observations (Ren et al., 2018; Xuan et al., 2018). RDI can be effectively combined with other observing strategies to simultaneously exploit their diversity. For instance, when integrating RDI with ADI, the nuisance component can be estimated and suppressed using PCA (Ruane et al., 2019; Xie et al., 2022; Juillard et al., 2024) or deep learning techniques (Chintarungruangchai et al., 2023; Wolf et al., 2024; Bodrito et al., 2024). RDI reconstructions can also be constrained by additional observations from other imaging modalities, such as polarimetry, where speckles and disk components behave differently as in total intensity images recorded with ADI/ASDI (Lawson et al., 2022). In practice, the effectiveness of RDI approaches depends heavily on the similarity between the reference and the actual observations, including factors such as star brightness, spectrum, and observation conditions. This degree of similarity becomes increasingly critical as we search for fainter objects. Finally, a last category of approaches jointly addresses the problem of estimating star light residuals and reconstructing the flux distribution of the disk and exoplanets. In ADI, three approaches based on an inverse-problems formulation were recently proposed: MAYONNAISE (Pairet et al., 2021), MUSTARD (Juillard et al., 2022, 2023), and REXPACO (Flasseur et al., 2021, 2022). These algorithms employ different strategies and regularization penalties of the inversion for separating the components of interest. In a first step, MAYONNAISE uses iterative PCA to initialize the inversion process. Building on this preliminary reconstruction, a second step involves estimating and unmixing multiple components by jointly minimizing a data fidelity term. The unmixed components are the star light residuals (restricted to lie within the subspace identified in the first step), the disk (enforced to have a sparse representation in a shearlet basis), and the exoplanets (restricted to be sparse). Non-negativity constraints are also enforced during the minimization. MUSTARD is a variant of MAYONNAISE that primarily differs in the formulation of the direct model. The reconstructed speckles field is enforced to be identical along the temporal axis to account explicitly for its quasi-static behavior. Unlike MAYONNAISE, MUSTARD does not use iterative PCA for initialization, nor does it enforce sparsity of the disk component in a shearlet basis. Additionally, MUSTARD can incorporate a regularization term based on a predefined mask, which helps resolve ambiguities between the speckle field and portions of the disk that are rotation invariant. Both MAYONNAISE and MUSTARD assume noise to be white, independent, and identically distributed. REXPACO follows quite a different modeling as it does not explicitly estimate the residual star light in each image. Instead, it builds a statistical and local model of all fluctuations other than the component of interest (i.e., noise and star light). REXPACO learns the spatial correlations of these fluctuations at the scale of 2D image patches, following an approach initially introduced for exoplanet detection in the PACO algorithm (Flasseur et al., 2018), based on PAtch COvariances. The component of interest is deconvolved with an edge-preserving smoothness regularization and a positivity constraint. Further extending REXPACO for ADI post-processing, a recent enhancement replaces its multivariate Gaussian model of the nuisance with a scaled mixture of multivariate Gaussian models (Flasseur et al., 2022). This improved model offers better fidelity to the observations and enhanced robustness against outlier data (e.g., defective pixels or large stellar leakages), which are identified and neutralized in a data-driven manner. In ADI, REXPACO can be combined with PACO to disentangle user-identified candidate point-like sources from the circumstellar environment.

In this paper, we address the problem of reconstructing circumstellar disks from ASDI sequences through joint multi-spectral post-processing. Compared to ADI, this raises several challenges: (i) modeling the temporal and spectral fluctuations of the residual star light, (ii) jointly exploiting both temporal and spectral information to effectively extract the component of interest, and (iii) ensuring the tractability of estimating high-dimensional models from large datasets. As an illustration of point (iii), typical ASDI datasets produced by the Integral Field Spectrograph (IFS) of the Spectro-Polarimetry High-contrast Exoplanet Research instrument (SPHERE; Beuzit et al. (2019)) at the Very Large Telescope (VLT) are N=290×290𝑁290290N=290\times 290italic_N = 290 × 290 pixels, have L=39𝐿39L=39italic_L = 39 spectral bands and T100𝑇100T\approx 100italic_T ≈ 100 individual exposures. Several hundred million pixel measurements must then be combined to produce a multi-spectral reconstruction of the component of interest. Modeling the full covariance associated with this volume of measurements theoretically involves estimating N(N+1)/2𝑁𝑁12N(N+1)/2italic_N ( italic_N + 1 ) / 2 degrees of freedom from the data, which is not feasible without making approximations to the covariance.

Table 1: Summary of the main notations.
Not. Range Definition
\triangleright Constants and related indexes
K𝐾Kitalic_K superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT number of pixels in a patch
N𝑁Nitalic_N superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT number of pixels in a dataset
Nsuperscript𝑁N^{\prime}italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT number of pixels in a reconstructed image
T𝑇Titalic_T superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT number of temporal frames
L𝐿Litalic_L superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT number of spectral channels
Leffsubscript𝐿effL_{\text{eff}}italic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT superscript\mathbb{N}^{*}blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT effective number of spectral channels
n()n^{(^{\prime})}italic_n start_POSTSUPERSCRIPT ( start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT 1,N()\llbracket 1,N^{(^{\prime})}\rrbracket⟦ 1 , italic_N start_POSTSUPERSCRIPT ( start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ⟧ pixel index
t𝑡titalic_t 1,T1𝑇\llbracket 1,T\rrbracket⟦ 1 , italic_T ⟧ temporal index
\ellroman_ℓ 1,L1𝐿\llbracket 1,L\rrbracket⟦ 1 , italic_L ⟧ spectral index
𝕂𝕂\mathbb{K}blackboard_K set of patch locations
\triangleright Data quantities
𝒗𝒗\bm{v}bold_italic_v NTLsuperscript𝑁𝑇𝐿\mathbb{R}^{NTL}blackboard_R start_POSTSUPERSCRIPT italic_N italic_T italic_L end_POSTSUPERSCRIPT ASDI sequence (with speckles aligned)
𝒇𝒇\bm{f}bold_italic_f NTLsuperscript𝑁𝑇𝐿\mathbb{R}^{NTL}blackboard_R start_POSTSUPERSCRIPT italic_N italic_T italic_L end_POSTSUPERSCRIPT nuisance component
𝐄n(,t)(,)\mathbf{E}_{n(,t)(,\ell)}bold_E start_POSTSUBSCRIPT italic_n ( , italic_t ) ( , roman_ℓ ) end_POSTSUBSCRIPT NTL×K(L)(T)superscript𝑁𝑇𝐿𝐾𝐿𝑇\mathbb{R}^{NTL\times K(L)(T)}blackboard_R start_POSTSUPERSCRIPT italic_N italic_T italic_L × italic_K ( italic_L ) ( italic_T ) end_POSTSUPERSCRIPT patch extractor at pixel n𝑛nitalic_n (, time t𝑡titalic_t) (, channel \ellroman_ℓ)
𝐕n,tsubscript𝐕𝑛𝑡\mathbf{V}_{n,t}bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT K×Lsuperscript𝐾𝐿\mathbb{R}^{K\times L}blackboard_R start_POSTSUPERSCRIPT italic_K × italic_L end_POSTSUPERSCRIPT residual multi-spectral patch at pixel n𝑛nitalic_n, time t𝑡titalic_t
𝒖𝒖\bm{u}bold_italic_u +NLsuperscriptsubscriptsuperscript𝑁𝐿\mathbb{R}_{+}^{N^{\prime}L}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT spatio-spectral flux distribution
\triangleright Operators
𝐌𝐌\mathbf{M}bold_M NL×NTLsuperscriptsuperscript𝑁𝐿𝑁𝑇𝐿\mathbb{R}^{N^{\prime}L\times NTL}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L × italic_N italic_T italic_L end_POSTSUPERSCRIPT direct image formation model: 𝐌=𝐒𝐙𝐀𝐁𝐑𝐌𝐒𝐙𝐀𝐁𝐑\mathbf{M}=\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,\mathbf{B}\,\mathbf{R}bold_M = bold_S bold_Z bold_A bold_B bold_R
𝐅tsubscript𝐅𝑡\mathbf{F}_{t}bold_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT NL×NLsuperscriptsuperscript𝑁𝐿𝑁𝐿\mathbb{R}^{N^{\prime}L\times NL}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L × italic_N italic_L end_POSTSUPERSCRIPT sparse operator at time t𝑡titalic_t: 𝐅t=(𝐒𝐙𝐀𝐑)tsubscript𝐅𝑡subscript𝐒𝐙𝐀𝐑𝑡\mathbf{F}_{t}=(\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,\mathbf{R})_{t}bold_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( bold_S bold_Z bold_A bold_R ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT
𝐁𝐁\mathbf{B}bold_B NTL×NLsuperscriptsuperscript𝑁𝑇𝐿superscript𝑁𝐿\mathbb{R}^{N^{\prime}TL\times N^{\prime}L}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L × italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT convolution by off-axis PSF
𝐑𝐑\mathbf{R}bold_R NTL×NTLsuperscriptsuperscript𝑁𝑇𝐿superscript𝑁𝑇𝐿\mathbb{R}^{N^{\prime}TL\times N^{\prime}TL}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L × italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L end_POSTSUPERSCRIPT apparent field rotation
𝐀𝐀\mathbf{A}bold_A NTL×NTLsuperscriptsuperscript𝑁𝑇𝐿superscript𝑁𝑇𝐿\mathbb{R}^{N^{\prime}TL\times N^{\prime}TL}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L × italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L end_POSTSUPERSCRIPT coronagraph attenuation
𝐙𝐙\mathbf{Z}bold_Z NTL×MTLsuperscriptsuperscript𝑁𝑇𝐿𝑀𝑇𝐿\mathbb{R}^{N^{\prime}TL\times MTL}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L × italic_M italic_T italic_L end_POSTSUPERSCRIPT field of view cropping
𝐒𝐒\mathbf{S}bold_S MTL×NTLsuperscript𝑀𝑇𝐿𝑁𝑇𝐿\mathbb{R}^{MTL\times NTL}blackboard_R start_POSTSUPERSCRIPT italic_M italic_T italic_L × italic_N italic_T italic_L end_POSTSUPERSCRIPT spectral scaling
direct-product\odot X×X,Xsuperscript𝑋𝑋𝑋superscript\mathbb{R}^{X\times X}\,,X\in\mathbb{N}^{*}blackboard_R start_POSTSUPERSCRIPT italic_X × italic_X end_POSTSUPERSCRIPT , italic_X ∈ blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT Hadamard (element-wise) product
tensor-product\otimes X×X,Xsuperscript𝑋𝑋𝑋superscript\mathbb{R}^{X\times X}\,,X\in\mathbb{N}^{*}blackboard_R start_POSTSUPERSCRIPT italic_X × italic_X end_POSTSUPERSCRIPT , italic_X ∈ blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT Kronecker product
\triangleright Estimated quantities
𝝁^specsuperscript^𝝁spec\widehat{\bm{\mu}}^{\,\mathrm{spec}}over^ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT NLsuperscript𝑁𝐿\mathbb{R}^{NL}blackboard_R start_POSTSUPERSCRIPT italic_N italic_L end_POSTSUPERSCRIPT multi-spectral mean of 𝒇𝒇\bm{f}bold_italic_f
𝝁~specsuperscript~𝝁spec\widetilde{\bm{\mu}}^{\,\mathrm{spec}}over~ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT NLsuperscript𝑁𝐿\mathbb{R}^{NL}blackboard_R start_POSTSUPERSCRIPT italic_N italic_L end_POSTSUPERSCRIPT shrunk multi-spectral mean of 𝒇𝒇\bm{f}bold_italic_f
𝐂^spatsuperscript^𝐂spat\widehat{\mathbf{C}}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT K×Ksuperscript𝐾𝐾\mathbb{R}^{K\times K}blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT local empirical spatial covariance of 𝒇𝒇\bm{f}bold_italic_f
𝐂^specsuperscript^𝐂spec\widehat{\mathbf{C}}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT L×Lsuperscript𝐿𝐿\mathbb{R}^{L\times L}blackboard_R start_POSTSUPERSCRIPT italic_L × italic_L end_POSTSUPERSCRIPT local empirical spectral covariance of 𝒇𝒇\bm{f}bold_italic_f
𝐂~spatsuperscript~𝐂spat\widetilde{\mathbf{C}}^{\mathrm{spat}}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT K×Ksuperscript𝐾𝐾\mathbb{R}^{K\times K}blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT local shrunk spatial covariance of 𝒇𝒇\bm{f}bold_italic_f
𝐂~specsuperscript~𝐂spec\widetilde{\mathbf{C}}^{\mathrm{spec}}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT L×Lsuperscript𝐿𝐿\mathbb{R}^{L\times L}blackboard_R start_POSTSUPERSCRIPT italic_L × italic_L end_POSTSUPERSCRIPT local shrunk spectral covariance of 𝒇𝒇\bm{f}bold_italic_f
ρ~spatsuperscript~𝜌spat\widetilde{\rho}^{\,\mathrm{spat}}over~ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT [0,1]01\left[0,1\right][ 0 , 1 ] spatial shrinkage coefficient
ρ~specsuperscript~𝜌spec\widetilde{\rho}^{\,\mathrm{spec}}over~ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT [0,1]01\left[0,1\right][ 0 , 1 ] spectral shrinkage coefficient
𝝈^^𝝈\widehat{\bm{\sigma}}over^ start_ARG bold_italic_σ end_ARG +Tsuperscriptsubscript𝑇\mathbb{R}_{+}^{T}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT temporal weights (𝝈^={σ^t}t=1:T^𝝈subscriptsubscript^𝜎𝑡:𝑡1𝑇\widehat{\bm{\sigma}}=\{\widehat{\sigma}_{t}\}_{t=1:T}over^ start_ARG bold_italic_σ end_ARG = { over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 : italic_T end_POSTSUBSCRIPT)
𝝈~~𝝈\widetilde{\bm{\sigma}}over~ start_ARG bold_italic_σ end_ARG +Tsuperscriptsubscript𝑇\mathbb{R}_{+}^{T}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT shrunk temporal weights (𝝈~={σ~t}t=1:T~𝝈subscriptsubscript~𝜎𝑡:𝑡1𝑇\widetilde{\bm{\sigma}}=\{\widetilde{\sigma}_{t}\}_{t=1:T}over~ start_ARG bold_italic_σ end_ARG = { over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 : italic_T end_POSTSUBSCRIPT)
𝚿spatsuperscript𝚿spat{\mathbf{\Psi}}^{\mathrm{spat}}bold_Ψ start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT K×Ksuperscript𝐾𝐾\mathbb{R}^{K\times K}blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT matrix of spatial shrinkage coefficients
𝚿specsuperscript𝚿spec{\mathbf{\Psi}}^{\mathrm{spec}}bold_Ψ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT L×Lsuperscript𝐿𝐿\mathbb{R}^{L\times L}blackboard_R start_POSTSUPERSCRIPT italic_L × italic_L end_POSTSUPERSCRIPT matrix of spectral shrinkage coefficients
𝚪𝚪\mathbf{\Gamma}bold_Γ KL×KLsuperscript𝐾𝐿𝐾𝐿\mathbb{R}^{KL\times KL}blackboard_R start_POSTSUPERSCRIPT italic_K italic_L × italic_K italic_L end_POSTSUPERSCRIPT shrunk spatio-spectral precision matrix
𝒖^^𝒖\widehat{\bm{u}}over^ start_ARG bold_italic_u end_ARG, 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG +NLsuperscriptsubscriptsuperscript𝑁𝐿\mathbb{R}_{+}^{N^{\prime}L}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT reconstructed spatio-spectral flux distribution
𝜷^^𝜷\widehat{\bm{\beta}}over^ start_ARG bold_italic_β end_ARG +2superscriptsubscript2\mathbb{R}_{+}^{2}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT regularization hyper-parameters
\triangleright Other quantities and metrics
𝒖invsubscript𝒖inv\bm{u}_{\text{inv}}bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT Nsuperscriptsuperscript𝑁\mathbb{R}^{N^{\prime}}blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT flux distribution invariant by ASDI
𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT +NLsuperscriptsubscriptsuperscript𝑁𝐿\mathbb{R}_{+}^{N^{\prime}L}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ground truth flux distribution
αgtsubscript𝛼gt\alpha_{\text{gt}}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT +subscript\mathbb{R}_{+}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT maximum ground truth contrast (disk vs star)
MSE \mathbb{R}blackboard_R mean square error
N-RMSE +subscript\mathbb{R}_{+}blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT normalized root mean square error
SURE \mathbb{R}blackboard_R Stein’s unbiased risk estimator

Our contributions: This paper extends the REXPACO algorithm (Flasseur et al., 2021, 2022) to ASDI sequences. This extension, named REXPACO ASDI, involves several specific methodological developments, including:

  • a spatio-spectral separable model of the covariances of the nuisance,

  • a spatio-temporal weighting of the measurements based on their relative quality,

  • a technique to estimate the components of the covariances and weights model,

  • a regularization strategy of the (noisy) sample covariances,

  • a strategy to jointly refine the model of the residual star light and reconstruct the disk of interest,

  • a spatio-spectral regularization of the reconstructed multi-spectral images,

  • a strategy to unmix point-like sources from the disk material.

Beyond these methodological developments, the proposed approach is, to the best of our knowledge, the first one to leverage joint processing of multi-spectral data through an inverse problem framework for reconstructing circumstellar disks in high-contrast imaging. We illustrate in this paper the benefits of an accurate exploitation of the spectral diversity to improve reconstruction fidelity. In particular, we show that REXPACO ASDI can faithfully reconstruct disks with near-circulo-symmetric morphologies (e.g., spiral and rings). Such morphological structures are especially challenging to reconstruct without additional data diversity complementary to A(S)DI (e.g., based on RDI techniques) to build an unbiased model of the nuisance component.

Section 2 develops the statistical model for the residual star light and different noise contributions. Building on this model, Sect. 3 presents a reconstruction method that jointly extracts and deconvolves the component of interest: the multi-spectral image of the disk surrounding the target star. Section 4 showcases reconstruction results on several ASDI sequences obtained with the VLT/SPHERE instrument. Additionally, Sect. 5 describes an iterative method to unmix the contribution of candidate point-like sources from the circumstellar disk. Finally, Sect. 6 draws the conclusions of this work.

Throughout the text, the reader can refer to Table 1 summarizing the main notations.

2 Statistical model of the nuisance

Refer to caption
Figure 1: Illustration of a dataset acquired with ASDI: (a) images captured at different wavelengths; (b) spatio-spectral slices along the two lines –1– and –2– drawn in (a); (c) spatio-temporal slices along the lines –1– and –2–. The four square areas define four regions studied in more details in Fig. 2. The component of interest, a spiral-shaped circumstellar disk, is shown in (d) based on REXPACO ASDI reconstruction given in Sect. 4.3. In the first channel, shown in blue, the signal of the disk is faint (contrast about 1.5×1061.5superscript1061.5\times 10^{-6}1.5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT) compared to the stellar leakages. Images are displayed using pseudo-colors (ranging from blue to red) chosen to cover the infrared spectrum. Colored polygons delimit the common field of view seen in all spectral channels. Dataset: SAO 206462 (2015-05-15), see Table 2 for the observation parameters.

In contrast to other methods in the literature, we do not explicitly extract the residual star light component from the data but rather develop a statistical model to describe both the residual star light (i.e., the speckles) and the various stochastic noise contributions (thermal noise, detector readout noise, photon noise). With pupil tracking mode and after chromatic speckle alignment by rescaling the images according to the wavelength, residual star light is very similar from one spectral channel to the next (up to some chromatic factor). There are, however, some fluctuations due to noise, chromatic phenomena, and the evolution of the phase aberrations during observation. These fluctuations display some spatial and spectral correlations and are highly non-stationary. In particular, they are much stronger close to the star. We describe in Sects. 2.1 and 2.2 the rationale of the statistical model embedded in REXPACO ASDI, and we develop in Sect. 2.3 a methodology to estimate the resulting large number of parameters directly from the data.

Figure 1 shows a dataset of a star (SAO 206462) surrounded by a bright disk observed using the ASDI technique. Slices along different dimensions of this 4D dataset are displayed. The coronagraphic mask is aligned with the star, at the center of the field of view (center of the images shown in Fig. 1(a)). Residual star light dominates the central area and extends over most of the field of view. It takes the form of granular intensity structures (speckles). During a pre-processing step, all images were rescaled by a wavelength-specific factor λref/λsubscript𝜆ref𝜆\lambda_{\text{ref}}/\lambdaitalic_λ start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT / italic_λ to compensate for diffraction and spatially align the speckles. The solid line –1– drawn in the x𝑥xitalic_x direction in Fig. 1(a) crosses a bright speckle. This speckle is visible in the left part of Fig. 1(b) and the first row of Fig. 1(c). It remains at the same spatial location for all wavelengths λ𝜆\lambdaitalic_λ and all times t𝑡titalic_t. Structures of interest, such as the disk that surrounds the star SAO 206462, undergo a rotation about the image center throughout time and a scaling with the wavelength (due to the rescaling applied in the pre-processing step). These spatial transformations are visible in the slices along the dotted line –2– drawn in the images of Fig. 1(a): the line crosses the disk (as well as an area with strong residual star light, close to the image center). The spatio-spectral slice shown at the right of Fig. 1(b) displays a scaling of the disk with respect to the wavelength (shorter wavelengths are dilated due to the speckle-aligning pre-processing), whereas the rotation motion can be noted in the spatio-temporal slices shown at the bottom of Fig. 1(c), in particular for a bright structure of the disk highlighted within a white box, which is moving closer to the image center during the sequence. Figure 1(d) shows only the component of interest: the circumstellar disk. The images were obtained with the reconstruction method introduced in this paper, see Sect. 4.3 for a spectrally combined visualization of the reconstructed disk. Comparing Figs. 1(a) and 1(d), illustrates that high-contrast observations suffer from a strong nuisance component which has to be numerically suppressed in order to reconstruct the component of interest.

The accuracy of residual star light and noise model has a strong impact on the reconstruction of the component of interest 𝒖𝒖\bm{u}bold_italic_u, as further discussed in Sect. 4. In the following of this section, we first assume that the object 𝒖𝒖\bm{u}bold_italic_u has a negligible impact on the statistical distribution of the nuisance term, i.e., the statistical distribution of the aligned data pV(𝒗)subscriptp𝑉𝒗\text{p}_{V}(\bm{v})p start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ( bold_italic_v ) in the absence of disk or exoplanet is nearly identical to the distribution pV(𝒗𝐌𝒖)subscriptp𝑉𝒗𝐌𝒖\text{p}_{V}(\bm{v}-\mathbf{M}\,\bm{u})p start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ( bold_italic_v - bold_M bold_italic_u ) of the nuisance component 𝒇=𝒗𝐌𝒖𝒇𝒗𝐌𝒖\bm{f}=\bm{v}-\mathbf{M}\,\bm{u}bold_italic_f = bold_italic_v - bold_M bold_italic_u obtained when the modeled contribution 𝐌𝒖𝐌𝒖\mathbf{M}\,\bm{u}bold_M bold_italic_u of the component of interest has been subtracted from the data 𝒗𝒗\bm{v}bold_italic_v (the direct model, 𝐌𝐌\mathbf{M}bold_M, is presented in Sect. 3.1). This assumption is made in order to initiate the estimation of the model parameters, and we introduce in Sects. 2.33.3 several strategies to jointly estimate the statistical distribution of the nuisance terms and reconstruct the component of interest. These joint and iterative strategies significantly enhance the fidelity of the reconstruction by explicitly accounting for the bias induced by the disk on the nuisance model.

2.1 Patch-based statistical modeling

Image patches (i.e., neighborhoods of a few tens to a hundred pixels) offer an interesting trade-off between locality (small enough to capture a local behavior) and complexity (they include enough pixels to collect geometrical and textural information). Their use has been very successful in image restoration, from methods based on image self-similarity (Buades et al., 2005), collaborative filtering (Dabov et al., 2007), sparse coding (Aharon et al., 2006; Mairal et al., 2009), mixture models (Zoran & Weiss, 2011; Yu et al., 2011), or Gaussian models (Lebrun et al., 2013). Whereas deep neural networks have become the state-of-the-art approach to learn rich models (either generative or discriminative) of natural images, patch-based models retain serious advantages when the number of training samples is limited or in the case of highly non-stationary images. As can be seen in Fig. 1(a), images in an ASDI dataset are far from stationary: residual star light is the strongest at the center of the image (at the actual location of the star). Observations made during separate nights around different stars also often display significantly different structures because of changes in the observing conditions (which impact the residual aberrations uncorrected by adaptive optics, and hence the spatial distribution of speckles due to star light) and star brightness. This limits the possibility to use external observations (e.g., using the RDI technique, see Sect. 1) to learn a model to process a specific ASDI sequence and motivates the development of a patch-based approach based solely on the ASDI sequence of interest.

Under our patch-based model, the distribution of an ASDI sequence 𝒗NLT𝒗superscript𝑁𝐿𝑇\bm{v}\in\mathbb{R}^{NLT}bold_italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_N italic_L italic_T end_POSTSUPERSCRIPT, formed by the collection of T𝑇Titalic_T multi-spectral images with L𝐿Litalic_L spectral bands and N𝑁Nitalic_N pixels in each band, after chromatic speckles alignment and without disk or exoplanet is given by:

pV(𝒗)n𝕂pVn(𝐄n𝒗),subscriptp𝑉𝒗subscriptproduct𝑛𝕂subscriptpsubscript𝑉𝑛subscript𝐄𝑛𝒗\displaystyle\text{p}_{V}(\bm{v})\approx\prod_{n\in\mathbb{K}}\text{p}_{V_{n}}% (\mathbf{E}_{n}\bm{v})\,,p start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ( bold_italic_v ) ≈ ∏ start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT p start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_E start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT bold_italic_v ) , (1)

where pVsubscriptp𝑉\text{p}_{V}p start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT is the joint distribution of the whole ASDI dataset, 𝐄nsubscript𝐄𝑛\mathbf{E}_{n}bold_E start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the linear operator that extracts a K×Leff×T𝐾subscript𝐿eff𝑇K\times L_{\text{eff}}\times Titalic_K × italic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT × italic_T-pixel spatio-spectro-temporal patch centered at the n𝑛nitalic_n-th spatial location of the field of view (i.e., 𝒗n=𝐄n𝒗subscript𝒗𝑛subscript𝐄𝑛𝒗\bm{v}_{n}=\mathbf{E}_{n}\bm{v}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_E start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT bold_italic_v is a 4D-patch222Throughout the text, we do not differentiate the x𝑥xitalic_x and y𝑦yitalic_y spatial dimensions to simplify the notations but rather use 2D spatial indices n𝑛nitalic_n.). The set of spatial locations 𝕂𝕂\mathbb{K}blackboard_K is defined to prevent patch overlapping while tiling the whole field of view (i.e., Card(𝕂)×K=NCard𝕂𝐾𝑁\text{Card}(\mathbb{K})\times K=NCard ( blackboard_K ) × italic_K = italic_N and juxtaposed square patches are used).

The model (1) assumes a statistical independence between patches, which is a simplifying hypothesis that eases a data-driven learning of the distribution pVsubscriptp𝑉\text{p}_{V}p start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT. In the sequel, each distribution pVnsubscriptpsubscript𝑉𝑛\text{p}_{V_{n}}p start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT is modeled by a different multivariate Gaussian in order to capture the correlations between observations within a spatio-spectro-temporal patch. By adapting the parameters of these Gaussian distributions to the spatial location n𝑛nitalic_n, a non-stationary model is obtained, with the capability to capture the variations between areas close to the star (at the center of the image) and areas farther away. The statistical model of a patch is thus given by its assumed distribution:

pVn(𝒗n)=1|2π𝐂n|exp(12𝒗n𝝁n𝐂n12),subscriptpsubscript𝑉𝑛subscript𝒗𝑛12𝜋subscript𝐂𝑛12superscriptsubscriptdelimited-∥∥subscript𝒗𝑛subscript𝝁𝑛superscriptsubscript𝐂𝑛12\displaystyle\text{p}_{V_{n}}(\bm{v}_{n})=\frac{1}{\sqrt{|2\pi\mathbf{C}_{n}|}% }\exp\left(-\tfrac{1}{2}\bigl{\|}\bm{v}_{n}-\bm{\mu}_{n}\bigr{\|}_{\mathbf{C}_% {n}^{-1}}^{2}\right)\,,p start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG | 2 italic_π bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT | end_ARG end_ARG roman_exp ( - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∥ bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (2)

with 𝒂𝐁2=𝒂𝐁𝒂superscriptsubscriptnorm𝒂𝐁2superscript𝒂top𝐁𝒂\|\bm{a}\|_{\mathbf{B}}^{2}=\bm{a}^{\top}\,\mathbf{B}\,\bm{a}∥ bold_italic_a ∥ start_POSTSUBSCRIPT bold_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = bold_italic_a start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_B bold_italic_a and |𝐂n|subscript𝐂𝑛|\mathbf{C}_{n}|| bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT | the determinant of matrix 𝐂nsubscript𝐂𝑛\mathbf{C}_{n}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. The Gaussian distribution pVnsubscriptpsubscript𝑉𝑛\text{p}_{V_{n}}p start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT is defined by the patch expectation 𝝁nKLTsubscript𝝁𝑛superscript𝐾𝐿𝑇\bm{\mu}_{n}\in\mathbb{R}^{KLT}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K italic_L italic_T end_POSTSUPERSCRIPT and the covariance matrix 𝐂nKLT×KLTsubscript𝐂𝑛superscript𝐾𝐿𝑇𝐾𝐿𝑇\mathbf{C}_{n}\in\mathbb{R}^{KLT\times KLT}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K italic_L italic_T × italic_K italic_L italic_T end_POSTSUPERSCRIPT. In order to estimate these two quantities at each location n𝑛nitalic_n, additional hypotheses and an estimation technique are required.

2.2 Constraining the structure of the average vector and of the covariance matrix

Estimating and handling different Gaussian parameters for each patch location is not feasible given the number of parameters involved: the set of all mean vectors {𝝁n}n𝕂subscriptsubscript𝝁𝑛𝑛𝕂\{\bm{\mu}_{n}\}_{n\in\mathbb{K}}{ bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT has as many free parameters as the total number of measurements in 𝒗𝒗\bm{v}bold_italic_v (i.e., NLT𝑁𝐿𝑇NLTitalic_N italic_L italic_T) and the set of all covariance matrices {𝐂n}n𝕂subscriptsubscript𝐂𝑛𝑛𝕂\{\mathbf{C}_{n}\}_{n\in\mathbb{K}}{ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT represents many times the number of measurements in 𝒗𝒗\bm{v}bold_italic_v (more precisely, NLT(KLT+1)/2𝑁𝐿𝑇𝐾𝐿𝑇12NLT(KLT+1)/2italic_N italic_L italic_T ( italic_K italic_L italic_T + 1 ) / 2 free parameters, which represents more than 300,000 times the size of 𝒗𝒗\bm{v}bold_italic_v for typical values of K13×13𝐾1313K\approx 13\times 13italic_K ≈ 13 × 13, L39𝐿39L\approx 39italic_L ≈ 39, and T100𝑇100T\approx 100italic_T ≈ 100).

There are two options to reduce the number of parameters in the Gaussian models of Eqs. (1) and (2). Approach (i) involves assuming a certain level of stationarity for the means or covariances with respect to the spatial location n𝑛nitalic_n. Strategy (ii) is to impose a structure on the mean 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and on the covariance 𝐂nsubscript𝐂𝑛\mathbf{C}_{n}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Beyond obtaining more tractable models, these assumptions are also indispensable, for a single ASDI dataset 𝒗𝒗\bm{v}bold_italic_v, to constrain the estimator of the parameters of the Gaussian models.

The strong spatial non-stationarity of ASDI datasets led us to favor option (ii). We considered several ways to select a structure suitable to ASDI observations and built on our experience of point-source detection in ASDI datasets (Flasseur et al., 2020b). We found that it is preferable to use a common mean vector 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for all times t𝑡titalic_t rather than a time-specific mean vector common to all wavelengths (the spectral variations being stronger than the temporal fluctuations):

Mean[𝒗n,(k,,:)]=𝝁n,(k,,:)=1Tt=1T𝒗n,(k,,t)=𝝁n,(k,)spec,Meandelimited-[]subscript𝒗𝑛𝑘:subscript𝝁𝑛𝑘:1𝑇superscriptsubscriptsuperscript𝑡1𝑇subscript𝒗𝑛𝑘superscript𝑡superscriptsubscript𝝁𝑛𝑘spec\text{Mean}\left[\bm{v}_{n,(k,\ell,:)}\right]=\bm{\mu}_{n,(k,\ell,:)}=\frac{1}% {T}\sum\limits_{t^{\prime}=1}^{T}\bm{v}_{n,(k,\ell,t^{\prime})}=\bm{\mu}_{n,(k% ,\ell)}^{\mathrm{spec}}\,,Mean [ bold_italic_v start_POSTSUBSCRIPT italic_n , ( italic_k , roman_ℓ , : ) end_POSTSUBSCRIPT ] = bold_italic_μ start_POSTSUBSCRIPT italic_n , ( italic_k , roman_ℓ , : ) end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_T end_ARG ∑ start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_v start_POSTSUBSCRIPT italic_n , ( italic_k , roman_ℓ , italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT = bold_italic_μ start_POSTSUBSCRIPT italic_n , ( italic_k , roman_ℓ ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , (3)

where 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT represents the mean vector at patch location n𝑛nitalic_n, and 𝝁n,(k,,t)subscript𝝁𝑛𝑘𝑡\bm{\mu}_{n,(k,\ell,t)}bold_italic_μ start_POSTSUBSCRIPT italic_n , ( italic_k , roman_ℓ , italic_t ) end_POSTSUBSCRIPT denotes its specific entry at pixel k𝑘kitalic_k, spectral channel \ellroman_ℓ and time t𝑡titalic_t. Equation (3) can be rewritten in the more concise form:

𝝁n=vec((|𝝁nspec |)(11)T),subscript𝝁𝑛vecmatrix|superscriptsubscript𝝁𝑛spec |absent𝑇absentmatrix11\displaystyle\bm{\mu}_{n}=\text{vec}\!\left(\begin{pmatrix}|\\ \bm{\mu}_{n}^{\text{spec }}\\ |\end{pmatrix}\overset{\text{\scriptsize$\longleftarrow T\longrightarrow$}}{% \begin{pmatrix}1&\cdots&1\end{pmatrix}}\right)\,,bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = vec ( ( start_ARG start_ROW start_CELL | end_CELL end_ROW start_ROW start_CELL bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT spec end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL | end_CELL end_ROW end_ARG ) start_OVERACCENT ⟵ italic_T ⟶ end_OVERACCENT start_ARG ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL ⋯ end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) end_ARG ) , (4)

where 𝝁nspecsuperscriptsubscript𝝁𝑛spec\bm{\mu}_{n}^{\mathrm{spec}}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT is a KL𝐾𝐿KLitalic_K italic_L-pixel multi-spectral vector that represents the temporal average of the multi-spectral patches and vec()vec\text{vec}(\cdot)vec ( ⋅ ) performs the vectorization of a matrix by stacking its columns (it transforms a KL×T𝐾𝐿𝑇KL\times Titalic_K italic_L × italic_T matrix into a vector of dimension KLT𝐾𝐿𝑇KLTitalic_K italic_L italic_T).

To capture the structures of both the spatial and the spectral covariances, we model the covariance between two pixels of the patch 𝒗nsubscript𝒗𝑛\bm{v}_{n}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT by:

Cov[𝒗n,(k1,1,t1),𝒗n,(k2,2,t2)]={0if t1t2,σn,t2𝐂n,(k1,k2)spat𝐂n,(1,2)specif t1=t2=t,Covsubscript𝒗𝑛subscript𝑘1subscript1subscript𝑡1subscript𝒗𝑛subscript𝑘2subscript2subscript𝑡2cases0if subscript𝑡1subscript𝑡2superscriptsubscript𝜎𝑛𝑡2superscriptsubscript𝐂𝑛subscript𝑘1subscript𝑘2spatsuperscriptsubscript𝐂𝑛subscript1subscript2specif subscript𝑡1subscript𝑡2𝑡\text{Cov}\!\left[\bm{v}_{n,(k_{1},\ell_{1},t_{1})},\,\bm{v}_{n,(k_{2},\ell_{2% },t_{2})}\right]\\ =\begin{cases}0&\text{if }t_{1}\neq t_{2}\,,\\ \sigma_{n,t}^{2}\mathbf{C}_{n,\,(k_{1},k_{2})}^{\mathrm{spat}}\mathbf{C}_{n,\,% (\ell_{1},\ell_{2})}^{\mathrm{spec}}&\text{if }t_{1}=t_{2}=t\,,\end{cases}start_ROW start_CELL Cov [ bold_italic_v start_POSTSUBSCRIPT italic_n , ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT , bold_italic_v start_POSTSUBSCRIPT italic_n , ( italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ] end_CELL end_ROW start_ROW start_CELL = { start_ROW start_CELL 0 end_CELL start_CELL if italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_C start_POSTSUBSCRIPT italic_n , ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT bold_C start_POSTSUBSCRIPT italic_n , ( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT end_CELL start_CELL if italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_t , end_CELL end_ROW end_CELL end_ROW (5)

where σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is a scalar that represents the global level of fluctuation in the multi-spectral slice at time t𝑡titalic_t, 𝐂nspatsuperscriptsubscript𝐂𝑛spat\mathbf{C}_{n}^{\mathrm{spat}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT is a K×K𝐾𝐾K\times Kitalic_K × italic_K covariance matrix encoding the spatial structure of the fluctuations (a K𝐾Kitalic_K-pixel spatial patch corresponds to a 2D square window, so this covariance matrix contains information about 2D spatial structures), and matrix 𝐂nspecsuperscriptsubscript𝐂𝑛spec\mathbf{C}_{n}^{\mathrm{spec}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT is an L×L𝐿𝐿L\times Litalic_L × italic_L covariance matrix encoding spectral correlations. To prevent a degeneracy by multiplicative factors, we normalize covariance matrices 𝐂nspatsuperscriptsubscript𝐂𝑛spat\mathbf{C}_{n}^{\mathrm{spat}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT and 𝐂nspecsuperscriptsubscript𝐂𝑛spec\mathbf{C}_{n}^{\mathrm{spec}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT such that their trace be equal to K𝐾Kitalic_K and L𝐿Litalic_L, respectively. In the covariance model of Eq. (5), multi-spectral slices at different times t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are considered uncorrelated (and, thus, mutually independent given the joint Gaussian assumption of Eq. (2)). The time-varying variance parameter σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT plays the role of a scale parameter in a compound-Gaussian model (Conte et al., 1995), also known as a Gaussian scale mixture model (Wainwright & Simoncelli, 1999). A large value of parameter σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT almost discards the time frame t𝑡titalic_t from the n𝑛nitalic_n-th 4D patch, which limits the impact of possible outliers and thus makes the estimator (more) robust (Flasseur et al., 2020a).

The covariance structure given in Eq. (5) corresponds to the following separable covariance matrix:

Cov[𝒗n]=diag(𝝈n2)𝐂nspec𝐂nspat,Covdelimited-[]subscript𝒗𝑛tensor-productdiagsuperscriptsubscript𝝈𝑛2superscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat\displaystyle\text{Cov}\!\left[\bm{v}_{n}\right]=\text{diag}(\bm{\sigma}_{n}^{% 2})\otimes\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}}\,,Cov [ bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] = diag ( bold_italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ⊗ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ⊗ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , (6)

where diag(𝝈n2)diagsuperscriptsubscript𝝈𝑛2\text{diag}(\bm{\sigma}_{n}^{2})diag ( bold_italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) is a T×T𝑇𝑇T\times Titalic_T × italic_T diagonal matrix whose t𝑡titalic_t-th diagonal entry is σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and tensor-product\otimes is Kronecker matrix product: 𝐀𝐁tensor-product𝐀𝐁\mathbf{A}\otimes\mathbf{B}bold_A ⊗ bold_B, with 𝐀n×n𝐀superscript𝑛𝑛\mathbf{A}\in\mathbb{R}^{n\times n}bold_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT and 𝐁m×m𝐁superscript𝑚𝑚\mathbf{B}\in\mathbb{R}^{m\times m}bold_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, is the nm×nm𝑛𝑚𝑛𝑚nm\times nmitalic_n italic_m × italic_n italic_m matrix with a n×n𝑛𝑛n\times nitalic_n × italic_n block structure such that the ij𝑖𝑗ijitalic_i italic_j-th block is the m×m𝑚𝑚m\times mitalic_m × italic_m matrix Aij𝐁subscript𝐴𝑖𝑗𝐁A_{ij}\mathbf{B}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT bold_B. Note that this is equivalent to modeling each multi-spectral slice 𝒗n,tKLsubscript𝒗𝑛𝑡superscript𝐾𝐿\bm{v}_{n,t}\in\mathbb{R}^{KL}bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K italic_L end_POSTSUPERSCRIPT of 𝒗nsubscript𝒗𝑛\bm{v}_{n}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT as random vectors following the compound-Gaussian model 𝒩(𝝁nspec,σn,t2𝐂nspec𝐂nspat)𝒩superscriptsubscript𝝁𝑛spectensor-productsuperscriptsubscript𝜎𝑛𝑡2superscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat\mathcal{N}(\bm{\mu}_{n}^{\mathrm{spec}},\sigma_{n,t}^{2}\mathbf{C}_{n}^{% \mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}})caligraphic_N ( bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ⊗ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ), i.e. the scaled and centered vectors 1σn,t(𝒗n,t𝝁nspec)1subscript𝜎𝑛𝑡subscript𝒗𝑛𝑡superscriptsubscript𝝁𝑛spec\frac{1}{\sigma_{n,t}}(\bm{v}_{n,t}-\bm{\mu}_{n}^{\mathrm{spec}})divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT end_ARG ( bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) are independent and identically distributed for all 1tT1𝑡𝑇1\leq t\leq T1 ≤ italic_t ≤ italic_T according to the centered Gaussian 𝒩(𝟎,𝐂nspec𝐂nspat)𝒩0tensor-productsuperscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat\mathcal{N}(\bm{0},\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{% \mathrm{spat}})caligraphic_N ( bold_0 , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ⊗ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ).

With the structure of the mean vector 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT given in Eqs. (3) and (4), corresponding to a multi-spectral patch constant through time, there are only NL𝑁𝐿NLitalic_N italic_L free parameters to estimate all mean vectors from 𝒗NLT𝒗superscript𝑁𝐿𝑇\bm{v}\in\mathbb{R}^{NLT}bold_italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_N italic_L italic_T end_POSTSUPERSCRIPT. The covariance structure defined in Eqs. (5) and (6) leads to T+K(K+1)/2+L(L+1)/22𝑇𝐾𝐾12𝐿𝐿122T+K(K+1)/2+L(L+1)/2-2italic_T + italic_K ( italic_K + 1 ) / 2 + italic_L ( italic_L + 1 ) / 2 - 2 free parameters per 4D patch (the -2 comes from the two normalization constraints), which leads to approximately NK/2𝑁𝐾2NK/2italic_N italic_K / 2 free parameters for the whole ASDI dataset (because KLmuch-greater-than𝐾𝐿K\gg Litalic_K ≫ italic_L and K2/2Tmuch-greater-thansuperscript𝐾22𝑇K^{2}/2\gg Titalic_K start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 ≫ italic_T), and is typically one to two orders of magnitudes smaller than the total number of measurements in 𝒗𝒗\bm{v}bold_italic_v (K/2𝐾2K/2italic_K / 2 is typically less than one hundred whereas LT𝐿𝑇LTitalic_L italic_T is several thousands). Jointly with an adequate estimation method, the structures assumed in Eqs. (3), (4), (5) and (6) can thus be used to derive a non-stationary model of the nuisance terms.

2.3 Estimation of the model parameters

The estimation of the parameters of a separable covariance model has been studied by several previous works from the signal-processing community, see for example Lu & Zimmerman (2005); Genton (2007); Werner et al. (2008). We build on these works and introduce several additional elements specific to high-contrast imaging: (i) whereas most works consider decompositions of the covariance matrix as a Kronecker product of two factors, we also include in Eqs. (5) and (6) the temporal scaling factors σn,tsubscript𝜎𝑛𝑡\sigma_{n,t}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT for increased robustness (Flasseur et al., 2020a, 2022); (ii) given the limited number of samples, we replace maximum likelihood estimates by shrinkage covariance estimators (Ledoit & Wolf, 2004; Chen et al., 2010; Flasseur et al., 2024) to ensure that all estimated covariance matrices are definite positive and to reduce estimation errors; (iii) to account for the superimposition of a component of interest and nuisance terms, we develop a joint estimation strategy in Sect. 3.3 based on the estimation technique developed in this section.

2.3.1 Maximum likelihood estimators

A first possiblity is to determine the parameters of the model of the nuisance statistics so as to maximize the likelihood of the data knowing the object of interest 𝒖𝒖\bm{u}bold_italic_u. According to the considered problem and to the assumed independence of the patches, this amounts to minimizing the following co-log-likelihood:

(7)

where nsubscript𝑛\mathscr{L}_{n}script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the co-log-likelihood of the patch at location n𝑛nitalic_n:

n(𝝁nspec,{σn,t2}t1:T,𝐂nspec,𝐂nspat,𝒖)=subscript𝑛superscriptsubscript𝝁𝑛specsubscriptsuperscriptsubscript𝜎𝑛𝑡2:𝑡1𝑇superscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat𝒖absent\displaystyle\mathscr{L}_{n}\!\left(\bm{\mu}_{n}^{\mathrm{spec}},\big{\{}% \sigma_{n,t}^{2}\big{\}}_{t\in 1:T},\mathbf{C}_{n}^{\mathrm{spec}},\mathbf{C}_% {n}^{\mathrm{spat}},\bm{u}\right)=script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , { italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t ∈ 1 : italic_T end_POSTSUBSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , bold_italic_u ) =
12t=1T(𝒗n,t𝝁nspec[𝐌𝒖]n,t𝐂n,t12+log|𝐂n,t|),\displaystyle\hskip 14.22636pt\frac{1}{2}\sum_{t=1}^{T}\left(\left\|\bm{v}_{n,% t}-\bm{\mu}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n,t}\right\|_{\mathbf{C}% _{n,t}^{-1}}^{2}+\log\left\rvert\mathbf{C}_{n,t}\right\lvert\right),divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( ∥ bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M bold_italic_u ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT bold_C start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + roman_log | bold_C start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT | ) , (8)

with 𝐂n,t=σn,t2𝐂nspec𝐂nspatsubscript𝐂𝑛𝑡tensor-productsuperscriptsubscript𝜎𝑛𝑡2subscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛\mathbf{C}_{n,t}=\sigma_{n,t}^{2}\,\mathbf{C}^{\mathrm{spec}}_{n}\otimes% \mathbf{C}^{\mathrm{spat}}_{n}bold_C start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⊗ bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT the assumed covariance of the patch data 𝒗n,tsubscript𝒗𝑛𝑡\bm{v}_{n,t}bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT and |𝐂n,t|\left\rvert\mathbf{C}_{n,t}\right\lvert| bold_C start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT | its determinant. The term 𝐌𝒖𝐌𝒖\mathbf{M}\,\bm{u}bold_M bold_italic_u accounts for the contribution of the object of interest in the data, the linear model matrix 𝐌𝐌\mathbf{M}bold_M is detailed in Sect. 3.1. The maximum likelihood estimators (MLEs) of the parameters of the nuisance statistic are then given by:

𝝁^nspecsuperscriptsubscript^𝝁𝑛spec\displaystyle\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT =argmin𝝁nspecn(𝝁nspec,{σ^n,t2}t1:T,𝐂^nspec,𝐂^nspat,𝒖^)absentsubscriptargminsuperscriptsubscript𝝁𝑛specsubscript𝑛superscriptsubscript𝝁𝑛specsubscriptsuperscriptsubscript^𝜎𝑛𝑡2:𝑡1𝑇superscriptsubscript^𝐂𝑛specsuperscriptsubscript^𝐂𝑛spat^𝒖\displaystyle=\operatorname*{arg\,min}_{\bm{\mu}_{n}^{\mathrm{spec}}}\mathscr{% L}_{n}\!\left(\bm{\mu}_{n}^{\mathrm{spec}},\big{\{}\widehat{\sigma}_{n,t}^{2}% \big{\}}_{t\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}},\widehat{\mathbf{% C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)= start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT end_POSTSUBSCRIPT script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , { over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t ∈ 1 : italic_T end_POSTSUBSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_u end_ARG )
=t=1Tσ^n,t2(𝒗n,t[𝐌𝒖^]n,t)t=1Tσ^n,t2,absentsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑛𝑡2subscript𝒗𝑛𝑡subscriptdelimited-[]𝐌^𝒖𝑛𝑡superscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑛𝑡2\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{n,t}^{-2}\,\left(\bm{v}_{n% ,t}-[\mathbf{M}\,\widehat{\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widehat{\sigma% }_{n,t}^{-2}}\,,= divide start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - [ bold_M over^ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT end_ARG , (9)
σ^n,t2superscriptsubscript^𝜎𝑛𝑡2\displaystyle\widehat{\sigma}_{n,t}^{2}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =argminσn,t2n(𝝁^nspec,{σn,t2}t1:T,𝐂^nspec,𝐂^nspat,𝒖^)absentsubscriptargminsuperscriptsubscript𝜎𝑛𝑡2subscript𝑛superscriptsubscript^𝝁𝑛specsubscriptsuperscriptsubscript𝜎𝑛superscript𝑡2:superscript𝑡1𝑇superscriptsubscript^𝐂𝑛specsuperscriptsubscript^𝐂𝑛spat^𝒖\displaystyle=\operatorname*{arg\,min}_{\sigma_{n,t}^{2}}\mathscr{L}_{n}\left(% \widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\sigma_{n,t^{\prime}}^{2}\big{% \}}_{t^{\prime}\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}},\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)= start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , { italic_σ start_POSTSUBSCRIPT italic_n , italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ 1 : italic_T end_POSTSUBSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_u end_ARG )
=1KL𝒗n,t𝝁^nspec[𝐌𝒖^]n,t(𝐂^nspec)1(𝐂^nspat)12,absent1𝐾𝐿superscriptsubscriptnormsubscript𝒗𝑛𝑡superscriptsubscript^𝝁𝑛specsubscriptdelimited-[]𝐌^𝒖𝑛𝑡tensor-productsuperscriptsuperscriptsubscript^𝐂𝑛spec1superscriptsuperscriptsubscript^𝐂𝑛spat12\displaystyle=\frac{1}{K\,L}\left\|\bm{v}_{n,t}-\widehat{\bm{\mu}}_{n}^{% \mathrm{spec}}-[\mathbf{M}\,\widehat{\bm{u}}]_{n,t}\right\|_{\big{(}\widehat{% \mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widehat{\mathbf{C}}_% {n}^{\mathrm{spat}}\big{)}^{-1}}^{2}\,,= divide start_ARG 1 end_ARG start_ARG italic_K italic_L end_ARG ∥ bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M over^ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (10)
𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT =argmin𝐂nspecn(𝝁^nspec,{σ^n,t2}t1:T,𝐂nspec,𝐂^nspat,𝒖^)absentsubscriptargminsuperscriptsubscript𝐂𝑛specsubscript𝑛superscriptsubscript^𝝁𝑛specsubscriptsuperscriptsubscript^𝜎𝑛𝑡2:𝑡1𝑇superscriptsubscript𝐂𝑛specsuperscriptsubscript^𝐂𝑛spat^𝒖\displaystyle=\operatorname*{arg\,min}_{\mathbf{C}_{n}^{\mathrm{spec}}}% \mathscr{L}_{n}\!\left(\widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\widehat% {\sigma}_{n,t}^{2}\big{\}}_{t\in 1:T},\mathbf{C}_{n}^{\mathrm{spec}},\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)= start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT end_POSTSUBSCRIPT script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , { over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t ∈ 1 : italic_T end_POSTSUBSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_u end_ARG )
=1TKt=1T𝐕^n,t(σ^n,t2𝐂^nspat)1𝐕^n,t,absent1𝑇𝐾superscriptsubscript𝑡1𝑇superscriptsubscript^𝐕𝑛𝑡topsuperscriptsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spat1subscript^𝐕𝑛𝑡\displaystyle=\frac{1}{T\,K}\sum_{t=1}^{T}\widehat{\mathbf{V}}_{n,t}^{\top}% \left(\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}% \right)^{-1}\,\widehat{\mathbf{V}}_{n,t}\,,= divide start_ARG 1 end_ARG start_ARG italic_T italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT , (11)
𝐂^nspatsuperscriptsubscript^𝐂𝑛spat\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT =argmin𝐂nspatn(𝝁^nspec,{σ^n,t2}t1:T,𝐂^nspec,𝐂nspat,𝒖^)absentsubscriptargminsuperscriptsubscript𝐂𝑛spatsubscript𝑛superscriptsubscript^𝝁𝑛specsubscriptsuperscriptsubscript^𝜎𝑛𝑡2:𝑡1𝑇superscriptsubscript^𝐂𝑛specsuperscriptsubscript𝐂𝑛spat^𝒖\displaystyle=\operatorname*{arg\,min}_{\mathbf{C}_{n}^{\mathrm{spat}}}% \mathscr{L}_{n}\!\left(\widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\widehat% {\sigma}_{n,t}^{2}\big{\}}_{t\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}% ,\mathbf{C}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)= start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT end_POSTSUBSCRIPT script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , { over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t ∈ 1 : italic_T end_POSTSUBSCRIPT , over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_u end_ARG )
=1TLt=1T𝐕^n,t(σ^n,t2𝐂^nspec)1𝐕^n,t,absent1𝑇𝐿superscriptsubscript𝑡1𝑇subscript^𝐕𝑛𝑡superscriptsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spec1superscriptsubscript^𝐕𝑛𝑡top\displaystyle=\frac{1}{T\,L}\sum_{t=1}^{T}\widehat{\mathbf{V}}_{n,t}\left(% \widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}\right)^{-% 1}\,\widehat{\mathbf{V}}_{n,t}^{\top}\,,= divide start_ARG 1 end_ARG start_ARG italic_T italic_L end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , (12)

with 𝒖^^𝒖\widehat{\bm{u}}over^ start_ARG bold_italic_u end_ARG the estimator of the object of interest and where 𝐕^n,tsubscript^𝐕𝑛𝑡\widehat{\mathbf{V}}_{n,t}over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT is a K×L𝐾𝐿K\times Litalic_K × italic_L matrix corresponding to the residual multi-spectral patch at pixel n𝑛nitalic_n and time t𝑡titalic_t: at row k𝑘kitalic_k and column \ellroman_ℓ it is equal to [𝒗n,t𝝁^nspec[𝐌𝒖^]n,t]k,subscriptdelimited-[]subscript𝒗𝑛𝑡superscriptsubscript^𝝁𝑛specsubscriptdelimited-[]𝐌^𝒖𝑛𝑡𝑘\big{[}\bm{v}_{n,t}-\widehat{\bm{\mu}}_{n}^{\mathrm{spec}}-[\mathbf{M}\,% \widehat{\bm{u}}]_{n,t}\big{]}_{k,\ell}[ bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M over^ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_k , roman_ℓ end_POSTSUBSCRIPT. The complete derivation of these expressions is given in Appendix A. These equations are generally interdependent which has an incidence on the optimization strategy, see Sect. 3.3.

Refer to caption
Figure 2: MLEs of the spatial and spectral correlation matrices given by Eqs. (11)–(12) computed in the four regions of interest, indicated by small colored squares in Fig. 1(a), with each matrix corresponding to its respective color-coded region. The angular separation with respect to the star (i.e., image center) increases from the region in (a), close to the star, to the region in (d), which is farther away. Dataset: SAO 206462 (2015-05-15), see Table 2 for the observation parameters.

The multi-spectral mean 𝝁^nspecsuperscriptsubscript^𝝁𝑛spec\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT in Eq. (9) is obtained by weighted averaging, with weights inversely proportional to the patch-wise variance σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT: this limits the impact of outliers. The patch-wise variance σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in Eq. (10) corresponds to the average squared deviation to the mean, computed after spatial and spectral whitening. The estimator 𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT of the spectral covariance given in Eq. (11) is readily the sample covariance of the residuals 𝐕^n,tsuperscriptsubscript^𝐕𝑛𝑡top\widehat{\mathbf{V}}_{n,t}^{\top}over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT whitened for the spatial covariances by σ^n,t2𝐂^nspatsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spat\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT. Conversely, the estimator 𝐂^nspatsuperscriptsubscript^𝐂𝑛spat\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT of the spatial covariance given in Eq. (12) corresponds to the sample covariance of the residuals 𝐕^n,tsubscript^𝐕𝑛𝑡\widehat{\mathbf{V}}_{n,t}over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT whitened for the spectral covariances by σ^n,t2𝐂^nspecsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spec\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT.

In practice, the whitening operation by either σ^n,t2𝐂^nspecsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spec\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT or σ^n,t2𝐂^nspatsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spat\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT is done by first computing the Cholesky’s decompositions σ^n,t2𝐂^nspec=𝕎nspec(𝕎nspec)superscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛specsuperscriptsubscript𝕎𝑛specsuperscriptsuperscriptsubscript𝕎𝑛spectop\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}=\mathbb{W% }_{n}^{\mathrm{spec}}\big{(}\mathbb{W}_{n}^{\mathrm{spec}}\big{)}^{\top}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT = blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ( blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT and σ^n,t2𝐂^nspat=𝕎nspat(𝕎nspat)superscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spatsuperscriptsubscript𝕎𝑛spatsuperscriptsuperscriptsubscript𝕎𝑛spattop\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}=\mathbb{W% }_{n}^{\mathrm{spat}}\big{(}\mathbb{W}_{n}^{\mathrm{spat}}\big{)}^{\top}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT = blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ( blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT with 𝕎nspecsuperscriptsubscript𝕎𝑛spec\mathbb{W}_{n}^{\mathrm{spec}}blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and 𝕎nspatsuperscriptsubscript𝕎𝑛spat\mathbb{W}_{n}^{\mathrm{spat}}blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT triangular matrices. We then compute (𝕎nspec)1𝐕^n,tsuperscriptsuperscriptsubscript𝕎𝑛spec1superscriptsubscript^𝐕𝑛𝑡top\left(\mathbb{W}_{n}^{\mathrm{spec}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}^{\top}( blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT and (𝕎nspat)1𝐕^n,tsuperscriptsuperscriptsubscript𝕎𝑛spat1subscript^𝐕𝑛𝑡\left(\mathbb{W}_{n}^{\mathrm{spat}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}( blackboard_W start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT, which respectively amounts to spectral and spatial whitening of the residuals 𝐕^n,tsubscript^𝐕𝑛𝑡\widehat{\mathbf{V}}_{n,t}over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT. We finally take the sample covariances of these whitened residuals.

From the expressions in Eqs. (9)-(12), it is not possible to derive a closed-form expression of each parameter that does not also depend on other parameters (i.e., estimators (9)-(12) are interdependent). Yet, these formulae can be applied alternately until convergence, a method called flip-flop in Lu & Zimmerman (2005) where a faster convergence is reported compared to maximizing the log-likelihood using an iterative optimization algorithm (Newton’s method).

Figure 2 illustrates the spatial and spectral covariance matrices estimated under this model from an ASDI dataset of the VLT/SPHERE-IFS instrument. MLEs of the spatial and spectral correlation matrices were computed with the flip-flop method for the four regions of interest indicated by small colored squares in Fig. 1(a). To compare matrices with very different variances, we normalized each covariance Cov[a,b] by Cov[a,a]Cov[b,b]Cov[a,a]Cov[b,b]\sqrt{\text{Cov[a,a]}\text{Cov[b,b]}}square-root start_ARG Cov[a,a] Cov[b,b] end_ARG, i.e. we show the correlation coefficients. Due to the vectorization of 2D spatial patches, the spatial correlations display a blocky structure. The spatial correlations within a patch globally decrease with the 2D distance between pixels. They are stronger in the area (a) which is the closest to the star. Spectral correlations are also much stronger close to the star. As can be observed in Fig. 1(a), after the scaling transform applied in the pre-processing step, regions far from the star are not seen at the longest wavelengths. The size of multi-spectral patches extracted in these regions is reduced from KL𝐾𝐿KLitalic_K italic_L to KLeff𝐾subscript𝐿effKL_{\text{eff}}italic_K italic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT pixels (with Leff<Lsubscript𝐿eff𝐿L_{\text{eff}}<Litalic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT < italic_L the effective number of wavelengths seen at location n𝑛nitalic_n) and the size of the spectral covariance matrix 𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT is reduced accordingly, from L×L𝐿𝐿L\times Litalic_L × italic_L to Leff×Leffsubscript𝐿effsubscript𝐿effL_{\text{eff}}\times L_{\text{eff}}italic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT × italic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT.

2.3.2 Shrinkage estimator of covariances

Given that the numbers T𝑇Titalic_T of exposures and L𝐿Litalic_L of spectral channels are limited, the empirical covariance estimates 𝐂^nspatsubscriptsuperscript^𝐂spat𝑛\widehat{\mathbf{C}}^{\mathrm{spat}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝐂^nspecsubscriptsuperscript^𝐂spec𝑛\widehat{\mathbf{C}}^{\mathrm{spec}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (indifferently 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT in the following) are very noisy (when TKsimilar-to-or-equals𝑇𝐾T\simeq Kitalic_T ≃ italic_K or LeffKsimilar-to-or-equalssubscript𝐿eff𝐾L_{\text{eff}}\simeq Kitalic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT ≃ italic_K) and can be even rank-deficient (in particular when T<K𝑇𝐾T<Kitalic_T < italic_K or Leff<Ksubscript𝐿eff𝐾L_{\text{eff}}<Kitalic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT < italic_K). To reduce the estimation error on 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and ensure its definite-positiveness, shrinkage techniques combine the maximum likelihood estimator with another estimator of smaller variance (Ledoit & Wolf, 2004). Like in our previous works (Flasseur et al., 2018, 2020b, 2021, 2023a, 2023b), we consider the convex combination between the low-bias/high-variance sample covariance 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and a high-bias/low-variance matrix 𝐅^nsubscript^𝐅𝑛\widehat{\mathbf{F}}_{n}over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT:

𝐂~n=γ((1ρ~n)𝐂^n+ρ~n𝐅^n),subscript~𝐂𝑛𝛾1subscript~𝜌𝑛subscript^𝐂𝑛subscript~𝜌𝑛subscript^𝐅𝑛\displaystyle\widetilde{\mathbf{C}}_{n}=\gamma((1-\widetilde{\rho}_{n})% \widehat{\mathbf{C}}_{n}+\widetilde{\rho}_{n}\widehat{\mathbf{F}}_{n})\,,over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_γ ( ( 1 - over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) , (13)

with 𝐅^n=Diag(𝐂^n)subscript^𝐅𝑛Diagsubscript^𝐂𝑛\widehat{\mathbf{F}}_{n}=\mathrm{Diag}(\widehat{\mathbf{C}}_{n})over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = roman_Diag ( over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) a diagonal matrix such that [𝐅^n]i,i=[𝐂^n]i,isubscriptdelimited-[]subscript^𝐅𝑛𝑖𝑖subscriptdelimited-[]subscript^𝐂𝑛𝑖𝑖[\widehat{\mathbf{F}}_{n}]_{i,i}=[\widehat{\mathbf{C}}_{n}]_{i,i}[ over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_i , italic_i end_POSTSUBSCRIPT = [ over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_i , italic_i end_POSTSUBSCRIPT, ρ~n[0,1]subscript~𝜌𝑛01\widetilde{\rho}_{n}\in[0,1]over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ [ 0 , 1 ], and γ𝛾\gammaitalic_γ a factor introduced to compensate for the fact that 𝐂^^𝐂\widehat{\mathbf{C}}over^ start_ARG bold_C end_ARG is a biased estimate of the true (and unknown) covariance 𝐂𝐂\mathbf{C}bold_C. The estimator 𝐂~nsubscript~𝐂𝑛\widetilde{\mathbf{C}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT defined in Eq. (13) shrinks off-diagonal values (i.e., the covariances) of 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT towards 0 (by multiplication by the factor 1ρ~n1subscript~𝜌𝑛1-\widetilde{\rho}_{n}1 - over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT) and leaves diagonal values (i.e., the sample variances) unchanged. By controlling the shrinkage amount, hyper-parameter ρ~nsubscript~𝜌𝑛\widetilde{\rho}_{n}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT plays a critical role as it set a bias-variance trade-off. Compared to other regularization techniques such as diagonal loading (i.e., adding a small fraction of the identity matrix to 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT), definition (13) is attractive because it is data-driven: it locally adapts to the fluctuations observed in the non-stationary data and to the number of samples (in particular, we have LeffLsubscript𝐿eff𝐿L_{\text{eff}}\neq Litalic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT ≠ italic_L on the borders of the field of view). Such a shrinkage estimator is thus well-suited to imaging systems suffering from non-stationary perturbations.

It remains to find the optimal level of shrinkage ρ~nsubscript~𝜌𝑛\widetilde{\rho}_{n}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT appropriate for each patch location n𝑛nitalic_n. An optimal setting can be defined based on risk minimization between the true covariance 𝐂nsubscript𝐂𝑛\mathbf{C}_{n}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and its shrunk counterpart 𝐂~nsubscript~𝐂𝑛\widetilde{\mathbf{C}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (Ledoit & Wolf, 2004). However, such an oracle estimator can not be used in practice since 𝐂nsubscript𝐂𝑛\mathbf{C}_{n}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is unknown. In a recent work (Flasseur et al., 2024), we derive a practical closed-form expression for its quasi-optimal setting that asymptotically approximates the oracle (for readability, the patch index n𝑛nitalic_n is omitted in the following equations):

ρ~=(γν+ϵ1)(tr(𝐂^2)i[𝐂^]i,i2)+γη(tr2(𝐂^)i[𝐂^]i,i2)γν(tr(𝐂^2)i[𝐂^]i,i2),~𝜌𝛾𝜈italic-ϵ1trsuperscript^𝐂2subscript𝑖superscriptsubscriptdelimited-[]^𝐂𝑖𝑖2𝛾𝜂superscripttr2^𝐂subscript𝑖superscriptsubscriptdelimited-[]^𝐂𝑖𝑖2𝛾𝜈trsuperscript^𝐂2subscript𝑖superscriptsubscriptdelimited-[]^𝐂𝑖𝑖2\widetilde{\rho}=\frac{(\gamma\nu+\epsilon-1)\big{(}\operatorname{tr}(\widehat% {\mathbf{C}}^{2})-\sum_{i}[\widehat{\mathbf{C}}]_{i,i}^{2}\big{)}+\gamma\eta% \big{(}\operatorname{tr}^{2}(\widehat{\mathbf{C}})-\sum_{i}[\widehat{\mathbf{C% }}]_{i,i}^{2}\big{)}}{\gamma\nu\big{(}\operatorname{tr}(\widehat{\mathbf{C}}^{% 2})-\sum_{i}[\widehat{\mathbf{C}}]_{i,i}^{2}\big{)}},over~ start_ARG italic_ρ end_ARG = divide start_ARG ( italic_γ italic_ν + italic_ϵ - 1 ) ( roman_tr ( over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ over^ start_ARG bold_C end_ARG ] start_POSTSUBSCRIPT italic_i , italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + italic_γ italic_η ( roman_tr start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( over^ start_ARG bold_C end_ARG ) - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ over^ start_ARG bold_C end_ARG ] start_POSTSUBSCRIPT italic_i , italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_γ italic_ν ( roman_tr ( over^ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ over^ start_ARG bold_C end_ARG ] start_POSTSUBSCRIPT italic_i , italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG , (14)

with:

ϵitalic-ϵ\displaystyle\epsilonitalic_ϵ =t=1Tσ^t4(t=1Tσ^t2)2,absentsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑡4superscriptsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑡22\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{t}^{-4}}{\left(\sum_{t=1}^% {T}\widehat{\sigma}_{t}^{-2}\right)^{2}},= divide start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT end_ARG start_ARG ( ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (15)
ζ𝜁\displaystyle\zetaitalic_ζ =t=1Tσ^t6(t=1Tσ^t2)3,absentsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑡6superscriptsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑡23\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{t}^{-6}}{\left(\sum_{t=1}^% {T}\widehat{\sigma}_{t}^{-2}\right)^{3}},= divide start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT end_ARG start_ARG ( ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG , (16)
γ𝛾\displaystyle\gammaitalic_γ =(1ϵ)1,absentsuperscript1italic-ϵ1\displaystyle=(1-\epsilon)^{-1},= ( 1 - italic_ϵ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (17)
ν𝜈\displaystyle\nuitalic_ν =1ϵ2ζ+2ϵ2,absent1italic-ϵ2𝜁2superscriptitalic-ϵ2\displaystyle=1-\epsilon-2\,\zeta+2\,\epsilon^{2},= 1 - italic_ϵ - 2 italic_ζ + 2 italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (18)
η𝜂\displaystyle\etaitalic_η =ϵ2ζ+ϵ2.absentitalic-ϵ2𝜁superscriptitalic-ϵ2\displaystyle=\epsilon-2\,\zeta+\epsilon^{2}\,.= italic_ϵ - 2 italic_ζ + italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (19)

This analytic solution depends solely on the sample covariance 𝐂^nsubscript^𝐂𝑛\widehat{\mathbf{C}}_{n}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and patch variances {σ^n,t2}t=1:Tsubscriptsuperscriptsubscript^𝜎𝑛𝑡2:𝑡1𝑇\{\widehat{\sigma}_{n,t}^{2}\}_{t=1:T}{ over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_t = 1 : italic_T end_POSTSUBSCRIPT introduced in the MLE estimators (9)-(12) to improve robustness against outliers. In addition, formulae (14)-(19) explicitly account for the use of 𝝁^nspecsuperscriptsubscript^𝝁𝑛spec\widehat{\bm{\mu}}_{n}^{\mathrm{spec}}over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT as an empirical estimate of the true unknown mean 𝝁nspecsuperscriptsubscript𝝁𝑛spec\bm{\mu}_{n}^{\mathrm{spec}}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT (Flasseur et al., 2024). It is worth noting that the shrinkage technique developed in this paragraph is general; it holds whatever the covariance structure of our problem, namely, the spatio-spectral separability of the covariance.

In the following, the shrunk covariance is given by:

𝐂~=𝚿𝐂^,~𝐂direct-product𝚿^𝐂\widetilde{\mathbf{C}}=\mathbf{\Psi}\odot\widehat{\mathbf{C}},over~ start_ARG bold_C end_ARG = bold_Ψ ⊙ over^ start_ARG bold_C end_ARG , (20)

where direct-product\odot denotes the Hadamard (element-wise) product, and 𝚿𝚿\mathbf{\Psi}bold_Ψ is a weighting matrix whose diagonal entries are 1 and whose off-diagonal entries are 1ρ~1~𝜌1-\widetilde{\rho}1 - over~ start_ARG italic_ρ end_ARG, where ρ~~𝜌\widetilde{\rho}over~ start_ARG italic_ρ end_ARG is given by Eq. (14).

2.3.3 Shrunk spatio-spectral covariance

To introduce the shrinkage with our particular factorization of the spatio-spectral covariance as 𝐂nspec𝐂nspattensor-productsuperscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ⊗ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT (see Eq. (6)), we propose to apply the shrinkage on each of the components 𝐂nspecsuperscriptsubscript𝐂𝑛spec\mathbf{C}_{n}^{\mathrm{spec}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and 𝐂nspatsuperscriptsubscript𝐂𝑛spat\mathbf{C}_{n}^{\mathrm{spat}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT separately. Futhermore, following the prescription in Flasseur et al. (2021), we estimate the skrinkage factors once at the initialization of the reconstruction algorithm. As a consequence, in subsequent steps, the shrinkage factors depend neither on the object of interest 𝒖𝒖\bm{u}bold_italic_u nor on the nuisance statistics defined in Eqs. (9)–(12). This amounts to rewriting the MLEs estimators in Eqs. (9)–(12) as:

𝝁~nspecsuperscriptsubscript~𝝁𝑛spec\displaystyle\widetilde{\bm{\mu}}_{n}^{\,\mathrm{spec}}over~ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT =t=1Tσ~n,t2(𝒗n,t[𝐌𝒖~]n,t)t=1Tσ~n,t2,absentsuperscriptsubscript𝑡1𝑇superscriptsubscript~𝜎𝑛𝑡2subscript𝒗𝑛𝑡subscriptdelimited-[]𝐌~𝒖𝑛𝑡superscriptsubscript𝑡1𝑇superscriptsubscript~𝜎𝑛𝑡2\displaystyle=\frac{\sum_{t=1}^{T}\widetilde{\sigma}_{n,t}^{-2}\,\left(\bm{v}_% {n,t}-[\mathbf{M}\,\widetilde{\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widetilde{% \sigma}_{n,t}^{-2}},= divide start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - [ bold_M over~ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT end_ARG , (21)
σ~n,t2superscriptsubscript~𝜎𝑛𝑡2\displaystyle\widetilde{\sigma}_{n,t}^{2}over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =1KL𝒗n,t𝝁~nspec[𝐌𝒖~]n,t(𝐂~nspec)1(𝐂~nspat)12,absent1𝐾𝐿superscriptsubscriptnormsubscript𝒗𝑛𝑡superscriptsubscript~𝝁𝑛specsubscriptdelimited-[]𝐌~𝒖𝑛𝑡tensor-productsuperscriptsuperscriptsubscript~𝐂𝑛spec1superscriptsuperscriptsubscript~𝐂𝑛spat12\displaystyle=\tfrac{1}{KL}\left\|\bm{v}_{n,t}-\widetilde{\bm{\mu}}_{n}^{\,% \mathrm{spec}}-[\mathbf{M}\,\widetilde{\bm{u}}]_{n,t}\right\|_{\big{(}% \widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widetilde% {\mathbf{C}}_{n}^{\mathrm{spat}}\big{)}^{-1}}^{2},= divide start_ARG 1 end_ARG start_ARG italic_K italic_L end_ARG ∥ bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - over~ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M over~ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (22)
𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT =1TKt=1T𝐕~n,t(σ~n,t2𝐂~nspat)1𝐕~n,t,absent1𝑇𝐾superscriptsubscript𝑡1𝑇superscriptsubscript~𝐕𝑛𝑡topsuperscriptsuperscriptsubscript~𝜎𝑛𝑡2superscriptsubscript~𝐂𝑛spat1subscript~𝐕𝑛𝑡\displaystyle=\tfrac{1}{TK}\sum_{t=1}^{T}\widetilde{\mathbf{V}}_{n,t}^{\top}% \left(\widetilde{\sigma}_{n,t}^{2}\,\widetilde{\mathbf{C}}_{n}^{\mathrm{spat}}% \right)^{-1}\,\widetilde{\mathbf{V}}_{n,t},= divide start_ARG 1 end_ARG start_ARG italic_T italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT , (23)
𝐂~nspecsuperscriptsubscript~𝐂𝑛spec\displaystyle\widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT =𝚿nspec𝐂^nspec,absentdirect-productsubscriptsuperscript𝚿spec𝑛superscriptsubscript^𝐂𝑛spec\displaystyle=\mathbf{\Psi}^{\mathrm{spec}}_{n}\odot\widehat{\mathbf{C}}_{n}^{% \mathrm{spec}},= bold_Ψ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⊙ over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , (24)
𝐂^nspatsuperscriptsubscript^𝐂𝑛spat\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT =1TLt=1T𝐕~n,t(σ~n,t2𝐂~nspec)1𝐕~n,t,absent1𝑇𝐿superscriptsubscript𝑡1𝑇subscript~𝐕𝑛𝑡superscriptsuperscriptsubscript~𝜎𝑛𝑡2superscriptsubscript~𝐂𝑛spec1superscriptsubscript~𝐕𝑛𝑡top\displaystyle=\tfrac{1}{TL}\sum_{t=1}^{T}\widetilde{\mathbf{V}}_{n,t}\left(% \widetilde{\sigma}_{n,t}^{2}\,\widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\right% )^{-1}\,\widetilde{\mathbf{V}}_{n,t}^{\top},= divide start_ARG 1 end_ARG start_ARG italic_T italic_L end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , (25)
𝐂~nspatsuperscriptsubscript~𝐂𝑛spat\displaystyle\widetilde{\mathbf{C}}_{n}^{\mathrm{spat}}over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT =𝚿nspat𝐂^nspat,absentdirect-productsubscriptsuperscript𝚿spat𝑛superscriptsubscript^𝐂𝑛spat\displaystyle=\mathbf{\Psi}^{\mathrm{spat}}_{n}\odot\widehat{\mathbf{C}}_{n}^{% \mathrm{spat}},= bold_Ψ start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⊙ over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT , (26)

where 𝐕~n,tsubscript~𝐕𝑛𝑡\widetilde{\mathbf{V}}_{n,t}over~ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT is defined as in Eqs. (9)–(12) but replacing 𝝁^nspecsuperscriptsubscript^𝝁𝑛spec\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT by 𝝁~nspecsuperscriptsubscript~𝝁𝑛spec\widetilde{\bm{\mu}}_{n}^{\,\mathrm{spec}}over~ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT as well as 𝒖^^𝒖\widehat{\bm{u}}over^ start_ARG bold_italic_u end_ARG by 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG, and where 𝚿nspecsubscriptsuperscript𝚿spec𝑛\mathbf{\Psi}^{\mathrm{spec}}_{n}bold_Ψ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝚿nspatsubscriptsuperscript𝚿spat𝑛\mathbf{\Psi}^{\mathrm{spat}}_{n}bold_Ψ start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are computed according to Eq. (20) for the respective sample covariances 𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and 𝐂^nspatsuperscriptsubscript^𝐂𝑛spat\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT given by Eqs. (23) and (25) as estimated during the initialization stage of the reconstruction algorithm. The sample covariances 𝐂^nspecsuperscriptsubscript^𝐂𝑛spec\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and 𝐂^nspatsuperscriptsubscript^𝐂𝑛spat\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT in Eqs. (23) and (25) differ from their MLEs counterparts in Eqs. (11) and (12) by the accounting of the shrinkage in the whitening. The assumed separable model of the covariance now takes the form 𝐂~n=diag(𝝈~n2)𝐂~nspec𝐂~nspatsubscript~𝐂𝑛tensor-productdiagsuperscriptsubscript~𝝈𝑛2superscriptsubscript~𝐂𝑛specsuperscriptsubscript~𝐂𝑛spat\widetilde{\mathbf{C}}_{n}=\text{diag}(\widetilde{\bm{\sigma}}_{n}^{2})\otimes% \widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\otimes\widetilde{\mathbf{C}}_{n}^{% \mathrm{spat}}over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = diag ( over~ start_ARG bold_italic_σ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ⊗ over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ⊗ over~ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT.

3 Reconstruction of the component of interest

3.1 Direct model

We extend the forward model developed for ADI in Flasseur et al. (2021, 2022) by including the spectral dimension. Since the whole ASDI sequence is acquired within a short time (a few hours of observations during a single night), we assume the component of interest (e.g., circumstellar disk and potential exoplanets) does not evolve during the observations: its proper rotation around the host star and photometry evolution are negligible at such short time scales. The multi-spectral image of this component is simply described by the vector 𝒖+NL𝒖superscriptsubscriptsuperscript𝑁𝐿\bm{u}\in\mathbb{R}_{+}^{N^{\prime}L}bold_italic_u ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT of its pixel values and there is no temporal dimension in this spatio-spectral reconstruction. Due to the apparent rotation of the field of view during the ASDI sequence, the number Nsuperscript𝑁N^{\prime}italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT of pixels in each spectral band of the reconstruction should be greater than N𝑁Nitalic_N to model any part of the disk seen within the sensor field of view on at least one exposure.

The contribution of 𝒖𝒖\bm{u}bold_italic_u to the data 𝒗𝒗\bm{v}bold_italic_v is modeled by 𝐌𝒖𝐌𝒖\mathbf{M}\,\bm{u}bold_M bold_italic_u with the linear operator:

𝐌=(𝐌1𝐌T) and 𝐌t=𝐒𝐙𝐀𝐁t𝐑t,𝐌matrixsubscript𝐌1subscript𝐌𝑇 and subscript𝐌𝑡𝐒𝐙𝐀subscript𝐁𝑡subscript𝐑𝑡\displaystyle\mathbf{M}=\begin{pmatrix}\mathbf{M}_{1}\\ \vdots\\ \mathbf{M}_{T}\\ \end{pmatrix}\text{ and }\mathbf{M}_{t}=\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,% \mathbf{B}_{t}\,\mathbf{R}_{t}\,,bold_M = ( start_ARG start_ROW start_CELL bold_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL bold_M start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) and bold_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_S bold_Z bold_A bold_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (27)

where 𝐌tsubscript𝐌𝑡\mathbf{M}_{t}bold_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, the model for the t𝑡titalic_t-th frame, accounts for several instrumental effects:

  • a rotation 𝐑tsubscript𝐑𝑡\mathbf{R}_{t}bold_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT applied to all off-axis sources due to the pupil-tracking mode (the field of view rotates while the residual star light remains fixed), implemented as a sparse interpolation matrix,

  • a blur 𝐁tsubscript𝐁𝑡\mathbf{B}_{t}bold_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT due to the instrumental blurring modeled as a 2D discrete convolution by the off-axis point spread function (PSF),

  • an attenuation 𝐀𝐀\mathbf{A}bold_A, very strong on the optical axis, then quickly decreasing (due to the coronagraph), modeled as a diagonal matrix (Flasseur et al., 2021),

  • the absence of measurements outside the spatial extension of the sensor (a non-square area due to the instrumental design of the integral field spectrograph), modeled as a diagonal matrix 𝐙𝐙\mathbf{Z}bold_Z that replaces values outside the sensor area by zeros and keeps other values unchanged (i.e., zero-padding).

  • the image scaling applied during the pre-processing step produces a last transform 𝐒𝐒\mathbf{S}bold_S (time-invariant), corresponding to a sparse interpolation matrix.

With the VLT/SPHERE instrument, the off-axis point spread function (PSF) is quite stable and its core is almost rotation invariant, leading to the approximation 𝐁t𝐑t𝐑t𝐁subscript𝐁𝑡subscript𝐑𝑡subscript𝐑𝑡𝐁\mathbf{B}_{t}\,\mathbf{R}_{t}\approx\mathbf{R}_{t}\,\mathbf{B}bold_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≈ bold_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_B. The model given in Eq. (27) can thus be approximated by:

𝐌𝒖(𝐅1𝐅T)𝐁𝒖,𝐌𝒖matrixsubscript𝐅1subscript𝐅𝑇𝐁𝒖\displaystyle\mathbf{M}\,\bm{u}\approx\begin{pmatrix}\mathbf{F}_{1}\\ \vdots\\ \mathbf{F}_{T}\\ \end{pmatrix}\!\,\mathbf{B}\,\bm{u}\,,bold_M bold_italic_u ≈ ( start_ARG start_ROW start_CELL bold_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL bold_F start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) bold_B bold_italic_u , (28)

where 𝐁𝐁\mathbf{B}bold_B is a time-invariant blurring operator and 𝐅={𝐅t}t=1:T𝐅subscriptsubscript𝐅𝑡:𝑡1𝑇\mathbf{F}=\{\mathbf{F}_{t}\}_{t=1:T}bold_F = { bold_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 : italic_T end_POSTSUBSCRIPT are sparse matrices that perform rotations, scalings, and attenuations according to the transmission of the coronagraph and the sensor field of view. The model in Eq. (28) is only approximate: it neglects possible anisotropies or temporal evolutions of the PSF. Thanks to this approximation, a single convolution of the multi-spectral dataset is performed instead of T𝑇Titalic_T convolutions, which leads to a dramatic acceleration of the numerical evaluation of the forward model (by one to two orders of magnitude) which is critical to achieving reconstructions on datasets in the order of a few hours. We verified through numerical simulations that the impact of these approximations on the reconstructions was negligible (less than 1%) in practice for VLT/SPHERE data. If approximation (28) does not hold (e.g., for instruments that do not produce a stable off-axis PSF or if the latter is not rotation invariant), the full model (27) can be evaluated, at each iteration of the optimization procedure (see Sect. 3.2), on a random subset of temporal frames using stochastic gradient descent (e.g., with the Adam optimizer; Kingma & Ba (2014)). The stochasticity of this procedure reduces both memory consumption and time computation and leads to an approximate solution. Based on simulated disks and off-axis PSFs, we observe a typical relative difference less than 5% on the reconstructed flux distribution obtained with the two strategies ((i) approximate model and no stochastic optimization versus (ii) full model and stochastic optimization). In the following, we use strategy (i) solely given that approximation (28) can be made with VLT/SPHERE data.

3.2 Regularized inversion

We reconstruct the component of interest using a penalized maximum likelihood approach, i.e., by solving the following numerical optimization problem:

𝒖^=argmin𝒖𝟎{𝒞(𝛀,𝒖)(𝛀,𝒖)+(𝒖)},^𝒖subscriptargmin𝒖0𝒞𝛀𝒖𝛀𝒖𝒖\displaystyle\widehat{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\big{% \{}\mathscr{C}(\bm{\Omega},\bm{u})\equiv\mathscr{L}\left(\bm{\Omega},\bm{u}% \right)+\mathscr{R}(\bm{u})\big{\}},over^ start_ARG bold_italic_u end_ARG = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_u ≥ bold_0 end_POSTSUBSCRIPT { script_C ( bold_Ω , bold_italic_u ) ≡ script_L ( bold_Ω , bold_italic_u ) + script_R ( bold_italic_u ) } , (29)

where 𝛀={𝝁nspec,𝝈n2,𝐂nspec,𝐂nspat}n𝕂𝛀subscriptsuperscriptsubscript𝝁𝑛specsuperscriptsubscript𝝈𝑛2superscriptsubscript𝐂𝑛specsuperscriptsubscript𝐂𝑛spat𝑛𝕂\bm{\Omega}=\Big{\{}\bm{\mu}_{n}^{\mathrm{spec}},\bm{\sigma}_{n}^{2},\mathbf{C% }_{n}^{\mathrm{spec}},\mathbf{C}_{n}^{\mathrm{spat}}\Big{\}}_{n\in\mathbb{K}}bold_Ω = { bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , bold_italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT , bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT represents the parameters of the statistical model of the nuisances, the co-log-likelihood \mathscr{L}script_L is given in Eqs. (6)–(8), and (𝒖)𝒖\mathscr{R}(\bm{u})script_R ( bold_italic_u ) is a regularization term to favor plausible reconstructions 𝒖𝒖\bm{u}bold_italic_u. We selected a combination of two regularization functions applying to the same 𝒖𝒖\bm{u}bold_italic_u: an edge-preserving one that favors smooth images with sharp edges co-located at all wavelengths and a sparsity-inducing L1superscriptL1\text{L}^{1}L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT norm. The regularization writes:

(𝒖)𝒖\displaystyle\mathscr{R}(\bm{u})script_R ( bold_italic_u ) =βsmoothn=1N1L=1L𝐃n,𝒖22+τ2absentsubscript𝛽smoothsuperscriptsubscript𝑛1superscript𝑁1𝐿superscriptsubscript1𝐿superscriptsubscriptdelimited-∥∥subscript𝐃𝑛𝒖22superscript𝜏2\displaystyle=\beta_{\text{smooth}}\sum_{n=1}^{N^{\prime}}\sqrt{\tfrac{1}{L}% \sum_{\ell=1}^{L}\left\lVert\mathbf{D}_{n,\ell}\,\bm{u}\right\rVert_{2}^{2}+% \tau^{2}}= italic_β start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_L end_ARG ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ∥ bold_D start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
+βsparsen=1N=1L|un,|,subscript𝛽sparsesuperscriptsubscript𝑛1superscript𝑁superscriptsubscript1𝐿subscript𝑢𝑛\displaystyle\quad+\beta_{\text{sparse}}\sum_{n=1}^{N^{\prime}}\sum_{\ell=1}^{% L}|u_{n,\ell}|,+ italic_β start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT | italic_u start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT | , (30)

where 𝐃n,𝒖n𝒖:,subscript𝐃𝑛𝒖subscript𝑛subscript𝒖:\mathbf{D}_{n,\ell}\,\bm{u}\approx\mathbf{\nabla}_{\!n}\bm{u}_{:,\ell}bold_D start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT bold_italic_u ≈ ∇ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT bold_italic_u start_POSTSUBSCRIPT : , roman_ℓ end_POSTSUBSCRIPT approximates by finite differences the 2D spatial gradient of 𝒖𝒖\bm{u}bold_italic_u at pixel n𝑛nitalic_n in the \ellroman_ℓ-th spectral channel and with τ𝜏\tauitalic_τ a parameter chosen so as to be negligible compared to the average norm of the spatial gradient where there is a sharp edge (the regularization then approaches an isotropic vectorial total variation; Bresson & Chan (2008)) and similar to the gradient magnitude in smoothly-varying areas (this prevents the apparition of the staircasing effect common with total variation; Charbonnier et al. (1997); Blomgren et al. (1997); Louchet & Moisan (2008)). We illustrate qualitatively through numerical simulations in Sect. 4.4 that these quite classical regularization penalties in image processing remain adapted to disks having very different morphologies, like elliptical disks with sharp edges or spiral disks with smooth edges. Hyper-parameters βsmoothsubscript𝛽smooth\beta_{\text{smooth}}italic_β start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT and βsparsesubscript𝛽sparse\beta_{\text{sparse}}italic_β start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT balance the weight of each regularization term with respect to the data-fitting term. Note that, due to the positivity constraint in Eq. (29), the L1superscriptL1\text{L}^{1}L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT norm 𝒖1subscriptnorm𝒖1\|\bm{u}\|_{1}∥ bold_italic_u ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT corresponds to the simple differentiable term n=1N=1Lun,superscriptsubscript𝑛1superscript𝑁superscriptsubscript1𝐿subscript𝑢𝑛\sum_{n=1}^{N^{\prime}}\sum_{\ell=1}^{L}u_{n,\ell}∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT for any feasible object 𝒖𝒖\bm{u}bold_italic_u, and thus the regularization (𝒖)𝒖\mathscr{R}(\bm{u})script_R ( bold_italic_u ) is differentiable for τ0𝜏0\tau\neq 0italic_τ ≠ 0 (in practice, we choose τ=106𝜏superscript106\tau=10^{-6}italic_τ = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT).

To solve the smooth constrained optimization problem in Eq. (29), we use a limited-memory quasi-Newton method with bound constraints, VMLM-B (Thiébaut, 2002), which is a more efficient variant of L-BFGS-B (Zhu et al., 1997). To minimize 𝒞(𝛀,𝒖)𝒞𝛀𝒖\mathscr{C}(\bm{\Omega},\bm{u})script_C ( bold_Ω , bold_italic_u ) in 𝒖𝒖\bm{u}bold_italic_u given 𝛀𝛀\bm{\Omega}bold_Ω, the VMLM-B optimizer requires to evaluate the cost function 𝒞(𝛀,𝒖)𝒞𝛀𝒖\mathscr{C}(\bm{\Omega},\bm{u})script_C ( bold_Ω , bold_italic_u ) and the first derivatives 𝒖𝒞(𝛀,𝒖)subscript𝒖𝒞𝛀𝒖\nabla_{\bm{u}}\mathscr{C}(\bm{\Omega},\bm{u})∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_C ( bold_Ω , bold_italic_u ) with respect to 𝒖𝒖\bm{u}bold_italic_u. The analytic expression of these first derivatives writes, for all 𝒖𝟎𝒖0\bm{u}\geq\mathbf{0}bold_italic_u ≥ bold_0:

𝒖𝒞(𝛀,𝒖)=n𝕂𝐌t=1T1σn,t2𝐄n,t𝚪n[𝐄n,t𝐌𝒖+𝝁n𝒗n,t]𝒖n(𝛀n,𝒖), see Eqs. (6)–(8)subscript𝒖𝒞𝛀𝒖subscript𝑛𝕂subscriptsuperscript𝐌topsuperscriptsubscript𝑡1𝑇1superscriptsubscript𝜎𝑛𝑡2superscriptsubscript𝐄𝑛𝑡topsubscript𝚪𝑛delimited-[]subscript𝐄𝑛𝑡𝐌𝒖subscript𝝁𝑛subscript𝒗𝑛𝑡subscript𝒖subscript𝑛subscript𝛀𝑛𝒖, see Eqs. (6)–(8)\displaystyle\nabla_{\bm{u}}\mathscr{C}\left(\bm{\Omega},\bm{u}\right)=\sum_{n% \in\mathbb{K}}\underbrace{\mathbf{M}^{\top}\sum_{t=1}^{T}\frac{1}{\sigma_{n,t}% ^{2}}\,\mathbf{E}_{n,t}^{\top}\,\mathbf{\Gamma}_{n}\,\big{[}\mathbf{E}_{n,t}\,% \mathbf{M}\,\bm{u}+\bm{\mu}_{n}-\bm{v}_{n,t}\big{]}}_{\nabla_{\bm{u}}\mathscr{% L}_{n}\left(\bm{\Omega}_{n},\bm{u}\right)\text{, see Eqs.~{}\eqref{eq:covsep}--\eqref{eq:patchcologlikelihood}}}∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_C ( bold_Ω , bold_italic_u ) = ∑ start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT under⏟ start_ARG bold_M start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_E start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_Γ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ bold_E start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT bold_M bold_italic_u + bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ] end_ARG start_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_Ω start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_u ) , see Eqs. ( )–( ) end_POSTSUBSCRIPT
+βsmoothn=1N𝐃n,𝐃n,𝒖1L=1L𝐃n,𝒖22+τ2+βsparse 1𝒖(𝒖), see Eq. (30),subscriptsubscript𝛽smoothsuperscriptsubscript𝑛1superscript𝑁superscriptsubscript𝐃𝑛topsuperscriptsubscript𝐃𝑛absent𝒖1𝐿superscriptsubscript1𝐿superscriptsubscriptdelimited-∥∥subscript𝐃𝑛𝒖22superscript𝜏2subscript𝛽sparse1subscript𝒖𝒖, see Eq. (30)\displaystyle\hskip 34.1433pt+\underbrace{\beta_{\mathrm{smooth}}\sum_{n=1}^{N% ^{\prime}}\frac{\mathbf{D}_{n,\ell}^{\top}\,\mathbf{D}_{n,\ell}^{{\phantom{% \top}}}\,\bm{u}}{\sqrt{\frac{1}{L}\sum_{\ell=1}^{L}\left\lVert\mathbf{D}_{n,% \ell}\,\bm{u}\right\rVert_{2}^{2}+\tau^{2}}}+\beta_{\mathrm{sparse}}\,\bm{1}}_% {\nabla_{\bm{u}}\mathscr{R}(\bm{u})\text{, see Eq. (\ref{eq:regul})}},+ under⏟ start_ARG italic_β start_POSTSUBSCRIPT roman_smooth end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT divide start_ARG bold_D start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_D start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT end_POSTSUPERSCRIPT bold_italic_u end_ARG start_ARG square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_L end_ARG ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ∥ bold_D start_POSTSUBSCRIPT italic_n , roman_ℓ end_POSTSUBSCRIPT bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG + italic_β start_POSTSUBSCRIPT roman_sparse end_POSTSUBSCRIPT bold_1 end_ARG start_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_R ( bold_italic_u ) , see Eq. ( ) end_POSTSUBSCRIPT , (31)

where 𝟏1\mathbf{1}bold_1 is an array of same size as 𝒖𝒖\bm{u}bold_italic_u filled with ones, 𝐄n,tsubscript𝐄𝑛𝑡\mathbf{E}_{n,t}bold_E start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT is the KL×KLT𝐾𝐿𝐾𝐿𝑇K\,L\times K\,L\,Titalic_K italic_L × italic_K italic_L italic_T operator that extracts a multi-spectral patch at spatial location n𝑛nitalic_n and time frame t𝑡titalic_t (by extension of its definition introduced in Sect. 2.1), 𝛀nsubscript𝛀𝑛\bm{\Omega}_{n}bold_Ω start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denotes the subset of the statistical model parameters for the n𝑛nitalic_n-th patch, and 𝚪nsubscript𝚪𝑛\mathbf{\Gamma}_{n}bold_Γ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is equal to (𝐂nspec)1(𝐂nspat)1tensor-productsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsuperscriptsubscript𝐂𝑛spat1{\big{(}\mathbf{C}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\mathbf{C}_{n}% ^{\mathrm{spat}}\big{)}}^{-1}( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

Solving the problem in Eq. (29) yields an estimator 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG of the object of interest given the parameters 𝛀𝛀\bm{\Omega}bold_Ω of the statistical model. We consider next different strategies to jointly obtain estimators of these parameters from the same dataset.

3.3 Joint estimation of all unknowns from the data

Formally, the estimators of the object of interest 𝒖𝒖\bm{u}bold_italic_u and of the parameters 𝛀𝛀\bm{\Omega}bold_Ω of the nuisance statistics provided by REXPACO ASDI are the ones for which Eqs. (21)–(26) and (29) jointly hold. Solving this system of non-linear equations is intrinsically difficult because there is no closed-form solution (at least due to the non-negativity constraint for 𝒖𝒖\bm{u}bold_italic_u) and because of the interdependence of the equations. In the following sub-sections, we develop practical algorithms to iteratively solve this system of equations.

3.3.1 Alternating strategy

Even though there is no joint closed-form solution to the set of equations (21)–(26) and (29), we note that each of these equations readily provides an estimator of some unknowns when the rest of the unknowns are fixed. This property can be exploited to solve the set of equations (21)–(26) by the following alternating strategy. Given the object 𝒖𝒖\bm{u}bold_italic_u, the parameters 𝛀𝛀\bm{\Omega}bold_Ω can be estimated by repeatedly applying Eqs. (21)–(26) in turn until convergence to a so-called fixed point solution. This procedure being applied for each patch to estimate all the nuisance parameters. We denote the resulting parameters as 𝛀~(𝒖)~𝛀𝒖\widetilde{\bm{\Omega}}(\bm{u})over~ start_ARG bold_Ω end_ARG ( bold_italic_u ) in the following. A first possible algorithm to find the solution is then:

1. Let i=0𝑖0i=0italic_i = 0 and initialy assume a null object 𝒖~[0]=𝟎superscript~𝒖delimited-[]00\widetilde{\bm{u}}^{[0]}=\bm{0}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ 0 ] end_POSTSUPERSCRIPT = bold_0.
2. Estimate nuisance statistics 𝛀~[i+1]=𝛀~(𝒖[i])superscript~𝛀delimited-[]𝑖1~𝛀superscript𝒖delimited-[]𝑖\widetilde{\bm{\Omega}}^{[i+1]}=\widetilde{\bm{\Omega}}\big{(}\bm{u}^{[i]}\big% {)}over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT = over~ start_ARG bold_Ω end_ARG ( bold_italic_u start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ) as the fixed point solution of Eqs. (21)–(26) for the current estimate of the object 𝒖[i]superscript𝒖delimited-[]𝑖\bm{u}^{[i]}bold_italic_u start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT. If i=0𝑖0i=0italic_i = 0, also include Eq. (14) in the fixed point method to determine the shrinkage factors ρ~specsuperscript~𝜌spec\widetilde{\rho}^{\,\mathrm{spec}}over~ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and ρ~spatsuperscript~𝜌spat\widetilde{\rho}^{\,\mathrm{spat}}over~ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT. These factors define 𝚿specsuperscript𝚿spec\mathbf{\Psi}^{\mathrm{spec}}bold_Ψ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT and 𝚿spatsuperscript𝚿spat\mathbf{\Psi}^{\mathrm{spat}}bold_Ψ start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT for all subsequent iterations, i.e. for i>0𝑖0i>0italic_i > 0.
3. Update the object 𝒖~[i+1]=argmin𝒖𝟎𝒞(𝛀~[i+1],𝒖)superscript~𝒖delimited-[]𝑖1subscriptargmin𝒖0𝒞superscript~𝛀delimited-[]𝑖1𝒖\widetilde{\bm{u}}^{[i+1]}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\mathscr% {C}\big{(}\widetilde{\bm{\Omega}}^{[i+1]},\bm{u}\big{)}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_u ≥ bold_0 end_POSTSUBSCRIPT script_C ( over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT , bold_italic_u ) by applying the reconstruction algorithm described in Sect. 3.2.
4. Let ii+1𝑖𝑖1i\leftarrow i+1italic_i ← italic_i + 1 and, unless estimators 𝛀~[i]superscript~𝛀delimited-[]𝑖\widetilde{\bm{\Omega}}^{[i]}over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT and 𝒖~[i]superscript~𝒖delimited-[]𝑖\widetilde{\bm{u}}^{[i]}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT have converged, go to step 2.

In practice, we assume the algorithm reaches convergence when the condition 𝒖~[i+1]𝒖~[i]η𝒖~[i+1]delimited-∥∥superscript~𝒖delimited-[]𝑖1superscript~𝒖delimited-[]𝑖𝜂delimited-∥∥superscript~𝒖delimited-[]𝑖1\big{\lVert}\widetilde{\bm{u}}^{[i+1]}-\widetilde{\bm{u}}^{[i]}\big{\rVert}% \leq\eta\big{\lVert}\widetilde{\bm{u}}^{[i+1]}\big{\rVert}∥ over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT - over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ∥ ≤ italic_η ∥ over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT ∥ is satisfied, with η=106𝜂superscript106\eta=10^{-6}italic_η = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT.

This first algorithm implements a simple alternating strategy which is equivalent, for non-linear equations, to the Gauss–Seidel method for solving a system of linear equations. The alternating method converges slowly due to the need for multiple reconstructions of the object of interest, which are progressively refined in each iteration of Step 3. This process represents the primary computational bottleneck (the computational cost of estimating nuisance statistics is negligible by comparison). However, as discussed in the following subsections, the computational efficiency of this estimation strategy can be significantly improved.

3.3.2 Partially hierarchical optimization

Noting that the joint solution of Eqs. (21) and (22) only depends on the object and on the spatial and spectral covariances, we introduce the following auxiliary cost function:

𝒟(𝒖,{𝐂nspec,𝐂nspat}n𝕂)𝒟𝒖subscriptsubscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛𝑛𝕂\displaystyle\mathscr{D}\big{(}\bm{u},\big{\{}\mathbf{C}^{\mathrm{spec}}_{n},% \mathbf{C}^{\mathrm{spat}}_{n}\big{\}}_{n\in\mathbb{K}}\big{)}script_D ( bold_italic_u , { bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT ) =min{𝝁nspec}n𝕂{σn,t2}n𝕂,t1:T𝒞(𝛀,𝒖)absentsubscriptsubscriptsubscriptsuperscript𝝁spec𝑛𝑛𝕂subscriptsuperscriptsubscript𝜎𝑛𝑡2:formulae-sequence𝑛𝕂𝑡1𝑇𝒞𝛀𝒖\displaystyle=\min_{\begin{subarray}{c}\{\bm{\mu}^{\mathrm{spec}}_{n}\}_{n\in% \mathbb{K}}\\ \{\sigma_{n,t}^{2}\}_{n\in\mathbb{K},t\in 1:T}\end{subarray}}\mathscr{C}(\bm{% \Omega},\bm{u})= roman_min start_POSTSUBSCRIPT start_ARG start_ROW start_CELL { bold_italic_μ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL { italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K , italic_t ∈ 1 : italic_T end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT script_C ( bold_Ω , bold_italic_u )
=𝒞(𝛀,𝒖) 𝝁nspec=𝝁~nspec(𝒖,𝐂nspec,𝐂nspat)𝝈n2=𝝈~n2(𝒖,𝐂nspec,𝐂nspat)absent𝒞𝛀𝒖subscript subscriptsuperscript𝝁spec𝑛subscriptsuperscript~𝝁spec𝑛𝒖subscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛superscriptsubscript𝝈𝑛2subscriptsuperscript~𝝈2𝑛𝒖subscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛\displaystyle=\mathscr{C}(\bm{\Omega},\bm{u})\,\rule[-17.07164pt]{0.5pt}{25.60% 747pt}_{\begin{array}[b]{l}\scriptscriptstyle\bm{\mu}^{\mathrm{spec}}_{n}\>{=}% \>\widetilde{\bm{\mu}}^{\,\mathrm{spec}}_{n}\big{(}\bm{u},\,\mathbf{C}^{% \mathrm{spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}\\[2.84526pt] \scriptscriptstyle\bm{\sigma}_{n}^{2}\>{=}\>\widetilde{\bm{\sigma}}^{2}_{n}% \big{(}\bm{u},\,\mathbf{C}^{\mathrm{spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n% }\big{)}\\ \end{array}}= script_C ( bold_Ω , bold_italic_u ) start_POSTSUBSCRIPT start_ARRAY start_ROW start_CELL bold_italic_μ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = over~ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_u , bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL bold_italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = over~ start_ARG bold_italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_u , bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARRAY end_POSTSUBSCRIPT (38)

In practice, for each patch n𝑛nitalic_n and given the object 𝒖𝒖\bm{u}bold_italic_u and the covariances 𝐂nspecsubscriptsuperscript𝐂spec𝑛\mathbf{C}^{\mathrm{spec}}_{n}bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝐂nspatsubscriptsuperscript𝐂spat𝑛\mathbf{C}^{\mathrm{spat}}_{n}bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, the estimators 𝝁~nspec(𝒖,𝐂nspec,𝐂nspat)subscriptsuperscript~𝝁spec𝑛𝒖subscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛\widetilde{\bm{\mu}}^{\,\mathrm{spec}}_{n}\big{(}\bm{u},\,\mathbf{C}^{\mathrm{% spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}over~ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_u , bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) and 𝝈~n2(𝒖,𝐂nspec,𝐂nspat)subscriptsuperscript~𝝈2𝑛𝒖subscriptsuperscript𝐂spec𝑛subscriptsuperscript𝐂spat𝑛\widetilde{\bm{\sigma}}^{2}_{n}\big{(}\bm{u},\,\mathbf{C}^{\mathrm{spec}}_{n},% \,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}over~ start_ARG bold_italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_u , bold_C start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_C start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) are obtained by applying Eqs. (21) and (22) iteratively until convergence to a fixed point. Such estimators define a stationary point of 𝒞𝒞\mathscr{C}script_C with respect to the parameters 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝝈nsubscript𝝈𝑛\bm{\sigma}_{n}bold_italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, the corresponding partial derivatives of 𝒞𝒞\mathscr{C}script_C are therefore null. Hence, by the chain rule, the derivatives of the auxiliary function 𝒟𝒟\mathscr{D}script_D in 𝒖𝒖\bm{u}bold_italic_u are simply given by 𝒖𝒞subscript𝒖𝒞\nabla_{\bm{u}}\mathscr{C}∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_C in Eq. (31) evaluated at the stationary point. Thanks to this property, solving:

𝒖~=argmin𝒖𝟎𝒟(𝒖,{𝐂~nspec,𝐂~nspat}n𝕂)~𝒖subscriptargmin𝒖0𝒟𝒖subscriptsubscriptsuperscript~𝐂spec𝑛subscriptsuperscript~𝐂spat𝑛𝑛𝕂\displaystyle\widetilde{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}% \mathscr{D}\big{(}\bm{u},\big{\{}\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n},% \widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}\big{\}}_{n\in\mathbb{K}}\big{)}over~ start_ARG bold_italic_u end_ARG = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_u ≥ bold_0 end_POSTSUBSCRIPT script_D ( bold_italic_u , { over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT ) (39)

can be done similarly to solving the constrained reconstruction problem in Eq. (29), that is with a quasi-Newton method like VMLM-B (Thiébaut, 2002).

Minimizing the auxiliary function 𝒟𝒟\mathscr{D}script_D instead of 𝒞𝒞\mathscr{C}script_C, the estimators are obtained by the following algorithm:

1. Let i=0𝑖0i=0italic_i = 0, assume a null object 𝒖~[0]=𝟎superscript~𝒖delimited-[]00\widetilde{\bm{u}}^{[0]}=\bm{0}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ 0 ] end_POSTSUPERSCRIPT = bold_0, and initialize model statistics 𝛀~[0]superscript~𝛀delimited-[]0\widetilde{\bm{\Omega}}^{[0]}over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ 0 ] end_POSTSUPERSCRIPT as in Step 2 of the first iteration of the algorithm given in Sect. 3.3.1.
2. Update the object by minimizing the auxiliary cost function:
  𝒖~[i+1]=argmin𝒖𝟎𝒟(𝒖,{𝐂~nspec[i],𝐂~nspat[i]}n𝕂)superscript~𝒖delimited-[]𝑖1subscriptargmin𝒖0𝒟𝒖subscriptsubscriptsuperscript~𝐂specdelimited-[]𝑖𝑛subscriptsuperscript~𝐂spatdelimited-[]𝑖𝑛𝑛𝕂\widetilde{\bm{u}}^{[i+1]}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\mathscr% {D}\Big{(}\bm{u},\big{\{}\widetilde{\mathbf{C}}^{\mathrm{spec}\,[i]}_{n},% \widetilde{\mathbf{C}}^{\mathrm{spat}\,[i]}_{n}\big{\}}_{n\in\mathbb{K}}\Big{)}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_u ≥ bold_0 end_POSTSUBSCRIPT script_D ( bold_italic_u , { over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT ).
3. Update the nuisance statistics: 𝛀~[i+1]=𝛀~(𝒖[i+1])superscript~𝛀delimited-[]𝑖1~𝛀superscript𝒖delimited-[]𝑖1\widetilde{\bm{\Omega}}^{[i+1]}=\widetilde{\bm{\Omega}}\big{(}\bm{u}^{[i+1]}% \big{)}over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT = over~ start_ARG bold_Ω end_ARG ( bold_italic_u start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT ).
4. Let ii+1𝑖𝑖1i\leftarrow i+1italic_i ← italic_i + 1 and, unless estimators 𝛀~[i]superscript~𝛀delimited-[]𝑖\widetilde{\bm{\Omega}}^{[i]}over~ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT and 𝒖~[i]superscript~𝒖delimited-[]𝑖\widetilde{\bm{u}}^{[i]}over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT have converged, go to step 2.

Like for the alternating strategy presented in Sect. 3.3.1, we assume that the partially hierarchical optimization scheme reaches convergence when the condition 𝒖~[i+1]𝒖~[i]η𝒖~[i+1]delimited-∥∥superscript~𝒖delimited-[]𝑖1superscript~𝒖delimited-[]𝑖𝜂delimited-∥∥superscript~𝒖delimited-[]𝑖1\big{\lVert}\widetilde{\bm{u}}^{[i+1]}-\widetilde{\bm{u}}^{[i]}\big{\rVert}% \leq\eta\big{\lVert}\widetilde{\bm{u}}^{[i+1]}\big{\rVert}∥ over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT - over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ∥ ≤ italic_η ∥ over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i + 1 ] end_POSTSUPERSCRIPT ∥ is satisfied, with η=106𝜂superscript106\eta=10^{-6}italic_η = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT.

It may be noted that the estimators 𝝁~nspec[i+1]subscriptsuperscript~𝝁specdelimited-[]𝑖1𝑛\widetilde{\bm{\mu}}^{\,\mathrm{spec}\,[i+1]}_{n}over~ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec [ italic_i + 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and σ~n,t2[i+1]subscriptsuperscript~𝜎2delimited-[]𝑖1𝑛𝑡\widetilde{\sigma}^{2\,[i+1]}_{n,t}over~ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 [ italic_i + 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT can also be considered as a by-product of the minimization of 𝒟𝒟\mathscr{D}script_D in Step 2 of the above algorithm. Hence, Step 3 can be modified to restrict the updating of the nuisance statistics to that of the covariances 𝐂~nspecsubscriptsuperscript~𝐂spec𝑛\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝐂~nspatsubscriptsuperscript~𝐂spat𝑛\widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (n𝕂for-all𝑛𝕂\forall n\in\mathbb{K}∀ italic_n ∈ blackboard_K), e.g. by finding a fixed point of Eqs. (23)–(26).

The hierarchical optimization in Step 2 yields estimates such that Eqs. (21), (22), and (29) jointly hold for given covariance matrices. As a result, the convergence speed is improved compared to the Algorithm described in Sect. 3.3.1.

3.3.3 Fully hierarchical approximation

In principle, all the parameters could be found by solving:

𝒖~=argmin𝒖𝟎{(𝒖)𝒞(𝛀~(𝒖),𝒖)},~𝒖subscriptargmin𝒖0𝒖𝒞~𝛀𝒖𝒖\widetilde{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\big{\{}\mathscr% {F}(\bm{u})\equiv\mathscr{C}\big{(}\widetilde{\bm{\Omega}}(\bm{u}),\bm{u}\big{% )}\big{\}},over~ start_ARG bold_italic_u end_ARG = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_u ≥ bold_0 end_POSTSUBSCRIPT { script_F ( bold_italic_u ) ≡ script_C ( over~ start_ARG bold_Ω end_ARG ( bold_italic_u ) , bold_italic_u ) } , (45)

and taking 𝛀~=𝛀~(𝒖~)~𝛀~𝛀~𝒖\widetilde{\bm{\Omega}}=\widetilde{\bm{\Omega}}(\widetilde{\bm{u}})over~ start_ARG bold_Ω end_ARG = over~ start_ARG bold_Ω end_ARG ( over~ start_ARG bold_italic_u end_ARG ). The estimator 𝛀~(𝒖)~𝛀𝒖\widetilde{\bm{\Omega}}(\bm{u})over~ start_ARG bold_Ω end_ARG ( bold_italic_u ) of the nuisance statistics is however not truly a stationary point of 𝒞𝒞\mathscr{C}script_C for the covariance matrices 𝐂~nspecsubscriptsuperscript~𝐂spec𝑛\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝐂~nspatsubscriptsuperscript~𝐂spat𝑛\widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}over~ start_ARG bold_C end_ARG start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT although it is a stationary point for the other parameters of the nuisance statistics. We nevertheless make the following approximation:

𝒖(𝒖)𝒖𝒞(𝛀,𝒖)|𝛀=𝛀~(𝒖),subscript𝒖𝒖evaluated-atsubscript𝒖𝒞𝛀𝒖𝛀~𝛀𝒖\nabla_{\bm{u}}\mathscr{F}(\bm{u})\approx\left.\nabla_{\bm{u}}\mathscr{C}(\bm{% \Omega},\bm{u})\right|_{\bm{\Omega}=\widetilde{\bm{\Omega}}(\bm{u})},∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_F ( bold_italic_u ) ≈ ∇ start_POSTSUBSCRIPT bold_italic_u end_POSTSUBSCRIPT script_C ( bold_Ω , bold_italic_u ) | start_POSTSUBSCRIPT bold_Ω = over~ start_ARG bold_Ω end_ARG ( bold_italic_u ) end_POSTSUBSCRIPT , (46)

since, under this approximation, the constrained problem in Eq. (45) can be solved by a quasi-Newton method as VMLM-B (Thiébaut, 2002).

In practice, we verified numerically that the approximation in Eq. (46) holds to a numerical precision that is sufficient to achieve the convergence of the quasi-Newton method. We also verified that the fully alternating strategy described in Sect. 3.3.1 and the fully hierarchical approach assuming Eq. (46) both converge to the same estimators. The fully hierarchical approach is however much faster than algorithms presented in Sects. 3.3.1 and 3.3.2. For example, the approximate fully hierarchical algorithm reduces the computational load of the alternating strategy by a factor comparable to the number of iterations i𝑖iitalic_i required to reach convergence with the algorithm described in Sect. 3.3.1 (ranging from 30 to 100 in practice). Consequently, we exclusively employed the approximate fully hierarchical optimization strategy throughout this paper and recommend it as the preferred method for estimating parameters in REXPACO ASDI.

3.4 Unsupervised setting of the regularization hyper-parameters

Refer to caption
Figure 3: (figure modified) Effect of the setting of the regularization parameters on REXPACO ASDI reconstructions of a simulated elliptical disk at contrast level αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT: (a) Comparison between the MSE (left; Eq. (47)) and the SURE criterion (right; Eq. (50)). The star symbol represents the minimum of the two criteria. We observe that it corresponds to the same setting of 𝜷𝜷\bm{\beta}bold_italic_β in both cases: β~sparseMSE=β~sparseSURE=7.5×105superscriptsubscript~𝛽sparseMSEsuperscriptsubscript~𝛽sparseSURE7.5superscript105\widetilde{\beta}_{\text{sparse}}^{\text{MSE}}=\widetilde{\beta}_{\text{sparse% }}^{\text{SURE}}=7.5\times 10^{5}over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT = over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT = 7.5 × 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT and β~smoothMSE=β~smoothSURE=1×107superscriptsubscript~𝛽smoothMSEsuperscriptsubscript~𝛽smoothSURE1superscript107\widetilde{\beta}_{\text{smooth}}^{\text{MSE}}=\widetilde{\beta}_{\text{smooth% }}^{\text{SURE}}=1\times 10^{7}over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT = over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT = 1 × 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT. The square and circle symbols respectively represents examples of an under-regularization (i.e., 𝜷<𝜷~MSE𝜷superscript~𝜷MSE\bm{\beta}<\widetilde{\bm{\beta}}^{\text{MSE}}bold_italic_β < over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT) and of an over-regularization (i.e., 𝜷>𝜷~MSE𝜷superscript~𝜷MSE\bm{\beta}>\widetilde{\bm{\beta}}^{\text{MSE}}bold_italic_β > over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT). Panel (b) shows the reconstructed flux distribution 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG for the three settings of the regularization hyper-parameters symbolized in panel (a), see corresponding symbols on top of images. The reconstructions can be compared qualitatively to the ground truth 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT given in Fig. 10. Given that the simulated disk has a constant contrast αgtsubscript𝛼gt\alpha_{\text{gt}}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT over the spectral channels, only the spectral mean of the spatio-spectral flux distributions reconstructed by REXPACO ASDI is displayed. Dataset: HD 172555 (2015-07-11), see Sect. 4.1 for the observing parameters.

As in our previous work on the REXPACO algorithm (Flasseur et al., 2021), we propose a strategy to set optimally, and in a data-driven fashion, the hyper-parameters 𝜷={βsmooth,βsparse}𝜷subscript𝛽smoothsubscript𝛽sparse\bm{\beta}=\{\beta_{\text{smooth}},\beta_{\text{sparse}}\}bold_italic_β = { italic_β start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT } involved in the regularization term \mathscr{R}script_R of Eq. (30). These two free parameters represent the relative weights of the two combined priors on the sought flux distribution and they also set the relative weight of the priors with respect to the data-fidelity term \mathscr{L}script_L defined in Eq. (7). In other words, the hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β set a critical bias-variance trade-off. These hyper-parameters can be tuned manually by trial and error until the reconstruction is qualitatively acceptable, but this approach relies on the user judgment and, likely, the resulting setting is not optimal. Instead, we capitalize on the large variety of methods available in the signal processing literature to set regularization hyper-parameters by minimizing a figure of merit, see e.g. Craven & Wahba (1978); Wahba et al. (1985); Stein (1981). One of these criteria is the so-called Stein’s Unbiased Risk Estimator (SURE; Stein (1981)) that we have also selected among other metrics in our previous works dedicated to the post-processing of high-contrast observations (Flasseur et al., 2020b, 2021) given its ability to approximate the mean square error (MSE) in the measurement space:

MSE(𝜷)=n𝕂t=1T1σ^n,t2𝐄n,t(𝐌(𝒖gt𝒖~𝜷(𝒗)))𝚪~n2,MSE𝜷subscript𝑛𝕂superscriptsubscript𝑡1𝑇superscriptsubscriptnorm1superscriptsubscript^𝜎𝑛𝑡2subscript𝐄𝑛𝑡𝐌subscript𝒖gtsubscript~𝒖𝜷𝒗subscript~𝚪𝑛2\text{MSE}(\bm{\beta})=\sum\limits_{n\in\mathbb{K}}\sum\limits_{t=1}^{T}\left% \|\frac{1}{\widehat{\sigma}_{n,t}^{2}}\mathbf{E}_{n,t}\,\left(\mathbf{M}\left(% \bm{u}_{\text{gt}}-\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})\right)\right)\right% \|_{\widetilde{\mathbf{\Gamma}}_{n}}^{2}\,,MSE ( bold_italic_β ) = ∑ start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ divide start_ARG 1 end_ARG start_ARG over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_E start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( bold_M ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT - over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) ) ) ∥ start_POSTSUBSCRIPT over~ start_ARG bold_Γ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (47)

with 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT the unknown ground truth flux distribution and 𝒖~𝜷(𝒗)subscript~𝒖𝜷𝒗\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) the flux distribution reconstructed from the data 𝒗𝒗\bm{v}bold_italic_v using the set of regularization hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β. It is shown in the literature (Stein, 1981) that the SURE estimator gives an unbiased estimation of MSE(𝜷)MSE𝜷\text{MSE}(\bm{\beta})MSE ( bold_italic_β ) without requiring the value of the unknown ground truth flux-distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT involved in the MSE (47).

By extending our previous work (Flasseur et al., 2021) to the multi-spectral model of the nuisance and of the object components, the resulting SURE risk estimator can be numerically evaluated by:

SURE(𝜷)n𝕂t=1T1σ~n,t2𝐄n,t(𝒗𝝁~spec𝐌𝒖~𝜷(𝒗))𝚪~n2+(2/ξ)𝒃𝐌[𝒖~𝜷(𝒗+ξ𝒃)𝒖~𝜷(𝒗)]NTL,SURE𝜷subscript𝑛𝕂superscriptsubscript𝑡1𝑇superscriptsubscriptdelimited-∥∥1superscriptsubscript~𝜎𝑛𝑡2subscript𝐄𝑛𝑡𝒗superscript~𝝁spec𝐌subscript~𝒖𝜷𝒗subscript~𝚪𝑛22𝜉superscript𝒃top𝐌delimited-[]subscript~𝒖𝜷𝒗𝜉𝒃subscript~𝒖𝜷𝒗𝑁𝑇𝐿\text{SURE}(\bm{\beta})\approx\sum\limits_{n\in\mathbb{K}}\sum_{t=1}^{T}\left% \|\frac{1}{\widetilde{\sigma}_{n,t}^{2}}\mathbf{E}_{n,t}\left(\bm{v}-% \widetilde{\bm{\mu}}^{\mathrm{spec}}-\mathbf{M}\,\widetilde{\bm{u}}_{\bm{\beta% }}(\bm{v})\right)\right\|_{\widetilde{\mathbf{\Gamma}}_{n}}^{2}\\ +(2/\xi)\,\bm{b}^{\top}\,\mathbf{M}\,\left[\widetilde{\bm{u}}_{\bm{\beta}}(\bm% {v}+\xi\bm{b})-\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})\right]-N\,T\,L\,,start_ROW start_CELL SURE ( bold_italic_β ) ≈ ∑ start_POSTSUBSCRIPT italic_n ∈ blackboard_K end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ divide start_ARG 1 end_ARG start_ARG over~ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_E start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( bold_italic_v - over~ start_ARG bold_italic_μ end_ARG start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - bold_M over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) ) ∥ start_POSTSUBSCRIPT over~ start_ARG bold_Γ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL + ( 2 / italic_ξ ) bold_italic_b start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_M [ over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v + italic_ξ bold_italic_b ) - over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) ] - italic_N italic_T italic_L , end_CELL end_ROW (50)

where 𝒃NTL𝒃superscriptsuperscript𝑁𝑇𝐿\bm{b}\in\mathbb{R}^{N^{\prime}\,T\,L}bold_italic_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_T italic_L end_POSTSUPERSCRIPT is an independent and identically distributed pseudo-random vector of unit variance, and ξ𝜉\xiitalic_ξ is the amplitude of this perturbation. This expression, as the MSE in Eq. (47), tailored to our problem accounts for the structured model of the covariances of the nuisance (i.e., separable spatially and spectrally), as defined by the matrix 𝚪𝚪\mathbf{\Gamma}bold_Γ. It also accounts for our patch-based strategy to model the full covariance through the partition of the image into non-overlapping patches with the operator 𝐄𝐄\mathbf{E}bold_E. In addition, expression (50) is a practical approximation of the original SURE criterion that involves the computation of the Jacobian matrix of the mapping 𝒖𝒖~𝜷(𝒗)𝒖subscript~𝒖𝜷𝒗\bm{u}\rightarrow\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})bold_italic_u → over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) with respect to the components of the data 𝒗𝒗\bm{v}bold_italic_v. Given that there is no-closed-form expression for such a term, we approximate it by resorting to finite differences through a Monte-Carlo perturbation of the data, as proposed by Girard (1989); Ramani et al. (2012). This strategy leads to the approximate expression (50) involving the reconstruction of the two flux distributions 𝒖~𝜷(𝒗)subscript~𝒖𝜷𝒗\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) and 𝒖~𝜷(𝒗+ξ𝒃)subscript~𝒖𝜷𝒗𝜉𝒃\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v + italic_ξ bold_italic_b ) obtained respectively from the data 𝒗𝒗\bm{v}bold_italic_v and the perturbed counterpart 𝒗+ξ𝒃𝒗𝜉𝒃\bm{v}+\xi\bm{b}bold_italic_v + italic_ξ bold_italic_b. The optimal setting 𝜷~SUREsuperscript~𝜷SURE\widetilde{\bm{\beta}}^{\text{SURE}}over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT of the regularization hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β is obtained by minimizing the SURE score (50) with respect to 𝜷𝜷\bm{\beta}bold_italic_β.

In Fig. 3, we illustrate the benefits of the proposed data-driven setting of the regularization hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β by resorting to the numerical injection of a synthetic elliptical disk within an object-free dataset of the HD 172555 star obtained with the VLT/SPHERE-IFS instrument (see Sect. 4.1 for the description of the dataset). The experiments are conducted for a disk of contrast (see definition in Sect. 1) αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT in every spectral channel. The corresponding ground truth flux distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT to be reconstructed is given in Fig. 10 bottom-left. We start by comparing the SURE criterion (50) to the MSE (47). The tested values of the hyper-parameters are βsmooth[1×102,1×1010]subscript𝛽smooth1superscript1021superscript1010\beta_{\text{smooth}}\in\left[1\times 10^{2},1\times 10^{10}\right]italic_β start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT ∈ [ 1 × 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 1 × 10 start_POSTSUPERSCRIPT 10 end_POSTSUPERSCRIPT ] and βsparse[7.5×100,7.5×108]subscript𝛽sparse7.5superscript1007.5superscript108\beta_{\text{sparse}}\in\left[7.5\times 10^{0},7.5\times 10^{8}\right]italic_β start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT ∈ [ 7.5 × 10 start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , 7.5 × 10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT ] with a regular sampling of log(𝜷)𝜷\log(\bm{\beta})roman_log ( bold_italic_β ). For the computation of the SURE metric, we have to set the value of the parameter ξ𝜉\xiitalic_ξ involved in Eq. (50), namely the strength of the perturbation 𝒃𝒃\bm{b}bold_italic_b. We found this value not to be critical, yet it should be set not too small to prevent errors due to numerical underflows in the computation of the difference 𝒖~𝜷(𝒗+ξ𝒃)𝒖~𝜷(𝒗)subscript~𝒖𝜷𝒗𝜉𝒃subscript~𝒖𝜷𝒗\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})-\widetilde{\bm{u}}_{\bm{% \beta}}(\bm{v})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v + italic_ξ bold_italic_b ) - over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) and not too large so that the approximation (50) stays valid. As in our previous work on the REXPACO algorithm (Flasseur et al., 2021), we set it empirically by ξ=0.1×MAD(𝒗)𝜉0.1MAD𝒗\xi=0.1\times\text{MAD}(\bm{v})italic_ξ = 0.1 × MAD ( bold_italic_v ), where the median absolute deviation MAD(𝒗)=median(|𝒗median(𝒗)|)MAD𝒗median𝒗median𝒗\text{MAD}(\bm{v})=\text{median}(|\bm{v}-\text{median}(\bm{v})|)MAD ( bold_italic_v ) = median ( | bold_italic_v - median ( bold_italic_v ) | ) is a robust estimator of the standard-deviation of the data 𝒗𝒗\bm{v}bold_italic_v. Panel (a) of Fig. 3 gives the results of the comparison between MSE and SURE. It illustrates that our custom SURE definition is an accurate proxy of the MSE: the global minimum of the two metrics is obtained for the same tested values of our grid of parameters 𝜷𝜷\bm{\beta}bold_italic_β. The SURE criterion (50) can thus be safely used to approximate the MSE when facing real cases where the ground truth flux distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT is not available. Panel (b) of Fig. 3 completes this study by showing an example of the reconstructed flux distribution in three cases: an under-regularized reconstruction (i.e., 𝜷~<𝜷~MSE~𝜷superscript~𝜷MSE\widetilde{\bm{\beta}}<\widetilde{\bm{\beta}}^{\text{MSE}}over~ start_ARG bold_italic_β end_ARG < over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT), the optimal regularization (i.e., 𝜷~=𝜷~MSE=𝜷~SURE~𝜷superscript~𝜷MSEsuperscript~𝜷SURE\widetilde{\bm{\beta}}=\widetilde{\bm{\beta}}^{\text{MSE}}=\widetilde{\bm{% \beta}}^{\text{SURE}}over~ start_ARG bold_italic_β end_ARG = over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT = over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT), and an over-regularized reconstruction (i.e., 𝜷~β~MSEmuch-greater-than~𝜷superscript~𝛽MSE\widetilde{\bm{\beta}}\gg\widetilde{\beta}^{\text{MSE}}over~ start_ARG bold_italic_β end_ARG ≫ over~ start_ARG italic_β end_ARG start_POSTSUPERSCRIPT MSE end_POSTSUPERSCRIPT). It illustrates the benefits of the regularization with an optimal strength: the reconstructed flux distribution is very similar to the ground truth presented in Fig. 10 bottom-left. The nuisance component is well discarded, even very close to the host star, and the reconstructed disk have sharp edges matching the ground truth. An under-regularization causes a slightly worst rejection of the nuisance component (i.e., a non-null background remains in the reconstruction) and the reconstructed disk exhibits some ripples and non-homogeneous parts. In the opposite case of an over-regularization, the reconstructed flux distribution is severely biased towards zero and it results in important morphological distortions impacting the disk, in particular due to a too strong promotion of sparsity .

By construction, the optimal setting of the hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β by minimizing the SURE criterion (50) requires to perform two reconstructions (𝒖~𝜷(𝒗)subscript~𝒖𝜷𝒗\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v ) and 𝒖~𝜷(𝒗+ξ𝒃)subscript~𝒖𝜷𝒗𝜉𝒃\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})over~ start_ARG bold_italic_u end_ARG start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_italic_v + italic_ξ bold_italic_b )) for each tested pair 𝜷𝜷\bm{\beta}bold_italic_β of hyper-parameters. Given that more than 120 individual reconstructions are presented in the following section to evaluate the performance of the proposed approach, it would have been an unreasonable computational overhead to derive 𝜷~SUREsuperscript~𝜷SURE\widetilde{\bm{\beta}}^{\text{SURE}}over~ start_ARG bold_italic_β end_ARG start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT for each reconstruction. We thus chose to evaluate the optimal setting in only one case: the disk of SA0 206462. This computation leads to β~sparseSURE=7.5×104superscriptsubscript~𝛽sparseSURE7.5superscript104\widetilde{\beta}_{\text{sparse}}^{\text{SURE}}=7.5\times 10^{4}over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT sparse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT = 7.5 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT and β~smoothSURE=1×106superscriptsubscript~𝛽smoothSURE1superscript106\widetilde{\beta}_{\text{smooth}}^{\text{SURE}}=1\times 10^{6}over~ start_ARG italic_β end_ARG start_POSTSUBSCRIPT smooth end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SURE end_POSTSUPERSCRIPT = 1 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT. These values are not too far from the optimal ones derived in the numerical experiments performed in Fig. 3 of this section on a totally different dataset. When facing a new dataset, we thus simply weight these pre-computed values according to the number of frames within the target dataset with respect to the dataset of SAO 206462 in order to keep a constant relative weighting between the regularization and the data fidelity terms. We found that this setting was qualitatively acceptable in all our experiments, i.e. no significant artifact was ever observed either in terms of a bad rejection of the nuisance component or in terms of non-physical discontinuities in the disk structures. We recommend to use this strategy when facing the processing of a large number of datasets. A careful data-dependent and data-driven setting of the hyper-parameters 𝜷𝜷\bm{\beta}bold_italic_β with SURE can be reserved to specific cases where the setting seems to be more critical (e.g., in the case of a very faint disk) or to refine the reconstruction obtained with the pre-computed and scaled values of the regularization hyper-parameters.

4 Results

4.1 Datasets description

Table 2: Summary of the main observational parameters for the VLT/SPHERE datasets analyzed in this paper. The columns include: target name, ESO survey ID, observation date, number (L𝐿Litalic_L) of spectral channels, spectral filter band (ΔλsubscriptΔ𝜆\Delta_{\lambda}roman_Δ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT), number (T𝑇Titalic_T) of available temporal frames, total apparent field of view rotation (ΔparsubscriptΔpar\Delta_{\text{par}}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT), number of sub-integration exposures (NDIT), individual exposure time (DIT; Detector Integration Time), average coherence time (τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT), average seeing, and the first publication reporting an analysis of the same data. All observations were conducted using the apodized Lyot coronagraph (Carbillet et al., 2011) on the VLT/SPHERE instrument. (a)(a){}^{\text{(a)}}start_FLOATSUPERSCRIPT (a) end_FLOATSUPERSCRIPTThe contribution of the three known exoplanets (HR 8799 c, d, e), which are within the SPHERE-IFS field of view, was masked. (b)(b){}^{\text{(b)}}start_FLOATSUPERSCRIPT (b) end_FLOATSUPERSCRIPTWhile the IRDIS dataset from the same epoch (recorded simultaneously using the IRDIFS-EXT mode of SPHERE) was analyzed in (Boccaletti et al., 2021), no reconstruction from the IFS dataset was reported in that study. (c)(c){}^{\text{(c)}}start_FLOATSUPERSCRIPT (c) end_FLOATSUPERSCRIPTThe first value is the real amplitude of the parallactic rotation, while the second corresponds to the simulated parallactic rotation used in our experiments with synthetic disk simulations (see Sect. 4.4).
Target ESO ID Obs. date L𝐿Litalic_L ΔλsubscriptΔ𝜆\Delta_{\lambda}roman_Δ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT T𝑇Titalic_T ΔparsubscriptΔpar\Delta_{\text{par}}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT NDIT DIT τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT Seeing Related paper
(µmµm\mathrm{\SIUnitSymbolMicro}\mathrm{m}roman_µ roman_m) (°) (s) (ms) (”)
SPHERE-IFS data used for validation of the statistical model, see Sect. 4.2
HR 8799(a)(a){}^{\text{(a)}}start_FLOATSUPERSCRIPT (a) end_FLOATSUPERSCRIPT 095.C-0298(C) 2015-07-04 39 0.96-1.64 46 16.4 4 64 2.3 0.94 Langlois et al. (2021)
SPHERE-IFS data used for qualitative analysis by reconstructing known real disks, see Sect. 4.3
HR 4796 095.C-0298(H) 2015-02-03 39 0.96-1.33 56 48.2 4 64 13.7 0.67 Milli et al. (2017)
SAO 206462 095.C-0298(A) 2015-05-15 39 0.96-1.64 63 63.7 4 64 8.9 0.59 Maire et al. (2017)
MWC 758 1100.C-0481(K) 2018-12-17 39 0.96-1.33 63 29.2 4 96 8.3 0.98 Boccaletti et al. (2021)
PDS 70 1100.C-0481(D) 2018-02-24 39 0.96-1.64 87 93.4 3 96 7.5 0.66 Mesa et al. (2019b)
HD 163296 1100.C-0481(G) 2018-05-07 39 0.96-1.64 48 14.2 3 96 2.6 1.04 Mesa et al. (2019a)
AB Aurigae 104.20V7.001 2020-01-18 39 0.96-1.64 51 38.5 2 64 5.6 0.71 this paper(b)(b){}^{\text{(b)}}start_FLOATSUPERSCRIPT (b) end_FLOATSUPERSCRIPT
SPHERE-IFS data used for quantitative analysis by reconstructing synthetic disks, see Sect. 4.4
HD 172555 095.C-0192 2015-07-11 39 0.96-1.33 62 12.9//30.0(c)(c){}^{\text{(c)}}start_FLOATSUPERSCRIPT (c) end_FLOATSUPERSCRIPT 8 32 3.9 1.20 Flasseur et al. (2020b)
SPHERE-IRDIS data used to compare ADI and ASDI post-processing, see Sect. 4.5
SAO 206462 095.C-0298(A) 2015-05-15 2 2.11-2.25 63 63.7 4 64 8.9 0.59 Maire et al. (2017)

For our comparisons, we selected eight datasets from the SPHERE-IFS instrument, acquired under diverse observing conditions.

First in Sect. 4.2, we consider a dataset of HR 8799 to assess the relevance of the statistical model proposed in this paper. This emblematic star hosts four known exoplanets, all detected by direct imaging (Marois et al., 2008, 2010). Three of which fall within the SPHERE-IFS field of view. After masking the contribution of these point-like sources within the data, we conduct a model ablation analysis to show that it is critical to accurately model the correlations of the nuisance component.

Then in Sects. 4.2 and 4.3, we consider six additional datasets from stars with previously imaged circumstellar disks. These datasets are used to qualitatively assess the benefits of the proposed algorithm on real disks in comparison to baseline methods. The selected disks are at different evolution stages and have very diverse morphologies. The stars included in the analysis are: – HR 4796A, which is the primary member of a binary system within the TW Hydrae association with an age of about 12 Myr (Bell et al., 2015). Located at about 72.8 pc (Van Leeuwen, 2007), HR 4796A harbors a debris disk observable in a face-on configuration, initially imaged by the Hubble Space Telescope (Schneider et al., 1999). Subsequently, its morphology and spectroscopy have been studied intensively by direct imaging (Milli et al., 2017, 2019). The disk showcases a slender ring and a high surface brightness hinting at the potential presence of exoplanets, though no companion has been detected yet. – SAO 206462, which is located within the Upper Centaurus Lupus constellation, has an estimated age of about 9 Myr (Müller et al., 2011). Located at about 157 pc (Brown et al., 2016), it hosts a nearly face-on transition disk imaged both in thermal emission (Doucet et al., 2006) and in scattered light (Grady et al., 2009). It includes two discernible spiral arms, several asymmetric features, and an inner cavity. High-contrast and high-resolution observations suggest that the observed structures may be attributed to the presence of low-mass exoplanets located within the spiral arms or within the inner cavity (Maire et al., 2017). – MWC 758, which is located within the Taurus association, has as estimated age of about 3.5 Myr. Located at about 156 pc (Brown et al., 2021), it hosts a protoplanetary disk in the form of a spiral with (at least) three arms (Reggiani et al., 2018). Recently, two candidate protoplanets have been proposed based on the post-processing of VLT/SPHERE and LBTI/LMIRCam observations by algorithms dedicated to the detection of point-like sources (Reggiani et al., 2018; Wagner et al., 2023). The first one is interior to the spiral (angular separation about 0.11”) and the second one is exterior to the Southern arm (angular separation about 0.62”). According to numerical models, each of these two massive candidate exoplanets would be able to generate the observed spiral arm (Wagner et al., 2019). However, the real existence of the spotted candidate exoplanets remains uncertain given the presence of disk material at the location of the candidate exoplanets, that could also lead to misinterpret disk features as point-like sources. – PDS 70, which is located within the Scorpius-Centaurus association, has an an estimated age of about 5 Myr (Müller et al., 2018). Located at about 113 pc (Brown et al., 2016), this star is notable for hosting a protoplanetary disk within which two confirmed exoplanets, PDS 70 b and PDS 70 c, are in the process of formation. The exoplanet PDS 70 b was directly imaged using the VLT/SPHERE instrument in near-infrared (Keppler et al., 2018), while PDS 70 c was unveiled through observations with the VLT/MUSE instrument in HαsubscriptH𝛼\text{H}_{\alpha}H start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT (Haffert et al., 2019). A third additional candidate exoplanet was also recently detected using JWST observations in the near and mid-infrared (Christiaens et al., 2024). By harboring multiple nascent exoplanets, this system stands as a unique case. Several structures such as arcs, outer and inner gaps, and potential spiral arms, particularly on the north side of the outer disk were also resolved by direct imaging (Riaud et al., 2006; Keppler et al., 2018; Mesa et al., 2019b; Juillard et al., 2022). – HD 163296, which is located within the Sagittarius association, has an estimated age of about 5 Myr. Located at about 101.5 pc (Gaia et al., 2018), it hosts a protoplanetary disk with a diameter larger than 1000 au (Isella et al., 2007; Tilling et al., 2012; Muro-Arena et al., 2018). Sub-millimeter observations have shown that this disk harbors multiple rings whose structure are due to variations in the gas pressure (Teague et al., 2018). Moreover, multiple asymmetries in the continuum emission have been observed, which supports the hypothesis of the existence of (yet undetected) sub-stellar companions (Isella et al., 2018). Near infrared observations with VLT/SPHERE allowed to put mass limits of about 3-4 MJup at 30 au, 6-7 MJup between 30 and 80 au, and 2-4 MJup beyond 200 au for such plausible exoplanets (Mesa et al., 2019a). – AB Aurigae, which is located within the Auriga association, has as estimated age of about 4 Myr. Located at about 163 pc Brown et al. (2016), it hosts a protoplanetary disk with complex spiral features (Boccaletti et al., 2020). Recently, three candidate point-like sources were identified within the circumstellar environment. Two of them were identified from VLT/SPHERE observations (Boccaletti et al., 2020). The first one appears very elongated and is embedded within the Southern spiral arm. The second one is located exterior to the Northern spiral arm and is more similar to a point-like source (while being detectable only from SPHERE-IRDIS data and not from SPHERE-IFS data recorded simultaneously). In addition, these two features are detectable both in polarimetry and total intensity, which suggests that they are more likely due to scattering dust particles (Boccaletti et al., 2020). A third candidate protoplanet was identified by Currie et al. (2022b) from SUBARU/SCExAO data. It behaves as a bright emission source at an angular separation of about 0.59”, interior to a dust ring resolved in millimeter observations. However, given that the candidate exoplanet would be at its first stage of formation, likely still accreting material from the disk, it does not appear as a point-like source, but rather as a very elongated pattern, which makes the detection difficult to confirm. Nevertheless, its location and estimated SED would be compatible with model predictions as a driver of the observed spiral arms (Currie et al., 2022b).

In Sect. 4.4, we quantitatively assess the performance of the proposed algorithm against baseline methods of the field. To this end, we resort to numerical injections of synthetic disks of various morphologies into a real SPHERE-IFS dataset of the HD 122555 star (Schütz et al., 2005; Lisse et al., 2009). To the best of our knowledge, no off-axis objects (either point-like sources or disk) have ever been imaged around this star within the SPHERE-IFS field of view (Nielsen et al., 2008; Nielsen & Close, 2010). We also generate a synthetic vector of parallactic angles (linearly distributed between 0° and 30°) differing from the experimental value, to vanish out any potential signal from (unknown) real objects.

Finally in Sect. 4.5, we consider an additional dataset from the Infrared Dual-band Imager and Spectrograph (IRDIS; Dohlen et al. (2008)) of the SPHERE instrument. Its dual band mode allows simultaneous imaging at two distinct spectral channels for each individual exposure (Vigan et al., 2014). The selected dataset corresponds to the observation of the star SAO 206462. The IRDIS and IFS data of this star were collected simultaneously using the IRDIFS-EXT mode of the SPHERE instrument (Beuzit et al., 2019). In our previous work with the ADI version of the REXPACO algorithm (Flasseur et al., 2021), we processed this dataset but with a mono-spectral approach. In Sect. 4.5, we revisit this data with the proposed REXPACO ASDI algorithm to illustrate the benefits of a joint spectral processing.

Refer to caption
Figure 4: Empirical distributions of the centered and whitened patches averaged over the whole field of view for different covariance models. Data used in this figure contain only the contribution from the nuisance component (i.e., no exoplanet or disk). Dataset: HR 8799 (2015-07-04) with the three known exoplanets masked out, see Table 2 for the observation parameters.
Refer to caption
Figure 5: (figure modified) Empirical distributions of the centered and whitened patches for different covariance models: diagonal spatial and spectral covariances in the first column; full spatial covariance, diagonal spectral covariance, and temporal weighting in the second column; full spectral covariance, diagonal spatial covariance, and temporal weighting in the third column; and finally in the last column the full spatio-spectral separable model introduced in this work. The reported empirical distribution are computed locally: a location selected at a small angular separation in rows (a) and (b); a location selected at a larger angular separation in rows (c) and (d). The corresponding global empirical distributions computed over the whole field of view are given in Fig. 4. Patches represented in this figure contain only the contribution from the nuisance component (i.e., no exoplanet or disk). Dataset: HR 8799 (2015-07-04) with the three known exoplanets masked out, see Table 2 for the observation parameters.

All datasets were calibrated and assembled from SPHERE raw data using the pre-reduction and handling pipeline of the SPHERE consortium (Pavlov et al., 2008). During this step, background, flat-field, bad pixels, registration, true-North, wavelength and astrometric calibrations are performed. These standard pre-processing steps are followed by additional refinements implemented at the SPHERE Data Center (Delorme et al., 2017), aimed at reducing cross-talk, enhancing bad pixel correction, and mitigating spectral cross-talk effects.

Table 2 summarizes the main observation parameters associated to each dataset.

4.2 Validation of the statistical model of the nuisance component

Before evaluating the reconstruction method on high-contrast observations of circumstellar disks, we aim to show that our statistical model of the nuisances is relevant. We use the same ASDI dataset (HR 8799, 2015-07-04) as in Flasseur et al. (2020b) with the three known exoplanets within the SPHERE-IFS field of view masked out so that the resulting data correspond only to the nuisance term. Following a similar analysis as in Flasseur et al. (2020b), Fig. 4 displays the empirical distribution of all patches in the field of view after performing different post-processing. If random vectors 𝒗nsubscript𝒗𝑛\bm{v}_{n}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are accurately modeled by a Gaussian distribution with mean 𝝁nsubscript𝝁𝑛\bm{\mu}_{n}bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and covariance 𝐂nsubscript𝐂𝑛\mathbf{C}_{n}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, as described in Eq. (2), the centered and whitened vectors 𝐂n1/2(𝒗n𝝁n)superscriptsubscript𝐂𝑛12subscript𝒗𝑛subscript𝝁𝑛\mathbf{C}_{n}^{-1/2}(\bm{v}_{n}-\bm{\mu}_{n})bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) should follow 𝒩(𝟎,𝐈)𝒩0𝐈\mathcal{N}(\bm{0},\mathbf{I})caligraphic_N ( bold_0 , bold_I ), corresponding to the red dashed line in Fig. 4. We thus compare a standard Gaussian distribution with the empirical marginal distribution of 𝐂n1/2(𝒗n𝝁n)superscriptsubscript𝐂𝑛12subscript𝒗𝑛subscript𝝁𝑛\mathbf{C}_{n}^{-1/2}(\bm{v}_{n}-\bm{\mu}_{n})bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for several covariance models: the three models considered in Flasseur et al. (2020b), drawn in gray dashed-lines: (i) no covariance (𝐂n=𝐈subscript𝐂𝑛𝐈\mathbf{C}_{n}=\mathbf{I}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_I); (ii) only spatial covariances; (iii) spatial covariances plus temporal and spectral weighting; and four additional models: (iv) diagonal spatial and spectral covariances (i.e., spatial, spectral, and temporal weighting via a separable model); (v) full spatial covariance, diagonal spectral covariance, and temporal weighting; (vi) full spectral covariance, diagonal spatial covariance, and temporal weighting; and finally (vii) the full separable model introduced in this paper, see Eq. (5). As shown by Fig. 4, the full spatio-spectral separable model (green curve) provides the best fit to the empirical distribution (i.e., the green curve closely matches the red dashed line of the standard Gaussian distribution). This justifies the use of the full spatio-spectral separable model in our loss function nsubscript𝑛\mathscr{L}_{n}script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. As an average trend over the field of view, we also observe that neglecting spatial covariances is more detrimental than ignoring spectral covariances, as model (v) better approximates 𝒩(𝟎,𝐈)𝒩0𝐈\mathcal{N}(\bm{0},\mathbf{I})caligraphic_N ( bold_0 , bold_I ) than model (vi). Figure 5 completes this study with a more localized examination of the empirical distribution of patches for models (iv)-(vii) across two nuisance regimes: (1) a regime near the star where speckles dominate, and (2) a regime at larger separations where stochastic noise prevails. A similar representation was provided in Fig. 4 of Flasseur et al. (2020b) on this dataset for three additional models considered in our previous work for exoplanet detection in angular and spectral differential imaging: (i) no covariance; (ii) spatial covariances only; and (iii) spatial covariances with temporal and spectral weighting. Based on this analysis, the full spatio-spectral separable model introduced in this paper is the most effective at statistically describing the fluctuations of the nuisance component across both noise regimes (i.e., regardless of the distance to the star). Notably, in both models, the empirical distributions of centered and whitened patches more closely follow a Gaussian law with zero mean and unit variance far from the star than near the star. This is to be expected, given that the nuisance is stronger, more correlated, and fluctuates more in the vicinity of the star than farther away; see Fig. 2.

Refer to caption
Figure 6: Ablation study: neglecting spatial and/or spectral correlations of the nuisance component significantly degrades reconstruction quality. The displayed images represent the deconvolved reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG. Pseudo-color images are shown with colors selected to cover the infrared spectrum according to the colormaps on the right. Datasets: HR 4796 (2015-02-03), SAO 206462 (2015-05-15) and MWC 758 (2018-12-17), see Table 2 for the observation parameters.

We conclude this ablation study by showing how the reconstruction results are impacted if simpler covariance models are considered rather than the full model of Eq. (5). Figure 6 displays examples on real data of the reconstructed disk component for the same four nuisance models as in Fig. 5 (i.e., models (iv)-(vii)). The datasets of HR 4796 and MWC 758 suffer from a strong nuisance component. Ignoring the spatial correlations leads to severe artifacts: a ghost circular structure is reconstructed and contaminates a large fraction of the field of view. For SAO 206462 and PDS 70, close inspection of the central region reveals spurious structures in all reconstructions except those obtained with the full model (vii) of the nuisance. While ignoring spectral correlations is also harmful (e.g., a bright nuisance halo remains around the MWC 758 disk), its effect is less pronounced compared to omitting spatial correlations, aligning with the findings in Fig. 4, where empirical residual distributions were analyzed across the whole field of view. These qualitative observations emphasize again the value of accurately modeling the nuisance’s spatial and spectral correlations to improve the reconstruction quality.

Refer to caption
Figure 7: Spatial distribution of the spatial and spectral shrinkage parameters. Dataset: HR 8799 (2015-07-04), see Table 2 for the observation parameters.

The shrinkage parameters ρ~nspatsuperscriptsubscript~𝜌𝑛spat\widetilde{\rho}_{n}^{\mathrm{spat}}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT and ρ~nspecsuperscriptsubscript~𝜌𝑛spec\widetilde{\rho}_{n}^{\mathrm{spec}}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT can significantly influence the statistical model. We monitor their values by displaying maps of the spatial and spectral shrinkage parameters in Fig. 7 for the dataset shown in Fig. 1. Values of ρ~nspatsuperscriptsubscript~𝜌𝑛spat\widetilde{\rho}_{n}^{\mathrm{spat}}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT and ρ~nspecsuperscriptsubscript~𝜌𝑛spec\widetilde{\rho}_{n}^{\mathrm{spec}}over~ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT remain relatively low (below 0.13), suggesting a moderate bias towards zero and indicating that the off-diagonal sample covariances are only slightly attenuated by the shrinkage. Spectral shrinkage intensifies at the edges of the field of view, where fewer samples are available due to spectral scaling (i.e., LeffLsubscript𝐿eff𝐿L_{\text{eff}}\leq Litalic_L start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT ≤ italic_L). Conversely, spatial shrinkage is stronger at some locations of the field of view for this dataset, illustrating that a uniform shrinkage value across the entire field of view would be sub-optimal.

4.3 Qualitative analysis: reconstruction of disks from SPHERE-IFS data

Table 3: Number of modes optimized for PCA ASDI reconstructions.
Known real disks, see Sect. 4.3
HR 4796 18
SAO 206462 6
MWC 758 4
PDS 70 14
HD 163296 42
AB Aurigae 20
Synthetic disks, see Sect. 4.4
αgt=1×106subscript𝛼gt1superscript106\alpha_{\text{gt}}=1\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT αgt=5×106subscript𝛼gt5superscript106\alpha_{\text{gt}}=5\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT
elliptical disk 14 4 4
circular disk 26 10 4
spiral disk 26 12 4
Refer to caption
Figure 8: Disk reconstruction results from real data obtained with the SPHERE-IFS instrument. The images in the fourth column show the outputs of the proposed REXPACO ASDI method, which have been re-blurred for direct comparison with the reference methods (i.e., these correspond to 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG). The deblurred reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG are displayed in the last column (corresponding to the last column of Fig. 6). Pseudo-color images are displayed as in Fig. 6. Datasets: HR 4796 (2015-02-03), SAO 206462 (2015-05-15), MWC 758 (2018-12-17), PDS 70 (2018-02-24), HD 163296 (2018-05-07) and AB Aurigae (2020-01-18), see Table 2 for the observation parameters.

Having established the benefits of the proposed statistical model, we now apply it to the six SPHERE-IFS datasets presented in Sect. 4.1, which correspond to observations of stars hosting known circumstellar disks with diverse morphological structures. These include SAO 206462 (already shown in Fig. 1) and MWC 758, both featuring a spiral disk; HR 4796, which hosts a thin elliptical disk; and PDS 70, AB Aurigae and HD 163296, each hosting a protoplanetary disk of complex shape and several candidate or confirmed exoplanets in formation within the surrounding gas and dust material. Figure 8 presents reconstructions produced by various reference methods alongside those obtained with our method. As the other methods do not perform a deconvolution, we show in the fourth column of Fig. 8 our reconstruction re-blurred at the resolution of the instrument (i.e., 𝐁𝒖~𝐁~𝒖\mathbf{B}\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG instead of 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG). Based on code availability, three standard methods were selected for comparison: (i) median ASDI (Sparks & Ford, 2002; Marois et al., 2006; Thatte et al., 2007) which estimates the nuisance component by temporally and spectrally stacking the observations using medians, (ii) PCA ASDI (Soummer et al., 2012; Amara & Quanz, 2012; Christiaens et al., 2019) which employs principal component analysis to remove the nuisance component, and (iii) PACO ASDI (Flasseur et al., 2020b) originally developed for exoplanet detection from ASDI datasets but also capable of partially reconstructing thin disks, see Sect. 1. For median ASDI and PCA ASDI, we used the Vortex Image Processing (VIP; Gonzalez et al. (2017); Christiaens et al. (2023)) package333See https://github.com/vortex-exoplanet/VIP., whereas we employed our unsupervised pipeline444See http://doi.org/10.5281/zenodo.3679426 for a frozen implementation. for PACO ASDI (Flasseur et al., 2020b). The number of modes in PCA ASDI has been manually optimized, with the selected value being constant across all angular separations (i.e., we applied so-called full frame PCA ASDI). In practice, we evaluated all possible mode numbers (in increments of two). For experiments involving a synthetic disk with a known ground truth flux distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT, we selected the number of modes that minimizes the MSE between the estimate 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG and the ground truth 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT. For real disks, we visually selected the optimal number of modes to best preserve fine structures while effectively removing most of the stellar leakage. Table 3 summarizes the number of modes for PCA ASDI reconstructions of the disks analyzed in this paper. For the other hyper-parameters of median ASDI and PCA ASDI, we used default values provided within VIP. The reconstructions obtained with the reference methods all suffer from noticeable artifacts, particularly at the center of the field of view where the reduced angular diversity makes it challenging to disentangle the components. In comparison, both the blurred and the deblurred reconstructions shown in the last two columns of Fig. 8 are far more satisfactory. The REXPACO ASDI reconstruction of the HR 4796 disk displays a near-continuous elliptical structure and a flux asymmetry on the West side of the ring, consistent with the predictions of intensity scattering models, see Milli et al. (2017). For the SAO 204642, the REXPACO ASDI reconstruction exhibits two main spiral arms whose overall morphology and spatial extent are in good agreement with radiative transfer and hydro-dynamical models of transitional disks shaped by giant planets, which are responsible for sculpting multiple spiral arms, see e.g. Bae et al. (2016); Maire et al. (2017). Additionally, the REXPACO ASDI reconstructions of HR 4796 (respectively, SAO 206462, MWC 758, PDS 70, HD 163296) can be qualitatively compared with the reconstructions in Fig. 4 of Milli et al. (2017) (respectively, Fig. 1 bottom-left of Maire et al. (2017), Fig. A.1 second line of Boccaletti et al. (2021), Fig. 1 first line-second row of Mesa et al. (2019b), Fig. 1 right of Mesa et al. (2019a)). These results were derived from custom routines of respectively median ASDI, RDI ADI, median ASDI, PCA ASDI, and PCA ASDI applied to the same datasets. The REXPACO ASDI reconstructions exhibit significantly fewer artifacts, such as non-physical discontinuities in the disk structures and residuals stellar leakages near the star. The deconvolution step in the proposed method also enhances the spatial resolution of thin disk structures. In contrast, baseline methods like median ASDI and PCA ASDI tend to subtract part of the disk component when removing the nuisance term. This leads to substantial flux biases and a high-pass filtering effect. PACO ASDI, being optimized for point-like detections, manages to recover parts of the disks in large gradient areas. It is much more successful on the thin disk of HR 4796 and on the extended disk of PDS 70 than on the thicker spiral disk of SAO 206462 and of MWC 758. Finally, the multi-spectral REXPACO ASDI reconstructions in Fig. 8 can be compared to the mono-spectral reconstructions produced by the REXPACO ADI algorithm (Flasseur et al., 2021) (see Fig. 11 of (Flasseur et al., 2021)) on mono-spectral datasets of the same target stars (excepted MWC 758, HD 163296, AU Aurigae). These mono-spectral datasets were recorded using the InfraRed Dual Imaging Spectrograph (IRDIS) of the SPHERE instrument, operating simultaneously with the IFS but in a different spectral band and resolution. The joint multi-processing leads to a better rejection of the nuisance component, thereby reducing non-physical reconstruction artifacts such as discontinuities, especially within spiral arms. These comparisons illustrate that joint processing of multi-spectral datasets is particularly beneficial for disks having a circular symmetry, such as SAO 206462 or MWC 758, as it helps to disentangle the disk light from the stellar light. This is because these two components do not always superimpose due to the chromatic scaling of speckles induced by ASDI. The advantages of joint spectral processing are further explored and discussed in Sect. 4.5.

Refer to caption
Figure 9: Disk reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG obtained with REXPACO ASDI: same results as in Fig. 8 (last column) for the protoplanetary disks MWC 758 and AB Aurigae. The main disk features reported in the literature are overlaid with straight lines and (candidate) point-like sources also identified in the literature (not only based on data of the same instrument) are highlighted with circles. Newly identified candidate disk features from our reconstructions are indicated with dashed lines, see text. Datasets: MWC 758 (2018-12-17) and AB Aurigae (2020-01-18), see Table 2 for the observation parameters.

Figure 9 focuses on protoplanetary disks MWC 758 and AB Aurigae reconstructed with the proposed REXPACO ASDI algorithm. Known disk features and (candidate) point-like sources reported in the literature, as well as new disk features identified through our reconstructions, are overlaid. For MWC 758, the three spiral arms identified by Wagner et al. (2019) (highlighted with solid arrows) are well reconstructed by REXPACO ASDI. We also reconstruct two additional elongated structures interior to the Northern main spiral arm. These features could be interpreted as additional spiral arms and they appear connected to the main spiral arms by material bridges. None of the two point-like sources (b and c) identified by Reggiani et al. (2018); Wagner et al. (2023) are detected in our reconstruction. This may be due to the VLT/SPHERE-IFS observations being taken in the Y-J spectral band, whereas the two exoplanets were discovered using Keck/NIRC2 and LBTI/LMIRCam observations in the L’ and M’ bands, where contrast for such candidate sources is more favorable. For AB Aurigae, REXPACO ASDI reconstructs the two main spiral arms previously identified by Boccaletti et al. (2020). We also identify additional complex structures such as gaps and splittings within the main spiral arms. Consistent with Boccaletti et al. (2020), we detect a bright emission source (f1) embedded within the Southern spiral arm, though it appears very extended, suggesting that it is part of the disk. Like Boccaletti et al. (2020), we do not detect the Northern point-like source (f2) from this SPHERE-IFS dataset. It can be also noted that point-like source f2 were identified by Boccaletti et al. (2020) at the same epoch, but from a dataset obtained with the SPHERE-IRDIS instrument, operating simultaneously to SPHERE-IFS. We also clearly detect the Northern bright emission source (CC c) identified by Currie et al. (2022b) from SUBARU/SCExAO data. However, CC c does not appear as a point-like source in our reconstruction, likely because this candidate exoplanet, if real, would be at its first stage of formation, still accreting material from the disk. The SPHERE-IFS wavelengths being shorter than on SUBARU/SCExAO, is it also possible that the point sources are beyond reach at these wavelengths with SPHERE-IFS.

4.4 Quantitative analysis: reconstruction of synthetic disks injected into SPHERE-IFS data

Refer to caption
Figure 10: Ground truth images for three synthetic disks: an elliptical disk, a circular disk, and a spiral disk. The first line gives the contribution 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,\bm{u}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT of the disks (i.e., blurred by the off-axis PSF to be at the same spatial resolution than the instrument) that are injected within the data. The second line gives the flux distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT free from the blur introduced by the off-axis PSF. For each disk, three slice-cuts are defined (denoted by profile 1, 2, 3) for 1D visualization of the reconstructed flux performed in Figs. 24, 25, 26, and 27 of Appendix B.

In this section, we quantitatively assess the performance of the proposed approach in comparison to three baseline methods: median ASDI, PCA ASDI and PACO ASDI. The general principles of these approaches are outlined in Sect. 1, and their specific settings are detailed in 4.3.

We consider three simulated disks representative of common morphologies in high-contrast observations: (i) a spatially centered elliptical disk with sharp edges and with an eccentricity of about 0.80; (ii) a circular disk with sharp edges and whose center is shifted by five pixels from the star center in the two spatial dimensions; (iii) a spiral disk exhibiting two arms with smooth edges. Figure 10 illustrates the ground truth flux distribution for for each of these disk types used in this analysis.

While these toy models were not generated using physics-based simulators (e.g., modeling the hydrodynamics and radiative transfer), cases (i) and (ii) typically correspond to debris disks while case (iii) resembles a particular instance of transition or protoplanetary disks. Additionally, it can be noted that these synthetic disks resemble the real circumstellar disks reconstructed in Fig. 8 so that these simulations can help to assess the quality of the reconstructions of these real circumstellar disks: the elliptical disk (i) has a spatial extent similar to the HR 4796 disk, and the spiral disk (iii) has similar spatial extent and morphology to the SAO 206462 disk.

Each simulated disk is injected into the HD 172555 dataset (which contains no known off-axis source), at three different contrast levels αgt{1×106,5×106,1×105}subscript𝛼gt1superscript1065superscript1061superscript105\alpha_{\text{gt}}\in\{1\times 10^{-6},5\times 10^{-6},1\times 10^{-5}\}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ∈ { 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT }. For our simulations, we consider gray objects, meaning the contrast is constant across the spectral band, resulting in an identical flux distribution across all spectral channels. Consequently, all reconstructions presented in this section are averaged over the whole spectral band. A total of 90 semi-synthetic datasets have been generated: for each disk type and contrast level αgtsubscript𝛼gt\alpha_{\text{gt}}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT, the simulated disk has been injected at ten different orientations relative to the nuisance component (which remains the same for all simulations). This simulation protocol allows us to evaluate the mean and variance of the reconstructions.

Refer to caption
Figure 11: Reconstructions of simulated elliptical disks: comparisons between median ASDI, PCA ASDI and re-blurred REXPACO ASDI reconstructions. Single reconstructions (associated to a selected orientation of the disk with respect to the nuisance) and the average reconstructions (over ten different injections of the same disk but with various orientations with respect to the nuisance) are displayed. The three columns correspond to the three considered levels of contrast: αgt{1×106,5×106,1×105}subscript𝛼gt1superscript1065superscript1061superscript105\alpha_{\text{gt}}\in\{1\times 10^{-6},5\times 10^{-6},1\times 10^{-5}\}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ∈ { 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT }. Dataset: HD 172555 (2015-07-11), see Table 2 for the observation parameters.
Refer to caption
Figure 12: Reconstructions of simulated elliptical disks: comparisons between PACO ASDI and deconvolved REXPACO ASDI reconstructions. Single reconstructions (associated to a selected orientation of the disk with respect to the nuisance) and the average reconstructions (over ten different injections of the same disk but with various orientations with respect to the nuisance) are displayed. The three columns correspond to the three considered levels of contrast: αgt{1×106,5×106,1×105}subscript𝛼gt1superscript1065superscript1061superscript105\alpha_{\text{gt}}\in\{1\times 10^{-6},5\times 10^{-6},1\times 10^{-5}\}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ∈ { 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT }. Dataset: HD 172555 (2015-07-11), see Table 2 for the observation parameters.
Refer to caption
Figure 13: Same than Fig. 11 for synthetic circular disks.
Refer to caption
Figure 14: Same than Fig. 12 for synthetic circular disks.
Refer to caption
Figure 15: Same than Fig. 11 for synthetic spiral disks.
Refer to caption
Figure 16: Same than Fig. 12 for synthetic spiral disks.
Table 4: Quantitative assessment of the reconstruction quality on synthetic disks. N-RMSE as defined in Eq. (112) is reported for the reconstructions displayed in Figs. 11-16 and 24-26. The N-RMSE is also computed on the restrictions 𝒟(𝒖gt)𝒟subscript𝒖gt\mathcal{D}(\bm{u}_{\text{gt}})caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) and 𝒟(𝒖~)𝒟~𝒖\mathcal{D}(\widetilde{\bm{u}})caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) to the area actually covered by the simulated disks. The best scores are highlighted in bold fonts.
Score Algorithm αgt=1×106subscript𝛼gt1superscript106\alpha_{\text{gt}}=1\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT αgt=5×106subscript𝛼gt5superscript106\alpha_{\text{gt}}=5\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT
Elliptical disk, see Figs. 11, 12 and 24
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) PACO ASDI 0.52 0.53 0.58
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.12 0.11 0.10
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PACO ASDI 0.26 0.41 0.53
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.10 0.06 0.04
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) median ASDI 0.68 0.56 0.46
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) PCA ASDI 0.40 0.30 0.30
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.13 0.06 0.05
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) median ASDI 0.66 0.54 0.45
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PCA ASDI 0.39 0.29 0.27
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.12 0.03 0.01
Circular disk, see Figs. 13, 14 and 25
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) PACO ASDI 0.74 0.71 0.77
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.14 0.12 0.11
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PACO ASDI 0.51 0.60 0.71
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.10 0.06 0.04
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) median ASDI 0.97 0.97 0.97
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) PCA ASDI 0.91 0.87 0.65
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.15 0.08 0.07
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) median ASDI 0.97 0.96 0.96
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PCA ASDI 0.90 0.87 0.63
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.12 0.04 0.02
Spiral disk, see Figs. 15, 16 and 26
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) PACO ASDI 0.63 0.64 0.69
N-RMSE(𝒖gt,𝒖~)N-RMSEsubscript𝒖gt~𝒖\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.60 0.39 0.38
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PACO ASDI 0.25 0.40 0.60
N-RMSE(𝒟(𝒖gt),𝒟(𝒖~))N-RMSE𝒟subscript𝒖gt𝒟~𝒖\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)N-RMSE ( caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.06 0.05 0.03
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) median ASDI 0.99 0.96 0.91
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) PCA ASDI 0.82 0.80 0.70
N-RMSE(𝐁𝒖gt,𝐁𝒖~)N-RMSE𝐁subscript𝒖gt𝐁~𝒖\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)N-RMSE ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , bold_B over~ start_ARG bold_italic_u end_ARG ) REXPACO ASDI 0.58 0.36 0.35
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) median ASDI 0.99 0.96 0.91
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) PCA ASDI 0.82 0.80 0.69
N-RMSE(𝐁𝒟(𝒖gt),𝐁𝒟(𝒖~))N-RMSE𝐁𝒟subscript𝒖gt𝐁𝒟~𝒖\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)N-RMSE ( bold_B caligraphic_D ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) , bold_B caligraphic_D ( over~ start_ARG bold_italic_u end_ARG ) ) REXPACO ASDI 0.14 0.05 0.04

Figures 11-12, 13-14 and 15-16 report the reconstruction results for the circular, elliptical and spiral disks, respectively. Figures 24, 25, and 26 complement these reconstruction results with a slice-cuts analysis along the three profiles defined in Fig. 10.

Because median ASDI and PCA ASDI do not perform a deconvolution, the comparisons are performed at the resolution of the instrument, as in Sect. 4.3. The deconvolved flux distributions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG estimated by REXPACO ASDI are thus re-blurred by the off-axis PSF so that the quantity 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG can be directly compared with the median ASDI and PCA ASDI images in Figs. 11, 15 and 13. REXPACO ASDI reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG deconvolved from the off-axis PSF are more specifically compared to PACO ASDI flux distribution maps in Figs. 12, 14 and 16.

Overall, significant errors both in terms of morphology distortions and photometry under-estimations are made on the sought objects by the three comparative techniques, regardless of the type of disk. These errors are more pronounced when the diversity induced by ASDI is the most limited to disentangle the nuisance from the off-axis objects. As an illustration, the circular disk and arms of the spiral disk are barely visible near the star in the median ASDI and PCA ASDI images, even for the brightest cases, which is the sign that an important self-subtraction occurs. In addition, some stellar leakages remain, especially near the star due to the absence of explicit modeling of the correlations of the nuisance. Flux distributions estimated by PACO ASDI are also affected by significant artifacts: continuous structures manifest as a series of point sources due to assumptions made in the model regarding the target objects. Unlike other tested algorithms, this effect worsen when the contrast improves. In addition, only gradient of smooth structures are (approximately) recovered by PACO ASDI.

In comparison, reconstructions produced by REXPACO ASDI seem much closer to the ground truth, even for the lowest level of contrast αgt=1×106subscript𝛼gt1superscript106\alpha_{\text{gt}}=1\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT, with an improved object fidelity and a better rejection of the star light. Unlike median ASDI or PCA ASDI reconstructions, which display non-physical negative values, REXPACO ASDI flux distributions are consistently non-negative (see slice-cuts profiles in Figs. 24-26), owing to the explicit non-negativity constraint imposed in the minimization problem (29). In addition, an important result is the ability of REXPACO ASDI to reconstruct disks having a quasi-circular symmetry (that are especially challenging to reconstruct due to the lack of angular diversity), without the need of additional diversity complementing ASDI, e.g. leveraging multiple datasets as done in RDI techniques (see Sect. 1). The deblurred reconstructions of REXPACO ASDI shown in Figs. 12, 14, 16 and 24-26 are in good agreement with the ground truth. As expected, the reconstruction fidelity is higher when the disk is brighter: more spurious fluctuations are visible in the deblurred reconstruction at αgt=1×106subscript𝛼gt1superscript106\alpha_{\text{gt}}=1\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT than at αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT. Moreover, the spatial resolution is also significantly improved by the deconvolution process. However, some discrepancies can be noted in the deblurred line profiles, such as a slight Gibbs effect (i.e., signal ripples) near sharp edges induced by the edge-preserving regularization (even if it is beneficial in overall) and a residual bias on the photometry for some parts of the spiral disk (even the overall morphology is preserved). We discuss the latter phenomenon in Sect. 4.5 dedicated to the comparison between ADI and ASDI processing.

After this qualitative analysis, we now compare, as done in Flasseur et al. (2021), the reconstruction quality of median ASDI, PCA ASDI, PACO ASDI and REXPACO ASDI by reporting the normalized root mean square error (N-RMSE, the lower the higher reconstruction fidelity):

N-RMSE(𝒖gt,𝒖~)=𝒖gt𝒖~2𝒖gt2.N-RMSEsubscript𝒖gt~𝒖subscriptnormsubscript𝒖gt~𝒖2subscriptnormsubscript𝒖gt2\text{N-RMSE}(\bm{u}_{\text{gt}},\widetilde{\bm{u}})=\frac{||\bm{u}_{\text{gt}% }-\widetilde{\bm{u}}||_{2}}{||\bm{u}_{\text{gt}}||_{2}}\,.N-RMSE ( bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT , over~ start_ARG bold_italic_u end_ARG ) = divide start_ARG | | bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT - over~ start_ARG bold_italic_u end_ARG | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG | | bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG . (112)

Table 4 reports the N-RMSE for two regions of the reconstructed flux distribution: (i) the entire image, and (ii) the disk area only. In the latter case, Eq. (112) is modified to account solely for disk regions. This metric shows a clear improvement brought by REXPACO ASDI compared to the other tested algorithms, with error reduction exceeding a factor 10 for more challenging configurations (e.g., circular or spiral disks). A more modest error reduction is obtained for configurations (i.e., morphology and contrast) leading to an easier separation of the disk from the nuisance contribution, like for the elliptical disk.

This study also provides valuable insights for interpreting the reconstructions of real disks presented in Fig. 8, as both the simulations and real data share comparable angular and spectral diversity (i.e., similar amounts of parallactic rotation, same number and spreading of the spectral channels). Additionally, the simulated disks possess morphologies closely resembling the real disks. Consequently, this study suggests that the reconstructed flux distribution of HR 4796 can be confidently interpreted as having an elliptical morphology. Similarly, the outer disk of HD 163296 has roughly the same morphology, allowing for confidence in the reconstructed structures on the Northern side, though the quality of the reconstruction is strongly limited by the low disk contrast (lower than 5×1075superscript1075\times 10^{-7}5 × 10 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT) on the Southern side. SAO 206462, MWC 758 and AB Aurigae, all of which exhibit spiral arms with a spatial extent quite similar to the simulated spiral disk studied in this section. We can thus expect that the morphology of these three real disks are well reconstructed with, likely, a slight photometric bias on some structures in the vicinity of the host star. The case of PDS 70 is more challenging due to its intricate structures including a smooth flux distribution near the star in the shortest wavelengths. While no non-physical discontinuities are observed in the outer disk, dedicated hydro-dynamical simulations of this object are needed to identify the areas impacted by potential artifacts. Such a study is out of the scope of this paper and is left for a future work dedicated to the re-analysis of multi-epochs and multi-instruments observations of PDS 70.

4.5 On the importance of a joint spectral processing

In this section, we aim to illustrate the benefits of joint spectral processing, incorporating fine modeling of correlations between spectral channels, compared to mono-spectral processing that does not leverage the apparent chromatic displacement of the speckle field induced by ASDI (see Sect. 1).

On the latter point, we start by identifying parts of disks that are expected to suffer the most from the self-subtraction effect for different disk morphologies, spectral bands and total amounts of parallactic rotation. For that purpose, we consider the three synthetic disk morphologies studied in Sect. 4.4, and we assume a null nuisance component to evaluate solely the influence of limited angular and spectral diversity on reconstruction quality. As done by Juillard et al. (2023) for ADI, given a ground truth flux distribution 𝒖gtN×Lsubscript𝒖gtsuperscriptsuperscript𝑁𝐿\bm{u}_{\text{gt}}\in\mathbb{R}^{N^{\prime}\times L}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT × italic_L end_POSTSUPERSCRIPT, we define the spectrally aggregated flux 𝒖invNsubscript𝒖invsuperscriptsuperscript𝑁\bm{u}_{\text{inv}}\in\mathbb{R}^{N^{\prime}}bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT which is invariant both from the apparent rotation induced by ADI and from the homothetic spectral motion of the speckle field induced by SDI as:

[𝒖inv]n=mint=1:T,=1:L[𝐅𝒖gt]n,t,,n1;N,formulae-sequencesubscriptdelimited-[]subscript𝒖inv𝑛subscriptmin:𝑡1𝑇1:𝐿subscriptdelimited-[]𝐅subscript𝒖gt𝑛𝑡for-all𝑛1superscript𝑁{\left[\bm{u}_{\text{inv}}\right]}_{n}=\text{min}_{t=1:T,\,\ell=1:L}\left[% \mathbf{F}\,\bm{u}_{\text{gt}}\right]_{n,t,\ell}\,,\forall n\in\llbracket 1;N^% {\prime}\rrbracket\,,[ bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = min start_POSTSUBSCRIPT italic_t = 1 : italic_T , roman_ℓ = 1 : italic_L end_POSTSUBSCRIPT [ bold_F bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_n , italic_t , roman_ℓ end_POSTSUBSCRIPT , ∀ italic_n ∈ ⟦ 1 ; italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟧ , (113)

with 𝐅𝐅\mathbf{F}bold_F the sparse operator performing rotations, scalings, and attenuations as defined for the forward image formation model in Sect. 3.1. Taking the minimum intensity value (operator min) across the temporal and spectral dimensions in Eq. (113) enables the identification of ASDI-invariant flux regions. The output is 0 for non-invariant regions and 1 for areas of the disk that are fully affected by angular and/or spectral invariance. We also consider the quantity 𝒖gt𝒖invsubscript𝒖gtsubscript𝒖inv\bm{u}_{\text{gt}}-\bm{u}_{\text{inv}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT - bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT representing the expected reconstructed flux distribution if the invariant component 𝒖invsubscript𝒖inv\bm{u}_{\text{inv}}bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT can not be disentangle from the nuisance component (i.e., the angular and spectral diversity are not sufficient to perform signal unmixing). Fig. 17 represents these two quantities in ADI, SDI and ASDI for the three typical morphologies considered in Sect. 4.4, for a simulated total amount of parallactic rotation Δpar={30°,45°}subscriptΔpar30arcdegree45arcdegree\Delta_{\text{par}}=\{30\mathrm{\SIUnitSymbolDegree},45\mathrm{% \SIUnitSymbolDegree}\}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT = { 30 ° , 45 ° }, and for simulated spectral bands YJ (i.e., λ[0.961.33]µm𝜆delimited-[]0.961.33µm\lambda\in\left[0.96-1.33\right]\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}italic_λ ∈ [ 0.96 - 1.33 ] roman_µ roman_m) or YJH (i.e., λ[0.961.64]µm𝜆delimited-[]0.961.64µm\lambda\in\left[0.96-1.64\right]\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}italic_λ ∈ [ 0.96 - 1.64 ] roman_µ roman_m). In ADI, i.e. in the absence of a joint spectral processing, we observe that a large fraction of the circular and spiral disks remain invariant with respect to the background. The elliptical disk is less affected by this phenomenon, even if it is not negligible, especially near the ellipse handles along its minor axis. This lack of diversity translates into a partial attenuation and distortion of the reconstructed disk, due to object self-subtraction, see e.g. Milli et al. (2012); Pairet et al. (2019); Juillard et al. (2023) for related studies in ADI. Moreover, as expected the total amount of parallactic rotation brings only a limited diversity at short angular separations: the angular-invariant flux distribution only slightly decreases when ΔparsubscriptΔpar\Delta_{\text{par}}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT evolves from 30° to 45°, regardless of the disk morphology. SDI effectively eliminates most signal ambiguities caused by object invariances. It leads to no invariant flux for elliptical and circular disks. Joint spectral processing with ASDI further improves the unmixing capability of post-processing algorithms as only a very slight fraction of the spiral disk remains invariant for the setting Δpar=30°subscriptΔpar30arcdegree\Delta_{\text{par}}=30\mathrm{\SIUnitSymbolDegree}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT = 30 ° in YJ band. It results in a slight object self-subtraction that could explain the observed photometric bias in Figs. 16 and 26 on the reconstructed spiral disk for similar settings (in terms of disk morphology, parallactic rotation, and spectral band). Increasing the spectral width towards the H band and the total parallactic rotation towards 45° leads to a negligible invariant flux distribution, that would allow to reconstruct the underlying off-axis object without self-subtraction with REXPACO ASDI, and without the need to leverage a database archive as in RDI techniques.

Refer to caption
Figure 17: Nuisance-free study on the importance of a joint spectral processing. Invariant flux distribution 𝒖invsubscript𝒖inv\bm{u}_{\text{inv}}bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT as defined in Eq. (113) is reported on the first line of panels (a), (b) and (c) for the three synthetic disks (i.e., elliptical, circular, and spiral) whose ground truth flux distribution 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT are represented in Fig. 10. The second line of panels (a), (b) and (c) gives the difference 𝒖gt𝒖invsubscript𝒖gtsubscript𝒖inv\bm{u}_{\text{gt}}-\bm{u}_{\text{inv}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT - bold_italic_u start_POSTSUBSCRIPT inv end_POSTSUBSCRIPT. Two total amounts ΔparsubscriptΔpar\Delta_{\text{par}}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT of parallactic rotation and spectral bands are considered (if applicable) in each case. Panel (a) is for ADI (i.e., each spectral channel is considered independently in Eq. (113)), panel (b) is for SDI (i.e., assuming the parallactic angle is equal for each temporal exposure), and panel (c) is for ASDI (i.e., all temporal frames and spectral channels are processed jointly in Eq. (113)).
Refer to caption
Figure 18: Comparison between ADI and ASDI on the reconstruction of synthetic disks. The considered elliptical, circular, and spiral disks are injected (αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT, Δpar=30°subscriptΔpar30arcdegree\Delta_{\text{par}}=30\mathrm{\SIUnitSymbolDegree}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT = 30 °, YJ band), within a real SPHERE-IFS dataset and processed with the mono-spectral algorithm REXPACO ADI (Flasseur et al., 2021) and its multi-spectral version REXPACO ASDI proposed in this paper. The reconstructions 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG are re-blurred by the off-axis PSF, and REXPACO ASDI results are similar to the ones presented in the third column of Figs. 11, 13, and 15. Dataset: HD 172555 (2015-07-11), see Table 2 for the observation parameters.

Figure 18 completes this study by comparing a post-processing relying on ADI only (here, with the REXPACO ADI algorithm) to a post-processing leveraging also on the spectral diversity brought by ASDI (here, with the REXPACO ASDI algorithm) for three particular configurations of the extensive simulations performed in Sect. 4.4. Figure 27 complements results displayed in Fig. 18 with a slice-cuts analysis along the three profiles defined in Fig. 10.

In all cases, the same total amount of information is used, i.e. all spectral channels are considered in ADI but they are processed individually instead of jointly as in ASDI. The conclusions derived from the nuisance-free simulations in Fig. 17 directly translates on the reconstruction quality: REXPACO ADI leads to a bias (respectively, by up to 20 % and 60%) on the reconstructed flux distribution of the elliptical and circular disks, respectively. This bias is almost null for the elliptical disk reconstructed with the proposed REXPACO ASDI algorithm. For the spiral disk, a bias up to 20% can remain on some parts of the ASDI reconstruction, even though it is significantly smaller than for ADI. This residual bias can be attributed to a still insufficient angular and spectral diversity, as shown by the nuisance-free study for a simulated spiral disk in the YJ band with a total parallactic rotation ΔparsubscriptΔpar\Delta_{\text{par}}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT of 30°.

Refer to caption
Figure 19: Comparison of reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG obtained on SPHERE-IFS data with the mono-spectral REXPACO algorithm (left) and the proposed multi-spectral REXPACO ASDI algorithm (right, similar results as in last column of Figs. 8). Pseudo-color images are displayed as in Figs. 6. Datasets: same as in Fig. 8.

Conversely, on the same real data of Sects. 4.2-4.3, we perform a model ablation study complementary to the one presented in Sect. 4.2. Unlike in Sect. 4.2, we consider here the spatial covariances when estimating the nuisance component but we process each spectral channel individually with REXPACO ADI instead of jointly with REXPACO ASDI as done in Sects. 4.2-4.3. Figure 19 displays the resulting reconstructed flux distributions compared to the corresponding REXPACO ASDI reconstructions. The absence of joint spectral processing is detrimental on three aspects. First, important residual star light remains in the ADI reconstructions, in particular for HR 4796 and AB Aurigae. Their typical signatures in rainbow pattern is due to the absence of modeling of the spectral correlations of the nuisance. Second, the sensitivity is lowered due to the absence of explicit exploitation of the spectral diversity, even if the same total amount of data is processed. As an illustration, the HD 163296 disk is almost invisible in the ADI reconstruction. Third, important non-physical artifacts and discontinuities on the disk features are present on the ADI reconstructions, especially for disk having a circular symmetry like SAO 206462, MWC 758 and PDS 70. This latter effect is due to the lack of diversity between the sought off-axis objects and the nuisance component in ADI, as discussed in the previous paragraph.

Refer to caption
Figure 20: Comparison of reconstructions obtained on SPHERE-IRDIS data with the mono-spectral REXPACO ADI algorithm (left) and the proposed multi-spectral REXPACO ASDI algorithm. The white arrow points out a part of West spiral arm severely impacted by reconstruction artifacts with a post-processing based on ADI solely, see Fig. 11 of (Flasseur et al., 2021). Pseudo-color images are displayed as in Figs. 6. Dataset: SAO 206462 (2015-05-15), see Table 2 for the observation parameters.

Now that we have established the causes of the limitations of ADI and emphasized the benefits of ASDI to produce faithful reconstructions of the circumstellar environment from IFS data, we illustrate that even a limited spectral diversity can be useful to improve the quality of the reconstructions. In our previous work on the REXPACO algorithm designed for ADI (Flasseur et al., 2021), we considered datasets from the SPHERE-IRDIS imager in its dual band configuration (i.e., producing simultaneously datasets on L=2𝐿2L=2italic_L = 2 spectral channels). In Flasseur et al. (2021), we have shown that REXPACO ADI is able to produce disk reconstructions with a significantly improved quality compared to standard post-processing methods like median ADI, PCA ADI, PACO ADI. We also notice that some plausible artifacts can remain due to the lack of diversity between the disk and the nuisance component. Here, we re-visit with the proposed REXPACO ASDI algorithm a SPHERE-IRDIS dataset (SAO 206462) considered in Flasseur et al. (2021) and for which the reconstruction seems the most impacted by residual artifacts. Figure 20 compares our new reconstruction obtained by a joint spectral processing with REXPACO ASDI to the REXPACO ADI reconstruction. Notably, we identified in our ADI reconstruction a spurious reconstruction effect on the West spiral arm, taking the form of a flux discontinuity (see white arrow in Fig. 20). This likely artifact is effectively mitigated in the ASDI reconstruction, primarily due to the joint spectral processing of both available spectral channels. Furthermore, the disk appears significantly fainter in the second channel compared to the first, leading to better separation between the disk and the nuisances. In this case, the second channel serves almost like a reference channel, nearly free from the signal of the target object. Overall, the morphology of SAO 206462 extracted with REXPACO ASDI from the IRDIS dataset exhibits structures very similar to these in the IFS reconstruction presented in Fig. 8. This example illustrates qualitatively that even a very limited spectral diversity (in the present case, L=2𝐿2L=2italic_L = 2 spectral channels, and a band width Δλ<0.15µmsubscriptΔ𝜆0.15µm\Delta_{\lambda}<0.15\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}roman_Δ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT < 0.15 roman_µ roman_m) is sufficient to improve significantly the reconstruction quality by reducing morphological distortions and flux attenuations.

This study yields two main conclusions. First, ASDI post-processing should be favored over ADI and SDI, as it significantly mitigates ambiguities due to object invariances. This finding supports the choices made in Sects. 4.3 and 4.4 regarding the application of comparative algorithms that exploit jointly ASDI diversities. Second, while ASDI offers a theoretical advantage in diversity, this benefit fully translates into improved reconstruction fidelity only when appropriate models of the data are employed. As an illustration, all median ASDI reconstructions (i.e., based on an overly simplistic and empirical model of the nuisance) shown in Sects. 4.3 and 4.4 display strong artifacts, despite the method jointly exploiting both ADI and SDI diversities.

5 Unmixing point-like sources from extended features

5.1 Alternate unmixing

Input: ASDI sequence 𝒗𝒗\bm{v}bold_italic_v.
Input: Forward operator 𝐌𝐌\mathbf{M}bold_M.
Input: Relative precision η(0,1)𝜂01\eta\in(0,1)italic_η ∈ ( 0 , 1 ), η=103𝜂superscript103\eta=10^{-3}italic_η = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT in practice.
Output: Flux distribution 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG of disk.
Output: Flux distribution 𝜶^^𝜶\widehat{\bm{\alpha}}over^ start_ARG bold_italic_α end_ARG of point-like sources.
Output: S/N of detection of point-like sources.
Output: Astrometry (separation ρ^^𝜌\widehat{\rho}over^ start_ARG italic_ρ end_ARG, angle θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG) of point-like sources.
\blacktriangleright Step 1. Initialization.
i0𝑖0i\leftarrow 0italic_i ← 0
  \triangleleft iteration counter
𝒖~[i]REXPACO ASDI(𝒗)superscript~𝒖delimited-[]𝑖REXPACO ASDI𝒗\widetilde{\bm{u}}^{[i]}\leftarrow\text{REXPACO ASDI}(\bm{v})over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ← REXPACO ASDI ( bold_italic_v )
  \triangleleft apply REXPACO on data
S/N[i],𝜶^[i]PACO ASDI(𝒗)superscriptS/Ndelimited-[]𝑖superscript^𝜶delimited-[]𝑖PACO ASDI𝒗\text{S/N}^{\left[i\right]},\widehat{\bm{\alpha}}^{\left[i\right]}\leftarrow% \text{PACO ASDI}(\bm{v})S/N start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ← PACO ASDI ( bold_italic_v )
  \triangleleft apply PACO on data
\blacktriangleright Step 2. User identification of (candidate) point-like sources.
P>0𝑃0P>0italic_P > 0
  \triangleleft set number of sources
ρ^1:P[i],θ^1:P[i]subscriptsuperscript^𝜌delimited-[]𝑖:1𝑃subscriptsuperscript^𝜃delimited-[]𝑖:1𝑃\widehat{\rho}^{[i]}_{1:P},\widehat{\theta}^{[i]}_{1:P}over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 : italic_P end_POSTSUBSCRIPT , over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 : italic_P end_POSTSUBSCRIPT
  \triangleleft set rough astrometry
\blacktriangleright Step 3. Main iteration loop.
do
       ii+1𝑖𝑖1i\leftarrow i+1italic_i ← italic_i + 1
        \triangleleft update iteration counter
       𝒖~[i]REXPACO ASDI(𝒗𝐌𝜶^[i1]PACO residuals)superscript~𝒖delimited-[]𝑖REXPACO ASDIsubscript𝒗𝐌superscript^𝜶delimited-[]𝑖1PACO residuals\widetilde{\bm{u}}^{[i]}\leftarrow\text{REXPACO ASDI}(\underbrace{\bm{v}-% \mathbf{M}\,\widehat{\bm{\alpha}}^{[i-1]}}_{\text{PACO residuals}})\hfillover~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ← REXPACO ASDI ( under⏟ start_ARG bold_italic_v - bold_M over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i - 1 ] end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT PACO residuals end_POSTSUBSCRIPT ) S/N[i],𝜶^[i],ρ^1:P[i],θ^1:P[i]PACO ASDI(𝒗𝐌𝒖~[i1]REXPACO residuals)superscriptS/Ndelimited-[]𝑖superscript^𝜶delimited-[]𝑖subscriptsuperscript^𝜌delimited-[]𝑖:1𝑃subscriptsuperscript^𝜃delimited-[]𝑖:1𝑃PACO ASDIsubscript𝒗𝐌superscript~𝒖delimited-[]𝑖1REXPACO residuals\text{S/N}^{\left[i\right]},\widehat{\bm{\alpha}}^{\left[i\right]},\widehat{% \rho}^{[i]}_{1:P},\widehat{\theta}^{[i]}_{1:P}\leftarrow\text{PACO ASDI}(% \underbrace{\bm{v}-\mathbf{M}\,\widetilde{\bm{u}}^{[i-1]}}_{\text{REXPACO % residuals}})\hfillS/N start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT , over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT , over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 : italic_P end_POSTSUBSCRIPT , over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 : italic_P end_POSTSUBSCRIPT ← PACO ASDI ( under⏟ start_ARG bold_italic_v - bold_M over~ start_ARG bold_italic_u end_ARG start_POSTSUPERSCRIPT [ italic_i - 1 ] end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT REXPACO residuals end_POSTSUBSCRIPT )
while  𝛂^[i]𝛂^[i1]>η𝛂^[i]delimited-∥∥superscript^𝛂delimited-[]𝑖superscript^𝛂delimited-[]𝑖1𝜂delimited-∥∥superscript^𝛂delimited-[]𝑖\big{\lVert}\widehat{\bm{\alpha}}^{[i]}-{\widehat{\bm{\alpha}}}^{[i-1]}\big{% \rVert}>\eta\,\big{\lVert}\widehat{\bm{\alpha}}^{[i]}\big{\rVert}∥ over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT - over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i - 1 ] end_POSTSUPERSCRIPT ∥ > italic_η ∥ over^ start_ARG bold_italic_α end_ARG start_POSTSUPERSCRIPT [ italic_i ] end_POSTSUPERSCRIPT ∥
Algorithm 1 Alternating REXPACO ASDI and PACO ASDI
(unmixing disk and point-like sources).
Refer to caption
Figure 21: Unmixing synthetic sources from the circumstellar environment. Each sub-graph corresponds to a specific fake source (numbered #1 to #6). Sources #1, #2, and #3 (top) exhibit a flat SED in contrast units (i.e., same spectrum as the star), while sources #4, #5, and #6 (bottom) have a bi-modal SED. The SEDs estimated by Algorithm 1 are shown with error bars, where the color represents the iteration. These estimates can be compared to the simulated ground truth SED indicated by gray straight lines. The S/N maps obtained with PACO ASDI at iteration 0 of Algorithm 1 are also included as insets, highlighting simulated point-like sources within the circumstellar environment (see circles).

In this section, we investigate the unmixing of the contribution of point-like sources embedded in spatially extended structures like circumstellar disks. The approach we propose is an extension to multi-spectral observations of the unmixing strategy described in our previous work Flasseur et al. (2021) for ADI data. It consists in combining REXPACO ASDI with PACO ASDI (Flasseur et al., 2020b); the former being dedicated to the reconstruction of disks while the latter being dedicated to the detection and to the sub-pixel characterization of point-like sources. In our experiments, this alternated strategy proved to be more satisfactory than a joint and regularized reconstruction of both a sparse component (for point-like sources) and of a smooth component (for the disk). One of the main peculiarities of the proposed alternated strategy is the ability to select manually the number and the rough location of candidate point-like sources to unmix from the disk material. In contrast, a joint reconstruction of both a sparse component and a smooth component leads either to many more nonzero values in the sparse component than the actual number of point-like sources, or misses the faintest sources, depending on the relative weights given to the sparsity and smoothness regularizations.

The proposed unmixing procedure works as follows: (i) REXPACO ASDI and PACO ASDI are applied independently on a target ASDI observation; (ii) based on the spatio-spectral S/N maps obtained with PACO ASDI and on the spatio-spectral flux distribution obtained with REXPACO ASDI, candidate point-like sources to unmix from the disk material are identified manually by the user; (iii) REXPACO ASDI and PACO ASDI are iteratively applied until convergence of the two retrieved components. During step (iii), the astrometry and photometry of the selected point-like sources are refined with sub-pixel accuracy by PACO ASDI within a 3×3333\times 33 × 3 pixels box, based on the residual data obtained after subtraction of the disk contribution as currently reconstructed by REXPACO ASDI. Similarly, the spatio-spectral flux distribution of the disk is refined by REXPACO ASDI on updated residuals obtained after subtraction of the refined point-sources contribution estimated by PACO ASDI. This procedure is summarized by Algorithm 1.

5.2 Case study on the PDS 70 system

Refer to caption
Figure 22: Unmixing disk and point-like components by combining REXPACO ASDI and PACO ASDI. The two algorithms are applied independently at iteration 0 of Algorithm 1, while they are applied on the current residual data (i.e., subtraction to the data of the current contribution estimated by the algorithm lastly applied) for the other iterations. Concerning the reconstruction of the disk component, the spatio-spectral flux distribution 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG and its re-blurred version 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG obtained with REXPACO ASDI are displayed for each iteration. Concerning the estimation of the point-like contributions, the spatio-spectral S/N maps of detection and the estimated flux contribution maps 𝐁𝜶^𝐁^𝜶\mathbf{B}\,\widehat{\bm{\alpha}}bold_B over^ start_ARG bold_italic_α end_ARG obtained with PACO ASDI are reported for each iteration. As the reported S/N comes from a detection algorithm, it should not be interpreted as a proper image of the multi-spectral flux distribution. The flux maps are non-null only at the locations of the three characterized point-like sources. Dataset: PDS 70 (2018-02-24), see Sect. 4.1 for the observation parameters.

We first evaluate the unmixing ability of the proposed algorithm through numerical experiments on a SPHERE-IFS dataset of PDS 70. We injected (not simultaneously) six faint point-like sources, and we disregarded the unmixing of the real known exoplanets, focusing solely on separating the synthetic sources from the circumstellar environment. Figure 21 compares the estimated SED of the synthetic sources across various iterations of Algorithm 1. It shows that estimation errors decrease over iterations, generally converging towards zero (except for the first spectral channels of source #3 that display a remaining discrepancy with the ground truth). The errors are larger when the SED of the point-like sources closely resembles that of the star (i.e., for sources #1, #2 and #3), as the disk material shares spectral similarities with it, making unmixing more ambiguous. Overall, these results demonstrate the capability of the proposed approach to effectively disentangle the signal from point-like sources, even when they are partially buried into disk material.

As a case-study, we apply Algorithm 1 on the same SPHERE-IFS dataset of PDS 70, focusing now on unmixing real point-like sources. We recall that PDS 70 hosts two known exoplanets (Keppler et al., 2018; Haffert et al., 2019) in accretion phase within a protoplanetary disk (Isella et al., 2019), see also Sect. 4.3. Based on the processing of the same dataset, Mesa et al. (2019b) also identified a point-like feature (PLF) with several post-processing algorithms dedicated to the detection of point-like sources. As in Mesa et al. (2019b), the independent application of PACO ASDI allows to identify a PLF in the spatio-spectral S/N maps produced with PACO ASDI, see iteration 0 in Fig. 22. Exoplanet PDS 70 c cannot be detected in the same S/N maps, likely due to its proximity to the disk material which is over-subtracted by PACO ASDI, since it is not specifically designed to preserve extended structures. Scrutinizing the REXPACO ASDI reconstruction allows to detect PDS 70 b and c, appearing as red point-like sources, even though they are embedded within the disk material. The outer and inner structures of the disks, as well as the spiral feature identified by Juillard et al. (2022) from SPHERE-IRDIS observations are also reconstructed. At the first application of REXPACO ASDI, the PLF seems to be more likely a part of the inner disk hosted by the star, see iteration 0 in Fig. 22. Iterating between REXPACO ASDI and PACO ASDI leads to several remarks. First, the extractions of PDS 70 b and PDS 70 c improve along the iterations. As a qualitative illustration, the REXPACO ASDI reconstruction obtained after a single iteration with Algorithm 1 exhibits a discontinuous footprint within the disk material at the location of the two exoplanets, as a sign of an overestimation of their contribution by PACO ASDI. At convergence of the proposed unmixing scheme, the disk component appears smooth and continuous at the locations of the two exoplanets without any residual signature of PDS 70 b and c. Across the iterations, the contribution of the PLF in the sparse component decreases since it is increasingly explained by the disk component. At convergence of the iterative procedure, the residual spatio-spectral S/N maps from PACO ASDI are almost free from the disk contribution and the signature of the PLF is significantly attenuated with respect to the initial S/N maps at iteration 0. These results also support the conclusions of Mesa et al. (2019b) likely attributing the PLF as a part of the disk, based on its estimated photometry (its SED being very similar to the disk one).

Refer to caption
Figure 23: Unmixing disk and point-like components by combining REXPACO ASDI and PACO ASDI. The estimated astrometry along the iterations are reported for the three considered point-like sources (PDS 70 b, PDS 70 c and the PLF). When estimations are similar from one iteration to the other, the error-bars are slightly shifted artificially to better see the evolution of the estimation accuracy. These cases are marked by a star symbol and the common estimated astrometry is given by the highest iteration. Dataset: PDS 70 (2018-02-24), see Sect. 4.1 for the observation parameters.
Table 5: Estimated astrometry (ρ^,θ^)^𝜌^𝜃(\widehat{\rho},\,\widehat{\theta})( over^ start_ARG italic_ρ end_ARG , over^ start_ARG italic_θ end_ARG ) of PDS 70 b and c as well as the candidate PLF. Values obtained with our unmixing scheme combining REXPACO ASDI and PACO ASDI are compared to the values reported in the literature on data from the same instrument taken at the same observation date.
Source ρ^^𝜌\widehat{\rho}over^ start_ARG italic_ρ end_ARG (mas) θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG (degrees) reference
PDS 70 b 192.2 ±plus-or-minus\pm± 8.0 146.8 ±plus-or-minus\pm± 2.4 Müller et al. (2018)
PDS 70 b 186.8 ±plus-or-minus\pm± 0.2 145.4 ±plus-or-minus\pm± 0.1 this paper
PDS 70 c 209 ±plus-or-minus\pm± 13 281.2 ±plus-or-minus\pm± 0.5 Mesa et al. (2019b)
PDS 70 c 211.0 ±plus-or-minus\pm± 0.5 280.3 ±plus-or-minus\pm± 0.1 this paper
PLF 118 ±plus-or-minus\pm± 4 316.8 ±plus-or-minus\pm± 0.5 Mesa et al. (2019b)
PLF 111.5 ±plus-or-minus\pm± 0.3 318.4 ±plus-or-minus\pm± 0.1 this paper

Figure 23 completes this study by showing the estimated astrometry of PDS 70 b and c, as well as of the PLF along the iterations of the unmixing method. It shows, that the estimated astrometry of the three sources evolves during the iterations. The estimated angular separation ρ^^𝜌\widehat{\rho}over^ start_ARG italic_ρ end_ARG evolves up to 15 mas (i.e., 2 pixels) and the estimated parallactic angle θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG evolves up to 0.5 degree for PDS 70 c (located very near the outer disk arm), which is an illustration of the impact of the disk material on the characterization and orbital parameters estimation of point-like sources embedded within. The estimation shift both in angular separation and in parallactic angle is even more important for the PLF since its sparse contribution gets fainter during the iterations and does not resemble to a point-source anymore. In addition, the accuracy of astrometry improves (i.e., the error bars get smaller) for PDS 70 b and c while it degrades (i.e., the error bars get larger) for the PLF. This observation is in agreement with the qualitative results presented in Fig. 22 attributing preferentially the PLF as part of the disk component. Table 5 reports our final astrometric measurements obtained for the considered sources. The retrieved values are compared to the most accurate measurements available in the literature using direct imaging for these three sources and at the same observation date. Overall, our estimations are compatible (within two times the standard-deviation, at most) with the values reported in the literature. However, our estimations are much more accurate: the uncertainties are decreased by a factor between 5 and 40. If the astrometric estimations we derived are confirmed (e.g., based on a multi-epochs analysis), they could be significant corrective factors of the orbit of the exoplanet PDS 70 b and c.

Beyond the benefits of the proposed iterative approach to unmix point-like sources from the circumstellar environment, this study illustrates that applying a post-processing algorithm not specifically designed for the recovery of extended sources can lead to critical artifacts and biases. In particular, it can lead to misinterpret a disk feature for a point-like source. These observations could encourage to revisit systems where candidate point-like sources embedded in disk material were recently identified via a post-processing of the data by algorithms not tailored to reconstruct extended features and even less to unmix disk and point-like components.

6 Discussion and conclusion

In this paper, we introduced REXPACO ASDI, a new algorithm for reconstructing circumstellar environments from high-contrast observations in pupil-tracking mode. Our approach utilizes spectral diversity inherent in ASDI data. REXPACO ASDI combines a tailored statistical model of non-stationary nuisances with a forward image formation model of the off-axis sources. These models are jointly used to solve a reconstruction task in a regularized inverse problem framework. This method, specifically designed for extended sources, is the first to leverage jointly angular and spectral diversity introduced by ASDI for reconstructing the spatio-spectral flux distribution of circumstellar environments.

On the methodological side, we employ a local modeling approach to capture spatial and spectral correlations of nuisances for a more accurate statistical description of the data. This model utilizes a spatio-spectral separable approximation to reduce the large number of free parameters needed to model full covariances. For similar reasons, the model is local, i.e. its parameters differ with the location in the field of view and are estimated at the scale of small patches. Our model can thus be interpreted as a block-diagonal approximation of the full spatio-spectral covariance. Tailored estimators of model parameters, based on covariance shrinkage, are developed to reduce estimation uncertainty and improve robustness. We illustrate on real data that this approximate statistical model effectively captures most nuisance correlations. Ablation study reveals that jointly accounting for spatio-spectral correlations directly from the data is crucial for capturing accurately the statistics of ASDI observations, outperforming methods that first model spatial correlations from spatio-temporo-spectral data and then spectral correlations from reduced quantities, as in our previous work dedicated to exoplanet detection from similar ASDI observations (Flasseur et al., 2020b).

We proposed a specific reconstruction strategy to refine jointly the statistical model of the nuisance and the reconstructed flux distribution of the circumstellar environment. This hierarchical estimation strategy derives estimators of the nuisance component mostly unbiased from the contamination of the sought off-axis objects. This method also prevents iterating between the characterization of the nuisance and the reconstruction task, thus leading to an algorithm that scales to the size of typical datasets recorded with the ASDI technique, both in terms of computational burden and memory storage. We apply regularization to the spatio-spectral flux distribution using suitable penalties. These penalties improve both rejection of residual starlight and fidelity of reconstructed features. We demonstrate the versatility of these priors in recovering various structures within the circumstellar environment, such as sharp edges and smooth transitions.

REXPACO ASDI operates in a fully unsupervised manner, allowing optimal estimation of all hyper-parameters from the dataset itself, without relying on prior knowledge about the disk properties or requiring trial and error reconstructions. Among the free hyper-parameters, the patch size is set based on the full width at half maximum of the off-axis PSF. The spatially adaptive regularization of noisy covariances through shrinkage is obtained via a derived closed-form expression, minimizing estimation risk for the statistical nuisance model. Hyper-parameters that determine the relative weights of reconstruction regularization can be estimated quasi-optimally by minimizing Stein’s unbiased risk estimator. However this process is time-consuming because it requires multiple reconstructions with different penalty weights. As this setting is not the most critical, it can be approximated from the optimal setting obtained on a standard dataset by scaling regularization parameters with respect to the acquired number of frames.

We tested the proposed algorithm using injection of synthetic disks with different morphologies, orientations, and contrast levels. While these simulations could be complemented and refined by even more extensive experiments, they allowed to identify the key capabilities and benefits of REXPACO ASDI. We showed that the proposed method is very versatile since it is able to reconstruct faithful spectral images of the considered disks for contrasts up to 106superscript10610^{-6}10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. One of our major result is the ability of REXPACO ASDI to reconstruct disks being partly rotation-invariant, i.e. whose morphology makes the unmixing of the disk and the speckles particularly difficult when only leveraging the ADI diversity. These disks are known to be especially challenging to reconstruct without an additional source of diversity in the data, for instance provided by multiple observations as in RDI. Unlike this latter category of methods, the unmixing capability of REXPACO ASDI is achieved from a single ASDI dataset, i.e. the model of the nuisance is dataset-dependent. Using simulated flux distributions, we also illustrated that the theoretical fraction of flux lost due to unmixing ambiguities is negligible if the different spectral channels are processed jointly. This property of ASDI is due to the chromatic scaling of the speckle field caused by the diffraction.

By resorting to a model ablation, applied both on synthetic and real disks, we illustrated that the joint spectral processing of REXPACO ASDI efficiently unmixes disk features from the nuisance and requires an accurate model of spatio-spectral correlations that are very strong in ASDI observations. Ignoring these correlations in the statistical model of the nuisance is particularly detrimental to the quality of the reconstruction.

As a proof of concept, we analyzed real SPHERE-IFS datasets containing six known circumstellar disks with various morphologies, including challenging features like spiral arms. Despite the absence of ground truth for these real objects, we observed that our method outperformed median ASDI, PCA ASDI, PACO ASDI, and the mono-spectral version of REXPACO in rejecting nuisances. Our approach significantly reduced non-physical artifacts, such as discontinuities from partial self-subtraction. Additionally, the reconstructed flux distribution showed improved spatial resolution compared to the original data, as we accounted for the blur introduced by the off-axis PSF through deconvolution in the forward image formation model. We also processed a dual-band dataset obtained with the IRDIS imager of the SPHERE instrument. Although spectral diversity was limited in this dataset, we illustrated that our approach enhanced reconstruction quality compared to the mono-spectral REXPACO algorithm designed for ADI observations.

Given the complementary capabilities of REXPACO ASDI, we can expect that it will be helpful to unveil new disks, to improve the spatio-spectral interpretation of their flux distribution, and thus to better understand the phenomena governing the formation of planetary systems like the intricate interactions between exoplanets and the disk material. In particular, we illustrated that the latter goal can be achieved by combining REXPACO ASDI with the detection algorithm PACO ASDI to unmix point-like sources from the circumstellar material. As initialization step, this latter strategy only requires the rough locations (typically, with pixel-level accuracy) of candidate point-like sources to be unmixed from the disk material. Based on numerical experiments, we illustrated that this combined approach can reduce significantly the photometry bias occurring during characterization of point-like sources embedded within disk material. As a case-study, we applied this strategy on a dataset of PDS 70. Our results illustrated the ability of the proposed approach to identify components being more likely disk features than point-like sources, even when they are mistaken as point-sources at initialization step.

As future work, we plan to improve the fidelity of the model of the nuisance, especially in the vicinity of the star where the model is slightly inaccurate. Disk reconstruction is very challenging in this area due to large stellar leakages that could be more accurately captured by accounting for the spatial correlations at a larger spatial scale than a patch of a few pixels. Complementary to that, even the spectral diversity is very useful to retrieve faithful flux distribution, a distortion can remains in some cases. This limitation could be tackled by building a more complex model leveraging deep learning techniques to model the nuisance distribution from multiple archival data.

Beyond the specific field of application of the proposed algorithm, its statistical modeling of the spatio-spectral correlations of the nuisance component and the estimation strategy of the underlying parameters are very general approaches. These methodological developments could be specialized to other large-scale reconstruction problems encountered in other imaging modalities such as microscopy or remote sensing. These fields often involve multi-spectral measurements, where signals of interest are faint and affected by multi-correlated and non-stationary nuisances.

Acknowledgements

We thank the anonymous Referee for her/his careful reading of the manuscript as well as her/his insightful comments and suggestions.

This project was funded in part by the French National Research Agency (ANR) under the project DDISK (grant ANR-21-CE31-0015) and by the Région Auvergne-Rhône-Alpes under the project DIAGHOLO. This work was also supported by the ANR under the France 2030 program (PEPR Origins, reference ANR-22-EXOR-0016), by the French National Programs (PNP and PNPS), and by the Action Spécifique Haute Résolution Angulaire (ASHRA) of CNRS/INSU co-funded by CNES.

OF, LD, and ÉT conceived and designed the method presented in this paper. OF developed, tested, and implemented the algorithm. OF selected the raw data. ML pre-reduced them through the SPHERE Data Center. OF performed the analysis of the data. OF, LD, ÉT, and ML wrote the manuscript.

Data Availability

The raw data used in this article are freely available on the ESO archive facility at http://archive.eso.org/eso/eso_archive_main.html. They were pre-reduced with the SPHERE Data Centre, jointly operated by OSUG/IPAG (Grenoble), PYTHEAS/LAM/CESAM (Marseille), OCA/Lagrange (Nice), Observatoire de Paris/LESIA (Paris), and Observatoire de Lyon/CRAL (Lyon, France). The resulting pre-processed datasets will be shared based on reasonable request to the corresponding author.

References

  • Aharon et al. (2006) Aharon M., Elad M., Bruckstein A., 2006, IEEE Transactions on Signal Processing, 54, 4311
  • Amara & Quanz (2012) Amara A., Quanz S. P., 2012, Monthly Notices of the Royal Astronomical Society, 427, 948
  • Bae et al. (2016) Bae J., Zhu Z., Hartmann L., 2016, The Astrophysical Journal, 819, 134
  • Bell et al. (2015) Bell C. P., Mamajek E. E., Naylor T., 2015, Monthly Notices of the Royal Astronomical Society, 454, 593
  • Benisty et al. (2015) Benisty M., et al., 2015, Astronomy & Astrophysics, 578, L6
  • Beuzit et al. (2019) Beuzit J.-L., et al., 2019, Astronomy & Astrophysics, 631, A155
  • Blomgren et al. (1997) Blomgren P., Chan T. F., Mulet P., Wong C.-K., 1997, in Proceedings of international conference on image processing. pp 384–387
  • Boccaletti et al. (2020) Boccaletti A., et al., 2020, Astronomy & Astrophysics, 637, L5
  • Boccaletti et al. (2021) Boccaletti A., et al., 2021, Astronomy & Astrophysics, 652, L8
  • Bodrito et al. (2024) Bodrito T., Flasseur O., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2024, Monthly Notices of the Royal Astronomical Society, p. stae2174
  • Bowler (2016) Bowler B. P., 2016, Publications of the Astronomical Society of the Pacific, 128, 102001
  • Bresson & Chan (2008) Bresson X., Chan T. F., 2008, Inverse Problems & Imaging, 2, 455
  • Brown et al. (2016) Brown A. G., et al., 2016, Astronomy & Astrophysics, 595, A2
  • Brown et al. (2021) Brown A. G., et al., 2021, Astronomy & Astrophysics, 649, A1
  • Buades et al. (2005) Buades A., Coll B., Morel J.-M., 2005, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 60–65
  • Carbillet et al. (2011) Carbillet M., et al., 2011, Experimental Astronomy, 30, 39
  • Charbonnier et al. (1997) Charbonnier P., Blanc-Féraud L., Aubert G., Barlaud M., 1997, IEEE Transactions on image processing, 6, 298
  • Chen et al. (2010) Chen Y., Wiesel A., Eldar Y. C., Hero A. O., 2010, IEEE Transactions on Signal Processing, 58, 5016
  • Chintarungruangchai et al. (2023) Chintarungruangchai P., Jiang G., Hashimoto J., Komatsu Y., Konishi M., 2023, New Astronomy, 100, 101997
  • Christiaens et al. (2019) Christiaens V., et al., 2019, Monthly Notices of the Royal Astronomical Society, 486, 5819
  • Christiaens et al. (2023) Christiaens V., et al., 2023, Journal of Open Source Software, 8
  • Christiaens et al. (2024) Christiaens V., et al., 2024, Astronomy & Astrophysics, 685, L1
  • Conte et al. (1995) Conte E., Lops M., Ricci G., 1995, IEEE Transactions on Aerospace and Electronic Systems, 31, 617
  • Craven & Wahba (1978) Craven P., Wahba G., 1978, Numerische mathematik, 31, 377
  • Currie et al. (2017) Currie T., et al., 2017, The Astrophysical Journal Letters, 836, L15
  • Currie et al. (2022a) Currie T., Biller B., Lagrange A.-M., Marois C., Guyon O., Nielsen E., Bonnefoy M., De Rosa R., 2022a, arXiv preprint arXiv:2205.05696
  • Currie et al. (2022b) Currie T., et al., 2022b, Nature Astronomy, 6, 751
  • Dabov et al. (2007) Dabov K., Foi A., Katkovnik V., Egiazarian K., 2007, IEEE Transactions on Image Processing, 16, 2080
  • Delorme et al. (2017) Delorme P., et al., 2017, arXiv preprint arXiv:1712.06948
  • Dohlen et al. (2008) Dohlen K., et al., 2008, in Ground-based and Airborne Instrumentation for Astronomy II. pp 1266–1275
  • Doucet et al. (2006) Doucet C., Pantin E., Lagage P., Dullemond C., 2006, Astronomy & Astrophysics, 460, 117
  • Esposito et al. (2013) Esposito T. M., Fitzgerald M. P., Graham J. R., Kalas P., 2013, The Astrophysical Journal, 780, 25
  • Esposito et al. (2020) Esposito T. M., et al., 2020, The Astronomical Journal, 160, 24
  • Flasseur et al. (2018) Flasseur O., Denis L., Thiébaut É., Langlois M., 2018, Astronomy & Astrophysics, 618, A138
  • Flasseur et al. (2020a) Flasseur O., Denis L., Thiébaut É., Langlois M., 2020a, Astronomy & Astrophysics, 634, A2
  • Flasseur et al. (2020b) Flasseur O., Denis L., Thiébaut É., Langlois M., 2020b, Astronomy & Astrophysics, 637, A9
  • Flasseur et al. (2021) Flasseur O., Thé S., Denis L., Thiébaut É., Langlois M., 2021, Astronomy & Astrophysics, 651, A62
  • Flasseur et al. (2022) Flasseur O., Denis L., Thiébaut É., Langlois M., et al., 2022, in Adaptive Optics Systems VIII. pp 1175–1189
  • Flasseur et al. (2023a) Flasseur O., Bodrito T., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2023a, in 2023 31st European Signal Processing Conference (EUSIPCO). pp 1723–1727
  • Flasseur et al. (2023b) Flasseur O., Bodrito T., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2023b, Monthly Notices of the Royal Astronomical Society, 527, 1534
  • Flasseur et al. (2024) Flasseur O., Thiébaut E., Denis L., Langlois M., 2024, accepted in EUSIPCO, arXiv preprint arXiv:2403.07104
  • Follette (2023) Follette K. B., 2023, Publications of the Astronomical Society of the Pacific, 135, 093001
  • Gaia et al. (2018) Gaia C., et al., 2018, Astronomy & Astrophysics, 616
  • Garufi et al. (2020) Garufi A., et al., 2020, Astronomy & Astrophysics, 633, A82
  • Genton (2007) Genton M. G., 2007, Environmetrics: The official Journal of the International Environmetrics Society, 18, 681
  • Girard (1989) Girard D. A., 1989, Numerische Mathematik, 56, 1
  • Gonzalez et al. (2017) Gonzalez C. A. G., et al., 2017, The Astronomical Journal, 154, 7
  • Grady et al. (2009) Grady C., et al., 2009, The Astrophysical Journal, 699, 1822
  • Haffert et al. (2019) Haffert S., Bohn A., de Boer J., Snellen I., Brinchmann J., Girard J., Keller C., Bacon R., 2019, Nature Astronomy, 3, 749
  • Hom et al. (2024) Hom J., et al., 2024, Monthly Notices of the Royal Astronomical Society, 528, 6959
  • Isella et al. (2007) Isella A., Testi L., Natta A., Neri R., Wilner D., Qi C., 2007, Astronomy & Astrophysics, 469, 213
  • Isella et al. (2018) Isella A., et al., 2018, The Astrophysical Journal Letters, 869, L49
  • Isella et al. (2019) Isella A., Benisty M., Teague R., Bae J., Keppler M., Facchini S., Pérez L., 2019, The Astrophysical Journal Letters, 879, L25
  • Juillard et al. (2022) Juillard S., Christiaens V., Absil O., 2022, Astronomy & Astrophysics, 668, A125
  • Juillard et al. (2023) Juillard S., Christiaens V., Absil O., 2023, Astronomy & Astrophysics, 679, A52
  • Juillard et al. (2024) Juillard S., Stasevic S., Christiaens V., Absil O., Milli J., 2024, Astronomy & Astrophysics, 688, A185
  • Keppler et al. (2018) Keppler M., et al., 2018, Astronomy & Astrophysics, 617, A44
  • Kiefer et al. (2021) Kiefer S., Bohn A. J., Quanz S. P., Kenworthy M., Stolker T., 2021, Astronomy & Astrophysics, 652, A33
  • Kingma & Ba (2014) Kingma D. P., Ba J., 2014, arXiv preprint arXiv:1412.6980
  • Lafrenière et al. (2007) Lafrenière D., Marois C., Doyon R., Nadeau D., Artigau E., 2007, The Astrophysical Journal, 660, 770
  • Lafrenière et al. (2009) Lafrenière D., Marois C., Doyon R., Barman T., 2009, The Astrophysical Journal, 694, L148
  • Lagrange et al. (2009) Lagrange A.-M., et al., 2009, Astronomy & Astrophysics, 493, L21
  • Lagrange et al. (2010) Lagrange A.-M., et al., 2010, Science, 329, 57
  • Langlois et al. (2020) Langlois M., Gratton R., Lagrange A.-M., Delorme P., Boccaletti A., Bonnefoy M., Maire A.-L., et al., 2020, in revision for Astronomy & Astrophysics
  • Langlois et al. (2021) Langlois M., et al., 2021, Astronomy & Astrophysics, 651, A71
  • Lawson et al. (2020) Lawson K., et al., 2020, The Astronomical Journal, 160, 163
  • Lawson et al. (2022) Lawson K., Currie T., Wisniewski J. P., Groff T. D., McElwain M. W., Schlieder J. E., 2022, The Astrophysical Journal Letters, 935, L25
  • Lebrun et al. (2013) Lebrun M., Buades A., Morel J.-M., 2013, SIAM Journal on Imaging Sciences, 6, 1665
  • Ledoit & Wolf (2004) Ledoit O., Wolf M., 2004, Journal of Multivariate Analysis, 88, 365
  • Lisse et al. (2009) Lisse C. M., Chen C., Wyatt M., Morlok A., Song I., Bryden G., Sheehan P., 2009, The Astrophysical Journal, 701, 2019
  • Louchet & Moisan (2008) Louchet C., Moisan L., 2008, in 2008 16th European Signal Processing Conference. pp 1–5
  • Lu & Zimmerman (2005) Lu N., Zimmerman D. L., 2005, Statistics & Probability Letters, 73, 449
  • Mairal et al. (2009) Mairal J., Bach F., Ponce J., Sapiro G., Zisserman A., 2009, in IEEE International Conference on Computer Vision. pp 2272–2279
  • Maire et al. (2017) Maire A.-L., et al., 2017, Astronomy & Astrophysics, 601, A134
  • Marois et al. (2006) Marois C., Lafrenière D., Doyon R., Macintosh B., Nadeau D., 2006, The Astrophysical Journal, 641, 556
  • Marois et al. (2008) Marois C., Macintosh B., Barman T., Zuckerman B., Song I., Patience J., Lafrenière D., Doyon R., 2008, science, 322, 1348
  • Marois et al. (2010) Marois C., Zuckerman B., Konopacky Q. M., Macintosh B., Barman T., 2010, Nature, 468, 1080
  • Marois et al. (2013) Marois C., Correia C., Véran J.-P., Currie T., 2013, International Astronomical Union, 8, 48
  • Marois et al. (2014) Marois C., Correia C., Galicher R., Ingraham P., Macintosh B., Currie T., De Rosa R., 2014, in SPIE Astronomical Intrumentation + Telescopes. p. 91480U
  • Mazoyer et al. (2020) Mazoyer J., et al., 2020, in Ground-based and Airborne Instrumentation for Astronomy VIII. pp 1080–1099
  • Mesa et al. (2019a) Mesa D., et al., 2019a, Monthly Notices of the Royal Astronomical Society, 488, 37
  • Mesa et al. (2019b) Mesa D., et al., 2019b, Astronomy & Astrophysics, 632, A25
  • Milli et al. (2012) Milli J., Mouillet D., Lagrange A.-M., Boccaletti A., Mawet D., Chauvin G., Bonnefoy M., 2012, Astronomy & Astrophysics, 545, A111
  • Milli et al. (2017) Milli J., et al., 2017, Astronomy & Astrophysics, 599, A108
  • Milli et al. (2019) Milli J., et al., 2019, Astronomy & Astrophysics, 626, A54
  • Müller et al. (2011) Müller A., van den Ancker M., Launhardt R., Pott J.-U., Fedele D., Henning T., 2011, Astronomy & Astrophysics, 530, A85
  • Müller et al. (2018) Müller A., et al., 2018, Astronomy & Astrophysics, 617, L2
  • Muro-Arena et al. (2018) Muro-Arena G., et al., 2018, Astronomy & Astrophysics, 614, A24
  • Muro-Arena et al. (2020) Muro-Arena G., et al., 2020, Astronomy & Astrophysics, 635, A121
  • Nielsen & Close (2010) Nielsen E. L., Close L. M., 2010, The Astrophysical Journal, 717, 878
  • Nielsen et al. (2008) Nielsen E. L., Close L. M., Biller B. A., Masciadri E., Lenzen R., 2008, The Astrophysical Journal, 674, 466
  • Pairet et al. (2019) Pairet B., Jacques L., Cantalloube F., 2019, Signal Processing with Adaptive Sparse Structured Representations, 1, 1
  • Pairet et al. (2021) Pairet B., Cantalloube F., Jacques L., 2021, Monthly Notices of the Royal Astronomical Society, 503, 3724
  • Pavlov et al. (2008) Pavlov A., Möller-Nilsson O., Feldt M., Henning T., Beuzit J.-L., Mouillet D., 2008, Advanced Software and Control for Astronomy II, 7019, 1093
  • Pueyo (2018) Pueyo L., 2018, Handbook of Exoplanets, pp 705–765
  • Ramani et al. (2012) Ramani S., Liu Z., Rosen J., Nielsen J.-F., Fessler J. A., 2012, IEEE Transactions on Image Processing, 21, 3659
  • Reggiani et al. (2018) Reggiani M., et al., 2018, Astronomy & Astrophysics, 611, A74
  • Ren (2023) Ren B. B., 2023, Astronomy & Astrophysics, 679, A18
  • Ren et al. (2018) Ren B., Pueyo L., Zhu G. B., Debes J., Duchêne G., 2018, The Astrophysical Journal, 852, 104
  • Ren et al. (2020) Ren B., Pueyo L., Chen C., Choquet É., Debes J. H., Duchêne G., Ménard F., Perrin M. D., 2020, The Astrophysical Journal, 892, 74
  • Riaud et al. (2006) Riaud P., Mawet D., Absil O., Boccaletti A., Baudoz P., Herwats E., Surdej J., 2006, Astronomy & Astrophysics, 458, 317
  • Ruane et al. (2019) Ruane G., et al., 2019, The Astronomical Journal, 157, 118
  • Schneider et al. (1999) Schneider G., et al., 1999, The Astrophysical Journal Letters, 513, L127
  • Schütz et al. (2005) Schütz O., Meeus G., Sterzik M., 2005, Astronomy & Astrophysics, 431, 175
  • Smith & Terrile (1984) Smith B. A., Terrile R. J., 1984, Science, 226, 1421
  • Soummer et al. (2012) Soummer R., Pueyo L., Larkin J., 2012, The Astrophysical Journal Letters, 755, L28
  • Sparks & Ford (2002) Sparks W. B., Ford H. C., 2002, The Astrophysical Journal, 578, 543
  • Stapper, L. M. & Ginski, C. (2022) Stapper, L. M. Ginski, C. 2022, Astronomy & Astrophysics, 668
  • Stein (1981) Stein C. M., 1981, The Annals of Statistics, pp 1135–1151
  • Teague et al. (2018) Teague R., Bae J., Bergin E. A., Birnstiel T., Foreman-Mackey D., 2018, The Astrophysical Journal Letters, 860, L12
  • Thatte et al. (2007) Thatte N., Abuter R., Tecza M., Nielsen E. L., Clarke F. J., Close L. M., 2007, Monthly Notices of the Royal Astronomical Society, 378, 1229
  • Thiébaut (2002) Thiébaut É., 2002, in Astronomical Data Analysis II. pp 174–183
  • Tilling et al. (2012) Tilling I., et al., 2012, Astronomy & Astrophysics, 538, A20
  • Traub & Oppenheimer (2010) Traub W. A., Oppenheimer B. R., 2010, Exoplanets, pp 111–156
  • Van Leeuwen (2007) Van Leeuwen F., 2007, Astronomy & Astrophysics, 474, 653
  • Vigan et al. (2010) Vigan A., Moutou C., Langlois M., Allard F., Boccaletti A., Carbillet M., Mouillet D., Smith I., 2010, Monthly Notices of the Royal Astronomical Society, 407, 71
  • Vigan et al. (2014) Vigan A., et al., 2014, in Ground-based and Airborne Instrumentation for Astronomy V. pp 1568–1577
  • Wagner et al. (2019) Wagner K., Stone J. M., Spalding E., Apai D., Dong R., Ertel S., Leisenring J., Webster R., 2019, The Astrophysical Journal, 882, 20
  • Wagner et al. (2023) Wagner K., et al., 2023, Nature Astronomy, 7, 1208
  • Wahba et al. (1985) Wahba G., et al., 1985, The Annals of Statistics, 13, 1378
  • Wahhaj et al. (2015) Wahhaj Z., et al., 2015, Astronomy & Astrophysics, 581, A24
  • Wahhaj et al. (2021) Wahhaj Z., et al., 2021, Astronomy & Astrophysics, 648, A26
  • Wainwright & Simoncelli (1999) Wainwright M. J., Simoncelli E. P., 1999, in Neural Information Processing Systems. pp 855–861
  • Werner et al. (2008) Werner K., Jansson M., Stoica P., 2008, IEEE Transactions on Signal Processing, 56, 478
  • Wolf et al. (2024) Wolf T. N., Jones B. A., Bowler B. P., 2024, The Astronomical Journal, 167, 92
  • Xie et al. (2022) Xie C., et al., 2022, arXiv preprint arXiv:2208.07915
  • Xuan et al. (2018) Xuan W. J., et al., 2018, The Astronomical Journal, 156, 156
  • Yu et al. (2011) Yu G., Sapiro G., Mallat S., 2011, IEEE Transactions on Image Processing, 21, 2481
  • Zhu et al. (1997) Zhu C., Byrd R. H., Lu P., Nocedal J., 1997, ACM Transactions on Mathematical Software, 23, 550
  • Zoran & Weiss (2011) Zoran D., Weiss Y., 2011, in IEEE International Conference on Computer Vision. pp 479–486

Appendix A Derivation of the maximum likelihood estimators for a weighted mixture of multi-variate Gaussian models

In this appendix, we detail the technical elements yielding to the MLEs (9)-(12) of the parameters of a weighted mixture of multi-variate Gaussian, knowing the object of interest 𝒖𝒖\bm{u}bold_italic_u, see Sect. 2.3.1.

Under the assumptions of Sect. 2.2, the co-log-likelihood of the 4D patch 𝒗nsubscript𝒗𝑛\bm{v}_{n}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is given by Eq. (8) and can be rewritten as:

nsubscript𝑛\displaystyle\mathscr{L}_{n}script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT =TK2log|𝐂nspec|+TL2log|𝐂nspat|\displaystyle=\frac{T\,K}{2}\,\log\big{\rvert}\mathbf{C}_{n}^{\mathrm{spec}}% \big{\lvert}+\frac{T\,L}{2}\log\big{\rvert}\mathbf{C}_{n}^{\mathrm{spat}}\big{\lvert}= divide start_ARG italic_T italic_K end_ARG start_ARG 2 end_ARG roman_log | bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT | + divide start_ARG italic_T italic_L end_ARG start_ARG 2 end_ARG roman_log | bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT |
+t=1T(KL2logσn,t2+12σn,t2𝒓n,t(𝐂nspec)1(𝐂nspat)12),superscriptsubscript𝑡1𝑇𝐾𝐿2superscriptsubscript𝜎𝑛𝑡212superscriptsubscript𝜎𝑛𝑡2superscriptsubscriptdelimited-∥∥subscript𝒓𝑛𝑡tensor-productsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsuperscriptsubscript𝐂𝑛spat12\displaystyle\quad+\sum_{t=1}^{T}\left(\frac{K\,L}{2}\,\log\sigma_{n,t}^{2}+% \frac{1}{2\,\sigma_{n,t}^{2}}\,\left\lVert\bm{r}_{n,t}\right\rVert_{{\big{(}% \mathbf{C}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\mathbf{C}_{n}^{% \mathrm{spat}}\big{)}^{-1}}}^{2}\right)\,,+ ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( divide start_ARG italic_K italic_L end_ARG start_ARG 2 end_ARG roman_log italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (121)

with:

𝒓n,t=𝒗n,t𝝁nspec[𝐌𝒖]n,tsubscript𝒓𝑛𝑡subscript𝒗𝑛𝑡superscriptsubscript𝝁𝑛specsubscriptdelimited-[]𝐌𝒖𝑛𝑡\bm{r}_{n,t}=\bm{v}_{n,t}-\bm{\mu}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n% ,t}bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT = bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M bold_italic_u ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT (122)

the residuals in the t𝑡titalic_t-th frame of the n𝑛nitalic_n-th patch, and where we used the following properties of the Kronecker product of any n×n𝑛𝑛n\times nitalic_n × italic_n matrix 𝐀𝐀\mathbf{A}bold_A and m×m𝑚𝑚m\times mitalic_m × italic_m matrix 𝐁𝐁\mathbf{B}bold_B: |𝐀𝐁|=|𝐀|m|𝐁|ntensor-product𝐀𝐁superscript𝐀𝑚superscript𝐁𝑛|\mathbf{A}\otimes\mathbf{B}|=|\mathbf{A}|^{m}|\mathbf{B}|^{n}| bold_A ⊗ bold_B | = | bold_A | start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT | bold_B | start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and (𝐀𝐁)1=𝐀1𝐁1superscripttensor-product𝐀𝐁1tensor-productsuperscript𝐀1superscript𝐁1(\mathbf{A}\otimes\mathbf{B})^{-1}=\mathbf{A}^{-1}\otimes\mathbf{B}^{-1}( bold_A ⊗ bold_B ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = bold_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

To obtain the MLEs, we differentiate the expression of nsubscript𝑛\mathscr{L}_{n}script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT given in Eq. (121):

n=subscript𝑛absent\displaystyle\partial\mathscr{L}_{n}=∂ script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = TK2tr((𝐂nspec)1𝐂nspec)+TL2tr((𝐂nspat)1𝐂nspat)𝑇𝐾2trsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsubscript𝐂𝑛spec𝑇𝐿2trsuperscriptsuperscriptsubscript𝐂𝑛spat1superscriptsubscript𝐂𝑛spat\displaystyle\tfrac{T\,K}{2}\,\operatorname{tr}\Bigg{(}\left({\mathbf{C}_{n}^{% \mathrm{spec}}}\right)^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{spec}}\Bigg{)}+% \tfrac{TL}{2}\,\operatorname{tr}\Bigg{(}\left({\mathbf{C}_{n}^{\mathrm{spat}}}% \right)^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{spat}}\Bigg{)}divide start_ARG italic_T italic_K end_ARG start_ARG 2 end_ARG roman_tr ( ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) + divide start_ARG italic_T italic_L end_ARG start_ARG 2 end_ARG roman_tr ( ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT )
+t=1Tsuperscriptsubscript𝑡1𝑇\displaystyle+\sum_{t=1}^{T}+ ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT {KL2σn,t2σn,t2σn,t22σn,t4𝒓n,t(𝐂nspec)1(𝐂nspat)12\displaystyle\biggl{\{}\tfrac{KL}{2}\frac{\partial\sigma_{n,t}^{2}}{\sigma_{n,% t}^{2}}-\frac{\partial\sigma_{n,t}^{2}}{2\sigma_{n,t}^{4}}\,\left\lVert\bm{r}_% {n,t}\right\rVert_{\big{(}{\mathbf{C}_{n}^{\mathrm{spec}}}\big{)}^{-1}\otimes{% \big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}{ divide start_ARG italic_K italic_L end_ARG start_ARG 2 end_ARG divide start_ARG ∂ italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG ∂ italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG ∥ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+1σn,t2𝒓n,t((𝐂nspec)1(𝐂nspat)1)𝒓n,t1superscriptsubscript𝜎𝑛𝑡2superscriptsubscript𝒓𝑛𝑡toptensor-productsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsuperscriptsubscript𝐂𝑛spat1subscript𝒓𝑛𝑡\displaystyle+\frac{1}{\sigma_{n,t}^{2}}\,\bm{r}_{n,t}^{\top}\,\left(\left({% \mathbf{C}_{n}^{\mathrm{spec}}}\right)^{-1}\otimes{\left(\mathbf{C}_{n}^{% \mathrm{spat}}\right)}^{-1}\right)\,\partial\bm{r}_{n,t}+ divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ∂ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT
12σn,t2tr((𝐂nspec)1𝐂nspec(𝐂nspec)1𝐕n,t(𝐂nspat)1𝐕n,t)12superscriptsubscript𝜎𝑛𝑡2trsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsubscript𝐂𝑛specsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsubscript𝐕𝑛𝑡topsuperscriptsuperscriptsubscript𝐂𝑛spat1subscript𝐕𝑛𝑡\displaystyle-\frac{1}{2\,\sigma_{n,t}^{2}}\,\operatorname{tr}\left({\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\partial\mathbf{C}_{n}^{\mathrm{% spec}}\,{\left(\mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}% ^{\top}\,{\left(\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t% }\right)- divide start_ARG 1 end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr ( ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT )
12σn,t2tr((𝐂nspat)1𝐂nspat(𝐂nspat)1𝐕n,t(𝐂nspec)1𝐕n,t)},\displaystyle-\frac{1}{2\,\sigma_{n,t}^{2}}\,\operatorname{tr}\left({\left(% \mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{% spat}}\,{\left(\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}% \,{\left(\mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top% }\right)\biggr{\}}\,,- divide start_ARG 1 end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr ( ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) } , (123)

where we obtained the last two terms by rewriting the squared norm term in Eq. (121) as:

𝒓n,t(𝐂nspec)1(𝐂nspat)12=tr(𝐕n,t(𝐂nspat)1𝐕n,t(𝐂nspec)1)superscriptsubscriptdelimited-∥∥subscript𝒓𝑛𝑡tensor-productsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsuperscriptsubscript𝐂𝑛spat12trsuperscriptsubscript𝐕𝑛𝑡topsuperscriptsuperscriptsubscript𝐂𝑛spat1subscript𝐕𝑛𝑡superscriptsuperscriptsubscript𝐂𝑛spec1\lVert\bm{r}_{n,t}\rVert_{\big{(}{\mathbf{C}_{n}^{\mathrm{spec}}}\big{)}^{-1}% \otimes{\big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}=\operatorname{% tr}\Bigg{(}{\mathbf{V}}_{n,t}^{\top}\,\left({\mathbf{C}_{n}^{\mathrm{spat}}}% \right)^{-1}\,{\mathbf{V}}_{n,t}\,\left({\mathbf{C}_{n}^{\mathrm{spec}}}\right% )^{-1}\Bigg{)}∥ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = roman_tr ( bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) (124)

with 𝐕n,tsubscript𝐕𝑛𝑡{\mathbf{V}}_{n,t}bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT the K×L𝐾𝐿K\times Litalic_K × italic_L matrix whose element at row k𝑘kitalic_k and column \ellroman_ℓ is [𝒓n,t]k,subscriptdelimited-[]subscript𝒓𝑛𝑡𝑘[\bm{r}_{n,t}]_{k,\ell}[ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_k , roman_ℓ end_POSTSUBSCRIPT. The following set of conditions is sufficient for the partial derivatives of nsubscript𝑛\mathscr{L}_{n}script_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT in respectively 𝝁specsuperscript𝝁spec\bm{\mu}^{\mathrm{spec}}bold_italic_μ start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT, σn,t2superscriptsubscript𝜎𝑛𝑡2\sigma_{n,t}^{2}italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, 𝐂nspecsuperscriptsubscript𝐂𝑛spec\mathbf{C}_{n}^{\mathrm{spec}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT, and 𝐂nspatsuperscriptsubscript𝐂𝑛spat\mathbf{C}_{n}^{\mathrm{spat}}bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT to be equal to zero:

{t=1Tσn,t2𝒓n,t=0,KLσn,t21σn,t4𝒓n,t(𝐂nspec)1(𝐂nspat)12=0,TK𝐈t=1T1σn,t2(𝐂nspec)1𝐕n,t(𝐂nspat)1𝐕n,t=𝟎,TL𝐈t=1T1σn,t2(𝐂nspat)1𝐕n,t(𝐂nspec)1𝐕n,t=𝟎,casessuperscriptsubscript𝑡1𝑇superscriptsubscript𝜎𝑛𝑡2subscript𝒓𝑛𝑡0otherwise𝐾𝐿superscriptsubscript𝜎𝑛𝑡21superscriptsubscript𝜎𝑛𝑡4superscriptsubscriptdelimited-∥∥subscript𝒓𝑛𝑡tensor-productsuperscriptsuperscriptsubscript𝐂𝑛spec1superscriptsuperscriptsubscript𝐂𝑛spat120otherwise𝑇𝐾𝐈superscriptsubscript𝑡1𝑇1superscriptsubscript𝜎𝑛𝑡2superscriptsuperscriptsubscript𝐂𝑛spec1superscriptsubscript𝐕𝑛𝑡topsuperscriptsuperscriptsubscript𝐂𝑛spat1subscript𝐕𝑛𝑡0otherwise𝑇𝐿𝐈superscriptsubscript𝑡1𝑇1superscriptsubscript𝜎𝑛𝑡2superscriptsuperscriptsubscript𝐂𝑛spat1subscript𝐕𝑛𝑡superscriptsuperscriptsubscript𝐂𝑛spec1superscriptsubscript𝐕𝑛𝑡top0otherwise\begin{cases}\sum_{t=1}^{T}\sigma_{n,t}^{-2}\,\bm{r}_{n,t}=0\,,\\[8.61108pt] \frac{K\,L}{\sigma_{n,t}^{2}}-\frac{1}{\sigma_{n,t}^{4}}\,\left\lVert\bm{r}_{n% ,t}\right\rVert_{{\big{(}\mathbf{C}_{n}^{\mathrm{spec}}\big{)}}^{-1}\otimes{% \big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}=0\,,\\[8.61108pt] T\,K\,\mathbf{I}-\sum\limits_{t=1}^{T}\frac{1}{\sigma_{n,t}^{2}}\,{\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top}\,{\left% (\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}=\mathbf{0}\,,% \\[8.61108pt] T\,L\,\mathbf{I}-\sum\limits_{t=1}^{T}\frac{1}{\sigma_{n,t}^{2}}\,{\left(% \mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}\,{\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top}=\mathbf% {0}\,,\end{cases}{ start_ROW start_CELL ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT = 0 , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_K italic_L end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG ∥ bold_italic_r start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0 , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_T italic_K bold_I - ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT = bold_0 , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_T italic_L bold_I - ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( bold_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_V start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT = bold_0 , end_CELL start_CELL end_CELL end_ROW (125)

with 𝐈𝐈\mathbf{I}bold_I the identity matrix. These conditions hold if:

{𝝁^nspec=t=1Tσ^n,t2(𝒗n,t[𝐌𝒖^]n,t)t=1Tσ^n,t2,σ^n,t2=1KL𝒗n,t𝝁^nspec[𝐌𝒖]n,t(𝐂^nspec)1(𝐂^nspat)12,𝐂^nspec=1TKt=1T𝐕^n,t(σ^n,t2𝐂^nspat)1𝐕^n,t,𝐂^nspat=1TLt=1T𝐕^n,t(σ^n,t2𝐂^nspec)1𝐕^n,t.casessuperscriptsubscript^𝝁𝑛specsuperscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑛𝑡2subscript𝒗𝑛𝑡subscriptdelimited-[]𝐌^𝒖𝑛𝑡superscriptsubscript𝑡1𝑇superscriptsubscript^𝜎𝑛𝑡2otherwisesuperscriptsubscript^𝜎𝑛𝑡21𝐾𝐿superscriptsubscriptdelimited-∥∥subscript𝒗𝑛𝑡superscriptsubscript^𝝁𝑛specsubscriptdelimited-[]𝐌𝒖𝑛𝑡tensor-productsuperscriptsuperscriptsubscript^𝐂𝑛spec1superscriptsuperscriptsubscript^𝐂𝑛spat12otherwisesuperscriptsubscript^𝐂𝑛spec1𝑇𝐾superscriptsubscript𝑡1𝑇superscriptsubscript^𝐕𝑛𝑡topsuperscriptsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spat1subscript^𝐕𝑛𝑡otherwisesuperscriptsubscript^𝐂𝑛spat1𝑇𝐿superscriptsubscript𝑡1𝑇subscript^𝐕𝑛𝑡superscriptsuperscriptsubscript^𝜎𝑛𝑡2superscriptsubscript^𝐂𝑛spec1superscriptsubscript^𝐕𝑛𝑡topotherwise\displaystyle\begin{cases}\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}=\frac{\sum_% {t=1}^{T}\widehat{\sigma}_{n,t}^{-2}\,\left(\bm{v}_{n,t}-[\mathbf{M}\,\widehat% {\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widehat{\sigma}_{n,t}^{-2}}\,,\\[8.6110% 8pt] \widehat{\sigma}_{n,t}^{2}=\tfrac{1}{K\,L}\,\left\lVert\bm{v}_{n,t}-\widehat{% \bm{\mu}}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n,t}\right\rVert_{{\big{(}% \widehat{\mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}}\big{)}^{-1}}}^{2}\,,\\[8.61108pt] \widehat{\mathbf{C}}_{n}^{\mathrm{spec}}=\tfrac{1}{T\,K}\sum\limits_{t=1}^{T}% \widehat{\mathbf{V}}_{n,t}^{\top}\,\left(\widehat{\sigma}_{n,t}^{2}\,\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}\,,\\[8% .61108pt] \widehat{\mathbf{C}}_{n}^{\mathrm{spat}}=\tfrac{1}{T\,L}\sum\limits_{t=1}^{T}% \widehat{\mathbf{V}}_{n,t}\,\left(\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf% {C}}_{n}^{\mathrm{spec}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}^{\top}\,.\end% {cases}{ start_ROW start_CELL over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT = divide start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ( bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - [ bold_M over^ start_ARG bold_italic_u end_ARG ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT end_ARG , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_K italic_L end_ARG ∥ bold_italic_v start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT - over^ start_ARG bold_italic_μ end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT - [ bold_M bold_italic_u ] start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ( over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ ( over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_T italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ( over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spat end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_T italic_L end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT ( over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG bold_C end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_spec end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_V end_ARG start_POSTSUBSCRIPT italic_n , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . end_CELL start_CELL end_CELL end_ROW (126)

These correspond to the expressions given in Eqs. (9)–(12).

Appendix B Additional reconstruction results on simulated synthetic disks

This appendix complements the results presented in Sects. 4.4 and 4.5 regarding the reconstruction of the flux distributions for synthetic disks. Figures 24, 25, 26 report line cuts respectively extracted from Figs. 11-12, 13-14, and 15-16 comparing the proposed approach to the median ASDI, PCA ASDI and PACO ASDI baselines. Figure 27 reports line cuts extracted from Fig. 18 comparing the proposed REXPACO ASDI algorithm to its mono-spectral version (REXPACO ADI; Flasseur et al. (2021)).

Refer to caption
Figure 24: Line cuts along the three profiles defined in Fig. 10 extracted from the reconstructions of synthetic elliptical disks shown in Figs. 11-12 for a contrast αgt=5×106subscript𝛼gt5superscript106\alpha_{\text{gt}}=5\times 10^{-6}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 5 × 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. Panel (a) compares median ASDI, PCA ASDI and REXPACO ASDI reconstructions 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG to the ground truth 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,\bm{u}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT. These quantities are shown within the range of [0.25×max(𝐁𝒖gt);2×max(𝐁𝒖gt)]0.25max𝐁subscript𝒖gt2max𝐁subscript𝒖gt[-0.25\times\text{max}(\mathbf{B}\,\bm{u}_{\text{gt}});2\times\text{max}(% \mathbf{B}\,\bm{u}_{\text{gt}})][ - 0.25 × max ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) ; 2 × max ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) ] to highlight both over-estimation and under-estimation of the signal of interest. The minimum value (zero) of the ground truth 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,\bm{u}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT is also marked on the right vertical axis. Panel (b) compared PACO ASDI and REXPACO ASDI reconstructions 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG against the ground truth 𝒖gtsubscript𝒖gt\bm{u}_{\text{gt}}bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT. The ground truth flux distribution 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,{\bm{u}}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT and the associated slice-cut locations are recalled in insets. Dataset: HD 172555 (2015-07-11), see Table 2 for the observation parameters.
Refer to caption
Figure 25: Same than Fig. 24 for synthetic circular disks, see reconstructed flux distributions in Figs. 13-14.
Refer to caption
Figure 26: Same than Fig. 24 for synthetic spiral disks, see reconstructed flux distributions in Figs. 15-16.
Refer to caption
Figure 27: Comparison between ADI and ASDI on the reconstruction of synthetic disks. The considered elliptical, circular, and spiral disks are injected (αgt=1×105subscript𝛼gt1superscript105\alpha_{\text{gt}}=1\times 10^{-5}italic_α start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT = 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT, Δpar=30°subscriptΔpar30arcdegree\Delta_{\text{par}}=30\mathrm{\SIUnitSymbolDegree}roman_Δ start_POSTSUBSCRIPT par end_POSTSUBSCRIPT = 30 °, YJ band), within a real SPHERE-IFS dataset and processed with the mono-spectral algorithm REXPACO ADI (Flasseur et al., 2021) and its multi-spectral version REXPACO ASDI proposed in this paper. Line cuts along the three profiles defined in Fig. 10 are extracted from the reconstructions displayed in Fig. 18. The ground truth flux distribution 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,{\bm{u}}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT and the associated slice-cut locations are recalled in insets. Quantities 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,{\bm{u}}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT and 𝐁𝒖~𝐁~𝒖\mathbf{B}\,\widetilde{\bm{u}}bold_B over~ start_ARG bold_italic_u end_ARG are shown within the range of [0.25×max(𝐁𝒖gt);2×max(𝐁𝒖gt)]0.25max𝐁subscript𝒖gt2max𝐁subscript𝒖gt[-0.25\times\text{max}(\mathbf{B}\,\bm{u}_{\text{gt}});2\times\text{max}(% \mathbf{B}\,\bm{u}_{\text{gt}})][ - 0.25 × max ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) ; 2 × max ( bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT ) ] to highlight both over-estimation and under-estimation of the signal of interest. The minimum value (zero) of the ground truth 𝐁𝒖gt𝐁subscript𝒖gt\mathbf{B}\,\bm{u}_{\text{gt}}bold_B bold_italic_u start_POSTSUBSCRIPT gt end_POSTSUBSCRIPT is also marked on the right vertical axis. Dataset: HD 172555 (2015-07-11), see Table 2 for the observation parameters.