REXPACO ASDI: Joint unmixing and deconvolution of the circumstellar environment by angular and spectral differential imaging

Olivier Flasseur¹, Loïc Denis², Éric Thiébaut¹, Maud Langlois¹
¹Université de Lyon, Université Lyon1, ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR 5574, Saint-Genis-Laval, France
²Université de Lyon, UJM-Saint-Etienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien UMR 5516, Saint-Étienne, France E-mail: [email protected]

Abstract

Angular and spectral differential imaging is an observational technique of choice to investigate the immediate vicinity of stars. By leveraging the relative angular motion and spectral scaling between on-axis and off-axis sources, post-processing techniques can separate residual star light from light emitted by surrounding objects such as circumstellar disks or point-like objects. This paper introduces a new algorithm that jointly unmixes these components and deconvolves disk images. The proposed algorithm is based on a statistical model of the residual star light, accounting for its spatial and spectral correlations. These correlations are crucial yet remain inadequately modeled by existing reconstruction algorithms. We employ dedicated shrinkage techniques to estimate the large number of parameters of our correlation model in a data-driven fashion. We show that the resulting separable model of the spatial and spectral covariances captures very accurately the star light, enabling its efficient suppression. We apply our method to datasets from the VLT/SPHERE instrument and compare its performance with standard algorithms (median subtraction, PCA, PACO). We demonstrate that considering the multiple correlations within the data significantly improves reconstruction quality, resulting in better preservation of both disk morphology and photometry. With its unique joint spectral modeling, the proposed algorithm can reconstruct disks with circular symmetry (e.g., rings, spirals) at intensities one million times fainter than the star, without needing additional reference datasets free from off-axis objects.

keywords:

techniques: high angular resolution – techniques: image processing – methods: numerical – methods: statistical – methods: data analysis

^†^†pubyear: 2024^†^†pagerange: REXPACO ASDI: Joint unmixing and deconvolution of the circumstellar environment by angular and spectral differential imaging–LABEL:lastpage

1 Introduction

Direct imaging is a recent observational technique allowing to probe the close environment of young stars (Traub & Oppenheimer, 2010; Bowler, 2016). The targeted tasks are threefold (see e.g., Pueyo (2018); Currie et al. (2022a); Follette (2023) for reviews): (i) detecting (massive) exoplanets, (ii) characterizing their physical properties by estimating their spectral energy distribution (SED), and (iii) reconstructing the flux distribution image of the circumstellar environment surrounding young nearby stars. In this paper, we primarily focus on the latter objective and we also address the unmixing of spatially resolved disks from point-like sources.

Circumstellar disks are key components of the intricate processes governing planetary formation. As an illustration, several studies performed in total intensity or in polarimetry (Esposito et al., 2020; Garufi et al., 2020; Langlois et al., 2020) have revealed the presence of a diversity of structures such as spirals, warps, rings, gaps, shadows and asymmetries, which are considered as potential indicators for the presence of exoplanets (Benisty et al., 2015; Muro-Arena et al., 2020). High-quality reconstruction of the circumstellar environment from high-contrast data thus offer a unique vantage point to understand the physical processes governing these objects (Keppler et al., 2018; Haffert et al., 2019; Mesa et al., 2019b). It also allows to study the intricate interactions between exoplanets and disks, and to provide critical insights into the mechanisms steering the evolution of exoplanetary systems.

In this context, direct imaging faces two observational challenges. First, the objects of interest (i.e., spatially resolved disks and exoplanets appearing as point-like sources) have a very low contrast¹¹1Throughout this paper, we define the contrast of the objects of interest as the ratio of their peak intensity to the star peak intensity. This also corresponds to the classical definition of contrast for single-pixel point objects. (typically lower than $10^{-4}$ in the infrared). Second, these off-axis objects are located in the immediate vicinity of the star, thus necessitating high angular resolution to separate them from the star (disks are generally observed inside an angle of less than 1 arcsecond). The angular resolution requirement can be achieved using large ground-based telescopes equipped with extreme adaptive optics systems to compensate in real-time for atmospheric turbulence. The contrast is further improved by filtering most of the star light with a coronagraph. However, this is not sufficient to recover interpretable images of the circumstellar environment, as residual star light still dominates (see Fig. 1). To further reduce the impact of star light, differential imaging is employed. This observational technique involves capturing several images in configurations that introduce diversity (e.g., relative motion in ADI or SDI) between the objects of interest and the star diffraction patterns, known as speckles, caused by diffraction effects in the telescope pupil. There are two primary configurations for differential imaging. In angular differential imaging (ADI; Marois et al. (2006)), a sequence of images is acquired over a few hours of observation. During the acquisition, the pupil of the telescope keeps a constant orientation (so-called pupil-tracking mode), while the field of view rotates due to Earth’s rotation. This leads to a rotation of the objects of interest around the optical axis in the images, while quasi-static speckles created by uncorrected optical aberrations stay mostly fixed between individual exposures. In spectral differential imaging (SDI; Sparks & Ford (2002); Thatte et al. (2007)), images are captured simultaneously in several spectral bands. Due to diffraction, the speckle pattern scales linearly with wavelength, in first approximation. By properly rescaling the images spectrally, the speckle patterns are aligned, while the objects of interest undergo radial motion and homothety due to the scaling transform. ADI and SDI can be advantageously combined to form angular and spectral differential imaging (ASDI) sequences, see e.g. Vigan et al. (2010); Christiaens et al. (2019); Kiefer et al. (2021). The images recorded in ADI, SDI, or ASDI are then combined in a post-processing step to enhance contrast and obtain interpretable images of the circumstellar environment.

The classical post-processing pipeline typically begins by estimating the stellar component, for instance, by averaging a stack of images with aligned speckles. This stellar component is then subtracted from the data, followed by the alignment and stacking of the residuals to compensate for rotations and scaling of the field of view. Beyond simple averaging, the stellar component can be estimated using various techniques: a median approach (Marois et al., 2006; Lagrange et al., 2009)), a weighted linear combination (LOCI methods; Lafrenière et al. (2007); Marois et al. (2013); Marois et al. (2014); Wahhaj et al. (2015)), or principal component analysis (PCA-based methods; Soummer et al. (2012); Amara & Quanz (2012)). All of these methods can be applied on spatio-temporo-spectral data from IFS by leveraging differential diversity in various ways, see e.g. Christiaens et al. (2019); Kiefer et al. (2021) for PCA. This can be done using ADI alone (i.e., a different model for each spectral channel), SDI alone (i.e., a different model for each temporal frame), ADI+SDI (i.e., two models: the first exploiting ADI diversity and the second exploiting SDI diversity to the ADI residuals), SDI+ADI (i.e., two models applied in reverse order of ADI+SDI models), or ASDI (i.e., a single model that jointly leverages both angular and spectral diversities). This latter strategy combined to the specific case of PCA is known as COmbined Differential Imaging (CODI; Kiefer et al. (2021)). However, these methods share a common drawback: part of the signal of interest is included in the estimated stellar component, resulting in its loss when the star component is subtracted from the data. This critical phenomenon, known as self-subtraction (Milli et al., 2012; Pairet et al., 2019), is particularly problematic close to the star, where the diversity between the disk and the star light is more limited (the apparent displacement of the off-axis objects due to the rotations and scaling transforms being separation-dependent). Consequently, disentangling the component of interest from the star light is even more difficult nearer the star. Self-subtraction can introduce various artifacts, such as partial replicas, suppression of some smooth extended structures, and smearing or non-uniform attenuations of disk features (Milli et al., 2012).

To mitigate the impact of self-subtraction, several approaches were considered. Some works perform iterative PCA in which the current disk reconstruction is subtracted from the measurements to improve progressively the estimation of the star light (Pairet et al., 2019; Stapper, L. M. & Ginski, C., 2022). In the same vein, data imputation strategies (Ren et al., 2020; Ren, 2023) discard measurements affected by the disk, either through a data-driven approach or based on prior knowledge of its shape and location, during the estimation of the star light contribution. This type of approaches remains limited by the strategy designed to discard fractions of the field of view impacted by the disk on each image. Other works consider a parametric model of a disk and iteratively adjust its parameters (Esposito et al., 2013; Currie et al., 2017; Milli et al., 2017) by minimizing the resulting residuals, possibly by modeling the effect of the self-subtraction (Lawson et al., 2020; Mazoyer et al., 2020; Hom et al., 2024). These approaches are mainly applicable to simple disk structures, such as ellipses, which are typical morphologies of debris disks. Another technique, Reference Differential Imaging (RDI; see Smith & Terrile (1984); Lafrenière et al. (2009); Lagrange et al. (2010) for some first examples of applications), employs additional images of one or more reference stars without known exoplanets or disks. These additional data can be captured simultaneously with the observation of the target star using the star-hopping technique (Wahhaj et al., 2021), or they can be drawn from a large library of archival observations (Ren et al., 2018; Xuan et al., 2018). RDI can be effectively combined with other observing strategies to simultaneously exploit their diversity. For instance, when integrating RDI with ADI, the nuisance component can be estimated and suppressed using PCA (Ruane et al., 2019; Xie et al., 2022; Juillard et al., 2024) or deep learning techniques (Chintarungruangchai et al., 2023; Wolf et al., 2024; Bodrito et al., 2024). RDI reconstructions can also be constrained by additional observations from other imaging modalities, such as polarimetry, where speckles and disk components behave differently as in total intensity images recorded with ADI/ASDI (Lawson et al., 2022). In practice, the effectiveness of RDI approaches depends heavily on the similarity between the reference and the actual observations, including factors such as star brightness, spectrum, and observation conditions. This degree of similarity becomes increasingly critical as we search for fainter objects. Finally, a last category of approaches jointly addresses the problem of estimating star light residuals and reconstructing the flux distribution of the disk and exoplanets. In ADI, three approaches based on an inverse-problems formulation were recently proposed: MAYONNAISE (Pairet et al., 2021), MUSTARD (Juillard et al., 2022, 2023), and REXPACO (Flasseur et al., 2021, 2022). These algorithms employ different strategies and regularization penalties of the inversion for separating the components of interest. In a first step, MAYONNAISE uses iterative PCA to initialize the inversion process. Building on this preliminary reconstruction, a second step involves estimating and unmixing multiple components by jointly minimizing a data fidelity term. The unmixed components are the star light residuals (restricted to lie within the subspace identified in the first step), the disk (enforced to have a sparse representation in a shearlet basis), and the exoplanets (restricted to be sparse). Non-negativity constraints are also enforced during the minimization. MUSTARD is a variant of MAYONNAISE that primarily differs in the formulation of the direct model. The reconstructed speckles field is enforced to be identical along the temporal axis to account explicitly for its quasi-static behavior. Unlike MAYONNAISE, MUSTARD does not use iterative PCA for initialization, nor does it enforce sparsity of the disk component in a shearlet basis. Additionally, MUSTARD can incorporate a regularization term based on a predefined mask, which helps resolve ambiguities between the speckle field and portions of the disk that are rotation invariant. Both MAYONNAISE and MUSTARD assume noise to be white, independent, and identically distributed. REXPACO follows quite a different modeling as it does not explicitly estimate the residual star light in each image. Instead, it builds a statistical and local model of all fluctuations other than the component of interest (i.e., noise and star light). REXPACO learns the spatial correlations of these fluctuations at the scale of 2D image patches, following an approach initially introduced for exoplanet detection in the PACO algorithm (Flasseur et al., 2018), based on PAtch COvariances. The component of interest is deconvolved with an edge-preserving smoothness regularization and a positivity constraint. Further extending REXPACO for ADI post-processing, a recent enhancement replaces its multivariate Gaussian model of the nuisance with a scaled mixture of multivariate Gaussian models (Flasseur et al., 2022). This improved model offers better fidelity to the observations and enhanced robustness against outlier data (e.g., defective pixels or large stellar leakages), which are identified and neutralized in a data-driven manner. In ADI, REXPACO can be combined with PACO to disentangle user-identified candidate point-like sources from the circumstellar environment.

In this paper, we address the problem of reconstructing circumstellar disks from ASDI sequences through joint multi-spectral post-processing. Compared to ADI, this raises several challenges: (i) modeling the temporal and spectral fluctuations of the residual star light, (ii) jointly exploiting both temporal and spectral information to effectively extract the component of interest, and (iii) ensuring the tractability of estimating high-dimensional models from large datasets. As an illustration of point (iii), typical ASDI datasets produced by the Integral Field Spectrograph (IFS) of the Spectro-Polarimetry High-contrast Exoplanet Research instrument (SPHERE; Beuzit et al. (2019)) at the Very Large Telescope (VLT) are $N=290\times 290$ pixels, have $L=39$ spectral bands and $T\approx 100$ individual exposures. Several hundred million pixel measurements must then be combined to produce a multi-spectral reconstruction of the component of interest. Modeling the full covariance associated with this volume of measurements theoretically involves estimating $N(N+1)/2$ degrees of freedom from the data, which is not feasible without making approximations to the covariance.

Table 1: Summary of the main notations.

Not.	Range	Definition
$\triangleright$ Constants and related indexes
$K$	$\mathbb{N}^{*}$	number of pixels in a patch
$N$	$\mathbb{N}^{*}$	number of pixels in a dataset
$N^{\prime}$	$\mathbb{N}^{*}$	number of pixels in a reconstructed image
$T$	$\mathbb{N}^{*}$	number of temporal frames
$L$	$\mathbb{N}^{*}$	number of spectral channels
$L_{\text{eff}}$	$\mathbb{N}^{*}$	effective number of spectral channels
$n^{(^{\prime})}$	$\llbracket 1,N^{(^{\prime})}\rrbracket$	pixel index
$t$	$\llbracket 1,T\rrbracket$	temporal index
$\ell$	$\llbracket 1,L\rrbracket$	spectral index
$\mathbb{K}$	–	set of patch locations
$\triangleright$ Data quantities
$\bm{v}$	$\mathbb{R}^{NTL}$	ASDI sequence (with speckles aligned)
$\bm{f}$	$\mathbb{R}^{NTL}$	nuisance component
$\mathbf{E}_{n(,t)(,\ell)}$	$\mathbb{R}^{NTL\times K(L)(T)}$	patch extractor at pixel $n$ (, time $t$ ) (, channel $\ell$ )
$\mathbf{V}_{n,t}$	$\mathbb{R}^{K\times L}$	residual multi-spectral patch at pixel $n$ , time $t$
$\bm{u}$	$\mathbb{R}_{+}^{N^{\prime}L}$	spatio-spectral flux distribution
$\triangleright$ Operators
$\mathbf{M}$	$\mathbb{R}^{N^{\prime}L\times NTL}$	direct image formation model: $\mathbf{M}=\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,\mathbf{B}\,\mathbf{R}$
$\mathbf{F}_{t}$	$\mathbb{R}^{N^{\prime}L\times NL}$	sparse operator at time $t$ : $\mathbf{F}_{t}=(\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,\mathbf{R})_{t}$
$\mathbf{B}$	$\mathbb{R}^{N^{\prime}TL\times N^{\prime}L}$	convolution by off-axis PSF
$\mathbf{R}$	$\mathbb{R}^{N^{\prime}TL\times N^{\prime}TL}$	apparent field rotation
$\mathbf{A}$	$\mathbb{R}^{N^{\prime}TL\times N^{\prime}TL}$	coronagraph attenuation
$\mathbf{Z}$	$\mathbb{R}^{N^{\prime}TL\times MTL}$	field of view cropping
$\mathbf{S}$	$\mathbb{R}^{MTL\times NTL}$	spectral scaling
$\odot$	$\mathbb{R}^{X\times X}\,,X\in\mathbb{N}^{*}$	Hadamard (element-wise) product
$\otimes$	$\mathbb{R}^{X\times X}\,,X\in\mathbb{N}^{*}$	Kronecker product
$\triangleright$ Estimated quantities
$\widehat{\bm{\mu}}^{\,\mathrm{spec}}$	$\mathbb{R}^{NL}$	multi-spectral mean of $\bm{f}$
$\widetilde{\bm{\mu}}^{\,\mathrm{spec}}$	$\mathbb{R}^{NL}$	shrunk multi-spectral mean of $\bm{f}$
$\widehat{\mathbf{C}}^{\mathrm{spat}}$	$\mathbb{R}^{K\times K}$	local empirical spatial covariance of $\bm{f}$
$\widehat{\mathbf{C}}^{\mathrm{spec}}$	$\mathbb{R}^{L\times L}$	local empirical spectral covariance of $\bm{f}$
$\widetilde{\mathbf{C}}^{\mathrm{spat}}$	$\mathbb{R}^{K\times K}$	local shrunk spatial covariance of $\bm{f}$
$\widetilde{\mathbf{C}}^{\mathrm{spec}}$	$\mathbb{R}^{L\times L}$	local shrunk spectral covariance of $\bm{f}$
$\widetilde{\rho}^{\,\mathrm{spat}}$	$\left[0,1\right]$	spatial shrinkage coefficient
$\widetilde{\rho}^{\,\mathrm{spec}}$	$\left[0,1\right]$	spectral shrinkage coefficient
$\widehat{\bm{\sigma}}$	$\mathbb{R}_{+}^{T}$	temporal weights ( $\widehat{\bm{\sigma}}=\{\widehat{\sigma}_{t}\}_{t=1:T}$ )
$\widetilde{\bm{\sigma}}$	$\mathbb{R}_{+}^{T}$	shrunk temporal weights ( $\widetilde{\bm{\sigma}}=\{\widetilde{\sigma}_{t}\}_{t=1:T}$ )
${\mathbf{\Psi}}^{\mathrm{spat}}$	$\mathbb{R}^{K\times K}$	matrix of spatial shrinkage coefficients
${\mathbf{\Psi}}^{\mathrm{spec}}$	$\mathbb{R}^{L\times L}$	matrix of spectral shrinkage coefficients
$\mathbf{\Gamma}$	$\mathbb{R}^{KL\times KL}$	shrunk spatio-spectral precision matrix
$\widehat{\bm{u}}$ , $\widetilde{\bm{u}}$	$\mathbb{R}_{+}^{N^{\prime}L}$	reconstructed spatio-spectral flux distribution
$\widehat{\bm{\beta}}$	$\mathbb{R}_{+}^{2}$	regularization hyper-parameters
$\triangleright$ Other quantities and metrics
$\bm{u}_{\text{inv}}$	$\mathbb{R}^{N^{\prime}}$	flux distribution invariant by ASDI
$\bm{u}_{\text{gt}}$	$\mathbb{R}_{+}^{N^{\prime}L}$	ground truth flux distribution
$\alpha_{\text{gt}}$	$\mathbb{R}_{+}$	maximum ground truth contrast (disk vs star)
MSE	$\mathbb{R}$	mean square error
N-RMSE	$\mathbb{R}_{+}$	normalized root mean square error
SURE	$\mathbb{R}$	Stein’s unbiased risk estimator

Our contributions: This paper extends the REXPACO algorithm (Flasseur et al., 2021, 2022) to ASDI sequences. This extension, named REXPACO ASDI, involves several specific methodological developments, including:

•

a spatio-spectral separable model of the covariances of the nuisance,
•

a spatio-temporal weighting of the measurements based on their relative quality,
•

a technique to estimate the components of the covariances and weights model,
•

a regularization strategy of the (noisy) sample covariances,
•

a strategy to jointly refine the model of the residual star light and reconstruct the disk of interest,
•

a spatio-spectral regularization of the reconstructed multi-spectral images,
•

a strategy to unmix point-like sources from the disk material.

Beyond these methodological developments, the proposed approach is, to the best of our knowledge, the first one to leverage joint processing of multi-spectral data through an inverse problem framework for reconstructing circumstellar disks in high-contrast imaging. We illustrate in this paper the benefits of an accurate exploitation of the spectral diversity to improve reconstruction fidelity. In particular, we show that REXPACO ASDI can faithfully reconstruct disks with near-circulo-symmetric morphologies (e.g., spiral and rings). Such morphological structures are especially challenging to reconstruct without additional data diversity complementary to A(S)DI (e.g., based on RDI techniques) to build an unbiased model of the nuisance component.

Section 2 develops the statistical model for the residual star light and different noise contributions. Building on this model, Sect. 3 presents a reconstruction method that jointly extracts and deconvolves the component of interest: the multi-spectral image of the disk surrounding the target star. Section 4 showcases reconstruction results on several ASDI sequences obtained with the VLT/SPHERE instrument. Additionally, Sect. 5 describes an iterative method to unmix the contribution of candidate point-like sources from the circumstellar disk. Finally, Sect. 6 draws the conclusions of this work.

Throughout the text, the reader can refer to Table 1 summarizing the main notations.

2 Statistical model of the nuisance

Refer to caption — Figure 1: Illustration of a dataset acquired with ASDI: (a) images captured at different wavelengths; (b) spatio-spectral slices along the two lines –1– and –2– drawn in (a); (c) spatio-temporal slices along the lines –1– and –2–. The four square areas define four regions studied in more details in Fig. 2. The component of interest, a spiral-shaped circumstellar disk, is shown in (d) based on REXPACO ASDI reconstruction given in Sect. 4.3. In the first channel, shown in blue, the signal of the disk is faint (contrast about $1.5\times 10^{-6}$ ) compared to the stellar leakages. Images are displayed using pseudo-colors (ranging from blue to red) chosen to cover the infrared spectrum. Colored polygons delimit the common field of view seen in all spectral channels. Dataset: SAO 206462 (2015-05-15), see Table 2 for the observation parameters.

In contrast to other methods in the literature, we do not explicitly extract the residual star light component from the data but rather develop a statistical model to describe both the residual star light (i.e., the speckles) and the various stochastic noise contributions (thermal noise, detector readout noise, photon noise). With pupil tracking mode and after chromatic speckle alignment by rescaling the images according to the wavelength, residual star light is very similar from one spectral channel to the next (up to some chromatic factor). There are, however, some fluctuations due to noise, chromatic phenomena, and the evolution of the phase aberrations during observation. These fluctuations display some spatial and spectral correlations and are highly non-stationary. In particular, they are much stronger close to the star. We describe in Sects. 2.1 and 2.2 the rationale of the statistical model embedded in REXPACO ASDI, and we develop in Sect. 2.3 a methodology to estimate the resulting large number of parameters directly from the data.

Figure 1 shows a dataset of a star (SAO 206462) surrounded by a bright disk observed using the ASDI technique. Slices along different dimensions of this 4D dataset are displayed. The coronagraphic mask is aligned with the star, at the center of the field of view (center of the images shown in Fig. 1(a)). Residual star light dominates the central area and extends over most of the field of view. It takes the form of granular intensity structures (speckles). During a pre-processing step, all images were rescaled by a wavelength-specific factor $\lambda_{\text{ref}}/\lambda$ to compensate for diffraction and spatially align the speckles. The solid line –1– drawn in the $x$ direction in Fig. 1(a) crosses a bright speckle. This speckle is visible in the left part of Fig. 1(b) and the first row of Fig. 1(c). It remains at the same spatial location for all wavelengths $\lambda$ and all times $t$ . Structures of interest, such as the disk that surrounds the star SAO 206462, undergo a rotation about the image center throughout time and a scaling with the wavelength (due to the rescaling applied in the pre-processing step). These spatial transformations are visible in the slices along the dotted line –2– drawn in the images of Fig. 1(a): the line crosses the disk (as well as an area with strong residual star light, close to the image center). The spatio-spectral slice shown at the right of Fig. 1(b) displays a scaling of the disk with respect to the wavelength (shorter wavelengths are dilated due to the speckle-aligning pre-processing), whereas the rotation motion can be noted in the spatio-temporal slices shown at the bottom of Fig. 1(c), in particular for a bright structure of the disk highlighted within a white box, which is moving closer to the image center during the sequence. Figure 1(d) shows only the component of interest: the circumstellar disk. The images were obtained with the reconstruction method introduced in this paper, see Sect. 4.3 for a spectrally combined visualization of the reconstructed disk. Comparing Figs. 1(a) and 1(d), illustrates that high-contrast observations suffer from a strong nuisance component which has to be numerically suppressed in order to reconstruct the component of interest.

The accuracy of residual star light and noise model has a strong impact on the reconstruction of the component of interest $\bm{u}$ , as further discussed in Sect. 4. In the following of this section, we first assume that the object $\bm{u}$ has a negligible impact on the statistical distribution of the nuisance term, i.e., the statistical distribution of the aligned data $\text{p}_{V}(\bm{v})$ in the absence of disk or exoplanet is nearly identical to the distribution $\text{p}_{V}(\bm{v}-\mathbf{M}\,\bm{u})$ of the nuisance component $\bm{f}=\bm{v}-\mathbf{M}\,\bm{u}$ obtained when the modeled contribution $\mathbf{M}\,\bm{u}$ of the component of interest has been subtracted from the data $\bm{v}$ (the direct model, $\mathbf{M}$ , is presented in Sect. 3.1). This assumption is made in order to initiate the estimation of the model parameters, and we introduce in Sects. 2.3–3.3 several strategies to jointly estimate the statistical distribution of the nuisance terms and reconstruct the component of interest. These joint and iterative strategies significantly enhance the fidelity of the reconstruction by explicitly accounting for the bias induced by the disk on the nuisance model.

2.1 Patch-based statistical modeling

Image patches (i.e., neighborhoods of a few tens to a hundred pixels) offer an interesting trade-off between locality (small enough to capture a local behavior) and complexity (they include enough pixels to collect geometrical and textural information). Their use has been very successful in image restoration, from methods based on image self-similarity (Buades et al., 2005), collaborative filtering (Dabov et al., 2007), sparse coding (Aharon et al., 2006; Mairal et al., 2009), mixture models (Zoran & Weiss, 2011; Yu et al., 2011), or Gaussian models (Lebrun et al., 2013). Whereas deep neural networks have become the state-of-the-art approach to learn rich models (either generative or discriminative) of natural images, patch-based models retain serious advantages when the number of training samples is limited or in the case of highly non-stationary images. As can be seen in Fig. 1(a), images in an ASDI dataset are far from stationary: residual star light is the strongest at the center of the image (at the actual location of the star). Observations made during separate nights around different stars also often display significantly different structures because of changes in the observing conditions (which impact the residual aberrations uncorrected by adaptive optics, and hence the spatial distribution of speckles due to star light) and star brightness. This limits the possibility to use external observations (e.g., using the RDI technique, see Sect. 1) to learn a model to process a specific ASDI sequence and motivates the development of a patch-based approach based solely on the ASDI sequence of interest.

Under our patch-based model, the distribution of an ASDI sequence $\bm{v}\in\mathbb{R}^{NLT}$ , formed by the collection of $T$ multi-spectral images with $L$ spectral bands and $N$ pixels in each band, after chromatic speckles alignment and without disk or exoplanet is given by:

\displaystyle\text{p}_{V}(\bm{v})\approx\prod_{n\in\mathbb{K}}\text{p}_{V_{n}}% (\mathbf{E}_{n}\bm{v})\,,

(1)

where $\text{p}_{V}$ is the joint distribution of the whole ASDI dataset, $\mathbf{E}_{n}$ is the linear operator that extracts a $K\times L_{\text{eff}}\times T$ -pixel spatio-spectro-temporal patch centered at the $n$ -th spatial location of the field of view (i.e., $\bm{v}_{n}=\mathbf{E}_{n}\bm{v}$ is a 4D-patch²²2Throughout the text, we do not differentiate the $x$ and $y$ spatial dimensions to simplify the notations but rather use 2D spatial indices $n$ .). The set of spatial locations $\mathbb{K}$ is defined to prevent patch overlapping while tiling the whole field of view (i.e., $\text{Card}(\mathbb{K})\times K=N$ and juxtaposed square patches are used).

The model (1) assumes a statistical independence between patches, which is a simplifying hypothesis that eases a data-driven learning of the distribution $\text{p}_{V}$ . In the sequel, each distribution $\text{p}_{V_{n}}$ is modeled by a different multivariate Gaussian in order to capture the correlations between observations within a spatio-spectro-temporal patch. By adapting the parameters of these Gaussian distributions to the spatial location $n$ , a non-stationary model is obtained, with the capability to capture the variations between areas close to the star (at the center of the image) and areas farther away. The statistical model of a patch is thus given by its assumed distribution:

\displaystyle\text{p}_{V_{n}}(\bm{v}_{n})=\frac{1}{\sqrt{|2\pi\mathbf{C}_{n}|}% }\exp\left(-\tfrac{1}{2}\bigl{\|}\bm{v}_{n}-\bm{\mu}_{n}\bigr{\|}_{\mathbf{C}_% {n}^{-1}}^{2}\right)\,,

(2)

with $\|\bm{a}\|_{\mathbf{B}}^{2}=\bm{a}^{\top}\,\mathbf{B}\,\bm{a}$ and $|\mathbf{C}_{n}|$ the determinant of matrix $\mathbf{C}_{n}$ . The Gaussian distribution $\text{p}_{V_{n}}$ is defined by the patch expectation $\bm{\mu}_{n}\in\mathbb{R}^{KLT}$ and the covariance matrix $\mathbf{C}_{n}\in\mathbb{R}^{KLT\times KLT}$ . In order to estimate these two quantities at each location $n$ , additional hypotheses and an estimation technique are required.

2.2 Constraining the structure of the average vector and of the covariance matrix

Estimating and handling different Gaussian parameters for each patch location is not feasible given the number of parameters involved: the set of all mean vectors $\{\bm{\mu}_{n}\}_{n\in\mathbb{K}}$ has as many free parameters as the total number of measurements in $\bm{v}$ (i.e., $NLT$ ) and the set of all covariance matrices $\{\mathbf{C}_{n}\}_{n\in\mathbb{K}}$ represents many times the number of measurements in $\bm{v}$ (more precisely, $NLT(KLT+1)/2$ free parameters, which represents more than 300,000 times the size of $\bm{v}$ for typical values of $K\approx 13\times 13$ , $L\approx 39$ , and $T\approx 100$ ).

There are two options to reduce the number of parameters in the Gaussian models of Eqs. (1) and (2). Approach (i) involves assuming a certain level of stationarity for the means or covariances with respect to the spatial location $n$ . Strategy (ii) is to impose a structure on the mean $\bm{\mu}_{n}$ and on the covariance $\mathbf{C}_{n}$ . Beyond obtaining more tractable models, these assumptions are also indispensable, for a single ASDI dataset $\bm{v}$ , to constrain the estimator of the parameters of the Gaussian models.

The strong spatial non-stationarity of ASDI datasets led us to favor option (ii). We considered several ways to select a structure suitable to ASDI observations and built on our experience of point-source detection in ASDI datasets (Flasseur et al., 2020b). We found that it is preferable to use a common mean vector $\bm{\mu}_{n}$ for all times $t$ rather than a time-specific mean vector common to all wavelengths (the spectral variations being stronger than the temporal fluctuations):

\text{Mean}\left[\bm{v}_{n,(k,\ell,:)}\right]=\bm{\mu}_{n,(k,\ell,:)}=\frac{1}% {T}\sum\limits_{t^{\prime}=1}^{T}\bm{v}_{n,(k,\ell,t^{\prime})}=\bm{\mu}_{n,(k% ,\ell)}^{\mathrm{spec}}\,,

(3)

where $\bm{\mu}_{n}$ represents the mean vector at patch location $n$ , and $\bm{\mu}_{n,(k,\ell,t)}$ denotes its specific entry at pixel $k$ , spectral channel $\ell$ and time $t$ . Equation (3) can be rewritten in the more concise form:

\displaystyle\bm{\mu}_{n}=\text{vec}\!\left(\begin{pmatrix}|\\ \bm{\mu}_{n}^{\text{spec }}\\ |\end{pmatrix}\overset{\text{\scriptsize$\longleftarrow T\longrightarrow$}}{% \begin{pmatrix}1&\cdots&1\end{pmatrix}}\right)\,,

(4)

where $\bm{\mu}_{n}^{\mathrm{spec}}$ is a $KL$ -pixel multi-spectral vector that represents the temporal average of the multi-spectral patches and $\text{vec}(\cdot)$ performs the vectorization of a matrix by stacking its columns (it transforms a $KL\times T$ matrix into a vector of dimension $KLT$ ).

To capture the structures of both the spatial and the spectral covariances, we model the covariance between two pixels of the patch $\bm{v}_{n}$ by:

\text{Cov}\!\left[\bm{v}_{n,(k_{1},\ell_{1},t_{1})},\,\bm{v}_{n,(k_{2},\ell_{2% },t_{2})}\right]\\ =\begin{cases}0&\text{if }t_{1}\neq t_{2}\,,\\ \sigma_{n,t}^{2}\mathbf{C}_{n,\,(k_{1},k_{2})}^{\mathrm{spat}}\mathbf{C}_{n,\,% (\ell_{1},\ell_{2})}^{\mathrm{spec}}&\text{if }t_{1}=t_{2}=t\,,\end{cases}

(5)

where $\sigma_{n,t}^{2}$ is a scalar that represents the global level of fluctuation in the multi-spectral slice at time $t$ , $\mathbf{C}_{n}^{\mathrm{spat}}$ is a $K\times K$ covariance matrix encoding the spatial structure of the fluctuations (a $K$ -pixel spatial patch corresponds to a 2D square window, so this covariance matrix contains information about 2D spatial structures), and matrix $\mathbf{C}_{n}^{\mathrm{spec}}$ is an $L\times L$ covariance matrix encoding spectral correlations. To prevent a degeneracy by multiplicative factors, we normalize covariance matrices $\mathbf{C}_{n}^{\mathrm{spat}}$ and $\mathbf{C}_{n}^{\mathrm{spec}}$ such that their trace be equal to $K$ and $L$ , respectively. In the covariance model of Eq. (5), multi-spectral slices at different times $t_{1}$ and $t_{2}$ are considered uncorrelated (and, thus, mutually independent given the joint Gaussian assumption of Eq. (2)). The time-varying variance parameter $\sigma_{n,t}^{2}$ plays the role of a scale parameter in a compound-Gaussian model (Conte et al., 1995), also known as a Gaussian scale mixture model (Wainwright & Simoncelli, 1999). A large value of parameter $\sigma_{n,t}^{2}$ almost discards the time frame $t$ from the $n$ -th 4D patch, which limits the impact of possible outliers and thus makes the estimator (more) robust (Flasseur et al., 2020a).

The covariance structure given in Eq. (5) corresponds to the following separable covariance matrix:

\displaystyle\text{Cov}\!\left[\bm{v}_{n}\right]=\text{diag}(\bm{\sigma}_{n}^{% 2})\otimes\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}}\,,

(6)

where $\text{diag}(\bm{\sigma}_{n}^{2})$ is a $T\times T$ diagonal matrix whose $t$ -th diagonal entry is $\sigma_{n,t}^{2}$ and $\otimes$ is Kronecker matrix product: $\mathbf{A}\otimes\mathbf{B}$ , with $\mathbf{A}\in\mathbb{R}^{n\times n}$ and $\mathbf{B}\in\mathbb{R}^{m\times m}$ , is the $nm\times nm$ matrix with a $n\times n$ block structure such that the $ij$ -th block is the $m\times m$ matrix $A_{ij}\mathbf{B}$ . Note that this is equivalent to modeling each multi-spectral slice $\bm{v}_{n,t}\in\mathbb{R}^{KL}$ of $\bm{v}_{n}$ as random vectors following the compound-Gaussian model $\mathcal{N}(\bm{\mu}_{n}^{\mathrm{spec}},\sigma_{n,t}^{2}\mathbf{C}_{n}^{% \mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}})$ , i.e. the scaled and centered vectors $\frac{1}{\sigma_{n,t}}(\bm{v}_{n,t}-\bm{\mu}_{n}^{\mathrm{spec}})$ are independent and identically distributed for all $1\leq t\leq T$ according to the centered Gaussian $\mathcal{N}(\bm{0},\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{% \mathrm{spat}})$ .

With the structure of the mean vector $\bm{\mu}_{n}$ given in Eqs. (3) and (4), corresponding to a multi-spectral patch constant through time, there are only $NL$ free parameters to estimate all mean vectors from $\bm{v}\in\mathbb{R}^{NLT}$ . The covariance structure defined in Eqs. (5) and (6) leads to $T+K(K+1)/2+L(L+1)/2-2$ free parameters per 4D patch (the -2 comes from the two normalization constraints), which leads to approximately $NK/2$ free parameters for the whole ASDI dataset (because $K\gg L$ and $K^{2}/2\gg T$ ), and is typically one to two orders of magnitudes smaller than the total number of measurements in $\bm{v}$ ( $K/2$ is typically less than one hundred whereas $LT$ is several thousands). Jointly with an adequate estimation method, the structures assumed in Eqs. (3), (4), (5) and (6) can thus be used to derive a non-stationary model of the nuisance terms.

2.3 Estimation of the model parameters

The estimation of the parameters of a separable covariance model has been studied by several previous works from the signal-processing community, see for example Lu & Zimmerman (2005); Genton (2007); Werner et al. (2008). We build on these works and introduce several additional elements specific to high-contrast imaging: (i) whereas most works consider decompositions of the covariance matrix as a Kronecker product of two factors, we also include in Eqs. (5) and (6) the temporal scaling factors $\sigma_{n,t}$ for increased robustness (Flasseur et al., 2020a, 2022); (ii) given the limited number of samples, we replace maximum likelihood estimates by shrinkage covariance estimators (Ledoit & Wolf, 2004; Chen et al., 2010; Flasseur et al., 2024) to ensure that all estimated covariance matrices are definite positive and to reduce estimation errors; (iii) to account for the superimposition of a component of interest and nuisance terms, we develop a joint estimation strategy in Sect. 3.3 based on the estimation technique developed in this section.

2.3.1 Maximum likelihood estimators

A first possiblity is to determine the parameters of the model of the nuisance statistics so as to maximize the likelihood of the data knowing the object of interest $\bm{u}$ . According to the considered problem and to the assumed independence of the patches, this amounts to minimizing the following co-log-likelihood:

(7)

where $\mathscr{L}_{n}$ is the co-log-likelihood of the patch at location $n$ :

	$\displaystyle\mathscr{L}_{n}\!\left(\bm{\mu}_{n}^{\mathrm{spec}},\big{\{}% \sigma_{n,t}^{2}\big{\}}_{t\in 1:T},\mathbf{C}_{n}^{\mathrm{spec}},\mathbf{C}_% {n}^{\mathrm{spat}},\bm{u}\right)=$
	$\displaystyle\hskip 14.22636pt\frac{1}{2}\sum_{t=1}^{T}\left(\left\\|\bm{v}_{n,% t}-\bm{\mu}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n,t}\right\\|_{\mathbf{C}% _{n,t}^{-1}}^{2}+\log\left\rvert\mathbf{C}_{n,t}\right\lvert\right),$		(8)

with $\mathbf{C}_{n,t}=\sigma_{n,t}^{2}\,\mathbf{C}^{\mathrm{spec}}_{n}\otimes% \mathbf{C}^{\mathrm{spat}}_{n}$ the assumed covariance of the patch data $\bm{v}_{n,t}$ and $\left\rvert\mathbf{C}_{n,t}\right\lvert$ its determinant. The term $\mathbf{M}\,\bm{u}$ accounts for the contribution of the object of interest in the data, the linear model matrix $\mathbf{M}$ is detailed in Sect. 3.1. The maximum likelihood estimators (MLEs) of the parameters of the nuisance statistic are then given by:

$\displaystyle\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}$	$\displaystyle=\operatorname*{arg\,min}_{\bm{\mu}_{n}^{\mathrm{spec}}}\mathscr{% L}_{n}\!\left(\bm{\mu}_{n}^{\mathrm{spec}},\big{\{}\widehat{\sigma}_{n,t}^{2}% \big{\}}_{t\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}},\widehat{\mathbf{% C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)$
	$\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{n,t}^{-2}\,\left(\bm{v}_{n% ,t}-[\mathbf{M}\,\widehat{\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widehat{\sigma% }_{n,t}^{-2}}\,,$	(9)
$\displaystyle\widehat{\sigma}_{n,t}^{2}$	$\displaystyle=\operatorname*{arg\,min}_{\sigma_{n,t}^{2}}\mathscr{L}_{n}\left(% \widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\sigma_{n,t^{\prime}}^{2}\big{% \}}_{t^{\prime}\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}},\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)$
	$\displaystyle=\frac{1}{K\,L}\left\\|\bm{v}_{n,t}-\widehat{\bm{\mu}}_{n}^{% \mathrm{spec}}-[\mathbf{M}\,\widehat{\bm{u}}]_{n,t}\right\\|_{\big{(}\widehat{% \mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widehat{\mathbf{C}}_% {n}^{\mathrm{spat}}\big{)}^{-1}}^{2}\,,$	(10)
$\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$	$\displaystyle=\operatorname*{arg\,min}_{\mathbf{C}_{n}^{\mathrm{spec}}}% \mathscr{L}_{n}\!\left(\widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\widehat% {\sigma}_{n,t}^{2}\big{\}}_{t\in 1:T},\mathbf{C}_{n}^{\mathrm{spec}},\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)$
	$\displaystyle=\frac{1}{T\,K}\sum_{t=1}^{T}\widehat{\mathbf{V}}_{n,t}^{\top}% \left(\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}% \right)^{-1}\,\widehat{\mathbf{V}}_{n,t}\,,$	(11)
$\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$	$\displaystyle=\operatorname*{arg\,min}_{\mathbf{C}_{n}^{\mathrm{spat}}}% \mathscr{L}_{n}\!\left(\widehat{\bm{\mu}}_{n}^{\mathrm{spec}},\big{\{}\widehat% {\sigma}_{n,t}^{2}\big{\}}_{t\in 1:T},\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}% ,\mathbf{C}_{n}^{\mathrm{spat}},\widehat{\bm{u}}\right)$
	$\displaystyle=\frac{1}{T\,L}\sum_{t=1}^{T}\widehat{\mathbf{V}}_{n,t}\left(% \widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}\right)^{-% 1}\,\widehat{\mathbf{V}}_{n,t}^{\top}\,,$	(12)

with $\widehat{\bm{u}}$ the estimator of the object of interest and where $\widehat{\mathbf{V}}_{n,t}$ is a $K\times L$ matrix corresponding to the residual multi-spectral patch at pixel $n$ and time $t$ : at row $k$ and column $\ell$ it is equal to $\big{[}\bm{v}_{n,t}-\widehat{\bm{\mu}}_{n}^{\mathrm{spec}}-[\mathbf{M}\,% \widehat{\bm{u}}]_{n,t}\big{]}_{k,\ell}$ . The complete derivation of these expressions is given in Appendix A. These equations are generally interdependent which has an incidence on the optimization strategy, see Sect. 3.3.

The multi-spectral mean $\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}$ in Eq. (9) is obtained by weighted averaging, with weights inversely proportional to the patch-wise variance $\sigma_{n,t}^{2}$ : this limits the impact of outliers. The patch-wise variance $\sigma_{n,t}^{2}$ in Eq. (10) corresponds to the average squared deviation to the mean, computed after spatial and spectral whitening. The estimator $\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ of the spectral covariance given in Eq. (11) is readily the sample covariance of the residuals $\widehat{\mathbf{V}}_{n,t}^{\top}$ whitened for the spatial covariances by $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$ . Conversely, the estimator $\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$ of the spatial covariance given in Eq. (12) corresponds to the sample covariance of the residuals $\widehat{\mathbf{V}}_{n,t}$ whitened for the spectral covariances by $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ .

In practice, the whitening operation by either $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ or $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$ is done by first computing the Cholesky’s decompositions $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}=\mathbb{W% }_{n}^{\mathrm{spec}}\big{(}\mathbb{W}_{n}^{\mathrm{spec}}\big{)}^{\top}$ and $\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}=\mathbb{W% }_{n}^{\mathrm{spat}}\big{(}\mathbb{W}_{n}^{\mathrm{spat}}\big{)}^{\top}$ with $\mathbb{W}_{n}^{\mathrm{spec}}$ and $\mathbb{W}_{n}^{\mathrm{spat}}$ triangular matrices. We then compute $\left(\mathbb{W}_{n}^{\mathrm{spec}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}^{\top}$ and $\left(\mathbb{W}_{n}^{\mathrm{spat}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}$ , which respectively amounts to spectral and spatial whitening of the residuals $\widehat{\mathbf{V}}_{n,t}$ . We finally take the sample covariances of these whitened residuals.

From the expressions in Eqs. (9)-(12), it is not possible to derive a closed-form expression of each parameter that does not also depend on other parameters (i.e., estimators (9)-(12) are interdependent). Yet, these formulae can be applied alternately until convergence, a method called flip-flop in Lu & Zimmerman (2005) where a faster convergence is reported compared to maximizing the log-likelihood using an iterative optimization algorithm (Newton’s method).

Figure 2 illustrates the spatial and spectral covariance matrices estimated under this model from an ASDI dataset of the VLT/SPHERE-IFS instrument. MLEs of the spatial and spectral correlation matrices were computed with the flip-flop method for the four regions of interest indicated by small colored squares in Fig. 1(a). To compare matrices with very different variances, we normalized each covariance Cov[a,b] by $\sqrt{\text{Cov[a,a]}\text{Cov[b,b]}}$ , i.e. we show the correlation coefficients. Due to the vectorization of 2D spatial patches, the spatial correlations display a blocky structure. The spatial correlations within a patch globally decrease with the 2D distance between pixels. They are stronger in the area (a) which is the closest to the star. Spectral correlations are also much stronger close to the star. As can be observed in Fig. 1(a), after the scaling transform applied in the pre-processing step, regions far from the star are not seen at the longest wavelengths. The size of multi-spectral patches extracted in these regions is reduced from $KL$ to $KL_{\text{eff}}$ pixels (with $L_{\text{eff}}<L$ the effective number of wavelengths seen at location $n$ ) and the size of the spectral covariance matrix $\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ is reduced accordingly, from $L\times L$ to $L_{\text{eff}}\times L_{\text{eff}}$ .

2.3.2 Shrinkage estimator of covariances

Given that the numbers $T$ of exposures and $L$ of spectral channels are limited, the empirical covariance estimates $\widehat{\mathbf{C}}^{\mathrm{spat}}_{n}$ and $\widehat{\mathbf{C}}^{\mathrm{spec}}_{n}$ (indifferently $\widehat{\mathbf{C}}_{n}$ in the following) are very noisy (when $T\simeq K$ or $L_{\text{eff}}\simeq K$ ) and can be even rank-deficient (in particular when $T<K$ or $L_{\text{eff}}<K$ ). To reduce the estimation error on $\widehat{\mathbf{C}}_{n}$ and ensure its definite-positiveness, shrinkage techniques combine the maximum likelihood estimator with another estimator of smaller variance (Ledoit & Wolf, 2004). Like in our previous works (Flasseur et al., 2018, 2020b, 2021, 2023a, 2023b), we consider the convex combination between the low-bias/high-variance sample covariance $\widehat{\mathbf{C}}_{n}$ and a high-bias/low-variance matrix $\widehat{\mathbf{F}}_{n}$ :

\displaystyle\widetilde{\mathbf{C}}_{n}=\gamma((1-\widetilde{\rho}_{n})% \widehat{\mathbf{C}}_{n}+\widetilde{\rho}_{n}\widehat{\mathbf{F}}_{n})\,,

(13)

with $\widehat{\mathbf{F}}_{n}=\mathrm{Diag}(\widehat{\mathbf{C}}_{n})$ a diagonal matrix such that $[\widehat{\mathbf{F}}_{n}]_{i,i}=[\widehat{\mathbf{C}}_{n}]_{i,i}$ , $\widetilde{\rho}_{n}\in[0,1]$ , and $\gamma$ a factor introduced to compensate for the fact that $\widehat{\mathbf{C}}$ is a biased estimate of the true (and unknown) covariance $\mathbf{C}$ . The estimator $\widetilde{\mathbf{C}}_{n}$ defined in Eq. (13) shrinks off-diagonal values (i.e., the covariances) of $\widehat{\mathbf{C}}_{n}$ towards 0 (by multiplication by the factor $1-\widetilde{\rho}_{n}$ ) and leaves diagonal values (i.e., the sample variances) unchanged. By controlling the shrinkage amount, hyper-parameter $\widetilde{\rho}_{n}$ plays a critical role as it set a bias-variance trade-off. Compared to other regularization techniques such as diagonal loading (i.e., adding a small fraction of the identity matrix to $\widehat{\mathbf{C}}_{n}$ ), definition (13) is attractive because it is data-driven: it locally adapts to the fluctuations observed in the non-stationary data and to the number of samples (in particular, we have $L_{\text{eff}}\neq L$ on the borders of the field of view). Such a shrinkage estimator is thus well-suited to imaging systems suffering from non-stationary perturbations.

It remains to find the optimal level of shrinkage $\widetilde{\rho}_{n}$ appropriate for each patch location $n$ . An optimal setting can be defined based on risk minimization between the true covariance $\mathbf{C}_{n}$ and its shrunk counterpart $\widetilde{\mathbf{C}}_{n}$ (Ledoit & Wolf, 2004). However, such an oracle estimator can not be used in practice since $\mathbf{C}_{n}$ is unknown. In a recent work (Flasseur et al., 2024), we derive a practical closed-form expression for its quasi-optimal setting that asymptotically approximates the oracle (for readability, the patch index $n$ is omitted in the following equations):

\widetilde{\rho}=\frac{(\gamma\nu+\epsilon-1)\big{(}\operatorname{tr}(\widehat% {\mathbf{C}}^{2})-\sum_{i}[\widehat{\mathbf{C}}]_{i,i}^{2}\big{)}+\gamma\eta% \big{(}\operatorname{tr}^{2}(\widehat{\mathbf{C}})-\sum_{i}[\widehat{\mathbf{C% }}]_{i,i}^{2}\big{)}}{\gamma\nu\big{(}\operatorname{tr}(\widehat{\mathbf{C}}^{% 2})-\sum_{i}[\widehat{\mathbf{C}}]_{i,i}^{2}\big{)}},

(14)

with:

$\displaystyle\epsilon$	$\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{t}^{-4}}{\left(\sum_{t=1}^% {T}\widehat{\sigma}_{t}^{-2}\right)^{2}},$	(15)
$\displaystyle\zeta$	$\displaystyle=\frac{\sum_{t=1}^{T}\widehat{\sigma}_{t}^{-6}}{\left(\sum_{t=1}^% {T}\widehat{\sigma}_{t}^{-2}\right)^{3}},$	(16)
$\displaystyle\gamma$	$\displaystyle=(1-\epsilon)^{-1},$	(17)
$\displaystyle\nu$	$\displaystyle=1-\epsilon-2\,\zeta+2\,\epsilon^{2},$	(18)
$\displaystyle\eta$	$\displaystyle=\epsilon-2\,\zeta+\epsilon^{2}\,.$	(19)

This analytic solution depends solely on the sample covariance $\widehat{\mathbf{C}}_{n}$ and patch variances $\{\widehat{\sigma}_{n,t}^{2}\}_{t=1:T}$ introduced in the MLE estimators (9)-(12) to improve robustness against outliers. In addition, formulae (14)-(19) explicitly account for the use of $\widehat{\bm{\mu}}_{n}^{\mathrm{spec}}$ as an empirical estimate of the true unknown mean $\bm{\mu}_{n}^{\mathrm{spec}}$ (Flasseur et al., 2024). It is worth noting that the shrinkage technique developed in this paragraph is general; it holds whatever the covariance structure of our problem, namely, the spatio-spectral separability of the covariance.

In the following, the shrunk covariance is given by:

\widetilde{\mathbf{C}}=\mathbf{\Psi}\odot\widehat{\mathbf{C}},

(20)

where $\odot$ denotes the Hadamard (element-wise) product, and $\mathbf{\Psi}$ is a weighting matrix whose diagonal entries are 1 and whose off-diagonal entries are $1-\widetilde{\rho}$ , where $\widetilde{\rho}$ is given by Eq. (14).

2.3.3 Shrunk spatio-spectral covariance

To introduce the shrinkage with our particular factorization of the spatio-spectral covariance as $\mathbf{C}_{n}^{\mathrm{spec}}\otimes\mathbf{C}_{n}^{\mathrm{spat}}$ (see Eq. (6)), we propose to apply the shrinkage on each of the components $\mathbf{C}_{n}^{\mathrm{spec}}$ and $\mathbf{C}_{n}^{\mathrm{spat}}$ separately. Futhermore, following the prescription in Flasseur et al. (2021), we estimate the skrinkage factors once at the initialization of the reconstruction algorithm. As a consequence, in subsequent steps, the shrinkage factors depend neither on the object of interest $\bm{u}$ nor on the nuisance statistics defined in Eqs. (9)–(12). This amounts to rewriting the MLEs estimators in Eqs. (9)–(12) as:

$\displaystyle\widetilde{\bm{\mu}}_{n}^{\,\mathrm{spec}}$	$\displaystyle=\frac{\sum_{t=1}^{T}\widetilde{\sigma}_{n,t}^{-2}\,\left(\bm{v}_% {n,t}-[\mathbf{M}\,\widetilde{\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widetilde{% \sigma}_{n,t}^{-2}},$	(21)
$\displaystyle\widetilde{\sigma}_{n,t}^{2}$	$\displaystyle=\tfrac{1}{KL}\left\\|\bm{v}_{n,t}-\widetilde{\bm{\mu}}_{n}^{\,% \mathrm{spec}}-[\mathbf{M}\,\widetilde{\bm{u}}]_{n,t}\right\\|_{\big{(}% \widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widetilde% {\mathbf{C}}_{n}^{\mathrm{spat}}\big{)}^{-1}}^{2},$	(22)
$\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$	$\displaystyle=\tfrac{1}{TK}\sum_{t=1}^{T}\widetilde{\mathbf{V}}_{n,t}^{\top}% \left(\widetilde{\sigma}_{n,t}^{2}\,\widetilde{\mathbf{C}}_{n}^{\mathrm{spat}}% \right)^{-1}\,\widetilde{\mathbf{V}}_{n,t},$	(23)
$\displaystyle\widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}$	$\displaystyle=\mathbf{\Psi}^{\mathrm{spec}}_{n}\odot\widehat{\mathbf{C}}_{n}^{% \mathrm{spec}},$	(24)
$\displaystyle\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$	$\displaystyle=\tfrac{1}{TL}\sum_{t=1}^{T}\widetilde{\mathbf{V}}_{n,t}\left(% \widetilde{\sigma}_{n,t}^{2}\,\widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\right% )^{-1}\,\widetilde{\mathbf{V}}_{n,t}^{\top},$	(25)
$\displaystyle\widetilde{\mathbf{C}}_{n}^{\mathrm{spat}}$	$\displaystyle=\mathbf{\Psi}^{\mathrm{spat}}_{n}\odot\widehat{\mathbf{C}}_{n}^{% \mathrm{spat}},$	(26)

where $\widetilde{\mathbf{V}}_{n,t}$ is defined as in Eqs. (9)–(12) but replacing $\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}$ by $\widetilde{\bm{\mu}}_{n}^{\,\mathrm{spec}}$ as well as $\widehat{\bm{u}}$ by $\widetilde{\bm{u}}$ , and where $\mathbf{\Psi}^{\mathrm{spec}}_{n}$ and $\mathbf{\Psi}^{\mathrm{spat}}_{n}$ are computed according to Eq. (20) for the respective sample covariances $\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ and $\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$ given by Eqs. (23) and (25) as estimated during the initialization stage of the reconstruction algorithm. The sample covariances $\widehat{\mathbf{C}}_{n}^{\mathrm{spec}}$ and $\widehat{\mathbf{C}}_{n}^{\mathrm{spat}}$ in Eqs. (23) and (25) differ from their MLEs counterparts in Eqs. (11) and (12) by the accounting of the shrinkage in the whitening. The assumed separable model of the covariance now takes the form $\widetilde{\mathbf{C}}_{n}=\text{diag}(\widetilde{\bm{\sigma}}_{n}^{2})\otimes% \widetilde{\mathbf{C}}_{n}^{\mathrm{spec}}\otimes\widetilde{\mathbf{C}}_{n}^{% \mathrm{spat}}$ .

3 Reconstruction of the component of interest

3.1 Direct model

We extend the forward model developed for ADI in Flasseur et al. (2021, 2022) by including the spectral dimension. Since the whole ASDI sequence is acquired within a short time (a few hours of observations during a single night), we assume the component of interest (e.g., circumstellar disk and potential exoplanets) does not evolve during the observations: its proper rotation around the host star and photometry evolution are negligible at such short time scales. The multi-spectral image of this component is simply described by the vector $\bm{u}\in\mathbb{R}_{+}^{N^{\prime}L}$ of its pixel values and there is no temporal dimension in this spatio-spectral reconstruction. Due to the apparent rotation of the field of view during the ASDI sequence, the number $N^{\prime}$ of pixels in each spectral band of the reconstruction should be greater than $N$ to model any part of the disk seen within the sensor field of view on at least one exposure.

The contribution of $\bm{u}$ to the data $\bm{v}$ is modeled by $\mathbf{M}\,\bm{u}$ with the linear operator:

\displaystyle\mathbf{M}=\begin{pmatrix}\mathbf{M}_{1}\\ \vdots\\ \mathbf{M}_{T}\\ \end{pmatrix}\text{ and }\mathbf{M}_{t}=\mathbf{S}\,\mathbf{Z}\,\mathbf{A}\,% \mathbf{B}_{t}\,\mathbf{R}_{t}\,,

(27)

where $\mathbf{M}_{t}$ , the model for the $t$ -th frame, accounts for several instrumental effects:

•

a rotation $\mathbf{R}_{t}$ applied to all off-axis sources due to the pupil-tracking mode (the field of view rotates while the residual star light remains fixed), implemented as a sparse interpolation matrix,
•

a blur $\mathbf{B}_{t}$ due to the instrumental blurring modeled as a 2D discrete convolution by the off-axis point spread function (PSF),
•

an attenuation $\mathbf{A}$ , very strong on the optical axis, then quickly decreasing (due to the coronagraph), modeled as a diagonal matrix (Flasseur et al., 2021),
•

the absence of measurements outside the spatial extension of the sensor (a non-square area due to the instrumental design of the integral field spectrograph), modeled as a diagonal matrix $\mathbf{Z}$ that replaces values outside the sensor area by zeros and keeps other values unchanged (i.e., zero-padding).
•

the image scaling applied during the pre-processing step produces a last transform $\mathbf{S}$ (time-invariant), corresponding to a sparse interpolation matrix.

With the VLT/SPHERE instrument, the off-axis point spread function (PSF) is quite stable and its core is almost rotation invariant, leading to the approximation $\mathbf{B}_{t}\,\mathbf{R}_{t}\approx\mathbf{R}_{t}\,\mathbf{B}$ . The model given in Eq. (27) can thus be approximated by:

\displaystyle\mathbf{M}\,\bm{u}\approx\begin{pmatrix}\mathbf{F}_{1}\\ \vdots\\ \mathbf{F}_{T}\\ \end{pmatrix}\!\,\mathbf{B}\,\bm{u}\,,

(28)

where $\mathbf{B}$ is a time-invariant blurring operator and $\mathbf{F}=\{\mathbf{F}_{t}\}_{t=1:T}$ are sparse matrices that perform rotations, scalings, and attenuations according to the transmission of the coronagraph and the sensor field of view. The model in Eq. (28) is only approximate: it neglects possible anisotropies or temporal evolutions of the PSF. Thanks to this approximation, a single convolution of the multi-spectral dataset is performed instead of $T$ convolutions, which leads to a dramatic acceleration of the numerical evaluation of the forward model (by one to two orders of magnitude) which is critical to achieving reconstructions on datasets in the order of a few hours. We verified through numerical simulations that the impact of these approximations on the reconstructions was negligible (less than 1%) in practice for VLT/SPHERE data. If approximation (28) does not hold (e.g., for instruments that do not produce a stable off-axis PSF or if the latter is not rotation invariant), the full model (27) can be evaluated, at each iteration of the optimization procedure (see Sect. 3.2), on a random subset of temporal frames using stochastic gradient descent (e.g., with the Adam optimizer; Kingma & Ba (2014)). The stochasticity of this procedure reduces both memory consumption and time computation and leads to an approximate solution. Based on simulated disks and off-axis PSFs, we observe a typical relative difference less than 5% on the reconstructed flux distribution obtained with the two strategies ((i) approximate model and no stochastic optimization versus (ii) full model and stochastic optimization). In the following, we use strategy (i) solely given that approximation (28) can be made with VLT/SPHERE data.

3.2 Regularized inversion

We reconstruct the component of interest using a penalized maximum likelihood approach, i.e., by solving the following numerical optimization problem:

\displaystyle\widehat{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\big{% \{}\mathscr{C}(\bm{\Omega},\bm{u})\equiv\mathscr{L}\left(\bm{\Omega},\bm{u}% \right)+\mathscr{R}(\bm{u})\big{\}},

(29)

where $\bm{\Omega}=\Big{\{}\bm{\mu}_{n}^{\mathrm{spec}},\bm{\sigma}_{n}^{2},\mathbf{C% }_{n}^{\mathrm{spec}},\mathbf{C}_{n}^{\mathrm{spat}}\Big{\}}_{n\in\mathbb{K}}$ represents the parameters of the statistical model of the nuisances, the co-log-likelihood $\mathscr{L}$ is given in Eqs. (6)–(8), and $\mathscr{R}(\bm{u})$ is a regularization term to favor plausible reconstructions $\bm{u}$ . We selected a combination of two regularization functions applying to the same $\bm{u}$ : an edge-preserving one that favors smooth images with sharp edges co-located at all wavelengths and a sparsity-inducing $\text{L}^{1}$ norm. The regularization writes:

	$\displaystyle\mathscr{R}(\bm{u})$	$\displaystyle=\beta_{\text{smooth}}\sum_{n=1}^{N^{\prime}}\sqrt{\tfrac{1}{L}% \sum_{\ell=1}^{L}\left\lVert\mathbf{D}_{n,\ell}\,\bm{u}\right\rVert_{2}^{2}+% \tau^{2}}$
		$\displaystyle\quad+\beta_{\text{sparse}}\sum_{n=1}^{N^{\prime}}\sum_{\ell=1}^{% L}\|u_{n,\ell}\|,$		(30)

where $\mathbf{D}_{n,\ell}\,\bm{u}\approx\mathbf{\nabla}_{\!n}\bm{u}_{:,\ell}$ approximates by finite differences the 2D spatial gradient of $\bm{u}$ at pixel $n$ in the $\ell$ -th spectral channel and with $\tau$ a parameter chosen so as to be negligible compared to the average norm of the spatial gradient where there is a sharp edge (the regularization then approaches an isotropic vectorial total variation; Bresson & Chan (2008)) and similar to the gradient magnitude in smoothly-varying areas (this prevents the apparition of the staircasing effect common with total variation; Charbonnier et al. (1997); Blomgren et al. (1997); Louchet & Moisan (2008)). We illustrate qualitatively through numerical simulations in Sect. 4.4 that these quite classical regularization penalties in image processing remain adapted to disks having very different morphologies, like elliptical disks with sharp edges or spiral disks with smooth edges. Hyper-parameters $\beta_{\text{smooth}}$ and $\beta_{\text{sparse}}$ balance the weight of each regularization term with respect to the data-fitting term. Note that, due to the positivity constraint in Eq. (29), the $\text{L}^{1}$ norm $\|\bm{u}\|_{1}$ corresponds to the simple differentiable term $\sum_{n=1}^{N^{\prime}}\sum_{\ell=1}^{L}u_{n,\ell}$ for any feasible object $\bm{u}$ , and thus the regularization $\mathscr{R}(\bm{u})$ is differentiable for $\tau\neq 0$ (in practice, we choose $\tau=10^{-6}$ ).

To solve the smooth constrained optimization problem in Eq. (29), we use a limited-memory quasi-Newton method with bound constraints, VMLM-B (Thiébaut, 2002), which is a more efficient variant of L-BFGS-B (Zhu et al., 1997). To minimize $\mathscr{C}(\bm{\Omega},\bm{u})$ in $\bm{u}$ given $\bm{\Omega}$ , the VMLM-B optimizer requires to evaluate the cost function $\mathscr{C}(\bm{\Omega},\bm{u})$ and the first derivatives $\nabla_{\bm{u}}\mathscr{C}(\bm{\Omega},\bm{u})$ with respect to $\bm{u}$ . The analytic expression of these first derivatives writes, for all $\bm{u}\geq\mathbf{0}$ :

	$\displaystyle\nabla_{\bm{u}}\mathscr{C}\left(\bm{\Omega},\bm{u}\right)=\sum_{n% \in\mathbb{K}}\underbrace{\mathbf{M}^{\top}\sum_{t=1}^{T}\frac{1}{\sigma_{n,t}% ^{2}}\,\mathbf{E}_{n,t}^{\top}\,\mathbf{\Gamma}_{n}\,\big{[}\mathbf{E}_{n,t}\,% \mathbf{M}\,\bm{u}+\bm{\mu}_{n}-\bm{v}_{n,t}\big{]}}_{\nabla_{\bm{u}}\mathscr{% L}_{n}\left(\bm{\Omega}_{n},\bm{u}\right)\text{, see Eqs.~{}\eqref{eq:covsep}--\eqref{eq:patchcologlikelihood}}}$
	$\displaystyle\hskip 34.1433pt+\underbrace{\beta_{\mathrm{smooth}}\sum_{n=1}^{N% ^{\prime}}\frac{\mathbf{D}_{n,\ell}^{\top}\,\mathbf{D}_{n,\ell}^{{\phantom{% \top}}}\,\bm{u}}{\sqrt{\frac{1}{L}\sum_{\ell=1}^{L}\left\lVert\mathbf{D}_{n,% \ell}\,\bm{u}\right\rVert_{2}^{2}+\tau^{2}}}+\beta_{\mathrm{sparse}}\,\bm{1}}_% {\nabla_{\bm{u}}\mathscr{R}(\bm{u})\text{, see Eq. (\ref{eq:regul})}},$		(31)

where $\mathbf{1}$ is an array of same size as $\bm{u}$ filled with ones, $\mathbf{E}_{n,t}$ is the $K\,L\times K\,L\,T$ operator that extracts a multi-spectral patch at spatial location $n$ and time frame $t$ (by extension of its definition introduced in Sect. 2.1), $\bm{\Omega}_{n}$ denotes the subset of the statistical model parameters for the $n$ -th patch, and $\mathbf{\Gamma}_{n}$ is equal to ${\big{(}\mathbf{C}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\mathbf{C}_{n}% ^{\mathrm{spat}}\big{)}}^{-1}$ .

Solving the problem in Eq. (29) yields an estimator $\widetilde{\bm{u}}$ of the object of interest given the parameters $\bm{\Omega}$ of the statistical model. We consider next different strategies to jointly obtain estimators of these parameters from the same dataset.

3.3 Joint estimation of all unknowns from the data

Formally, the estimators of the object of interest $\bm{u}$ and of the parameters $\bm{\Omega}$ of the nuisance statistics provided by REXPACO ASDI are the ones for which Eqs. (21)–(26) and (29) jointly hold. Solving this system of non-linear equations is intrinsically difficult because there is no closed-form solution (at least due to the non-negativity constraint for $\bm{u}$ ) and because of the interdependence of the equations. In the following sub-sections, we develop practical algorithms to iteratively solve this system of equations.

3.3.1 Alternating strategy

Even though there is no joint closed-form solution to the set of equations (21)–(26) and (29), we note that each of these equations readily provides an estimator of some unknowns when the rest of the unknowns are fixed. This property can be exploited to solve the set of equations (21)–(26) by the following alternating strategy. Given the object $\bm{u}$ , the parameters $\bm{\Omega}$ can be estimated by repeatedly applying Eqs. (21)–(26) in turn until convergence to a so-called fixed point solution. This procedure being applied for each patch to estimate all the nuisance parameters. We denote the resulting parameters as $\widetilde{\bm{\Omega}}(\bm{u})$ in the following. A first possible algorithm to find the solution is then:

1.	Let $i=0$ and initialy assume a null object $\widetilde{\bm{u}}^{[0]}=\bm{0}$ .
2.	Estimate nuisance statistics $\widetilde{\bm{\Omega}}^{[i+1]}=\widetilde{\bm{\Omega}}\big{(}\bm{u}^{[i]}\big% {)}$ as the fixed point solution of Eqs. (21)–(26) for the current estimate of the object $\bm{u}^{[i]}$ . If $i=0$ , also include Eq. (14) in the fixed point method to determine the shrinkage factors $\widetilde{\rho}^{\,\mathrm{spec}}$ and $\widetilde{\rho}^{\,\mathrm{spat}}$ . These factors define $\mathbf{\Psi}^{\mathrm{spec}}$ and $\mathbf{\Psi}^{\mathrm{spat}}$ for all subsequent iterations, i.e. for $i>0$ .
3.	Update the object $\widetilde{\bm{u}}^{[i+1]}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\mathscr% {C}\big{(}\widetilde{\bm{\Omega}}^{[i+1]},\bm{u}\big{)}$ by applying the reconstruction algorithm described in Sect. 3.2.
4.	Let $i\leftarrow i+1$ and, unless estimators $\widetilde{\bm{\Omega}}^{[i]}$ and $\widetilde{\bm{u}}^{[i]}$ have converged, go to step 2.

In practice, we assume the algorithm reaches convergence when the condition $\big{\lVert}\widetilde{\bm{u}}^{[i+1]}-\widetilde{\bm{u}}^{[i]}\big{\rVert}% \leq\eta\big{\lVert}\widetilde{\bm{u}}^{[i+1]}\big{\rVert}$ is satisfied, with $\eta=10^{-6}$ .

This first algorithm implements a simple alternating strategy which is equivalent, for non-linear equations, to the Gauss–Seidel method for solving a system of linear equations. The alternating method converges slowly due to the need for multiple reconstructions of the object of interest, which are progressively refined in each iteration of Step 3. This process represents the primary computational bottleneck (the computational cost of estimating nuisance statistics is negligible by comparison). However, as discussed in the following subsections, the computational efficiency of this estimation strategy can be significantly improved.

3.3.2 Partially hierarchical optimization

Noting that the joint solution of Eqs. (21) and (22) only depends on the object and on the spatial and spectral covariances, we introduce the following auxiliary cost function:

	$\displaystyle\mathscr{D}\big{(}\bm{u},\big{\{}\mathbf{C}^{\mathrm{spec}}_{n},% \mathbf{C}^{\mathrm{spat}}_{n}\big{\}}_{n\in\mathbb{K}}\big{)}$	$\displaystyle=\min_{\begin{subarray}{c}\{\bm{\mu}^{\mathrm{spec}}_{n}\}_{n\in% \mathbb{K}}\\ \{\sigma_{n,t}^{2}\}_{n\in\mathbb{K},t\in 1:T}\end{subarray}}\mathscr{C}(\bm{% \Omega},\bm{u})$
		$\displaystyle=\mathscr{C}(\bm{\Omega},\bm{u})\,\rule[-17.07164pt]{0.5pt}{25.60% 747pt}_{\begin{array}[b]{l}\scriptscriptstyle\bm{\mu}^{\mathrm{spec}}_{n}\>{=}% \>\widetilde{\bm{\mu}}^{\,\mathrm{spec}}_{n}\big{(}\bm{u},\,\mathbf{C}^{% \mathrm{spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}\\[2.84526pt] \scriptscriptstyle\bm{\sigma}_{n}^{2}\>{=}\>\widetilde{\bm{\sigma}}^{2}_{n}% \big{(}\bm{u},\,\mathbf{C}^{\mathrm{spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n% }\big{)}\\ \end{array}}$		(38)

In practice, for each patch $n$ and given the object $\bm{u}$ and the covariances $\mathbf{C}^{\mathrm{spec}}_{n}$ and $\mathbf{C}^{\mathrm{spat}}_{n}$ , the estimators $\widetilde{\bm{\mu}}^{\,\mathrm{spec}}_{n}\big{(}\bm{u},\,\mathbf{C}^{\mathrm{% spec}}_{n},\,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}$ and $\widetilde{\bm{\sigma}}^{2}_{n}\big{(}\bm{u},\,\mathbf{C}^{\mathrm{spec}}_{n},% \,\mathbf{C}^{\mathrm{spat}}_{n}\big{)}$ are obtained by applying Eqs. (21) and (22) iteratively until convergence to a fixed point. Such estimators define a stationary point of $\mathscr{C}$ with respect to the parameters $\bm{\mu}_{n}$ and $\bm{\sigma}_{n}$ , the corresponding partial derivatives of $\mathscr{C}$ are therefore null. Hence, by the chain rule, the derivatives of the auxiliary function $\mathscr{D}$ in $\bm{u}$ are simply given by $\nabla_{\bm{u}}\mathscr{C}$ in Eq. (31) evaluated at the stationary point. Thanks to this property, solving:

\displaystyle\widetilde{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}% \mathscr{D}\big{(}\bm{u},\big{\{}\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n},% \widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}\big{\}}_{n\in\mathbb{K}}\big{)}

(39)

can be done similarly to solving the constrained reconstruction problem in Eq. (29), that is with a quasi-Newton method like VMLM-B (Thiébaut, 2002).

Minimizing the auxiliary function $\mathscr{D}$ instead of $\mathscr{C}$ , the estimators are obtained by the following algorithm:

1.	Let $i=0$ , assume a null object $\widetilde{\bm{u}}^{[0]}=\bm{0}$ , and initialize model statistics $\widetilde{\bm{\Omega}}^{[0]}$ as in Step 2 of the first iteration of the algorithm given in Sect. 3.3.1.
2.	Update the object by minimizing the auxiliary cost function:
	$\widetilde{\bm{u}}^{[i+1]}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\mathscr% {D}\Big{(}\bm{u},\big{\{}\widetilde{\mathbf{C}}^{\mathrm{spec}\,[i]}_{n},% \widetilde{\mathbf{C}}^{\mathrm{spat}\,[i]}_{n}\big{\}}_{n\in\mathbb{K}}\Big{)}$ .
3.	Update the nuisance statistics: $\widetilde{\bm{\Omega}}^{[i+1]}=\widetilde{\bm{\Omega}}\big{(}\bm{u}^{[i+1]}% \big{)}$ .
4.	Let $i\leftarrow i+1$ and, unless estimators $\widetilde{\bm{\Omega}}^{[i]}$ and $\widetilde{\bm{u}}^{[i]}$ have converged, go to step 2.

Like for the alternating strategy presented in Sect. 3.3.1, we assume that the partially hierarchical optimization scheme reaches convergence when the condition $\big{\lVert}\widetilde{\bm{u}}^{[i+1]}-\widetilde{\bm{u}}^{[i]}\big{\rVert}% \leq\eta\big{\lVert}\widetilde{\bm{u}}^{[i+1]}\big{\rVert}$ is satisfied, with $\eta=10^{-6}$ .

It may be noted that the estimators $\widetilde{\bm{\mu}}^{\,\mathrm{spec}\,[i+1]}_{n}$ and $\widetilde{\sigma}^{2\,[i+1]}_{n,t}$ can also be considered as a by-product of the minimization of $\mathscr{D}$ in Step 2 of the above algorithm. Hence, Step 3 can be modified to restrict the updating of the nuisance statistics to that of the covariances $\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n}$ and $\widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}$ ( $\forall n\in\mathbb{K}$ ), e.g. by finding a fixed point of Eqs. (23)–(26).

The hierarchical optimization in Step 2 yields estimates such that Eqs. (21), (22), and (29) jointly hold for given covariance matrices. As a result, the convergence speed is improved compared to the Algorithm described in Sect. 3.3.1.

3.3.3 Fully hierarchical approximation

In principle, all the parameters could be found by solving:

\widetilde{\bm{u}}=\operatorname*{arg\,min}_{\bm{u}\geq\bm{0}}\big{\{}\mathscr% {F}(\bm{u})\equiv\mathscr{C}\big{(}\widetilde{\bm{\Omega}}(\bm{u}),\bm{u}\big{% )}\big{\}},

(45)

and taking $\widetilde{\bm{\Omega}}=\widetilde{\bm{\Omega}}(\widetilde{\bm{u}})$ . The estimator $\widetilde{\bm{\Omega}}(\bm{u})$ of the nuisance statistics is however not truly a stationary point of $\mathscr{C}$ for the covariance matrices $\widetilde{\mathbf{C}}^{\mathrm{spec}}_{n}$ and $\widetilde{\mathbf{C}}^{\mathrm{spat}}_{n}$ although it is a stationary point for the other parameters of the nuisance statistics. We nevertheless make the following approximation:

\nabla_{\bm{u}}\mathscr{F}(\bm{u})\approx\left.\nabla_{\bm{u}}\mathscr{C}(\bm{% \Omega},\bm{u})\right|_{\bm{\Omega}=\widetilde{\bm{\Omega}}(\bm{u})},

(46)

since, under this approximation, the constrained problem in Eq. (45) can be solved by a quasi-Newton method as VMLM-B (Thiébaut, 2002).

In practice, we verified numerically that the approximation in Eq. (46) holds to a numerical precision that is sufficient to achieve the convergence of the quasi-Newton method. We also verified that the fully alternating strategy described in Sect. 3.3.1 and the fully hierarchical approach assuming Eq. (46) both converge to the same estimators. The fully hierarchical approach is however much faster than algorithms presented in Sects. 3.3.1 and 3.3.2. For example, the approximate fully hierarchical algorithm reduces the computational load of the alternating strategy by a factor comparable to the number of iterations $i$ required to reach convergence with the algorithm described in Sect. 3.3.1 (ranging from 30 to 100 in practice). Consequently, we exclusively employed the approximate fully hierarchical optimization strategy throughout this paper and recommend it as the preferred method for estimating parameters in REXPACO ASDI.

3.4 Unsupervised setting of the regularization hyper-parameters

As in our previous work on the REXPACO algorithm (Flasseur et al., 2021), we propose a strategy to set optimally, and in a data-driven fashion, the hyper-parameters $\bm{\beta}=\{\beta_{\text{smooth}},\beta_{\text{sparse}}\}$ involved in the regularization term $\mathscr{R}$ of Eq. (30). These two free parameters represent the relative weights of the two combined priors on the sought flux distribution and they also set the relative weight of the priors with respect to the data-fidelity term $\mathscr{L}$ defined in Eq. (7). In other words, the hyper-parameters $\bm{\beta}$ set a critical bias-variance trade-off. These hyper-parameters can be tuned manually by trial and error until the reconstruction is qualitatively acceptable, but this approach relies on the user judgment and, likely, the resulting setting is not optimal. Instead, we capitalize on the large variety of methods available in the signal processing literature to set regularization hyper-parameters by minimizing a figure of merit, see e.g. Craven & Wahba (1978); Wahba et al. (1985); Stein (1981). One of these criteria is the so-called Stein’s Unbiased Risk Estimator (SURE; Stein (1981)) that we have also selected among other metrics in our previous works dedicated to the post-processing of high-contrast observations (Flasseur et al., 2020b, 2021) given its ability to approximate the mean square error (MSE) in the measurement space:

\text{MSE}(\bm{\beta})=\sum\limits_{n\in\mathbb{K}}\sum\limits_{t=1}^{T}\left% \|\frac{1}{\widehat{\sigma}_{n,t}^{2}}\mathbf{E}_{n,t}\,\left(\mathbf{M}\left(% \bm{u}_{\text{gt}}-\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})\right)\right)\right% \|_{\widetilde{\mathbf{\Gamma}}_{n}}^{2}\,,

(47)

with $\bm{u}_{\text{gt}}$ the unknown ground truth flux distribution and $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})$ the flux distribution reconstructed from the data $\bm{v}$ using the set of regularization hyper-parameters $\bm{\beta}$ . It is shown in the literature (Stein, 1981) that the SURE estimator gives an unbiased estimation of $\text{MSE}(\bm{\beta})$ without requiring the value of the unknown ground truth flux-distribution $\bm{u}_{\text{gt}}$ involved in the MSE (47).

By extending our previous work (Flasseur et al., 2021) to the multi-spectral model of the nuisance and of the object components, the resulting SURE risk estimator can be numerically evaluated by:

\text{SURE}(\bm{\beta})\approx\sum\limits_{n\in\mathbb{K}}\sum_{t=1}^{T}\left% \|\frac{1}{\widetilde{\sigma}_{n,t}^{2}}\mathbf{E}_{n,t}\left(\bm{v}-% \widetilde{\bm{\mu}}^{\mathrm{spec}}-\mathbf{M}\,\widetilde{\bm{u}}_{\bm{\beta% }}(\bm{v})\right)\right\|_{\widetilde{\mathbf{\Gamma}}_{n}}^{2}\\ +(2/\xi)\,\bm{b}^{\top}\,\mathbf{M}\,\left[\widetilde{\bm{u}}_{\bm{\beta}}(\bm% {v}+\xi\bm{b})-\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})\right]-N\,T\,L\,,

(50)

where $\bm{b}\in\mathbb{R}^{N^{\prime}\,T\,L}$ is an independent and identically distributed pseudo-random vector of unit variance, and $\xi$ is the amplitude of this perturbation. This expression, as the MSE in Eq. (47), tailored to our problem accounts for the structured model of the covariances of the nuisance (i.e., separable spatially and spectrally), as defined by the matrix $\mathbf{\Gamma}$ . It also accounts for our patch-based strategy to model the full covariance through the partition of the image into non-overlapping patches with the operator $\mathbf{E}$ . In addition, expression (50) is a practical approximation of the original SURE criterion that involves the computation of the Jacobian matrix of the mapping $\bm{u}\rightarrow\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})$ with respect to the components of the data $\bm{v}$ . Given that there is no-closed-form expression for such a term, we approximate it by resorting to finite differences through a Monte-Carlo perturbation of the data, as proposed by Girard (1989); Ramani et al. (2012). This strategy leads to the approximate expression (50) involving the reconstruction of the two flux distributions $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})$ and $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})$ obtained respectively from the data $\bm{v}$ and the perturbed counterpart $\bm{v}+\xi\bm{b}$ . The optimal setting $\widetilde{\bm{\beta}}^{\text{SURE}}$ of the regularization hyper-parameters $\bm{\beta}$ is obtained by minimizing the SURE score (50) with respect to $\bm{\beta}$ .

In Fig. 3, we illustrate the benefits of the proposed data-driven setting of the regularization hyper-parameters $\bm{\beta}$ by resorting to the numerical injection of a synthetic elliptical disk within an object-free dataset of the HD 172555 star obtained with the VLT/SPHERE-IFS instrument (see Sect. 4.1 for the description of the dataset). The experiments are conducted for a disk of contrast (see definition in Sect. 1) $\alpha_{\text{gt}}=1\times 10^{-5}$ in every spectral channel. The corresponding ground truth flux distribution $\bm{u}_{\text{gt}}$ to be reconstructed is given in Fig. 10 bottom-left. We start by comparing the SURE criterion (50) to the MSE (47). The tested values of the hyper-parameters are $\beta_{\text{smooth}}\in\left[1\times 10^{2},1\times 10^{10}\right]$ and $\beta_{\text{sparse}}\in\left[7.5\times 10^{0},7.5\times 10^{8}\right]$ with a regular sampling of $\log(\bm{\beta})$ . For the computation of the SURE metric, we have to set the value of the parameter $\xi$ involved in Eq. (50), namely the strength of the perturbation $\bm{b}$ . We found this value not to be critical, yet it should be set not too small to prevent errors due to numerical underflows in the computation of the difference $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})-\widetilde{\bm{u}}_{\bm{% \beta}}(\bm{v})$ and not too large so that the approximation (50) stays valid. As in our previous work on the REXPACO algorithm (Flasseur et al., 2021), we set it empirically by $\xi=0.1\times\text{MAD}(\bm{v})$ , where the median absolute deviation $\text{MAD}(\bm{v})=\text{median}(|\bm{v}-\text{median}(\bm{v})|)$ is a robust estimator of the standard-deviation of the data $\bm{v}$ . Panel (a) of Fig. 3 gives the results of the comparison between MSE and SURE. It illustrates that our custom SURE definition is an accurate proxy of the MSE: the global minimum of the two metrics is obtained for the same tested values of our grid of parameters $\bm{\beta}$ . The SURE criterion (50) can thus be safely used to approximate the MSE when facing real cases where the ground truth flux distribution $\bm{u}_{\text{gt}}$ is not available. Panel (b) of Fig. 3 completes this study by showing an example of the reconstructed flux distribution in three cases: an under-regularized reconstruction (i.e., $\widetilde{\bm{\beta}}<\widetilde{\bm{\beta}}^{\text{MSE}}$ ), the optimal regularization (i.e., $\widetilde{\bm{\beta}}=\widetilde{\bm{\beta}}^{\text{MSE}}=\widetilde{\bm{% \beta}}^{\text{SURE}}$ ), and an over-regularized reconstruction (i.e., $\widetilde{\bm{\beta}}\gg\widetilde{\beta}^{\text{MSE}}$ ). It illustrates the benefits of the regularization with an optimal strength: the reconstructed flux distribution is very similar to the ground truth presented in Fig. 10 bottom-left. The nuisance component is well discarded, even very close to the host star, and the reconstructed disk have sharp edges matching the ground truth. An under-regularization causes a slightly worst rejection of the nuisance component (i.e., a non-null background remains in the reconstruction) and the reconstructed disk exhibits some ripples and non-homogeneous parts. In the opposite case of an over-regularization, the reconstructed flux distribution is severely biased towards zero and it results in important morphological distortions impacting the disk, in particular due to a too strong promotion of sparsity .

By construction, the optimal setting of the hyper-parameters $\bm{\beta}$ by minimizing the SURE criterion (50) requires to perform two reconstructions ( $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v})$ and $\widetilde{\bm{u}}_{\bm{\beta}}(\bm{v}+\xi\bm{b})$ ) for each tested pair $\bm{\beta}$ of hyper-parameters. Given that more than 120 individual reconstructions are presented in the following section to evaluate the performance of the proposed approach, it would have been an unreasonable computational overhead to derive $\widetilde{\bm{\beta}}^{\text{SURE}}$ for each reconstruction. We thus chose to evaluate the optimal setting in only one case: the disk of SA0 206462. This computation leads to $\widetilde{\beta}_{\text{sparse}}^{\text{SURE}}=7.5\times 10^{4}$ and $\widetilde{\beta}_{\text{smooth}}^{\text{SURE}}=1\times 10^{6}$ . These values are not too far from the optimal ones derived in the numerical experiments performed in Fig. 3 of this section on a totally different dataset. When facing a new dataset, we thus simply weight these pre-computed values according to the number of frames within the target dataset with respect to the dataset of SAO 206462 in order to keep a constant relative weighting between the regularization and the data fidelity terms. We found that this setting was qualitatively acceptable in all our experiments, i.e. no significant artifact was ever observed either in terms of a bad rejection of the nuisance component or in terms of non-physical discontinuities in the disk structures. We recommend to use this strategy when facing the processing of a large number of datasets. A careful data-dependent and data-driven setting of the hyper-parameters $\bm{\beta}$ with SURE can be reserved to specific cases where the setting seems to be more critical (e.g., in the case of a very faint disk) or to refine the reconstruction obtained with the pre-computed and scaled values of the regularization hyper-parameters.

4 Results

4.1 Datasets description

Table 2: Summary of the main observational parameters for the VLT/SPHERE datasets analyzed in this paper. The columns include: target name, ESO survey ID, observation date, number (

L

) of spectral channels, spectral filter band (

\Delta_{\lambda}

), number (

T

) of available temporal frames, total apparent field of view rotation (

\Delta_{\text{par}}

), number of sub-integration exposures (NDIT), individual exposure time (DIT; Detector Integration Time), average coherence time (

\tau_{0}

), average seeing, and the first publication reporting an analysis of the same data. All observations were conducted using the apodized Lyot coronagraph (Carbillet et al., 2011) on the VLT/SPHERE instrument.

{}^{\text{(a)}}

The contribution of the three known exoplanets (HR 8799 c, d, e), which are within the SPHERE-IFS field of view, was masked.

{}^{\text{(b)}}

While the IRDIS dataset from the same epoch (recorded simultaneously using the IRDIFS-EXT mode of SPHERE) was analyzed in (Boccaletti et al., 2021), no reconstruction from the IFS dataset was reported in that study.

{}^{\text{(c)}}

The first value is the real amplitude of the parallactic rotation, while the second corresponds to the simulated parallactic rotation used in our experiments with synthetic disk simulations (see Sect. 4.4).

Target	ESO ID	Obs. date	$L$	$\Delta_{\lambda}$	$T$	$\Delta_{\text{par}}$	NDIT	DIT	$\tau_{0}$	Seeing	Related paper
				( $\mathrm{\SIUnitSymbolMicro}\mathrm{m}$ )		(°)		(s)	(ms)	(”)
SPHERE-IFS data used for validation of the statistical model, see Sect. 4.2
HR 8799 ${}^{\text{(a)}}$	095.C-0298(C)	2015-07-04	39	0.96-1.64	46	16.4	4	64	2.3	0.94	Langlois et al. (2021)
SPHERE-IFS data used for qualitative analysis by reconstructing known real disks, see Sect. 4.3
HR 4796	095.C-0298(H)	2015-02-03	39	0.96-1.33	56	48.2	4	64	13.7	0.67	Milli et al. (2017)
SAO 206462	095.C-0298(A)	2015-05-15	39	0.96-1.64	63	63.7	4	64	8.9	0.59	Maire et al. (2017)
MWC 758	1100.C-0481(K)	2018-12-17	39	0.96-1.33	63	29.2	4	96	8.3	0.98	Boccaletti et al. (2021)
PDS 70	1100.C-0481(D)	2018-02-24	39	0.96-1.64	87	93.4	3	96	7.5	0.66	Mesa et al. (2019b)
HD 163296	1100.C-0481(G)	2018-05-07	39	0.96-1.64	48	14.2	3	96	2.6	1.04	Mesa et al. (2019a)
AB Aurigae	104.20V7.001	2020-01-18	39	0.96-1.64	51	38.5	2	64	5.6	0.71	this paper ${}^{\text{(b)}}$
SPHERE-IFS data used for quantitative analysis by reconstructing synthetic disks, see Sect. 4.4
HD 172555	095.C-0192	2015-07-11	39	0.96-1.33	62	12.9//30.0 ${}^{\text{(c)}}$	8	32	3.9	1.20	Flasseur et al. (2020b)
SPHERE-IRDIS data used to compare ADI and ASDI post-processing, see Sect. 4.5
SAO 206462	095.C-0298(A)	2015-05-15	2	2.11-2.25	63	63.7	4	64	8.9	0.59	Maire et al. (2017)

For our comparisons, we selected eight datasets from the SPHERE-IFS instrument, acquired under diverse observing conditions.

First in Sect. 4.2, we consider a dataset of HR 8799 to assess the relevance of the statistical model proposed in this paper. This emblematic star hosts four known exoplanets, all detected by direct imaging (Marois et al., 2008, 2010). Three of which fall within the SPHERE-IFS field of view. After masking the contribution of these point-like sources within the data, we conduct a model ablation analysis to show that it is critical to accurately model the correlations of the nuisance component.

Then in Sects. 4.2 and 4.3, we consider six additional datasets from stars with previously imaged circumstellar disks. These datasets are used to qualitatively assess the benefits of the proposed algorithm on real disks in comparison to baseline methods. The selected disks are at different evolution stages and have very diverse morphologies. The stars included in the analysis are: – HR 4796A, which is the primary member of a binary system within the TW Hydrae association with an age of about 12 Myr (Bell et al., 2015). Located at about 72.8 pc (Van Leeuwen, 2007), HR 4796A harbors a debris disk observable in a face-on configuration, initially imaged by the Hubble Space Telescope (Schneider et al., 1999). Subsequently, its morphology and spectroscopy have been studied intensively by direct imaging (Milli et al., 2017, 2019). The disk showcases a slender ring and a high surface brightness hinting at the potential presence of exoplanets, though no companion has been detected yet. – SAO 206462, which is located within the Upper Centaurus Lupus constellation, has an estimated age of about 9 Myr (Müller et al., 2011). Located at about 157 pc (Brown et al., 2016), it hosts a nearly face-on transition disk imaged both in thermal emission (Doucet et al., 2006) and in scattered light (Grady et al., 2009). It includes two discernible spiral arms, several asymmetric features, and an inner cavity. High-contrast and high-resolution observations suggest that the observed structures may be attributed to the presence of low-mass exoplanets located within the spiral arms or within the inner cavity (Maire et al., 2017). – MWC 758, which is located within the Taurus association, has as estimated age of about 3.5 Myr. Located at about 156 pc (Brown et al., 2021), it hosts a protoplanetary disk in the form of a spiral with (at least) three arms (Reggiani et al., 2018). Recently, two candidate protoplanets have been proposed based on the post-processing of VLT/SPHERE and LBTI/LMIRCam observations by algorithms dedicated to the detection of point-like sources (Reggiani et al., 2018; Wagner et al., 2023). The first one is interior to the spiral (angular separation about 0.11”) and the second one is exterior to the Southern arm (angular separation about 0.62”). According to numerical models, each of these two massive candidate exoplanets would be able to generate the observed spiral arm (Wagner et al., 2019). However, the real existence of the spotted candidate exoplanets remains uncertain given the presence of disk material at the location of the candidate exoplanets, that could also lead to misinterpret disk features as point-like sources. – PDS 70, which is located within the Scorpius-Centaurus association, has an an estimated age of about 5 Myr (Müller et al., 2018). Located at about 113 pc (Brown et al., 2016), this star is notable for hosting a protoplanetary disk within which two confirmed exoplanets, PDS 70 b and PDS 70 c, are in the process of formation. The exoplanet PDS 70 b was directly imaged using the VLT/SPHERE instrument in near-infrared (Keppler et al., 2018), while PDS 70 c was unveiled through observations with the VLT/MUSE instrument in $\text{H}_{\alpha}$ (Haffert et al., 2019). A third additional candidate exoplanet was also recently detected using JWST observations in the near and mid-infrared (Christiaens et al., 2024). By harboring multiple nascent exoplanets, this system stands as a unique case. Several structures such as arcs, outer and inner gaps, and potential spiral arms, particularly on the north side of the outer disk were also resolved by direct imaging (Riaud et al., 2006; Keppler et al., 2018; Mesa et al., 2019b; Juillard et al., 2022). – HD 163296, which is located within the Sagittarius association, has an estimated age of about 5 Myr. Located at about 101.5 pc (Gaia et al., 2018), it hosts a protoplanetary disk with a diameter larger than 1000 au (Isella et al., 2007; Tilling et al., 2012; Muro-Arena et al., 2018). Sub-millimeter observations have shown that this disk harbors multiple rings whose structure are due to variations in the gas pressure (Teague et al., 2018). Moreover, multiple asymmetries in the continuum emission have been observed, which supports the hypothesis of the existence of (yet undetected) sub-stellar companions (Isella et al., 2018). Near infrared observations with VLT/SPHERE allowed to put mass limits of about 3-4 M_Jup at 30 au, 6-7 M_Jup between 30 and 80 au, and 2-4 M_Jup beyond 200 au for such plausible exoplanets (Mesa et al., 2019a). – AB Aurigae, which is located within the Auriga association, has as estimated age of about 4 Myr. Located at about 163 pc Brown et al. (2016), it hosts a protoplanetary disk with complex spiral features (Boccaletti et al., 2020). Recently, three candidate point-like sources were identified within the circumstellar environment. Two of them were identified from VLT/SPHERE observations (Boccaletti et al., 2020). The first one appears very elongated and is embedded within the Southern spiral arm. The second one is located exterior to the Northern spiral arm and is more similar to a point-like source (while being detectable only from SPHERE-IRDIS data and not from SPHERE-IFS data recorded simultaneously). In addition, these two features are detectable both in polarimetry and total intensity, which suggests that they are more likely due to scattering dust particles (Boccaletti et al., 2020). A third candidate protoplanet was identified by Currie et al. (2022b) from SUBARU/SCExAO data. It behaves as a bright emission source at an angular separation of about 0.59”, interior to a dust ring resolved in millimeter observations. However, given that the candidate exoplanet would be at its first stage of formation, likely still accreting material from the disk, it does not appear as a point-like source, but rather as a very elongated pattern, which makes the detection difficult to confirm. Nevertheless, its location and estimated SED would be compatible with model predictions as a driver of the observed spiral arms (Currie et al., 2022b).

In Sect. 4.4, we quantitatively assess the performance of the proposed algorithm against baseline methods of the field. To this end, we resort to numerical injections of synthetic disks of various morphologies into a real SPHERE-IFS dataset of the HD 122555 star (Schütz et al., 2005; Lisse et al., 2009). To the best of our knowledge, no off-axis objects (either point-like sources or disk) have ever been imaged around this star within the SPHERE-IFS field of view (Nielsen et al., 2008; Nielsen & Close, 2010). We also generate a synthetic vector of parallactic angles (linearly distributed between 0° and 30°) differing from the experimental value, to vanish out any potential signal from (unknown) real objects.

Finally in Sect. 4.5, we consider an additional dataset from the Infrared Dual-band Imager and Spectrograph (IRDIS; Dohlen et al. (2008)) of the SPHERE instrument. Its dual band mode allows simultaneous imaging at two distinct spectral channels for each individual exposure (Vigan et al., 2014). The selected dataset corresponds to the observation of the star SAO 206462. The IRDIS and IFS data of this star were collected simultaneously using the IRDIFS-EXT mode of the SPHERE instrument (Beuzit et al., 2019). In our previous work with the ADI version of the REXPACO algorithm (Flasseur et al., 2021), we processed this dataset but with a mono-spectral approach. In Sect. 4.5, we revisit this data with the proposed REXPACO ASDI algorithm to illustrate the benefits of a joint spectral processing.

All datasets were calibrated and assembled from SPHERE raw data using the pre-reduction and handling pipeline of the SPHERE consortium (Pavlov et al., 2008). During this step, background, flat-field, bad pixels, registration, true-North, wavelength and astrometric calibrations are performed. These standard pre-processing steps are followed by additional refinements implemented at the SPHERE Data Center (Delorme et al., 2017), aimed at reducing cross-talk, enhancing bad pixel correction, and mitigating spectral cross-talk effects.

Table 2 summarizes the main observation parameters associated to each dataset.

4.2 Validation of the statistical model of the nuisance component

Before evaluating the reconstruction method on high-contrast observations of circumstellar disks, we aim to show that our statistical model of the nuisances is relevant. We use the same ASDI dataset (HR 8799, 2015-07-04) as in Flasseur et al. (2020b) with the three known exoplanets within the SPHERE-IFS field of view masked out so that the resulting data correspond only to the nuisance term. Following a similar analysis as in Flasseur et al. (2020b), Fig. 4 displays the empirical distribution of all patches in the field of view after performing different post-processing. If random vectors $\bm{v}_{n}$ are accurately modeled by a Gaussian distribution with mean $\bm{\mu}_{n}$ and covariance $\mathbf{C}_{n}$ , as described in Eq. (2), the centered and whitened vectors $\mathbf{C}_{n}^{-1/2}(\bm{v}_{n}-\bm{\mu}_{n})$ should follow $\mathcal{N}(\bm{0},\mathbf{I})$ , corresponding to the red dashed line in Fig. 4. We thus compare a standard Gaussian distribution with the empirical marginal distribution of $\mathbf{C}_{n}^{-1/2}(\bm{v}_{n}-\bm{\mu}_{n})$ for several covariance models: the three models considered in Flasseur et al. (2020b), drawn in gray dashed-lines: (i) no covariance ( $\mathbf{C}_{n}=\mathbf{I}$ ); (ii) only spatial covariances; (iii) spatial covariances plus temporal and spectral weighting; and four additional models: (iv) diagonal spatial and spectral covariances (i.e., spatial, spectral, and temporal weighting via a separable model); (v) full spatial covariance, diagonal spectral covariance, and temporal weighting; (vi) full spectral covariance, diagonal spatial covariance, and temporal weighting; and finally (vii) the full separable model introduced in this paper, see Eq. (5). As shown by Fig. 4, the full spatio-spectral separable model (green curve) provides the best fit to the empirical distribution (i.e., the green curve closely matches the red dashed line of the standard Gaussian distribution). This justifies the use of the full spatio-spectral separable model in our loss function $\mathscr{L}_{n}$ . As an average trend over the field of view, we also observe that neglecting spatial covariances is more detrimental than ignoring spectral covariances, as model (v) better approximates $\mathcal{N}(\bm{0},\mathbf{I})$ than model (vi). Figure 5 completes this study with a more localized examination of the empirical distribution of patches for models (iv)-(vii) across two nuisance regimes: (1) a regime near the star where speckles dominate, and (2) a regime at larger separations where stochastic noise prevails. A similar representation was provided in Fig. 4 of Flasseur et al. (2020b) on this dataset for three additional models considered in our previous work for exoplanet detection in angular and spectral differential imaging: (i) no covariance; (ii) spatial covariances only; and (iii) spatial covariances with temporal and spectral weighting. Based on this analysis, the full spatio-spectral separable model introduced in this paper is the most effective at statistically describing the fluctuations of the nuisance component across both noise regimes (i.e., regardless of the distance to the star). Notably, in both models, the empirical distributions of centered and whitened patches more closely follow a Gaussian law with zero mean and unit variance far from the star than near the star. This is to be expected, given that the nuisance is stronger, more correlated, and fluctuates more in the vicinity of the star than farther away; see Fig. 2.

We conclude this ablation study by showing how the reconstruction results are impacted if simpler covariance models are considered rather than the full model of Eq. (5). Figure 6 displays examples on real data of the reconstructed disk component for the same four nuisance models as in Fig. 5 (i.e., models (iv)-(vii)). The datasets of HR 4796 and MWC 758 suffer from a strong nuisance component. Ignoring the spatial correlations leads to severe artifacts: a ghost circular structure is reconstructed and contaminates a large fraction of the field of view. For SAO 206462 and PDS 70, close inspection of the central region reveals spurious structures in all reconstructions except those obtained with the full model (vii) of the nuisance. While ignoring spectral correlations is also harmful (e.g., a bright nuisance halo remains around the MWC 758 disk), its effect is less pronounced compared to omitting spatial correlations, aligning with the findings in Fig. 4, where empirical residual distributions were analyzed across the whole field of view. These qualitative observations emphasize again the value of accurately modeling the nuisance’s spatial and spectral correlations to improve the reconstruction quality.

The shrinkage parameters $\widetilde{\rho}_{n}^{\mathrm{spat}}$ and $\widetilde{\rho}_{n}^{\mathrm{spec}}$ can significantly influence the statistical model. We monitor their values by displaying maps of the spatial and spectral shrinkage parameters in Fig. 7 for the dataset shown in Fig. 1. Values of $\widetilde{\rho}_{n}^{\mathrm{spat}}$ and $\widetilde{\rho}_{n}^{\mathrm{spec}}$ remain relatively low (below 0.13), suggesting a moderate bias towards zero and indicating that the off-diagonal sample covariances are only slightly attenuated by the shrinkage. Spectral shrinkage intensifies at the edges of the field of view, where fewer samples are available due to spectral scaling (i.e., $L_{\text{eff}}\leq L$ ). Conversely, spatial shrinkage is stronger at some locations of the field of view for this dataset, illustrating that a uniform shrinkage value across the entire field of view would be sub-optimal.

4.3 Qualitative analysis: reconstruction of disks from SPHERE-IFS data

Table 3: Number of modes optimized for PCA ASDI reconstructions.

	Known real disks, see Sect. 4.3
HR 4796	18
SAO 206462	6
MWC 758	4
PDS 70	14
HD 163296	42
AB Aurigae	20
	Synthetic disks, see Sect. 4.4
	$\alpha_{\text{gt}}=1\times 10^{-6}$	$\alpha_{\text{gt}}=5\times 10^{-6}$	$\alpha_{\text{gt}}=1\times 10^{-5}$
elliptical disk	14	4	4
circular disk	26	10	4
spiral disk	26	12	4

Having established the benefits of the proposed statistical model, we now apply it to the six SPHERE-IFS datasets presented in Sect. 4.1, which correspond to observations of stars hosting known circumstellar disks with diverse morphological structures. These include SAO 206462 (already shown in Fig. 1) and MWC 758, both featuring a spiral disk; HR 4796, which hosts a thin elliptical disk; and PDS 70, AB Aurigae and HD 163296, each hosting a protoplanetary disk of complex shape and several candidate or confirmed exoplanets in formation within the surrounding gas and dust material. Figure 8 presents reconstructions produced by various reference methods alongside those obtained with our method. As the other methods do not perform a deconvolution, we show in the fourth column of Fig. 8 our reconstruction re-blurred at the resolution of the instrument (i.e., $\mathbf{B}\widetilde{\bm{u}}$ instead of $\widetilde{\bm{u}}$ ). Based on code availability, three standard methods were selected for comparison: (i) median ASDI (Sparks & Ford, 2002; Marois et al., 2006; Thatte et al., 2007) which estimates the nuisance component by temporally and spectrally stacking the observations using medians, (ii) PCA ASDI (Soummer et al., 2012; Amara & Quanz, 2012; Christiaens et al., 2019) which employs principal component analysis to remove the nuisance component, and (iii) PACO ASDI (Flasseur et al., 2020b) originally developed for exoplanet detection from ASDI datasets but also capable of partially reconstructing thin disks, see Sect. 1. For median ASDI and PCA ASDI, we used the Vortex Image Processing (VIP; Gonzalez et al. (2017); Christiaens et al. (2023)) package³³3See https://github.com/vortex-exoplanet/VIP., whereas we employed our unsupervised pipeline⁴⁴4See http://doi.org/10.5281/zenodo.3679426 for a frozen implementation. for PACO ASDI (Flasseur et al., 2020b). The number of modes in PCA ASDI has been manually optimized, with the selected value being constant across all angular separations (i.e., we applied so-called full frame PCA ASDI). In practice, we evaluated all possible mode numbers (in increments of two). For experiments involving a synthetic disk with a known ground truth flux distribution $\bm{u}_{\text{gt}}$ , we selected the number of modes that minimizes the MSE between the estimate $\widetilde{\bm{u}}$ and the ground truth $\bm{u}_{\text{gt}}$ . For real disks, we visually selected the optimal number of modes to best preserve fine structures while effectively removing most of the stellar leakage. Table 3 summarizes the number of modes for PCA ASDI reconstructions of the disks analyzed in this paper. For the other hyper-parameters of median ASDI and PCA ASDI, we used default values provided within VIP. The reconstructions obtained with the reference methods all suffer from noticeable artifacts, particularly at the center of the field of view where the reduced angular diversity makes it challenging to disentangle the components. In comparison, both the blurred and the deblurred reconstructions shown in the last two columns of Fig. 8 are far more satisfactory. The REXPACO ASDI reconstruction of the HR 4796 disk displays a near-continuous elliptical structure and a flux asymmetry on the West side of the ring, consistent with the predictions of intensity scattering models, see Milli et al. (2017). For the SAO 204642, the REXPACO ASDI reconstruction exhibits two main spiral arms whose overall morphology and spatial extent are in good agreement with radiative transfer and hydro-dynamical models of transitional disks shaped by giant planets, which are responsible for sculpting multiple spiral arms, see e.g. Bae et al. (2016); Maire et al. (2017). Additionally, the REXPACO ASDI reconstructions of HR 4796 (respectively, SAO 206462, MWC 758, PDS 70, HD 163296) can be qualitatively compared with the reconstructions in Fig. 4 of Milli et al. (2017) (respectively, Fig. 1 bottom-left of Maire et al. (2017), Fig. A.1 second line of Boccaletti et al. (2021), Fig. 1 first line-second row of Mesa et al. (2019b), Fig. 1 right of Mesa et al. (2019a)). These results were derived from custom routines of respectively median ASDI, RDI ADI, median ASDI, PCA ASDI, and PCA ASDI applied to the same datasets. The REXPACO ASDI reconstructions exhibit significantly fewer artifacts, such as non-physical discontinuities in the disk structures and residuals stellar leakages near the star. The deconvolution step in the proposed method also enhances the spatial resolution of thin disk structures. In contrast, baseline methods like median ASDI and PCA ASDI tend to subtract part of the disk component when removing the nuisance term. This leads to substantial flux biases and a high-pass filtering effect. PACO ASDI, being optimized for point-like detections, manages to recover parts of the disks in large gradient areas. It is much more successful on the thin disk of HR 4796 and on the extended disk of PDS 70 than on the thicker spiral disk of SAO 206462 and of MWC 758. Finally, the multi-spectral REXPACO ASDI reconstructions in Fig. 8 can be compared to the mono-spectral reconstructions produced by the REXPACO ADI algorithm (Flasseur et al., 2021) (see Fig. 11 of (Flasseur et al., 2021)) on mono-spectral datasets of the same target stars (excepted MWC 758, HD 163296, AU Aurigae). These mono-spectral datasets were recorded using the InfraRed Dual Imaging Spectrograph (IRDIS) of the SPHERE instrument, operating simultaneously with the IFS but in a different spectral band and resolution. The joint multi-processing leads to a better rejection of the nuisance component, thereby reducing non-physical reconstruction artifacts such as discontinuities, especially within spiral arms. These comparisons illustrate that joint processing of multi-spectral datasets is particularly beneficial for disks having a circular symmetry, such as SAO 206462 or MWC 758, as it helps to disentangle the disk light from the stellar light. This is because these two components do not always superimpose due to the chromatic scaling of speckles induced by ASDI. The advantages of joint spectral processing are further explored and discussed in Sect. 4.5.

Figure 9 focuses on protoplanetary disks MWC 758 and AB Aurigae reconstructed with the proposed REXPACO ASDI algorithm. Known disk features and (candidate) point-like sources reported in the literature, as well as new disk features identified through our reconstructions, are overlaid. For MWC 758, the three spiral arms identified by Wagner et al. (2019) (highlighted with solid arrows) are well reconstructed by REXPACO ASDI. We also reconstruct two additional elongated structures interior to the Northern main spiral arm. These features could be interpreted as additional spiral arms and they appear connected to the main spiral arms by material bridges. None of the two point-like sources (b and c) identified by Reggiani et al. (2018); Wagner et al. (2023) are detected in our reconstruction. This may be due to the VLT/SPHERE-IFS observations being taken in the Y-J spectral band, whereas the two exoplanets were discovered using Keck/NIRC2 and LBTI/LMIRCam observations in the L’ and M’ bands, where contrast for such candidate sources is more favorable. For AB Aurigae, REXPACO ASDI reconstructs the two main spiral arms previously identified by Boccaletti et al. (2020). We also identify additional complex structures such as gaps and splittings within the main spiral arms. Consistent with Boccaletti et al. (2020), we detect a bright emission source (f1) embedded within the Southern spiral arm, though it appears very extended, suggesting that it is part of the disk. Like Boccaletti et al. (2020), we do not detect the Northern point-like source (f2) from this SPHERE-IFS dataset. It can be also noted that point-like source f2 were identified by Boccaletti et al. (2020) at the same epoch, but from a dataset obtained with the SPHERE-IRDIS instrument, operating simultaneously to SPHERE-IFS. We also clearly detect the Northern bright emission source (CC c) identified by Currie et al. (2022b) from SUBARU/SCExAO data. However, CC c does not appear as a point-like source in our reconstruction, likely because this candidate exoplanet, if real, would be at its first stage of formation, still accreting material from the disk. The SPHERE-IFS wavelengths being shorter than on SUBARU/SCExAO, is it also possible that the point sources are beyond reach at these wavelengths with SPHERE-IFS.

4.4 Quantitative analysis: reconstruction of synthetic disks injected into SPHERE-IFS data

In this section, we quantitatively assess the performance of the proposed approach in comparison to three baseline methods: median ASDI, PCA ASDI and PACO ASDI. The general principles of these approaches are outlined in Sect. 1, and their specific settings are detailed in 4.3.

We consider three simulated disks representative of common morphologies in high-contrast observations: (i) a spatially centered elliptical disk with sharp edges and with an eccentricity of about 0.80; (ii) a circular disk with sharp edges and whose center is shifted by five pixels from the star center in the two spatial dimensions; (iii) a spiral disk exhibiting two arms with smooth edges. Figure 10 illustrates the ground truth flux distribution for for each of these disk types used in this analysis.

While these toy models were not generated using physics-based simulators (e.g., modeling the hydrodynamics and radiative transfer), cases (i) and (ii) typically correspond to debris disks while case (iii) resembles a particular instance of transition or protoplanetary disks. Additionally, it can be noted that these synthetic disks resemble the real circumstellar disks reconstructed in Fig. 8 so that these simulations can help to assess the quality of the reconstructions of these real circumstellar disks: the elliptical disk (i) has a spatial extent similar to the HR 4796 disk, and the spiral disk (iii) has similar spatial extent and morphology to the SAO 206462 disk.

Each simulated disk is injected into the HD 172555 dataset (which contains no known off-axis source), at three different contrast levels $\alpha_{\text{gt}}\in\{1\times 10^{-6},5\times 10^{-6},1\times 10^{-5}\}$ . For our simulations, we consider gray objects, meaning the contrast is constant across the spectral band, resulting in an identical flux distribution across all spectral channels. Consequently, all reconstructions presented in this section are averaged over the whole spectral band. A total of 90 semi-synthetic datasets have been generated: for each disk type and contrast level $\alpha_{\text{gt}}$ , the simulated disk has been injected at ten different orientations relative to the nuisance component (which remains the same for all simulations). This simulation protocol allows us to evaluate the mean and variance of the reconstructions.

Table 4: Quantitative assessment of the reconstruction quality on synthetic disks. N-RMSE as defined in Eq. (112) is reported for the reconstructions displayed in Figs. 11-16 and 24-26. The N-RMSE is also computed on the restrictions

\mathcal{D}(\bm{u}_{\text{gt}})

and

\mathcal{D}(\widetilde{\bm{u}})

to the area actually covered by the simulated disks. The best scores are highlighted in bold fonts.

Score	Algorithm	$\alpha_{\text{gt}}=1\times 10^{-6}$	$\alpha_{\text{gt}}=5\times 10^{-6}$	$\alpha_{\text{gt}}=1\times 10^{-5}$
		— Elliptical disk, see Figs. 11, 12 and 24 —
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	PACO ASDI	0.52	0.53	0.58
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	REXPACO ASDI	0.12	0.11	0.10
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	PACO ASDI	0.26	0.41	0.53
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	REXPACO ASDI	0.10	0.06	0.04
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	median ASDI	0.68	0.56	0.46
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	PCA ASDI	0.40	0.30	0.30
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	REXPACO ASDI	0.13	0.06	0.05
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	median ASDI	0.66	0.54	0.45
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	PCA ASDI	0.39	0.29	0.27
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	REXPACO ASDI	0.12	0.03	0.01
		— Circular disk, see Figs. 13, 14 and 25 —
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	PACO ASDI	0.74	0.71	0.77
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	REXPACO ASDI	0.14	0.12	0.11
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	PACO ASDI	0.51	0.60	0.71
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	REXPACO ASDI	0.10	0.06	0.04
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	median ASDI	0.97	0.97	0.97
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	PCA ASDI	0.91	0.87	0.65
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	REXPACO ASDI	0.15	0.08	0.07
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	median ASDI	0.97	0.96	0.96
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	PCA ASDI	0.90	0.87	0.63
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	REXPACO ASDI	0.12	0.04	0.02
		— Spiral disk, see Figs. 15, 16 and 26 —
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	PACO ASDI	0.63	0.64	0.69
$\text{N-RMSE}\left(\bm{u}_{\text{gt}},\widetilde{\bm{u}}\right)$	REXPACO ASDI	0.60	0.39	0.38
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	PACO ASDI	0.25	0.40	0.60
$\text{N-RMSE}\left(\mathcal{D}(\bm{u}_{\text{gt}}),\mathcal{D}(\widetilde{\bm{% u}})\right)$	REXPACO ASDI	0.06	0.05	0.03
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	median ASDI	0.99	0.96	0.91
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	PCA ASDI	0.82	0.80	0.70
$\text{N-RMSE}\left(\mathbf{B}\,\bm{u}_{\text{gt}},\mathbf{B}\,\widetilde{\bm{u% }}\right)$	REXPACO ASDI	0.58	0.36	0.35
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	median ASDI	0.99	0.96	0.91
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	PCA ASDI	0.82	0.80	0.69
$\text{N-RMSE}\left(\mathbf{B}\,\mathcal{D}(\bm{u}_{\text{gt}}),\mathbf{B}\,% \mathcal{D}(\widetilde{\bm{u}})\right)$	REXPACO ASDI	0.14	0.05	0.04

Figures 11-12, 13-14 and 15-16 report the reconstruction results for the circular, elliptical and spiral disks, respectively. Figures 24, 25, and 26 complement these reconstruction results with a slice-cuts analysis along the three profiles defined in Fig. 10.

Because median ASDI and PCA ASDI do not perform a deconvolution, the comparisons are performed at the resolution of the instrument, as in Sect. 4.3. The deconvolved flux distributions $\widetilde{\bm{u}}$ estimated by REXPACO ASDI are thus re-blurred by the off-axis PSF so that the quantity $\mathbf{B}\,\widetilde{\bm{u}}$ can be directly compared with the median ASDI and PCA ASDI images in Figs. 11, 15 and 13. REXPACO ASDI reconstructions $\widetilde{\bm{u}}$ deconvolved from the off-axis PSF are more specifically compared to PACO ASDI flux distribution maps in Figs. 12, 14 and 16.

Overall, significant errors both in terms of morphology distortions and photometry under-estimations are made on the sought objects by the three comparative techniques, regardless of the type of disk. These errors are more pronounced when the diversity induced by ASDI is the most limited to disentangle the nuisance from the off-axis objects. As an illustration, the circular disk and arms of the spiral disk are barely visible near the star in the median ASDI and PCA ASDI images, even for the brightest cases, which is the sign that an important self-subtraction occurs. In addition, some stellar leakages remain, especially near the star due to the absence of explicit modeling of the correlations of the nuisance. Flux distributions estimated by PACO ASDI are also affected by significant artifacts: continuous structures manifest as a series of point sources due to assumptions made in the model regarding the target objects. Unlike other tested algorithms, this effect worsen when the contrast improves. In addition, only gradient of smooth structures are (approximately) recovered by PACO ASDI.

In comparison, reconstructions produced by REXPACO ASDI seem much closer to the ground truth, even for the lowest level of contrast $\alpha_{\text{gt}}=1\times 10^{-6}$ , with an improved object fidelity and a better rejection of the star light. Unlike median ASDI or PCA ASDI reconstructions, which display non-physical negative values, REXPACO ASDI flux distributions are consistently non-negative (see slice-cuts profiles in Figs. 24-26), owing to the explicit non-negativity constraint imposed in the minimization problem (29). In addition, an important result is the ability of REXPACO ASDI to reconstruct disks having a quasi-circular symmetry (that are especially challenging to reconstruct due to the lack of angular diversity), without the need of additional diversity complementing ASDI, e.g. leveraging multiple datasets as done in RDI techniques (see Sect. 1). The deblurred reconstructions of REXPACO ASDI shown in Figs. 12, 14, 16 and 24-26 are in good agreement with the ground truth. As expected, the reconstruction fidelity is higher when the disk is brighter: more spurious fluctuations are visible in the deblurred reconstruction at $\alpha_{\text{gt}}=1\times 10^{-6}$ than at $\alpha_{\text{gt}}=1\times 10^{-5}$ . Moreover, the spatial resolution is also significantly improved by the deconvolution process. However, some discrepancies can be noted in the deblurred line profiles, such as a slight Gibbs effect (i.e., signal ripples) near sharp edges induced by the edge-preserving regularization (even if it is beneficial in overall) and a residual bias on the photometry for some parts of the spiral disk (even the overall morphology is preserved). We discuss the latter phenomenon in Sect. 4.5 dedicated to the comparison between ADI and ASDI processing.

After this qualitative analysis, we now compare, as done in Flasseur et al. (2021), the reconstruction quality of median ASDI, PCA ASDI, PACO ASDI and REXPACO ASDI by reporting the normalized root mean square error (N-RMSE, the lower the higher reconstruction fidelity):

\text{N-RMSE}(\bm{u}_{\text{gt}},\widetilde{\bm{u}})=\frac{||\bm{u}_{\text{gt}% }-\widetilde{\bm{u}}||_{2}}{||\bm{u}_{\text{gt}}||_{2}}\,.

(112)

Table 4 reports the N-RMSE for two regions of the reconstructed flux distribution: (i) the entire image, and (ii) the disk area only. In the latter case, Eq. (112) is modified to account solely for disk regions. This metric shows a clear improvement brought by REXPACO ASDI compared to the other tested algorithms, with error reduction exceeding a factor 10 for more challenging configurations (e.g., circular or spiral disks). A more modest error reduction is obtained for configurations (i.e., morphology and contrast) leading to an easier separation of the disk from the nuisance contribution, like for the elliptical disk.

This study also provides valuable insights for interpreting the reconstructions of real disks presented in Fig. 8, as both the simulations and real data share comparable angular and spectral diversity (i.e., similar amounts of parallactic rotation, same number and spreading of the spectral channels). Additionally, the simulated disks possess morphologies closely resembling the real disks. Consequently, this study suggests that the reconstructed flux distribution of HR 4796 can be confidently interpreted as having an elliptical morphology. Similarly, the outer disk of HD 163296 has roughly the same morphology, allowing for confidence in the reconstructed structures on the Northern side, though the quality of the reconstruction is strongly limited by the low disk contrast (lower than $5\times 10^{-7}$ ) on the Southern side. SAO 206462, MWC 758 and AB Aurigae, all of which exhibit spiral arms with a spatial extent quite similar to the simulated spiral disk studied in this section. We can thus expect that the morphology of these three real disks are well reconstructed with, likely, a slight photometric bias on some structures in the vicinity of the host star. The case of PDS 70 is more challenging due to its intricate structures including a smooth flux distribution near the star in the shortest wavelengths. While no non-physical discontinuities are observed in the outer disk, dedicated hydro-dynamical simulations of this object are needed to identify the areas impacted by potential artifacts. Such a study is out of the scope of this paper and is left for a future work dedicated to the re-analysis of multi-epochs and multi-instruments observations of PDS 70.

4.5 On the importance of a joint spectral processing

In this section, we aim to illustrate the benefits of joint spectral processing, incorporating fine modeling of correlations between spectral channels, compared to mono-spectral processing that does not leverage the apparent chromatic displacement of the speckle field induced by ASDI (see Sect. 1).

On the latter point, we start by identifying parts of disks that are expected to suffer the most from the self-subtraction effect for different disk morphologies, spectral bands and total amounts of parallactic rotation. For that purpose, we consider the three synthetic disk morphologies studied in Sect. 4.4, and we assume a null nuisance component to evaluate solely the influence of limited angular and spectral diversity on reconstruction quality. As done by Juillard et al. (2023) for ADI, given a ground truth flux distribution $\bm{u}_{\text{gt}}\in\mathbb{R}^{N^{\prime}\times L}$ , we define the spectrally aggregated flux $\bm{u}_{\text{inv}}\in\mathbb{R}^{N^{\prime}}$ which is invariant both from the apparent rotation induced by ADI and from the homothetic spectral motion of the speckle field induced by SDI as:

{\left[\bm{u}_{\text{inv}}\right]}_{n}=\text{min}_{t=1:T,\,\ell=1:L}\left[% \mathbf{F}\,\bm{u}_{\text{gt}}\right]_{n,t,\ell}\,,\forall n\in\llbracket 1;N^% {\prime}\rrbracket\,,

(113)

with $\mathbf{F}$ the sparse operator performing rotations, scalings, and attenuations as defined for the forward image formation model in Sect. 3.1. Taking the minimum intensity value (operator min) across the temporal and spectral dimensions in Eq. (113) enables the identification of ASDI-invariant flux regions. The output is 0 for non-invariant regions and 1 for areas of the disk that are fully affected by angular and/or spectral invariance. We also consider the quantity $\bm{u}_{\text{gt}}-\bm{u}_{\text{inv}}$ representing the expected reconstructed flux distribution if the invariant component $\bm{u}_{\text{inv}}$ can not be disentangle from the nuisance component (i.e., the angular and spectral diversity are not sufficient to perform signal unmixing). Fig. 17 represents these two quantities in ADI, SDI and ASDI for the three typical morphologies considered in Sect. 4.4, for a simulated total amount of parallactic rotation $\Delta_{\text{par}}=\{30\mathrm{\SIUnitSymbolDegree},45\mathrm{% \SIUnitSymbolDegree}\}$ , and for simulated spectral bands YJ (i.e., $\lambda\in\left[0.96-1.33\right]\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}$ ) or YJH (i.e., $\lambda\in\left[0.96-1.64\right]\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}$ ). In ADI, i.e. in the absence of a joint spectral processing, we observe that a large fraction of the circular and spiral disks remain invariant with respect to the background. The elliptical disk is less affected by this phenomenon, even if it is not negligible, especially near the ellipse handles along its minor axis. This lack of diversity translates into a partial attenuation and distortion of the reconstructed disk, due to object self-subtraction, see e.g. Milli et al. (2012); Pairet et al. (2019); Juillard et al. (2023) for related studies in ADI. Moreover, as expected the total amount of parallactic rotation brings only a limited diversity at short angular separations: the angular-invariant flux distribution only slightly decreases when $\Delta_{\text{par}}$ evolves from 30° to 45°, regardless of the disk morphology. SDI effectively eliminates most signal ambiguities caused by object invariances. It leads to no invariant flux for elliptical and circular disks. Joint spectral processing with ASDI further improves the unmixing capability of post-processing algorithms as only a very slight fraction of the spiral disk remains invariant for the setting $\Delta_{\text{par}}=30\mathrm{\SIUnitSymbolDegree}$ in YJ band. It results in a slight object self-subtraction that could explain the observed photometric bias in Figs. 16 and 26 on the reconstructed spiral disk for similar settings (in terms of disk morphology, parallactic rotation, and spectral band). Increasing the spectral width towards the H band and the total parallactic rotation towards 45° leads to a negligible invariant flux distribution, that would allow to reconstruct the underlying off-axis object without self-subtraction with REXPACO ASDI, and without the need to leverage a database archive as in RDI techniques.

Figure 18 completes this study by comparing a post-processing relying on ADI only (here, with the REXPACO ADI algorithm) to a post-processing leveraging also on the spectral diversity brought by ASDI (here, with the REXPACO ASDI algorithm) for three particular configurations of the extensive simulations performed in Sect. 4.4. Figure 27 complements results displayed in Fig. 18 with a slice-cuts analysis along the three profiles defined in Fig. 10.

In all cases, the same total amount of information is used, i.e. all spectral channels are considered in ADI but they are processed individually instead of jointly as in ASDI. The conclusions derived from the nuisance-free simulations in Fig. 17 directly translates on the reconstruction quality: REXPACO ADI leads to a bias (respectively, by up to 20 % and 60%) on the reconstructed flux distribution of the elliptical and circular disks, respectively. This bias is almost null for the elliptical disk reconstructed with the proposed REXPACO ASDI algorithm. For the spiral disk, a bias up to 20% can remain on some parts of the ASDI reconstruction, even though it is significantly smaller than for ADI. This residual bias can be attributed to a still insufficient angular and spectral diversity, as shown by the nuisance-free study for a simulated spiral disk in the YJ band with a total parallactic rotation $\Delta_{\text{par}}$ of 30°.

Conversely, on the same real data of Sects. 4.2-4.3, we perform a model ablation study complementary to the one presented in Sect. 4.2. Unlike in Sect. 4.2, we consider here the spatial covariances when estimating the nuisance component but we process each spectral channel individually with REXPACO ADI instead of jointly with REXPACO ASDI as done in Sects. 4.2-4.3. Figure 19 displays the resulting reconstructed flux distributions compared to the corresponding REXPACO ASDI reconstructions. The absence of joint spectral processing is detrimental on three aspects. First, important residual star light remains in the ADI reconstructions, in particular for HR 4796 and AB Aurigae. Their typical signatures in rainbow pattern is due to the absence of modeling of the spectral correlations of the nuisance. Second, the sensitivity is lowered due to the absence of explicit exploitation of the spectral diversity, even if the same total amount of data is processed. As an illustration, the HD 163296 disk is almost invisible in the ADI reconstruction. Third, important non-physical artifacts and discontinuities on the disk features are present on the ADI reconstructions, especially for disk having a circular symmetry like SAO 206462, MWC 758 and PDS 70. This latter effect is due to the lack of diversity between the sought off-axis objects and the nuisance component in ADI, as discussed in the previous paragraph.

Now that we have established the causes of the limitations of ADI and emphasized the benefits of ASDI to produce faithful reconstructions of the circumstellar environment from IFS data, we illustrate that even a limited spectral diversity can be useful to improve the quality of the reconstructions. In our previous work on the REXPACO algorithm designed for ADI (Flasseur et al., 2021), we considered datasets from the SPHERE-IRDIS imager in its dual band configuration (i.e., producing simultaneously datasets on $L=2$ spectral channels). In Flasseur et al. (2021), we have shown that REXPACO ADI is able to produce disk reconstructions with a significantly improved quality compared to standard post-processing methods like median ADI, PCA ADI, PACO ADI. We also notice that some plausible artifacts can remain due to the lack of diversity between the disk and the nuisance component. Here, we re-visit with the proposed REXPACO ASDI algorithm a SPHERE-IRDIS dataset (SAO 206462) considered in Flasseur et al. (2021) and for which the reconstruction seems the most impacted by residual artifacts. Figure 20 compares our new reconstruction obtained by a joint spectral processing with REXPACO ASDI to the REXPACO ADI reconstruction. Notably, we identified in our ADI reconstruction a spurious reconstruction effect on the West spiral arm, taking the form of a flux discontinuity (see white arrow in Fig. 20). This likely artifact is effectively mitigated in the ASDI reconstruction, primarily due to the joint spectral processing of both available spectral channels. Furthermore, the disk appears significantly fainter in the second channel compared to the first, leading to better separation between the disk and the nuisances. In this case, the second channel serves almost like a reference channel, nearly free from the signal of the target object. Overall, the morphology of SAO 206462 extracted with REXPACO ASDI from the IRDIS dataset exhibits structures very similar to these in the IFS reconstruction presented in Fig. 8. This example illustrates qualitatively that even a very limited spectral diversity (in the present case, $L=2$ spectral channels, and a band width $\Delta_{\lambda}<0.15\,\mathrm{\SIUnitSymbolMicro}\mathrm{m}$ ) is sufficient to improve significantly the reconstruction quality by reducing morphological distortions and flux attenuations.

This study yields two main conclusions. First, ASDI post-processing should be favored over ADI and SDI, as it significantly mitigates ambiguities due to object invariances. This finding supports the choices made in Sects. 4.3 and 4.4 regarding the application of comparative algorithms that exploit jointly ASDI diversities. Second, while ASDI offers a theoretical advantage in diversity, this benefit fully translates into improved reconstruction fidelity only when appropriate models of the data are employed. As an illustration, all median ASDI reconstructions (i.e., based on an overly simplistic and empirical model of the nuisance) shown in Sects. 4.3 and 4.4 display strong artifacts, despite the method jointly exploiting both ADI and SDI diversities.

5 Unmixing point-like sources from extended features

5.1 Alternate unmixing

Input: ASDI sequence

\bm{v}

Input: Forward operator

\mathbf{M}

Input: Relative precision

\eta\in(0,1)

\eta=10^{-3}

in practice.

Output: Flux distribution

\widetilde{\bm{u}}

of disk.

Output: Flux distribution

\widehat{\bm{\alpha}}

of point-like sources.

Output: S/N of detection of point-like sources.

Output: Astrometry (separation

\widehat{\rho}

, angle

\widehat{\theta}

) of point-like sources.

\blacktriangleright

Step 1. Initialization.

i\leftarrow 0

\triangleleft

iteration counter

\widetilde{\bm{u}}^{[i]}\leftarrow\text{REXPACO ASDI}(\bm{v})

\triangleleft

apply REXPACO on data

\text{S/N}^{\left[i\right]},\widehat{\bm{\alpha}}^{\left[i\right]}\leftarrow% \text{PACO ASDI}(\bm{v})

\triangleleft

apply PACO on data

\blacktriangleright

Step 2. User identification of (candidate) point-like sources.

P>0

\triangleleft

set number of sources

\widehat{\rho}^{[i]}_{1:P},\widehat{\theta}^{[i]}_{1:P}

\triangleleft

set rough astrometry

\blacktriangleright

Step 3. Main iteration loop.

i\leftarrow i+1

\triangleleft

update iteration counter

\widetilde{\bm{u}}^{[i]}\leftarrow\text{REXPACO ASDI}(\underbrace{\bm{v}-% \mathbf{M}\,\widehat{\bm{\alpha}}^{[i-1]}}_{\text{PACO residuals}})\hfill

\text{S/N}^{\left[i\right]},\widehat{\bm{\alpha}}^{\left[i\right]},\widehat{% \rho}^{[i]}_{1:P},\widehat{\theta}^{[i]}_{1:P}\leftarrow\text{PACO ASDI}(% \underbrace{\bm{v}-\mathbf{M}\,\widetilde{\bm{u}}^{[i-1]}}_{\text{REXPACO % residuals}})\hfill

while $\big{\lVert}\widehat{\bm{\alpha}}^{[i]}-{\widehat{\bm{\alpha}}}^{[i-1]}\big{% \rVert}>\eta\,\big{\lVert}\widehat{\bm{\alpha}}^{[i]}\big{\rVert}$

Algorithm 1 Alternating REXPACO ASDI and PACO ASDI
(unmixing disk and point-like sources).

In this section, we investigate the unmixing of the contribution of point-like sources embedded in spatially extended structures like circumstellar disks. The approach we propose is an extension to multi-spectral observations of the unmixing strategy described in our previous work Flasseur et al. (2021) for ADI data. It consists in combining REXPACO ASDI with PACO ASDI (Flasseur et al., 2020b); the former being dedicated to the reconstruction of disks while the latter being dedicated to the detection and to the sub-pixel characterization of point-like sources. In our experiments, this alternated strategy proved to be more satisfactory than a joint and regularized reconstruction of both a sparse component (for point-like sources) and of a smooth component (for the disk). One of the main peculiarities of the proposed alternated strategy is the ability to select manually the number and the rough location of candidate point-like sources to unmix from the disk material. In contrast, a joint reconstruction of both a sparse component and a smooth component leads either to many more nonzero values in the sparse component than the actual number of point-like sources, or misses the faintest sources, depending on the relative weights given to the sparsity and smoothness regularizations.

The proposed unmixing procedure works as follows: (i) REXPACO ASDI and PACO ASDI are applied independently on a target ASDI observation; (ii) based on the spatio-spectral S/N maps obtained with PACO ASDI and on the spatio-spectral flux distribution obtained with REXPACO ASDI, candidate point-like sources to unmix from the disk material are identified manually by the user; (iii) REXPACO ASDI and PACO ASDI are iteratively applied until convergence of the two retrieved components. During step (iii), the astrometry and photometry of the selected point-like sources are refined with sub-pixel accuracy by PACO ASDI within a $3\times 3$ pixels box, based on the residual data obtained after subtraction of the disk contribution as currently reconstructed by REXPACO ASDI. Similarly, the spatio-spectral flux distribution of the disk is refined by REXPACO ASDI on updated residuals obtained after subtraction of the refined point-sources contribution estimated by PACO ASDI. This procedure is summarized by Algorithm 1.

5.2 Case study on the PDS 70 system

We first evaluate the unmixing ability of the proposed algorithm through numerical experiments on a SPHERE-IFS dataset of PDS 70. We injected (not simultaneously) six faint point-like sources, and we disregarded the unmixing of the real known exoplanets, focusing solely on separating the synthetic sources from the circumstellar environment. Figure 21 compares the estimated SED of the synthetic sources across various iterations of Algorithm 1. It shows that estimation errors decrease over iterations, generally converging towards zero (except for the first spectral channels of source #3 that display a remaining discrepancy with the ground truth). The errors are larger when the SED of the point-like sources closely resembles that of the star (i.e., for sources #1, #2 and #3), as the disk material shares spectral similarities with it, making unmixing more ambiguous. Overall, these results demonstrate the capability of the proposed approach to effectively disentangle the signal from point-like sources, even when they are partially buried into disk material.

As a case-study, we apply Algorithm 1 on the same SPHERE-IFS dataset of PDS 70, focusing now on unmixing real point-like sources. We recall that PDS 70 hosts two known exoplanets (Keppler et al., 2018; Haffert et al., 2019) in accretion phase within a protoplanetary disk (Isella et al., 2019), see also Sect. 4.3. Based on the processing of the same dataset, Mesa et al. (2019b) also identified a point-like feature (PLF) with several post-processing algorithms dedicated to the detection of point-like sources. As in Mesa et al. (2019b), the independent application of PACO ASDI allows to identify a PLF in the spatio-spectral S/N maps produced with PACO ASDI, see iteration 0 in Fig. 22. Exoplanet PDS 70 c cannot be detected in the same S/N maps, likely due to its proximity to the disk material which is over-subtracted by PACO ASDI, since it is not specifically designed to preserve extended structures. Scrutinizing the REXPACO ASDI reconstruction allows to detect PDS 70 b and c, appearing as red point-like sources, even though they are embedded within the disk material. The outer and inner structures of the disks, as well as the spiral feature identified by Juillard et al. (2022) from SPHERE-IRDIS observations are also reconstructed. At the first application of REXPACO ASDI, the PLF seems to be more likely a part of the inner disk hosted by the star, see iteration 0 in Fig. 22. Iterating between REXPACO ASDI and PACO ASDI leads to several remarks. First, the extractions of PDS 70 b and PDS 70 c improve along the iterations. As a qualitative illustration, the REXPACO ASDI reconstruction obtained after a single iteration with Algorithm 1 exhibits a discontinuous footprint within the disk material at the location of the two exoplanets, as a sign of an overestimation of their contribution by PACO ASDI. At convergence of the proposed unmixing scheme, the disk component appears smooth and continuous at the locations of the two exoplanets without any residual signature of PDS 70 b and c. Across the iterations, the contribution of the PLF in the sparse component decreases since it is increasingly explained by the disk component. At convergence of the iterative procedure, the residual spatio-spectral S/N maps from PACO ASDI are almost free from the disk contribution and the signature of the PLF is significantly attenuated with respect to the initial S/N maps at iteration 0. These results also support the conclusions of Mesa et al. (2019b) likely attributing the PLF as a part of the disk, based on its estimated photometry (its SED being very similar to the disk one).

Table 5: Estimated astrometry

(\widehat{\rho},\,\widehat{\theta})

of PDS 70 b and c as well as the candidate PLF. Values obtained with our unmixing scheme combining REXPACO ASDI and PACO ASDI are compared to the values reported in the literature on data from the same instrument taken at the same observation date.

Source	$\widehat{\rho}$ (mas)	$\widehat{\theta}$ (degrees)	reference
PDS 70 b	192.2 $\pm$ 8.0	146.8 $\pm$ 2.4	Müller et al. (2018)
PDS 70 b	186.8 $\pm$ 0.2	145.4 $\pm$ 0.1	this paper
PDS 70 c	209 $\pm$ 13	281.2 $\pm$ 0.5	Mesa et al. (2019b)
PDS 70 c	211.0 $\pm$ 0.5	280.3 $\pm$ 0.1	this paper
PLF	118 $\pm$ 4	316.8 $\pm$ 0.5	Mesa et al. (2019b)
PLF	111.5 $\pm$ 0.3	318.4 $\pm$ 0.1	this paper

Figure 23 completes this study by showing the estimated astrometry of PDS 70 b and c, as well as of the PLF along the iterations of the unmixing method. It shows, that the estimated astrometry of the three sources evolves during the iterations. The estimated angular separation $\widehat{\rho}$ evolves up to 15 mas (i.e., 2 pixels) and the estimated parallactic angle $\widehat{\theta}$ evolves up to 0.5 degree for PDS 70 c (located very near the outer disk arm), which is an illustration of the impact of the disk material on the characterization and orbital parameters estimation of point-like sources embedded within. The estimation shift both in angular separation and in parallactic angle is even more important for the PLF since its sparse contribution gets fainter during the iterations and does not resemble to a point-source anymore. In addition, the accuracy of astrometry improves (i.e., the error bars get smaller) for PDS 70 b and c while it degrades (i.e., the error bars get larger) for the PLF. This observation is in agreement with the qualitative results presented in Fig. 22 attributing preferentially the PLF as part of the disk component. Table 5 reports our final astrometric measurements obtained for the considered sources. The retrieved values are compared to the most accurate measurements available in the literature using direct imaging for these three sources and at the same observation date. Overall, our estimations are compatible (within two times the standard-deviation, at most) with the values reported in the literature. However, our estimations are much more accurate: the uncertainties are decreased by a factor between 5 and 40. If the astrometric estimations we derived are confirmed (e.g., based on a multi-epochs analysis), they could be significant corrective factors of the orbit of the exoplanet PDS 70 b and c.

Beyond the benefits of the proposed iterative approach to unmix point-like sources from the circumstellar environment, this study illustrates that applying a post-processing algorithm not specifically designed for the recovery of extended sources can lead to critical artifacts and biases. In particular, it can lead to misinterpret a disk feature for a point-like source. These observations could encourage to revisit systems where candidate point-like sources embedded in disk material were recently identified via a post-processing of the data by algorithms not tailored to reconstruct extended features and even less to unmix disk and point-like components.

6 Discussion and conclusion

In this paper, we introduced REXPACO ASDI, a new algorithm for reconstructing circumstellar environments from high-contrast observations in pupil-tracking mode. Our approach utilizes spectral diversity inherent in ASDI data. REXPACO ASDI combines a tailored statistical model of non-stationary nuisances with a forward image formation model of the off-axis sources. These models are jointly used to solve a reconstruction task in a regularized inverse problem framework. This method, specifically designed for extended sources, is the first to leverage jointly angular and spectral diversity introduced by ASDI for reconstructing the spatio-spectral flux distribution of circumstellar environments.

On the methodological side, we employ a local modeling approach to capture spatial and spectral correlations of nuisances for a more accurate statistical description of the data. This model utilizes a spatio-spectral separable approximation to reduce the large number of free parameters needed to model full covariances. For similar reasons, the model is local, i.e. its parameters differ with the location in the field of view and are estimated at the scale of small patches. Our model can thus be interpreted as a block-diagonal approximation of the full spatio-spectral covariance. Tailored estimators of model parameters, based on covariance shrinkage, are developed to reduce estimation uncertainty and improve robustness. We illustrate on real data that this approximate statistical model effectively captures most nuisance correlations. Ablation study reveals that jointly accounting for spatio-spectral correlations directly from the data is crucial for capturing accurately the statistics of ASDI observations, outperforming methods that first model spatial correlations from spatio-temporo-spectral data and then spectral correlations from reduced quantities, as in our previous work dedicated to exoplanet detection from similar ASDI observations (Flasseur et al., 2020b).

We proposed a specific reconstruction strategy to refine jointly the statistical model of the nuisance and the reconstructed flux distribution of the circumstellar environment. This hierarchical estimation strategy derives estimators of the nuisance component mostly unbiased from the contamination of the sought off-axis objects. This method also prevents iterating between the characterization of the nuisance and the reconstruction task, thus leading to an algorithm that scales to the size of typical datasets recorded with the ASDI technique, both in terms of computational burden and memory storage. We apply regularization to the spatio-spectral flux distribution using suitable penalties. These penalties improve both rejection of residual starlight and fidelity of reconstructed features. We demonstrate the versatility of these priors in recovering various structures within the circumstellar environment, such as sharp edges and smooth transitions.

REXPACO ASDI operates in a fully unsupervised manner, allowing optimal estimation of all hyper-parameters from the dataset itself, without relying on prior knowledge about the disk properties or requiring trial and error reconstructions. Among the free hyper-parameters, the patch size is set based on the full width at half maximum of the off-axis PSF. The spatially adaptive regularization of noisy covariances through shrinkage is obtained via a derived closed-form expression, minimizing estimation risk for the statistical nuisance model. Hyper-parameters that determine the relative weights of reconstruction regularization can be estimated quasi-optimally by minimizing Stein’s unbiased risk estimator. However this process is time-consuming because it requires multiple reconstructions with different penalty weights. As this setting is not the most critical, it can be approximated from the optimal setting obtained on a standard dataset by scaling regularization parameters with respect to the acquired number of frames.

We tested the proposed algorithm using injection of synthetic disks with different morphologies, orientations, and contrast levels. While these simulations could be complemented and refined by even more extensive experiments, they allowed to identify the key capabilities and benefits of REXPACO ASDI. We showed that the proposed method is very versatile since it is able to reconstruct faithful spectral images of the considered disks for contrasts up to $10^{-6}$ . One of our major result is the ability of REXPACO ASDI to reconstruct disks being partly rotation-invariant, i.e. whose morphology makes the unmixing of the disk and the speckles particularly difficult when only leveraging the ADI diversity. These disks are known to be especially challenging to reconstruct without an additional source of diversity in the data, for instance provided by multiple observations as in RDI. Unlike this latter category of methods, the unmixing capability of REXPACO ASDI is achieved from a single ASDI dataset, i.e. the model of the nuisance is dataset-dependent. Using simulated flux distributions, we also illustrated that the theoretical fraction of flux lost due to unmixing ambiguities is negligible if the different spectral channels are processed jointly. This property of ASDI is due to the chromatic scaling of the speckle field caused by the diffraction.

By resorting to a model ablation, applied both on synthetic and real disks, we illustrated that the joint spectral processing of REXPACO ASDI efficiently unmixes disk features from the nuisance and requires an accurate model of spatio-spectral correlations that are very strong in ASDI observations. Ignoring these correlations in the statistical model of the nuisance is particularly detrimental to the quality of the reconstruction.

As a proof of concept, we analyzed real SPHERE-IFS datasets containing six known circumstellar disks with various morphologies, including challenging features like spiral arms. Despite the absence of ground truth for these real objects, we observed that our method outperformed median ASDI, PCA ASDI, PACO ASDI, and the mono-spectral version of REXPACO in rejecting nuisances. Our approach significantly reduced non-physical artifacts, such as discontinuities from partial self-subtraction. Additionally, the reconstructed flux distribution showed improved spatial resolution compared to the original data, as we accounted for the blur introduced by the off-axis PSF through deconvolution in the forward image formation model. We also processed a dual-band dataset obtained with the IRDIS imager of the SPHERE instrument. Although spectral diversity was limited in this dataset, we illustrated that our approach enhanced reconstruction quality compared to the mono-spectral REXPACO algorithm designed for ADI observations.

Given the complementary capabilities of REXPACO ASDI, we can expect that it will be helpful to unveil new disks, to improve the spatio-spectral interpretation of their flux distribution, and thus to better understand the phenomena governing the formation of planetary systems like the intricate interactions between exoplanets and the disk material. In particular, we illustrated that the latter goal can be achieved by combining REXPACO ASDI with the detection algorithm PACO ASDI to unmix point-like sources from the circumstellar material. As initialization step, this latter strategy only requires the rough locations (typically, with pixel-level accuracy) of candidate point-like sources to be unmixed from the disk material. Based on numerical experiments, we illustrated that this combined approach can reduce significantly the photometry bias occurring during characterization of point-like sources embedded within disk material. As a case-study, we applied this strategy on a dataset of PDS 70. Our results illustrated the ability of the proposed approach to identify components being more likely disk features than point-like sources, even when they are mistaken as point-sources at initialization step.

As future work, we plan to improve the fidelity of the model of the nuisance, especially in the vicinity of the star where the model is slightly inaccurate. Disk reconstruction is very challenging in this area due to large stellar leakages that could be more accurately captured by accounting for the spatial correlations at a larger spatial scale than a patch of a few pixels. Complementary to that, even the spectral diversity is very useful to retrieve faithful flux distribution, a distortion can remains in some cases. This limitation could be tackled by building a more complex model leveraging deep learning techniques to model the nuisance distribution from multiple archival data.

Beyond the specific field of application of the proposed algorithm, its statistical modeling of the spatio-spectral correlations of the nuisance component and the estimation strategy of the underlying parameters are very general approaches. These methodological developments could be specialized to other large-scale reconstruction problems encountered in other imaging modalities such as microscopy or remote sensing. These fields often involve multi-spectral measurements, where signals of interest are faint and affected by multi-correlated and non-stationary nuisances.

Acknowledgements

We thank the anonymous Referee for her/his careful reading of the manuscript as well as her/his insightful comments and suggestions.

This project was funded in part by the French National Research Agency (ANR) under the project DDISK (grant ANR-21-CE31-0015) and by the Région Auvergne-Rhône-Alpes under the project DIAGHOLO. This work was also supported by the ANR under the France 2030 program (PEPR Origins, reference ANR-22-EXOR-0016), by the French National Programs (PNP and PNPS), and by the Action Spécifique Haute Résolution Angulaire (ASHRA) of CNRS/INSU co-funded by CNES.

OF, LD, and ÉT conceived and designed the method presented in this paper. OF developed, tested, and implemented the algorithm. OF selected the raw data. ML pre-reduced them through the SPHERE Data Center. OF performed the analysis of the data. OF, LD, ÉT, and ML wrote the manuscript.

Data Availability

The raw data used in this article are freely available on the ESO archive facility at http://archive.eso.org/eso/eso_archive_main.html. They were pre-reduced with the SPHERE Data Centre, jointly operated by OSUG/IPAG (Grenoble), PYTHEAS/LAM/CESAM (Marseille), OCA/Lagrange (Nice), Observatoire de Paris/LESIA (Paris), and Observatoire de Lyon/CRAL (Lyon, France). The resulting pre-processed datasets will be shared based on reasonable request to the corresponding author.

References

Aharon et al. (2006) Aharon M., Elad M., Bruckstein A., 2006, IEEE Transactions on Signal Processing, 54, 4311
Amara & Quanz (2012) Amara A., Quanz S. P., 2012, Monthly Notices of the Royal Astronomical Society, 427, 948
Bae et al. (2016) Bae J., Zhu Z., Hartmann L., 2016, The Astrophysical Journal, 819, 134
Bell et al. (2015) Bell C. P., Mamajek E. E., Naylor T., 2015, Monthly Notices of the Royal Astronomical Society, 454, 593
Benisty et al. (2015) Benisty M., et al., 2015, Astronomy & Astrophysics, 578, L6
Beuzit et al. (2019) Beuzit J.-L., et al., 2019, Astronomy & Astrophysics, 631, A155
Blomgren et al. (1997) Blomgren P., Chan T. F., Mulet P., Wong C.-K., 1997, in Proceedings of international conference on image processing. pp 384–387
Boccaletti et al. (2020) Boccaletti A., et al., 2020, Astronomy & Astrophysics, 637, L5
Boccaletti et al. (2021) Boccaletti A., et al., 2021, Astronomy & Astrophysics, 652, L8
Bodrito et al. (2024) Bodrito T., Flasseur O., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2024, Monthly Notices of the Royal Astronomical Society, p. stae2174
Bowler (2016) Bowler B. P., 2016, Publications of the Astronomical Society of the Pacific, 128, 102001
Bresson & Chan (2008) Bresson X., Chan T. F., 2008, Inverse Problems & Imaging, 2, 455
Brown et al. (2016) Brown A. G., et al., 2016, Astronomy & Astrophysics, 595, A2
Brown et al. (2021) Brown A. G., et al., 2021, Astronomy & Astrophysics, 649, A1
Buades et al. (2005) Buades A., Coll B., Morel J.-M., 2005, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 60–65
Carbillet et al. (2011) Carbillet M., et al., 2011, Experimental Astronomy, 30, 39
Charbonnier et al. (1997) Charbonnier P., Blanc-Féraud L., Aubert G., Barlaud M., 1997, IEEE Transactions on image processing, 6, 298
Chen et al. (2010) Chen Y., Wiesel A., Eldar Y. C., Hero A. O., 2010, IEEE Transactions on Signal Processing, 58, 5016
Chintarungruangchai et al. (2023) Chintarungruangchai P., Jiang G., Hashimoto J., Komatsu Y., Konishi M., 2023, New Astronomy, 100, 101997
Christiaens et al. (2019) Christiaens V., et al., 2019, Monthly Notices of the Royal Astronomical Society, 486, 5819
Christiaens et al. (2023) Christiaens V., et al., 2023, Journal of Open Source Software, 8
Christiaens et al. (2024) Christiaens V., et al., 2024, Astronomy & Astrophysics, 685, L1
Conte et al. (1995) Conte E., Lops M., Ricci G., 1995, IEEE Transactions on Aerospace and Electronic Systems, 31, 617
Craven & Wahba (1978) Craven P., Wahba G., 1978, Numerische mathematik, 31, 377
Currie et al. (2017) Currie T., et al., 2017, The Astrophysical Journal Letters, 836, L15
Currie et al. (2022a) Currie T., Biller B., Lagrange A.-M., Marois C., Guyon O., Nielsen E., Bonnefoy M., De Rosa R., 2022a, arXiv preprint arXiv:2205.05696
Currie et al. (2022b) Currie T., et al., 2022b, Nature Astronomy, 6, 751
Dabov et al. (2007) Dabov K., Foi A., Katkovnik V., Egiazarian K., 2007, IEEE Transactions on Image Processing, 16, 2080
Delorme et al. (2017) Delorme P., et al., 2017, arXiv preprint arXiv:1712.06948
Dohlen et al. (2008) Dohlen K., et al., 2008, in Ground-based and Airborne Instrumentation for Astronomy II. pp 1266–1275
Doucet et al. (2006) Doucet C., Pantin E., Lagage P., Dullemond C., 2006, Astronomy & Astrophysics, 460, 117
Esposito et al. (2013) Esposito T. M., Fitzgerald M. P., Graham J. R., Kalas P., 2013, The Astrophysical Journal, 780, 25
Esposito et al. (2020) Esposito T. M., et al., 2020, The Astronomical Journal, 160, 24
Flasseur et al. (2018) Flasseur O., Denis L., Thiébaut É., Langlois M., 2018, Astronomy & Astrophysics, 618, A138
Flasseur et al. (2020a) Flasseur O., Denis L., Thiébaut É., Langlois M., 2020a, Astronomy & Astrophysics, 634, A2
Flasseur et al. (2020b) Flasseur O., Denis L., Thiébaut É., Langlois M., 2020b, Astronomy & Astrophysics, 637, A9
Flasseur et al. (2021) Flasseur O., Thé S., Denis L., Thiébaut É., Langlois M., 2021, Astronomy & Astrophysics, 651, A62
Flasseur et al. (2022) Flasseur O., Denis L., Thiébaut É., Langlois M., et al., 2022, in Adaptive Optics Systems VIII. pp 1175–1189
Flasseur et al. (2023a) Flasseur O., Bodrito T., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2023a, in 2023 31st European Signal Processing Conference (EUSIPCO). pp 1723–1727
Flasseur et al. (2023b) Flasseur O., Bodrito T., Mairal J., Ponce J., Langlois M., Lagrange A.-M., 2023b, Monthly Notices of the Royal Astronomical Society, 527, 1534
Flasseur et al. (2024) Flasseur O., Thiébaut E., Denis L., Langlois M., 2024, accepted in EUSIPCO, arXiv preprint arXiv:2403.07104
Follette (2023) Follette K. B., 2023, Publications of the Astronomical Society of the Pacific, 135, 093001
Gaia et al. (2018) Gaia C., et al., 2018, Astronomy & Astrophysics, 616
Garufi et al. (2020) Garufi A., et al., 2020, Astronomy & Astrophysics, 633, A82
Genton (2007) Genton M. G., 2007, Environmetrics: The official Journal of the International Environmetrics Society, 18, 681
Girard (1989) Girard D. A., 1989, Numerische Mathematik, 56, 1
Gonzalez et al. (2017) Gonzalez C. A. G., et al., 2017, The Astronomical Journal, 154, 7
Grady et al. (2009) Grady C., et al., 2009, The Astrophysical Journal, 699, 1822
Haffert et al. (2019) Haffert S., Bohn A., de Boer J., Snellen I., Brinchmann J., Girard J., Keller C., Bacon R., 2019, Nature Astronomy, 3, 749
Hom et al. (2024) Hom J., et al., 2024, Monthly Notices of the Royal Astronomical Society, 528, 6959
Isella et al. (2007) Isella A., Testi L., Natta A., Neri R., Wilner D., Qi C., 2007, Astronomy & Astrophysics, 469, 213
Isella et al. (2018) Isella A., et al., 2018, The Astrophysical Journal Letters, 869, L49
Isella et al. (2019) Isella A., Benisty M., Teague R., Bae J., Keppler M., Facchini S., Pérez L., 2019, The Astrophysical Journal Letters, 879, L25
Juillard et al. (2022) Juillard S., Christiaens V., Absil O., 2022, Astronomy & Astrophysics, 668, A125
Juillard et al. (2023) Juillard S., Christiaens V., Absil O., 2023, Astronomy & Astrophysics, 679, A52
Juillard et al. (2024) Juillard S., Stasevic S., Christiaens V., Absil O., Milli J., 2024, Astronomy & Astrophysics, 688, A185
Keppler et al. (2018) Keppler M., et al., 2018, Astronomy & Astrophysics, 617, A44
Kiefer et al. (2021) Kiefer S., Bohn A. J., Quanz S. P., Kenworthy M., Stolker T., 2021, Astronomy & Astrophysics, 652, A33
Kingma & Ba (2014) Kingma D. P., Ba J., 2014, arXiv preprint arXiv:1412.6980
Lafrenière et al. (2007) Lafrenière D., Marois C., Doyon R., Nadeau D., Artigau E., 2007, The Astrophysical Journal, 660, 770
Lafrenière et al. (2009) Lafrenière D., Marois C., Doyon R., Barman T., 2009, The Astrophysical Journal, 694, L148
Lagrange et al. (2009) Lagrange A.-M., et al., 2009, Astronomy & Astrophysics, 493, L21
Lagrange et al. (2010) Lagrange A.-M., et al., 2010, Science, 329, 57
Langlois et al. (2020) Langlois M., Gratton R., Lagrange A.-M., Delorme P., Boccaletti A., Bonnefoy M., Maire A.-L., et al., 2020, in revision for Astronomy & Astrophysics
Langlois et al. (2021) Langlois M., et al., 2021, Astronomy & Astrophysics, 651, A71
Lawson et al. (2020) Lawson K., et al., 2020, The Astronomical Journal, 160, 163
Lawson et al. (2022) Lawson K., Currie T., Wisniewski J. P., Groff T. D., McElwain M. W., Schlieder J. E., 2022, The Astrophysical Journal Letters, 935, L25
Lebrun et al. (2013) Lebrun M., Buades A., Morel J.-M., 2013, SIAM Journal on Imaging Sciences, 6, 1665
Ledoit & Wolf (2004) Ledoit O., Wolf M., 2004, Journal of Multivariate Analysis, 88, 365
Lisse et al. (2009) Lisse C. M., Chen C., Wyatt M., Morlok A., Song I., Bryden G., Sheehan P., 2009, The Astrophysical Journal, 701, 2019
Louchet & Moisan (2008) Louchet C., Moisan L., 2008, in 2008 16th European Signal Processing Conference. pp 1–5
Lu & Zimmerman (2005) Lu N., Zimmerman D. L., 2005, Statistics & Probability Letters, 73, 449
Mairal et al. (2009) Mairal J., Bach F., Ponce J., Sapiro G., Zisserman A., 2009, in IEEE International Conference on Computer Vision. pp 2272–2279
Maire et al. (2017) Maire A.-L., et al., 2017, Astronomy & Astrophysics, 601, A134
Marois et al. (2006) Marois C., Lafrenière D., Doyon R., Macintosh B., Nadeau D., 2006, The Astrophysical Journal, 641, 556
Marois et al. (2008) Marois C., Macintosh B., Barman T., Zuckerman B., Song I., Patience J., Lafrenière D., Doyon R., 2008, science, 322, 1348
Marois et al. (2010) Marois C., Zuckerman B., Konopacky Q. M., Macintosh B., Barman T., 2010, Nature, 468, 1080
Marois et al. (2013) Marois C., Correia C., Véran J.-P., Currie T., 2013, International Astronomical Union, 8, 48
Marois et al. (2014) Marois C., Correia C., Galicher R., Ingraham P., Macintosh B., Currie T., De Rosa R., 2014, in SPIE Astronomical Intrumentation + Telescopes. p. 91480U
Mazoyer et al. (2020) Mazoyer J., et al., 2020, in Ground-based and Airborne Instrumentation for Astronomy VIII. pp 1080–1099
Mesa et al. (2019a) Mesa D., et al., 2019a, Monthly Notices of the Royal Astronomical Society, 488, 37
Mesa et al. (2019b) Mesa D., et al., 2019b, Astronomy & Astrophysics, 632, A25
Milli et al. (2012) Milli J., Mouillet D., Lagrange A.-M., Boccaletti A., Mawet D., Chauvin G., Bonnefoy M., 2012, Astronomy & Astrophysics, 545, A111
Milli et al. (2017) Milli J., et al., 2017, Astronomy & Astrophysics, 599, A108
Milli et al. (2019) Milli J., et al., 2019, Astronomy & Astrophysics, 626, A54
Müller et al. (2011) Müller A., van den Ancker M., Launhardt R., Pott J.-U., Fedele D., Henning T., 2011, Astronomy & Astrophysics, 530, A85
Müller et al. (2018) Müller A., et al., 2018, Astronomy & Astrophysics, 617, L2
Muro-Arena et al. (2018) Muro-Arena G., et al., 2018, Astronomy & Astrophysics, 614, A24
Muro-Arena et al. (2020) Muro-Arena G., et al., 2020, Astronomy & Astrophysics, 635, A121
Nielsen & Close (2010) Nielsen E. L., Close L. M., 2010, The Astrophysical Journal, 717, 878
Nielsen et al. (2008) Nielsen E. L., Close L. M., Biller B. A., Masciadri E., Lenzen R., 2008, The Astrophysical Journal, 674, 466
Pairet et al. (2019) Pairet B., Jacques L., Cantalloube F., 2019, Signal Processing with Adaptive Sparse Structured Representations, 1, 1
Pairet et al. (2021) Pairet B., Cantalloube F., Jacques L., 2021, Monthly Notices of the Royal Astronomical Society, 503, 3724
Pavlov et al. (2008) Pavlov A., Möller-Nilsson O., Feldt M., Henning T., Beuzit J.-L., Mouillet D., 2008, Advanced Software and Control for Astronomy II, 7019, 1093
Pueyo (2018) Pueyo L., 2018, Handbook of Exoplanets, pp 705–765
Ramani et al. (2012) Ramani S., Liu Z., Rosen J., Nielsen J.-F., Fessler J. A., 2012, IEEE Transactions on Image Processing, 21, 3659
Reggiani et al. (2018) Reggiani M., et al., 2018, Astronomy & Astrophysics, 611, A74
Ren (2023) Ren B. B., 2023, Astronomy & Astrophysics, 679, A18
Ren et al. (2018) Ren B., Pueyo L., Zhu G. B., Debes J., Duchêne G., 2018, The Astrophysical Journal, 852, 104
Ren et al. (2020) Ren B., Pueyo L., Chen C., Choquet É., Debes J. H., Duchêne G., Ménard F., Perrin M. D., 2020, The Astrophysical Journal, 892, 74
Riaud et al. (2006) Riaud P., Mawet D., Absil O., Boccaletti A., Baudoz P., Herwats E., Surdej J., 2006, Astronomy & Astrophysics, 458, 317
Ruane et al. (2019) Ruane G., et al., 2019, The Astronomical Journal, 157, 118
Schneider et al. (1999) Schneider G., et al., 1999, The Astrophysical Journal Letters, 513, L127
Schütz et al. (2005) Schütz O., Meeus G., Sterzik M., 2005, Astronomy & Astrophysics, 431, 175
Smith & Terrile (1984) Smith B. A., Terrile R. J., 1984, Science, 226, 1421
Soummer et al. (2012) Soummer R., Pueyo L., Larkin J., 2012, The Astrophysical Journal Letters, 755, L28
Sparks & Ford (2002) Sparks W. B., Ford H. C., 2002, The Astrophysical Journal, 578, 543
Stapper, L. M. & Ginski, C. (2022) Stapper, L. M. Ginski, C. 2022, Astronomy & Astrophysics, 668
Stein (1981) Stein C. M., 1981, The Annals of Statistics, pp 1135–1151
Teague et al. (2018) Teague R., Bae J., Bergin E. A., Birnstiel T., Foreman-Mackey D., 2018, The Astrophysical Journal Letters, 860, L12
Thatte et al. (2007) Thatte N., Abuter R., Tecza M., Nielsen E. L., Clarke F. J., Close L. M., 2007, Monthly Notices of the Royal Astronomical Society, 378, 1229
Thiébaut (2002) Thiébaut É., 2002, in Astronomical Data Analysis II. pp 174–183
Tilling et al. (2012) Tilling I., et al., 2012, Astronomy & Astrophysics, 538, A20
Traub & Oppenheimer (2010) Traub W. A., Oppenheimer B. R., 2010, Exoplanets, pp 111–156
Van Leeuwen (2007) Van Leeuwen F., 2007, Astronomy & Astrophysics, 474, 653
Vigan et al. (2010) Vigan A., Moutou C., Langlois M., Allard F., Boccaletti A., Carbillet M., Mouillet D., Smith I., 2010, Monthly Notices of the Royal Astronomical Society, 407, 71
Vigan et al. (2014) Vigan A., et al., 2014, in Ground-based and Airborne Instrumentation for Astronomy V. pp 1568–1577
Wagner et al. (2019) Wagner K., Stone J. M., Spalding E., Apai D., Dong R., Ertel S., Leisenring J., Webster R., 2019, The Astrophysical Journal, 882, 20
Wagner et al. (2023) Wagner K., et al., 2023, Nature Astronomy, 7, 1208
Wahba et al. (1985) Wahba G., et al., 1985, The Annals of Statistics, 13, 1378
Wahhaj et al. (2015) Wahhaj Z., et al., 2015, Astronomy & Astrophysics, 581, A24
Wahhaj et al. (2021) Wahhaj Z., et al., 2021, Astronomy & Astrophysics, 648, A26
Wainwright & Simoncelli (1999) Wainwright M. J., Simoncelli E. P., 1999, in Neural Information Processing Systems. pp 855–861
Werner et al. (2008) Werner K., Jansson M., Stoica P., 2008, IEEE Transactions on Signal Processing, 56, 478
Wolf et al. (2024) Wolf T. N., Jones B. A., Bowler B. P., 2024, The Astronomical Journal, 167, 92
Xie et al. (2022) Xie C., et al., 2022, arXiv preprint arXiv:2208.07915
Xuan et al. (2018) Xuan W. J., et al., 2018, The Astronomical Journal, 156, 156
Yu et al. (2011) Yu G., Sapiro G., Mallat S., 2011, IEEE Transactions on Image Processing, 21, 2481
Zhu et al. (1997) Zhu C., Byrd R. H., Lu P., Nocedal J., 1997, ACM Transactions on Mathematical Software, 23, 550
Zoran & Weiss (2011) Zoran D., Weiss Y., 2011, in IEEE International Conference on Computer Vision. pp 479–486

Appendix A Derivation of the maximum likelihood estimators for a weighted mixture of multi-variate Gaussian models

In this appendix, we detail the technical elements yielding to the MLEs (9)-(12) of the parameters of a weighted mixture of multi-variate Gaussian, knowing the object of interest $\bm{u}$ , see Sect. 2.3.1.

Under the assumptions of Sect. 2.2, the co-log-likelihood of the 4D patch $\bm{v}_{n}$ is given by Eq. (8) and can be rewritten as:

	$\displaystyle\mathscr{L}_{n}$	$\displaystyle=\frac{T\,K}{2}\,\log\big{\rvert}\mathbf{C}_{n}^{\mathrm{spec}}% \big{\lvert}+\frac{T\,L}{2}\log\big{\rvert}\mathbf{C}_{n}^{\mathrm{spat}}\big{\lvert}$
		$\displaystyle\quad+\sum_{t=1}^{T}\left(\frac{K\,L}{2}\,\log\sigma_{n,t}^{2}+% \frac{1}{2\,\sigma_{n,t}^{2}}\,\left\lVert\bm{r}_{n,t}\right\rVert_{{\big{(}% \mathbf{C}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\mathbf{C}_{n}^{% \mathrm{spat}}\big{)}^{-1}}}^{2}\right)\,,$		(121)

with:

\bm{r}_{n,t}=\bm{v}_{n,t}-\bm{\mu}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n% ,t}

(122)

the residuals in the $t$ -th frame of the $n$ -th patch, and where we used the following properties of the Kronecker product of any $n\times n$ matrix $\mathbf{A}$ and $m\times m$ matrix $\mathbf{B}$ : $|\mathbf{A}\otimes\mathbf{B}|=|\mathbf{A}|^{m}|\mathbf{B}|^{n}$ and $(\mathbf{A}\otimes\mathbf{B})^{-1}=\mathbf{A}^{-1}\otimes\mathbf{B}^{-1}$ .

To obtain the MLEs, we differentiate the expression of $\mathscr{L}_{n}$ given in Eq. (121):

$\displaystyle\partial\mathscr{L}_{n}=$	$\displaystyle\tfrac{T\,K}{2}\,\operatorname{tr}\Bigg{(}\left({\mathbf{C}_{n}^{% \mathrm{spec}}}\right)^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{spec}}\Bigg{)}+% \tfrac{TL}{2}\,\operatorname{tr}\Bigg{(}\left({\mathbf{C}_{n}^{\mathrm{spat}}}% \right)^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{spat}}\Bigg{)}$
$\displaystyle+\sum_{t=1}^{T}$	$\displaystyle\biggl{\{}\tfrac{KL}{2}\frac{\partial\sigma_{n,t}^{2}}{\sigma_{n,% t}^{2}}-\frac{\partial\sigma_{n,t}^{2}}{2\sigma_{n,t}^{4}}\,\left\lVert\bm{r}_% {n,t}\right\rVert_{\big{(}{\mathbf{C}_{n}^{\mathrm{spec}}}\big{)}^{-1}\otimes{% \big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}$
	$\displaystyle+\frac{1}{\sigma_{n,t}^{2}}\,\bm{r}_{n,t}^{\top}\,\left(\left({% \mathbf{C}_{n}^{\mathrm{spec}}}\right)^{-1}\otimes{\left(\mathbf{C}_{n}^{% \mathrm{spat}}\right)}^{-1}\right)\,\partial\bm{r}_{n,t}$
	$\displaystyle-\frac{1}{2\,\sigma_{n,t}^{2}}\,\operatorname{tr}\left({\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\partial\mathbf{C}_{n}^{\mathrm{% spec}}\,{\left(\mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}% ^{\top}\,{\left(\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t% }\right)$
	$\displaystyle-\frac{1}{2\,\sigma_{n,t}^{2}}\,\operatorname{tr}\left({\left(% \mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,\partial\mathbf{C}_{n}^{\mathrm{% spat}}\,{\left(\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}% \,{\left(\mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top% }\right)\biggr{\}}\,,$	(123)

where we obtained the last two terms by rewriting the squared norm term in Eq. (121) as:

\lVert\bm{r}_{n,t}\rVert_{\big{(}{\mathbf{C}_{n}^{\mathrm{spec}}}\big{)}^{-1}% \otimes{\big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}=\operatorname{% tr}\Bigg{(}{\mathbf{V}}_{n,t}^{\top}\,\left({\mathbf{C}_{n}^{\mathrm{spat}}}% \right)^{-1}\,{\mathbf{V}}_{n,t}\,\left({\mathbf{C}_{n}^{\mathrm{spec}}}\right% )^{-1}\Bigg{)}

(124)

with ${\mathbf{V}}_{n,t}$ the $K\times L$ matrix whose element at row $k$ and column $\ell$ is $[\bm{r}_{n,t}]_{k,\ell}$ . The following set of conditions is sufficient for the partial derivatives of $\mathscr{L}_{n}$ in respectively $\bm{\mu}^{\mathrm{spec}}$ , $\sigma_{n,t}^{2}$ , $\mathbf{C}_{n}^{\mathrm{spec}}$ , and $\mathbf{C}_{n}^{\mathrm{spat}}$ to be equal to zero:

\begin{cases}\sum_{t=1}^{T}\sigma_{n,t}^{-2}\,\bm{r}_{n,t}=0\,,\\[8.61108pt] \frac{K\,L}{\sigma_{n,t}^{2}}-\frac{1}{\sigma_{n,t}^{4}}\,\left\lVert\bm{r}_{n% ,t}\right\rVert_{{\big{(}\mathbf{C}_{n}^{\mathrm{spec}}\big{)}}^{-1}\otimes{% \big{(}\mathbf{C}_{n}^{\mathrm{spat}}\big{)}}^{-1}}^{2}=0\,,\\[8.61108pt] T\,K\,\mathbf{I}-\sum\limits_{t=1}^{T}\frac{1}{\sigma_{n,t}^{2}}\,{\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top}\,{\left% (\mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}=\mathbf{0}\,,% \\[8.61108pt] T\,L\,\mathbf{I}-\sum\limits_{t=1}^{T}\frac{1}{\sigma_{n,t}^{2}}\,{\left(% \mathbf{C}_{n}^{\mathrm{spat}}\right)}^{-1}\,{\mathbf{V}}_{n,t}\,{\left(% \mathbf{C}_{n}^{\mathrm{spec}}\right)}^{-1}\,{\mathbf{V}}_{n,t}^{\top}=\mathbf% {0}\,,\end{cases}

(125)

with $\mathbf{I}$ the identity matrix. These conditions hold if:

\displaystyle\begin{cases}\widehat{\bm{\mu}}_{n}^{\,\mathrm{spec}}=\frac{\sum_% {t=1}^{T}\widehat{\sigma}_{n,t}^{-2}\,\left(\bm{v}_{n,t}-[\mathbf{M}\,\widehat% {\bm{u}}]_{n,t}\right)}{\sum_{t=1}^{T}\widehat{\sigma}_{n,t}^{-2}}\,,\\[8.6110% 8pt] \widehat{\sigma}_{n,t}^{2}=\tfrac{1}{K\,L}\,\left\lVert\bm{v}_{n,t}-\widehat{% \bm{\mu}}_{n}^{\mathrm{spec}}-[\mathbf{M}\,\bm{u}]_{n,t}\right\rVert_{{\big{(}% \widehat{\mathbf{C}}_{n}^{\mathrm{spec}}\big{)}^{-1}\otimes\big{(}\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}}\big{)}^{-1}}}^{2}\,,\\[8.61108pt] \widehat{\mathbf{C}}_{n}^{\mathrm{spec}}=\tfrac{1}{T\,K}\sum\limits_{t=1}^{T}% \widehat{\mathbf{V}}_{n,t}^{\top}\,\left(\widehat{\sigma}_{n,t}^{2}\,\widehat{% \mathbf{C}}_{n}^{\mathrm{spat}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}\,,\\[8% .61108pt] \widehat{\mathbf{C}}_{n}^{\mathrm{spat}}=\tfrac{1}{T\,L}\sum\limits_{t=1}^{T}% \widehat{\mathbf{V}}_{n,t}\,\left(\widehat{\sigma}_{n,t}^{2}\,\widehat{\mathbf% {C}}_{n}^{\mathrm{spec}}\right)^{-1}\,\widehat{\mathbf{V}}_{n,t}^{\top}\,.\end% {cases}

(126)

These correspond to the expressions given in Eqs. (9)–(12).

Appendix B Additional reconstruction results on simulated synthetic disks

This appendix complements the results presented in Sects. 4.4 and 4.5 regarding the reconstruction of the flux distributions for synthetic disks. Figures 24, 25, 26 report line cuts respectively extracted from Figs. 11-12, 13-14, and 15-16 comparing the proposed approach to the median ASDI, PCA ASDI and PACO ASDI baselines. Figure 27 reports line cuts extracted from Fig. 18 comparing the proposed REXPACO ASDI algorithm to its mono-spectral version (REXPACO ADI; Flasseur et al. (2021)).