Gravitational Wave Signal Extraction Against Non-Stationary Instrumental Noises with Deep Neural Network

Yuxiang Xu Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China Center for Gravitational Wave Experiment, National Microgravity Laboratory, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences (UCAS), Beijing 100049, China    Minghui Du Center for Gravitational Wave Experiment, National Microgravity Laboratory, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China    Peng Xu Corresponding author: [email protected] Center for Gravitational Wave Experiment, National Microgravity Laboratory, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China Lanzhou Center of Theoretical Physics, Lanzhou University, Lanzhou 730000, China Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences (UCAS), Beijing 100049, China    Bo Liang Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China Center for Gravitational Wave Experiment, National Microgravity Laboratory, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences (UCAS), Beijing 100049, China    He Wang CAS Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China International Centre for Theoretical Physics Asia-Pacific, University of Chinese Academy of Sciences, 100190 Beijing, China Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences (UCAS), Beijing 100049, China
(September 16, 2024)
Abstract

Sapce-borne gravitational wave antennas, such as LISA and LISA-like mission (Taiji and Tianqin), will offer novel perspectives for exploring our Universe while introduce new challenges, especially in data analysis. Aside from the known challenges like high parameter space dimension, superposition of large number of signals etc., gravitational wave detections in space would be more seriously affected by anomalies or non-stationarities in the science measurements. Considering the three types of foreseeable non-stationarities including data gaps, transients (glitches), and time-varying noise auto-correlations, which may come from routine maintenance or unexpected disturbances during science operations, we developed a deep learning model for accurate signal extractions confronted with such anomalous scenarios. Our model exhibits the same performance as the current state-of-the-art models do for the ideal and anomaly free scenario, while shows remarkable adaptability in extractions of coalescing massive black hole binary signal against all three types of non-stationarities and even their mixtures. This also provide new explorations into the robustness studies of deep learning models for data processing in space-borne gravitational wave missions.

preprint: APS/123-QED

I INTRODUCTION

Since the first landmark event GW150914 captured by Adv-LIGO [1], today nearly a hundred Gravitational Wave (GW) signals from mergers of stellar mass compact binaries had been observed by the LIGO-Virgo collaboration, which lead to numerous accomplishments in astrophysics and fundamental physics [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]. However, limited by seismic noises, ground-based GW detectors are typically sensitive to signals at frequencies higher than 10 Hz [15, 16, 17]. To enclose the exciting sources of larger and heavier astrophysical systems, one needs to explore the low frequency band of the GW spectrum with detectors of much longer baselines. The first generation space-borne antennas, including LISA (Laser Interferometer Space Antenna) [18, 19] and the LISA-like missions Taiji [20, 21] and Tianqin [22], would be launched in the 2030s and cover the mHz band. Offering novel perspectives for exploring the Universe, such space-borne antennas will also introduce new challenges [19], especially in data analysis [23, 24, 25].

Space antennas are supposed to response to a large amount of superimposed GW signals, including coalescing (super) Massive Black Hole Binaries (MBHB), extreme mass ratio inspirals, galactic compact binaries, evolving topological defects, primordial GW background, and also un-modeled sources. To achieve the expected scientific objectives, throughout analysis of the observational data and estimations of the relevant parameters from the largely superimposed GW signals against complicated noises are needed. Many works have been done to address these issues [26, 27, 28], and in present days the most accurate results can be obtained by matched filtering method for signals buried in Gaussian noises [29, 30]. However, the construction of a full-fledged pipeline of scientific data analysis for space-borne antennas remains still an unfinished task, especially with the boost, in recent years, from machine learning methods been included. Following the studies in [31, 32], accurate signal waveform extractions can be viewed as an intermediate step of the subsequent high-precision parameter estimations with machine learning methods, and the improvements in its performances and robustness are worth further explorations and investigations.

As lessons learned from the LIGO-Virgo observatories [33] and precedent missions like GRACE/GFO [34], LISA PathFinder (LPF) [35] and also Taiji-1 [36], new issues in data analysis for space-borne GW detections begin to draw more attentions [37, 38, 39, 40, 25]. As mentioned, to precisely measure the related parameters and infer the physical properties of the sources, continues measurements without disruptions in the data are normally demanded, which however imposes hard challenges on the long-term stability and robustness of both the payloads and satellite platforms. GW detection in space would be more seriously affected by anomalies or non-stationarities in the science measurements, including instrument transients or noise bursts, data gaps, slowly varying noise auto-correlations, trends etc., as the detailed knowledge of instrumental noises is crucial to GW signals extraction and parameter estimations. According to the designs of the LISA and Taiji missions, the foreseeable anomalies or non-stationarities in science measurements may come from the routine maintenance of the satellites or the onboard payloads and unexpected environment disturbances or instrumental anomalies, which will be the main subjects of this work. Cyclostationary properties also exist mainly due to the orbital modulations of the stochastic GW foreground from compact binaries, which are not within the scope of this work and will be left for future studies.

Routine maintenance, such as antenna re-orientations, payloads calibrations etc., could cause important and even continuous disturbances of satellite platforms and affect heavily the performances of the key payloads, therefore result into varying noise auto-correlations and even data loss or gaps. For example, the scheduled re-orientations of the transmission antennas would lead to data gaps about 3.5 hours per week or 7 hours every two-weeks [23]. In addition, unforeseen anomalies or failures due to hardware problems would take place and produce random unscheduled data gaps. Based on the experiences from LPF [41, 38, 24, 40], GRACE/GFO [42, 43] and also Taiji-1 [44], instrumental transients may also contaminate the science data, such as the acceleration glitches in gravitational reference sensor (GRS) systems, phase jumps in interferometers etc. Early studies on data processing algorithms against such possible anomalies can be found in [37, 39, 40]. Recently, Dey et al. [23] have used Bayesian analysis techniques to assess the impact of data gaps and concluded that the effects of unscheduled gaps were often more significant, and Spadaro et al. [45] used a joint parameter estimation approach to evaluate the influences of GRS glitches demonstrating that it is possible to accurately identify glitches in the absence of overlapping GW signals. These studies treated the above non-stationarities independently, and, for instrumental transients, prior knowledge of transient models are required. In this work, given the potentials and capacities of deep neural networks, we seek to develop an anomaly-model-independent signal extraction method for space GW antennas that can thoroughly treat and overcome the impacts of the above three types of data non-stationarities or anomalies.

Deep learning methods have already achieved considerable success in data processing of GW detections [46, 47, 32, 48, 49, 50, 51, 52, 53, 54, 55, 31]. Compared to traditional algorithms, such as matched filtering [29, 30, 56], which generally require a vast template bank and thus consume substantial computational resources, deep learning methods are well known for their fast processing speed and excellent generalization ability, making it possible to achieve rapid and anomaly-model-independent signal extraction. For GW detection in space, Conv-TasNet [31] has demonstrated its performance in various scenarios, completing signal extraction in less-than-or-similar-to\lesssim 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT seconds, however, robustness tests under anomalous conditions are overlooked. In this work, focusing on the three types of possible non-stationarities for LISA or LISA-like missions, including gaps, transients (glitches), and time varying noise auto-correlations, we developed a deep learning model for accurate extractions of coalescing MBHB signals confronted with such anomalous scenarios. This also provide new explorations into the robustness studies of deep learning models for data processing in space-borne GW missions. Our model exhibits the same performance as the Conv-TasNet model [31] does for the ideal and anomaly free scenario, while shows remarkable adaptability in signal extractions against all three types of non-stationarities and even their mixtures.

II Dense-LSTM model

Refer to caption
Figure 1: The architecture of our denoising autoencoder is delineated as follows. The input data is first subjected to normalization and segmentation (top left), resulting in the formation of overlapping subsequences. Each subsequence is then processed through the encoder to extract its characteristic feature vector. This is followed by passage through three bidirectional LSTM layers to yield predictive values. The final reconstructed waveform is then attained in the Dense layer (top right).

Autoencoders, commonly employed as denoising architectures, are extensively applied in fields like image denoising [57, 58, 59, 60], signal extractions [61, 62, 63, 64], and speech enhancements, et al. [65, 66, 67, 68]. Autoencoder models include an encoder and a decoder. The encoder is designed to extract feature mappings from the input data, while the decoder is to generate reconstructed data from these mappings. For GW signal extractions in LIGO-Virgo data analysis, the CNN-LSTM denoising autoencoder model has been used [69] and shown superior performance.

As discussed in Sec. I, the long periods and consequently the much more complicated data anomalies for science measurements of LISA and LISA-like missions have driven us to develop models capable of uncovering deeper mappings within the signals and thereby reducing the loss and alteration of the crucial information under significant anomalous conditions. Simple CNN architectures fall short in meeting these demands. In this work, we choose to use a deeper CNN network different from the CNN-LSTM denoising autoencoder suggested by Chatterjee et al. [69]. The DenseNet is chosen as the encoder for its advantage in reducing computational overhead while maintaining comparable accuracy compared with other deep CNN networks [70, 71, 72, 73] and at the same time ensuring faster feature extraction. More importantly, DenseNet thoroughly implements the concept of feature reuse, with each layer directly connected to all preceding layers. This enables the comprehensive utilization of low-complexity shallow features, facilitating the derivation of a smoother and more robust decision function. Such a design ensures considerable extraction capabilities even when confronted with data strongly contaminated by the possible non-stationarities or anomalies for space-borne GW detections. For the decoder, we retain the bidirectional LSTM network due to its superiority in capturing long-term dependencies within sequences.

The structure of our model is depicted in Fig 1. The input GW data for the model have a sampling frequency of 0.1Hz and a duration of 160,000 seconds, amounting to 16,000 data points. The detailed data preparations can be found in Sec III. All data are normalized between -1 and 1, followed by segmentation into 16,000 overlapping subsequences of length 4. Each subsequence is then fed into the decoder network for feature extraction. The feature vector of each subsequence is processed through three bidirectional LSTM layers for predicting the output of the subsequent time step. Finally, a complete waveform output is obtained through a dense layer.

III Data preparations

For LISA and LISA-like missions, laser frequency instability noise is dominant in the science measurements, and a pre-processing technology called Time Delay Interferometry (TDI) [74, 75, 76] is employed to efficiently suppress such noise. Following this, our data is generated through the first generation noise orthogonal TDI channels A𝐴Aitalic_A and E𝐸Eitalic_E, see [74] for a detailed introduction. The total TDI output data stream of channel I𝐼Iitalic_I (dubbed sI(t)superscript𝑠𝐼𝑡s^{I}(t)italic_s start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t )) is the combination of signal hI(t)superscript𝐼𝑡h^{I}(t)italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) and noise nI(t)superscript𝑛𝐼𝑡n^{I}(t)italic_n start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ):

sI(t)=hI(t)+nI(t),I{A,E},formulae-sequencesuperscript𝑠𝐼𝑡superscript𝐼𝑡superscript𝑛𝐼𝑡𝐼𝐴𝐸s^{I}(t)=h^{I}(t)+n^{I}(t),\quad I\in\{A,E\},italic_s start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) = italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) + italic_n start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) , italic_I ∈ { italic_A , italic_E } , (1)

where the detector response in terms of channel I𝐼Iitalic_I to some incident GW with polarizations hαsubscript𝛼h_{\alpha}italic_h start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT reads

hI(t)=αTαI(t)hα(t),α{+,×},formulae-sequencesuperscript𝐼𝑡subscript𝛼superscriptsubscript𝑇𝛼𝐼𝑡subscript𝛼𝑡𝛼h^{I}(t)=\sum_{\alpha}T_{\alpha}^{I}(t)\ h_{\alpha}(t),\quad\alpha\in\{+,% \times\},italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) italic_h start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_t ) , italic_α ∈ { + , × } , (2)

where TαI(t)superscriptsubscript𝑇𝛼𝐼𝑡T_{\alpha}^{I}(t)italic_T start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) is the total transfer function of signal including antenna responses and TDI combinations, and hα(t)subscript𝛼𝑡h_{\alpha}(t)italic_h start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_t ) represents the polarization components of the coalescing MBHB GW signal. Specific details of this transfer function can be found in [76, 74]. In this paper, the TDI transfer functions of signals are derived according to Taiji’s mission concept and orbit configurations [77], which has a nominal arm length of 3×109similar-toabsent3superscript109\sim 3\times 10^{9}∼ 3 × 10 start_POSTSUPERSCRIPT 9 end_POSTSUPERSCRIPT m. The PyCBC package [78] and the waveform model SEOBNRv4 [79] are employed to generate the coalescing MBHB GW templates hα(t)subscript𝛼𝑡h_{\alpha}(t)italic_h start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_t ). On the other hand, the noise floor of the output data from a certain TDI channel is in principle determined by the corresponding TDI combinations of GRS residual acceleration noises nACC(t)subscript𝑛ACC𝑡n_{\rm ACC}(t)italic_n start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) and optical metrology system noises nOMS(t)subscript𝑛OMS𝑡n_{\rm OMS}(t)italic_n start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_t ). The resulting total instrumental noise of TDI channel I𝐼Iitalic_I can be written as

nI(t)=TOMSI(t)nOMS(t)+TACCI(t)nACC(t),superscript𝑛𝐼𝑡subscriptsuperscript𝑇𝐼OMS𝑡subscript𝑛OMS𝑡subscriptsuperscript𝑇𝐼ACC𝑡subscript𝑛ACC𝑡n^{I}(t)=T^{I}_{\rm OMS}(t)n_{\rm OMS}(t)+T^{I}_{\rm ACC}(t)n_{\rm ACC}(t),italic_n start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_t ) = italic_T start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_t ) italic_n start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_t ) + italic_T start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) italic_n start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) , (3)

with TOMSI(t)subscriptsuperscript𝑇𝐼OMS𝑡T^{I}_{\rm OMS}(t)italic_T start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_t ) and TACCI(t)subscriptsuperscript𝑇𝐼ACC𝑡T^{I}_{\rm ACC}(t)italic_T start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) being the transfer functions of GRS residual acceleration noise and optical metrology system noise, respectively [74, 76], which are common to LISA and Taiji missions. The time series of basic instrumental noises nACC(t)subscript𝑛ACC𝑡n_{\rm ACC}(t)italic_n start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) and nOMS(t)subscript𝑛OMS𝑡n_{\rm OMS}(t)italic_n start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_t ) are generated according to the Power Spectral Densities (PSDs) of these two noise components, denoted as SOMS1/2(f)superscriptsubscript𝑆OMS12𝑓S_{\text{OMS}}^{1/2}(f)italic_S start_POSTSUBSCRIPT OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( italic_f ) and SACC1/2(f)superscriptsubscript𝑆ACC12𝑓S_{\text{ACC}}^{1/2}(f)italic_S start_POSTSUBSCRIPT ACC end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( italic_f ), and according to the designs of Taiji [77] we assume the nominal values

SOMS1/2(f)=8×10121+(2mHzf)4mHz,superscriptsubscript𝑆OMS12𝑓8superscript10121superscript2mHz𝑓4mHzS_{\text{OMS}}^{1/2}(f)=8\times 10^{-12}\sqrt{1+\left(\frac{2\text{mHz}}{f}% \right)^{4}}\,\frac{\rm m}{\sqrt{\rm Hz}},italic_S start_POSTSUBSCRIPT OMS end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( italic_f ) = 8 × 10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT square-root start_ARG 1 + ( divide start_ARG 2 mHz end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG divide start_ARG roman_m end_ARG start_ARG square-root start_ARG roman_Hz end_ARG end_ARG , (4)
SACC1/2(f)=3×10151+(0.4mHzf)2×1+(f8mHz)4m/s2Hz.superscriptsubscript𝑆ACC12𝑓3superscript10151superscript0.4mHz𝑓21superscript𝑓8mHz4msuperscripts2Hz\begin{split}S_{\text{ACC}}^{1/2}(f)=&3\times 10^{-15}\sqrt{1+\left(\frac{0.4% \text{mHz}}{f}\right)^{2}}\\ &\times\sqrt{1+\left(\frac{f}{8\text{mHz}}\right)^{4}}\,\frac{\rm m/s^{2}}{\rm% \sqrt{Hz}}.\end{split}start_ROW start_CELL italic_S start_POSTSUBSCRIPT ACC end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( italic_f ) = end_CELL start_CELL 3 × 10 start_POSTSUPERSCRIPT - 15 end_POSTSUPERSCRIPT square-root start_ARG 1 + ( divide start_ARG 0.4 mHz end_ARG start_ARG italic_f end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL × square-root start_ARG 1 + ( divide start_ARG italic_f end_ARG start_ARG 8 mHz end_ARG ) start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG divide start_ARG roman_m / roman_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG square-root start_ARG roman_Hz end_ARG end_ARG . end_CELL end_ROW (5)
Table 1: Summary of parameter setups in coalescing MBHB GW signals generations.
Parameter Lower bound Upper bound
Mtotsubscript𝑀totM_{\text{tot}}italic_M start_POSTSUBSCRIPT tot end_POSTSUBSCRIPT 111Total mass of the system. 106Msuperscript106subscript𝑀direct-product10^{6}M_{\odot}10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT 108Msuperscript108subscript𝑀direct-product10^{8}M_{\odot}10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT
q𝑞qitalic_q 222Mass ratio of the objects in the binary system. 0.01 1
s1zsuperscriptsubscript𝑠1𝑧s_{1}^{z}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT 333Dimensionless spin parameter for the primary object. -0.99 0.99
s2zsuperscriptsubscript𝑠2𝑧s_{2}^{z}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT 444Dimensionless spin parameter for the secondary object. -0.99 0.99
Refer to caption
Figure 2: The noise curve and the typical coalescing MBHB signal (SNRsimilar-to\sim 35.0) in the frequency domain.

Specifically, to generate the noise data in time domain according to the PSDs, we convert Gaussian white noises to the frequency domain through Fourier transform, then multiply them by the amplitude spectral density value at each frequency to adjust the spectra of the noises, and finally obtain time series of noises through inverse Fourier transform. Theoretically, the total noise PSDs of the A𝐴Aitalic_A, E𝐸Eitalic_E channels SnI(f),I{A,E}superscriptsubscript𝑆𝑛𝐼𝑓𝐼𝐴𝐸S_{n}^{I}(f),\ I\in\{A,E\}italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_f ) , italic_I ∈ { italic_A , italic_E } are the TDI combinations of SOMS(f)subscript𝑆OMS𝑓S_{\rm OMS}(f)italic_S start_POSTSUBSCRIPT roman_OMS end_POSTSUBSCRIPT ( italic_f ) and SACC(f)subscript𝑆ACC𝑓S_{\rm ACC}(f)italic_S start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_f ) calculated according to Eq. (3), see Refs. [76, 74, 80, 81] for their specific forms.

We established the parameter space for coalescing MBHBs as in Table 1, and employed a uniform grid to generate the diverse waveforms. Parameters such as times and phases at coalescence, sky locations, polarization angles, as well as inclinations are drawn from uniform distributions. We inject the signal with specific optimal Signal-to-Noise Ratios (SNR)

SNR=(hI|hI)1/2,SNRsuperscriptconditionalsuperscript𝐼superscript𝐼12{\rm SNR}=(h^{I}|h^{I})^{1/2},roman_SNR = ( italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT | italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT , (6)

where hIsuperscript𝐼h^{I}italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT represents the signal template through the TDI channel I𝐼Iitalic_I, the inner product (sI|hI)conditionalsuperscript𝑠𝐼superscript𝐼(s^{I}|h^{I})( italic_s start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT | italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ) is defined as,

(sI|hI)=2fminfmax(s~I(f)h~I(f)+s~I(f)h~I(f))𝑑f,conditionalsuperscript𝑠𝐼superscript𝐼2superscriptsubscriptsubscript𝑓minsubscript𝑓maxsuperscript~𝑠𝐼𝑓superscript~𝐼𝑓superscript~𝑠𝐼𝑓superscript~𝐼𝑓differential-d𝑓(s^{I}|h^{I})=2\int_{f_{\text{min}}}^{f_{\text{max}}}\left(\tilde{s}^{I}(f)% \tilde{h}^{I*}(f)+\tilde{s}^{I*}(f)\tilde{h}^{I}(f)\right)\,df,( italic_s start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT | italic_h start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ) = 2 ∫ start_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT min end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT max end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( over~ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_f ) over~ start_ARG italic_h end_ARG start_POSTSUPERSCRIPT italic_I ∗ end_POSTSUPERSCRIPT ( italic_f ) + over~ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT italic_I ∗ end_POSTSUPERSCRIPT ( italic_f ) over~ start_ARG italic_h end_ARG start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_f ) ) italic_d italic_f , (7)

here, fmin=3×105subscript𝑓min3superscript105f_{\text{min}}=3\times 10^{-5}italic_f start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = 3 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT Hz and fmax=0.05subscript𝑓max0.05f_{\text{max}}=0.05italic_f start_POSTSUBSCRIPT max end_POSTSUBSCRIPT = 0.05 Hz. The tilde symbol represents the Fourier transform and * the complex conjugation. This inner product can also be used to calculate the overlap between the signal waveform o𝑜oitalic_o extracted by the model and the whitened template waveform hhitalic_h through channel A𝐴Aitalic_A or E𝐸Eitalic_E can be evaluated in terms of the overlap function

𝒪(o,h)=maxtc,ϕc(o^|h^),𝒪𝑜subscriptsubscript𝑡𝑐subscriptitalic-ϕ𝑐conditional^𝑜^\mathcal{O}(o,h)=\max_{t_{c},\phi_{c}}(\hat{o}|\hat{h}),caligraphic_O ( italic_o , italic_h ) = roman_max start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( over^ start_ARG italic_o end_ARG | over^ start_ARG italic_h end_ARG ) , (8)

where o^^𝑜\hat{o}over^ start_ARG italic_o end_ARG and h^^\hat{h}over^ start_ARG italic_h end_ARG are the normalized ones, and tcsubscript𝑡𝑐t_{c}italic_t start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT and ϕcsubscriptitalic-ϕ𝑐\phi_{c}italic_ϕ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT are the instantaneous time and phase corresponding to the maximum overlap between o^^𝑜\hat{o}over^ start_ARG italic_o end_ARG and h^^\hat{h}over^ start_ARG italic_h end_ARG.

We set the SNR level for the training dataset as 50, and generate 10,000 samples. The distances of the sources are adjusted according to the specific optimal SNR. All samples are subjected to whitening processing, and a Tukey window with α=18𝛼18\alpha=\frac{1}{8}italic_α = divide start_ARG 1 end_ARG start_ARG 8 end_ARG is adopted.

Following the same routine, we produce 2,000 test samples, see Fig 2 for illustration. In order to assess the model’s performance for signals extractions with relative low SNRs and test the robustness of our model against various data anomalies, these samples were generated with SNRs ranging from 30 to 70.

Refer to caption
Figure 3: The figure illustrates the distribution of our training and test datasets across redshifts ranging from 1 to 20.

Our training and test datasets were sampled from redshifts ranging from 1 to 20. The resulting distribution is shown in Fig 3.

IV Training strategy

For denoising autoencoders, the Mean Squared Error (MSE) is a prevalent loss function employed to quantify the discrepancy between the network’s predictions and the true values. However, given that the GW signals fed into the network have undergone normalization, the contributions to the overall MSE from the earlier stages of the coalescing signals are rather weak compared to those from the merger phases. This can hinder the model from achieving optimal refinements. To address this, the loss function defined in [69] is adopted which is the difference of the MSE and the fractal Tanimoto similarity coefficient [82] terms

Lo,h=in(hioi)2nrw,o,hd,(i=1,,n)subscript𝐿𝑜superscriptsubscript𝑖𝑛superscriptsubscript𝑖subscript𝑜𝑖2𝑛subscriptsuperscript𝑟𝑑𝑤𝑜𝑖1𝑛L_{o,h}=\frac{\sum_{i}^{n}(h_{i}-o_{i})^{2}}{n}-r^{d}_{w,o,h},\quad(i=1,\ldots% ,n)italic_L start_POSTSUBSCRIPT italic_o , italic_h end_POSTSUBSCRIPT = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n end_ARG - italic_r start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w , italic_o , italic_h end_POSTSUBSCRIPT , ( italic_i = 1 , … , italic_n ) (9)

the first term on the right hand side is the MSE term and the fractal Tanimoto similarity coefficient is defined as

rw,o,hd=inwioihi2dinwi(oi2+hi2)(2d+11)inwioihi.subscriptsuperscript𝑟𝑑𝑤𝑜superscriptsubscript𝑖𝑛subscript𝑤𝑖subscript𝑜𝑖subscript𝑖superscript2𝑑superscriptsubscript𝑖𝑛subscript𝑤𝑖superscriptsubscript𝑜𝑖2superscriptsubscript𝑖2superscript2𝑑11superscriptsubscript𝑖𝑛subscript𝑤𝑖subscript𝑜𝑖subscript𝑖r^{d}_{w,o,h}=\frac{\sum_{i}^{n}w_{i}\cdot o_{i}\cdot h_{i}}{2^{d}\sum_{i}^{n}% w_{i}\cdot(o_{i}^{2}+h_{i}^{2})-(2^{d+1}-1)\sum_{i}^{n}w_{i}\cdot o_{i}\cdot h% _{i}}.italic_r start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w , italic_o , italic_h end_POSTSUBSCRIPT = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ ( italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) - ( 2 start_POSTSUPERSCRIPT italic_d + 1 end_POSTSUPERSCRIPT - 1 ) ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG . (10)

The MSE is used for the optimizations of individual data points, while the fractal Tanimoto similarity coefficient is used for the optimizations of the overall data. Tanimoto coefficient is commonly used to measure the similarity between two vectors, and the fractal coefficient, as an enhanced version, introduces a parameter d𝑑ditalic_d to facilitate deeper levels of optimizations. This approach ensures that the extracted data achieves good amplitude and phase agreements with the template data on a global level.

Another worth-noting modification is that, unlike in the original work [69] where the weight wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is set to 1/oi1subscript𝑜𝑖1/o_{i}1 / italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, we set wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT equal to 1 in our approach. This is because, the long observation time in space borne GW detection will give rise to large number of data points that having amplitudes close to zero in the generated templates, the high weights, such as ωi=1/oisubscript𝜔𝑖1subscript𝑜𝑖\omega_{i}=1/o_{i}italic_ω start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 / italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, of these data points with oi0similar-tosubscript𝑜𝑖0o_{i}\sim 0italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∼ 0 will reduce the weights of the data from the merger phases, and therefore the optimizations will be highly affected by the data segments away from the important merge phases and consequently prevent the network from converging. For the parameter d𝑑ditalic_d, which is used to adjust the depth of optimization in similarity, is initially set to 0. As the training converges, d𝑑ditalic_d grows gradually to facilitate deeper levels of optimization.

To summarize, at the beginning of the training, we set the learning rate to 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and d𝑑ditalic_d to 0. As the model approaches convergence, we set d=5𝑑5d=5italic_d = 5 and reduce the learning rate by a factor of 10. We have chosen the learning rate as 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, and 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT with corresponding d𝑑ditalic_d taking values as 0, 5, and 10 for case studies, and trained for 200 epochs for each case.

V Results

We first tested the extraction performance of our model for the ideal case without data non-stationarities or anomalies, see illustrations in Fig 4. More than 99% of the signals extracted by the model achieve an overlap \geq 0.9 with the corresponding template, and each signal could be extracted in less than 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT seconds. This demonstrates a similar performance of our model compared to the current state-of-the-art models [31]. In the followings, we conduct various tests of our model against the possible non-stationarities or anomalies known to the literature, including data gaps, time-varying noise auto-correlations, glitches, and even their mixtures. We expect that at least 80% of the signals extracted by the model should attain an overlap \geq 0.9 with the template signals, which we will use as the criterion for assessing the limits of the model’s performance. The physical meanings of a more realistic criterion or threshold should come from more detailed studies on false positive, uncertainties in parameters estimations and so on, which will be left for future studies.

Refer to caption
Figure 4: Example of signal extraction with our model when there is no anomaly or non-stationarity present in the data.

V.1 Data gaps

Refer to caption
Figure 5: The left image illustrates the model’s extraction result in the presence of scheduled long gap, while the right image depicts the outcomes when facing unscheduled gaps, with a occurrence frequency about once every four hours.
Refer to caption
Figure 6: The left image displays the distributions of the overlaps for our test set confronting with data gaps of different occurrence frequencies. The image on the right presents the histogram for instances where the model achieves an overlap \geq 0.99, corresponding to our test set confronting with data gaps of different occurrence frequencies.

Following the discussions in [23], we use window functions to generate data gaps (see Ref [23] for the specific forms), which, as discussed in Sec. I can be classified into two types, that the scheduled prolonged gaps and unscheduled random but short gaps. For the scheduled gaps, we emulate an extreme situation, resulting into a data loss of 7 hours during the inspiral phases. For unscheduled gaps, it can last up to days or cause data loss of a few minutes per day due to glitch masking according to [37]. Considering the length of our data, here we choose the second type of unscheduled gap. We set each gap to cause 5 to 8 minutes of data loss, and progressively increase the frequency of occurrences of such randomly happened gaps to a threshold at which our model can hardly tolerate. Moreover, the unscheduled gaps are allowed to break out at any place of our data, while a prolonged data gap during the merger phase is ignored for obvious reason. Our extraction results are presented in Fig 5. Our model can proficiently extract the signals despite facing these two types of data gaps.

We delved further into the threshold performance of our model, as detailed in Fig 6. The model exhibits minimal accuracy loss in scenarios with scheduled gaps before mergers. When confronted with the random unscheduled gaps, the model’s results gradually worsen as the frequency of occurrences increases. As expected, signals with lower SNRs are more susceptible to data gaps.

Table 2: For different types of data gaps, the ratio of overlaps \geq 0.9 between the model-extracted signals and the templates across the entire dataset. The last two columns represent the means and standard deviations of the overlap.
Type of gap Overlaps \geq 0.9 Mean Std. Dev.
No gap 100.0% 0.997 0.005
Scheduled gap 99.9% 0.996 0.016
Gap every 6 hours 86.7% 0.935 0.169
Gap every 5 hours 83.1% 0.923 0.181
Gap every 4 hours 77.3% 0.893 0.220
Refer to caption
Figure 7: The left image displays the distributions of the overlaps for our test set confronting with time-varying noise auto-correlations of varying intensities. The image on the right presents the histogram for instances where the model achieves an overlap \geq 0.99, corresponding to our test set confronting with time-varying noise auto-correlations of varying intensities.

In Tab 2, we present more detailed results under different conditions. According to the predetermined criterion, our model can maintain robustness even when the frequency of unscheduled gaps reaches once per 5-hours.

V.2 Time-varying noise auto-correlations

Refer to caption
Figure 8: Example of signal extraction of the model when contending with time-varying noise auto-correlations. Here, a segment of the data spanning 80,000 seconds has the residual acceleration noise set to threefold its nominal magnitude. The result demonstrate that our model effectively extracted the signal, achieving a high overlap.

We simulated non-stationary noises with time-varying auto-correlations or PSDs by altering the magnitudes of SOMS(f)subscript𝑆OMS𝑓S_{\text{OMS}}(f)italic_S start_POSTSUBSCRIPT OMS end_POSTSUBSCRIPT ( italic_f ) and SACC(f)subscript𝑆ACC𝑓S_{\text{ACC}}(f)italic_S start_POSTSUBSCRIPT ACC end_POSTSUBSCRIPT ( italic_f ) of the two noise components, with the modified noise segment lasting for 80,000 seconds. In Fig 8, we present one of our extraction samples. The limits of our model’s performance against such non-stationary noise is investigated and shown in Fig 7. As expected, data with lower SNRs will be more significantly affected by time-varying noise PSDs, and the detailed results are presented in Tab 3. Our model can successfully extract the signal without being affected by such variations in total noise PSDs.

While, given the predefined criterion, one finds that noises originating from the optical metrology system has a more significant impact on the final outcomes. Further investigation shows that this is because that, increasing the optical metrology system noise (concentrated in the relatively high frequency band) will significantly reduce the SNR of coalescing MBHB signals due to the specific forms of the PSDs of the total noise and expected signal, see Fig 9 for illustrations. This, but not the variations in noise PSDs, will make it more difficult for the model to extract signals and produce the results shown Fig 7. and Tab  3.

Refer to caption
Figure 9: The changes in the overall noise spectrum when the magnitudes of the optical metrology system noise and acceleration noise are altered. For our test data set, the optical noises affect the total SNR (SNR similar-to\sim 35 in this example) more significantly.
Table 3: For time-varying noise auto-correlations, the ratio of overlaps \geq 0.9 between the model-extracted signals and the template signals across the entire dataset. The last two columns represent the means and standard deviations of the overlap.
SOMS(f)subscript𝑆OMS𝑓S_{\text{OMS}}(f)italic_S start_POSTSUBSCRIPT OMS end_POSTSUBSCRIPT ( italic_f ) SACC(f)subscript𝑆ACC𝑓S_{\text{ACC}}(f)italic_S start_POSTSUBSCRIPT ACC end_POSTSUBSCRIPT ( italic_f ) Overlaps \geq 0.9 Mean Std. Dev.
1x 1x 100.0% 0.997 0.005
1x 2x 92.2% 0.962 0.117
1x 3x 81.5% 0.923 0.158
1x 4x 56.0% 0.842 0.205
2x 1x 78.3% 0.894 0.224

V.3 Glitches

For instrumental transients, we consider the legacy model from LISA Pathfinder for GRS glitches as representatives [40]

g(t)=Δvτ1τ2(e(tt0)τ1e(tt0)τ2)Θ(tt0),𝑔𝑡Δ𝑣subscript𝜏1subscript𝜏2superscript𝑒𝑡subscript𝑡0subscript𝜏1superscript𝑒𝑡subscript𝑡0subscript𝜏2Θ𝑡subscript𝑡0g(t)=\frac{\Delta v}{\tau_{1}-\tau_{2}}\left(e^{\frac{-(t-t_{0})}{\tau_{1}}}-e% ^{\frac{-(t-t_{0})}{\tau_{2}}}\right)\Theta(t-t_{0}),italic_g ( italic_t ) = divide start_ARG roman_Δ italic_v end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ( italic_e start_POSTSUPERSCRIPT divide start_ARG - ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT - italic_e start_POSTSUPERSCRIPT divide start_ARG - ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ) roman_Θ ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (11)

where t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the time of glitch injection, ΔvΔ𝑣\Delta vroman_Δ italic_v is the impulse transferred by the glitch (i.e., the gain of test mass velocity caused by the glitch), τ1,τ2subscript𝜏1subscript𝜏2\tau_{1},\tau_{2}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are time scales, and Θ(tt0)Θ𝑡subscript𝑡0\Theta(t-t_{0})roman_Θ ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) denotes the Heaviside step function. Glitches are simulated and injected to the GRS residual acceleration noise nACC(t)subscript𝑛ACC𝑡n_{\rm ACC}(t)italic_n start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) of TM12 (i.e. the test-mass onboard spacecraft 1 and facing spacecraft 2). They are then passed through TACCI(t)subscriptsuperscript𝑇𝐼ACC𝑡T^{I}_{\rm ACC}(t)italic_T start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ACC end_POSTSUBSCRIPT ( italic_t ) (in Eq. (3)), together with the “normal” acceleration noises, to get the total TDI combinations with glitches. We considered two types of glitches according to Ref. [40], that the short-duration glitches lasting approximately 70 seconds and long-duration glitches for about 3 hours. The initial amplitudes of both types of glitches were set to 1×1014ms21superscript1014superscriptms21\times 10^{-14}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT, see Tab 4 for detailed parameters, and we tested the robustness of our model by increasing the amplitudes by 0.5 times for each new experiment until the performance fell below the threshold we set.

Table 4: The parameters corresponding to the two types of glitches, under this parameter setting, their amplitudes are both 1×1014ms21superscript1014superscriptms21\times 10^{-14}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT.
Duration ΔvΔ𝑣\Delta vroman_Δ italic_v (m/s) τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (s) τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (s)
Short 2.86×10132.86superscript10132.86\times 10^{-13}2.86 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT 10 11
Long 5.44×10115.44superscript10115.44\times 10^{-11}5.44 × 10 start_POSTSUPERSCRIPT - 11 end_POSTSUPERSCRIPT 2000 2001

All glitches were injected with an initial time t0=0subscript𝑡00t_{0}=0italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0, and the time difference between the injection points of the glitches and the mergers rang from 22 to 38 hours. Our model proficiently extracts the coalescing MBHB signals given such instrumental transients. Fig 10 shows an example of the model’s extraction result.

Refer to caption
Figure 10: The left image displays the extraction result of our model when encountering a short duration glitch with a peak value of 1×1013ms21superscript1013superscriptms21\times 10^{-13}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. The right image shows the extraction result of our model when encountering a long duration glitch with a peak value of 1×1014ms21superscript1014superscriptms21\times 10^{-14}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT.
Refer to caption
Figure 11: The left image displays the distributions of the overlaps for our test set confronting with glitches of varying intensities and duration. The image on the right presents the histogram for instances where the model achieves an overlap \geq 0.99, corresponding to our test set confronting with glitches of varying intensities and duration.
Table 5: For glitches with different intensities and duration, the ratio of overlaps \geq 0.9 between the model-extracted signals and the template signals across the entire dataset. The last two columns represent the means and standard deviations of the overlap.
Duration Peak value(ms-2) Overlaps \geq 0.9 Mean Std. Dev.
None None 100.0% 0.997 0.005
Short 1×10141superscript10141\times 10^{-14}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 100.0% 0.997 0.005
Short 5×10145superscript10145\times 10^{-14}5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 99.9% 0.995 0.010
Short 1×10131superscript10131\times 10^{-13}1 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT 94.9% 0.977 0.057
Short 1.5×10131.5superscript10131.5\times 10^{-13}1.5 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT 72.0% 0.893 0.161
Long 1×10141superscript10141\times 10^{-14}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 100.0% 0.996 0.006
Long 5×10145superscript10145\times 10^{-14}5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 31% 0.830 0.112

The limits of our model’s performance against glitches is shown in Fig 11. The results indicate that the performance deteriorates as the duration of the glitch increases, and detailed results are presented in Tab 5. Specifically, for short-duration glitches, the maximum tolerable magnitude of the glitch is about 1×1013ms21superscript1013superscriptms21\times 10^{-13}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT, while for long-duration glitches, the maximum tolerable magnitude is about 1×1014ms21superscript1014superscriptms21\times 10^{-14}\,\text{ms}^{-2}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. We can see that the long-duration glitch has a greater impact on the extraction results. This is because we adjust its duration to be consistent with the duration of the merge phase of the MBHB signal, which greatly affects the model’s judgment, resulting in erroneous extraction.

V.4 Mixing three types of non-stationarities

Table 6: After mixing all anomalous or non-stationary conditions, the ratio of overlaps \geq 0.9 between the model-extracted signals and the template signals across the entire dataset. The results also show the means and standard deviations of the obtained overlaps.
Data gaps Non-stationary Gaussian noise Glitches Results
Scheduled gap Unscheduled gap SOMS(f)subscript𝑆OMS𝑓S_{\text{OMS}}(f)italic_S start_POSTSUBSCRIPT OMS end_POSTSUBSCRIPT ( italic_f ) SACC(f)subscript𝑆ACC𝑓S_{\text{ACC}}(f)italic_S start_POSTSUBSCRIPT ACC end_POSTSUBSCRIPT ( italic_f ) Duration Peak value (ms-2) Overlaps \geq 0.9 Mean Std. Dev.
\checkmark ×\times× 1x 3x Short 5×10145superscript10145\times 10^{-14}5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 81.6% 0.919 0.167
\checkmark ×\times× 1x 3x Long 1×10141superscript10141\times 10^{-14}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 80.5% 0.921 0.160
×\times× Gap every 10 hours 1x 2x Short 5×10145superscript10145\times 10^{-14}5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 82.1% 0.914 0.194
×\times× Gap every 10 hours 1x 2x Long 1×10141superscript10141\times 10^{-14}1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 81.8% 0.905 0.216

At last but not least, we conducted tests by mixing three types of non-stationarities and have listed in Tab 6 the results achieved by our model under different conditions. When all these combine, our model still exhibits considerable extraction capability. When long-term scheduled gap exists, our model can tolerate with time-varying noise PSDs together with short-duration glitches of maximum peak value 5×1014ms2similar-toabsent5superscript1014superscriptms2\sim 5\times 10^{-14}\,\text{ms}^{-2}∼ 5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT or long-duration glitches 1×1014ms2similar-toabsent1superscript1014superscriptms2\sim 1\times 10^{-14}\,\text{ms}^{-2}∼ 1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. When the occurrence frequency of unscheduled gaps reaches about once every 10 hours, our model can tolerate with relative weaker changes in noise (residual acceleration noise) PSDs combined with short-duration glitches of maximum peak value 5×1014ms2similar-toabsent5superscript1014superscriptms2\sim 5\times 10^{-14}\,\text{ms}^{-2}∼ 5 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT or long-duration glitches 1×1014ms2similar-toabsent1superscript1014superscriptms2\sim 1\times 10^{-14}\,\text{ms}^{-2}∼ 1 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT ms start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. These tests give typically the possible extreme cases in real science operations and measurements of space-borne GW antennas, and our model have shown high robustness and feasibility in typical signal extractions. It can be observed that high-frequency unscheduled gaps may have important impacts, which is predominantly due to substantial loss of spectral information and further compounded by additional anomalies that significantly obfuscate the model’s signal extraction capabilities. In future works, we will try to improve the model’s ability to tolerate these mixed situations in order to improve performance in more extreme situations, and we will focus on unscheduled gaps, improving the model’s tolerance for this situation and mitigating its impact on other anomalies.

VI Concluding remarks

In this work, we developed a deep learning model for accurate GW extractions against possible data anomalies or non-stationarities for space-borne GW antennas, such as LISA and LISA-like missions, which also provide robustness studies of deep learning models in data processing of space-borne GW missions. Our research focuses on the three types of non-stationarities, including gaps, glitches, and time varying noise auto-correlations. Compared with the current state-of-the-art models, our model exhibits the same performance for the ideal and anomaly free cases in signal extractions. While, confronted with the three types of non-stationarities and even their mixtures in some extreme cases, our model still shows considerable adaptability. It is worth noting that when all anomalies occur at the same time, the model’s tolerance for these anomalies would degrade compared with each stand-alone situation, and among these the unscheduled gap turns out to be a more important factor.

Deep learning has provided a method for the rapid detection of space-borne GW missions, and robustness research is key to ensuring that deep learning models are applicable in real detection scenarios. In the future, we will use this signal extraction model as a pre-processing model to reduce the difficulty of subsequent scientific analysis such as GW signal detection and parameter estimation through robust extraction of GW signals, becoming an important part of the Taiji detection pipeline. It is planned to process longer datasets in future works and to incorporate a greater variety of realistic simulation scenarios, including various levels of SNR and types of anomalies. Additionally, we will focus on unscheduled gaps, seeking new improvements to mitigate the substantial impact of such anomalies on model performance.

Acknowledgements.
This work is supported by the National Key Research and Development Program of China No. 2021YFC2201901, No. 2021YFC2201903, No. 2020YFC2200601 and No. 2020YFC2200901.

References

  • Abbott et al. [2016a] B. P. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, et al., Gw150914: The advanced ligo detectors in the era of first discoveries, Physical review letters 116, 131103 (2016a).
  • Abbott et al. [2016b] B. P. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, et al., Observation of gravitational waves from a binary black hole merger, Physical review letters 116, 061102 (2016b).
  • Abbott et al. [2016c] B. P. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, et al., Gw151226: observation of gravitational waves from a 22-solar-mass binary black hole coalescence, Physical review letters 116, 241103 (2016c).
  • Abbott et al. [2016d] B. P. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, et al., Binary black hole mergers in the first advanced ligo observing run, Physical Review X 6, 041015 (2016d).
  • Scientific et al. [2017] L. Scientific, B. P. Abbott, R. Abbott, T. Abbott, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, et al., Gw170104: observation of a 50-solar-mass binary black hole coalescence at redshift 0.2, Physical review letters 118, 221101 (2017).
  • Abbott et al. [2017a] B. P. Abbott, R. Abbott, T. Abbott, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, V. Adya, et al., Gw170608: observation of a 19 solar-mass binary black hole coalescence, The Astrophysical Journal Letters 851, L35 (2017a).
  • Abbott et al. [2017b] B. P. Abbott, R. Abbott, T. Abbott, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. X. Adhikari, V. B. Adya, et al., Gw170814: a three-detector observation of gravitational waves from a binary black hole coalescence, Physical review letters 119, 141101 (2017b).
  • Abbott et al. [2017c] B. P. Abbott, R. Abbott, T. Abbott, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari, V. B. Adya, et al., Gw170817: observation of gravitational waves from a binary neutron star inspiral, Physical review letters 119, 161101 (2017c).
  • Abbott et al. [2020a] B. Abbott, R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. Adhikari, V. Adya, C. Affeldt, et al., Gw190425: Observation of a compact binary coalescence with total mass  3.4 m, The Astrophysical Journal 892, L3 (2020a).
  • Abbott et al. [2020b] R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. Adya, C. Affeldt, M. Agathos, et al., Gw190412: Observation of a binary-black-hole coalescence with asymmetric masses, Physical Review D 102, 043015 (2020b).
  • Abbott et al. [2020c] R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. Adhikari, V. Adya, C. Affeldt, M. Agathos, et al., Gw190521: a binary black hole merger with a total mass of 150 m, Physical review letters 125, 101102 (2020c).
  • Abbott et al. [2021] R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. Adya, C. Affeldt, et al., Tests of general relativity with binary black holes from the second ligo-virgo gravitational-wave transient catalog, Physical review D 103, 122002 (2021).
  • Ezquiaga [2021] J. M. Ezquiaga, Hearing gravity from the cosmos: Gwtc-2 probes general relativity at cosmological scales, Physics Letters B 822, 136665 (2021).
  • Bozzola and Paschalidis [2021] G. Bozzola and V. Paschalidis, General relativistic simulations of the quasicircular inspiral and merger of charged black holes: Gw150914 and fundamental physics implications, Physical Review Letters 126, 041103 (2021).
  • Abbott et al. [2019] B. Abbott, R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. Adhikari, V. Adya, C. Affeldt, et al., Gwtc-1: a gravitational-wave transient catalog of compact binary mergers observed by ligo and virgo during the first and second observing runs, Physical Review X 9, 031040 (2019).
  • Abbott et al. [2020d] B. P. Abbott, R. Abbott, T. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, V. Adya, C. Affeldt, M. Agathos, et al., Prospects for observing and localizing gravitational-wave transients with advanced ligo, advanced virgo and kagra, Living reviews in relativity 23, 1 (2020d).
  • Freise and Strain [2010] A. Freise and K. Strain, Interferometer techniques for gravitational-wave detection, Living Reviews in Relativity 13, 1 (2010).
  • Amaro-Seoane et al. [2017] P. Amaro-Seoane, H. Audley, S. Babak, J. Baker, E. Barausse, P. Bender, E. Berti, P. Binetruy, M. Born, D. Bortoluzzi, et al., Laser interferometer space antenna, arXiv preprint arXiv:1702.00786  (2017).
  • Baker et al. [2019] J. Baker, J. Bellovary, P. L. Bender, E. Berti, R. Caldwell, J. Camp, J. W. Conklin, N. Cornish, C. Cutler, R. DeRosa, et al., The laser interferometer space antenna: unveiling the millihertz gravitational wave sky, arXiv preprint arXiv:1907.06482  (2019).
  • Gong et al. [2011] X. Gong, S. Xu, S. Bai, Z. Cao, G. Chen, Y. Chen, X. He, G. Heinzel, Y.-K. Lau, C. Liu, et al., A scientific case study of an advanced lisa mission, Classical and Quantum Gravity 28, 094012 (2011).
  • Hu and Wu [2017] W.-R. Hu and Y.-L. Wu, The taiji program in space for gravitational wave physics and the nature of gravity (2017).
  • Luo et al. [2016] J. Luo, L.-S. Chen, H.-Z. Duan, Y.-G. Gong, S. Hu, J. Ji, Q. Liu, J. Mei, V. Milyukov, M. Sazhin, et al., Tianqin: a space-borne gravitational wave detector, Classical and Quantum Gravity 33, 035010 (2016).
  • Dey et al. [2021] K. Dey, N. Karnesis, A. Toubiana, E. Barausse, N. Korsakova, Q. Baghi, and S. Basak, Effect of data gaps on the detectability and parameter estimation of massive black hole binaries with lisa, Physical Review D 104, 044035 (2021).
  • Baghi et al. [2022] Q. Baghi, N. Korsakova, J. Slutsky, E. Castelli, N. Karnesis, and J.-B. Bayle, Detection and characterization of instrumental transients in lisa pathfinder and their projection to lisa, Physical Review D 105, 042002 (2022).
  • Edwards et al. [2020] M. C. Edwards, P. Maturana-Russel, R. Meyer, J. Gair, N. Korsakova, and N. Christensen, Identifying and addressing nonstationary lisa noise, Physical Review D 102, 084062 (2020).
  • Force et al. [2006] M. L. D. C. T. Force, K. A. Arnaud, S. Babak, J. G. Baker, M. J. Benacquista, N. J. Cornish, C. Cutler, S. L. Larson, B. Sathyaprakash, M. Vallisneri, et al., An overview of the mock lisa data challenges, in AIP Conference Proceedings, Vol. 873 (American Institute of Physics, 2006) pp. 619–624.
  • Katz [2022] M. L. Katz, Fully automated end-to-end pipeline for massive black hole binary signal extraction from lisa data, Physical Review D 105, 044055 (2022).
  • Vallisneri [2005] M. Vallisneri, Synthetic lisa: Simulating time delay interferometry in a model lisa, Physical Review D 71, 022001 (2005).
  • Finn [1992] L. S. Finn, Detection, measurement, and gravitational radiation, Physical Review D 46, 5236 (1992).
  • Usman et al. [2016] S. A. Usman, A. H. Nitz, I. W. Harry, C. M. Biwer, D. A. Brown, M. Cabero, C. D. Capano, T. Dal Canton, T. Dent, S. Fairhurst, et al., The pycbc search for gravitational waves from compact binary coalescence, Classical and Quantum Gravity 33, 215004 (2016).
  • Zhao et al. [2023] T. Zhao, R. Lyu, H. Wang, Z. Cao, and Z. Ren, Space-based gravitational wave signal detection and extraction with deep neural network, Communications Physics 6, 212 (2023).
  • Wang et al. [2020a] H. Wang, S. Wu, Z. Cao, X. Liu, and J.-Y. Zhu, Gravitational-wave signal recognition of ligo data by deep learning, Physical Review D 101, 104003 (2020a).
  • De Luca et al. [2020] V. De Luca, G. Franciolini, P. Pani, and A. Riotto, Primordial black holes confront ligo/virgo data: current situation, Journal of Cosmology and Astroparticle Physics 2020 (06), 044.
  • Chen et al. [2022] J. Chen, A. Cazenave, C. Dahle, W. Llovel, I. Panet, J. Pfeffer, and L. Moreira, Applications and challenges of grace and grace follow-on satellite gravimetry, Surveys in Geophysics , 1 (2022).
  • Armano et al. [2009] M. Armano, M. Benedetti, J. Bogenstahl, D. Bortoluzzi, P. Bosetti, N. Brandt, A. Cavalleri, G. Ciani, I. Cristofolini, A. Cruise, et al., Lisa pathfinder: the experiment and the route to lisa, Classical and Quantum Gravity 26, 094001 (2009).
  • Collaboration et al. [2021] T. S. Collaboration, Y.-L. Wu, Z.-R. Luo, J.-Y. Wang, M. Bai, W. Bian, H.-W. Cai, R.-G. Cai, Z.-M. Cai, J. Cao, et al., Taiji program in space for gravitational universe with the first run key technologies test in taiji-1 (2021).
  • Baghi et al. [2019] Q. Baghi, J. I. Thorpe, J. Slutsky, J. Baker, T. Dal Canton, N. Korsakova, and N. Karnesis, Gravitational-wave parameter estimation with gaps in lisa: a bayesian data augmentation method, Physical Review D 100, 022003 (2019).
  • Armano et al. [2018] M. Armano, H. Audley, J. Baird, P. Binetruy, M. Born, D. Bortoluzzi, E. Castelli, A. Cavalleri, A. Cesarini, A. Cruise, et al., Beyond the required lisa free-fall performance: new lisa pathfinder results down to 20 μ𝜇\muitalic_μ hz, Physical review letters 120, 061101 (2018).
  • Robson and Cornish [2019] T. Robson and N. J. Cornish, Detecting gravitational wave bursts with lisa in the presence of instrumental glitches, Physical Review D 99, 024019 (2019).
  • Armano et al. [2022] M. Armano, H. Audley, J. Baird, P. Binetruy, M. Born, D. Bortoluzzi, E. Castelli, A. Cavalleri, A. Cesarini, V. Chiavegato, et al., Transient acceleration events in lisa pathfinder data: Properties and possible physical origin, Physical Review D 106, 062001 (2022).
  • Armano et al. [2016] M. Armano, H. Audley, G. Auger, J. T. Baird, M. Bassan, P. Binetruy, M. Born, D. Bortoluzzi, N. Brandt, M. Caleno, et al., Sub-femto-g free fall for space-based gravitational wave observatories: Lisa pathfinder results, Physical review letters 116, 231101 (2016).
  • Frommknecht [2007] B. Frommknecht, Integrated sensor analysis of the GRACE mission, Ph.D. thesis, Technische Universität München (2007).
  • Sheard et al. [2012] B. Sheard, G. Heinzel, K. Danzmann, D. Shaddock, W. Klipstein, and W. Folkner, Intersatellite laser ranging instrument for the grace follow-on mission, Journal of Geodesy 86, 1083 (2012).
  • tai [2021] China’s first step towards probing the expanding universe and the nature of gravity using a space borne gravitational wave antenna, Communications Physics 4, 34 (2021).
  • Spadaro et al. [2023] A. Spadaro, R. Buscicchio, D. Vetrugno, A. Klein, D. Gerosa, S. Vitale, R. Dolesi, W. J. Weber, and M. Colpi, Glitch systematics on the observation of massive black-hole binaries with lisa, arXiv preprint arXiv:2306.03923  (2023).
  • George and Huerta [2018] D. George and E. Huerta, Deep neural networks to enable real-time multimessenger astrophysics, Physical Review D 97, 044039 (2018).
  • Gabbard et al. [2018] H. Gabbard, M. Williams, F. Hayes, and C. Messenger, Matching matched filtering with deep networks for gravitational-wave astronomy, Physical review letters 120, 141103 (2018).
  • Gabbard et al. [2022] H. Gabbard, C. Messenger, I. S. Heng, F. Tonolini, and R. Murray-Smith, Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy, Nature Physics 18, 112 (2022).
  • Dax et al. [2021] M. Dax, S. R. Green, J. Gair, J. H. Macke, A. Buonanno, and B. Schölkopf, Real-time gravitational wave science with neural posterior estimation, Physical review letters 127, 241103 (2021).
  • Colgan et al. [2020] R. E. Colgan, K. R. Corley, Y. Lau, I. Bartos, J. N. Wright, Z. Márka, and S. Márka, Efficient gravitational-wave glitch identification from environmental data through machine learning, Physical Review D 101, 102003 (2020).
  • Cavaglia et al. [2019] M. Cavaglia, K. Staats, and T. Gill, Finding the origin of noise transients in ligo data with machine learning, Communications in Computational Physics 25 (2019).
  • Razzano and Cuoco [2018] M. Razzano and E. Cuoco, Image-based deep learning for classification of noise transients in gravitational wave detectors, Classical and Quantum Gravity 35, 095016 (2018).
  • Ormiston et al. [2020] R. Ormiston, T. Nguyen, M. Coughlin, R. X. Adhikari, and E. Katsavounidis, Noise reduction in gravitational-wave data via deep learning, Physical Review Research 2, 033066 (2020).
  • Torres-Forne et al. [2016] A. Torres-Forne, A. Marquina, J. A. Font, and J. M. Ibanez, Denoising of gravitational wave signals via dictionary learning algorithms, Physical Review D 94, 124040 (2016).
  • Wei and Huerta [2020] W. Wei and E. Huerta, Gravitational wave denoising of binary black hole mergers with deep learning, Physics Letters B 800, 135081 (2020).
  • Cannon et al. [2021] K. Cannon, S. Caudill, C. Chan, B. Cousins, J. D. Creighton, B. Ewing, H. Fong, P. Godwin, C. Hanna, S. Hooper, et al., Gstlal: A software framework for gravitational wave discovery, SoftwareX 14, 100680 (2021).
  • Gondara [2016] L. Gondara, Medical image denoising using convolutional denoising autoencoders, in 2016 IEEE 16th international conference on data mining workshops (ICDMW) (IEEE, 2016) pp. 241–246.
  • Xie et al. [2012] J. Xie, L. Xu, and E. Chen, Image denoising and inpainting with deep neural networks, Advances in neural information processing systems 25 (2012).
  • Vincent et al. [2010] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion., Journal of machine learning research 11 (2010).
  • Creswell and Bharath [2018] A. Creswell and A. A. Bharath, Denoising adversarial autoencoders, IEEE transactions on neural networks and learning systems 30, 968 (2018).
  • Chiang et al. [2019] H.-T. Chiang, Y.-Y. Hsieh, S.-W. Fu, K.-H. Hung, Y. Tsao, and S.-Y. Chien, Noise reduction in ecg signals using fully convolutional denoising autoencoders, Ieee Access 7, 60806 (2019).
  • Saad and Chen [2020] O. M. Saad and Y. Chen, Deep denoising autoencoder for seismic random noise attenuation, Geophysics 85, V367 (2020).
  • Xia et al. [2017] M. Xia, T. Li, L. Liu, L. Xu, and C. W. de Silva, Intelligent fault diagnosis approach with unsupervised feature learning by stacked denoising autoencoder, IET Science, Measurement & Technology 11, 687 (2017).
  • Dasan and Panneerselvam [2021] E. Dasan and I. Panneerselvam, A novel dimensionality reduction approach for ecg signal via convolutional denoising autoencoder with lstm, Biomedical Signal Processing and Control 63, 102225 (2021).
  • Araki et al. [2015] S. Araki, T. Hayashi, M. Delcroix, M. Fujimoto, K. Takeda, and T. Nakatani, Exploring multi-channel features for denoising-autoencoder-based speech enhancement, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2015) pp. 116–120.
  • Lai et al. [2016] Y.-H. Lai, F. Chen, S.-S. Wang, X. Lu, Y. Tsao, and C.-H. Lee, A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation, IEEE Transactions on Biomedical Engineering 64, 1568 (2016).
  • Feng et al. [2014] X. Feng, Y. Zhang, and J. Glass, Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition, in 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (IEEE, 2014) pp. 1759–1763.
  • Lu et al. [2013] X. Lu, Y. Tsao, S. Matsuda, and C. Hori, Speech enhancement based on deep denoising autoencoder., in Interspeech, Vol. 2013 (2013) pp. 436–440.
  • Chatterjee et al. [2021] C. Chatterjee, L. Wen, F. Diakogiannis, and K. Vinsen, Extraction of binary black hole gravitational wave signals from detector data using deep learning, Physical Review D 104, 064046 (2021).
  • Krizhevsky et al. [2012] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems 25 (2012).
  • Simonyan and Zisserman [2014] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556  (2014).
  • He et al. [2016] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition (2016) pp. 770–778.
  • LeCun et al. [2015] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, nature 521, 436 (2015).
  • Tinto and Dhurandhar [2021] M. Tinto and S. V. Dhurandhar, Time-delay interferometry, Living Reviews in Relativity 24, 1 (2021).
  • Armstrong et al. [1999] J. Armstrong, F. Estabrook, and M. Tinto, Time-delay interferometry for space-based gravitational wave searches, The Astrophysical Journal 527, 814 (1999).
  • Babak et al. [2021] S. Babak, M. Hewitson, and A. Petiteau, LISA Sensitivity and SNR Calculations (2021), arXiv:2108.01167 [astro-ph, physics:gr-qc].
  • Luo et al. [2020] Z. Luo, Z. Guo, G. Jin, Y. Wu, and W. Hu, A brief analysis to taiji: Science and technology, Results in Physics 16, 102918 (2020).
  • Nitz et al. [2022] A. Nitz, I. Harry, D. Brown, C. M. Biwer, J. Willis, T. Dal Canton, C. Capano, T. Dent, L. Pekowsky, A. R. Williamson, et al., gwastro/pycbc: v2. 0.2 release of pycbc, Zenodo  (2022).
  • Bohé et al. [2017] A. Bohé, L. Shao, A. Taracchini, A. Buonanno, S. Babak, I. W. Harry, I. Hinder, S. Ossokine, M. Pürrer, V. Raymond, et al., Improved effective-one-body model of spinning, nonprecessing binary black holes for the era of gravitational-wave astrophysics with advanced detectors, Physical Review D 95, 044028 (2017).
  • Wang et al. [2020b] G. Wang, W.-T. Ni, W.-B. Han, S.-C. Yang, and X.-Y. Zhong, Numerical simulation of sky localization for lisa-taiji joint observation, Physical Review D 102, 024089 (2020b).
  • Wang and Ni [2023] G. Wang and W.-T. Ni, Revisiting time delay interferometry for unequal-arm lisa and taiji, Physica Scripta 98, 075005 (2023).
  • Diakogiannis et al. [2020] F. I. Diakogiannis, F. Waldner, and P. Caccetta, Looking for change? roll the dice and demand attention, arXiv preprint arXiv:2009.02062  (2020).