Machine Learning and Climate Physics
Machine Learning and Climate Physics
343
1. INTRODUCTION
Machine learning (ML) has led to breakthroughs in various areas, from playing Go to text gen-
Model: erated with large language models (LLMs) and, more recently, to weather forecasting (1–5).
a representation of a Different from Go and LLM, the language scientists use to understand and simulate weather and
system to make climate has been equations rooted in fundamental physics. Physics-based equations, often dif-
predictions; can be
ferential equations, are essential to simulate systems where direct observations are limited and
physics-based,
ML-based, or coupled noisy—even more so for projections of future climates where data are altogether unavailable
(Figure 1). Recently, ML has emerged as an alternative tool for predictive modeling as well as
Neural network
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54
improving the understanding of climate physics (Figure 2). For instance, physics-free ML models
(NN): function
approximators such as neural networks (NNs), which are universal approximators of functions (12) and opera-
parameterized by tors (13), trained on data from observations or physics-based simulations have demonstrated a
function operations remarkable ability to perform accurate nowcasting (14) with lead times of a few hours, weather
and parameters γ , that prediction (15) with lead times of several days, and El Niño forecast with lead times of a year (16).
are optimized to
It remains an open question if some of the strategies and success in these short-time predictions
minimize specified
cost functions L can be applied to improve climate projections, i.e., to estimate changes in the statistics of weather
events (e.g., return periods of heat waves or tropical cyclones) in the next decades, centuries, and
Simulation:
beyond.
physics-based models
solved numerically
1.1. Weather Versus Climate: Nonstationarity
Emulator: a subset of
models that fit the The success story of ML for prediction (Section 3.2) has been primarily showcased in weather
data, bypassing solving forecasting (17). Followed by initial attempts started in 2019 (e.g., 18–24), by 2022–2023 ML
physics-based weather models (often called emulators; Section 3.2) achieve, at a fraction of the computational
equations cost, similar or better forecast skills than state-of-the-art physics-based weather prediction models
(e.g., 1–5, 25). This success has generated excitement about using ML to improve climate projec-
tions as well. Yet, climate projections involve major additional challenges (26, 27). In weather
forecasting, we have a constant stream of real-time and historical data for training ML-based em-
ulators (lower right of Figure 1a) to make predictions for a few weeks, where the statistics can
be assumed to be stationary. In climate, we are often interested in predicting the climate’s forced
a b
Equation
physics-based equations
x(t), y(t), z(t) at lots of times, x(t), y(t) at sparse times, and the x(t), y(t), z(t) at lots of times
Given and mathematical form of the mathematical form of f, including θ
model f
Prediction z
Distribution
0°
t = 10
–45°
response to changes in greenhouse gases in the atmosphere, leading to nonstationarity (28), e.g.,
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54
climate with different mean and variability. ML models are not suited to predict the behavior
of a system substantially different from the one they have been trained on. Yet we simply do
not have observational data for the future (i.e., lower left corner in Figure 1a) to train and vali-
date future predictions; this is an issue for both ML and physics-based models but one expects
the fundamental laws of physics to hold in the future as well. We summarize the nonstation-
ary challenge and potential solutions, such as incorporating physics constraints, in Section 4.2.
Furthermore, the long-term prediction of future climate involves the interactions between the
atmosphere, ocean, cryosphere, land, and biosphere, which make the problem more challenging
than short-term weather forecast.
2.2.1. Parametric estimation θ. Given a discrete dataset of states measured at discrete times ti ,
i.e., {x(ti ), y(ti ), z(ti )}N
i=1 , parametric estimation refers to predicting the free parameters θ = [a, b, c]
when the functional form of the mathematical model f (z(t ), θ) is known. See Figure 2d.
2.2.2. State estimation z. State estimation involves predicting the state variables z(t ) given the
mathematical model f (z(t ), θ), model parameters θ, and data {x(ti ), y(ti ), z(ti )}N
i=1 . Estimating states
is particularly useful for data interpolation when the available data are sparse in time or space, for
data denoising, or for inversion when the predicted state is not measurable (e.g., Figure 2f ) and,
thus, completely unavailable in the data library, e.g., predicting z(t ) with data of {x(ti ), y(ti )}Ni=1 .
increasingly recognized (62–64), including using ML to correct model error in data assimilation (65) and emulation
of a dynamical system (66).
a GPR (Gaussian process regression) as a cheap method to emulate the prediction of interest z as a
function of θ . Sampling the GPR emulator with Markov chain Monte Carlo enables substantially
faster UQ (uncertainty quantification) of the predictions z resulting from the plausible range of θ.
GPR has also been used directly for calibrating parameters with UQ in Earth system models (57).
2.2.4.2. Physics-informed neural networks (PINNs). The use of PINNs (58, 59) for planetary-
scale geophysical flow problems has started to emerge in the past few years. Introduced by Raissi
et al. (58), a PINN is a differentiable solver for partial differential equations (PDEs) that is particu-
larly useful for inverse problems involving sparse-data inference, superresolution, data denoising,
and state estimation z in data assimilation (see the sidebar titled Inverse Problem and Data As-
similation). Unlike classical ML in which the cost function typically only involves data, PINN
encodes physics-based equations directly in the cost function (Figure 3a).
Throughout the training iterations, the optimizer identifies the best ML-parameterized states
z = N N (x, t, γ ) that are consistent with both the data and the governing equations. In the small-
data regime, without evaluating the NN-parameterized z against known physical laws (such as
conservation of mass, momentum, and energy), the ML predictions can be physically inconsis-
tent and nonextrapolatable beyond the available observational data (e.g., deviation from truth in
Figure 2g). In contrast, by incorporating PDEs, PINNs can achieve both physics-informed data
interpolation and extrapolation, as demonstrated by the examples in Figure 3, which cannot be
achieved by ML models trained with observational data alone.
Figure 3b,c demonstrates the applications of PINNs on observations, ranging from estimat-
ing the initial conditions z(x, t = 0) of hurricanes for subsequent forecasts (67) to inferring the
nonmeasurable viscosity structure z(x) of Antarctic ice shelves (68). Both examples fall within
the small-data regime in the upper left corner of Figure 1a, where incorporating knowledge of
PDEs becomes crucial for solving the inverse problems; the PINN-reconstructed wind field z(x, t )
(Figure 3b) involves only sparse observations of wind velocity itself as training data, obtained from
measurements by hurricane hunter planes and dropsondes. The PINN prediction of ice viscosity
is achieved without any observations of viscosity in the training data (Figure 3c); it relies solely
on equations and other observable states (velocity and thickness fields) as training data. Thus,
both examples involve substantial extrapolation beyond the sparse observational data, i.e., limited
velocity data and no viscosity data.
As long as the same data and physics-based equations are used to solve the inverse problem,
the predictions generated by properly trained PINNs are as trustworthy as those produced by es-
tablished data assimilation methods. Due to PINNs’ leverage of a graphics processing unit (GPU)
and differentiable modeling to infer accurate initial conditions without ensembles of forward
modeling, as used in ensemble-based data assimilation methods (67) (see Supplemental Text,
Section II for a brief comparison), PINNs require fewer computational resources to construct
350 Lai et al.
a Physics-informed neural network
Optimize NN parameters γ
Inject training data
Check physical consistency
x
……
z1 ∂x zi
…
Cost Data Equation
function loss loss
…
z z3
…… ƒ2
Derivatives in the equation are calculated exactly using
automated differentiation. The cost function (γ) is
Time t z4 ∂ 2x2 zi
…… differentiable with respect to the state variables z(γ)
and the trainable parameters γ in the ML model.
Input
Hidden Output Mathematical
(coordinates)
layers (states) models (PDEs)
b Sparse-data inference
(e.g., reconstruct initial conditions for hurricanes forecast)
Sparse data PIML-reconstructed
(Hurricane Ida) wind fields
40
Wind speed (m s–1)
30
200
Pressure (hPa)
500 30
400
700 20
20
600
900
–200 200 10 800 10
x (k 0 0 ) –400 400
m) 200 –200 y (k m –200 0 0 200
x (k 200 200
m) 400 –400 m)
y (k
2.2.4.3. Equation discovery. Existing equations f (z, θ) describing the numerous processes in
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54
the climate system, particularly the SGS processes, are far from complete. Equation discovery,
which outputs equations that are most consistent with data, has been used to tackle this prob-
lem. Inspired by earlier symbolic regression algorithms for distilling physical laws from data (72),
sparse identification of nonlinear dynamics (SINDy) (73) has emerged as a widely used method
for discovering f (z, θ) from data of the states z(t ). It demonstrates the power of sparse regres-
sion for learning the most relevant terms in the prescribed function library that describes the
data. To learn the correct f (z, θ), SINDy requires sampled data of both z(t ) and dz(t )/dt. For
many climate problems these state measurements are sparse and noisy, or entirely unavailable.
Schneider et al. (74) showed that time-averaged statistics of the states z(t ), which are available for
the climate system, can be sufficient to recover both the functional form of f (z, θ) and the noise
level of the data using sparse regression combined with EKI. Sparse EKI is robust to noisy data
and was successfully implemented to recover the Lorenz 96 equations (74). Other approaches for
equation discovery from data assimilation increments (75, 76) and from partial observations (77),
motivated by climate problems, have been proposed too.
Arguably the most successful example of the application of equation discovery for climate
physics so far has been the learning of an ocean mesoscale SGS parameterization (78). Trained on
high-resolution simulation data, Zanna & Bolton (78) showed that Bayesian linear sparse regres-
sion with relevance vector machines identifies relevant terms in the prescribed function library to
discover the closed-form equations of 5(z̄) (defined in Figure 2) for eddy momentum and temper-
ature forcing. The closed-form equation is consistent with an analytically derivable physics-based
model (79, 80). As discussed later in Section 3.1, SGS parameterizations (Figure 2) are essential for
improving the accuracy of computationally feasible low-resolution climate simulations. Although
black-box NNs have also shown promise for developing data-driven SGS parameterizations
(Section 3.1), the significant interest in equation discovery stems from their better generalization
to future climates and their interpretability (upper side of Figure 1b).
Inspired by early work on symbolic regression (72, 81), the symbolic genetic algorithm (82) was
developed to discover PDEs without the need to predetermine a function library. It uses a binary
tree to parameterize common mathematical operations (e.g., addition, multiplication, derivative,
division) and finds the correct operations such that the discovered equation matches the data.
In climate applications, genetic algorithms have been used for finding equations for cloud cover
parameterization (83) and ocean parameterization (84).
with predictions z(t ) for example, between SGS parameterizations of ocean and atmosphere boundary layer turbu-
subject to the initial
lence that interact through air–sea fluxes. Training is typically done on subcomponents of the full
conditions z(t0 )
specified at time t0 climate system because it is not computationally feasible to run global climate simulations
that fully resolve all SGS processes and their interactions. Concurrent observations of differ-
Boundary value
ent SGS processes are also limited. As a result, interactions among a number of individually
problem: a forward
model f (z) with trained/calibrated data-driven parameterizations can lead to inaccurate or even unstable global
predictions z(t ) subject simulations. This is an area in need of practical advancements.
to the boundary
conditions z(x0 , t ),
3.2. Climate Emulators
which can be
time-evolving, at the “Emulator” refers to several types of tools in the climate science literature. In general, an emula-
boundaries x0 tor is trained to mimic the data, from physics-based simulations or observations, to substantially
reduce the computational cost of producing new climate predictions, e.g., for other climate
conditions within the distribution of the training data.
Emulators can be used to interpolate the projections from expensive climate simulations,
making their projections among different emission scenarios accessible without rerunning the
simulations. Earlier use of ML for emulators followed the successful approach of traditional
pattern-scaling emulators (103, 104), which, for example, predict the change in statistics of vari-
ables of interest (e.g., regional annual-mean surface temperature or the return period of extreme
events at a later time) given a small set of inputs (e.g., year, greenhouse gas forcing, global mean sur-
face temperature). Using ML techniques (e.g., GPR, NN), emulators such as ClimateBench (105)
have been employed to estimate the climate impacts of anthropogenic emissions annually up
to 2100. However, it remains to be demonstrated that their skill is superior to that of pattern-
scaling emulators, i.e., emulators that regress regional temperature on global mean temperature
or cumulative emissions.
Although the aforementioned emulators can predict aggregated statistics within an often large
window of length scales and timescales, another type of emulator has emerged in recent years
with the aim of predicting the evolution of the climate system at fine spatiotemporal scales. These
spatiotemporal emulators leverage the success of ML-based weather forecast models, which are
physics free and trained solely on reanalysis data (106) (spanning 1979–present; see the sidebar
titled Reanalysis). Recent ML-based weather forecast models [e.g., FourCastNet (2), Pangu (3),
GraphCast (4)] are time-stepping algorithms that solve the initial value problem of predicting
the state z(t ) of the global atmosphere forward in time (from ti to ti+1 , then from ti+1 to ti+2 ,
and so on; Figure 2i). They exhibit comparable or even better skill than the best physics-based
weather prediction models for lead times of up to around 10 days (4). However, weather and
climate predictions are different problems. The former is an initial value problem, whereas the
latter is more akin to a boundary value problem in the sense that the focus is on how external
boundary conditions impact the system over longer periods of time.
For climate predictions, atmospheric spatiotemporal emulators are built to solve boundary
value problems that integrate the global atmospheric state given external forcings (e.g., radia-
tive forcing) and time-evolving boundary conditions (e.g., sea-surface and land temperature) for
decades or centuries. The AI2 Climate Emulator (ACE) (10, 11) is a promising example of such
a spatiotemporal emulator trained on physics-based simulations. Similar work on oceanic spa-
tiotemporal emulators (107, 108) suggests that coupled climate emulators might start to emerge
as well.
ML spatiotemporal emulators have shown even more promise in simulating components of the
climate system whose physics are less well understood. For the cryosphere, deep learning-based
emulators for seasonal sea ice prediction have been found to outperform state-of-the-art physics-
based dynamical models in terms of forecast accuracy (109–111), with a lead time of a few months.
Some of these sea ice emulators capture atmospheric-ice-ocean interactions by training with ap-
propriate climate variables (109, 111). Because these emulators were trained directly on sea ice ob-
servational data, they learn the atmospheric-ice-ocean interactions that are incompletely param-
eterized in the physics-based dynamical models, thereby correcting the model’s structural error.
3.3.1. Conservation laws. Various methods exist for incorporating conservation laws into ML
models, such as embedding them in the loss function [e.g., PINNs (58); Section 2.2.4.2] or other
components of the ML architecture. For instance, Beucler et al. (115) demonstrated that conserv-
ing quantities like mass and energy can be enforced as hard constraints within the NN architecture.
Their architecture-constrained NN, trained as an SGS parameterization of moist convection,
significantly improved simulated climate.
3.3.2. Symmetries and equivariances. Incorporating symmetries and equivariances has also
shown advantages, particularly in the small-data regime. For a variable x, the nonlinear function
g is equivariant under transformation A if Ag(x) = g(Ax). For example, by incorporating various
symmetries (e.g., scale equivariance, rotational equivariance) into convolutional neural networks
(CNNs) trained on turbulence data from previous time steps, the CNNs generalize well to future
need to be addressed. For example, ML weather forecast models have been shown to produce unstable or unphysical
atmospheric circulations beyond 10 days, poorly represent small-scale processes (15, 112), and fail to reproduce the
chaotic behavior of weather (113). Potential solutions to address these challenges include incorporating physical
constraints into ML models (Section 3.3) and developing a deeper understanding of the different sources of error
in these models (Sections 4.1 and 4.2).
time steps (8) (Figure 2h). Enforcing rotational equivariance through Capsule NN, CNNs, or
customized latent spaces has improved ML-based predictions of large-scale weather patterns (116)
and turbulent flows (8, 117).
3.3.3. Spectrum information. Including information about the Fourier spectrum of geophys-
ical turbulence in the loss function has been shown to aid in learning small scales and reducing
spectral bias, thereby improving the stability and physical consistency of ML-based emulators
(112). See Section 4.1 for further discussions.
generalization
large scales much more easily than small scales, which can pose challenges for multiscale climate
problems. Figure 4a shows the spectrum of the one-time-step (∼6 or 12 h) prediction of upper-
level wind from a few state-of-the-art ML weather emulators. Although the predictions exhibit the
correct spectrum for up to zonal Fourier wave numbers of ∼30 (scales of 40,000 km to 500 km),
smaller scales (from 500 km to 25 km) are poorly learned. It is noteworthy that these predictions
all boast around ∼99% accuracy based on anomaly pattern correlation. These errors in small
scales grow to larger scales after 10 days. Eventually, these predictions either blow up or become
unphysical (112). The same behavior is observed in simpler tasks such as reconstruction of at-
mospheric boundary layer turbulence (Figure 4b), time-stepping prediction for quasigeostrophic
(QG) turbulence (Figure 4c) and its reconstruction (Figure 4d), or even a simple 1D function
(Figure 4f ). Promising solutions include Fourier regularization of the loss function (Figure 4c,
modified from Reference 112) and random Fourier features (134, 135) (Figure 4d, modified from
Reference 76). Superposing small NNs (132, 133) via the multistage NNs also improves spectral
bias substantially compared with vanilla NNs (Figure 4e, modified from Reference 132).
4.1.3. Rare extreme events. Another example of epistemic error relates to rare events (e.g.,
heat waves, hurricanes, ice-shelf collapse, ocean circulation collapse). Predicting these rare events
is crucial, but they are often underrepresented or entirely absent from the training set, lead-
ing to significant data imbalance. Addressing data imbalance and improving the learning of rare
events is an active area of research. Common approaches such as resampling (136, 137), using
weighted loss function (123, 138), and learning the causal relationship that drives the rare behav-
ior (124) have shown promise. Innovative approaches, such as combining ML-based emulators
with mathematical tools for rare events (139, 140), may enable the learning of the rarest events.
103 PanguWeather
103
FourCastNet
101 FourCastNet-V2
GraphCast 101
10–1 ERA5 (truth)
Train
10–3 10–1 Test
Unconditional
10–5 Conditional
10–3
–5/3
10–7
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54
100 100
10–1 10–1
10–2 10–2
10–3 10–3
0 10 20 30 40 50 0 10 20 30 40 50
Wave number Wave number
e 2D Navier–Stokes (Multistage NN) f 1D time series
1010 100
Ground truth
10–2 One NN
Multistage NN
100 10–4
The 1D function
1.0
10–6
0.5
10–10 10–8 u 0.0
Ground truth –0.5
One NN 10–10 –1.0
Multistage NN –1.0 –0.5 0.0 0.5 1.0
10–20 x
10–12
0 100 200 300 400 0 25 50 75 100 125 150
Wave number Frequency
Figure 4
The spectral bias of neural networks (129, 130) can be widely observed in climate applications and can cause
major challenges such as instabilities. (a) State-of-the-art ML-based weather emulator predictions after one
time step (based on results from Reference 112, courtesy of Qiang Sun). (b) Atmospheric boundary layer
turbulence reconstruction (131). (c,d) QG turbulence prediction after one time step (112) (data provided by
Ashesh Chattopadhyay) or reconstruction (based on results from Reference 76, data provided by Rambod
Mojgani). (e) Reconstruction of a 2D flow field (132) using the multistage NN (133). ( f ) Reconstruction of a
1D function with sharp peaks, difficult to fit with vanilla NNs (data provided by Yongji Wang). Panels a, b,
and e adapted with permission from Reference 112 (CC BY 4.0), Reference 131 (CC BY 4.0), and
Reference 132 (CC BY 4.0), respectively. Panels c and d adapted with permission from Reference 76
(CC BY 4.0). Abbreviations: 1D, one-dimensional; 2D, two-dimensional; AI, artificial intelligence;
ERA5, the 5th generation ECMWF atmospheric reanalysis of the global climate; ML, machine learning;
NN, neural network; QG, quasigeostrophic; RFF, random Fourier feature.
This study showed promising offline results of data-driven parameterization of moist convection
across a range of cold to warm climates once temperature, relative humidity, and latent heating
were properly transformed. This approach leverages physical insights of climate. The main chal-
lenge is finding the appropriate transformations, which are easier to find for thermodynamics but
harder to find for dynamics (e.g., wind).
of climates are emerging, providing a valuable source of training and retraining data (143–145).
However, the range of scenarios typically explored with Earth system models is typically restricted
to plausible future scenarios. As ML techniques become more mainstream in climate studies, it
will be important to simulate a wider range of future climates to expand the range of training data,
especially in extreme regimes.
…
{xp}
×τ
Δ
H
Relevance propagation
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54
Heatmap Output
Ri Rj
… …
×τ
Δ
…
Depth
SSH Negative relevance 0 Positive relevance
c z Π
QG turbulence simulations
60
kx 0
–60
–60 0 60
ky
Figure 5
Understanding what ML learns. Panels a and b illustrate how the THOR method ensures the input data necessary for the ML model to
demonstrate its learning of physics is present (120). (a, top) Training an NN to predict sections of the ocean dominated by different
balances in equation terms describing the flow (colors in output) using related surface fields (e.g., wind). (a, bottom) Looking backward
using XAI to see where in the input the NN saw as relevant (blue, not relevant; red, relevant). (b) For the pink section in the North
Atlantic, only two equation terms are relevant (red boxes), and relevances show conformance in two maps below, e.g., where the
mountain range (closed black lines) in depth (H) gives negative relevance. (c) The two leftmost panels show examples of the state z and
SGS term 5 (defined in Figure 2) from two setups of geophysical turbulence, separated by the dotted line, that differs in forcing scale
and dynamics. The right-side panels show examples of the Fourier spectra of convolutional kernels of NNs trained as ML-based SGS
parameterizations N N (z̄, γ ) = 5. The Fourier analysis shows the emergence of low-pass, high-pass, and band-pass Gabor filters (141).
Panels a and b adapted with permission from Reference 120 (CC BY 4.0). Panel c adapted with permission from Reference 141
(CC BY 4.0). Abbreviations: ML, machine learning; NN, neural network; QG, quasigeostrophic; SGS, subgrid-scale; SSH, sea-surface
height; THOR, Tracking global Heating with Ocean Regimes; XAI, explainable artificial intelligence.
we do not yet have accurate process-level models to describe the system (e.g., sea ice rheology and
cloud microphysics). The increasing amount of observational data offers exciting opportunities
for both equation and knowledge discovery to improve the fundamental understanding of climate
physics.
On the other hand, ML can be used as tools to improve simulations. ML models can be
coupled with traditional physics-based models and used to parameterize processes for which
from collaborations among the ML, climate sciences, and mathematics communities. For exam-
ple, the numerical analysis of differential equations and the advent of digital computers played
a key role in starting the field of numerical weather and climate prediction (163). Developing
similar rigorous tools, by closely combining methods from climate physics, ML theory, and nu-
merical analysis, can potentially help with building stable, accurate, and trustworthy ML-based
models.
DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTS
We thank Mingjing Tong, Oliver Dunbar, Jinlong Wu, and Duncan Watson-Parris for their
helpful discussions regarding data assimilation methods, GPR with EKI, sparse learning, and
emulators, respectively. We are grateful for the valuable general feedback from Andre Souza
and Janni Yuval on this article. We also thank Qiang Sun, Ashesh Chattopadhyay, Rambod
Mojgani, and Yongji Wang for helping to remake the spectral bias figure. C.-Y.L., R.F., P.H., and
A.S. acknowledge the National Science Foundation for funding via grants DMS-2245228, AGS-
2426087, OAC-2005123, and OAC-2004492, respectively. R.F., P.H., and A.S. also acknowledge
funding from Schmidt Sciences through the Virtual Earth System Research Institute.
LITERATURE CITED
1. Keisler R. 2022. arXiv:2202.07575 [physics.ao-ph]
2. Pathak J, Subramanian S, Harrington P, Raja S, Chattopadhyay A, et al. 2022. arXiv:2202.11214
[physics.ao-ph]
3. Bi K, Xie L, Zhang H, Chen X, Gu X, Tian Q. 2023. Nature 619:533–38
4. Lam R, Sanchez-Gonzalez A, Willson M, Wirnsberger P, Fortunato M, et al. 2023. Science 382:1416–21
5. Chen L, Zhong X, Zhang F, Cheng Y, Xu Y, et al. 2023. NPJ Clim. Atmos. Sci. 6:190
6. Sonnewald M, Wunsch C, Heimbach P. 2019. Earth Space Sci. 6:784–94
7. Xiao Q, Balwada D, Jones CS, Herrero-González M, Smith KS, Abernathey R. 2023. J. Adv. Model. Earth
Syst. 15:e2023MS003709
8. Wang R, Walters R, Yu R. 2021. Paper presented at the International Conference on Learning
Representations (ICLR) 2021, Virtual Event, Austria, May 3–7
9. Yuval J, O’Gorman PA. 2020. Nat. Commun. 11:3295
10. Watt-Meyer O, Dresdner G, McGibbon J, Clark SK, Henn B, et al. 2023. arXiv:2310.02074 [physics.ao-
ph]
11. Duncan JP, Wu E, Golaz JC, Caldwell PM, Watt-Meyer O, et al. 2024. Mach. Learn. Comput.
1(3):e2024JH000136
12. Hornik K, Stinchcombe M, White H. 1989. Neural Netw. 2:359–66
20. Weyn JA, Durran DR, Caruana R. 2019. J. Adv. Model. Earth Syst. 11:2680–93
21. Chattopadhyay A, Nabizadeh E, Hassanzadeh P. 2020. J. Adv. Model. Earth Syst. 12:e2019MS001958
22. Rasp S, Dueben PD, Scher S, Weyn JA, Mouatadid S, Thuerey N. 2020. J. Adv. Model. Earth Syst.
12:e2020MS002203
23. Rasp S, Thuerey N. 2021. J. Adv. Model. Earth Syst. 13:e2020MS002405
24. Clare MCA, Jamil O, Morcrette CJ. 2021. Q. J. R. Meteorol. Soc. 147:4337–57
25. Price I, Sanchez-Gonzalez A, Alet F, Ewalds T, El-Kadi A, et al. 2023. arXiv:2312.15796 [cs.LG]
26. Watson-Parris D. 2021. Philos. Trans. R. Soc. A 379:20200098
27. Schneider T, Behera S, Boccaletti G, Deser C, Emanuel K, et al. 2023. Nat. Climate Change 13:887–89
28. Palmer TN. 1999. J. Climate 12:575–91
29. Held IM. 2005. Bull. Am. Meteorol. Soc. 86:1609–14
30. Walker G. 1928. Q. J. R. Meteorol. Soc. 54:79–87
31. Corti S, Molteni F, Palmer T. 1999. Nature 398:799–802
32. Thompson DW, Solomon S. 2002. Science 296:895–99
33. Monahan AH, Fyfe JC, Ambaum MH, Stephenson DB, North GR. 2009. J. Climate 22:6501–14
34. Page J, Brenner MP, Kerswell RR. 2021. Phys. Rev. Fluids 6:034402
35. Lusch B, Kutz JN, Brunton SL. 2018. Nat. Commun. 9:4950
36. Shamekh S, Lamb KD, Huang Y, Gentine P. 2023. PNAS 120:e2216158120
37. Souza AN. 2023. arXiv:2304.03362 [physics.flu-dyn]
38. Geogdzhayev G, Souza AN, Ferrari R. 2024. Phys. D Nonlinear Phenom. 462:134107
39. Wang X, Slawinska J, Giannakis D. 2020. Sci. Rep. 10:2636
40. Rowley CW, Mezić I, Bagheri S, Schlatter P, Henningson DS. 2009. J. Fluid Mech. 641:115–27
41. Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, et al. 2019. Nature 566:195–204
42. Landy JC, Dawson GJ, Tsamados M, Bushuk M, Stroeve JC, et al. 2022. Nature 609:517–22
43. Martin SA, Manucharyan G, Klein P. 2024. Geophys. Res. Lett. 51(17):e2024GL110059
44. Rezvanbehbahani S, Stearns LA, Keramati R, Shankar S, van der Veen C. 2020. Commun. Earth Environ.
1:31
45. Surawy-Stepney T, Hogg AE, Cornford SL, Davison BJ. 2023. Nat. Geosci. 16:37–43
46. Lai CY, Kingslake J, Wearing MG, Chen PHC, Gentine P, et al. 2020. Nature 584:574–78
47. Lorenz EN. 1963. J. Atmos. Sci. 20:130–41
48. Iglesias MA, Law KJ, Stuart AM. 2013. Inverse Probl. 29:045001
49. Cleary E, Garbuno-Inigo A, Lan S, Schneider T, Stuart AM. 2021. J. Comput. Phys. 424:109716
50. Dunbar OR, Garbuno-Inigo A, Schneider T, Stuart AM. 2021. J. Adv. Model. Earth Syst.
13:e2020MS002454
51. Lopez-Gomez I, Christopoulos C, Langeland Ervik HL, Dunbar OR, Cohen Y, Schneider T. 2022.
J. Adv. Model. Earth Syst. 14:e2022MS003105
52. Mansfield L, Sheshadri A. 2022. J. Adv. Model. Earth Syst. 14:e2022MS003245
53. Souza AN, Wagner G, Ramadhan A, Allen B, Churavy V, et al. 2020. J. Adv. Model. Earth Syst.
12:e2020MS002108
54. Evensen G. 1994. J. Geophys. Res. Oceans 99:10143–62
55. Houtekamer PL, Zhang F. 2016. Mon. Weather Rev. 144:4489–532
56. Kovachki NB, Stuart AM. 2019. Inverse Probl. 35:095005
57. Watson-Parris D, Williams A, Deaconu L, Stier P. 2021. Geosci. Model Dev. 14:7659–72
151. Irrgang C, Boers N, Sonnewald M, Barnes EA, Kadow C, et al. 2021. Nat. Mach. Intel. 3:667–74
152. Sonnewald M, Lguensat R, Jones DC, Dueben PD, Brajard J, Balaji V. 2021. Environ. Res. Lett. 16:073008
153. Toms BA, Barnes EA, Ebert-Uphoff I. 2020. J. Adv. Model. Earth Syst. 12:e2019MS002002
154. Labe ZM, Barnes EA. 2022. Earth Space Sci. 9:e2022EA002348
155. Farge M. 1992. Annu. Rev. Fluid Mech. 24:395–458
156. Mallat S. 2016. Philos. Trans. R. Soc. A 374:20150203
157. Olshausen BA, Field DJ. 1996. Nature 381:607–9
158. Bassetti S, Hutchinson B, Tebaldi C, Kravitz B. 2023. J. Adv. Model. Earth Syst. 16(10):e2023MS004194
159. Finn TS, Durand C, Farchi A, Bocquet M, Brajard J. 2024. arXiv:2406.18417 [cs.LG]
160. Zhou A, Hawkins L, Gentine P. 2024. arXiv:2405.00018 [cs.DC]
161. Mukkavilli SK, Civitarese DS, Schmude J, Jakubik J, Jones A, et al. 2023. arXiv:2309.10808 [cs.LG]
162. Gupta A, Sheshadri A, Roy S, Gaur V, Maskey M, Ramachandran R. 2024. arXiv:2406.14775 [physics.ao-
ph]
163. Balaji V. 2021. Philos. Trans. R. Soc. A 379:20200085