0% found this document useful (0 votes)
28 views23 pages

Machine Learning and Climate Physics

The document discusses the integration of machine learning (ML) techniques with climate physics to enhance climate knowledge and simulations. It highlights two main aspects: using ML for climate physics and for climate simulations, emphasizing the importance of physics-informed models in scenarios with limited data. The authors call for collaboration among various fields to develop reliable ML models that can address the challenges of climate predictions and simulations.

Uploaded by

Ash Faq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views23 pages

Machine Learning and Climate Physics

The document discusses the integration of machine learning (ML) techniques with climate physics to enhance climate knowledge and simulations. It highlights two main aspects: using ML for climate physics and for climate simulations, emphasizing the importance of physics-informed models in scenarios with limited data. The authors call for collaboration among various fields to develop reliable ML models that can address the challenges of climate predictions and simulations.

Uploaded by

Ash Faq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Annual Review of Condensed Matter Physics

Machine Learning for Climate


Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

Physics and Simulations


Ching-Yao Lai,1 Pedram Hassanzadeh,2
Aditi Sheshadri,3 Maike Sonnewald,4
Raffaele Ferrari,5 and Venkatramani Balaji6
1
Department of Geophysics, Stanford University, Stanford, California, USA;
email: [email protected]
2
Department of Geophysical Sciences and Committee on Computational and Applied
Mathematics, University of Chicago, Chicago, Illinois, USA
3
Department of Earth System Science, Stanford University, Stanford, California, USA
4
Department of Computer Science, University of California, Davis, California, USA
5
Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of
Technology, Cambridge, Massachusetts, USA
6
Schmidt Sciences, New York, NY, USA

Annu. Rev. Condens. Matter Phys. 2025. 16:343–65 Keywords


First published as a Review in Advance on
climate, physics-informed machine learning, machine learning-informed
November 26, 2024
physics, equation discovery, parameterization, emulator
The Annual Review of Condensed Matter Physics is
online at conmatphys.annualreviews.org Abstract
https://doi.org/10.1146/annurev-conmatphys-
We discuss the emerging advances and opportunities at the intersection of
043024-114758
machine learning (ML) and climate physics, highlighting the use of ML
Copyright © 2025 by the author(s). This work is
techniques, including supervised, unsupervised, and equation discovery, to
licensed under a Creative Commons Attribution 4.0
International License, which permits unrestricted accelerate climate knowledge discoveries and simulations. We delineate two
use, distribution, and reproduction in any medium, distinct yet complementary aspects: (a) ML for climate physics and (b) ML
provided the original author and source are credited.
for climate simulations. Although physics-free ML-based models, such as
See credit lines of images or other third-party
material in this article for license information. ML-based weather forecasting, have demonstrated success when data are
abundant and stationary, the physics knowledge and interpretability of ML
models become crucial in the small-data/nonstationary regime to ensure
generalizability. Given the absence of observations, the long-term future cli-
mate falls into the small-data regime. Therefore, ML for climate physics
holds a critical role in addressing the challenges of ML for climate simu-
lations. We emphasize the need for collaboration among climate physics,
ML theory, and numerical analysis to achieve reliable ML-based models for
climate applications.

343
1. INTRODUCTION
Machine learning (ML) has led to breakthroughs in various areas, from playing Go to text gen-
Model: erated with large language models (LLMs) and, more recently, to weather forecasting (1–5).
a representation of a Different from Go and LLM, the language scientists use to understand and simulate weather and
system to make climate has been equations rooted in fundamental physics. Physics-based equations, often dif-
predictions; can be
ferential equations, are essential to simulate systems where direct observations are limited and
physics-based,
ML-based, or coupled noisy—even more so for projections of future climates where data are altogether unavailable
(Figure 1). Recently, ML has emerged as an alternative tool for predictive modeling as well as
Neural network
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

improving the understanding of climate physics (Figure 2). For instance, physics-free ML models
(NN): function
approximators such as neural networks (NNs), which are universal approximators of functions (12) and opera-
parameterized by tors (13), trained on data from observations or physics-based simulations have demonstrated a
function operations remarkable ability to perform accurate nowcasting (14) with lead times of a few hours, weather
and parameters γ , that prediction (15) with lead times of several days, and El Niño forecast with lead times of a year (16).
are optimized to
It remains an open question if some of the strategies and success in these short-time predictions
minimize specified
cost functions L can be applied to improve climate projections, i.e., to estimate changes in the statistics of weather
events (e.g., return periods of heat waves or tropical cyclones) in the next decades, centuries, and
Simulation:
beyond.
physics-based models
solved numerically
1.1. Weather Versus Climate: Nonstationarity
Emulator: a subset of
models that fit the The success story of ML for prediction (Section 3.2) has been primarily showcased in weather
data, bypassing solving forecasting (17). Followed by initial attempts started in 2019 (e.g., 18–24), by 2022–2023 ML
physics-based weather models (often called emulators; Section 3.2) achieve, at a fraction of the computational
equations cost, similar or better forecast skills than state-of-the-art physics-based weather prediction models
(e.g., 1–5, 25). This success has generated excitement about using ML to improve climate projec-
tions as well. Yet, climate projections involve major additional challenges (26, 27). In weather
forecasting, we have a constant stream of real-time and historical data for training ML-based em-
ulators (lower right of Figure 1a) to make predictions for a few weeks, where the statistics can
be assumed to be stationary. In climate, we are often interested in predicting the climate’s forced

a b
Equation
physics-based equations

Data assimilation, discovery


inverse problems
Subgrid
Interpretability
Availability of

parameterization Data assimilation,


Critical inverse problems Subgrid
Equation
step discovery parameterization

Future Supervised Supervised


climate neural network neural network

Availability of data Availability of data


Figure 1
Conceptual diagram of ML applications in climate sciences with respect to the availability of existing physics-based equations,
availability of data, and interpretability. We explain different components of this figure throughout this article. (a) Existing
physics-based equations and data are two sources of information used for training ML models. (b) More physics-based equations are
not necessarily more interpretable, e.g., existing numerical weather predictions. However, equation discovery usually comes with
regularization techniques to find the simplest set of equations capturing the dominant behavior of the system, enhancing
interpretability. Abbreviation: ML, machine learning.

344 Lai et al.


Machine learning for climate physics
Data-informed knowledge discovery
a Dimensionality reduction b Supervised versus unsupervised c Discover ocean dynamical regimes
with clustering and validation
Two dimensions One dimension Classification Clustering
Dimension 2

Data 2nd PC k=2 60° N 60° N


Data
1st PC PCA 0° 0°
1st PC
Dimension 1
60° S 60° S
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

Data-informed model discovery


Example: ż [x, y, ż]T = f(z(t), θ), where z(t) are the states, t is time, θ are model parameters

Parametric estimation, θ State estimation, z(t) Structural estimation, f

Discover parameters θ of Discover states z(t) of a system Discover a mathematical


Goal a mathematical model that given sparse, noisy, and discrete model f that best fits a
best fit a given dataset. data. given dataset.

Example EKI Physics-informed neural networks, Sparse regression (e.g., SINDy),


physics-informed neural operator genetic algorithm

x(t), y(t), z(t) at lots of times, x(t), y(t) at sparse times, and the x(t), y(t), z(t) at lots of times
Given and mathematical form of the mathematical form of f, including θ
model f

θ State variables x(t), y(t), z(t) at The mathematical form of the


Predict lots of times model f, including θ

d Parameter estimation e Uncertainty quantification (UQ) f State estimation example


Sea-surface height x Vorticity (unobserved) z
θ4 θ3
Cost function
Prediction z

Prediction z

Distribution

Find Map 1,400 Data 1,400 CNN Truth


θ2 θ* 1,200 1,200
θ* θ to z Extreme
θ1 events 1,000 1,000
800 800
θ km
t t θ* z (θ*) z (θ) 600 600
400 400
Simulations from f(θ) θ* Calibrated Simulations from f(θ) z Prediction (e.g., 200 200
Data parameter (expensive) sea-level rise 0 0
with UQ Emulator (cheap) in 100 years) 0 250 500 0 250 500 0 250 500
Plausible range of θ km km km

Machine learning for climate simulations


SGS modeling Machine-learning emulator g Physics-informed ML
Learn an SGS model Π (z) such that Discover a model parameterized by Poor generalizability Good generalizability
Goal z = f(z, θ) + f(z) – f(z) ML that best fits a given dataset Ground
truth
Π (z) Physics-
z Data z Data informed
In LES z is the coarse-grained z, ClimateBench, FourcastNet, Pengu Add ML
Example and Π includes Reynolds stress Weather, GraphCast, ACE ML model physics
t t
Given f, θ, z at lots of t and its Data of states from previous time
high-resolution counterpart z step(s) z(ti)
h Better generalization capacity via
Predict N N (z, γ) = Π(z) or symbolic equation State at future time step(s) z(ti + 1) incorporating symmetries into NN
Target Nonequivariant Equivariant NN
i Weather forecast via ML emulator j SGS parameterization
ti ti + 1
t=1

High res z Low res z z with ML-SGS


45°
Latitude


t = 10

–45°

20° 50° 20° 50° 20° 50°


Longitude
(Caption appears on following page)

www.annualreviews.org • ML for Climate Physics and Simulations 345


Figure 2 (Figure appears on preceding page)
An overview of the areas in which ML has played a role in uncovering climate physics and advancing climate simulations. Panels c, f, h,
and j adapted with permission from Reference 6 (CC BY 4.0), Reference 7 (CC BY 4.0), Reference 8 (CC BY 4.0), and Reference 9
(CC BY 4.0), respectively. Panel i adapted from Reference 4; copyright 2023 AAAS. Abbreviations: ACE, AI2 Climate Emulator (10,
11); CNN, convolutional neural network; EKI, ensemble Kalman inversion; LES, large eddy simulation; ML, machine learning;
NN, neural network; PC, principal component; PCA, principal component analysis; SGS, subgrid-scale; SINDy, sparse identification of
nonlinear dynamics.

response to changes in greenhouse gases in the atmosphere, leading to nonstationarity (28), e.g.,
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

climate with different mean and variability. ML models are not suited to predict the behavior
of a system substantially different from the one they have been trained on. Yet we simply do
not have observational data for the future (i.e., lower left corner in Figure 1a) to train and vali-
date future predictions; this is an issue for both ML and physics-based models but one expects
the fundamental laws of physics to hold in the future as well. We summarize the nonstation-
ary challenge and potential solutions, such as incorporating physics constraints, in Section 4.2.
Furthermore, the long-term prediction of future climate involves the interactions between the
atmosphere, ocean, cryosphere, land, and biosphere, which make the problem more challenging
than short-term weather forecast.

1.2. Challenges in Understanding and Simulating Climate


The climate system consists of interacting processes that span orders of magnitude in spatial (from
microns to planetary) and temporal (from seconds to centuries) scales. Simulating the climate
system to resolve all these scales is computationally challenging. Due to its multiscale nature,
representing physics in the underresolved scales (e.g., cloud microphysics, turbulence) in low-
resolution climate simulations—referred to as subgrid-scale (SGS) parameterization—has been
a central goal for climate scientists. ML has emerged as a promising alternative to SGS pa-
ML-based
emulators: emulators rameterization due to its ability to perform equation discovery and its desirable properties as
that fit the data with universal function approximators, which do not require prior assumptions about the functional
ML models (e.g., deep forms of the parameterization. We summarize recent advances of ML in SGS parameterization in
neural networks), Section 3.1 as well as its major challenges, such as interpretability in Section 4.1 (lower right corner
bypassing solving
of Figure 1b) and uncertainty quantification (UQ) in Section 4.3. Without understanding what
physics-based
equations the ML model actually learns and the reasoning behind it, we cannot deduce when the ML model
will generalize well to future climates. One approach to improve interpretability is discovering
Nonstationarity:
the closed-form equations that capture the data (Section 2.2.4.3; upper side of Figure 1b). Equa-
systems with
time-evolving tions have long been the language for physicists to develop understandings of the systems they
statistical properties, govern. With the emergence of ML, ways to comprehend and extract knowledge from ML-based
so that a limited time models are new areas of research. Several methods developed to understand what the ML model
series is not itself learns are detailed in Section 4.3.
representative of the
We organize this article by focusing on two distinct goals through which ML is reshaping cli-
past or the future
mate science: (a) ML for climate physics and (b) ML for climate simulations. The former focuses
Subgrid-scale (SGS) on utilizing the increasing availability of data from the Earth system to extract understandings, in-
parameterization:
cluding knowledge discovery (Section 2.1) and data-driven model discovery (Section 2.2). The lat-
a model to
parameterize the ter discusses recent advances in accelerating simulations, including data-driven parameterization
relationship between (Section 3.1) and climate emulators (Section 3.2). Section 3.3 discusses methods used to
the resolved states as a add physical constraints to ML models. For inverse problems with sparse data (upper left of
function of the Figure 1a) in the small-data regime including physical constraints is necessary to generalize pre-
unresolved (subgrid)
diction where data do not exist. Finally, the future climate also falls in the small-data regime, as
state
no future observations exist. Along with an incomplete understanding of the physics of future

346 Lai et al.


climate, it lies at the bottom left of Figure 1a, making it the most challenging among others in
Figure 1a. Thus, predicting climate is not merely about accelerating simulation but essentially re-
quires generating more physical knowledge than currently available, moving toward the tractable
Supervised learning:
upper left regime of Figure 1a. While training ML models to make accurate predictions faster algorithms that learn a
than physics-based simulations, making previously challenging tasks computationally tractable, mapping from input
is an achievement, the ability to simulate does not equate to improved physical understanding data (features) to
(29). We stress that accurate and fast simulations or predictions are not sufficient; deeper physical output labels based on
the training examples
understanding of the climate is necessary to address the climate modeling challenges.
provided
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

2. MACHINE LEARNING FOR CLIMATE PHYSICS Unsupervised


learning: algorithms
The increase in availability of data, from both observations and high-fidelity simulations, is a key that discover patterns,
driver for new physical insights. Below, we introduce two emerging trends of research using ML to structures, or
improve our understanding of climate physics using data: (a) Data-informed knowledge discovery, relationships within
the input data
e.g., identifying patterns and dynamical regimes in high-dimensional, complex observations and
(features) without
simulations, and (b) discovering data-informed predictive models. explicit guidance from
labeled examples
2.1. Data-Informed Knowledge Discovery
Knowledge discovery (e.g., Figure 2a–c), such as identifying coherent patterns of dynamical sig-
nificance in spatiotemporal data, has long been a fundamental process for making discoveries in
climate science. A classical example in atmospheric science is the identification of what we now call
the El Niño Southern Oscillation (ENSO) by Walker in 1928 (30). Stationed in India during the
British occupation, Walker employed an army of Indian clerks to conduct principal component
analysis (PCA) by hand on all available data, decomposing it into orthogonal modes that revealed
coherent structures associated with ENSO. Today, we have access to vast amounts of data, both ob-
servational and computational. Advanced ML techniques emerge as powerful pattern recognition
tools that computationally scale well with increasing volumes of data. Off-the-shelf tools widely
adopted in climate sciences include supervised learning methods [e.g., random forest, Gaussian
process regression (GPR), and NN] and unsupervised learning methods such as autoencoders and
clustering algorithms (e.g., k-means, self-organizing maps). Several of the above tools have been
used in climate science for decades, long before deep learning took off.
2.1.1. Dimensionality reduction. Climate data often involve high-dimensional, nonlinearly
correlated, spatial and temporal variables, such as temperature, pressure, and precipitation, across
large geographical regions and long time periods. Dimensionality reduction has long been used for
transforming high-dimensional climate data into a lower-dimensional space, which might be more
amenable to physical interpretation and to develop reduced-order predictive models. PCA, also
known as empirical orthogonal function (EOF), is one of the commonly used linear techniques for
dimensionality reduction (Figure 2a). Reducing decades of observational data into a few modes
of variability, such as ENSO, has facilitated understanding of the underlying dynamics and even
the robust detection and interpretation of the anthropogenic climate-change footprint (31, 32).
However, traditional techniques such as PCA/EOF have major limitations, such as the lack of a
dynamical meaning of the discovered modes and the linearity (33) (see also Supplemental Text,
Section I).
ML techniques have the potential to address the linearity limitation. For example, autoen-
coders, a type of NN with high-dimensional input and output layers and a lower-dimension latent
space, are powerful tools for dimensionality reduction. A single-layer autoencoder using linear ac-
tivation functions is equivalent to PCA. In contrast, deep autoencoders with nonlinear activation
functions have more expressive power in capturing the low-dimensional representation of the

www.annualreviews.org • ML for Climate Physics and Simulations 347


high-dimensional input (34, 35), and their applications to climate science have started to emerge.
Shamekh et al. (36), interpreting the latent space of an autoencoder, developed a new metric for
cloud and precipitation organization, enabling the development of a parameterization for moist
convection.
Markov models that describe the transition probability from one state to another are being ap-
plied to study the evolution of the climate system in a reduced space, possibly based on a PCA/EOF
projection (37, 38). Latent variable models have also shown promise in numerical model analy-
sis and predictive skill. For example, Wang et al. (39) improved ENSO prediction skill by using
kernel analog forecasting [related to the Koopman operators (40), a mathematical technique to
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

transform a nonlinear dynamical system into a linear one in a higher-dimensional space].


2.1.2. Finding patterns in climate data. The task of finding patterns in climate data extends
far beyond dimensionality reduction and is a fruitful area that still has much to be explored. For
example, unsupervised methods like clustering (Figure 2b) have been used to identify the balance
between terms of the equations governing simulation data and to discover global ocean dynamical
regions as parsimonious representations of the governing equations (6) (Figure 2c).
Data from actual observations can often inspire new knowledge about climate systems (41).
Supervised methods facilitate the utilization of vast amounts of satellite observations, such as re-
constructing a pan-Arctic dataset of sea ice thickness during periods when data are unavailable
(42), revealing the strong nonlinear interactions with ocean eddies (43), reconstructing ocean sur-
face kinematics with sea-surface height measurements (7), and detecting icebergs to understand
their contribution to the freshwater budget (44). Extracting information from remote-sensing data
can fill missing gaps required to inform physics-based models. For example, the identification of
ice fractures (underresolved in simulations) is needed to constrain parameters for modeling ice
dynamics (45, 46). Some of the above tasks have long been done manually and often subjectively
by scientists. ML offers an efficient alternative that can be easily scaled up to all available data that
may be intractable otherwise and can be made easily accessible, reproducible, and transparent via
open-source software. That said, the design of the loss functions is still subjective.

2.2. Data-Informed Model Discovery


Apart from distilling knowledge from data, physicists have been developing predictive models to
describe observations for centuries. The utility of a model, if it accurately represents the obser-
vations, lies in its ability to make predictions when the data are unavailable, such as projections
about the future. The crux of climate physics is creating trustworthy future predictions, and deter-
mining how to construct a model that faithfully describes the data are essential. It’s worth noting
that in traditional physical sciences, a model often takes the form of mathematical equations. In
modern ML literature, a model can refer to functional operations (e.g., NNs) that parameterize
relationships between specified input and output variables. For clarity, in this section, use of the
term model discovery means the discovery of mathematical equations.
Here, we broadly classify three different ML approaches that have been used for finding mod-
els to describe climate data: parametric, state, and structural estimations (Figure 2). We use a
dynamical-system example of the following form to illustrate the differences:
d
z(t ) = f (z(t ), θ), 1.
dt
where z(t ) ∈ Rn represents the states vector of the system that evolves with time t ∈ R, and its
evolution is dictated by the mathematical expression of the dynamics f and the model parameters
θ (see the sidebar titled The Lorenz 63 System for an example).

348 Lai et al.


THE LORENZ 63 SYSTEM
The Lorenz 63 system (47), a simplified mathematical model for atmospheric convection, is described by the
following set of ordinary differential equations:
d d d
x = a(y − x), y = x(b − z) − y, z = xy − cz,
dt dt dt
where the model parameters θ = [a, b, c] are constants, and z(t ) = [x(t ), y(t ), z(t )] are the time-evolving states.
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

2.2.1. Parametric estimation θ. Given a discrete dataset of states measured at discrete times ti ,
i.e., {x(ti ), y(ti ), z(ti )}N
i=1 , parametric estimation refers to predicting the free parameters θ = [a, b, c]
when the functional form of the mathematical model f (z(t ), θ) is known. See Figure 2d.
2.2.2. State estimation z. State estimation involves predicting the state variables z(t ) given the
mathematical model f (z(t ), θ), model parameters θ, and data {x(ti ), y(ti ), z(ti )}N
i=1 . Estimating states
is particularly useful for data interpolation when the available data are sparse in time or space, for
data denoising, or for inversion when the predicted state is not measurable (e.g., Figure 2f ) and,
thus, completely unavailable in the data library, e.g., predicting z(t ) with data of {x(ti ), y(ti )}Ni=1 .

2.2.3. Structural estimation f . This is also referred to as equation discovery, reconstructing


the complete mathematical expression of f (z(t ), θ), including the free parameters θ = [a, b, c] given
i=1 . The determination of the model f (z(t ), θ) fully relies
only the discrete data {x(ti ), y(ti ), z(ti )}N
on the data; therefore, dense data are often needed to guarantee the success of the algorithms.
Beyond these three categories, an emerging data-driven approach involves replacing f with an
ML-based emulator. In this approach, instead of using equations, the dynamics f are represented
by black-box ML models, as detailed in Section 3.2.
2.2.4. Algorithms and examples. In this section, we list a few examples that demonstrate how
ML has influenced data-driven model discovery within the three categories described above: en-
semble Kalman inversion (EKI) for parameter θ estimation, physics-informed machine learning
(PIML) for state z estimation, and equation discovery for structural f (z, θ) estimation.
2.2.4.1. Ensemble Kalman inversion. Here, we focus on one family of methods, EKI, as an
example, as it is increasingly used in climate science. EKI (48) is a well-developed parameter es-
timation technique in the climate modeling community and has been used in various contexts
such as convection, turbulence, and clouds (49–51); gravity waves (52); and ocean convection
(53). EKI is a derivative-free (48) optimization method for parametric estimation θ, based on
ensemble Kalman filtering (EnKF) (54), which is used for estimating states z(t ) in numerical
weather prediction (NWP) (55) given noisy observations. EKI (56) attempts to find a distribu-
tion of model parameters θ that can describe time-averaged statistics of a truth, which could be
from observational data or simulations, removing dependence on state variables by utilizing long
integrations. EKI optimizes for macrophysical climate statistics (e.g., derived by averaging over
many occurrences of the event of interest).
Quantification of parametric uncertainty is important as it illustrates how perturbations of
parameters θ that we want to estimate would translate to the predictions of z. As shown in
Figure 2e, though an optimal value of θ ∗ minimizing the cost function only captures one pre-
diction z(θ ∗ ), a range of θ could yield wide-ranging predictions (e.g., covering extreme events).
Running ensembles of forward physics-based simulations f (θ) with a range of θ for climate models
to quantify these uncertainties propagated by parameter uncertainty is currently computationally
infeasible. To address this challenge, the calibrate-emulate-sample (CES) approach (49, 50) trained
www.annualreviews.org • ML for Climate Physics and Simulations 349
INVERSE PROBLEM AND DATA ASSIMILATION
The problem of estimating parameters θ and states z of a model f (z, θ) falls under the umbrella of inverse problems
(60). The importance of parametric and state estimations lies in the fact that direct observations of model parameters
and states are often unavailable, yet they are crucial for simulating and predicting both the weather and climate
accurately. Various inverse problems arise in weather and climate, such as estimating parameters in climate models
or determining initial conditions for improved weather forecasting. Data assimilation (61) refers to the process of
combining observational data with numerical models. The synergies between data assimilation and ML have been
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

increasingly recognized (62–64), including using ML to correct model error in data assimilation (65) and emulation
of a dynamical system (66).

a GPR (Gaussian process regression) as a cheap method to emulate the prediction of interest z as a
function of θ . Sampling the GPR emulator with Markov chain Monte Carlo enables substantially
faster UQ (uncertainty quantification) of the predictions z resulting from the plausible range of θ.
GPR has also been used directly for calibrating parameters with UQ in Earth system models (57).
2.2.4.2. Physics-informed neural networks (PINNs). The use of PINNs (58, 59) for planetary-
scale geophysical flow problems has started to emerge in the past few years. Introduced by Raissi
et al. (58), a PINN is a differentiable solver for partial differential equations (PDEs) that is particu-
larly useful for inverse problems involving sparse-data inference, superresolution, data denoising,
and state estimation z in data assimilation (see the sidebar titled Inverse Problem and Data As-
similation). Unlike classical ML in which the cost function typically only involves data, PINN
encodes physics-based equations directly in the cost function (Figure 3a).
Throughout the training iterations, the optimizer identifies the best ML-parameterized states
z = N N (x, t, γ ) that are consistent with both the data and the governing equations. In the small-
data regime, without evaluating the NN-parameterized z against known physical laws (such as
conservation of mass, momentum, and energy), the ML predictions can be physically inconsis-
tent and nonextrapolatable beyond the available observational data (e.g., deviation from truth in
Figure 2g). In contrast, by incorporating PDEs, PINNs can achieve both physics-informed data
interpolation and extrapolation, as demonstrated by the examples in Figure 3, which cannot be
achieved by ML models trained with observational data alone.
Figure 3b,c demonstrates the applications of PINNs on observations, ranging from estimat-
ing the initial conditions z(x, t = 0) of hurricanes for subsequent forecasts (67) to inferring the
nonmeasurable viscosity structure z(x) of Antarctic ice shelves (68). Both examples fall within
the small-data regime in the upper left corner of Figure 1a, where incorporating knowledge of
PDEs becomes crucial for solving the inverse problems; the PINN-reconstructed wind field z(x, t )
(Figure 3b) involves only sparse observations of wind velocity itself as training data, obtained from
measurements by hurricane hunter planes and dropsondes. The PINN prediction of ice viscosity
is achieved without any observations of viscosity in the training data (Figure 3c); it relies solely
on equations and other observable states (velocity and thickness fields) as training data. Thus,
both examples involve substantial extrapolation beyond the sparse observational data, i.e., limited
velocity data and no viscosity data.
As long as the same data and physics-based equations are used to solve the inverse problem,
the predictions generated by properly trained PINNs are as trustworthy as those produced by es-
tablished data assimilation methods. Due to PINNs’ leverage of a graphics processing unit (GPU)
and differentiable modeling to infer accurate initial conditions without ensembles of forward
modeling, as used in ensemble-based data assimilation methods (67) (see Supplemental Text,
Section II for a brief comparison), PINNs require fewer computational resources to construct
350 Lai et al.
a Physics-informed neural network
Optimize NN parameters γ
Inject training data
Check physical consistency

x
……
z1 ∂x zi

…… (γ) = data(zdata, z(γ)) + equation (ƒ(z(γ)))


Space y z2 ƒ1
∂t zi
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54


Cost Data Equation
function loss loss


z z3
…… ƒ2
Derivatives in the equation are calculated exactly using
automated differentiation. The cost function (γ) is
Time t z4 ∂ 2x2 zi
…… differentiable with respect to the state variables z(γ)
and the trainable parameters γ in the ML model.
Input
Hidden Output Mathematical
(coordinates)
layers (states) models (PDEs)

b Sparse-data inference
(e.g., reconstruct initial conditions for hurricanes forecast)
Sparse data PIML-reconstructed
(Hurricane Ida) wind fields
40
Wind speed (m s–1)

30
200
Pressure (hPa)

Wind speed (m s–1)


Pressure (hPa)

500 30
400
700 20
20
600
900
–200 200 10 800 10
x (k 0 0 ) –400 400
m) 200 –200 y (k m –200 0 0 200
x (k 200 200
m) 400 –400 m)
y (k

c Inferring nonmeasurable state


(e.g., infer glacial ice viscosity for ice-sheet models)

Velocity data (m/year) Thickness data (m) PIML-inferred viscosity (Pas)


1,000 700
900 1016
600
800
700 500
600 1015
400
500
400 300
300 200 1014
200
100 100
0 1013
500 km
Figure 3
(a) The PIML algorithm and its applications in data assimilation (b,c). Panels b and c adapted with permission from Reference 67
(CC BY 4.0) and Reference 68 (CC BY 4.0), respectively. Abbreviations: ML, machine learning; NN, neural network; PDE, partial
differential equation; PIML, physics-informed machine learning.

www.annualreviews.org • ML for Climate Physics and Simulations 351


hurricane initial conditions with similar accuracy as the ensemble-based data assimilation meth-
ods (69). That being said, established data assimilation methods are supported by several mature
theories, which are relatively lacking for PINN methods. Although several models used in climate
predictions are not easily differentiable without substantial engineering efforts, the development
of differentiable solvers for atmospheric dynamics (70) demonstrates promises. Differentiable ice-
flow solver and emulator have also recently emerged as new tools for forward and inverse ice-flow
modeling (71).

2.2.4.3. Equation discovery. Existing equations f (z, θ) describing the numerous processes in
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

the climate system, particularly the SGS processes, are far from complete. Equation discovery,
which outputs equations that are most consistent with data, has been used to tackle this prob-
lem. Inspired by earlier symbolic regression algorithms for distilling physical laws from data (72),
sparse identification of nonlinear dynamics (SINDy) (73) has emerged as a widely used method
for discovering f (z, θ) from data of the states z(t ). It demonstrates the power of sparse regres-
sion for learning the most relevant terms in the prescribed function library that describes the
data. To learn the correct f (z, θ), SINDy requires sampled data of both z(t ) and dz(t )/dt. For
many climate problems these state measurements are sparse and noisy, or entirely unavailable.
Schneider et al. (74) showed that time-averaged statistics of the states z(t ), which are available for
the climate system, can be sufficient to recover both the functional form of f (z, θ) and the noise
level of the data using sparse regression combined with EKI. Sparse EKI is robust to noisy data
and was successfully implemented to recover the Lorenz 96 equations (74). Other approaches for
equation discovery from data assimilation increments (75, 76) and from partial observations (77),
motivated by climate problems, have been proposed too.
Arguably the most successful example of the application of equation discovery for climate
physics so far has been the learning of an ocean mesoscale SGS parameterization (78). Trained on
high-resolution simulation data, Zanna & Bolton (78) showed that Bayesian linear sparse regres-
sion with relevance vector machines identifies relevant terms in the prescribed function library to
discover the closed-form equations of 5(z̄) (defined in Figure 2) for eddy momentum and temper-
ature forcing. The closed-form equation is consistent with an analytically derivable physics-based
model (79, 80). As discussed later in Section 3.1, SGS parameterizations (Figure 2) are essential for
improving the accuracy of computationally feasible low-resolution climate simulations. Although
black-box NNs have also shown promise for developing data-driven SGS parameterizations
(Section 3.1), the significant interest in equation discovery stems from their better generalization
to future climates and their interpretability (upper side of Figure 1b).
Inspired by early work on symbolic regression (72, 81), the symbolic genetic algorithm (82) was
developed to discover PDEs without the need to predetermine a function library. It uses a binary
tree to parameterize common mathematical operations (e.g., addition, multiplication, derivative,
division) and finds the correct operations such that the discovered equation matches the data.
In climate applications, genetic algorithms have been used for finding equations for cloud cover
parameterization (83) and ocean parameterization (84).

3. MACHINE LEARNING FOR CLIMATE SIMULATIONS


We discuss two major directions leveraging ML methods to improve the accuracy of climate
simulations: (a) SGS parameterization, aimed at developing more accurate climate models via
better representation of small-scale (expensive to resolve) physical processes, and (b) emulators,
aimed at generating large ensembles of simulations (or directly, the statistics) at a fraction of
the computational cost of a physics-based climate simulation. These two approaches are briefly
discussed below.

352 Lai et al.


UNCERTAINTIES IN CLIMATE PROJECTIONS
Climate projections are affected by three sources of uncertainty (86): (a) model uncertainty (also known as structural
error), (b) internal variability uncertainty (e.g., the signal-to-noise ratio problem), and (c) scenario uncertainty (re-
lated to how much greenhouse gas will be released in the future). Reducing model uncertainty requires developing
more accurate climate models (e.g., improving parameterization or increasing resolutions), whereas reducing the
internal variability uncertainty and scenario uncertainty requires computationally efficient climate models that can
generate long, large ensembles of simulations and explore different scenarios.
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

3.1. Subgrid-Scale Parameterization


There are two reasons climate models require SGS parameterization to achieve simulations on
relevant century-long timescales: (a) the process of interest varies on length scales or timescales
smaller than a climate model’s resolution, and (b) equations to describe the process are not known.
SGS parameterizations estimate the effect of these unresolved processes on the resolved scales.
Developing SGS parameterizations for climate modeling, but also for high-resolution simulations
in limited domain, has been an active area of research since the pioneering work of Smagorinsky
(85) on the first climate models in the early 1960s. Still, the approximations made in formulat-
ing these parameterizations remain a leading cause of model uncertainty (also known as structural
error; see the sidebar titled Uncertainties in Climate Projections). ML presents a potentially excit-
ing path forward in improving these SGS parameterizations or developing new ones. The general
idea is to use observations or high-resolution simulations to learn a data-driven representation or
closed-form equation of the SGS term 5 (defined in Figure 2). Examples of the latter approach
were discussed in Section 2.2.4; below, we mainly focus on approaches based on NNs (Figure 2j).
Studies have demonstrated the ability of ML algorithms such as NNs to learn ML-based pa-
rameterizations as a supervised learning task, N N (z̄, γ ) = 5(z̄), for prototypes of geophysical
turbulence (e.g., 78, 87), ocean turbulence (e.g., 88, 89), moist convection and clouds (e.g., 90–96),
Model uncertainty:
and atmospheric gravity waves (e.g., 97–99). Some of these examples have achieved stable simu- deviation of the
lations that are more accurate than simulations with traditional physics-based parameterizations physics-based model
(e.g., 78, 87, 93, 95). However, though promising, this approach faces a number of challenges. Some from the data, due to
are common among other ML applications to climate (e.g., interpretability, extrapolation) and are inaccurate parameters
θ or the physics
discussed in Section 4. Challenges specific to SGS parameterization via supervised learning include
equations f itself
availability of suitable high-resolution training data from numerical studies or observational cam-
paigns, as well as issues with accuracy and stability once these ML-based SGS parameterizations Structural error:
a type of model
are coupled [e.g., with atmosphere (93, 100) and ocean models (78)].
uncertainty arising
This discussion of SGS parameterization for climate models would not be complete if we from inaccuracies in
did not emphasize the distinct challenges compared to its use in weather forecasting. In weather the equations used to
forecasting, the objective is to predict a specific trajectory based on initial conditions, necessitat- represent the data or
ing accurate and detailed prediction of SGS physics. Conversely, climate studies aim to predict processes of interest
changes in the system’s average behavior over decades. Thus, it is sufficient to predict the SGS ML-based SGS
statistics rather than all its specific features. This shift requires novel ML approaches optimized parameterization:
to capture emergent statistics rather than detailed information from training data in supervised SGS parameterization
represented by an ML
learning. This poses challenges, as long-term observations of climate statistics are limited and
model rather than
the simulations coupled with ML-based SGS parameterizations need to be stable for a long physics-based
enough timescale to learn the climate statistics from the training data. Despite these challenges, equations
a few studies have made progress in producing stable and accurate simulations of simple climate

www.annualreviews.org • ML for Climate Physics and Simulations 353


prototypes using NNs trained with differentiable modeling (101) and EKI (100) that target
the evolution of the climate variables in response to the SGS processes rather than training on
the SGS processes themselves. Recently, Google Research has made strides in this direction by
Spatiotemporal
emulator: an emulator developing an atmospheric model’s dynamical core that learns SGS physics statistics directly
with predictions that from reanalysis data (70). Yet, numerical stability and satisfaction of global energy conservation
evolve with space and constraints remain a challenge. Unlike in the weather literature, training from direct observations
time (102) has not been attempted yet, possibly because of the sparsity of global datasets with long
Initial value problem: enough timescales to capture climate statistics.
a forward model f (z) An additional challenge is how to address the interactions between different SGS processes,
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

with predictions z(t ) for example, between SGS parameterizations of ocean and atmosphere boundary layer turbu-
subject to the initial
lence that interact through air–sea fluxes. Training is typically done on subcomponents of the full
conditions z(t0 )
specified at time t0 climate system because it is not computationally feasible to run global climate simulations
that fully resolve all SGS processes and their interactions. Concurrent observations of differ-
Boundary value
ent SGS processes are also limited. As a result, interactions among a number of individually
problem: a forward
model f (z) with trained/calibrated data-driven parameterizations can lead to inaccurate or even unstable global
predictions z(t ) subject simulations. This is an area in need of practical advancements.
to the boundary
conditions z(x0 , t ),
3.2. Climate Emulators
which can be
time-evolving, at the “Emulator” refers to several types of tools in the climate science literature. In general, an emula-
boundaries x0 tor is trained to mimic the data, from physics-based simulations or observations, to substantially
reduce the computational cost of producing new climate predictions, e.g., for other climate
conditions within the distribution of the training data.
Emulators can be used to interpolate the projections from expensive climate simulations,
making their projections among different emission scenarios accessible without rerunning the
simulations. Earlier use of ML for emulators followed the successful approach of traditional
pattern-scaling emulators (103, 104), which, for example, predict the change in statistics of vari-
ables of interest (e.g., regional annual-mean surface temperature or the return period of extreme
events at a later time) given a small set of inputs (e.g., year, greenhouse gas forcing, global mean sur-
face temperature). Using ML techniques (e.g., GPR, NN), emulators such as ClimateBench (105)
have been employed to estimate the climate impacts of anthropogenic emissions annually up
to 2100. However, it remains to be demonstrated that their skill is superior to that of pattern-
scaling emulators, i.e., emulators that regress regional temperature on global mean temperature
or cumulative emissions.
Although the aforementioned emulators can predict aggregated statistics within an often large
window of length scales and timescales, another type of emulator has emerged in recent years
with the aim of predicting the evolution of the climate system at fine spatiotemporal scales. These
spatiotemporal emulators leverage the success of ML-based weather forecast models, which are
physics free and trained solely on reanalysis data (106) (spanning 1979–present; see the sidebar
titled Reanalysis). Recent ML-based weather forecast models [e.g., FourCastNet (2), Pangu (3),
GraphCast (4)] are time-stepping algorithms that solve the initial value problem of predicting
the state z(t ) of the global atmosphere forward in time (from ti to ti+1 , then from ti+1 to ti+2 ,
and so on; Figure 2i). They exhibit comparable or even better skill than the best physics-based
weather prediction models for lead times of up to around 10 days (4). However, weather and
climate predictions are different problems. The former is an initial value problem, whereas the
latter is more akin to a boundary value problem in the sense that the focus is on how external
boundary conditions impact the system over longer periods of time.

354 Lai et al.


REANALYSIS
Reanalysis, sometimes referred to as maps without gaps, refers to a method of using a physical model to assimilate
disparate observational data streams into a combined multivariate dataset uniform in space and time. The model
fills in the data-poor regions and ensures physical consistency between variables. The output is often referred to
as reanalysis data; however, it is important to keep in mind that these data are not observations but outputs of a
forecast model. In fact, reanalysis products from different weather centers usually differ among themselves, and this
spread can be taken as a measure of uncertainty in observations and understanding.
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

For climate predictions, atmospheric spatiotemporal emulators are built to solve boundary
value problems that integrate the global atmospheric state given external forcings (e.g., radia-
tive forcing) and time-evolving boundary conditions (e.g., sea-surface and land temperature) for
decades or centuries. The AI2 Climate Emulator (ACE) (10, 11) is a promising example of such
a spatiotemporal emulator trained on physics-based simulations. Similar work on oceanic spa-
tiotemporal emulators (107, 108) suggests that coupled climate emulators might start to emerge
as well.
ML spatiotemporal emulators have shown even more promise in simulating components of the
climate system whose physics are less well understood. For the cryosphere, deep learning-based
emulators for seasonal sea ice prediction have been found to outperform state-of-the-art physics-
based dynamical models in terms of forecast accuracy (109–111), with a lead time of a few months.
Some of these sea ice emulators capture atmospheric-ice-ocean interactions by training with ap-
propriate climate variables (109, 111). Because these emulators were trained directly on sea ice ob-
servational data, they learn the atmospheric-ice-ocean interactions that are incompletely param-
eterized in the physics-based dynamical models, thereby correcting the model’s structural error.

3.3. Physics-Informed Machine Learning (PIML)


Despite ML’s ability to emulate weather (Section 3.2) and parameterize SGS processes when
trained on high-resolution simulations or observations (Section 3.1), there is no guarantee that its
predictions are physically sound (e.g., conservation of mass, energy). This physical inconsistency is
problematic and makes long-term climate projections using ML-based emulators and ML-based
parameterizations not trustworthy (Figure 2g, left panel; see the sidebar titled Challenges and
Opportunities of Machine Learning–Based Emulators). Incorporating physics constraints such
as conservation laws, symmetries, and more broadly, equivariances (defined below), has been
shown to alleviate a number of challenges such as instabilities and learning in the small-data
regime—Kashinath et al. (114) review earlier work in PIML for weather and climate modeling.

3.3.1. Conservation laws. Various methods exist for incorporating conservation laws into ML
models, such as embedding them in the loss function [e.g., PINNs (58); Section 2.2.4.2] or other
components of the ML architecture. For instance, Beucler et al. (115) demonstrated that conserv-
ing quantities like mass and energy can be enforced as hard constraints within the NN architecture.
Their architecture-constrained NN, trained as an SGS parameterization of moist convection,
significantly improved simulated climate.

3.3.2. Symmetries and equivariances. Incorporating symmetries and equivariances has also
shown advantages, particularly in the small-data regime. For a variable x, the nonlinear function
g is equivariant under transformation A if Ag(x) = g(Ax). For example, by incorporating various
symmetries (e.g., scale equivariance, rotational equivariance) into convolutional neural networks
(CNNs) trained on turbulence data from previous time steps, the CNNs generalize well to future

www.annualreviews.org • ML for Climate Physics and Simulations 355


CHALLENGES AND OPPORTUNITIES OF MACHINE LEARNING–BASED
EMULATORS
Emulators can address the climate response to a particular emission scenario and internal variability uncertainty
(defined in the sidebar titled Uncertainties in Climate Projections). However, emulators are at best as accurate as the
data they are trained on, which may still contain model uncertainty; this may be partially overcome by training with
data from very high-resolution yet very expensive simulations, such as the emerging global 1-km climate simulations.
Nonetheless, major questions about the stability and physical consistency of the trained spatiotemporal emulators
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

need to be addressed. For example, ML weather forecast models have been shown to produce unstable or unphysical
atmospheric circulations beyond 10 days, poorly represent small-scale processes (15, 112), and fail to reproduce the
chaotic behavior of weather (113). Potential solutions to address these challenges include incorporating physical
constraints into ML models (Section 3.3) and developing a deeper understanding of the different sources of error
in these models (Sections 4.1 and 4.2).

time steps (8) (Figure 2h). Enforcing rotational equivariance through Capsule NN, CNNs, or
customized latent spaces has improved ML-based predictions of large-scale weather patterns (116)
and turbulent flows (8, 117).
3.3.3. Spectrum information. Including information about the Fourier spectrum of geophys-
ical turbulence in the loss function has been shown to aid in learning small scales and reducing
spectral bias, thereby improving the stability and physical consistency of ML-based emulators
(112). See Section 4.1 for further discussions.

4. CHALLENGES AND PROMISES


4.1. Quantifying Uncertainties of Machine Learning Models
Broadly speaking, there are two sources of error in ML-based models: errors in the training data
and the epistemic uncertainty for the ML model. The errors in data can stem from sparsity and
measurement noise, which are particularly relevant for observations, or from errors in simulation
data, which can arise from numerical errors and inaccurate physics-based equations. The epis-
temic uncertainty of the ML model arises from different sources, such as model architecture and
hyperparameters (118). Some ML techniques, such as GPR, provide rigorous estimates of uncer-
Spectral bias: neural tainty (see, e.g., References 26 and 105 for climate applications). However, for deep learning, UQ
networks’ tendency to
(uncertainty quantification) is more complicated and the subject of extensive research (for recent
preferentially capture
certain frequencies of review papers in the context of scientific applications see References 118 and 119).
the training data Understanding the sources of errors in ML models can improve their stability, physical con-
sistency, and reliability. For example, in simulations coupled with ML-based parameterization,
Epistemic
uncertainty: deviation errors from the ML model are propagated into the simulations and vice versa, potentially lead-
of the ML model from ing to instabilities and nonphysical behavior. Similarly, errors in a spatiotemporal emulator can
the data; can be due to accumulate and destabilize the emulation. Because we cannot directly estimate accuracy during
approximation, inference (as we do not have access to the ground truth), the best approach is to estimate the un-
optimization, and
certainty of the ML model’s output, as this uncertainty may be indicative of its accuracy. Here, we
generalization errors
provide examples from climate science for UQ of NNs, and we also discuss two impactful sources
of epistemic uncertainties related to representation error (e.g., spectral bias) and data imbalance
(e.g., rare extreme events).
4.1.1. Quantifying epistemic uncertainty. A variety of techniques from the ML literature
have been employed for UQ of NNs in climate applications. For instance, deep ensembles (24,

356 Lai et al.


120–122) and Bayesian NNs (24, 123) are used to assess the mean and spread of predictions as
well as the faithfulness of the NN optimization. In Reference 124, these two methodologies were
combined to reveal the consequences of architecture choice, as determined by UQ and the ability
Climate-invariant
to approximate the physical system. Other techniques, such as variational autoencoders, dropout, machine learning:
and abstention, have also been explored (123, 125–127). See References 128 and 123 for detailed machine learning that
discussions. utilizes relationships
that stay the same
4.1.2. Spectral bias. Another example of epistemic error in ML models affecting climate appli- across different
cations is the spectral bias (129) (or frequency principle; 130). Namely, NNs learn to represent the climates to improve
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

generalization
large scales much more easily than small scales, which can pose challenges for multiscale climate
problems. Figure 4a shows the spectrum of the one-time-step (∼6 or 12 h) prediction of upper-
level wind from a few state-of-the-art ML weather emulators. Although the predictions exhibit the
correct spectrum for up to zonal Fourier wave numbers of ∼30 (scales of 40,000 km to 500 km),
smaller scales (from 500 km to 25 km) are poorly learned. It is noteworthy that these predictions
all boast around ∼99% accuracy based on anomaly pattern correlation. These errors in small
scales grow to larger scales after 10 days. Eventually, these predictions either blow up or become
unphysical (112). The same behavior is observed in simpler tasks such as reconstruction of at-
mospheric boundary layer turbulence (Figure 4b), time-stepping prediction for quasigeostrophic
(QG) turbulence (Figure 4c) and its reconstruction (Figure 4d), or even a simple 1D function
(Figure 4f ). Promising solutions include Fourier regularization of the loss function (Figure 4c,
modified from Reference 112) and random Fourier features (134, 135) (Figure 4d, modified from
Reference 76). Superposing small NNs (132, 133) via the multistage NNs also improves spectral
bias substantially compared with vanilla NNs (Figure 4e, modified from Reference 132).

4.1.3. Rare extreme events. Another example of epistemic error relates to rare events (e.g.,
heat waves, hurricanes, ice-shelf collapse, ocean circulation collapse). Predicting these rare events
is crucial, but they are often underrepresented or entirely absent from the training set, lead-
ing to significant data imbalance. Addressing data imbalance and improving the learning of rare
events is an active area of research. Common approaches such as resampling (136, 137), using
weighted loss function (123, 138), and learning the causal relationship that drives the rare behav-
ior (124) have shown promise. Innovative approaches, such as combining ML-based emulators
with mathematical tools for rare events (139, 140), may enable the learning of the rarest events.

4.2. Nonstationarity: Out-of-Distribution Error of Machine Learning Models


Climate change is inherently nonstationary (28): The mean state and its variability change over
time. This poses a major challenge for applications of ML models to climate-change projections.
For instance, ML-based models trained on data from the current climate may not perform well for
a warmer future climate with higher greenhouse gas concentrations. Studies have already demon-
strated unstable or unphysical simulations resulting from an NN’s inability to extrapolate beyond
its training data (93, 141, 142). The nonstationary problem raises new questions: How can we
ensure that the prediction task is within the distribution of the training data? How do we ensure
that the ML model leverages information that is climate-invariant? Would a hybrid approach,
coupling physics- and ML-based models, improve long-term climate simulations?
Examples of strategies for dealing with nonstationarity and out-of-distribution generaliza-
tion include (a) incorporating physical knowledge (142) and (b) transfer learning (87, 141). As an
example of incorporating physical knowledge, the recently proposed climate-invariant machine
learning (142) learns the mapping between variables of interests that is universal across climates.

www.annualreviews.org • ML for Climate Physics and Simulations 357


a AI weather models (Δt) b Atmospheric boundary layer

103 PanguWeather
103
FourCastNet
101 FourCastNet-V2
GraphCast 101
10–1 ERA5 (truth)
Train
10–3 10–1 Test
Unconditional
10–5 Conditional
10–3
–5/3
10–7
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

0 200 400 600 10–3 10–2


Total wave number Wave number
c QG dynamics
d QG (random Fourier features)
103 103
Ground truth Truth
102 U-NET 102 NN
Spectral loss NN+RFF
101 101

100 100

10–1 10–1

10–2 10–2

10–3 10–3
0 10 20 30 40 50 0 10 20 30 40 50
Wave number Wave number
e 2D Navier–Stokes (Multistage NN) f 1D time series
1010 100
Ground truth
10–2 One NN
Multistage NN
100 10–4
The 1D function
1.0
10–6
0.5
10–10 10–8 u 0.0
Ground truth –0.5
One NN 10–10 –1.0
Multistage NN –1.0 –0.5 0.0 0.5 1.0
10–20 x
10–12
0 100 200 300 400 0 25 50 75 100 125 150
Wave number Frequency
Figure 4
The spectral bias of neural networks (129, 130) can be widely observed in climate applications and can cause
major challenges such as instabilities. (a) State-of-the-art ML-based weather emulator predictions after one
time step (based on results from Reference 112, courtesy of Qiang Sun). (b) Atmospheric boundary layer
turbulence reconstruction (131). (c,d) QG turbulence prediction after one time step (112) (data provided by
Ashesh Chattopadhyay) or reconstruction (based on results from Reference 76, data provided by Rambod
Mojgani). (e) Reconstruction of a 2D flow field (132) using the multistage NN (133). ( f ) Reconstruction of a
1D function with sharp peaks, difficult to fit with vanilla NNs (data provided by Yongji Wang). Panels a, b,
and e adapted with permission from Reference 112 (CC BY 4.0), Reference 131 (CC BY 4.0), and
Reference 132 (CC BY 4.0), respectively. Panels c and d adapted with permission from Reference 76
(CC BY 4.0). Abbreviations: 1D, one-dimensional; 2D, two-dimensional; AI, artificial intelligence;
ERA5, the 5th generation ECMWF atmospheric reanalysis of the global climate; ML, machine learning;
NN, neural network; QG, quasigeostrophic; RFF, random Fourier feature.

This study showed promising offline results of data-driven parameterization of moist convection
across a range of cold to warm climates once temperature, relative humidity, and latent heating
were properly transformed. This approach leverages physical insights of climate. The main chal-
lenge is finding the appropriate transformations, which are easier to find for thermodynamics but
harder to find for dynamics (e.g., wind).

358 Lai et al.


Transfer learning, which is a common framework in ML for addressing out-of-distribution
generalization, involves training an ML model for a given system (e.g., the current climate) and
then retraining it with a much smaller amount of data from a new system (e.g., a warmer cli-
mate). The retrained ML model could then perform better for the new system. Several studies
have demonstrated the potential of transfer learning to address significant changes in parame-
ters θ (e.g., a 100-times increase in Reynolds number in geophysical turbulence) or forcing (87,
141). The key challenge with transfer learning is obtaining reliable data for retraining. In climate-
change prediction, we must rely on simulations, as observations from the future are unavailable.
Libraries of high-resolution global and regional simulations that strategically sample from a range
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

of climates are emerging, providing a valuable source of training and retraining data (143–145).
However, the range of scenarios typically explored with Earth system models is typically restricted
to plausible future scenarios. As ML techniques become more mainstream in climate studies, it
will be important to simulate a wider range of future climates to expand the range of training data,
especially in extreme regimes.

4.3. Understanding What Machine Learning Is Learning


Understanding what an ML model has learned, and how and why, is essential for climate applica-
tions, especially to gain trust of and further improve such models. The use of explainable artificial
intelligence (XAI) techniques from the ML community for climate applications has gained pop-
ularity in recent years (e.g., 120, 146–148). For review papers featuring XAI in climate science,
see References 149–152. A core strategy for many XAI techniques is to identify which parts of the
inputs of an NN (e.g., regions of the atmosphere or ocean) are used to predict a specific output.
For example, Toms et al. (153) illustrated XAI could infer scientifically meaningful information
regarding climate patterns known as El Niño events. In Labe & Barnes (154), XAI was used to
assist model comparisons of the Arctic. XAI is also often used to determine whether the ML has
gained skill through detecting meaningful patterns in the training data, instead of spurious corre-
lations. Sonnewald & Lguensat (120) used XAI within an ensemble of NNs to determine their ML
model’s accuracy (Figure 5a,b) by assessing conformance with theory. The ML model’s task was to
predict ocean physical regimes, i.e., dominant balances between terms in the equation governing
the flow. Similar equation-determining frameworks are in References 24, 121, and 124.
Standard techniques for analyzing physical systems can also be applied to understand NNs.
For example, Fourier analysis has provided insight into NNs’ learning process (129, 130). In
the context of data-driven modeling of geophysical turbulence (141), Fourier analyses of CNNs
revealed what they have learned. The convolution kernels (with over 1 million learnable param-
eters) were shown to fall into just a few classes: low- and high-pass filters, and Gabor wavelets
(Figure 5c). These findings align well with prior work that used wavelets for turbulence model-
ing (155), and even more so with theoretical ML studies on the need for such spectral filters for
learning multiscale, localized data (156, 157). More recent work has found this approach useful
in interpreting deep NNs in climate applications by examining concepts from physics and ML
together (100, 112).

5. SUMMARY AND OUTLOOK


Numerous scientific discoveries and rigorous understandings have been prompted by first iden-
tifying empirical relationships. We summarized in Section 2.1 how ML can facilitate this, for
example, by accelerating the search for patterns in climate data that can be used to derive physical
understandings. We also summarized in Section 2.2 the promises ML has brought to find closed-
form equations for poorly understood climate processes. In many aspects of the climate system,

www.annualreviews.org • ML for Climate Physics and Simulations 359


a Forward pass b
Input Output THOR method follows fluid dynamics
xi xj
Δ
×τ Advection Wind and bottom stress Lateral viscosity
Depth
SSH … … 0=
Δ
• (fU) –
Δ Δ
× (pb H) +
Δ
×τ –
Δ
×A +
Δ
×B
Bottom pressure Nonlinear
torque torque


{xp}
×τ
Δ
H
Relevance propagation
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

Heatmap Output
Ri Rj

… …

×τ
Δ

Depth
SSH Negative relevance 0 Positive relevance

c z Π
QG turbulence simulations

60

kx 0

–60
–60 0 60
ky

Figure 5
Understanding what ML learns. Panels a and b illustrate how the THOR method ensures the input data necessary for the ML model to
demonstrate its learning of physics is present (120). (a, top) Training an NN to predict sections of the ocean dominated by different
balances in equation terms describing the flow (colors in output) using related surface fields (e.g., wind). (a, bottom) Looking backward
using XAI to see where in the input the NN saw as relevant (blue, not relevant; red, relevant). (b) For the pink section in the North
Atlantic, only two equation terms are relevant (red boxes), and relevances show conformance in two maps below, e.g., where the
mountain range (closed black lines) in depth (H) gives negative relevance. (c) The two leftmost panels show examples of the state z and
SGS term 5 (defined in Figure 2) from two setups of geophysical turbulence, separated by the dotted line, that differs in forcing scale
and dynamics. The right-side panels show examples of the Fourier spectra of convolutional kernels of NNs trained as ML-based SGS
parameterizations N N (z̄, γ ) = 5. The Fourier analysis shows the emergence of low-pass, high-pass, and band-pass Gabor filters (141).
Panels a and b adapted with permission from Reference 120 (CC BY 4.0). Panel c adapted with permission from Reference 141
(CC BY 4.0). Abbreviations: ML, machine learning; NN, neural network; QG, quasigeostrophic; SGS, subgrid-scale; SSH, sea-surface
height; THOR, Tracking global Heating with Ocean Regimes; XAI, explainable artificial intelligence.

we do not yet have accurate process-level models to describe the system (e.g., sea ice rheology and
cloud microphysics). The increasing amount of observational data offers exciting opportunities
for both equation and knowledge discovery to improve the fundamental understanding of climate
physics.
On the other hand, ML can be used as tools to improve simulations. ML models can be
coupled with traditional physics-based models and used to parameterize processes for which

360 Lai et al.


closed-form equations are not yet available (Section 3.1). ML has led to breakthroughs in
weather forecasting, which was a task not widely expected to be possible a couple of years ago.
We discussed the challenges scientists need to overcome when moving forward from weather
forecasting to climate prediction (Sections 3.2 and 4).
ML is advancing rapidly, and new techniques and concepts that have shown great promise in
other fields are now being quickly adopted in climate science. Notable examples, as of this writ-
ing, include diffusion models (e.g., 25, 158, 159), LLMs (e.g., 160), and foundation models (see,
e.g., Reference 161 for a discussion of their design and implementation and Reference 162 for
a downstream task involving gravity waves). Progress in climate modeling could greatly benefit
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

from collaborations among the ML, climate sciences, and mathematics communities. For exam-
ple, the numerical analysis of differential equations and the advent of digital computers played
a key role in starting the field of numerical weather and climate prediction (163). Developing
similar rigorous tools, by closely combining methods from climate physics, ML theory, and nu-
merical analysis, can potentially help with building stable, accurate, and trustworthy ML-based
models.

DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS
We thank Mingjing Tong, Oliver Dunbar, Jinlong Wu, and Duncan Watson-Parris for their
helpful discussions regarding data assimilation methods, GPR with EKI, sparse learning, and
emulators, respectively. We are grateful for the valuable general feedback from Andre Souza
and Janni Yuval on this article. We also thank Qiang Sun, Ashesh Chattopadhyay, Rambod
Mojgani, and Yongji Wang for helping to remake the spectral bias figure. C.-Y.L., R.F., P.H., and
A.S. acknowledge the National Science Foundation for funding via grants DMS-2245228, AGS-
2426087, OAC-2005123, and OAC-2004492, respectively. R.F., P.H., and A.S. also acknowledge
funding from Schmidt Sciences through the Virtual Earth System Research Institute.

LITERATURE CITED
1. Keisler R. 2022. arXiv:2202.07575 [physics.ao-ph]
2. Pathak J, Subramanian S, Harrington P, Raja S, Chattopadhyay A, et al. 2022. arXiv:2202.11214
[physics.ao-ph]
3. Bi K, Xie L, Zhang H, Chen X, Gu X, Tian Q. 2023. Nature 619:533–38
4. Lam R, Sanchez-Gonzalez A, Willson M, Wirnsberger P, Fortunato M, et al. 2023. Science 382:1416–21
5. Chen L, Zhong X, Zhang F, Cheng Y, Xu Y, et al. 2023. NPJ Clim. Atmos. Sci. 6:190
6. Sonnewald M, Wunsch C, Heimbach P. 2019. Earth Space Sci. 6:784–94
7. Xiao Q, Balwada D, Jones CS, Herrero-González M, Smith KS, Abernathey R. 2023. J. Adv. Model. Earth
Syst. 15:e2023MS003709
8. Wang R, Walters R, Yu R. 2021. Paper presented at the International Conference on Learning
Representations (ICLR) 2021, Virtual Event, Austria, May 3–7
9. Yuval J, O’Gorman PA. 2020. Nat. Commun. 11:3295
10. Watt-Meyer O, Dresdner G, McGibbon J, Clark SK, Henn B, et al. 2023. arXiv:2310.02074 [physics.ao-
ph]
11. Duncan JP, Wu E, Golaz JC, Caldwell PM, Watt-Meyer O, et al. 2024. Mach. Learn. Comput.
1(3):e2024JH000136
12. Hornik K, Stinchcombe M, White H. 1989. Neural Netw. 2:359–66

www.annualreviews.org • ML for Climate Physics and Simulations 361


13. Chen T, Chen H. 1995. IEEE Trans. Neural Netw. 6:911–17
14. Ravuri S, Lenc K, Willson M, Kangin D, Lam R, et al. 2021. Nature 597:672–77
15. Ben Bouallègue Z, Clare MC, Magnusson L, Gascon E, Maier-Gerber M, et al. 2024. Bull. Am. Meteorol.
Soc. 105:E864–83
16. Ham YG, Kim JH, Luo JJ. 2019. Nature 573:568–72
17. Rasp S, Hoyer S, Merose A, Langmore I, Battaglia P, et al. 2024. J. Adv. Model. Earth Syst.
16:e2023MS004019
18. Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018. Phys. Rev. Lett. 120:024102
19. Dueben PD, Bauer P. 2018. Geosci. Model Dev. 11:3999–4009
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

20. Weyn JA, Durran DR, Caruana R. 2019. J. Adv. Model. Earth Syst. 11:2680–93
21. Chattopadhyay A, Nabizadeh E, Hassanzadeh P. 2020. J. Adv. Model. Earth Syst. 12:e2019MS001958
22. Rasp S, Dueben PD, Scher S, Weyn JA, Mouatadid S, Thuerey N. 2020. J. Adv. Model. Earth Syst.
12:e2020MS002203
23. Rasp S, Thuerey N. 2021. J. Adv. Model. Earth Syst. 13:e2020MS002405
24. Clare MCA, Jamil O, Morcrette CJ. 2021. Q. J. R. Meteorol. Soc. 147:4337–57
25. Price I, Sanchez-Gonzalez A, Alet F, Ewalds T, El-Kadi A, et al. 2023. arXiv:2312.15796 [cs.LG]
26. Watson-Parris D. 2021. Philos. Trans. R. Soc. A 379:20200098
27. Schneider T, Behera S, Boccaletti G, Deser C, Emanuel K, et al. 2023. Nat. Climate Change 13:887–89
28. Palmer TN. 1999. J. Climate 12:575–91
29. Held IM. 2005. Bull. Am. Meteorol. Soc. 86:1609–14
30. Walker G. 1928. Q. J. R. Meteorol. Soc. 54:79–87
31. Corti S, Molteni F, Palmer T. 1999. Nature 398:799–802
32. Thompson DW, Solomon S. 2002. Science 296:895–99
33. Monahan AH, Fyfe JC, Ambaum MH, Stephenson DB, North GR. 2009. J. Climate 22:6501–14
34. Page J, Brenner MP, Kerswell RR. 2021. Phys. Rev. Fluids 6:034402
35. Lusch B, Kutz JN, Brunton SL. 2018. Nat. Commun. 9:4950
36. Shamekh S, Lamb KD, Huang Y, Gentine P. 2023. PNAS 120:e2216158120
37. Souza AN. 2023. arXiv:2304.03362 [physics.flu-dyn]
38. Geogdzhayev G, Souza AN, Ferrari R. 2024. Phys. D Nonlinear Phenom. 462:134107
39. Wang X, Slawinska J, Giannakis D. 2020. Sci. Rep. 10:2636
40. Rowley CW, Mezić I, Bagheri S, Schlatter P, Henningson DS. 2009. J. Fluid Mech. 641:115–27
41. Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, et al. 2019. Nature 566:195–204
42. Landy JC, Dawson GJ, Tsamados M, Bushuk M, Stroeve JC, et al. 2022. Nature 609:517–22
43. Martin SA, Manucharyan G, Klein P. 2024. Geophys. Res. Lett. 51(17):e2024GL110059
44. Rezvanbehbahani S, Stearns LA, Keramati R, Shankar S, van der Veen C. 2020. Commun. Earth Environ.
1:31
45. Surawy-Stepney T, Hogg AE, Cornford SL, Davison BJ. 2023. Nat. Geosci. 16:37–43
46. Lai CY, Kingslake J, Wearing MG, Chen PHC, Gentine P, et al. 2020. Nature 584:574–78
47. Lorenz EN. 1963. J. Atmos. Sci. 20:130–41
48. Iglesias MA, Law KJ, Stuart AM. 2013. Inverse Probl. 29:045001
49. Cleary E, Garbuno-Inigo A, Lan S, Schneider T, Stuart AM. 2021. J. Comput. Phys. 424:109716
50. Dunbar OR, Garbuno-Inigo A, Schneider T, Stuart AM. 2021. J. Adv. Model. Earth Syst.
13:e2020MS002454
51. Lopez-Gomez I, Christopoulos C, Langeland Ervik HL, Dunbar OR, Cohen Y, Schneider T. 2022.
J. Adv. Model. Earth Syst. 14:e2022MS003105
52. Mansfield L, Sheshadri A. 2022. J. Adv. Model. Earth Syst. 14:e2022MS003245
53. Souza AN, Wagner G, Ramadhan A, Allen B, Churavy V, et al. 2020. J. Adv. Model. Earth Syst.
12:e2020MS002108
54. Evensen G. 1994. J. Geophys. Res. Oceans 99:10143–62
55. Houtekamer PL, Zhang F. 2016. Mon. Weather Rev. 144:4489–532
56. Kovachki NB, Stuart AM. 2019. Inverse Probl. 35:095005
57. Watson-Parris D, Williams A, Deaconu L, Stier P. 2021. Geosci. Model Dev. 14:7659–72

362 Lai et al.


58. Raissi M, Perdikaris P, Karniadakis GE. 2019. J. Comput. Phys. 378:686–707
59. Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L. 2021. Nat. Rev. Phys. 3:422–40
60. Tarantola A. 2005. Inverse Problem Theory and Methods for Model Parameter Estimation. Philadelphia: SIAM
61. Kalnay E. 2003. Atmospheric Modeling, Data Assimilation and Predictability. Cambridge, UK: Cambridge
Univ. Press
62. Geer AJ. 2021. Philos. Trans. R. Soc. A 379:20200089
63. Brajard J, Carrassi A, Bocquet M, Bertino L. 2021. Philos. Trans. R. Soc. A 379:20200086
64. Cheng S, Quilodrán-Casas C, Ouala S, Farchi A, Liu C, et al. 2023. IEEE/CAA J. Automat. Sin. 10:1361–
87
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

65. Farchi A, Laloyaux P, Bonavita M, Bocquet M. 2021. Q. J. R. Meteorol. Soc. 147:3067–84


66. Brajard J, Carrassi A, Bocquet M, Bertino L. 2020. J. Comput. Sci. 44:101171
67. Eusebi R, Vecchi GA, Lai CY, Tong M. 2024. Commun. Earth Environ. 5:8
68. Wang Y, Lai CY, Prior D, Cowen-Breen C. 2025. Science https://doi.org/10.1126/science.adp3300
69. Lu X, Wang X, Tong M, Tallapragada V. 2017. Mon. Weather Rev. 145:4877–98
70. Kochkov D, Yuval J, Langmore I, Norgaard P, Smith J, et al. 2024. Nature 632:1060–66
71. Jouvet G, Cordonnier G. 2023. J. Glaciol. 69:1941–55
72. Schmidt M, Lipson H. 2009. Science 324:81–85
73. Brunton SL, Proctor JL, Kutz JN. 2016. PNAS 113:3932–37
74. Schneider T, Stuart AM, Wu JL. 2022. J. Comput. Phys. 470:111559
75. Lang M, Jan Van Leeuwen P, Browne P. 2016. Tellus A Dyn. Meteorol. Oceanogr. 68:29012
76. Mojgani R, Chattopadhyay A, Hassanzadeh P. 2024. J. Adv. Model. Earth Syst. 16(3):e2023MS004033
77. Chen N, Zhang Y. 2023. Phys. D Nonlinear Phenom. 449:133743
78. Zanna L, Bolton T. 2020. Geophys. Res. Lett. 47:e2020GL088376
79. Anstey JA, Zanna L. 2017. Ocean Model. 112:99–111
80. Jakhar K, Guan Y, Mojgani R, Chattopadhyay A, Hassanzadeh P. 2024. J. Adv. Model. Earth Syst.
16(7):e2023MS003874
81. Koza JR. 1994. Stat. Comput. 4:87–112
82. Chen Y, Luo Y, Liu Q, Xu H, Zhang D. 2022. Phys. Rev. Res. 4:023174
83. Grundner A, Beucler T, Gentine P, Eyring V. 2024. J. Adv. Model. Earth Syst. 16:e2023MS003763
84. Ross A, Li Z, Perezhogin P, Fernandez-Granda C, Zanna L. 2023. J. Adv. Model. Earth Syst.
15:e2022MS003258
85. Smagorinsky J. 1963. Mon. Weather Rev. 91:99–164
86. Hawkins E, Sutton R. 2009. Bull. Am. Meteorol. Soc. 90:1095–108
87. Guan Y, Chattopadhyay A, Subel A, Hassanzadeh P. 2022. J. Comput. Phys. 458:111090
88. Bolton T, Zanna L. 2019. J. Adv. Model. Earth Syst. 11:376–99
89. Sane A, Reichl BG, Adcroft A, Zanna L. 2023. J. Adv. Model. Earth Syst. 15:e2023MS003890
90. Gentine P, Pritchard M, Rasp S, Reinaudi G, Yacalis G. 2018. Geophys. Res. Lett. 45:5742–51
91. Gentine P, Eyring V, Beucler T. 2021. In Deep Learning for the Earth Sciences: A Comprehensive Approach
to Remote Sensing, Climate Science and Geosciences, ed. G Camps-Valls, D Tuia, XX Zhu, M Reichstein,
pp. 307–14. Hoboken, NJ: Wiley
92. Yuval J, O’Gorman PA. 2023. J. Adv. Model. Earth Syst. 15:e2023MS003606
93. Rasp S, Pritchard MS, Gentine P. 2018. PNAS 115:9684–89
94. Grundner A, Beucler T, Gentine P, Iglesias-Suarez F, Giorgetta MA, Eyring V. 2022. J. Adv. Model.
Earth Syst. 14:e2021MS002959
95. Arcomano T, Szunyogh I, Wikner A, Hunt BR, Ott E. 2023. Geophys. Res. Lett. 50:e2022GL102649
96. Watt-Meyer O, Brenowitz ND, Clark SK, Henn B, Kwa A, et al. 2024. J. Adv. Model. Earth Syst.
16:e2023MS003668
97. Matsuoka D, Watanabe S, Sato K, Kawazoe S, Yu W, Easterbrook S. 2020. Geophys. Res. Lett.
47:e2020GL089436
98. Espinosa ZI, Sheshadri A, Cain GR, Gerber EP, DallaSanta KJ. 2022. Geophys. Res. Lett.
49:e2022GL098174
99. Hardiman SC, Scaife AA, van Niekerk A, Prudden R, Owen A, et al. 2023. Artif. Intel. Earth Syst.
2:e220081

www.annualreviews.org • ML for Climate Physics and Simulations 363


100. Pahlavan HA, Hassanzadeh P, Alexander MJ. 2024. Geophys. Res. Lett. 51:e2023GL106324
101. Frezat H, Le Sommer J, Fablet R, Balarac G, Lguensat R. 2022. J. Adv. Model. Earth Syst.
14:e2022MS003124
102. McNally A, Lessig C, Lean P, Boucher E, Alexe M, et al. 2024. arXiv:2407.15586 [physics.ao-ph]
103. Beusch L, Gudmundsson L, Seneviratne SI. 2020. Earth Syst. Dyn. 11:139–59
104. Tebaldi C, Snyder A, Dorheim K. 2022. Earth Syst. Dyn. 13:1557–609
105. Watson-Parris D, Rao Y, Olivié D, Seland Ø, Nowack P, et al. 2022. J. Adv. Model. Earth Syst.
14:e2021MS002954
106. Hersbach H, Bell B, Berrisford P, Hirahara S, Horányi A, et al. 2020. Q. J. R. Meteorol. Soc. 146:1999–2049
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

107. Bire S, Lütjens B, Azizzadenesheli K, Anandkumar A, Hill CN. 2023. https://doi.org/10.22541/essoar.


170110658.85641696/v1
108. Subel A, Zanna L. 2024. arXiv:2402.04342 [physics.ao-ph]
109. Andersson TR, Hosking JS, Pérez-Ortiz M, Paige B, Elliott A, et al. 2021. Nat. Commun. 12:5124
110. Wang Y, Yuan X, Ren Y, Bushuk M, Shu Q, et al. 2023. Geophys. Res. Lett. 50:e2023GL104347
111. Zhu Y, Qin M, Dai P, Wu S, Fu Z, et al. 2023. J. Geophys. Res. Atmos. 128:e2023JD039521
112. Chattopadhyay Ashesh SYQ, Hassanzadeh P. 2023. arXiv:2304.07029 [physics.flu-dyn]
113. Selz T, Craig GC. 2023. Geophys. Res. Lett. 50:e2023GL105747
114. Kashinath K, Mustafa M, Wu J, Jiang C, Wang R, et al. 2021. Philos. Trans. R. Soc. A 379:20200093
115. Beucler T, Pritchard M, Rasp S, Ott J, Baldi P, Gentine P. 2021. Phys. Rev. Lett. 126:098302
116. Chattopadhyay A, Mustafa M, Hassanzadeh P, Bach E, Kashinath K. 2022. Geosci. Model Dev. 15:2221–37
117. Guan Y, Subel A, Chattopadhyay A, Hassanzadeh P. 2023. Phys. D Nonlinear Phenom. 443:133568
118. Psaros AF, Meng X, Zou Z, Guo L, Karniadakis GE. 2023. J. Comput. Phys. 477:111902
119. Abdar M, Pourpanah F, Hussain S, Rezazadegan D, Liu L, et al. 2021. Inform. Fusion 76:243–97
120. Sonnewald M, Lguensat R. 2021. J. Adv. Model. Earth Syst. 13:e2021MS002496
121. Yik W, Sonnewald M, Clare MCA, Lguensat R. 2023. arXiv:2310.13916 [physics.ao-ph]
122. Mansfield LA, Sheshadri A. 2024. J. Adv. Model. Earth. Syst. 16(7):e2024MS004292
123. Sun YQ, Pahlavan HA, Chattopadhyay A, Hassanzadeh P, Lubis SW, et al. 2024. J. Adv. Model. Earth
Syst. 16:e2023MS004145
124. Dräger S, Sonnewald M. 2024. arXiv:2402.13979 [cs.LG]
125. Guillaumin AP, Zanna L. 2021. J. Adv. Model. Earth Syst. 13:e2021MS002534
126. Foster D, Gagne DJ, Whitt DB. 2021. J. Adv. Model. Earth Syst. 13:e2021MS002474
127. Barnes EA, Barnes RJ. 2021. J. Adv. Model. Earth Syst. 13:e2021MS002575
128. Haynes K, Lagerquist R, McGraw M, Musgrave K, Ebert-Uphoff I. 2023. Artif. Intel. Earth Syst.
2:220061
129. Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, et al. 2019. In Proceedings of the 36th International
Conference on Machine Learning, PMLR. 97:5301–10
130. Xu ZQJ, Zhang Y, Luo T, Xiao Y, Ma Z. 2019. arXiv:1901.06523 [cs.LG]
131. Rybchuk A, Hassanaly M, Hamilton N, Doubrawa P, Fulton MJ, Martínez-Tossas LA. 2023. Phys. Fluids
35:126604
132. Ng J, Wang Y, Lai CY. 2024. arXiv:2407.17213 [cs.LG]
133. Wang Y, Lai CY. 2024. J. Comput. Phys. 504(C):112865
134. Tancik M, Srinivasan P, Mildenhall B, Fridovich-Keil S, Raghavan N, et al. 2020. Adv. Neural Inf. Proc.
Syst. 33:7537–47
135. Wang S, Wang H, Perdikaris P. 2021. Comput. Methods Appl. Mech. Eng. 384:113938
136. Miloshevich G, Cozian B, Abry P, Borgnat P, Bouchet F. 2023. Phys. Rev. Fluids 8:040501
137. Lopez-Gomez I, McGovern A, Agrawal S, Hickey J. 2023. Artif. Intel. Earth Syst. 2:e220035
138. Rudy SH, Sapsis TP. 2023. Phys. D Nonlinear Phenom. 443:133570
139. Ragone F, Wouters J, Bouchet F. 2018. PNAS 115:24–29
140. Finkel J, Webber RJ, Gerber EP, Abbot DS, Weare J. 2021. Mon. Weather Rev. 149:3647–69
141. Subel A, Guan Y, Chattopadhyay A, Hassanzadeh P. 2023. PNAS Nexus 2:pgad015
142. Beucler T, Gentine P, Yuval J, Gupta A, Peng L, et al. 2024. Sci. Adv. 10:eadj7250
143. Shen Z, Sridhar A, Tan Z, Jaruga A, Schneider T. 2022. J. Adv. Model. Earth Syst. 14:e2021MS002631

364 Lai et al.


144. Sun YQ, Hassanzadeh P, Alexander MJ, Kruse CG. 2023. J. Adv. Model. Earth Syst. 15:e2022MS003585
145. Satoh M, Stevens B, Judt F, Khairoutdinov M, Lin SJ, et al. 2019. Curr. Climate Change Rep. 5:172–84
146. Mamalakis A, Barnes EA, Ebert-Uphoff I. Artif. Intel. Earth Syst. 1:e220012
147. Camps-Valls G, Reichstein M, Zhu X, Tuia D. 2020. In IGARSS 2020–2020 IEEE International Geoscience
and Remote Sensing Symposium, Waikoloa, HI, pp. 3979–82. Piscataway, NJ: IEEE
148. Mayer KJ, Barnes EA. 2021. Geophys. Res. Lett. 48:e2020GL092092
149. Bommer PL, Kretschmer M, Hedström A, Bareeva D, Höhne MMC. 2024. Artif. Intel. Earth Syst.
3:e230074
150. Flora M, Potvin C, McGovern A, Handler S. 2023. Artif. Intel. Earth Syst. 3:e230018
Downloaded from www.annualreviews.org. Guest (guest) IP: 37.111.148.35 On: Sun, 16 Mar 2025 22:57:54

151. Irrgang C, Boers N, Sonnewald M, Barnes EA, Kadow C, et al. 2021. Nat. Mach. Intel. 3:667–74
152. Sonnewald M, Lguensat R, Jones DC, Dueben PD, Brajard J, Balaji V. 2021. Environ. Res. Lett. 16:073008
153. Toms BA, Barnes EA, Ebert-Uphoff I. 2020. J. Adv. Model. Earth Syst. 12:e2019MS002002
154. Labe ZM, Barnes EA. 2022. Earth Space Sci. 9:e2022EA002348
155. Farge M. 1992. Annu. Rev. Fluid Mech. 24:395–458
156. Mallat S. 2016. Philos. Trans. R. Soc. A 374:20150203
157. Olshausen BA, Field DJ. 1996. Nature 381:607–9
158. Bassetti S, Hutchinson B, Tebaldi C, Kravitz B. 2023. J. Adv. Model. Earth Syst. 16(10):e2023MS004194
159. Finn TS, Durand C, Farchi A, Bocquet M, Brajard J. 2024. arXiv:2406.18417 [cs.LG]
160. Zhou A, Hawkins L, Gentine P. 2024. arXiv:2405.00018 [cs.DC]
161. Mukkavilli SK, Civitarese DS, Schmude J, Jakubik J, Jones A, et al. 2023. arXiv:2309.10808 [cs.LG]
162. Gupta A, Sheshadri A, Roy S, Gaur V, Maskey M, Ramachandran R. 2024. arXiv:2406.14775 [physics.ao-
ph]
163. Balaji V. 2021. Philos. Trans. R. Soc. A 379:20200085

www.annualreviews.org • ML for Climate Physics and Simulations 365

You might also like