Xcdat Readthedocs Io en Stable
Xcdat Readthedocs Io en Stable
Release 0.6.0
Tom Vo
1 Project Motivation 3
2 Getting Started 5
3 Community 7
4 Contributing 9
5 Features 11
7 Releases 15
8 Useful Resources 17
9 Acknowledgement 19
10 License 21
10.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
10.2 xCDAT on Jupyter and HPC Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
10.3 Gallery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
10.4 Presentations and Demos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10.5 API Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10.6 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10.7 Frequently Asked Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
10.8 xCDAT Community Code of Conduct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
10.9 Contributing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
10.10 Project Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.11 The Team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Index 213
i
ii
xCDAT Documentation, Release 0.6.0
xCDAT is an extension of xarray for climate data analysis on structured grids. It serves as a modern successor to the
Community Data Analysis Tools (CDAT) library.
Useful links: Documentation | Code Repository | Issues | Discussions | Releases | Mailing List
FOR USERS 1
xCDAT Documentation, Release 0.6.0
2 FOR USERS
CHAPTER
ONE
PROJECT MOTIVATION
The goal of xCDAT is to provide generalizable features and utilities for simple and robust analysis of climate data.
xCDAT’s design philosophy is focused on reducing the overhead required to accomplish certain tasks in xarray. xCDAT
aims to be compatible with structured grids that are CF-compliant (e.g., CMIP6). Some key xCDAT features are
inspired by or ported from the core CDAT library, while others leverage powerful libraries in the xarray ecosystem
(e.g., xESMF, xgcm, cf_xarray) to deliver robust APIs.
The xCDAT core team’s mission is to provide a maintainable and extensible package that serves the needs of the climate
community in the long-term. We are excited to be working on this project and hope to have you onboard!
3
xCDAT Documentation, Release 0.6.0
TWO
GETTING STARTED
The best resource for getting started is the xCDAT documentation website. Our documentation provides general guid-
ance for setting up xCDAT in an Anaconda environment on your local computer or on an HPC/Jupyter environment.
We also include an API Overview and Gallery to highlight xCDAT functionality.
5
xCDAT Documentation, Release 0.6.0
THREE
COMMUNITY
xCDAT is a community-driven open source project. We encourage discussion on topics such as version releases, feature
suggestions, and architecture design on the GitHub Discussions page.
Subscribe to our mailing list for news and announcements related to xCDAT, such as software version releases or future
roadmap plans.
Please note that xCDAT has a Code of Conduct. By participating in the xCDAT community, you agree to abide by its
rules.
7
xCDAT Documentation, Release 0.6.0
8 Chapter 3. Community
CHAPTER
FOUR
CONTRIBUTING
We welcome and appreciate contributions to xCDAT. Users and contributors can view and open issues on our GitHub
Issue Tracker.
For more instructions on how to contribute, please checkout our Contributing Guide.
9
xCDAT Documentation, Release 0.6.0
10 Chapter 4. Contributing
CHAPTER
FIVE
FEATURES
11
xCDAT Documentation, Release 0.6.0
12 Chapter 5. Features
CHAPTER
SIX
• xCDAT supports CF compliant datasets, but will also strive to support datasets with common non-CF compliant
metadata (e.g., time units in “months since . . . ” or “years since . . . ”)
– xCDAT leverages cf_xarray to interpret CF attributes on xarray objects
– Refer to CF Convention for more information on CF attributes
• Robust handling of dimensions and their coordinates and coordinate bounds
– Coordinate variables are retrieved with cf_xarray using CF axis names or coordinate names found in
xarray object attributes. Refer to Metadata Interpretation for more information.
– Bounds are retrieved with cf_xarray using the "bounds" attr
– Ability to operate on both longitudinal axis orientations, [0, 360) and [-180, 180)
• Support for parallelism using dask where it is both possible and makes sense
13
xCDAT Documentation, Release 0.6.0
SEVEN
RELEASES
xCDAT (released as xcdat) follows a feedback-driven release cycle using continuous integration/continuous deploy-
ment. Software releases are performed based on the bandwidth of the development team, the needs of the community,
and the priority of bug fixes or feature updates.
After releases are performed on GitHub Releases, the corresponding xcdat package version will be available to down-
load through Anaconda conda-forge usually within a day.
Subscribe to our mailing list to stay notified of new releases.
15
xCDAT Documentation, Release 0.6.0
16 Chapter 7. Releases
CHAPTER
EIGHT
USEFUL RESOURCES
We highly encourage you to checkout the awesome resources below to learn more about Xarray and Xarray usage in
climate science!
• Official Xarray Tutorials
• Xarray GitHub Discussion Forum
• Pangeo Forum
• Project Pythia
17
xCDAT Documentation, Release 0.6.0
NINE
ACKNOWLEDGEMENT
19
xCDAT Documentation, Release 0.6.0
20 Chapter 9. Acknowledgement
CHAPTER
TEN
LICENSE
xCDAT is licensed under the terms of the Apache License (Version 2.0 with LLVM exception).
All new contributions must be made under the Apache-2.0 with LLVM exception license.
See LICENSE and NOTICE for details.
SPDX-License-Identifier: Apache-2.0
LLNL-CODE-846944
10.1.1 Prerequisites
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/
˓→Miniforge3-$(uname)-$(uname -m).sh"
Then follow the instructions for installation. We recommend you type yes in response to "Do you wish the
installer to initialize Miniforge by running conda init?" to add conda and mamba to your
path. Note that this will modify your shell profile (e.g., ~/.bashrc).
Note: After installation completes you may need to type ``bash`` to restart your shell (if you use bash). Alterna-
tively, you can log out and log back in.
21
xCDAT Documentation, Release 0.6.0
10.1.2 Installation
Note that xesmf is an optional dependency, which is required for using xesmf based horizontal regridding APIs
in xcdat. xesmf is not currently supported on Windows because it depends on esmpy, which also does not
support Windows. Windows users can try WSL2 as a workaround.
2. Install xcdat in an existing Mamba environment (mamba install)
You can also install xcdat in an existing Mamba environment, granted that Mamba is able to resolve the com-
patible dependencies.
10.1.3 Updating
New versions of xcdat will be released periodically. We recommend you use the latest stable version of xcdat for the
latest features and bug fixes.
xCDAT should be compatible with most high performance computing (HPC) platforms. In general, xCDAT is available
on Anaconda via the conda-forge channel. xCDAT follows the same convention as other conda-based packages by
being installable via conda. The conda installation instructions in this guide are based on the instructions provided by
NERSC.
Setup can vary depending on the exact HPC environment you are working in so please consult your HPC doc-
umentation and/or HPC support resources. Some HPC environments might have security settings that restrict
user-managed conda installations and environments.
Generally, the instructions from getting started guide can also be followed for HPC machines. This guide covers
installing Miniconda3 and creating a conda environment with the xcdat package.
Before installing Miniconda3, you should consult your HPC documentation to see if conda is already available; in
some cases, python and conda may be pre-installed on an HPC machine. You can check to see whether they are
available by entering which conda and/or which python in the command line (which will return their path if they
are available).
In other cases, python and conda are available via modules on an HPC machine. For example, some machines make
both available via:
Once conda is active, you can create and activate a new xcdat environment with xesmf (a recommended dependency):
Note that xesmf is an optional dependency, which is required for using xesmf based horizontal regridding APIs in
xcdat. xesmf is not currently supported on osx-arm64 or windows because esmpy is not yet available on these
platforms. Windows users can try WSL2 as a workaround.
You may also want to use xcdat with some additional packages. For example, you can install xcdat with matplotlib,
ipython, and ipykernel (see the next section for more about ipykernel):
The advantage with following this approach is that conda will attempt to resolve dependencies (e.g., python >= 3.8) for
compatibility.
If you prefer, you can also add packages later with conda install (granted that conda is able to resolve the compatible
dependencies).
HPC systems frequently include a web interface to Jupyter, which is a popular web application that is used to perform
analyses in Python. In order to use xcdat with Jupyter, you will need to create a kernel in your xcdat conda envi-
ronment using ipykernel. These instructions follow those from NERSC, but setup can vary depending on the exact
HPC environment you are working in so please consult your HPC documentation. If you have not already installed
ipykernel, you can install it in your xcdat environment (created above) with:
Once ipykernel is added to your xcdat environment, you can create an xcdat kernel with:
python -m ipykernel install --user --name <ENV NAME> --display-name <ENV NAME>
After the kernel is installed, login to the Jupyter instance on your HPC. Your xcdat kernel may be available on the
home launch page (to open a new notebook or command line instance). This launcher is sometimes accessed by clicking
the blue plus symbol (see screenshot below). Alternatively, you may need top open a new Notebook and then click
“Kernel” on the top bar -> click “Change Kernel. . . ” and then select your xcdat kernel. You should then be able to
use your xcdat environment on Jupyter.
10.3 Gallery
This gallery demonstrates how to use some of the features in xcdat. Contributions are highly welcomed and appreci-
ated. Please checkout the Contributing guide.
This work is performed under the auspices of the U. S. DOE by Lawrence Livermore National Laboratory under contract
No. DE-AC52-07NA27344.
Notebook Setup
Create an Anaconda environment for this notebook using the command below:
conda create -n xcdat -c conda-forge xarray xcdat xesmf matplotlib nc-time-axis jupyter
Presentation Overview
10.3. Gallery 25
xCDAT Documentation, Release 0.6.0
• The CDAT (Community Data Analysis Tools) library has provided a suite of robust and comprehensive open-
source climate data analysis and visualization packages for over 20 years
• A driving need for a modern successor
– Focus on a maintainable and extensible library
– Serve the needs of the climate community in the long-term
Introducing xCDAT
10.3. Gallery 27
xCDAT Documentation, Release 0.6.0
“Xarray introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-
like multidimensional arrays, which allows for a more intuitive, more concise, and less error-prone
developer experience.”
—https://xarray.pydata.org/en/v2022.10.0/getting-started-guide/why-xarray.html
• Apply operations over dimensions by name
– x.sum('time')
• Select values by label (or logical location) instead of integer location
10.3. Gallery 29
xCDAT Documentation, Release 0.6.0
– x.loc['2014-01-01'] or x.sel(time='2014-01-01')
• Mathematical operations vectorize across multiple dimensions (array broadcasting) based on dimension
names, not shape
– x - y
• Easily use the split-apply-combine paradigm with groupby
– x.groupby('time.dayofyear').mean().
• Database-like alignment based on coordinate labels that smoothly handles missing values
– x, y = xr.align(x, y, join='outer')
• Keep track of arbitrary metadata in the form of a Python dictionary
– x.attrs
Source: https://docs.xarray.dev/en/v2022.10.0/getting-started-guide/why-xarray.html#what-labels-enable
“Xarray data models are borrowed from netCDF file format, which provides xarray with a natural and
portable serialization format.”
—https://docs.xarray.dev/en/v2022.10.0/getting-started-guide/why-xarray.html
1. ``xarray.Dataset``
• A dictionary-like container of DataArray objects with aligned dimensions
– DataArray objects are classified as “coordinate variables” or “data variables”
– All data variables have a shared union of coordinates
• Serves a similar purpose to a pandas.DataFrame
2. ``xarray.DataArray``
• A class that attaches dimension names, coordinates, and attributes to multi-dimensional arrays (aka
“labeled arrays”)
• An N-D generalization of a pandas.Series
[1]: # This style import is necessary to properly render Xarray's HTML output with
# the Jupyer RISE extension.
# GitHub Issue: https://github.com/damianavila/RISE/issues/594
# Source: https://github.com/smartass101/xarray-pydata-prague-2020/blob/main/rise.css
style = """
<style>
.reveal pre.xr-text-repr-fallback {
display: none;
}
.reveal ul.xr-sections {
display: grid
}
.reveal ul ul.xr-var-list {
display: contents
}
</style>
"""
HTML(style)
[1]: <IPython.core.display.HTML object>
filepath = "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-
˓→ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xr.open_dataset(filepath)
[3]: ds
[3]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 ...
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 ...
(continues on next page)
10.3. Gallery 31
xCDAT Documentation, Release 0.6.0
A class that attaches dimension names, coordinates, and attributes to multi-dimensional arrays (aka
“labeled arrays”)
Key properties:
• values: a numpy.ndarray holding the array’s values
• dims: dimension names for each axis (e.g., (‘x’, ‘y’, ‘z’))
• coords: a dict-like container of arrays (coordinates) that label each point (e.g., 1-dimensional arrays of numbers,
datetime objects or strings)
• attrs: dict to hold arbitrary metadata (attributes)
Source: https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataarray
[4]: ds.tas
[4]: <xarray.DataArray 'tas' (time: 1980, lat: 145, lon: 192)>
[55123200 values with dtype=float32]
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
(continues on next page)
• Some key xCDAT features are inspired by or ported from the core CDAT library
– e.g., spatial averaging, temporal averaging, regrid2 for horizontal regridding
• Other features leverage powerful libraries in the xarray ecosystem
– xESMF for horizontal regridding
– xgcm for vertical interpolation
– CF-xarray for CF convention metadata interpretation
• xCDAT strives to support datasets CF compliant and common non-CF compliant metadata (time units in
“months since . . . ” or “years since . . . ”)
• Inherent support for lazy operations and parallelism through xarray + dask
10.3. Gallery 33
xCDAT Documentation, Release 0.6.0
xcdat spatial functionality is exposed by chaining the .spatial accessor attribute to the xr.Dataset object.
Source: https://xcdat.readthedocs.io/en/latest/api.html
Feature
API
Description
Extend xr.open_dataset() and xr.open_mfdataset()
open_dataset()
open_mfdataset()
Bounds generation
Time decoding (CF and select non-CF time units)
• Prerequisites
– Installing xcdat
– Import xcdat
– Open a dataset and apply postprocessing operations
• Scenario 1 - Calculate the spatial averages over the tropical region
• Scenario 2 - Calculate the annual anomalies
• Scenario 3 - Horizontal regridding (bilinear, gaussian grid)
10.3. Gallery 35
xCDAT Documentation, Release 0.6.0
Installing xcdat
Opening a dataset
[5]: # This gives access to all xcdat public top-level APIs and accessor classes.
import xcdat as xc
# We import these packages specifically for plotting. It is not required to use xcdat.
import matplotlib.pyplot as plt
import pandas as pd
filepath = "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-
˓→ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xc.open_dataset(
filepath,
add_bounds=True,
decode_times=True,
center_times=True
)
[6]: ds
[6]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
(continues on next page)
10.3. Gallery 37
xCDAT Documentation, Release 0.6.0
[10]: ds_avg.tas.plot(label="weighted")
[10]: <matplotlib.collections.QuadMesh at 0x10cc641f0>
10.3. Gallery 39
xCDAT Documentation, Release 0.6.0
input_grid = ds.regridder.grid
input_grid.plot.scatter(x='lon', y='lat', s=5, ax=axes[0], add_colorbar=False, cmap=plt.
˓→cm.RdBu)
axes[0].set_title('Input Grid')
axes[1].set_title('Output Grid')
plt.tight_layout()
xCDAT offers horizontal regridding with xESMF (default) and a Python port of regrid2. We will be using xESMF to
regrid.
ds.tas.isel(time=0).plot(ax=axes[0])
axes[0].set_title('Input data')
output.tas.isel(time=0).plot(ax=axes[1])
axes[1].set_title('Output data')
plt.tight_layout()
Nearly all existing xarray methods have been extended to work automatically with Dask arrays for paral-
lelism
—https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray
• Parallelized xarray methods include indexing, computation, concatenating and grouped operations
• xCDAT APIs that build upon xarray methods inherently support Dask parallelism
– Dask arrays are loaded into memory only when absolutely required (e.g., decoding time, handling bounds)
• Dask divides arrays into many small pieces, called “chunks” (each presumed to be small enough to fit into
memory)
• Dask arrays operations are lazy
– Operations queue up a series of tasks mapped over blocks
– No computation is performed until values need to be computed (lazy)
– Data is loaded into memory and computation is performed in streaming fashion, block-by-block
10.3. Gallery 41
xCDAT Documentation, Release 0.6.0
• The usual way to create a Dataset filled with Dask arrays is to load the data from a netCDF file or files
• You can do this by supplying a chunks argument to open_dataset() or using the open_mfdataset() function
– By default, open_mfdataset() will chunk each netCDF file into a single Dask array
– Supply the chunks argument to control the size of the resulting Dask arrays
– Xarray maintains a Dask array until it is not possible (raises an exception instead of loading into memory)
Source: https://docs.xarray.dev/en/stable/user-guide/dask.html#reading-and-writing-data
[15]: filepath = "http://esgf.nci.org.au/thredds/dodsC/master/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/
˓→historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_
˓→gn_185001-201412.nc"
[16]: ds
[16]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 ...
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) object dask.array<chunksize=(1980, 2), meta=np.ndarray>
lat_bnds (lat, bnds) float64 dask.array<chunksize=(145, 2), meta=np.ndarray>
lon_bnds (lon, bnds) float64 dask.array<chunksize=(192, 2), meta=np.ndarray>
tas (time, lat, lon) float32 dask.array<chunksize=(1205, 145, 192), meta=np.
˓→ndarray>
Attributes: (12/49)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 87658.0
creation_date: 2020-06-05T04:06:11Z
... ...
version: v20200605
license: CMIP6 model data produced by CSIRO is li...
(continues on next page)
This is a demonstration that chunked Dataset objects work with xCDAT APIs. - The generation of weights is
serial - The weighted average operation should be parallelized (uses xarray’s .weighted() API) - We intend on
doing performance metrics and giving guidance on when to chunk - For now visit https://github.com/xCDAT/xcdat/
discussions/376 for best practices
Coordinates:
* time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
height float64 ...
Attributes:
standard_name: air_temperature
long_name: Near-Surface Air Temperature
comment: near-surface (usually, 2 meter) air temperature
units: K
cell_methods: area: time: mean
cell_measures: area: areacella
history: 2020-06-05T04:06:10Z altered by CMOR: Treated scalar dime...
_ChunkSizes: [ 1 145 192]
Key Takeaways
10.3. Gallery 43
xCDAT Documentation, Release 0.6.0
Authors:
• Tom Vo
• Stephen Po-Chedley
Date: 05/26/22
Overview
This notebook demonstrates the use of general utility methods available in xcdat, including the reorientation of the
longitude axis, centering of time coordinates using time bounds, and adding and getting bounds.
Open a dataset
[2]: dataset_links = [
"https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/
˓→1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_187001_189412.nc",
"https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/
˓→1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_189501_191912.nc",
[3]: # NOTE: Opening a multi-file dataset will result in data variables to be dask
# arrays.
ds = xcdat.open_mfdataset(dataset_links)
[4]: ds
[4]: <xarray.Dataset>
Dimensions: (lat: 180, lon: 360, nbnd: 2, time: 600)
Coordinates:
* lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
* lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
* time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00
Dimensions without coordinates: nbnd
Data variables:
lat_bnds (lat, nbnd) float64 dask.array<chunksize=(180, 2), meta=np.ndarray>
lon_bnds (lon, nbnd) float64 dask.array<chunksize=(360, 2), meta=np.ndarray>
gw (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>
time_bnds (time, nbnd) object dask.array<chunksize=(300, 2), meta=np.ndarray>
area (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>
(continues on next page)
10.3. Gallery 45
xCDAT Documentation, Release 0.6.0
Attributes: (12/21)
ne: 30
np: 4
Conventions: CF-1.0
source: CAM
case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...
title: UNSET
... ...
remap_script: ncremap
remap_hostname: acme1
remap_version: 4.9.6
map_file: /export/zender1/data/maps/map_ne30np4_to...
input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...
DODS_EXTRA.Unlimited_Dimension: time
Longitude can be represented from 0 to 360 E or as 180 W to 180 E. xcdat allows you to convert between these axes
systems.
• Related API: xcdat.swap_lon_axis()
• Alternative solution: xcdat.open_mfdataset(dataset_links, lon_orient=(-180, 180))
[5]: ds.lon
[5]: <xarray.DataArray 'lon' (lon: 360)>
array([ 0.5, 1.5, 2.5, ..., 357.5, 358.5, 359.5])
Coordinates:
* lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
Attributes:
long_name: Longitude of Grid Cell Centers
standard_name: longitude
units: degrees_east
axis: X
valid_min: 0.0
valid_max: 360.0
bounds: lon_bnds
[7]: ds2.lon
[7]: <xarray.DataArray 'lon' (lon: 360)>
array([-179.5, -178.5, -177.5, ..., 177.5, 178.5, 179.5])
Coordinates:
* lon (lon) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
Attributes:
long_name: Longitude of Grid Cell Centers
standard_name: longitude
(continues on next page)
A given point of time often represents some time period (e.g., a monthly average). In this situation, data providers
sometimes record the time as the beginning, middle, or end of the period. center_times() places the time coordinate
in the center of the time interval (using time bounds to determine the center of the period).
• Related API: xcdat.center_times()
• Alternative solution: xcdat.open_mfdataset(dataset_links, center_times=True)
The time bounds used for centering time coordinates:
[9]: ds.time
[9]: <xarray.DataArray 'time' (time: 600)>
array([cftime.DatetimeNoLeap(1870, 2, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1870, 3, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1870, 4, 1, 0, 0, 0, 0, has_year_zero=True), ...,
cftime.DatetimeNoLeap(1919, 11, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1919, 12, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1920, 1, 1, 0, 0, 0, 0, has_year_zero=True)],
dtype=object)
Coordinates:
* time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00
Attributes:
long_name: time
bounds: time_bnds
cell_methods: time: mean
10.3. Gallery 47
xCDAT Documentation, Release 0.6.0
[11]: ds3.time
[11]: <xarray.DataArray 'time' (time: 600)>
array([cftime.DatetimeNoLeap(1870, 1, 16, 12, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True),
...,
cftime.DatetimeNoLeap(1919, 10, 16, 12, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True)],
dtype=object)
Coordinates:
* time (time) object 1870-01-16 12:00:00 ... 1919-12-16 12:00:00
Attributes:
long_name: time
bounds: time_bnds
cell_methods: time: mean
Add bounds
Bounds are critical to many xcdat operations. For example, they are used in determining the weights in spatial or
temporal averages and in regridding operations. add_bounds() will attempt to produce bounds if they do not exist in
the original dataset.
• Related API: xarray.Dataset.bounds.add_bounds()
• Alternative solution: xcdat.open_mfdataset(dataset_links, add_bounds=True)
– (Assuming the file doesn’t already have bounds for your desired axis/axes)
[13]: try:
ds4.bounds.get_bounds("T")
except KeyError as e:
print(e)
"Bounds were not found for the coordinate variable 'time'. They must be added (Dataset.
˓→bounds.add_bounds)."
[14]: # A `width` kwarg can be specified, which is width of the bounds relative to
# the position of the nearest points. The default value is 0.5.
ds4 = ds4.bounds.add_bounds("T", width=0.5)
[15]: ds4.bounds.get_bounds("T")
[15]: <xarray.DataArray 'time_bnds' (time: 600, bnds: 2)>
array([[cftime.DatetimeNoLeap(1870, 1, 18, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True)],
(continues on next page)
[16]: # We drop the dataset axes bounds to demonstrate generating missing bounds.
ds5 = ds.drop_vars(["time_bnds", "lat_bnds", "lon_bnds"])
[17]: ds5
[17]: <xarray.Dataset>
Dimensions: (lat: 180, lon: 360, time: 600)
Coordinates:
* lat (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5
* lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
* time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00
Data variables:
gw (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>
area (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>
TS (time, lat, lon) float32 dask.array<chunksize=(300, 180, 360), meta=np.
˓→ndarray>
Attributes: (12/21)
ne: 30
np: 4
Conventions: CF-1.0
source: CAM
case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...
title: UNSET
... ...
remap_script: ncremap
remap_hostname: acme1
remap_version: 4.9.6
map_file: /export/zender1/data/maps/map_ne30np4_to...
input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...
DODS_EXTRA.Unlimited_Dimension: time
10.3. Gallery 49
xCDAT Documentation, Release 0.6.0
[19]: ds5
[19]: <xarray.Dataset>
Dimensions: (lat: 180, lon: 360, time: 600, bnds: 2)
Coordinates:
* lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
* lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
* time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00
Dimensions without coordinates: bnds
Data variables:
gw (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>
area (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>
TS (time, lat, lon) float32 dask.array<chunksize=(300, 180, 360), meta=np.
˓→ndarray>
lon_bnds (lon, bnds) float64 0.0 1.0 1.0 2.0 ... 358.0 359.0 359.0 360.0
lat_bnds (lat, bnds) float64 -90.0 -89.0 -89.0 -88.0 ... 89.0 89.0 90.0
time_bnds (time, bnds) object 1870-01-18 00:00:00 ... 1920-01-16 12:00:00
Attributes: (12/21)
ne: 30
np: 4
Conventions: CF-1.0
source: CAM
case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...
title: UNSET
... ...
remap_script: ncremap
remap_hostname: acme1
remap_version: 4.9.6
map_file: /export/zender1/data/maps/map_ne30np4_to...
input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...
DODS_EXTRA.Unlimited_Dimension: time
In xarray, you can get a dimension coordinates by directly referencing its name (e.g., ds.lat). xcdat provides an
alternative way to get dimension coordinates agnostically by simply passing the CF axis key to applicable APIs.
• Related API: xcdat.get_dim_coords()
Helpful knowledge:
• This API uses cf_xarray to interpret CF axis names and coordinate names in the xarray object attributes. Refer
to Metadata Interpretation for more information.
Xarray documentation on coordinates (source):
• There are two types of coordinates in xarray:
– dimension coordinates are one dimensional coordinates with a name equal to their sole dimension (marked
by * when printing a dataset or data array). They are used for label based indexing and alignment, like the
index found on a pandas DataFrame or Series. Indeed, these “dimension” coordinates use a pandas.Index
internally to store their values.
– non-dimension coordinates are variables that contain coordinate data, but are not a dimension coordinate.
They can be multidimensional (see Working with Multidimensional Coordinates), and there is no relation-
ship between the name of a non-dimension coordinate and the name(s) of its dimension(s). Non-dimension
coordinates can be useful for indexing or plotting; otherwise, xarray does not make any direct use of the
values associated with them. They are not used for alignment or automatic indexing, nor are they required
to match when doing arithmetic (see Coordinates).
• Xarray’s terminology differs from the CF terminology, where the “dimension coordinates” are called “coordinate
variables”, and the “non-dimension coordinates” are called “auxiliary coordinate variables” (see GH1295 for
more details).
1. axis attr
[20]: ds.lat.attrs["axis"]
[20]: 'Y'
2. standard_name attr
[21]: ds.lat.attrs["standard_name"]
[21]: 'latitude'
10.3. Gallery 51
xCDAT Documentation, Release 0.6.0
Authors:
• Tom Vo
• Stephen Po-Chedley
Date: 05/27/22
Related APIs:
• xarray.Dataset.spatial.average()
The data used in this example can be found through the Earth System Grid Federation (ESGF) search portal.
Overview
A common data reduction in geophysical sciences is to produce spatial averages. Spatial averaging functionality in
xcdat allows users to quickly produce area-weighted spatial averages for selected regions (or full dataset domains).
In the example below, we demonstrate the opening of a (remote) dataset and spatial averaging over the global, tropical,
and Niño 3.4 domains.
We are using xarray’s OPeNDAP support to read a netCDF4 dataset file directly from its source. The data is not loaded
over the network until we perform operations on it (e.g., temperature unit adjustment).
More information on the xarray’s OPeNDAP support can be found here.
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xcdat.open_dataset(filepath)
ds
[2]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 -27.19 -27.19 -27.19 ... -25.29 -25.29
Attributes: (12/48)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 87658.0
creation_date: 2020-06-05T04:06:11Z
... ...
variant_label: r10i1p1f1
version: v20200605
license: CMIP6 model data produced by CSIRO is li...
cmor_version: 3.4.0
tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...
DODS_EXTRA.Unlimited_Dimension: time
2. Global
[4]: ds_global_avg.tas
[4]: <xarray.DataArray 'tas' (time: 1980)>
array([12.52127071, 13.09115223, 13.60703132, ..., 15.5767848 ,
14.65664621, 13.84951678])
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
height float64 2.0
10.3. Gallery 53
xCDAT Documentation, Release 0.6.0
3. Tropical Region
[7]: ds_trop_avg.tas
[7]: <xarray.DataArray 'tas' (time: 1980)>
array([25.24722608, 25.61795924, 25.96516235, ..., 26.79536823,
26.67771602, 26.27182383])
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
height float64 2.0
Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average >
equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño > 3.4
index typically uses a 5-month running mean, and El Niño or La Niña events are defined when > the Niño
3.4 SSTs exceed +/- 0.4C for a period of six months or more.”
—https://climatedataguide.ucar.edu/climate-data/nino-sst-indices-nino-12-3-34-4-oni-and-tni
[10]: ds_nino_avg.tas
[10]: <xarray.DataArray 'tas' (time: 1980)>
array([27.00284678, 27.06796429, 26.18095324, ..., 27.17515272,
27.30917002, 27.38399379])
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
height float64 2.0
10.3. Gallery 55
xCDAT Documentation, Release 0.6.0
Author: Tom Vo
Date: 05/27/22
Last Edited: 08/17/22 (v0.3.1)
Related APIs:
• xarray.Dataset.temporal.average()
• xarray.Dataset.temporal.group_average()
The data used in this example can be found through the Earth System Grid Federation (ESGF) search portal.
Overview
Suppose we have netCDF4 files for air temperature data (tas) with monthly, daily, and 3hr frequencies.
We want to calculate averages using these files with the time dimension removed (a single time snapshot), and averages
by time group (yearly, seasonal, and daily).
import pandas as pd
import matplotlib.pyplot as plt
import xcdat
In this example, we will be calculating the time weighted averages with the time dimension removed (single snapshot)
for monthly tas data.
We are using xarray’s OPeNDAP support to read a netCDF4 dataset file directly from its source. The data is not loaded
over the network until we perform operations on it (e.g., temperature unit adjustment).
More information on the xarray’s OPeNDAP support can be found here.
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xcdat.open_dataset(filepath)
ds
[2]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
(continues on next page)
10.3. Gallery 57
xCDAT Documentation, Release 0.6.0
[4]: ds_avg.tas
[4]: <xarray.DataArray 'tas' (lat: 145, lon: 192)>
array([[-48.01481628, -48.01481628, -48.01481628, ..., -48.01481628,
-48.01481628, -48.01481628],
[-44.94085363, -44.97948214, -45.01815398, ..., -44.82408252,
-44.86273067, -44.9009281 ],
[-44.11875274, -44.23060624, -44.33960158, ..., -43.76766492,
-43.88593717, -44.00303006],
...,
[-18.21076615, -18.17513373, -18.13957458, ..., -18.32720478,
-18.28428828, -18.2486193 ],
[-18.50778243, -18.49301854, -18.47902819, ..., -18.55410851,
-18.5406963 , -18.52413098],
[-19.07366375, -19.07366375, -19.07366375, ..., -19.07366375,
-19.07366375, -19.07366375]])
Coordinates:
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 2.0
Attributes:
operation: temporal_avg
mode: average
freq: month
weighted: True
[5]: ds_avg.tas.plot(label="weighted")
[5]: <matplotlib.collections.QuadMesh at 0x7f120aa2a9a0>
– The grouping conventions are based on CDAT/cdutil, except for daily and hourly means which aren’t im-
plemented in CDAT/cdutil.
• Masked (missing) data is automatically handled.
– The weight of masked (missing) data are excluded when averages are calculated. This is the same as giving
them a weight of 0.
10.3. Gallery 59
xCDAT Documentation, Release 0.6.0
In this example, we will be calculating the weighted grouped time averages for tas data.
We are using xarray’s OPeNDAP support to read a netCDF4 dataset file directly from its source. The data is not loaded
over the network until we perform operations on it (e.g., temperature unit adjustment).
More information on the xarray’s OPeNDAP support can be found here.
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xcdat.open_dataset(filepath)
ds
[6]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 -27.19 -27.19 -27.19 ... -25.29 -25.29
Attributes: (12/48)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 87658.0
creation_date: 2020-06-05T04:06:11Z
... ...
variant_label: r10i1p1f1
version: v20200605
license: CMIP6 model data produced by CSIRO is li...
cmor_version: 3.4.0
tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...
DODS_EXTRA.Unlimited_Dimension: time
Yearly Averages
[8]: ds_yearly.tas
[8]: <xarray.DataArray 'tas' (time: 165, lat: 145, lon: 192)>
array([[[-48.75573349, -48.75573349, -48.75573349, ..., -48.75573349,
-48.75573349, -48.75573349],
[-45.65206528, -45.69302368, -45.73506165, ..., -45.52127838,
-45.56386566, -45.60668945],
[-44.77523422, -44.90583801, -45.03297043, ..., -44.37118149,
-44.50630951, -44.64050293],
...,
[-20.50597572, -20.48132133, -20.45456505, ..., -20.58895874,
-20.55752182, -20.53087234],
[-20.79759216, -20.78425217, -20.77545547, ..., -20.83267975,
-20.82335663, -20.80768394],
[-21.20114899, -21.20114899, -21.20114899, ..., -21.20114899,
-21.20114899, -21.20114899]],
10.3. Gallery 61
xCDAT Documentation, Release 0.6.0
import xmovie
mov = xmovie.Movie(ds_yearly_avg.tas)
mov.save("temporal-average-yearly.gif")
Seasonal Averages
[10]: ds_season.tas
[10]: <xarray.DataArray 'tas' (time: 661, lat: 145, lon: 192)>
array([[[-32.70588303, -32.70588303, -32.70588303, ..., -32.70588303,
-32.70588303, -32.70588303],
[-30.99376678, -31.03758621, -31.08932686, ..., -30.84562302,
-30.89412689, -30.94400978],
[-30.0251503 , -30.14543724, -30.26419067, ..., -29.66037178,
-29.78108025, -29.90287781],
...,
[-37.72314072, -37.68549347, -37.65416718, ..., -37.82619858,
-37.79034424, -37.75682831],
[-38.27464676, -38.26372528, -38.25014496, ..., -38.29218292,
-38.29063797, -38.28456116],
[-38.74358749, -38.74358749, -38.74358749, ..., -38.74358749,
-38.74358749, -38.74358749]],
Notice that the season of each time coordinate is represented by its middle month.
• “DJF” is represented by month 1 (“J”/January)
• “MAM” is represented by month 4 (“A”/April)
• “JJA” is represented by month 7 (“J”/July)
• “SON” is represented by month 10 (“O”/October).
This is implementation design was used because datetime objects do not distinguish seasons, so the middle month is
used instead.
[11]: ds_season.time
[11]: <xarray.DataArray 'time' (time: 661)>
array([cftime.DatetimeProlepticGregorian(1850, 1, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(1850, 4, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(1850, 7, 1, 0, 0, 0, 0, has_year_zero=True),
...,
cftime.DatetimeProlepticGregorian(2014, 7, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(2014, 10, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(2015, 1, 1, 0, 0, 0, 0, has_year_zero=True)],
dtype=object)
Coordinates:
height float64 2.0
(continues on next page)
10.3. Gallery 63
xCDAT Documentation, Release 0.6.0
Monthly Averages
NOTE:
For OPeNDAP servers, the default file size request limit is 500MB in the TDS server configuration. Opening up a
dataset over OPeNDAP also introduces an overhead compared to direct file access.
The workaround is to use Dask to request the data in manageable chunks, which overcomes file size limitations
and can improve performance.
We have a few ways to chunk our request:
1. Specify chunks with "auto" to let Dask determine the chunksize.
2. Specify a specify the file size to chunk on (e.g., "100MB") or number of chunks as an integer (100 for 100 chunks).
Visit this page to learn more about chunking and performance: https://docs.xarray.dev/en/stable/user-guide/dask.html#
chunking-and-performance
[12]: # The size of this file is approximately 1.45 GB, so we will be chunking our
# request using Dask to avoid hitting the OPeNDAP file size request limit for
# this ESGF node.
ds2 = xcdat.open_dataset(
"https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/
˓→historical/r10i1p1f1/3hr/tas/gn/v20200605/tas_3hr_ACCESS-ESM1-5_historical_r10i1p1f1_
˓→gn_201001010300-201501010000.nc",
chunks={"time": "auto"},
)
ds2
[12]: <xarray.Dataset>
Dimensions: (time: 14608, lat: 145, bnds: 2, lon: 192)
Coordinates:
* time (time) datetime64[ns] 2010-01-01T03:00:00 ... 2015-01-01
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
(continues on next page)
[14]: ds2_monthly_avg.tas
[14]: <xarray.DataArray 'tas' (time: 61, lat: 145, lon: 192)>
dask.array<truediv, shape=(61, 145, 192), dtype=float64, chunksize=(1, 145, 192),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
* time (time) object 2010-01-01 00:00:00 ... 2015-01-01 00:00:00
Attributes:
operation: temporal_avg
mode: group_average
freq: month
weighted: True
Daily Averages
[15]: # The size of this file is approximately 1.17 GB, so we will be chunking our
# request using Dask to avoid hitting the OPeNDAP file size request limit for
# this ESGF node.
ds3 = xcdat.open_dataset(
(continues on next page)
10.3. Gallery 65
xCDAT Documentation, Release 0.6.0
˓→gn_201001010300-201501010000.nc",
chunks={"time": "auto"},
)
[16]: ds3.tas
[16]: <xarray.DataArray 'tas' (time: 14608, lat: 145, lon: 192)>
dask.array<sub, shape=(14608, 145, 192), dtype=float32, chunksize=(913, 145, 192),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 2010-01-01T03:00:00 ... 2015-01-01
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
[18]: ds3_day_avg.tas
[18]: <xarray.DataArray 'tas' (time: 1827, lat: 145, lon: 192)>
dask.array<truediv, shape=(1827, 145, 192), dtype=float64, chunksize=(1, 145, 192),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
* time (time) object 2010-01-01 00:00:00 ... 2015-01-01 00:00:00
Attributes:
operation: temporal_avg
mode: group_average
freq: day
weighted: True
Author: Tom Vo
Date: 05/27/22
Last Updated: 2/27/23
Related APIs:
• xarray.Dataset.temporal.climatology()
• xarray.Dataset.temporal.departures()
The data used in this example can be found through the Earth System Grid Federation (ESGF) search portal.
Overview
Suppose we have two netCDF4 files for air temperature data (tas).
• File 1: Monthly frequency from 1850-01-16 to 2014-12-16
– We want to calculate the annual and seasonal cycle climatologies and departures using this file.
• File 2: Hourly frequency from 2010-01-01 to 2015-01-01 (subset).
– We want to calculate the daily cycle climatologies and departures using this file.
We are using xarray’s OPeNDAP support to read a netCDF4 dataset files directly from their source. The data is not
loaded over the network until we perform operations on it (e.g., temperature unit adjustment).
More information on the xarray’s OPeNDAP support can be found here.
˓→gn_185001-201412.nc"
ds_monthly = xcdat.open_dataset(filepath1)
10.3. Gallery 67
xCDAT Documentation, Release 0.6.0
The size of this file is approximately 1.17 GB, so we will be chunking our request using Dask to avoid hitting the
OPeNDAP file size request limit for this ESGF node.
˓→gn_201001010300-201501010000.nc"
2. Calculate Climatology
Seasonal Climatology
[5]: season_climo.tas
[5]: <xarray.DataArray 'tas' (time: 4, lat: 145, lon: 192)>
array([[[-31.00774765, -31.00774765, -31.00774765, ..., -31.00774765,
-31.00774765, -31.00774765],
[-29.65324402, -29.685215 , -29.71771049, ..., -29.55809784,
-29.58923149, -29.62030983],
[-28.88215446, -28.98016167, -29.07778549, ..., -28.58658791,
-28.68405914, -28.78241539],
...,
[-31.36740303, -31.31291962, -31.25907516, ..., -31.54325676,
-31.47868538, -31.42434502],
[-31.88631248, -31.86421967, -31.84326553, ..., -31.95551682,
-31.93475533, -31.91006279],
[-32.83132172, -32.83132172, -32.83132172, ..., -32.83132172,
-32.83132172, -32.83132172]],
10.3. Gallery 69
xCDAT Documentation, Release 0.6.0
for ax in axes.flat:
ax.axes.get_xaxis().set_ticklabels([])
ax.axes.get_yaxis().set_ticklabels([])
ax.axes.axis("tight")
ax.set_xlabel("")
plt.tight_layout()
fig.suptitle("Seasonal Surface Air Temperature", fontsize=16, y=1.02)
[6]: Text(0.5, 1.02, 'Seasonal Surface Air Temperature')
Notice that the time coordinates are cftime objects, with each season (“DJF”, “MAM”, “JJA”, and “SON”) represented
10.3. Gallery 71
xCDAT Documentation, Release 0.6.0
[7]: season_climo.time
[7]: <xarray.DataArray 'time' (time: 4)>
array([cftime.DatetimeProlepticGregorian(1, 1, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(1, 4, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(1, 7, 1, 0, 0, 0, 0, has_year_zero=True),
cftime.DatetimeProlepticGregorian(1, 10, 1, 0, 0, 0, 0, has_year_zero=True)],
dtype=object)
Coordinates:
height float64 2.0
* time (time) object 0001-01-01 00:00:00 ... 0001-10-01 00:00:00
Attributes:
bounds: time_bnds
axis: T
long_name: time
standard_name: time
_ChunkSizes: 1
[8]: custom_seasons = [
["Jan", "Feb", "Mar"], # "JanFebMar"
["Apr", "May", "Jun"], # "AprMayJun"
["Jul", "Aug", "Sep"], # "JunJulAug"
["Oct", "Nov", "Dec"], # "OctNovDec"
]
c_season_climo = ds_monthly.temporal.climatology(
"tas",
freq="season",
weighted=True,
season_config={"custom_seasons": custom_seasons},
)
[9]: c_season_climo.tas
[9]: <xarray.DataArray 'tas' (time: 4, lat: 145, lon: 192)>
array([[[-38.74568939, -38.74568939, -38.74568939, ..., -38.74568939,
-38.74568939, -38.74568939],
[-36.58245468, -36.61849976, -36.65530777, ..., -36.47352982,
-36.50952148, -36.54521942],
[-35.74017334, -35.84892654, -35.95645142, ..., -35.40914154,
-35.51865387, -35.62909698],
...,
[-32.0694809 , -32.01528931, -31.96115875, ..., -32.24432373,
-32.18037796, -32.1263504 ],
[-32.59425354, -32.57166672, -32.55008316, ..., -32.66543961,
-32.64432526, -32.61899185],
[-33.51273727, -33.51273727, -33.51273727, ..., -33.51273727,
-33.51273727, -33.51273727]],
10.3. Gallery 73
xCDAT Documentation, Release 0.6.0
for ax in axes.flat:
ax.axes.get_xaxis().set_ticklabels([])
ax.axes.get_yaxis().set_ticklabels([])
ax.axes.axis("tight")
ax.set_xlabel("")
plt.tight_layout()
fig.suptitle("Seasonal Surface Air Temperature", fontsize=16, y=1.02)
[10]: Text(0.5, 1.02, 'Seasonal Surface Air Temperature')
Annual Climatology
[12]: annual_climo.tas
[12]: <xarray.DataArray 'tas' (time: 12, lat: 145, lon: 192)>
array([[[-28.21442795, -28.21442795, -28.21442795, ..., -28.21442795,
-28.21442795, -28.21442795],
[-27.14847946, -27.17834282, -27.20867348, ..., -27.06005478,
(continues on next page)
10.3. Gallery 75
xCDAT Documentation, Release 0.6.0
Daily Climatology
[14]: daily_climo.tas
[14]: <xarray.DataArray 'tas' (time: 365, lat: 145, lon: 192)>
dask.array<truediv, shape=(365, 145, 192), dtype=float64, chunksize=(1, 145, 192),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
* time (time) object 0001-01-01 00:00:00 ... 0001-12-31 00:00:00
Attributes:
operation: temporal_avg
mode: climatology
freq: day
weighted: True
10.3. Gallery 77
xCDAT Documentation, Release 0.6.0
Seasonal Anomalies
The season_config dictionary keyword argument can be passed to .departures() for more granular configuration.
We will be sticking with the default settings.
[16]: season_departures.tas
[16]: <xarray.DataArray 'tas' (time: 1977, lat: 145, lon: 192)>
array([[[ 4.34326172, 4.34326172, 4.34326172, ..., 4.34326172,
4.34326172, 4.34326172],
[ 3.86720657, 3.8577919 , 3.85103607, ..., 3.88219452,
3.87863541, 3.87652969],
[ 4.2077713 , 4.16518402, 4.11819077, ..., 4.32532883,
4.28820419, 4.24903488],
...,
[-12.20487785, -12.16464233, -12.10724449, ..., -12.2838459 ,
-12.24758339, -12.22140503],
[-12.55296707, -12.54230881, -12.52475929, ..., -12.58355713,
-12.5769062 , -12.56380463],
[-12.51847076, -12.51847076, -12.51847076, ..., -12.51847076,
-12.51847076, -12.51847076]],
To calculate custom seasonal cycle anomalies, we must first define our custom seasons using the season_config
dictionary and the "custom_seasons" key.
"custom_seasons" must be a list of sublists containing month strings, with each sublist representing a custom season.
* Month strings must be in the three letter format (e.g., ‘Jan’) * Each month must be included once in a custom season
* Order of the months in each custom season does not matter * Custom seasons can vary in length
[17]: custom_seasons = [
["Jan", "Feb", "Mar"], # "JanFebMar"
["Apr", "May", "Jun"], # "AprMayJun"
["Jul", "Aug", "Sep"], # "JulAugSep"
["Oct", "Nov", "Dec"], # "OctNovDec"
]
c_season_departs = ds_monthly.temporal.departures(
"tas",
freq="season",
weighted=True,
season_config={"custom_seasons": custom_seasons},
)
[18]: c_season_departs.tas
[18]: <xarray.DataArray 'tas' (time: 1980, lat: 145, lon: 192)>
array([[[ 1.15587234e+01, 1.15587234e+01, 1.15587234e+01, ...,
1.15587234e+01, 1.15587234e+01, 1.15587234e+01],
[ 1.06294823e+01, 1.06283112e+01, 1.06131477e+01, ...,
1.06636086e+01, 1.06529236e+01, 1.06392441e+01],
[ 1.07633209e+01, 1.07548561e+01, 1.07489014e+01, ...,
1.07870865e+01, 1.07799606e+01, 1.07722702e+01],
...,
[-3.13597870e+00, -3.16473389e+00, -3.20427704e+00, ...,
-3.06773376e+00, -3.09281540e+00, -3.11287689e+00],
(continues on next page)
10.3. Gallery 79
xCDAT Documentation, Release 0.6.0
Annual Anomalies
[20]: annual_departures.tas
[20]: <xarray.DataArray 'tas' (time: 1980, lat: 145, lon: 192)>
array([[[ 1.02746201, 1.02746201, 1.02746201, ..., 1.02746201,
1.02746201, 1.02746201],
[ 1.19550705, 1.18815422, 1.16651344, ..., 1.25013351,
1.23219299, 1.2116642 ],
[ 1.46669388, 1.44287872, 1.42212868, ..., 1.53920364,
1.51576614, 1.49129677],
...,
[-3.27492523, -3.30706978, -3.3486805 , ..., -3.19853592,
-3.22591019, -3.24889755],
[-3.33357239, -3.35164261, -3.37155914, ..., -3.27233505,
-3.29122543, -3.31744385],
[-3.00098038, -3.00098038, -3.00098038, ..., -3.00098038,
-3.00098038, -3.00098038]],
10.3. Gallery 81
xCDAT Documentation, Release 0.6.0
Daily Anomalies
Leap days (if present) are dropped if the CF calendar type is "gregorian", "proleptic_gregorian", or "standard
".
[22]: daily_departures.tas
[22]: <xarray.DataArray 'tas' (time: 14600, lat: 145, lon: 192)>
dask.array<getitem, shape=(14600, 145, 192), dtype=float64, chunksize=(8, 145, 192),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 2010-01-01T03:00:00 ... 2015-01-01
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
Attributes:
operation: temporal_avg
mode: departures
freq: day
weighted: True
Overview
We’ll cover horizontal regridding using the xESMF and Regrid2 tools as well as various methods supported by xESMF.
It should be noted that Regrid2 treats the grid cells as being flat.
import os
import sys
(continues on next page)
We are using xarray’s OPeNDAP support to read a netCDF4 dataset file directly from its source. The data is not loaded
over the network until we perform operations on it (e.g., temperature unit adjustment).
More information on the xarray’s OPeNDAP support can be found here.
˓→185001-201412.nc"
ds = xcdat.open_dataset(filepath)
ds
[3]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 64, lon: 128)
Coordinates:
* time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
* lat (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
* lon (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) object ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 -25.04 -25.28 -25.49 ... -25.93 -25.73
Attributes: (12/54)
CCCma_model_hash: 7e8e715f3f2ce47e1bab830db971c362ca329419
CCCma_parent_runid: rc3.1-pictrl
CCCma_pycmor_hash: 33c30511acc319a98240633965a04ca99c26427e
CCCma_runid: rc3.1-his13
Conventions: CF-1.7 CMIP-6.2
YMDH_branch_time_in_child: 1850:01:01:00
... ...
variable_id: tas
variant_label: r13i1p1f1
version: v20190429
license: CMIP6 model data produced by The Governm...
(continues on next page)
10.3. Gallery 83
xCDAT Documentation, Release 0.6.0
grid_urlpath = "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NOAA-GFDL/GFDL-
˓→CM4/abrupt-4xCO2/r1i1p1f1/day/tas/gr2/v20180701/tas_day_GFDL-CM4_abrupt-4xCO2_r1i1p1f1_
˓→gr2_00010101-00201231.nc"
grid_ds = xcdat.open_dataset(grid_urlpath)
output_grid = grid_ds.regridder.grid
Other related APIs available for creating grids: xcdat.create_grid() and xcdat.create_uniform_grid()
plt.tight_layout()
axes[1].set_title("Output data")
plt.tight_layout()
4. Regridding algorithms
axes = axes.flatten()
10.3. Gallery 85
xCDAT Documentation, Release 0.6.0
axes[i].set_title(method)
axes[-1].set_visible(False)
plt.tight_layout()
5. Masking
ds["mask"].plot(ax=axes[0], cmap="binary_r")
axes[0].set_title("Mask")
axes[1].set_title("Masked output")
plt.tight_layout()
10.3. Gallery 87
xCDAT Documentation, Release 0.6.0
import warnings
warnings.filterwarnings("ignore")
1. Open dataset
[2]: # urls for sea water potential temperature (thetao) and salinity (so) from the NCAR␣
˓→model in CMIP6
urls = [
"http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NCAR/CESM2/historical/
˓→r1i1p1f1/Omon/so/gn/v20190308/so_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc",
"http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NCAR/CESM2/historical/
˓→r1i1p1f1/Omon/thetao/gn/v20190308/thetao_Omon_CESM2_historical_r1i1p1f1_gn_185001-
˓→201412.nc",
ds
2023-09-26 15:45:48,860 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
2023-09-26 15:45:48,860 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
2023-09-26 15:45:48,966 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
2023-09-26 15:45:48,966 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
[2]: <xarray.Dataset>
Dimensions: (lev: 60, nlat: 384, nlon: 320, time: 1980, d2: 2, vertices: 4,
bnds: 2)
Coordinates:
* lev (lev) float64 5.0 15.0 25.0 ... 4.875e+03 5.125e+03 5.375e+03
* nlat (nlat) int32 1 2 3 4 5 6 7 8 ... 377 378 379 380 381 382 383 384
* nlon (nlon) int32 1 2 3 4 5 6 7 8 ... 313 314 315 316 317 318 319 320
* time (time) object 1850-01-15 13:00:00.000007 ... 2014-12-15 12:00:00
lat (nlat, nlon) float64 dask.array<chunksize=(384, 320), meta=np.ndarray>
lon (nlat, nlon) float64 dask.array<chunksize=(384, 320), meta=np.ndarray>
Dimensions without coordinates: d2, vertices, bnds
Data variables:
time_bnds (time, d2) object dask.array<chunksize=(4, 2), meta=np.ndarray>
lat_bnds (nlat, nlon, vertices) float32 dask.array<chunksize=(384, 320, 4),␣
˓→meta=np.ndarray>
nlon_bnds (nlon, bnds) float64 0.5 1.5 1.5 2.5 ... 318.5 319.5 319.5 320.5
thetao (time, lev, nlat, nlon) float32 dask.array<chunksize=(4, 60, 384, 320),␣
˓→meta=np.ndarray>
(continues on next page)
10.3. Gallery 89
xCDAT Documentation, Release 0.6.0
grid_urlpath = "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NOAA-GFDL/GFDL-
˓→CM4/abrupt-4xCO2/r1i1p1f1/day/tas/gr2/v20180701/tas_day_GFDL-CM4_abrupt-4xCO2_r1i1p1f1_
˓→gr2_00010101-00201231.nc"
grid_ds = xcdat.open_dataset(grid_urlpath)
output_grid = grid_ds.regridder.grid
output_grid
[3]: <xarray.Dataset>
Dimensions: (lev: 10, bnds: 2)
Coordinates:
* lev (lev) float64 5.0 64.11 123.2 182.3 ... 359.7 418.8 477.9 537.0
Dimensions without coordinates: bnds
Data variables:
lev_bnds (lev, bnds) float64 -24.56 34.56 34.56 93.67 ... 507.4 507.4 566.6
output.so.isel(time=0).mean(dim="nlon").plot()
plt.gca().invert_yaxis()
2023-09-26 15:45:50,454 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
2023-09-26 15:45:50,454 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
10.3. Gallery 91
xCDAT Documentation, Release 0.6.0
[5]: # Apply gsw function to calculate potential density from potential temperature (thetao)␣
˓→and salinity (so)
ds.dens.isel(time=0).mean(dim="nlon").plot()
plt.gca().invert_yaxis()
output.so.isel(time=0).mean(dim="nlon").plot()
plt.gca().invert_yaxis()
output.so.isel(time=0).sel(lev=0, method="nearest").plot()
2023-09-26 15:47:50,128 [WARNING]: bounds.py(add_missing_bounds:186) >> The nlat coord␣
˓→variable has a 'units' attribute that is not in degrees.
10.3. Gallery 93
xCDAT Documentation, Release 0.6.0
1. Open dataset
˓→r1i1p1f1_gr_185001-189912.nc'
url_cl = 'https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/CMIP6/CMIP/E3SM-
˓→Project/E3SM-2-0/historical/r1i1p1f1/Amon/cl/gr/v20220830/cl_Amon_E3SM-2-0_historical_
˓→r1i1p1f1_gr_185001-189912.nc'
output_grid
[9]: <xarray.Dataset>
Dimensions: (lev: 13, bnds: 2)
Coordinates:
* lev (lev) float64 1e+05 9.167e+04 8.333e+04 ... 8.334e+03 1.0
Dimensions without coordinates: bnds
Data variables:
lev_bnds (lev, bnds) float64 1.042e+05 9.583e+04 ... 4.168e+03 -4.166e+03
[10]: # Remap from original pressure level to target pressure level using logarithmic␣
˓→interpolation
output_ta.ta.isel(time=0, lev=0).plot()
[10]: <matplotlib.collections.QuadMesh at 0x188277b10>
10.3. Gallery 95
xCDAT Documentation, Release 0.6.0
pressure = hybrid_coordinate(**ds_cl.data_vars)
pressure
[11]: <xarray.DataArray (lev: 72, time: 600, lat: 180, lon: 360)>
dask.array<add, shape=(72, 600, 180, 360), dtype=float64, chunksize=(72, 4, 180, 360),␣
˓→chunktype=numpy.ndarray>
Coordinates:
* lev (lev) float64 0.9985 0.9938 0.9862 ... 0.0001828 0.0001238
* time (time) object 1850-01-16 12:00:00 ... 1899-12-16 12:00:00
* lat (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5
* lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5
output_cl.cl.isel(time=0, lev=0).plot()
[12]: <matplotlib.collections.QuadMesh at 0x19974eed0>
[13]: output_cl.cl.isel(time=0).mean(dim='lon').plot()
plt.gca().invert_yaxis()
10.3. Gallery 97
xCDAT Documentation, Release 0.6.0
[ ]:
10.4.1 LLNL Climate and Weather Seminar Series (01/25/2023) - A Gentle Introduc-
tion to xCDAT
This work is performed under the auspices of the U. S. DOE by Lawrence Livermore National Laboratory under contract
No. DE-AC52-07NA27344.
Presentation Overview
Notebook Setup
Create an Anaconda environment for this notebook using the command below:
conda create -n xcdat -c conda-forge xarray xcdat xesmf matplotlib nc-time-axis jupyter
• The CDAT (Community Data Analysis Tools) library has provided a suite of robust and comprehensive open-
source climate data analysis and visualization packages for over 20 years
• A driving need for a modern successor
– Focus on a maintainable and extensible library
– Serve the needs of the climate community in the long-term
Introducing xCDAT
“Xarray introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-
like multidimensional arrays, which allows for a more intuitive, more concise, and less error-prone
developer experience.”
—https://xarray.pydata.org/en/v2022.10.0/getting-started-guide/why-xarray.html
• Apply operations over dimensions by name
– x.sum('time')
• Select values by label (or logical location) instead of integer location
– x.loc['2014-01-01'] or x.sel(time='2014-01-01')
• Mathematical operations vectorize across multiple dimensions (array broadcasting) based on dimension
names, not shape
– x - y
• Easily use the split-apply-combine paradigm with groupby
– x.groupby('time.dayofyear').mean().
• Database-like alignment based on coordinate labels that smoothly handles missing values
– x, y = xr.align(x, y, join='outer')
• Keep track of arbitrary metadata in the form of a Python dictionary
– x.attrs
Source: https://docs.xarray.dev/en/v2022.10.0/getting-started-guide/why-xarray.html#what-labels-enable
“Xarray data models are borrowed from netCDF file format, which provides xarray with a natural and
portable serialization format.”
—https://docs.xarray.dev/en/v2022.10.0/getting-started-guide/why-xarray.html
1. ``xarray.Dataset``
• A dictionary-like container of DataArray objects with aligned dimensions
– DataArray objects are classified as “coordinate variables” or “data variables”
– All data variables have a shared union of coordinates
[1]: # This style import is necessary to properly render Xarray's HTML output with
# the Jupyer RISE extension.
# GitHub Issue: https://github.com/damianavila/RISE/issues/594
# Source: https://github.com/smartass101/xarray-pydata-prague-2020/blob/main/rise.css
style = """
<style>
.reveal pre.xr-text-repr-fallback {
display: none;
}
.reveal ul.xr-sections {
display: grid
}
.reveal ul ul.xr-var-list {
display: contents
}
</style>
"""
HTML(style)
[1]: <IPython.core.display.HTML object>
filepath = "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-
˓→ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xr.open_dataset(filepath)
[3]: ds
[3]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 ...
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 ...
Attributes: (12/48)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 87658.0
creation_date: 2020-06-05T04:06:11Z
... ...
variant_label: r10i1p1f1
version: v20200605
license: CMIP6 model data produced by CSIRO is li...
cmor_version: 3.4.0
tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...
DODS_EXTRA.Unlimited_Dimension: time
[4]: ds.tas
[4]: <xarray.DataArray 'tas' (time: 1980, lat: 145, lon: 192)>
[55123200 values with dtype=float32]
Coordinates:
* time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
height float64 ...
Attributes:
standard_name: air_temperature
long_name: Near-Surface Air Temperature
comment: near-surface (usually, 2 meter) air temperature
units: K
cell_methods: area: time: mean
cell_measures: area: areacella
history: 2020-06-05T04:06:10Z altered by CMOR: Treated scalar dime...
_ChunkSizes: [ 1 145 192]
A class that attaches dimension names, coordinates, and attributes to multi-dimensional arrays (aka “labeled
arrays”)
Key properties:
• values: a numpy.ndarray holding the array’s values
• dims: dimension names for each axis (e.g., (‘x’, ‘y’, ‘z’))
• coords: a dict-like container of arrays (coordinates) that label each point (e.g., 1-dimensional arrays of
numbers, datetime objects or strings)
• attrs: dict to hold arbitrary metadata (attributes)
Source: https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataarray
• Some key xCDAT features are inspired by or ported from the core CDAT library
– e.g., spatial averaging, temporal averaging, regrid2 for horizontal regridding
• Other features leverage powerful libraries in the xarray ecosystem
– xESMF for horizontal regridding
– xgcm for vertical interpolation
– CF-xarray for CF convention metadata interpretation
• xCDAT strives to support datasets CF compliant and common non-CF compliant metadata (time units in
“months since . . . ” or “years since . . . ”)
• Inherent support for lazy operations and parallelism through xarray + dask
xcdat spatial functionality is exposed by chaining the .spatial accessor attribute to the xr.Dataset object.
Source: https://xcdat.readthedocs.io/en/latest/api.html
Feature
API
Description
Extend xr.open_dataset() and xr.open_mfdataset()
open_dataset()
open_mfdataset()
Bounds generation
Time decoding (CF and select non-CF time units)
Centering of time coordinates
Conversion of longitudinal axis orientation
Temporal averaging
ds.temporal.average()
ds.temporal.group_average()
ds.temporal.climatology()
ds.temporal.departures()
Single snapshot and group average
Climatology and departure
Weighted or unweighted
Optional seasonal configuration< (e.g., custom seasons)
Geospatial averaging
ds.spatial.average()
Rectilinear grids
Weighted
Optional specification of region domain
Horizontal regridding
ds.regridder.horizontal()
Rectilinear and curvilinear grids
Extends xESMF horizontal regridding
Python implementation of regrid2
Vertical regridding
ds.regridder.vertical()
Transform vertical coordinates
Extends xgcm vertical interpolation
Linear, logarithmic, and conservative interpolation
Decode parametric vertical coordinates if required
• Prerequisites
– Installing xcdat
– Import xcdat
– Open a dataset and apply postprocessing operations
• Scenario 1 - Calculate the spatial averages over the tropical region
• Scenario 2 - Calculate temporal average
• Scenario 3 - Horizontal regridding (bilinear, gaussian grid)
Installing xcdat
Source: https://xcdat.readthedocs.io/en/latest/getting-started.html
Opening a dataset
[5]: # This gives access to all xcdat public top-level APIs and accessor classes.
import xcdat as xc
# We import these packages specifically for plotting. It is not required to use xcdat.
import matplotlib.pyplot as plt
import pandas as pd
filepath = "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-
˓→ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_
˓→r10i1p1f1_gn_185001-201412.nc"
ds = xc.open_dataset(
filepath,
add_bounds=True,
decode_times=True,
center_times=True
)
[6]: ds
[6]: <xarray.Dataset>
Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)
Coordinates:
* time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
* lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
* lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) object ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 -27.19 -27.19 -27.19 ... -25.29 -25.29
Attributes: (12/48)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 87658.0
creation_date: 2020-06-05T04:06:11Z
... ...
variant_label: r10i1p1f1
version: v20200605
license: CMIP6 model data produced by CSIRO is li...
cmor_version: 3.4.0
tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...
DODS_EXTRA.Unlimited_Dimension: time
[10]: ds_avg.tas.plot(label="weighted")
[10]: <matplotlib.collections.QuadMesh at 0x1443d9f30>
input_grid = ds.regridder.grid
input_grid.plot.scatter(x='lon', y='lat', s=5, ax=axes[0], add_colorbar=False, cmap=plt.
˓→cm.RdBu)
axes[0].set_title('Input Grid')
axes[1].set_title('Output Grid')
plt.tight_layout()
xCDAT offers horizontal regridding with xESMF (default) and a Python port of regrid2. We will be using xESMF to
regrid.
ds.tas.isel(time=0).plot(ax=axes[0])
axes[0].set_title('Input data')
output.tas.isel(time=0).plot(ax=axes[1])
axes[1].set_title('Output data')
plt.tight_layout()
Nearly all existing xarray methods have been extended to work automatically with Dask arrays for paral-
lelism
—https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray
• Parallelized xarray methods include indexing, computation, concatenating and grouped operations
• xCDAT APIs that build upon xarray methods inherently support Dask parallelism
– Dask arrays are loaded into memory only when absolutely required (e.g., generating weights for averaging)
˓→gn_185001-201412.nc"
Attributes: (12/49)
Conventions: CF-1.7 CMIP-6.2
(continues on next page)
Key Takeaways
10.5.1 Overview
Most public xcdat APIs operate on xarray.Dataset objects. xcdat follows this design pattern because coordinate
variable bounds are often required to perform robust calculations. Currently, coordinate variable bounds can only be
stored on Dataset objects and not DataArray objects. Refer to this issue for more information.
xcdat.open_dataset
• center_times (bool, optional) – If True, attempt to center time coordinates using the mid-
point between its upper and lower bounds. Otherwise, use the provided time coordinates, by
default False.
• lon_orient (Optional[Tuple[float, float]], optional) – The orientation to use for the
Dataset’s longitude axis (if it exists). Either (-180, 180) or (0, 360), by default None. Sup-
ported options include:
– None: use the current orientation (if the longitude axis exists)
– (-180, 180): represents [-180, 180) in math notation
– (0, 360): represents [0, 360) in math notation
• kwargs (Dict[str, Any]) – Additional arguments passed on to xarray.open_dataset.
Refer to the1 xarray docs for accepted keyword arguments.
Returns
xr.Dataset – Dataset after applying operations.
Notes
xarray.open_dataset opens the file with read-only access. When you modify values of a Dataset, even one
linked to files on disk, only the in-memory copy you are manipulating in xarray is modified: the original file on
disk is never touched.
References
xcdat.open_mfdataset
with any XML file that has the directory attribute. Refer to4 for more information on
CDML. NOTE: This feature is deprecated in v0.6.0 and will be removed in the subsequent
release. CDAT (including cdms2/CDML) is in maintenance only mode and marked for
end-of-life by the end of 2023.
• add_bounds (List[CFAxisKey] | None | bool) – List of CF axes to try to add bounds
for (if missing), by default [“X”, “Y”]. Set to None to not add any missing bounds. Please
note that bounds are required for many xCDAT features.
– This parameter calls xarray.Dataset.bounds.add_missing_bounds()
– Supported CF axes include “X”, “Y”, “Z”, and “T”
– By default, missing “T” bounds are generated using the time frequency of the coordinates.
If desired, refer to xarray.Dataset.bounds.add_time_bounds() if you require more
granular configuration for how “T” bounds are generated.
• data_var (Optional[str], optional) – The key of the data variable to keep in the Dataset,
by default None.
• decode_times (bool, optional) – If True, attempt to decode times encoded in the standard
NetCDF datetime format into cftime.datetime objects. Otherwise, leave them encoded as
numbers. This keyword may not be supported by all the backends, by default True.
• center_times (bool, optional) – If True, attempt to center time coordinates using the mid-
point between its upper and lower bounds. Otherwise, use the provided time coordinates, by
default False.
• lon_orient (Optional[Tuple[float, float]], optional) – The orientation to use for the
Dataset’s longitude axis (if it exists), by default None. Supported options include:
– None: use the current orientation (if the longitude axis exists)
– (-180, 180): represents [-180, 180) in math notation
– (0, 360): represents [0, 360) in math notation
• data_vars ({"minimal", "different", "all" or list of str}, optional) –
These data variables will be concatenated together:
– “minimal”: Only data variables in which the dimension already appears are included,
the default value.
– “different”: Data variables which are not equal (ignoring attributes) across all datasets
are also concatenated (as well as all for which dimension already appears). Beware: this
option may load the data payload of data variables into memory if they are not already
loaded.
– “all”: All data variables will be concatenated.
– list of str: The listed data variables will be concatenated, in addition to the “minimal”
data variables.
The data_vars kwarg defaults to "minimal", which concatenates data variables in a man-
ner where only data variables in which the dimension already appears are included. For
example, the time dimension will not be concatenated to the dimensions of non-time data
variables such as “lat_bnds” or “lon_bnds”. data_vars=”minimal” is required for some xC-
DAT functions, including spatial averaging where a reduction is performed using the lat/lon
bounds.
4 https://cdms.readthedocs.io/en/latest/manual/cdms_6.html
Notes
xarray.open_mfdataset opens the file with read-only access. When you modify values of a Dataset, even
one linked to files on disk, only the in-memory copy you are manipulating in xarray is modified: the original file
on disk is never touched.
The CDAT “Climate Data Markup Language” (CDML) is a deprecated dialect of XML with a defined set of
attributes. CDML is still used by current and former users of CDAT. To enable CDML users to adopt xCDAT
more easily in their workflows, xCDAT can parse XML/CDML files for the directory to generate a glob or list
of file paths. Refer toPage 121, 4 for more information on CDML. NOTE: This feature is deprecated in v0.6.0 and
will be removed in the subsequent release. CDAT (including cdms2/CDML) is in maintenance only mode and
marked for end-of-life by the end of 2023.
References
xcdat.center_times
xcdat.center_times(dataset)
Centers time coordinates using the midpoint between time bounds.
Time coordinates can be recorded using different intervals, including the beginning, middle, or end of the interval.
Centering time coordinates, ensures calculations using these values are performed reliably regardless of the
recorded interval.
This method attempts to get bounds for each time variable using the CF “bounds” attribute. Coordinate variables
that cannot be mapped to bounds will be skipped.
Parameters
dataset (xr.Dataset) – The Dataset with original time coordinates.
Returns
xr.Dataset – The Dataset with centered time coordinates.
xcdat.decode_time
xcdat.decode_time(dataset)
Decodes CF and non-CF time coordinates and time bounds using cftime.
By default, xarray only supports decoding time with CF compliant units5 . This function enables also decod-
ing time with non-CF compliant units. It skips decoding time coordinates that have already been decoded as
"datetime64[ns]" or cftime.datetime.
3 https://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html
5 https://cfconventions.org/cf-conventions/cf-conventions.html#time-coordinate
For time coordinates to be decodable, they must have a “calendar” attribute set to a CF calendar type sup-
ported by cftime. CF calendar types include “noleap”, “360_day”, “365_day”, “366_day”, “gregorian”, “pro-
leptic_gregorian”, “julian”, “all_leap”, or “standard”. They must also have a “units” attribute set to a format
supported by xCDAT (“months since . . . ” or “years since . . . ”).
Parameters
dataset (xr.Dataset) – Dataset with numerically encoded time coordinates and time bounds
(if they exist). If the time coordinates cannot be decoded then the original dataset is returned.
Returns
xr.Dataset – Dataset with decoded time coordinates and time bounds (if they exist) as cftime
objects.
Raises
KeyError – If time coordinates were not detected in the dataset, either because they don’t exist
at all or their CF attributes (e.g., ‘axis’ or ‘standard_name’) are not set.
Notes
Time coordinates are represented by cftime.datetime objects because it is not restricted by the pandas.
Timestamp range (years 1678 through 2262). Refer to6 and7 for more information on this limitation.
References
Examples
>>> ds_decoded.time.encoding
{'source': None,
'dtype': dtype('int64'),
'original_shape': (3,),
'units': 'years since 2000-01-01',
'calendar': 'noleap'}
xcdat.swap_lon_axis
xcdat.compare_datasets
xcdat.compare_datasets(ds1, ds2)
Compares the keys and values of two datasets.
This utility function is especially useful for debugging tests that involve comparing two Dataset objects for being
identical or equal.
Checks include:
• Unique keys - keys that exist only in one of the two datasets.
• Non-identical - keys whose values have the same dimension, coordinates, values, name, attributes, and
attributes on all coordinates.
• Non-equal keys - keys whose values have the same dimension, coordinates, and values, but not necessarily
the same attributes. Key values that are non-equal will also be non-identical.
Parameters
• ds1 (xr.Dataset) – The first Dataset.
• ds2 (xr.Dataset) – The second Dataset.
Returns
Dict[str, Union[List[str]]] – A dictionary mapping unique, non-identical, and non-equal
keys in both Datasets.
xcdat.get_dim_coords
xcdat.get_dim_coords(obj, axis)
Gets the dimension coordinates for an axis.
This function uses cf_xarray to attempt to map the axis to its dimension coordinates by interpreting the CF
axis and coordinate names found in the coordinate attributes. Refer to1 for a list of CF axis and coordinate names
that can be interpreted by cf_xarray.
If obj is an xr.Dataset,, this function can return a single dimension coordinate variable as an xr.DataArray
or multiple dimension coordinate variables in an xr Dataset. If obj is an xr.DataArray, this function should
return a single dimension coordinate variable as an xr.DataArray.
Parameters
• obj (Union[xr.Dataset, xr.DataArray]) – The Dataset or DataArray object.
• axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, “Z”).
Returns
Union[xr.Dataset, xr.DataArray] – A Dataset of dimension coordinate variables or a
DataArray for the single dimension coordinate variable.
Raises
• ValueError – If the obj is an xr.DataArray and more than one dimension is mapped to
the same axis.
• KeyError – If no dimension coordinate variables were found for the axis.
1 https://cf-xarray.readthedocs.io/en/latest/coord_axes.html#axes-and-coordinates
Notes
References
xcdat.get_dim_keys
xcdat.get_dim_keys(obj, axis)
Gets the dimension key(s) for an axis.
Each dimension should have a corresponding dimension coordinate variable, which has a 1:1 map in keys and is
denoted by the * symbol when printing out the xarray object.
Parameters
• obj (Union[xr.Dataset, xr.DataArray]) – The Dataset or DataArray object.
• axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, or “Z”)
Returns
Union[str, List[str]] – The dimension string or a list of dimensions strings for an axis.
xcdat.create_axis
Examples
>>> lat, bnds = create_axis("lat", [-45, 0, 45], bounds=[[-67.5, -22.5], [-22.5, 22.
˓→5], [22.5, 67.5]])
xcdat.create_gaussian_grid
xcdat.create_gaussian_grid(nlats)
Creates a grid with Gaussian latitudes and uniform longitudes.
Parameters
nlats (int) – Number of latitudes.
Returns
xr.Dataset – Dataset with new grid, containing Gaussian latitudes.
Examples
>>> xcdat.regridder.grid.create_gaussian_grid(32)
xcdat.create_global_mean_grid
xcdat.create_global_mean_grid(grid)
Creates a global mean grid.
Bounds are expected to be present in grid.
Parameters
grid (xr.Dataset) – Source grid.
Returns
xr.Dataset – A dataset containing the global mean grid.
xcdat.create_grid
Examples
>>> z = create_axis(
>>> "lev", np.linspace(1000, 1, 20), attrs={"units": "meters", "positive": "down"}
>>> )
>>> grid = create_grid(z=z)
xcdat.create_uniform_grid
Examples
xcdat.create_zonal_grid
xcdat.create_zonal_grid(grid)
Creates a zonal grid.
Bounds are expected to be present in grid.
Parameters
grid (xr.Dataset) – Source grid.
Returns
xr.Dataset – A dataset containing a zonal grid.
10.5.3 Accessors
xcdat provides Dataset accessors, which are implicit namespaces for custom functionality that clearly identifies it
as separate from built-in xarray methods. xcdat implements accessors to extend xarray with custom functionality
because it is the officially recommended and most common practice (over sub-classing).
In the example below, custom spatial functionality is exposed by chaining the spatial accessor attribute to the
Dataset object. This chaining enables access to the underlying spatial average() method.
Now chain the accessor attribute to the Dataset to expose the accessor class attributes, methods, or properties:
Note: Accessors are created once per Dataset instance. New instances, like those created from arithmetic operations
will have new accessors created.
Classes
xcdat.regridder.xgcm.XGCMRegridder(...[, ...])
xcdat.bounds.BoundsAccessor
class xcdat.bounds.BoundsAccessor(dataset)
An accessor class that provides bounds attributes and methods on xarray Datasets through the .bounds attribute.
Examples
>>> ds = xcdat.open_dataset("/path/to/file")
>>>
>>> ds.bounds.<attribute>
>>> ds.bounds.<method>
>>> ds.bounds.<property>
Parameters
dataset (xr.Dataset) – A Dataset object.
Examples
Import:
>>> ds.bounds.map
>>> ds.bounds.keys
>>> ds = ds.bounds.add_bounds("Y")
__init__(dataset)
Methods
__init__(dataset)
Attributes
_dataset
property map
Returns a map of axis and coordinates keys to their bounds.
The dictionary provides all valid CF compliant keys for axis and coordinates. For example, latitude will
includes keys for “lat”, “latitude”, and “Y”.
Returns
Dict[str, Optional[xr.DataArray]] – Dictionary mapping axis and coordinate keys to
their bounds.
property keys
Returns a list of keys for the bounds data variables in the Dataset.
Returns
List[str] – A list of sorted bounds data variable keys.
add_missing_bounds(axes)
Adds missing coordinate bounds for supported axes in the Dataset.
This function loops through the Dataset’s axes and attempts to adds bounds to its coordinates if they don’t
exist. “X”, “Y” , and “Z” axes bounds are the midpoints between coordinates. “T” axis bounds are based
on the time frequency of the coordinates.
An axis must meet the following criteria to add bounds for it, otherwise they are ignored:
1. Axis is either X”, “Y”, “T”, or “Z”
2. Coordinates are a single dimension, not multidimensional
3. Coordinates are a length > 1 (not singleton)
4. Bounds must not already exist
• Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if ds.
time.attrs["bounds"] is set to "time_bnds" and ds.time_bnds is present in the dataset.
5. For the “T” axis, its coordinates must be composed of datetime-like objects (np.datetime64 or cftime).
Parameters
axes (List[str]) – List of CF axes that function should operate on. Options include “X”,
“Y”, “T”, or “Z”.
Returns
xr.Dataset
get_bounds(axis, var_key=None)
Gets coordinate bounds.
Parameters
• axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, “Z”).
• var_key (Optional[str]) – The key of the coordinate or data variable to get axis bounds
for. This parameter is useful if you only want the single bounds DataArray related to the
axis on the variable (e.g., “tas” has a “lat” dimension and you want “lat_bnds”).
Returns
Union[xr.Dataset, xr.DataArray] – A Dataset of N bounds variables, or a single bounds
variable DataArray.
Raises
• ValueError – If an incorrect axis argument is passed.
• KeyError: – If bounds were not found for the specific axis.
add_bounds(axis)
Add bounds for an axis using its coordinates as midpoints.
This method loops over the axis’s coordinate variables and attempts to add bounds for each of them if they
don’t exist. Each coordinate point is the midpoint between their lower and upper bounds.
To add bounds for an axis its coordinates must meet the following criteria, otherwise an error is thrown:
1. Axis is either X”, “Y”, “T”, or “Z”
2. Coordinates are single dimensional, not multidimensional
3. Coordinates are a length > 1 (not singleton)
4. Bounds must not already exist
• Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if ds.
time.attrs["bounds"] is set to "time_bnds" and ds.time_bnds is present in the dataset.
Parameters
axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, “Z”).
Returns
• xr.Dataset – The dataset with bounds added.
• Raises
Parameters
• method ({"freq", "midpoint"}) – The method for creating time bounds for time co-
ordinates, either “freq” or “midpoint”.
– “freq”: Create time bounds as the start and end of each timestep’s period using either the
inferred or specified time frequency (freq parameter). For example, the time bounds
will be the start and end of each month for each monthly coordinate point.
– “midpoint”: Create time bounds using time coordinates as the midpoint between their
upper and lower bounds.
• freq ({"year", "month", "day", "hour"}, optional) – If method="freq", this pa-
rameter specifies the time frequency for creating time bounds. By default None, which
infers the frequency using the time coordinates.
• daily_subfreq ({1, 2, 3, 4, 6, 8, 12, 24}, optional) – If freq=="hour", this pa-
rameter sets the number of timepoints per day for time bounds, by default None.
– daily_subfreq=None infers the daily time frequency from the time coordinates.
– daily_subfreq=1 is daily
– daily_subfreq=2 is twice daily
– daily_subfreq=4 is 6-hourly
– daily_subfreq=8 is 3-hourly
– daily_subfreq=12 is 2-hourly
– daily_subfreq=24 is hourly
• end_of_month (bool, optional) – If freq=="month", this flag notes that the timepoint is
saved at the end of the monthly interval (see Note), by default False.
– Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time
interval Jan. 1 00:00 - Feb. 1 00:00. Since this method determines the month and year
from the time vector, the bounds will be set incorrectly if the timepoint is set to the end
of the time interval. For these cases, set end_of_month=True.
Returns
xr.Dataset – The dataset with time bounds added.
_drop_ancillary_singleton_coords(coord_vars)
Drop ancillary singleton coordinates from dimension coordinates.
Xarray coordinate variables retain all coordinates from the parent object. This means if singleton coordi-
nates exist, they are attached to dimension coordinates as ancillary coordinates. For example, the “height”
singleton coordinate will be attached to “time” coordinates even though “height” is related to the “Z” axis,
not the “T” axis. Refer to1 for more info on this Xarray behavior.
1 https://github.com/pydata/xarray/issues/6196
This is an undesirable behavior in xCDAT because the add bounds methods loop over coordinates related
to an axis and attempts to add bounds if they don’t exist. If ancillary coordinates are present, “ValueError:
Cannot generate bounds for coordinate variable ‘height’ which has a length <= 1 (singleton)” is raised. For
the purpose of adding bounds, we temporarily drop any ancillary singletons from dimension coordinates
before looping over those coordinates. Ancillary singletons will still be present in the final Dataset object
to maintain the Dataset’s integrity.
Parameters
coord_vars (Union[xr.Dataset, xr.DataArray]) – The dimension coordinate variables
with ancillary coordinates (if they exist).
Returns
Union[xr.Dataset, xr.DataArray] – The dimension coordinate variables with ancillary
coordinates dropped (if they exist).
References
_get_bounds_keys(axis)
Get bounds keys for an axis’s coordinate variables in the dataset.
This function attempts to map bounds to an axis using cf_xarray and its interpretation of the CF “bounds”
attribute.
Parameters
axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, or “Z”).
Returns
List[str] – The axis bounds key(s).
_create_time_bounds(time, freq=None, daily_subfreq=None, end_of_month=False)
Creates time bounds for each timestep of the time coordinate axis.
This method creates time bounds as the start and end of each timestep’s period using either the inferred or
specified time frequency (freq parameter). For example, the time bounds will be the start and end of each
month for each monthly coordinate point.
Parameters
• time (xr.DataArray) – The temporal coordinate variable for the axis.
• freq ({"year", "month", "day", "hour"}, optional) – The time frequency for creat-
ing time bounds, by default None (infer the frequency).
• daily_subfreq ({1, 2, 3, 4, 6, 8, 12, 24}, optional) – If freq=="hour", this pa-
rameter sets the number of timepoints per day for bounds, by default None. If greater than
1, sub-daily bounds are created.
– daily_subfreq=None infers the freq from the time coords (default)
– daily_subfreq=1 is daily
– daily_subfreq=2 is twice daily
– daily_subfreq=4 is 6-hourly
– daily_subfreq=8 is 3-hourly
– daily_subfreq=12 is 2-hourly
– daily_subfreq=24 is hourly
• end_of_month (bool, optional) – If freq==”month”`, this flag notes that the timepoint is
saved at the end of the monthly interval (see Note), by default False.
Returns
xr.DataArray – A DataArray storing bounds for the time axis.
Raises
• ValueError – If coordinates are a singleton.
• TypeError – If time coordinates are not composed of datetime-like objects.
Note: Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time interval Jan.
1 00:00 - Feb. 1 00:00. Since this function determines the month and year from the time vector, the
bounds will be set incorrectly if the timepoint is set to the end of the time interval. For these cases, set
end_of_month=True.
_create_yearly_time_bounds(timesteps, obj_type)
Creates time bounds for each timestep with the start and end of the year.
Bounds for each timestep correspond to Jan. 1 00:00:00 of the year of the timestep and Jan. 1 00:00:00 of
the subsequent year.
Parameters
• timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).
• obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time
bounds based on the dtype of time_values.
Returns
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
_create_monthly_time_bounds(timesteps, obj_type, end_of_month=False)
Creates time bounds for each timestep with the start and end of the month.
Bounds for each timestep correspond to 00:00:00 on the first of the month and 00:00:00 on the first of the
subsequent month.
Parameters
• timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).
• obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time
bounds based on the dtype of time_values.
• end_of_month (bool, optional) – Flag to note that the timepoint is saved at the end of the
monthly interval (see Note), by default False.
Returns
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
Note: Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time interval Jan.
1 00:00 - Feb. 1 00:00. Since this function determines the month and year from the time vector, the
bounds will be set incorrectly if the timepoint is set to the end of the time interval. For these cases, set
end_of_month=True.
References
[
["01/01/2000 00:00", "01/01/2000 03:00"],
["01/01/2000 03:00", "01/01/2000 06:00"],
...
["01/01/2000 21:00", "02/01/2000 00:00"],
]
Parameters
• timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).
• obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time
bounds based on the dtype of time_values.
• freq ({1, 2, 3, 4, 6, 8, 12, 24}, optional) – Number of timepoints per day, by de-
fault 1. If greater than 1, sub-daily bounds are created.
– freq=1 is daily (default)
– freq=2 is twice daily
– freq=4 is 6-hourly
– freq=8 is 3-hourly
– freq=12 is 2-hourly
– freq=24 is hourly
4 https://stackoverflow.com/a/4131114
Returns
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
Raises
ValueError – If an incorrect freq argument is passed. Should be 1, 2, 3, 4, 6, 8, 12, or 24.
Notes
References
_validate_axis_arg(axis)
xcdat.spatial.SpatialAccessor
class xcdat.spatial.SpatialAccessor(dataset)
An accessor class that provides spatial attributes and methods on xarray Datasets through the .spatial attribute.
Examples
>>> ds = xcdat.open_dataset("/path/to/file")
>>>
>>> ds.spatial.<attribute>
>>> ds.spatial.<method>
>>> ds.spatial.<property>
Parameters
dataset (xr.Dataset) – A Dataset object.
__init__(dataset)
Methods
__init__(dataset)
average(data_var[, axis, weights, ...]) Calculates the spatial average for a rectilinear grid
over an optionally specified regional domain.
get_weights(axis[, lat_bounds, lon_bounds, ...]) Get area weights for specified axis keys and an op-
tional target domain.
5 https://github.com/CDAT/cdutil/blob/master/cdutil/times.py#L1093
Examples
>>> ds.lat.attrs["axis"]
>>> Y
>>>
>>> ds.lon.attrs["axis"]
>>> X
>>> ds.spatial.average(...)
>>> # The shape of the weights must align with the data var.
>>> self.weights = xr.DataArray(
>>> data=np.ones((4, 4)),
>>> coords={"lat": self.ds.lat, "lon": self.ds.lon},
>>> dims=["lat", "lon"],
>>> )
>>>
>>> ts_global = ds.spatial.average("tas", axis=["X", "Y"],
>>> weights=weights)["tas"]
Returns
xr.DataArray – A DataArray containing the region weights to use during averaging.
weights are 1-D and correspond to the specified axes (axis) in the region.
Notes
This method was developed for rectilinear grids only. get_weights() recognizes and operate on latitude
and longitude, but could be extended to work with other standard geophysical dimensions (e.g., time, depth,
and pressure).
_validate_axis_arg(axis)
Validates that the axis dimension(s) exists in the dataset.
Parameters
axis (List[SpatialAxis]) – List of axis dimensions to average over.
Raises
• ValueError – If a key in axis is not a supported value.
• KeyError – If the dataset does not have coordinates for the axis dimension, or the axis
attribute is not set for those coordinates.
_validate_region_bounds(axis, bounds)
Validates the bounds arg based on a set of criteria.
Parameters
• axis (SpatialAxis) – The axis related to the bounds.
• bounds (RegionAxisBounds) – The axis bounds.
Raises
• TypeError – If bounds is not a tuple.
• ValueError – If the bounds has 0 elements or greater than 2 elements.
• TypeError – If the bounds lower bound is not a float or integer.
• TypeError – If the bounds upper bound is not a float or integer.
• ValueError – If the axis is “Y” and the bounds lower value is larger than the upper
value.
_get_longitude_weights(domain_bounds, region_bounds)
Gets weights for the longitude axis.
This method performs longitudinal processing including (in order):
1. Align the axis orientations of the domain and region bounds to (0, 360) to ensure compatibility in the
proceeding steps.
2. Handle grid cells that cross the prime meridian (e.g., [-1, 1]) by breaking such grid cells into two (e.g.,
[0, 1] and [359, 360]) to ensure alignment with the (0, 360) axis orientation. This results in a bounds
axis of length(nlon)+1. The index of the grid cell that crosses the prime meridian is returned in order
to reduce the length of weights to nlon.
3. Scale the domain down to a region (if selected).
4. Calculate weights using the domain bounds.
5. If the prime meridian grid cell exists, use this cell’s index to handle the weights vector’s increased
length as a result of the two additional grid cells. The extra weights are added to the prime meridian
grid cell and removed from the weights vector to ensure the lengths of the weights and its corresponding
domain remain in alignment.
Parameters
• domain_bounds (xr.DataArray) – The array of bounds for the longitude domain.
• region_bounds (Optional[np.ndarray]) – The array of bounds for longitude regional
selection.
Returns
xr.DataArray – The longitude axis weights.
Raises
ValueError – If the there are multiple instances in which the domain_bounds[:, 0] > do-
main_bounds[:, 1]
_get_latitude_weights(domain_bounds, region_bounds)
Gets weights for the latitude axis.
This method scales the domain to a region (if selected). It also scales the area between two lines of latitude
as the difference of the sine of latitude bounds.
Parameters
• domain_bounds (xr.DataArray) – The array of bounds for the latitude domain.
• region_bounds (Optional[np.ndarray]) – The array of bounds for latitude regional
selection.
Returns
xr.DataArray – The latitude axis weights.
_calculate_weights(domain_bounds)
Calculate weights for the domain.
This method takes the absolute difference between the upper and lower bound values to calculate weights.
Parameters
domain_bounds (xr.DataArray) – The array of bounds for a domain.
Returns
xr.DataArray – The weights for an axes.
_swap_lon_axis(lon, to)
Swap the longitude axis orientation.
Parameters
• lon (Union[xr.DataArray, np.ndarray]) – Longitude values to convert.
• to (Literal[180, 360]) – Axis orientation to convert to, either 180 [-180, 180) or 360 [0,
360).
Returns
Union[xr.DataArray, np.ndarray] – Converted longitude values.
Notes
This does not reorder the values in any way; it only converts the values in-place between longitude conven-
tions [-180, 180) or [0, 360).
_scale_domain_to_region(domain_bounds, region_bounds)
Scale domain bounds to conform to a regional selection in order to calculate spatial weights.
Axis weights are determined by the difference between the upper and lower boundary. If a region is selected,
the grid cell bounds outside the selected region are adjusted using this method so that the grid cell bounds
match the selected region bounds. The effect of this adjustment is to give partial weight to grid cells that
are partially in the selected regional domain and zero weight to grid cells outside the selected domain.
Parameters
• domain_bounds (xr.DataArray) – The domain’s bounds.
• region_bounds (np.ndarray) – The region bounds that the domain bounds are scaled
down to.
Returns
xr.DataArray – Scaled dimension bounds based on regional selection.
Notes
If a lower regional selection bound exceeds the upper selection bound, this algorithm assumes that the axis
is longitude and the user is specifying a region that includes the prime meridian. The lower selection bound
should not exceed the upper bound for latitude.
_combine_weights(axis_weights)
Generically rescales axis weights for a given region.
This method creates an n-dimensional weighting array by performing matrix multiplication for a list of
specified axis keys using a dictionary of axis weights.
Parameters
axis_weights (AxisWeights) – Dictionary of axis weights, where key is axis and value is
the corresponding DataArray of weights.
Returns
xr.DataArray – A DataArray containing the region weights to use during averaging.
weights are 1-D and correspond to the specified axis keys (axis) in the region.
_validate_weights(data_var, axis)
Validates the weights arg based on a set of criteria.
This methods checks for the dimensional alignment between the weights and data_var. It assumes that
data_var has the same keys that are specified in axis, which has already been validated using self.
_validate_axis() in self.average().
Parameters
• data_var (xr.DataArray) – The data variable used for validation with user supplied
weights.
• axis (List[SpatialAxis]) – List of axes dimension(s) average over.
• weights (xr.DataArray) – A DataArray containing the region area weights for averaging.
weights must include the same spatial axis dimensions found in axis and data_var, and
the same axis dims sizes as data_var.
Raises
• KeyError – If weights does not include the latitude dimension.
• KeyError – If weights does not include the longitude dimension.
• ValueError – If the axis dimension sizes between weights and data_var are mis-
aligned.
_averager(data_var, axis)
Perform a weighted average of a data variable.
This method assumes all specified keys in axis exists in the data variable. Validation for this criteria is
performed in _validate_weights().
Operations include:
• Masked (missing) data receives zero weight.
• Perform weighted average over user-specified axes/axis.
Parameters
• data_var (xr.DataArray) – Data variable inside a Dataset.
• axis (List[SpatialAxis]) – List of axis dimensions to average over.
Returns
xr.DataArray – Variable that has been reduced via a weighted average.
Notes
weights must be a DataArray and cannot contain missing values. Missing values are replaced with 0 using
weights.fillna(0).
xcdat.temporal.TemporalAccessor
class xcdat.temporal.TemporalAccessor(dataset)
An accessor class that provides temporal attributes and methods on xarray Datasets through the .temporal
attribute.
This accessor class requires the dataset’s time coordinates to be decoded as np.datetime64 or cftime.
datetime objects. The dataset must also have time bounds to generate weights for weighted calculations and to
infer the grouping time frequency in average() (single-snap shot average).
Examples
>>> ds = xcdat.open_dataset("/path/to/file")
>>>
>>> ds.temporal.<attribute>
(continues on next page)
>>> ds.time.attrs["axis"]
>>> T
Parameters
dataset (xr.Dataset) – A Dataset object.
__init__(dataset)
Methods
__init__(dataset)
average(data_var[, weighted, keep_weights]) Returns a Dataset with the average of a data variable
and the time dimension removed.
climatology(data_var, freq[, weighted, ...]) Returns a Dataset with the climatology of a data vari-
able.
departures(data_var, freq[, weighted, ...]) Returns a Dataset with the climatological departures
(anomalies) for a data variable.
group_average(data_var, freq[, weighted, ...]) Returns a Dataset with average of a data variable by
time group.
The weight of masked (missing) data is excluded when averages are taken. This is the same
as giving them a weight of 0.
• keep_weights (bool, optional) – If calculating averages using weights, keep the weights
in the final dataset output, by default False.
Returns
xr.Dataset – Dataset with the average of the data variable and the time dimension removed.
Examples
Returns
xr.Dataset – Dataset with the average of a data variable by time group.
Examples
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JulAugSep"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]
>>>
>>> ds_season_custom = ds.temporal.group_average(
>>> "ts",
>>> "season",
>>> season_config={"custom_seasons": custom_seasons}
>>> )
>>> ds_season_with_djf.ts.attrs
{
'operation': 'temporal_avg',
'mode': 'average',
'freq': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JulAugSep"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]
Returns
xr.Dataset – Dataset with the climatology of a data variable.
References
Examples
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JulAugSep"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]
>>>
>>> ds_season_custom = ds.temporal.climatology(
>>> "ts",
>>> "season",
>>> season_config={"custom_seasons": custom_seasons}
>>> )
>>> ds_season_with_djf.ts.attrs
{
'operation': 'temporal_avg',
'mode': 'climatology',
'freq': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
the January average surface air temperature) and the long-term average value for that time interval (e.g.,
the average surface temperature over the last 30 Januaries).
Time bounds are used for generating weights to calculate weighted climatology (refer to the weighted
parameter documentation below).
Parameters
• data_var (str) – The key of the data variable for calculating departures.
• freq (Frequency) – The frequency of time to group by.
– “season”: groups by season for the seasonal cycle departures.
– “month”: groups by month for the annual cycle departures.
– “day”: groups by (month, day) for the daily cycle departures. If the CF cal-
endar type is "gregorian", "proleptic_gregorian", or "standard",
leap days (if present) are dropped to avoid inconsistencies when calculating
climatologies. Refer to2 for more details on this implementation decision.
• weighted (bool, optional) – Calculate averages using weights, by default True.
Weights are calculated by first determining the length of time for each coordinate
point using the difference of its upper and lower bounds. The time lengths are
grouped, then each time length is divided by the total sum of the time lengths to
get the weight of each coordinate point.
The weight of masked (missing) data is excluded when averages are taken. This
is the same as giving them a weight of 0.
• keep_weights (bool, optional) – If calculating averages using weights, keep the
weights in the final dataset output, by default False.
• reference_period (Optional[Tuple[str, str]], optional) – The climatolog-
ical reference period, which is a subset of the entire time series and used for
calculating departures. This parameter accepts a tuple of strings in the format
‘yyyy-mm-dd’. For example, ('1850-01-01', '1899-12-31'). If no value is
provided, the climatological reference period will be the full period covered by
the dataset.
• season_config (SeasonConfigInput, optional) – A dictionary for “season” fre-
quency configurations. If configs for predefined seasons are passed, configs for
custom seasons are ignored and vice versa.
Configs for predefined seasons:
– “dec_mode” (Literal[“DJF”, “JFD”], by default “DJF”)
The mode for the season that includes December.
∗ “DJF”: season includes the previous year December.
∗ “JFD”: season includes the same year December.
Xarray labels the season with December as “DJF”, but it is actually
“JFD”.
– “drop_incomplete_djf” (bool, by default False)
If the “dec_mode” is “DJF”, this flag drops (True) or keeps (False) time
coordinates that fall under incomplete DJF seasons Incomplete DJF sea-
sons include the start year Jan/Feb and the end year Dec.
Configs for custom seasons:
2 https://github.com/xCDAT/xcdat/discussions/332
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JulAugSep"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]
Returns
xr.Dataset – The Dataset containing the departures for a data var’s climatology.
Notes
This method uses xarray’s grouped arithmetic as a shortcut for mapping over all unique labels. Grouped
arithmetic works by assigning a grouping label to each time coordinate of the observation data based
on the averaging mode and frequency. Afterwards, the corresponding climatology is removed from the
observation data at each time coordinate based on the matching labels.
Refer to3 to learn more about how xarray’s grouped arithmetic works.
References
Examples
>>> ds_depart.ts.attrs
{
'operation': 'departures',
'frequency': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
3 https://xarray.pydata.org/en/stable/user-guide/groupby.html#grouped-arithmetic
_form_seasons(custom_seasons)
Forms custom seasons from a nested list of months.
This method concatenates the strings in each sublist to form a a flat list of custom season strings
Parameters
custom_seasons (List[List[str]]) – List of sublists containing month strings, with
each sublist representing a custom season.
Returns
Dict[str, List[str]] – A dictionary with the keys being the custom season and the
values being the corresponding list of months.
Raises
• ValueError – If exactly 12 months are not passed in the list of custom seasons.
• ValueError – If a duplicate month(s) were found in the list of custom seasons.
• ValueError – If a month string(s) is not supported.
_preprocess_dataset(ds)
Preprocess the dataset based on averaging settings.
Preprocessing operations include:
• Drop incomplete DJF seasons (leading/trailing)
• Drop leap days
Parameters
ds (xr.Dataset) – The dataset.
Returns
xr.Dataset
_drop_incomplete_djf(dataset)
Drops incomplete DJF seasons within a continuous time series.
This method assumes that the time series is continuous and removes the leading and trailing incomplete
seasons (e.g., the first January and February of a time series that are not complete, because the December
of the previous year is missing). This method does not account for or remove missing time steps anywhere
else.
Parameters
dataset (xr.Dataset) – The dataset with some possibly incomplete DJF seasons.
Returns
xr.Dataset – The dataset with only complete DJF seasons.
_drop_leap_days(ds)
Drop leap days from time coordinates.
This method is used to drop 2/29 from leap years (if present) before calculating climatology/departures
for high frequency time series data to avoid cftime breaking (ValueError: invalid day number provided in
cftime.DatetimeProlepticGregorian(1, 2, 29, 0, 0, 0, 0, has_year_zero=True).
Parameters
ds (xr.Dataset) – The dataset.
Returns
xr.Dataset
_average(data_var, time_bounds)
Averages a data variable with the time dimension removed.
Parameters
• data_var (xr.DataArray) – The data variable.
• time_bounds (xr.DataArray) – The time bounds.
Returns
xr.DataArray – The averages for a data variable with the time dimension removed.
_group_average(data_var, time_bounds)
Averages a data variable by time group.
Parameters
• data_var (xr.DataArray) – The data variable.
• time_bounds (xr.DataArray) – The time bounds.
Returns
xr.DataArray – The data variable averaged by time group.
_get_weights(time_bounds)
Calculates weights for a data variable using time bounds.
This method gets the length of time for each coordinate point by using the difference in the upper and lower
time bounds. This approach ensures that the correct time lengths are calculated regardless of how time
coordinates are recorded (e.g., monthly, daily, hourly) and the calendar type used.
The time lengths are labeled and grouped, then each time length is divided by the total sum of the time
lengths in its group to get its corresponding weight.
The sum of the weights for each group is validated to ensure it equals 1.0.
Parameters
time_bounds (xr.DataArray) – The time bounds.
Returns
xr.DataArray – The weights based on a specified frequency.
Notes
References
_group_data(data_var)
Groups a data variable.
This method groups a data variable by a single datetime component for the “average” mode or labeled time
coordinates for all other modes.
Parameters
data_var (xr.DataArray) – A data variable.
Returns
DataArrayGroupBy – A data variable grouped by label.
_label_time_coords(time_coords)
Labels time coordinates with a group for grouping.
This methods labels time coordinates for grouping by first extracting specific xarray datetime components
from time coordinates and storing them in a pandas DataFrame. After processing (if necessary) is per-
formed on the DataFrame, it is converted to a numpy array of datetime objects. This numpy serves as the
data source for the final DataArray of labeled time coordinates.
Parameters
time_coords (xr.DataArray) – The time coordinates.
Returns
xr.DataArray – The DataArray of labeled time coordinates for grouping.
4 https://cfconventions.org/cf-conventions/cf-conventions.html#calendar
Examples
_get_df_dt_components(time_coords)
Returns a DataFrame of xarray datetime components.
This method extracts the applicable xarray datetime components from each time coordinate based on the
averaging mode and frequency, and stores them in a DataFrame.
Additional processing is performed for the seasonal frequency, including:
• If custom seasons are used, map them to each time coordinate based on the middle month of the
custom season.
• If season with December is “DJF”, shift Decembers over to the next year so DJF seasons are correctly
grouped using the previous year December.
• Drop obsolete columns after processing is done.
Parameters
time_coords (xr.DataArray) – The time coordinates.
Returns
pd.DataFrame – A DataFrame of datetime components.
Notes
References
_process_season_df(df )
Processes a DataFrame of datetime components for the season frequency.
Parameters
df (pd.DataFrame) – A DataFrame of xarray datetime components.
Returns
pd.DataFrame – A DataFrame of processed xarray datetime components.
_map_months_to_custom_seasons(df )
Maps the month column in the DataFrame to a custom season.
This method maps each integer value in the “month” column to its string represention, which then maps
to a custom season that is stored in the “season” column. For example, the month of 1 maps to “Jan” and
“Jan” maps to the “JanFebMar” custom season.
Parameters
df (pd.DataFrame) – The DataFrame of xarray datetime components.
Returns
pd.DataFrame – The DataFrame of xarray datetime coordinates, with each row
mapped to a custom season.
_shift_decembers(df_season)
Shifts Decembers over to the next year for “DJF” seasons in-place.
For “DJF” seasons, Decembers must be shifted over to the next year in order for the xarray groupby oper-
ation to correctly label and group the corresponding time coordinates. If the aren’t shifted over, grouping
is incorrectly performed with the native xarray “DJF” season (which is actually “JFD”).
Parameters
df_season (pd.DataFrame) – The DataFrame of xarray datetime components pro-
duced using the “season” frequency.
Returns
pd.DataFrame – The DataFrame of xarray datetime components with Decembers
shifted over to the next year.
Examples
_map_seasons_to_mid_months(df )
Maps the season column values to the integer of its middle month.
DateTime objects don’t support storing seasons as strings, so the middle months are used to represent the
season. For example, for the season “DJF”, the middle month “J” is mapped to the integer value 1.
The middle month of a custom season is extracted using the ceiling of the middle index from its list of
months. For example, for the custom season “FebMarAprMay” with the list of months [“Feb”, “Mar”,
“Apr”, “May”], the index 3 is used to get the month “Apr”. “Apr” is then mapped to the integer value 4.
After mapping the season to its month, the “season” column is renamed to “month”.
Parameters
df (pd.DataFrame) – The dataframe of datetime components, including a “season”
column.
Returns
pd.DataFrame – The dataframe of datetime components, including a “month” column.
_drop_obsolete_columns(df_season)
Drops obsolete columns from the DataFrame of xarray datetime components.
For the “season” frequency, processing is required on the DataFrame of xarray datetime components,
such as mapping custom seasons based on the month. Additional datetime component values must be
included as DataFrame columns, which become obsolete after processing is done. The obsolete columns
are dropped from the DataFrame before grouping time coordinates.
Parameters
df_season (pd.DataFrame) – The DataFrame of time coordinates for the “season”
frequency with obsolete columns.
Returns
pd.DataFrame – The DataFrame of time coordinates for the “season” frequency with
obsolete columns dropped.
_convert_df_to_dt(df )
Converts a DataFrame of datetime components to cftime datetime objects.
datetime objects require at least a year, month, and day value. However, some modes and time frequencies
don’t require year, month, and/or day for grouping. For these cases, use default values of 1 in order to meet
this datetime requirement.
Parameters
df (pd.DataFrame) – The DataFrame of xarray datetime components.
Returns
np.ndarray – A numpy ndarray of cftime.datetime objects.
Notes
Refer to6 and7 for more information on Timestamp-valid range. We use cftime.datetime objects to avoid
these time range issues.
References
_keep_weights(ds)
Keep the weights in the dataset.
Parameters
ds (xr.Dataset) – The dataset.
Returns
xr.Dataset – The dataset with the weights used for averaging.
6 https://docs.xarray.dev/en/stable/user-guide/weather-climate.html#non-standard-calendars-and-dates-outside-the-timestamp-valid-range
7 https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamp-limitations
_add_operation_attrs(data_var)
Adds attributes to the data variable describing the operation. These attributes distinguish a data variable
that has been operated on from its original state. The attributes in netCDF4 files do not support booleans
or nested dictionaries, so booleans are converted to strings and nested dictionaries are unpacked.
Parameters
data_var (xr.DataArray) – The data variable.
Returns
xr.DataArray – The data variable with a temporal averaging attributes.
xcdat.regridder.accessor.RegridderAccessor
class xcdat.regridder.accessor.RegridderAccessor(dataset)
An accessor class that provides regridding attributes and methods for xarray Datasets through the .regridder
attribute.
Examples
Import xCDAT:
>>> ds = xcdat.open_dataset("...")
>>>
>>> ds.regridder.<attribute>
>>> ds.regridder.<method>
>>> ds.regridder.<property>
Parameters
dataset (xr.Dataset) – The Dataset to attach this accessor.
__init__(dataset)
Methods
__init__(dataset)
Attributes
grid Extract the X, Y, and Z axes from the Dataset and re-
turn a new xr.Dataset.
_ds
property grid
Extract the X, Y, and Z axes from the Dataset and return a new xr.Dataset.
Returns
xr.Dataset – Containing grid axes.
Raises
• ValueError – If axis dimension coordinate variable is not correctly identified.
• ValueError – If axis has multiple dimensions (only one is expected).
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
_get_axis_data(name)
Examples
Examples
References
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
Parameters
• data_var (str) – Name of the variable to transform.
• output_grid (xr.Dataset) – Grid to transform data_var to.
• tool (str) – Name of the tool to use.
• **options (Any) – These options are passed directly to the tool. See specific
regridder for available options.
Returns
xr.Dataset – With the data_var transformed to the output_grid.
Raises
ValueError – If tool is not supported.
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
xcdat.regridder.regrid2.Regrid2Regridder
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
Regrid data:
Methods
vertical(data_var, ds)
Placeholder for base class.
horizontal(data_var, ds)
See documentation in xcdat.regridder.regrid2.Regrid2Regridder()
_output_axis_sizes(da)
Maps axes to output array sizes.
Parameters
da (xr.DataArray) – Data array containing variable to be regridded.
Returns
Dict – Mapping of axis name e.g. (“X”, “Y”, etc) to output sizes.
_regrid(input_data, axis_sizes, ordered_axis_names)
Applies regridding to input data.
Parameters
• input_data (np.ndarray) – Input multi-dimensional array on source grid.
• axis_sizes (Dict[str, int]) – Mapping of axis name e.g. (“X”, “Y”, etc) to
output sizes.
• ordered_axis_names (List[str]) – List of axis name in order of dimensions of
input_data.
Returns
np.ndarray – Multi-dimensional array on destination grid.
_base_put_indexes(axis_sizes)
Calculates the base indexes to place cell (0, 0).
Example: For a 3D array (time, lat, lon) with the shape (2, 2, 2) the offsets to place cell (0, 0) in each time
step would be [0, 4].
For a 4D array (time, plev, lat, lon) with shape (2, 2, 2, 2) the offsets to place cell (0, 0) in each time step
would be [0, 4, 8, 16].
Parameters
axis_sizes (Dict[str, int]) – Mapping of axis name e.g. (“X”, “Y”, etc) to output
sizes.
Returns
np.ndarray – Array containing the base indexes to be used in np.put operations.
_create_output_dataset(input_ds, data_var, output_data, axis_variable_name_map,
ordered_axis_names)
Creates the output Dataset containing the new variable on the destination grid.
Parameters
• input_ds (xr.Dataset) – Input dataset containing coordinates and bounds for
unmodified axes.
• data_var (str) – The name of the regridded variable.
• output_data (np.ndarray) – Output data array.
• axis_variable_name_map (Dict[str, str]) – Map of axis name e.g. (“X”,
“Y”, etc) to variable name e.g. (“lon”, “lat”, etc).
• ordered_axis_names (List[str]) – List of axis names in the order observed for
output_data.
Returns
xr.Dataset – Dataset containing the variable on the destination grid.
_abc_impl = <_abc._abc_data object>
xcdat.regridder.xesmf.XESMFRegridder
The **options arguments are additional values passed to the xesmf.XESMFRegridder constructor. A
description of these arguments can be found on xESMF’s documentation.
Parameters
• input_grid (xr.Dataset) – Contains source grid coordinates.
• output_grid (xr.Dataset) – Contains desintation grid coordinates.
• method (str) – The regridding method to apply, defaults to “bilinear”.
• periodic (bool) – Treat longitude as periodic, used for global grids.
• extrap_method (Optional[str]) – Extrapolation method, useful when moving
from a fine to coarse grid.
• extrap_dist_exponent (Optional[float]) – The exponent to raise the distance
to when calculating weights for the extrapolation method.
• extrap_num_src_pnts (Optional[int]) – The number of source points to use
for the extrapolation methods that use more than one source point.
• ignore_degenerate (bool) – Ignore degenerate cells when checking the in-
put_grid for errors. If set False, a degenerate cell produces an error.
This only applies to “conservative” and “conservative_normed” regridding meth-
ods.
• **options (Any) – Additional arguments passed to the underlying xesmf.
XESMFRegridder constructor.
Raises
• KeyError – If data variable does not exist in the Dataset.
• ValueError – If method is not valid.
• ValueError – If extrap_method is not valid.
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
>>> )
Methods
vertical(data_var, ds)
Placeholder for base class.
horizontal(data_var, ds)
See documentation in xcdat.regridder.xesmf.XESMFRegridder()
_abc_impl = <_abc._abc_data object>
xcdat.regridder.xgcm.XGCMRegridder
– linear (default)
– log
– conservative
• target_data (Optional[Union[str, xr.DataArray]]) – Data to transform tar-
get data onto, either the key of a variable in the input dataset or an xr.DataArray,
by default None.
• grid_positions (Optional[Dict[str, str]]) – Mapping of dimension posi-
tions, by default None. If None then an attempt is made to derive this argument.
• periodic (Optional[bool]) – Whether the grid is periodic, by default False.
• extra_init_options (Optional[Dict[str, Any]]) – Extra options passed to the
xgcm.Grid constructor, by default None.
• options (Optional[Dict[str, Any]]) – Extra options passed to the xgcm.
Grid.transform method.
Raises
• KeyError – If data variable does not exist in the Dataset.
• ValueError – If method is not valid.
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
Methods
horizontal(data_var, ds)
Placeholder for base class.
vertical(data_var, ds)
See documentation in xcdat.regridder.xgcm.XGCMRegridder()
_get_grid_positions()
Attributes
xarray.Dataset.bounds.map
Dataset.bounds.map
Returns a map of axis and coordinates keys to their bounds.
The dictionary provides all valid CF compliant keys for axis and coordinates. For example, latitude will includes
keys for “lat”, “latitude”, and “Y”.
Returns
Dict[str, Optional[xr.DataArray]] – Dictionary mapping axis and coordinate keys to
their bounds.
xarray.Dataset.bounds.keys
Dataset.bounds.keys
Returns a list of keys for the bounds data variables in the Dataset.
Returns
List[str] – A list of sorted bounds data variable keys.
xarray.Dataset.regridder.grid
Dataset.regridder.grid
Extract the X, Y, and Z axes from the Dataset and return a new xr.Dataset.
Returns
xr.Dataset – Containing grid axes.
Raises
• ValueError – If axis dimension coordinate variable is not correctly identified.
• ValueError – If axis has multiple dimensions (only one is expected).
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
Methods
xarray.Dataset.bounds.add_bounds
Dataset.bounds.add_bounds(axis)
Add bounds for an axis using its coordinates as midpoints.
This method loops over the axis’s coordinate variables and attempts to add bounds for each of them if they don’t
exist. Each coordinate point is the midpoint between their lower and upper bounds.
To add bounds for an axis its coordinates must meet the following criteria, otherwise an error is thrown:
1. Axis is either X”, “Y”, “T”, or “Z”
2. Coordinates are single dimensional, not multidimensional
3. Coordinates are a length > 1 (not singleton)
4. Bounds must not already exist
• Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if ds.time.
attrs["bounds"] is set to "time_bnds" and ds.time_bnds is present in the dataset.
Parameters
axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, “Z”).
Returns
• xr.Dataset – The dataset with bounds added.
• Raises
xarray.Dataset.bounds.add_time_bounds
xarray.Dataset.bounds.get_bounds
Dataset.bounds.get_bounds(axis, var_key=None)
Gets coordinate bounds.
Parameters
• axis (CFAxisKey) – The CF axis key (“X”, “Y”, “T”, “Z”).
• var_key (Optional[str]) – The key of the coordinate or data variable to get axis
bounds for. This parameter is useful if you only want the single bounds DataArray
related to the axis on the variable (e.g., “tas” has a “lat” dimension and you want
“lat_bnds”).
Returns
Union[xr.Dataset, xr.DataArray] – A Dataset of N bounds variables, or a single bounds
variable DataArray.
Raises
• ValueError – If an incorrect axis argument is passed.
• KeyError: – If bounds were not found for the specific axis.
xarray.Dataset.bounds.add_missing_bounds
Dataset.bounds.add_missing_bounds(axes)
Adds missing coordinate bounds for supported axes in the Dataset.
This function loops through the Dataset’s axes and attempts to adds bounds to its coordinates if they don’t exist.
“X”, “Y” , and “Z” axes bounds are the midpoints between coordinates. “T” axis bounds are based on the time
frequency of the coordinates.
An axis must meet the following criteria to add bounds for it, otherwise they are ignored:
1. Axis is either X”, “Y”, “T”, or “Z”
2. Coordinates are a single dimension, not multidimensional
3. Coordinates are a length > 1 (not singleton)
4. Bounds must not already exist
• Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if ds.time.
attrs["bounds"] is set to "time_bnds" and ds.time_bnds is present in the dataset.
5. For the “T” axis, its coordinates must be composed of datetime-like objects (np.datetime64 or cftime).
Parameters
axes (List[str]) – List of CF axes that function should operate on. Options include “X”,
“Y”, “T”, or “Z”.
Returns
xr.Dataset
xarray.Dataset.spatial.average
Examples
>>> ds.lat.attrs["axis"]
>>> Y
>>>
>>> ds.lon.attrs["axis"]
>>> X
>>> ds.spatial.average(...)
>>> # The shape of the weights must align with the data var.
>>> self.weights = xr.DataArray(
>>> data=np.ones((4, 4)),
>>> coords={"lat": self.ds.lat, "lon": self.ds.lon},
>>> dims=["lat", "lon"],
>>> )
>>>
>>> ts_global = ds.spatial.average("tas", axis=["X", "Y"],
>>> weights=weights)["tas"]
xarray.Dataset.temporal.average
Examples
xarray.Dataset.temporal.group_average
Examples
>>> ds_season_with_djf.ts.attrs
{
'operation': 'temporal_avg',
'mode': 'average',
'freq': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
xarray.Dataset.temporal.climatology
References
Examples
>>> custom_seasons = [
>>> ["Jan", "Feb", "Mar"], # "JanFebMar"
>>> ["Apr", "May", "Jun"], # "AprMayJun"
>>> ["Jul", "Aug", "Sep"], # "JulAugSep"
>>> ["Oct", "Nov", "Dec"], # "OctNovDec"
>>> ]
>>>
>>> ds_season_custom = ds.temporal.climatology(
>>> "ts",
>>> "season",
>>> season_config={"custom_seasons": custom_seasons}
>>> )
>>> ds_season_with_djf.ts.attrs
{
'operation': 'temporal_avg',
'mode': 'climatology',
'freq': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
xarray.Dataset.temporal.departures
Notes
This method uses xarray’s grouped arithmetic as a shortcut for mapping over all unique labels. Grouped arith-
metic works by assigning a grouping label to each time coordinate of the observation data based on the averaging
mode and frequency. Afterwards, the corresponding climatology is removed from the observation data at each
time coordinate based on the matching labels.
Refer to3 to learn more about how xarray’s grouped arithmetic works.
References
Examples
3 https://xarray.pydata.org/en/stable/user-guide/groupby.html#grouped-arithmetic
>>> ds_depart.ts.attrs
{
'operation': 'departures',
'frequency': 'season',
'weighted': 'True',
'dec_mode': 'DJF',
'drop_incomplete_djf': 'False'
}
xarray.Dataset.regridder.horizontal
References
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
xarray.Dataset.regridder.vertical
Examples
Import xCDAT:
Open a dataset:
>>> ds = xcdat.open_dataset("...")
The table below maps the supported xCDAT operations to the equivalent CDAT and xCDAT APIs. It is especially
useful for those who are transitioning over from CDAT to xarray/xCDAT.
10.6 History
This minor version update consists of new features including vertical regridding (extension of xgcm), functions for
producing accurate time bounds, and improving the usability of the create_grid API. It also includes bug fixes to
preserve attributes when using regrid2 horizontal regridder and fixing multi-file datasets spatial average orientation
and weights when lon bounds span prime meridian.
10.6.2 Features
10.6.3 Deprecation
Horizontal Regridding
• Improves error when axis is missing/incorrect attributes with regrid2 by Jason Boutte in https://github.com/
xCDAT/xcdat/pull/481
• Fixes preserving ds/da attributes in the regrid2 module by Jason Boutte in https://github.com/xCDAT/xcdat/pull/
468
• Fixes duplicate parameter in regrid2 docs by Jason Boutte in https://github.com/xCDAT/xcdat/pull/532
Spatial Averaging
• Fix multi-file dataset spatial average orientation and weights when lon bounds span prime meridian by Stephen
Po-Chedley in https://github.com/xCDAT/xcdat/pull/495
10.6.5 Documentation
• Typo fix for climatology code example in docs by Jiwoo Lee in https://github.com/xCDAT/xcdat/pull/491
• Update documentation in regrid2.py by Jiwoo Lee in https://github.com/xCDAT/xcdat/pull/509
• Add more fields to GH Discussions question form by Tom Vo in https://github.com/xCDAT/xcdat/pull/480
• Add Q&A GH discussions template by Tom Vo in https://github.com/xCDAT/xcdat/pull/479
• Update FAQs question covering datasets with conflicting bounds by Tom Vo in https://github.com/xCDAT/xcdat/
pull/474
• Add Google Groups mailing list to docs by Tom Vo in https://github.com/xCDAT/xcdat/pull/452
• Fix README link to CODE-OF-CONDUCT.rst by Tom Vo in https://github.com/xCDAT/xcdat/pull/444
• Replace LLNL E3SM License with xCDAT License by Tom Vo in https://github.com/xCDAT/xcdat/pull/443
• Update getting started and HPC documentation by Tom Vo in https://github.com/xCDAT/xcdat/pull/553
10.6.6 DevOps
• Fix Python deprecation comment in conda env yml files by Tom Vo in https://github.com/xCDAT/xcdat/pull/514
• Simplify conda environments and move configs to pyproject.toml by Tom Vo in https://github.com/xCDAT/
xcdat/pull/512
• Update DevOps to cache conda and fix attributes not being preserved with xarray > 2023.3.0 by Tom Vo in
https://github.com/xCDAT/xcdat/pull/465
• Update GH Actions to use mamba by Tom Vo in https://github.com/xCDAT/xcdat/pull/450
• Update constraint cf_xarray >=0.7.3 to workaround xarray import issue by Tom Vo in https://github.com/
xCDAT/xcdat/pull/547
Full Changelog: https://github.com/xCDAT/xcdat/compare/v0.5.0. . . v0.6.0
This long-awaited minor release includes feature updates to support an optional user-specified climatology reference
period when calculating climatologies and departures, support for opening datasets using the directory key of the
legacy CDAT Climate Data Markup Language (CDML) format (an XML dialect), and improved support for using
custom time coordinates in temporal APIs.
This release also includes a bug fix for singleton coordinates breaking the swap_lon_axis() function. Additionally,
Jupyter Notebooks for presentations and demos have been added to the documentation.
Features
• Update departures and climatology APIs with reference period by Tom Vo in https://github.com/xCDAT/xcdat/
pull/417
• Wrap open_dataset and open_mfdataset to flexibly open datasets by Stephen Po-Chedley in https://github.com/
xCDAT/xcdat/pull/385
• Add better support for using custom time coordinates in temporal APIs by Tom Vo in https://github.com/xCDAT/
xcdat/pull/415
Bug Fixes
Documentation
DevOps
This minor release includes a feature update to support datasets that have N dimensions mapped to N coordinates to
represent an axis. This means xcdat APIs are able to intelligently select which axis’s coordinates and bounds to work
with if multiple are present within the dataset. Decoding time is now a lazy operation, leading to significant upfront
runtime improvements when opening datasets with decode_times=True.
A new notebook called “A Gentle Introduction to xCDAT” was added to the documentation gallery to help guide new
xarray/xcdat users. xCDAT is now hosted on Zenodo with a DOI for citations.
There are various bug fixes for bounds, naming of spatial weights, and a missing flag for xesmf that broke curvilinear
regridding.
Features
• Support for N axis dimensions mapped to N coordinates by Tom Vo and Stephen Po-Chedley in https://github.
com/xCDAT/xcdat/pull/343
– Rename get_axis_coord() to get_dim_coords() and get_axis_dim() to get_dim_keys()
– Update spatial and temporal accessor class methods to refer to the dimension coordinate variable on the
data_var being operated on, rather than the parent dataset
• Decoding times (decode_time()) is now a lazy operation, which results in significant runtime improvements
by Tom Vo in https://github.com/xCDAT/xcdat/pull/343
Bug Fixes
• Fix add_bounds() not ignoring 0-dim singleton coords by Tom Vo and Stephen Po-Chedley in https://github.
com/xCDAT/xcdat/pull/343
• Fix name of spatial weights with singleton coord by Tom Vo in https://github.com/xCDAT/xcdat/pull/379
• Fixes xesmf flag that was missing which broke curvilinear regridding by Jason Boutte and Stephen Po-Chedley
in https://github.com/xCDAT/xcdat/pull/374
Documentation
DevOps
This patch release fixes a bug where calculating daily climatologies/departures for specific CF calendar types that have
leap days breaks when using cftime. It also includes documentation updates.
Bug Fixes
• Drop leap days based on CF calendar type to calculate daily climatologies and departures by Tom Vo and Jiwoo
Lee in https://github.com/xCDAT/xcdat/pull/350
– Affected CF calendar types include gregorian, proleptic_gregorian, and standard
– Since a solution implementation for handling leap days is generally opinionated, we decided to go with
the route of least complexity and overhead (drop the leap days before performing calculations). We may
revisit adding more options for the user to determine how they want to handle leap days (based on how
valuable/desired it is).
Documentation
This patch release focuses on bug fixes related to temporal averaging, spatial averaging, and regridding. xesmf is
now an optional dependency because it is not supported on osx-arm64 and windows at this time. There is a new
documentation page for HPC/Jupyter guidance.
Bug Fixes
Temporal Average
• Fix multiple temporal avg calls on same dataset breaking by Tom Vo in https://github.com/xCDAT/xcdat/pull/
329
• Fix incorrect results for group averaging with missing data by Stephen Po-Chedley in https://github.com/xCDAT/
xcdat/pull/320
Spatial Average
• Fix spatial bugs: handle datasets with domain bounds out of order and zonal averaging by Stephen Po-Chedley
in https://github.com/xCDAT/xcdat/pull/340
Horizontal Regridding
Documentation
Dependencies
This patch release focuses on bug fixes including handling bounds generation with singleton coordinates and the use of
cftime to represent temporal averaging outputs and non-CF compliant time coordinates (to avoid the pandas Times-
tamp limitations).
Bug Fixes
Bounds
• Ignore singleton coordinates without dims when attempting to generate bounds by Stephen Po-Chedley in https:
//github.com/xCDAT/xcdat/pull/281
• Modify logic to not throw error for singleton coordinates (with no bounds) by Stephen Po-Chedley in https:
//github.com/xCDAT/xcdat/pull/313
• Fix TypeError with Dask Arrays from multifile datasets in temporal averaging by Stephen Po-Chedley in https:
//github.com/xCDAT/xcdat/pull/291
• Use cftime to avoid out of bounds datetime when decoding non-CF time coordinates by Stephen Po-Chedley
and Tom Vo in https://github.com/xCDAT/xcdat/pull/283
• Use cftime for temporal averaging operations to avoid out of bounds datetime by Stephen Po-Chedley and
Tom Vo in https://github.com/xCDAT/xcdat/pull/302
• Fix open_mfdataset() dropping time encoding attrs by Tom Vo in https://github.com/xCDAT/xcdat/pull/309
• Replace “time” references with self._dim in class TemporalAccessor by Tom Vo in https://github.com/
xCDAT/xcdat/pull/312
Internal Changes
Documentation
DevOps
New Contributors
New Features
Bug Fixes
• Fix add_bounds() breaking when time coords are cftime objects by Tom Vo in https://github.com/xCDAT/
xcdat/pull/241
• Fix parsing of custom seasons for departures by Tom Vo in https://github.com/xCDAT/xcdat/pull/246
• Update swap_lon_axis to ignore same systems, which was causing odd behaviors for (0, 360) by Tom Vo in
https://github.com/xCDAT/xcdat/pull/257
Breaking Changes
Documentation
Internal Changes
• Update time coordinates object type from MultiIndex to datetime/cftime for TemporalAccessor reduction
methods and add convenience methods by Tom Vo in https://github.com/xCDAT/xcdat/pull/221
• Extract method _postprocess_dataset() and make bounds generation optional by Tom Vo in https://github.
com/xCDAT/xcdat/pull/223
• Update add_bounds kwarg default value to True by Tom Vo in https://github.com/xCDAT/xcdat/pull/230
• Update decode_non_cf_time to return input dataset if the time “units” attr can’t be split into unit and reference
date by Stephen Po-Chedley in https://github.com/xCDAT/xcdat/pull/263
Full Changelog: https://github.com/xCDAT/xcdat/compare/v0.2.0. . . v0.3.0
New Features
• Add support for spatial averaging parallelism via Dask by Stephen Po-Chedley in https://github.com/xCDAT/
xcdat/pull/132
• Refactor spatial averaging with more robust handling of longitude spanning prime meridian by Stephen Po-
Chedley in https://github.com/xCDAT/xcdat/pull/152
• Update xcdat.open_mfdataset time decoding logic by Stephen Po-Chedley in https://github.com/xCDAT/xcdat/
pull/161
• Add function to swap dataset longitude axis orientation by Tom Vo in https://github.com/xCDAT/xcdat/pull/145
• Add utility functions by Tom Vo in https://github.com/xCDAT/xcdat/pull/205
• Add temporal utilities and averaging functionalities by Tom Vo in https://github.com/xCDAT/xcdat/pull/107
Bug Fixes
• Add exception for coords of len <= 1 or multidimensional coords in fill_missing_bounds() by Tom Vo in
https://github.com/xCDAT/xcdat/pull/141
• Update open_mfdataset() to avoid data vars dim concatenation by Tom Vo in https://github.com/xCDAT/
xcdat/pull/143
• Fix indexing on axis keys using generic map (related to spatial averaging) by Tom Vo in https://github.com/
xCDAT/xcdat/pull/172
Breaking Changes
• Rename accessor classes and methods for API consistency by Tom Vo in https://github.com/xCDAT/xcdat/pull/
142
• Rename fill_missing_bounds() to add_missing_bounds() by Tom Vo in https://github.com/xCDAT/
xcdat/pull/157
• Remove data variable inference API by Tom Vo in https://github.com/xCDAT/xcdat/pull/196
• Rename spatial file and class by Tom Vo in https://github.com/xCDAT/xcdat/pull/207
Documentation
Deprecations
Internal Changes
DevOps
New Contributors
New Features
• Add geospatial averaging API through DatasetSpatialAverageAccessor class by Stephen Po-Chedley and
Tom Vo in #87
– Does not support parallelism with Dask yet
• Add wrappers for xarray’s open_dataset and open_mfdataset to apply common operations such as:
– If the dataset has a time dimension, decode both CF and non-CF time units
– Generate bounds for supported coordinates if they don’t exist
– Option to limit the Dataset to a single regular (non-bounds) data variable while retaining any bounds data
variables
• Add DatasetBoundsAccessor class for filling missing bounds, returning mapping of bounds, returning names
of bounds keys
• Add BoundsAccessor class for accessing xcdat public methods from other accessor classes
– This will be probably be the API endpoint for most users, unless they prefer importing the individual
accessor classes
• Add ability to infer data variables in xcdat APIs based on the “xcdat_infer” Dataset attr
– This attr is set in xcdat.open_dataset(), xcdat_mfdataset(), or manually
• Utilizes cf_xarray package (https://github.com/xarray-contrib/cf-xarray)
Documentation
DevOps
xcdat supports datasets with structured grids that follow the CF convention, but will also strive to support datasets
with common non-CF compliant metadata (e.g., time units in “months since . . . ” or “years since . . . ”).
xCDAT aims to be a generalizable package that is compatible with structured grids that are CF-compliant (e.g.,
CMIP6). xCDAT’s horizontal regridder supports grids that are supported by Regrid2 and xESMF (curvilinear and
rectilinear).
xcdat leverages cf_xarray to interpret CF attributes on xarray objects. xcdat methods and functions usually accept an
axis argument (e.g., ds.temporal.average("ts")). This argument is internally mapped to cf_xarray mapping
tables that interpret the CF attributes.
– For example, the latitude coordinate variable has bounds: "lat_bnds", which maps its bounds to the
lat_bnds data variable.
– Refer to cf_xarray Bounds Variables page for more information.
xCDAT generates bounds by using coordinate points as the midpoint between their lower and upper bounds.
Does xCDAT support generating bounds for multiple axis coordinate systems in the same dataset?
For example, there are two sets of coordinates called “lat” and “latitude” in the dataset.
Yes, xCDAT can generate bounds for axis coordinates if they are “dimension coordinates” (coordinate variables in CF
terminology) and have the required CF metadata. “Non-dimension coordinates” (auxiliary coordinate variables in CF
terminology) are ignored.
Visit Xarray’s documentation page on Coordinates for more info on “dimension coordinates” vs. “non-dimension
coordinates”.
The units attribute must be in the CF compliant format "<units> since <reference_date>". For example, "days
since 1990-01-01".
Supported CF compliant units include day, hour, minute, second, which is inherited from xarray and cftime.
Supported non-CF compliant units include year and month, which xcdat is able to parse. Note, the plural form of
these units are accepted.
References:
• https://cfconventions.org/cf-conventions/cf-conventions#time-coordinate
xcdat supports that same CF convention calendars as xarray (based on cftime and netCDF4-python package).
Supported calendars include:
• 'standard'
• 'gregorian'
• 'proleptic_gregorian'
• 'noleap'
• '365_day'
• '360_day'
• 'julian'
• 'all_leap'
• '366_day'
References:
• https://cfconventions.org/cf-conventions/cf-conventions#calendar
Why does xcdat decode time coordinates as cftime objects instead of datetime64[ns]?
One unfortunate limitation of using datetime64[ns] is that it limits the native representation of dates to those that
fall between the years 1678 and 2262. This affects climate modeling datasets that have time coordinates outside of this
range.
As a workaround, xarray uses the cftime library when decoding/encoding datetimes for non-standard calendars or
for dates before year 1678 or after year 2262.
xcdat opted to decode time coordinates exclusively with cftime because it has no timestamp range limitations, sim-
plifies implementation, and the output object type is deterministic.
References:
• https://github.com/pydata/xarray/issues/789
• https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamp-limitations
xcdat aims to implement generalized functionality. This means that functionality intended to handle data quality issues
is out of scope, especially for limited cases.
If data quality issues are present, xarray and xcdat might not be able to open the datasets. Examples of data quality
issues include conflicting floating point values between files or non-CF compliant attributes that are not common.
A few workarounds include:
1. Configuring open_dataset() or open_mfdataset() keyword arguments based on your needs.
2. Writing a custom preprocess() function to feed into open_mfdataset(). This function preprocesses each
dataset file individually before joining them into a single Dataset object.
In xarray, the default setting for checking compatibility across a multi-file dataset is compat='no_conflicts'.
In cases where variable values conflict between files, xarray raises MergeError: conflicting values for
variable <VARIABLE NAME> on objects to be combined. You can skip this check by specifying
compat="override".
If you still intend on working with these datasets and recognize the source of the issue (e.g., minor floating point diffs),
follow the workarounds below. Please proceed with caution. You should understand the potential implications of
these workarounds.
1. Pick the first bounds variable and keep dimensions the same as the input files
• This option is recommended if you know bounds values should be the same across all files, but one or
more files has inconsistent bounds values which breaks the concatenation of files into a single xr.Dataset
object.
>>> ds = xcdat.open_mfdataset(
"path/to/files/*.nc",
compat="override",
data_vars="minimal",
coords="minimal",
join="override",
)
10.7.5 Regridding
xcdat extends and provides a uniform interface to xESMF and xgcm. In addition, xcdat provides a port of the CDAT
regrid2 package.
Structured rectilinear and curvilinear grids are supported.
ds = xcdat.open_dataset(...)
grid = ds.regridder.grid
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience
for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity
and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste,
color, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
Examples of behavior that contributes to a positive environment for our community include:
• Demonstrating empathy and kindness toward other people
• Being respectful of differing opinions, viewpoints, and experiences
• Giving and gracefully accepting constructive feedback
• Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
• Focusing on what is best not just for us as individuals, but for the overall community
Examples of unacceptable behavior include:
• The use of sexualized language or imagery, and sexual attention or advances of any kind
• Trolling, insulting or derogatory comments, and personal or political attacks
• Public or private harassment
• Publishing others’ private information, such as a physical or email address, without their explicit permission
• Other conduct which could reasonably be considered inappropriate in a professional setting
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take
appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, is-
sues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
10.8.4 Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing
the community in public spaces. Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed representative at an online or offline event.
10.8.5 Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders re-
sponsible for enforcement at [email protected]. All complaints will be reviewed and investigated
promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action
they deem in violation of this Code of Conduct:
1. Correction
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the
community.
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation
and an explanation of why the behavior was inappropriate. A public apology may be requested.
2. Warning
3. Temporary Ban
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
Consequence: A temporary ban from any sort of interaction or public communication with the community for a
specified period of time. No public or private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent
ban.
4. Permanent Ban
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate
behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
Consequence: A permanent ban from any sort of public interaction within the community.
10.8.7 Attribution
This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.
contributor-covenant.org/version/2/1/code_of_conduct.html.
Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq.
Translations are available at https://www.contributor-covenant.org/translations.
10.9 Contributing
Contributions are welcome and greatly appreciated! Every little bit helps, and credit will always be given.
Bug Report
Look through the GitHub Issues for bugs to fix. Any unassigned issues tagged with “Type: Bug” is open for imple-
mentation.
Feature Request
Look through the GitHub Issues for feature suggestions. Any unassigned issues tagged with “Type: Enhancement” is
open for implementation.
If you are proposing a feature:
• Explain in detail how it would work.
• Keep the scope as narrow as possible, to make it easier to implement.
• Remember that this is a open-source project, and that contributions are welcome :)
Features must meet the following criteria before they are considered for implementation:
1. Feature is not implemented by xarray
2. Feature is not implemented in another actively developed xarray-based package
• For example, cf_xarray already handles interpretation of CF convention attributes on xarray objects
3. Feature is not limited to specific use cases (e.g., data quality issues)
4. Feature is generally reusable
5. Feature is relatively simple and lightweight to implement and use
Documentation Update
Help improve xCDAT’s documentation, whether that be the Sphinx documentation or the API docstrings.
Community Discussion
Take a look at the GitHub Discussions page to get involved, share ideas, or ask questions.
The repository uses branch-based (core team) and fork-based (external collaborators) Git workflows with tagged soft-
ware releases.
Guidelines
Things to Avoid
Pre-commit
The repository uses the pre-commit package to manage pre-commit hooks. These hooks help with quality assurance
standards by identifying simple issues at the commit level before submitting code reviews.
We recommend using VS Code as your IDE because it is open-source and has great Python development support.
Get VS Code here: https://code.visualstudio.com
VS Code Setup
xCDAT includes a VS Code workspace file (.vscode/xcdat.code-setting). This file automatically configures
your IDE with the quality assurance tools, code line-length rulers, and more.
Make sure to follow the Local Development section below.
• Python
• Pylance
• Python Docstring Generator
• Python Type Hint
• Better Comments
• Jupyter
• Visual Studio Intellicode
Local Development
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-
˓→x86_64.sh
$ bash ./Miniconda3-latest-Linux-x86_64.sh
Do you wish the installer to initialize Miniconda3 by running conda␣
˓→init? [yes|no] yes
MacOS
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-
˓→x86_64.sh
$ bash ./Miniconda3-latest-MacOSX-x86_64.sh
Do you wish the installer to initialize Miniconda3 by running conda␣
˓→init? [yes|no] yes
$ cd xcdat
$ conda env create -f conda-env/dev.yml
$ conda activate xcdat_dev
$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
9. <OPTIONAL> During or after making changes, check for formatting or linting issues using pre-commit:
# Step 9 performs this automatically on staged files in a commit
$ pre-commit run --all-files
10. Generate code coverage report and check unit tests pass:
$ make test # Automatically opens HTML report in your browser
$ pytest # Does not automatically open HTML report in your browser
tests/test_dataset.py ..
tests/test_xcdat.py .
• The Coverage HTML report is much more detailed (e.g., exact lines of tested/untested code)
11. Commit your changes:
$ git add .
$ git commit -m <Your detailed description of your changes>
12. Make sure pre-commit QA checks pass. Otherwise, fix any caught issues.
• Most of the tools fix issues automatically so you just need to re-stage the files.
• flake8 and mypy issues must be fixed automatically.
13. Push changes:
Before you submit a pull request, check that it meets these guidelines:
1. The pull request should include tests for new or modified code.
2. Link issues to pull requests.
3. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with
a docstring, and add the feature to the list in README.rst.
4. Squash and rebase commits for a clean and navigable Git history.
When you open a pull request on GitHub, there is a template available for use.
xCDAT integrates the Black code formatter for code styling. If you want to learn more, please read about it here.
xCDAT also leverages Python Type Annotations to help the project scale. mypy performs optional static type checking
through pre-commit.
10.9.6 Testing
Testing your local changes are important to ensure long-term maintainability and extensibility of the project. Since
xCDAT is an open source library, we aim to avoid as many bugs as possible from reaching the end-user.
To get started, here are guides on how to write tests using pytest:
• https://docs.pytest.org/en/latest/
• https://docs.python-guide.org/writing/tests/#py-test
In most cases, if a function is hard to test, it is usually a symptom of being too complex (high cyclomatic-complexity).
If you are using VS code, the Python Docstring Generator extension can be used to auto-generate a docstring snippet
once a function/class has been written. If you want the extension to generate docstrings in Sphinx format, you must set
the "autoDocstring.docstringFormat": "sphinx" setting, under File > Preferences > Settings.
Note that it is best to write the docstrings once you have fully defined the function/class, as then the extension will
generate the full docstring. If you make any changes to the code once a docstring is generated, you will have to
manually go and update the affected docstrings.
More info on docstrings here: https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html
• DO explain why something is done, its purpose, and its goal. The code shows how it is done, so commenting on
this can be redundant.
• DO explain ambiguity or complexities to avoid confusion
• DO embrace documentation as an integral part of the overall development process
• DO treat documenting as code and follow principles such as Don’t Repeat Yourself and Easier to Change
• flake8 will warn you if the cyclomatic complexity of a function is too high.
– https://github.com/PyCQA/mccabe
Note: Run make help in the root of the project for a list of useful commands
$ pytest tests.test_xcdat
10.9.10 FAQs
Before you merge a support branch back into main, the branch is typically squashed down to a single buildable commit,
and then rebased on top of the main repo’s main branch.
Why?
• Ensures build passes from the commit
• Cleans up Git history for easy navigation
• Makes collaboration and review process more efficient
• Makes handling conflicts from rebasing simple since you only have to deal with conflicted commits
3. Squash commits:
# OR
This project uses GitHub Actions to run the CI/CD build workflow.
This workflow is triggered by Git pull_request and push (merging PRs) events to the the main repo’s main branch.
Jobs:
1. Run pre-commit for formatting, linting, and type checking
2. Build conda CI/CD environment with different Python versions, install package, and run test suite
• Tom Vo <[email protected]>
• Jason Boutte
• Stephen Po-Chedley
• Jill Chengzhu Zhang
• Jiwoo Lee
213
xCDAT Documentation, Release 0.6.0
214 Index
xCDAT Documentation, Release 0.6.0
H X
horizontal() (xarray.Dataset.regridder method), 183 XESMFRegridder (class in xcdat.regridder.xesmf ), 165
horizontal() (xcdat.regridder.accessor.RegridderAccessorXGCMRegridder (class in xcdat.regridder.xgcm), 167
method), 161
horizontal() (xcdat.regridder.regrid2.Regrid2Regridder
method), 164
horizontal() (xcdat.regridder.xesmf.XESMFRegridder
method), 167
horizontal() (xcdat.regridder.xgcm.XGCMRegridder
method), 169
horizontal_regrid2() (xc-
dat.regridder.accessor.RegridderAccessor
method), 161
horizontal_xesmf() (xc-
dat.regridder.accessor.RegridderAccessor
method), 160
K
keys (xarray.Dataset.bounds attribute), 170
keys (xcdat.bounds.BoundsAccessor property), 132
M
map (xarray.Dataset.bounds attribute), 169
map (xcdat.bounds.BoundsAccessor property), 132
O
open_dataset() (in module xcdat), 119
open_mfdataset() (in module xcdat), 120
R
Regrid2Regridder (class in xcdat.regridder.regrid2),
163
RegridderAccessor (class in xc-
dat.regridder.accessor), 159
S
SpatialAccessor (class in xcdat.spatial), 138
swap_lon_axis() (in module xcdat), 124
T
TemporalAccessor (class in xcdat.temporal), 144
V
vertical() (xarray.Dataset.regridder method), 184
vertical() (xcdat.regridder.accessor.RegridderAccessor
method), 162
vertical() (xcdat.regridder.regrid2.Regrid2Regridder
method), 164
vertical() (xcdat.regridder.xesmf.XESMFRegridder
method), 167
vertical() (xcdat.regridder.xgcm.XGCMRegridder
method), 169
Index 215