0% found this document useful (0 votes)
16 views

ODC_Cheatsheet

Uploaded by

sherryimtiaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

ODC_Cheatsheet

Uploaded by

sherryimtiaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Website: www.opendatacube.

org
Loading and analysing Earth Observation data with the Open Data Cube Datacube docs: https://datacube-core.readthedocs.io/en/latest/
odc-geo docs: https://odc-geo.readthedocs.io/en/latest/

Getting started Load and reproject data into a custom coordinate reference system Aggregating data (e.g. min, max, mean, median, std):
and resolution grid, e.g. UTM Zone 55 S, 200 metre resolution: Calculate means for every pixel Calculate means across all pixels in each
Import Python packages and connect to database: (for most CRSs, the first value is negative by convention) across time, producing a 2D image: timestep, producing a 1D timeseries:
import datacube # used for querying and loading data
import odc.geo.xr # enables additional geospatial tools dc.load(... ds.mean(dim="time") ds.mean(dim=["y", "x"])
output_crs="EPSG:32755",
dc = datacube.Datacube() resolution=(-200, 200)) # -y, x

List available products in the datacube: Apply custom resampling when reprojecting (default is “nearest”): Plotting and exporting data
dc.list_products() Use “average” resampling Use “nearest” resampling for the Plot on an interactive map for rapid data exploration:
for all bands: “fmask” band, “average” for all others:
ds.isel(time=0).odc.explore() # also works for single bands
List the measurements (e.g. bands or variables) available for dc.load(...
each datacube product: dc.load(... resampling={ Plotting single bands as a static plot:
resampling="average") "fmask": "nearest",
dc.list_measurements() Plot a single timestep: Plot multiple timesteps:
"*": "average"})
ds.fmask.plot(
ds.fmask.isel(time=0).plot()
col="time", col_wrap=4)
Lazily load data using Dask:
(used for parallelization and managing memory; chunk sizes will depend on data)
Plotting multiple bands as an RGB image:
Loading data (will auto-guess red, green and blue bands if they exist in the data)
dc.load(..., dask_chunks={"y": 2048, "x": 2048})
Load a specific product and measurements:
ds.isel(time=0).odc.to_rgba().plot.imshow()
ds = dc.load(
product="ga_ls8c_ard_3", Export data as a cloud optimised GeoTIFF raster file:
measurements=["nbart_red", "nbart_blue", "fmask"], ...) Preparing data for analysis
ds.isel(time=0).fmask.odc.write_cog("output_filename.tif")
Inspect nodata attributes and cloud masking band flags:
Load data for a specific spatial extent:
ds.nbart_red.odc.nodata
Degrees lat/lon coordinates Custom coordinate reference system
ds.fmask.attrs["flags_definition"]
(WGS84/EPSG:4326): (e.g. Australian Albers):
GeoBox and geospatial tools
dc.load(... View a dataset’s “GeoBox” defining its spatial pixel grid:
dc.load(... Setting nodata pixels (e.g. -999) to NaN:
x=(948280, 981840),
y=(-32.2, -32.5),
y=(-3546480, -3584720),
x=(142.2, 142.5)) ds_masked = datacube.utils.masking.mask_invalid_data(ds) ds.odc.geobox
crs="EPSG:3577")
ds.odc.geobox.crs # coordinate reference system (CRS)
ds.odc.geobox.resolution # spatial pixel resolution
Loading data by time: Convert a cloud masking band into a boolean mask and apply to ds.odc.geobox.boundingbox # spatial extent of data
a dataset (setting cloud pixels to NaN):
From a specific date: From an entire year:
Reproject a loaded dataset:
cloud_mask = datacube.utils.masking.make_mask(
dc.load(... dc.load(... Reproject to a different CRS: Reproject to another dataset’s GeoBox:
ds.fmask, fmask="cloud")
time="2020-01-01") time="2020")
ds_masked = ds.where(~cloud_mask)
ds_wgs84 # data in another CRS
ds.odc.reproject(
ds.odc.reproject(
All data from 2020 to 2022 All data from 2020 onward how="EPSG:32755")
how=ds_wgs84.odc.geobox)
(inclusive of start and end): (inclusive of start):

dc.load(... dc.load(... Basic analysis with xarray Mask or crop a dataset to the extent of a polygon:
time=("2020", "2022")) time=("2020", None))
Selecting a subset of data: from odc.geo.geom import Geometry
Use “.isel()” for “index selection”, Use “.sel()” for “coordinate selection”, geopolygon = Geometry(<shapely_polygon>, crs="EPSG:4326")
Group sequential images captured along each satellite path into
e.g. select first 5 values along the e.g. select all pixels between specific y
daily timesteps: data’s y and x dimensions: and x coordinates: # Mask data to set pixels outside polygon to NaN
(only required for products with daily acquisitions, e.g. Landsat or Sentinel-2; ds_masked = ds.odc.mask(poly=geopolygon)
not required for summary products like annual or monthly datasets)
ds.isel( ds.sel(
y=slice(0, 5), y=slice(-3867375, -3867350), # Crop data to extent of polygon (and optionally mask)
dc.load(..., group_by="solar_day") x=slice(0, 5)) x=slice(1516200, 1541300)) ds_cropped = ds.odc.crop(poly=geopolygon, apply_mask=True)

Designed: Robbi Bishop-Taylor (@SatelliteSci), Geoscience Australia. Modified: Feb 2024 (datacube==1.8.17, odc-geo==0.4.2)

You might also like